All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] Introduce a vringh accessor for IO memory
@ 2022-12-27  2:25 ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: kvm, virtualization, netdev, linux-kernel, Shunsuke Mie

Vringh is a host-side implementation of virtio rings, and supports the
vring located on three kinds of memories, userspace, kernel space and a
space translated iotlb.

The goal of this patchset is to refactor vringh and introduce a new vringh
accessor for the vring located on the io memory region. The io memory
accessor (iomem) is used by a driver that is not published yet, but I'm
planning to publish it. Drivers affected by these changes are not included
in this patchset. e.g. caif_virtio and vdpa (sim_net, sim_blk and net/mlx5)
drivers.

This patchset can separate into 3 parts:
1. Fix and prepare some code related vringh [1, 2, 3/6]
2. Unify the vringh APIs and change related [4, 5/6]
3. Support IOMEM to vringh [6/6]

This first part is preparation for the second part which has a little fix
and changes. A test code for vringh named vringh_test is also updated along
with the changes. In the second part, unify the vringh API for each
accessors that are user, kern and iotlb. The main point is struct
vringh_ops that fill the gap between all accessors. The final part
introduces an iomem support to vringh according to the unified API in the
second part.

Those changes are tested for the user accessor using vringh_test and kern
and iomem using a non published driver, but I think I can add a link to a
patchset for the driver in the next version of this patchset.

Shunsuke Mie (6):
  vringh: fix a typo in comments for vringh_kiov
  vringh: remove vringh_iov and unite to vringh_kiov
  tools/virtio: convert to new vringh user APIs
  vringh: unify the APIs for all accessors
  tools/virtio: convert to use new unified vringh APIs
  vringh: IOMEM support

 drivers/vhost/Kconfig      |   6 +
 drivers/vhost/vringh.c     | 721 ++++++++++++-------------------------
 include/linux/vringh.h     | 147 +++-----
 tools/virtio/vringh_test.c | 123 ++++---
 4 files changed, 356 insertions(+), 641 deletions(-)

--
2.25.1


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [RFC PATCH 0/6] Introduce a vringh accessor for IO memory
@ 2022-12-27  2:25 ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: netdev, Shunsuke Mie, linux-kernel, kvm, virtualization

Vringh is a host-side implementation of virtio rings, and supports the
vring located on three kinds of memories, userspace, kernel space and a
space translated iotlb.

The goal of this patchset is to refactor vringh and introduce a new vringh
accessor for the vring located on the io memory region. The io memory
accessor (iomem) is used by a driver that is not published yet, but I'm
planning to publish it. Drivers affected by these changes are not included
in this patchset. e.g. caif_virtio and vdpa (sim_net, sim_blk and net/mlx5)
drivers.

This patchset can separate into 3 parts:
1. Fix and prepare some code related vringh [1, 2, 3/6]
2. Unify the vringh APIs and change related [4, 5/6]
3. Support IOMEM to vringh [6/6]

This first part is preparation for the second part which has a little fix
and changes. A test code for vringh named vringh_test is also updated along
with the changes. In the second part, unify the vringh API for each
accessors that are user, kern and iotlb. The main point is struct
vringh_ops that fill the gap between all accessors. The final part
introduces an iomem support to vringh according to the unified API in the
second part.

Those changes are tested for the user accessor using vringh_test and kern
and iomem using a non published driver, but I think I can add a link to a
patchset for the driver in the next version of this patchset.

Shunsuke Mie (6):
  vringh: fix a typo in comments for vringh_kiov
  vringh: remove vringh_iov and unite to vringh_kiov
  tools/virtio: convert to new vringh user APIs
  vringh: unify the APIs for all accessors
  tools/virtio: convert to use new unified vringh APIs
  vringh: IOMEM support

 drivers/vhost/Kconfig      |   6 +
 drivers/vhost/vringh.c     | 721 ++++++++++++-------------------------
 include/linux/vringh.h     | 147 +++-----
 tools/virtio/vringh_test.c | 123 ++++---
 4 files changed, 356 insertions(+), 641 deletions(-)

--
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [RFC PATCH 1/9] vringh: fix a typo in comments for vringh_kiov
  2022-12-27  2:25 ` Shunsuke Mie
@ 2022-12-27  2:25   ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: kvm, virtualization, netdev, linux-kernel, Shunsuke Mie

Probably it is a simple copy error from struct vring_iov.

Fixes: f87d0fbb5798 ("vringh: host-side implementation of virtio rings.")
Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 include/linux/vringh.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 212892cf9822..1991a02c6431 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -92,7 +92,7 @@ struct vringh_iov {
 };
 
 /**
- * struct vringh_iov - kvec mangler.
+ * struct vringh_kiov - kvec mangler.
  *
  * Mangles kvec in place, and restores it.
  * Remaining data is iov + i, of used - i elements.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 1/9] vringh: fix a typo in comments for vringh_kiov
@ 2022-12-27  2:25   ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: netdev, Shunsuke Mie, linux-kernel, kvm, virtualization

Probably it is a simple copy error from struct vring_iov.

Fixes: f87d0fbb5798 ("vringh: host-side implementation of virtio rings.")
Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 include/linux/vringh.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 212892cf9822..1991a02c6431 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -92,7 +92,7 @@ struct vringh_iov {
 };
 
 /**
- * struct vringh_iov - kvec mangler.
+ * struct vringh_kiov - kvec mangler.
  *
  * Mangles kvec in place, and restores it.
  * Remaining data is iov + i, of used - i elements.
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-27  2:25 ` Shunsuke Mie
@ 2022-12-27  2:25   ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: kvm, virtualization, netdev, linux-kernel, Shunsuke Mie

struct vringh_iov is defined to hold userland addresses. However, to use
common function, __vring_iov, finally the vringh_iov converts to the
vringh_kiov with simple cast. It includes compile time check code to make
sure it can be cast correctly.

To simplify the code, this patch removes the struct vringh_iov and unifies
APIs to struct vringh_kiov.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/vhost/vringh.c | 32 ++++++------------------------
 include/linux/vringh.h | 45 ++++--------------------------------------
 2 files changed, 10 insertions(+), 67 deletions(-)

diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index 828c29306565..aa3cd27d2384 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
  * calling vringh_iov_cleanup() to release the memory, even on error!
  */
 int vringh_getdesc_user(struct vringh *vrh,
-			struct vringh_iov *riov,
-			struct vringh_iov *wiov,
+			struct vringh_kiov *riov,
+			struct vringh_kiov *wiov,
 			bool (*getrange)(struct vringh *vrh,
 					 u64 addr, struct vringh_range *r),
 			u16 *head)
@@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
 	if (err == vrh->vring.num)
 		return 0;
 
-	/* We need the layouts to be the identical for this to work */
-	BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
-	BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
-		     offsetof(struct vringh_iov, iov));
-	BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
-		     offsetof(struct vringh_iov, i));
-	BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
-		     offsetof(struct vringh_iov, used));
-	BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
-		     offsetof(struct vringh_iov, max_num));
-	BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
-	BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
-		     offsetof(struct kvec, iov_base));
-	BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
-		     offsetof(struct kvec, iov_len));
-	BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
-		     != sizeof(((struct kvec *)NULL)->iov_base));
-	BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
-		     != sizeof(((struct kvec *)NULL)->iov_len));
-
 	*head = err;
 	err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
 			   (struct vringh_kiov *)wiov,
@@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
 EXPORT_SYMBOL(vringh_getdesc_user);
 
 /**
- * vringh_iov_pull_user - copy bytes from vring_iov.
+ * vringh_iov_pull_user - copy bytes from vring_kiov.
  * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
  * @dst: the place to copy.
  * @len: the maximum length to copy.
  *
  * Returns the bytes copied <= len or a negative errno.
  */
-ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
+ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
 {
 	return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
 			       dst, len, xfer_from_user);
@@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
 EXPORT_SYMBOL(vringh_iov_pull_user);
 
 /**
- * vringh_iov_push_user - copy bytes into vring_iov.
+ * vringh_iov_push_user - copy bytes into vring_kiov.
  * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
  * @src: the place to copy from.
  * @len: the maximum length to copy.
  *
  * Returns the bytes copied <= len or a negative errno.
  */
-ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
+ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
 			     const void *src, size_t len)
 {
 	return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 1991a02c6431..733d948e8123 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -79,18 +79,6 @@ struct vringh_range {
 	u64 offset;
 };
 
-/**
- * struct vringh_iov - iovec mangler.
- *
- * Mangles iovec in place, and restores it.
- * Remaining data is iov + i, of used - i elements.
- */
-struct vringh_iov {
-	struct iovec *iov;
-	size_t consumed; /* Within iov[i] */
-	unsigned i, used, max_num;
-};
-
 /**
  * struct vringh_kiov - kvec mangler.
  *
@@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
 		     vring_avail_t __user *avail,
 		     vring_used_t __user *used);
 
-static inline void vringh_iov_init(struct vringh_iov *iov,
-				   struct iovec *iovec, unsigned num)
-{
-	iov->used = iov->i = 0;
-	iov->consumed = 0;
-	iov->max_num = num;
-	iov->iov = iovec;
-}
-
-static inline void vringh_iov_reset(struct vringh_iov *iov)
-{
-	iov->iov[iov->i].iov_len += iov->consumed;
-	iov->iov[iov->i].iov_base -= iov->consumed;
-	iov->consumed = 0;
-	iov->i = 0;
-}
-
-static inline void vringh_iov_cleanup(struct vringh_iov *iov)
-{
-	if (iov->max_num & VRINGH_IOV_ALLOCATED)
-		kfree(iov->iov);
-	iov->max_num = iov->used = iov->i = iov->consumed = 0;
-	iov->iov = NULL;
-}
-
 /* Convert a descriptor into iovecs. */
 int vringh_getdesc_user(struct vringh *vrh,
-			struct vringh_iov *riov,
-			struct vringh_iov *wiov,
+			struct vringh_kiov *riov,
+			struct vringh_kiov *wiov,
 			bool (*getrange)(struct vringh *vrh,
 					 u64 addr, struct vringh_range *r),
 			u16 *head);
 
 /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
-ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
+ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
 
 /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
-ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
+ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
 			     const void *src, size_t len);
 
 /* Mark a descriptor as used. */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2022-12-27  2:25   ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: netdev, Shunsuke Mie, linux-kernel, kvm, virtualization

struct vringh_iov is defined to hold userland addresses. However, to use
common function, __vring_iov, finally the vringh_iov converts to the
vringh_kiov with simple cast. It includes compile time check code to make
sure it can be cast correctly.

To simplify the code, this patch removes the struct vringh_iov and unifies
APIs to struct vringh_kiov.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/vhost/vringh.c | 32 ++++++------------------------
 include/linux/vringh.h | 45 ++++--------------------------------------
 2 files changed, 10 insertions(+), 67 deletions(-)

diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index 828c29306565..aa3cd27d2384 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
  * calling vringh_iov_cleanup() to release the memory, even on error!
  */
 int vringh_getdesc_user(struct vringh *vrh,
-			struct vringh_iov *riov,
-			struct vringh_iov *wiov,
+			struct vringh_kiov *riov,
+			struct vringh_kiov *wiov,
 			bool (*getrange)(struct vringh *vrh,
 					 u64 addr, struct vringh_range *r),
 			u16 *head)
@@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
 	if (err == vrh->vring.num)
 		return 0;
 
-	/* We need the layouts to be the identical for this to work */
-	BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
-	BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
-		     offsetof(struct vringh_iov, iov));
-	BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
-		     offsetof(struct vringh_iov, i));
-	BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
-		     offsetof(struct vringh_iov, used));
-	BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
-		     offsetof(struct vringh_iov, max_num));
-	BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
-	BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
-		     offsetof(struct kvec, iov_base));
-	BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
-		     offsetof(struct kvec, iov_len));
-	BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
-		     != sizeof(((struct kvec *)NULL)->iov_base));
-	BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
-		     != sizeof(((struct kvec *)NULL)->iov_len));
-
 	*head = err;
 	err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
 			   (struct vringh_kiov *)wiov,
@@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
 EXPORT_SYMBOL(vringh_getdesc_user);
 
 /**
- * vringh_iov_pull_user - copy bytes from vring_iov.
+ * vringh_iov_pull_user - copy bytes from vring_kiov.
  * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
  * @dst: the place to copy.
  * @len: the maximum length to copy.
  *
  * Returns the bytes copied <= len or a negative errno.
  */
-ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
+ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
 {
 	return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
 			       dst, len, xfer_from_user);
@@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
 EXPORT_SYMBOL(vringh_iov_pull_user);
 
 /**
- * vringh_iov_push_user - copy bytes into vring_iov.
+ * vringh_iov_push_user - copy bytes into vring_kiov.
  * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
  * @src: the place to copy from.
  * @len: the maximum length to copy.
  *
  * Returns the bytes copied <= len or a negative errno.
  */
-ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
+ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
 			     const void *src, size_t len)
 {
 	return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 1991a02c6431..733d948e8123 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -79,18 +79,6 @@ struct vringh_range {
 	u64 offset;
 };
 
-/**
- * struct vringh_iov - iovec mangler.
- *
- * Mangles iovec in place, and restores it.
- * Remaining data is iov + i, of used - i elements.
- */
-struct vringh_iov {
-	struct iovec *iov;
-	size_t consumed; /* Within iov[i] */
-	unsigned i, used, max_num;
-};
-
 /**
  * struct vringh_kiov - kvec mangler.
  *
@@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
 		     vring_avail_t __user *avail,
 		     vring_used_t __user *used);
 
-static inline void vringh_iov_init(struct vringh_iov *iov,
-				   struct iovec *iovec, unsigned num)
-{
-	iov->used = iov->i = 0;
-	iov->consumed = 0;
-	iov->max_num = num;
-	iov->iov = iovec;
-}
-
-static inline void vringh_iov_reset(struct vringh_iov *iov)
-{
-	iov->iov[iov->i].iov_len += iov->consumed;
-	iov->iov[iov->i].iov_base -= iov->consumed;
-	iov->consumed = 0;
-	iov->i = 0;
-}
-
-static inline void vringh_iov_cleanup(struct vringh_iov *iov)
-{
-	if (iov->max_num & VRINGH_IOV_ALLOCATED)
-		kfree(iov->iov);
-	iov->max_num = iov->used = iov->i = iov->consumed = 0;
-	iov->iov = NULL;
-}
-
 /* Convert a descriptor into iovecs. */
 int vringh_getdesc_user(struct vringh *vrh,
-			struct vringh_iov *riov,
-			struct vringh_iov *wiov,
+			struct vringh_kiov *riov,
+			struct vringh_kiov *wiov,
 			bool (*getrange)(struct vringh *vrh,
 					 u64 addr, struct vringh_range *r),
 			u16 *head);
 
 /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
-ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
+ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
 
 /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
-ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
+ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
 			     const void *src, size_t len);
 
 /* Mark a descriptor as used. */
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 3/9] tools/virtio: convert to new vringh user APIs
  2022-12-27  2:25 ` Shunsuke Mie
@ 2022-12-27  2:25   ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: kvm, virtualization, netdev, linux-kernel, Shunsuke Mie

struct vringh_iov is being remove, so convert vringh_test to use the
vringh user APIs. This has it change to use struct vringh_kiov instead of
the struct vringh_iov.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 tools/virtio/vringh_test.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
index 98ff808d6f0c..6c9533b8a2ca 100644
--- a/tools/virtio/vringh_test.c
+++ b/tools/virtio/vringh_test.c
@@ -193,8 +193,8 @@ static int parallel_test(u64 features,
 			errx(1, "Could not set affinity to cpu %u", first_cpu);
 
 		while (xfers < NUM_XFERS) {
-			struct iovec host_riov[2], host_wiov[2];
-			struct vringh_iov riov, wiov;
+			struct kvec host_riov[2], host_wiov[2];
+			struct vringh_kiov riov, wiov;
 			u16 head, written;
 
 			if (fast_vringh) {
@@ -216,10 +216,10 @@ static int parallel_test(u64 features,
 				written = 0;
 				goto complete;
 			} else {
-				vringh_iov_init(&riov,
+				vringh_kiov_init(&riov,
 						host_riov,
 						ARRAY_SIZE(host_riov));
-				vringh_iov_init(&wiov,
+				vringh_kiov_init(&wiov,
 						host_wiov,
 						ARRAY_SIZE(host_wiov));
 
@@ -442,8 +442,8 @@ int main(int argc, char *argv[])
 	struct virtqueue *vq;
 	struct vringh vrh;
 	struct scatterlist guest_sg[RINGSIZE], *sgs[2];
-	struct iovec host_riov[2], host_wiov[2];
-	struct vringh_iov riov, wiov;
+	struct kvec host_riov[2], host_wiov[2];
+	struct vringh_kiov riov, wiov;
 	struct vring_used_elem used[RINGSIZE];
 	char buf[28];
 	u16 head;
@@ -517,8 +517,8 @@ int main(int argc, char *argv[])
 	__kmalloc_fake = NULL;
 
 	/* Host retreives it. */
-	vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
-	vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
+	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
+	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
 	if (err != 1)
@@ -586,8 +586,8 @@ int main(int argc, char *argv[])
 	__kmalloc_fake = NULL;
 
 	/* Host picks it up (allocates new iov). */
-	vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
-	vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
+	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
+	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
 	if (err != 1)
@@ -613,8 +613,8 @@ int main(int argc, char *argv[])
 		assert(err < 3 || buf[2] == (char)(i + 2));
 	}
 	assert(riov.i == riov.used);
-	vringh_iov_cleanup(&riov);
-	vringh_iov_cleanup(&wiov);
+	vringh_kiov_cleanup(&riov);
+	vringh_kiov_cleanup(&wiov);
 
 	/* Complete using multi interface, just because we can. */
 	used[0].id = head;
@@ -638,8 +638,8 @@ int main(int argc, char *argv[])
 	}
 
 	/* Now get many, and consume them all at once. */
-	vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
-	vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
+	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
+	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 	for (i = 0; i < RINGSIZE; i++) {
 		err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
@@ -723,8 +723,8 @@ int main(int argc, char *argv[])
 		d[5].flags = 0;
 
 		/* Host picks it up (allocates new iov). */
-		vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
-		vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
+		vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
+		vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 		err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
 		if (err != 1)
@@ -744,7 +744,7 @@ int main(int argc, char *argv[])
 		/* Data should be linear. */
 		for (i = 0; i < err; i++)
 			assert(buf[i] == i);
-		vringh_iov_cleanup(&riov);
+		vringh_kiov_cleanup(&riov);
 	}
 
 	/* Don't leak memory... */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 3/9] tools/virtio: convert to new vringh user APIs
@ 2022-12-27  2:25   ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: netdev, Shunsuke Mie, linux-kernel, kvm, virtualization

struct vringh_iov is being remove, so convert vringh_test to use the
vringh user APIs. This has it change to use struct vringh_kiov instead of
the struct vringh_iov.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 tools/virtio/vringh_test.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
index 98ff808d6f0c..6c9533b8a2ca 100644
--- a/tools/virtio/vringh_test.c
+++ b/tools/virtio/vringh_test.c
@@ -193,8 +193,8 @@ static int parallel_test(u64 features,
 			errx(1, "Could not set affinity to cpu %u", first_cpu);
 
 		while (xfers < NUM_XFERS) {
-			struct iovec host_riov[2], host_wiov[2];
-			struct vringh_iov riov, wiov;
+			struct kvec host_riov[2], host_wiov[2];
+			struct vringh_kiov riov, wiov;
 			u16 head, written;
 
 			if (fast_vringh) {
@@ -216,10 +216,10 @@ static int parallel_test(u64 features,
 				written = 0;
 				goto complete;
 			} else {
-				vringh_iov_init(&riov,
+				vringh_kiov_init(&riov,
 						host_riov,
 						ARRAY_SIZE(host_riov));
-				vringh_iov_init(&wiov,
+				vringh_kiov_init(&wiov,
 						host_wiov,
 						ARRAY_SIZE(host_wiov));
 
@@ -442,8 +442,8 @@ int main(int argc, char *argv[])
 	struct virtqueue *vq;
 	struct vringh vrh;
 	struct scatterlist guest_sg[RINGSIZE], *sgs[2];
-	struct iovec host_riov[2], host_wiov[2];
-	struct vringh_iov riov, wiov;
+	struct kvec host_riov[2], host_wiov[2];
+	struct vringh_kiov riov, wiov;
 	struct vring_used_elem used[RINGSIZE];
 	char buf[28];
 	u16 head;
@@ -517,8 +517,8 @@ int main(int argc, char *argv[])
 	__kmalloc_fake = NULL;
 
 	/* Host retreives it. */
-	vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
-	vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
+	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
+	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
 	if (err != 1)
@@ -586,8 +586,8 @@ int main(int argc, char *argv[])
 	__kmalloc_fake = NULL;
 
 	/* Host picks it up (allocates new iov). */
-	vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
-	vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
+	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
+	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
 	if (err != 1)
@@ -613,8 +613,8 @@ int main(int argc, char *argv[])
 		assert(err < 3 || buf[2] == (char)(i + 2));
 	}
 	assert(riov.i == riov.used);
-	vringh_iov_cleanup(&riov);
-	vringh_iov_cleanup(&wiov);
+	vringh_kiov_cleanup(&riov);
+	vringh_kiov_cleanup(&wiov);
 
 	/* Complete using multi interface, just because we can. */
 	used[0].id = head;
@@ -638,8 +638,8 @@ int main(int argc, char *argv[])
 	}
 
 	/* Now get many, and consume them all at once. */
-	vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
-	vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
+	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
+	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 	for (i = 0; i < RINGSIZE; i++) {
 		err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
@@ -723,8 +723,8 @@ int main(int argc, char *argv[])
 		d[5].flags = 0;
 
 		/* Host picks it up (allocates new iov). */
-		vringh_iov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
-		vringh_iov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
+		vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
+		vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 		err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
 		if (err != 1)
@@ -744,7 +744,7 @@ int main(int argc, char *argv[])
 		/* Data should be linear. */
 		for (i = 0; i < err; i++)
 			assert(buf[i] == i);
-		vringh_iov_cleanup(&riov);
+		vringh_kiov_cleanup(&riov);
 	}
 
 	/* Don't leak memory... */
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 4/9] vringh: unify the APIs for all accessors
  2022-12-27  2:25 ` Shunsuke Mie
@ 2022-12-27  2:25   ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: kvm, virtualization, netdev, linux-kernel, Shunsuke Mie

Each vringh memory accessors that are for user, kern and iotlb has own
interfaces that calls common code. But some codes are duplicated and that
becomes loss extendability.

Introduce a struct vringh_ops and provide a common APIs for all accessors.
It can bee easily extended vringh code for new memory accessor and
simplified a caller code.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/vhost/vringh.c | 667 +++++++++++------------------------------
 include/linux/vringh.h | 100 +++---
 2 files changed, 225 insertions(+), 542 deletions(-)

diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index aa3cd27d2384..ebfd3644a1a3 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
 }
 
 /* Returns vring->num if empty, -ve on error. */
-static inline int __vringh_get_head(const struct vringh *vrh,
-				    int (*getu16)(const struct vringh *vrh,
-						  u16 *val, const __virtio16 *p),
-				    u16 *last_avail_idx)
+static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
 {
 	u16 avail_idx, i, head;
 	int err;
 
-	err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
+	err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
 	if (err) {
 		vringh_bad("Failed to access avail idx at %p",
 			   &vrh->vring.avail->idx);
@@ -58,7 +55,7 @@ static inline int __vringh_get_head(const struct vringh *vrh,
 
 	i = *last_avail_idx & (vrh->vring.num - 1);
 
-	err = getu16(vrh, &head, &vrh->vring.avail->ring[i]);
+	err = vrh->ops.getu16(vrh, &head, &vrh->vring.avail->ring[i]);
 	if (err) {
 		vringh_bad("Failed to read head: idx %d address %p",
 			   *last_avail_idx, &vrh->vring.avail->ring[i]);
@@ -131,12 +128,10 @@ static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
 
 /* May reduce *len if range is shorter. */
 static inline bool range_check(struct vringh *vrh, u64 addr, size_t *len,
-			       struct vringh_range *range,
-			       bool (*getrange)(struct vringh *,
-						u64, struct vringh_range *))
+			       struct vringh_range *range)
 {
 	if (addr < range->start || addr > range->end_incl) {
-		if (!getrange(vrh, addr, range))
+		if (!vrh->ops.getrange(vrh, addr, range))
 			return false;
 	}
 	BUG_ON(addr < range->start || addr > range->end_incl);
@@ -165,9 +160,7 @@ static inline bool range_check(struct vringh *vrh, u64 addr, size_t *len,
 }
 
 static inline bool no_range_check(struct vringh *vrh, u64 addr, size_t *len,
-				  struct vringh_range *range,
-				  bool (*getrange)(struct vringh *,
-						   u64, struct vringh_range *))
+				  struct vringh_range *range)
 {
 	return true;
 }
@@ -244,17 +237,7 @@ static u16 __cold return_from_indirect(const struct vringh *vrh, int *up_next,
 }
 
 static int slow_copy(struct vringh *vrh, void *dst, const void *src,
-		     bool (*rcheck)(struct vringh *vrh, u64 addr, size_t *len,
-				    struct vringh_range *range,
-				    bool (*getrange)(struct vringh *vrh,
-						     u64,
-						     struct vringh_range *)),
-		     bool (*getrange)(struct vringh *vrh,
-				      u64 addr,
-				      struct vringh_range *r),
-		     struct vringh_range *range,
-		     int (*copy)(const struct vringh *vrh,
-				 void *dst, const void *src, size_t len))
+		     struct vringh_range *range)
 {
 	size_t part, len = sizeof(struct vring_desc);
 
@@ -265,10 +248,10 @@ static int slow_copy(struct vringh *vrh, void *dst, const void *src,
 		part = len;
 		addr = (u64)(unsigned long)src - range->offset;
 
-		if (!rcheck(vrh, addr, &part, range, getrange))
+		if (!vrh->ops.range_check(vrh, addr, &part, range))
 			return -EINVAL;
 
-		err = copy(vrh, dst, src, part);
+		err = vrh->ops.copydesc(vrh, dst, src, part);
 		if (err)
 			return err;
 
@@ -279,18 +262,35 @@ static int slow_copy(struct vringh *vrh, void *dst, const void *src,
 	return 0;
 }
 
+static int __vringh_init(struct vringh *vrh, u64 features, unsigned int num,
+			 bool weak_barriers, gfp_t gfp, struct vring_desc *desc,
+			 struct vring_avail *avail, struct vring_used *used)
+{
+	/* Sane power of 2 please! */
+	if (!num || num > 0xffff || (num & (num - 1))) {
+		vringh_bad("Bad ring size %u", num);
+		return -EINVAL;
+	}
+
+	vrh->little_endian = (features & (1ULL << VIRTIO_F_VERSION_1));
+	vrh->event_indices = (features & (1 << VIRTIO_RING_F_EVENT_IDX));
+	vrh->weak_barriers = weak_barriers;
+	vrh->completed = 0;
+	vrh->last_avail_idx = 0;
+	vrh->last_used_idx = 0;
+	vrh->vring.num = num;
+	vrh->vring.desc = desc;
+	vrh->vring.avail = avail;
+	vrh->vring.used = used;
+	vrh->desc_gfp = gfp;
+
+	return 0;
+}
+
 static inline int
 __vringh_iov(struct vringh *vrh, u16 i,
 	     struct vringh_kiov *riov,
-	     struct vringh_kiov *wiov,
-	     bool (*rcheck)(struct vringh *vrh, u64 addr, size_t *len,
-			    struct vringh_range *range,
-			    bool (*getrange)(struct vringh *, u64,
-					     struct vringh_range *)),
-	     bool (*getrange)(struct vringh *, u64, struct vringh_range *),
-	     gfp_t gfp,
-	     int (*copy)(const struct vringh *vrh,
-			 void *dst, const void *src, size_t len))
+	     struct vringh_kiov *wiov, gfp_t gfp)
 {
 	int err, count = 0, indirect_count = 0, up_next, desc_max;
 	struct vring_desc desc, *descs;
@@ -317,10 +317,9 @@ __vringh_iov(struct vringh *vrh, u16 i,
 		size_t len;
 
 		if (unlikely(slow))
-			err = slow_copy(vrh, &desc, &descs[i], rcheck, getrange,
-					&slowrange, copy);
+			err = slow_copy(vrh, &desc, &descs[i], &slowrange);
 		else
-			err = copy(vrh, &desc, &descs[i], sizeof(desc));
+			err = vrh->ops.copydesc(vrh, &desc, &descs[i], sizeof(desc));
 		if (unlikely(err))
 			goto fail;
 
@@ -330,7 +329,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
 
 			/* Make sure it's OK, and get offset. */
 			len = vringh32_to_cpu(vrh, desc.len);
-			if (!rcheck(vrh, a, &len, &range, getrange)) {
+			if (!vrh->ops.range_check(vrh, a, &len, &range)) {
 				err = -EINVAL;
 				goto fail;
 			}
@@ -382,8 +381,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
 	again:
 		/* Make sure it's OK, and get offset. */
 		len = vringh32_to_cpu(vrh, desc.len);
-		if (!rcheck(vrh, vringh64_to_cpu(vrh, desc.addr), &len, &range,
-			    getrange)) {
+		if (!vrh->ops.range_check(vrh, vringh64_to_cpu(vrh, desc.addr), &len, &range)) {
 			err = -EINVAL;
 			goto fail;
 		}
@@ -436,13 +434,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
 
 static inline int __vringh_complete(struct vringh *vrh,
 				    const struct vring_used_elem *used,
-				    unsigned int num_used,
-				    int (*putu16)(const struct vringh *vrh,
-						  __virtio16 *p, u16 val),
-				    int (*putused)(const struct vringh *vrh,
-						   struct vring_used_elem *dst,
-						   const struct vring_used_elem
-						   *src, unsigned num))
+				    unsigned int num_used)
 {
 	struct vring_used *used_ring;
 	int err;
@@ -456,12 +448,12 @@ static inline int __vringh_complete(struct vringh *vrh,
 	/* Compiler knows num_used == 1 sometimes, hence extra check */
 	if (num_used > 1 && unlikely(off + num_used >= vrh->vring.num)) {
 		u16 part = vrh->vring.num - off;
-		err = putused(vrh, &used_ring->ring[off], used, part);
+		err = vrh->ops.putused(vrh, &used_ring->ring[off], used, part);
 		if (!err)
-			err = putused(vrh, &used_ring->ring[0], used + part,
+			err = vrh->ops.putused(vrh, &used_ring->ring[0], used + part,
 				      num_used - part);
 	} else
-		err = putused(vrh, &used_ring->ring[off], used, num_used);
+		err = vrh->ops.putused(vrh, &used_ring->ring[off], used, num_used);
 
 	if (err) {
 		vringh_bad("Failed to write %u used entries %u at %p",
@@ -472,7 +464,7 @@ static inline int __vringh_complete(struct vringh *vrh,
 	/* Make sure buffer is written before we update index. */
 	virtio_wmb(vrh->weak_barriers);
 
-	err = putu16(vrh, &vrh->vring.used->idx, used_idx + num_used);
+	err = vrh->ops.putu16(vrh, &vrh->vring.used->idx, used_idx + num_used);
 	if (err) {
 		vringh_bad("Failed to update used index at %p",
 			   &vrh->vring.used->idx);
@@ -483,11 +475,13 @@ static inline int __vringh_complete(struct vringh *vrh,
 	return 0;
 }
 
-
-static inline int __vringh_need_notify(struct vringh *vrh,
-				       int (*getu16)(const struct vringh *vrh,
-						     u16 *val,
-						     const __virtio16 *p))
+/**
+ * vringh_need_notify - must we tell the other side about used buffers?
+ * @vrh: the vring we've called vringh_complete() on.
+ *
+ * Returns -errno or 0 if we don't need to tell the other side, 1 if we do.
+ */
+int vringh_need_notify(struct vringh *vrh)
 {
 	bool notify;
 	u16 used_event;
@@ -501,7 +495,7 @@ static inline int __vringh_need_notify(struct vringh *vrh,
 	/* Old-style, without event indices. */
 	if (!vrh->event_indices) {
 		u16 flags;
-		err = getu16(vrh, &flags, &vrh->vring.avail->flags);
+		err = vrh->ops.getu16(vrh, &flags, &vrh->vring.avail->flags);
 		if (err) {
 			vringh_bad("Failed to get flags at %p",
 				   &vrh->vring.avail->flags);
@@ -511,7 +505,7 @@ static inline int __vringh_need_notify(struct vringh *vrh,
 	}
 
 	/* Modern: we know when other side wants to know. */
-	err = getu16(vrh, &used_event, &vring_used_event(&vrh->vring));
+	err = vrh->ops.getu16(vrh, &used_event, &vring_used_event(&vrh->vring));
 	if (err) {
 		vringh_bad("Failed to get used event idx at %p",
 			   &vring_used_event(&vrh->vring));
@@ -530,24 +524,28 @@ static inline int __vringh_need_notify(struct vringh *vrh,
 	vrh->completed = 0;
 	return notify;
 }
+EXPORT_SYMBOL(vringh_need_notify);
 
-static inline bool __vringh_notify_enable(struct vringh *vrh,
-					  int (*getu16)(const struct vringh *vrh,
-							u16 *val, const __virtio16 *p),
-					  int (*putu16)(const struct vringh *vrh,
-							__virtio16 *p, u16 val))
+/**
+ * vringh_notify_enable - we want to know if something changes.
+ * @vrh: the vring.
+ *
+ * This always enables notifications, but returns false if there are
+ * now more buffers available in the vring.
+ */
+bool vringh_notify_enable(struct vringh *vrh)
 {
 	u16 avail;
 
 	if (!vrh->event_indices) {
 		/* Old-school; update flags. */
-		if (putu16(vrh, &vrh->vring.used->flags, 0) != 0) {
+		if (vrh->ops.putu16(vrh, &vrh->vring.used->flags, 0) != 0) {
 			vringh_bad("Clearing used flags %p",
 				   &vrh->vring.used->flags);
 			return true;
 		}
 	} else {
-		if (putu16(vrh, &vring_avail_event(&vrh->vring),
+		if (vrh->ops.putu16(vrh, &vring_avail_event(&vrh->vring),
 			   vrh->last_avail_idx) != 0) {
 			vringh_bad("Updating avail event index %p",
 				   &vring_avail_event(&vrh->vring));
@@ -559,7 +557,7 @@ static inline bool __vringh_notify_enable(struct vringh *vrh,
 	 * sure it's written, then check again. */
 	virtio_mb(vrh->weak_barriers);
 
-	if (getu16(vrh, &avail, &vrh->vring.avail->idx) != 0) {
+	if (vrh->ops.getu16(vrh, &avail, &vrh->vring.avail->idx) != 0) {
 		vringh_bad("Failed to check avail idx at %p",
 			   &vrh->vring.avail->idx);
 		return true;
@@ -570,20 +568,27 @@ static inline bool __vringh_notify_enable(struct vringh *vrh,
 	 * notification anyway). */
 	return avail == vrh->last_avail_idx;
 }
+EXPORT_SYMBOL(vringh_notify_enable);
 
-static inline void __vringh_notify_disable(struct vringh *vrh,
-					   int (*putu16)(const struct vringh *vrh,
-							 __virtio16 *p, u16 val))
+/**
+ * vringh_notify_disable - don't tell us if something changes.
+ * @vrh: the vring.
+ *
+ * This is our normal running state: we disable and then only enable when
+ * we're going to sleep.
+ */
+void vringh_notify_disable(struct vringh *vrh)
 {
 	if (!vrh->event_indices) {
 		/* Old-school; update flags. */
-		if (putu16(vrh, &vrh->vring.used->flags,
+		if (vrh->ops.putu16(vrh, &vrh->vring.used->flags,
 			   VRING_USED_F_NO_NOTIFY)) {
 			vringh_bad("Setting used flags %p",
 				   &vrh->vring.used->flags);
 		}
 	}
 }
+EXPORT_SYMBOL(vringh_notify_disable);
 
 /* Userspace access helpers: in this case, addresses are really userspace. */
 static inline int getu16_user(const struct vringh *vrh, u16 *val, const __virtio16 *p)
@@ -630,6 +635,16 @@ static inline int xfer_to_user(const struct vringh *vrh,
 		-EFAULT : 0;
 }
 
+static struct vringh_ops user_vringh_ops = {
+	.getu16 = getu16_user,
+	.putu16 = putu16_user,
+	.xfer_from = xfer_from_user,
+	.xfer_to = xfer_to_user,
+	.putused = putused_user,
+	.copydesc = copydesc_user,
+	.range_check = range_check,
+};
+
 /**
  * vringh_init_user - initialize a vringh for a userspace vring.
  * @vrh: the vringh to initialize.
@@ -639,6 +654,7 @@ static inline int xfer_to_user(const struct vringh *vrh,
  * @desc: the userpace descriptor pointer.
  * @avail: the userpace avail pointer.
  * @used: the userpace used pointer.
+ * @getrange: a function that return a range that vring can access.
  *
  * Returns an error if num is invalid: you should check pointers
  * yourself!
@@ -647,36 +663,32 @@ int vringh_init_user(struct vringh *vrh, u64 features,
 		     unsigned int num, bool weak_barriers,
 		     vring_desc_t __user *desc,
 		     vring_avail_t __user *avail,
-		     vring_used_t __user *used)
+		     vring_used_t __user *used,
+			 bool (*getrange)(struct vringh *vrh, u64 addr, struct vringh_range *r))
 {
-	/* Sane power of 2 please! */
-	if (!num || num > 0xffff || (num & (num - 1))) {
-		vringh_bad("Bad ring size %u", num);
-		return -EINVAL;
-	}
+	int err;
+
+	err = __vringh_init(vrh, features, num, weak_barriers, GFP_KERNEL,
+			(__force struct vring_desc *)desc,
+			(__force struct vring_avail *)avail,
+			(__force struct vring_used *)used);
+	if (err)
+		return err;
+
+	memcpy(&vrh->ops, &user_vringh_ops, sizeof(user_vringh_ops));
+	vrh->ops.getrange = getrange;
 
-	vrh->little_endian = (features & (1ULL << VIRTIO_F_VERSION_1));
-	vrh->event_indices = (features & (1 << VIRTIO_RING_F_EVENT_IDX));
-	vrh->weak_barriers = weak_barriers;
-	vrh->completed = 0;
-	vrh->last_avail_idx = 0;
-	vrh->last_used_idx = 0;
-	vrh->vring.num = num;
-	/* vring expects kernel addresses, but only used via accessors. */
-	vrh->vring.desc = (__force struct vring_desc *)desc;
-	vrh->vring.avail = (__force struct vring_avail *)avail;
-	vrh->vring.used = (__force struct vring_used *)used;
 	return 0;
 }
 EXPORT_SYMBOL(vringh_init_user);
 
 /**
- * vringh_getdesc_user - get next available descriptor from userspace ring.
- * @vrh: the userspace vring.
+ * vringh_getdesc - get next available descriptor from ring.
+ * @vrh: the vringh to get desc.
  * @riov: where to put the readable descriptors (or NULL)
  * @wiov: where to put the writable descriptors (or NULL)
  * @getrange: function to call to check ranges.
- * @head: head index we received, for passing to vringh_complete_user().
+ * @head: head index we received, for passing to vringh_complete().
  *
  * Returns 0 if there was no descriptor, 1 if there was, or -errno.
  *
@@ -690,17 +702,15 @@ EXPORT_SYMBOL(vringh_init_user);
  * When you don't have to use riov and wiov anymore, you should clean up them
  * calling vringh_iov_cleanup() to release the memory, even on error!
  */
-int vringh_getdesc_user(struct vringh *vrh,
+int vringh_getdesc(struct vringh *vrh,
 			struct vringh_kiov *riov,
 			struct vringh_kiov *wiov,
-			bool (*getrange)(struct vringh *vrh,
-					 u64 addr, struct vringh_range *r),
 			u16 *head)
 {
 	int err;
 
 	*head = vrh->vring.num;
-	err = __vringh_get_head(vrh, getu16_user, &vrh->last_avail_idx);
+	err = __vringh_get_head(vrh, &vrh->last_avail_idx);
 	if (err < 0)
 		return err;
 
@@ -709,137 +719,100 @@ int vringh_getdesc_user(struct vringh *vrh,
 		return 0;
 
 	*head = err;
-	err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
-			   (struct vringh_kiov *)wiov,
-			   range_check, getrange, GFP_KERNEL, copydesc_user);
+	err = __vringh_iov(vrh, *head, riov, wiov, GFP_KERNEL);
 	if (err)
 		return err;
 
 	return 1;
 }
-EXPORT_SYMBOL(vringh_getdesc_user);
+EXPORT_SYMBOL(vringh_getdesc);
 
 /**
- * vringh_iov_pull_user - copy bytes from vring_kiov.
- * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
+ * vringh_iov_pull - copy bytes from vring_kiov.
+ * @vrh: the vringh to load data.
+ * @riov: the riov as passed to vringh_getdesc() (updated as we consume)
  * @dst: the place to copy.
  * @len: the maximum length to copy.
  *
  * Returns the bytes copied <= len or a negative errno.
  */
-ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
+ssize_t vringh_iov_pull(struct vringh *vrh, struct vringh_kiov *riov, void *dst, size_t len)
 {
 	return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
-			       dst, len, xfer_from_user);
+			       dst, len, vrh->ops.xfer_from);
 }
-EXPORT_SYMBOL(vringh_iov_pull_user);
+EXPORT_SYMBOL(vringh_iov_pull);
 
 /**
- * vringh_iov_push_user - copy bytes into vring_kiov.
- * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
+ * vringh_iov_push - copy bytes into vring_kiov.
+ * @vrh: the vringh to store data.
+ * @wiov: the wiov as passed to vringh_getdesc() (updated as we consume)
  * @src: the place to copy from.
  * @len: the maximum length to copy.
  *
  * Returns the bytes copied <= len or a negative errno.
  */
-ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
+ssize_t vringh_iov_push(struct vringh *vrh, struct vringh_kiov *wiov,
 			     const void *src, size_t len)
 {
 	return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
-			       (void *)src, len, xfer_to_user);
+			       (void *)src, len, vrh->ops.xfer_to);
 }
-EXPORT_SYMBOL(vringh_iov_push_user);
+EXPORT_SYMBOL(vringh_iov_push);
 
 /**
- * vringh_abandon_user - we've decided not to handle the descriptor(s).
+ * vringh_abandon - we've decided not to handle the descriptor(s).
  * @vrh: the vring.
  * @num: the number of descriptors to put back (ie. num
  *	 vringh_get_user() to undo).
  *
  * The next vringh_get_user() will return the old descriptor(s) again.
  */
-void vringh_abandon_user(struct vringh *vrh, unsigned int num)
+void vringh_abandon(struct vringh *vrh, unsigned int num)
 {
 	/* We only update vring_avail_event(vr) when we want to be notified,
 	 * so we haven't changed that yet. */
 	vrh->last_avail_idx -= num;
 }
-EXPORT_SYMBOL(vringh_abandon_user);
+EXPORT_SYMBOL(vringh_abandon);
 
 /**
- * vringh_complete_user - we've finished with descriptor, publish it.
+ * vringh_complete - we've finished with descriptor, publish it.
  * @vrh: the vring.
- * @head: the head as filled in by vringh_getdesc_user.
+ * @head: the head as filled in by vringh_getdesc.
  * @len: the length of data we have written.
  *
- * You should check vringh_need_notify_user() after one or more calls
+ * You should check vringh_need_notify() after one or more calls
  * to this function.
  */
-int vringh_complete_user(struct vringh *vrh, u16 head, u32 len)
+int vringh_complete(struct vringh *vrh, u16 head, u32 len)
 {
 	struct vring_used_elem used;
 
 	used.id = cpu_to_vringh32(vrh, head);
 	used.len = cpu_to_vringh32(vrh, len);
-	return __vringh_complete(vrh, &used, 1, putu16_user, putused_user);
+	return __vringh_complete(vrh, &used, 1);
 }
-EXPORT_SYMBOL(vringh_complete_user);
+EXPORT_SYMBOL(vringh_complete);
 
 /**
- * vringh_complete_multi_user - we've finished with many descriptors.
+ * vringh_complete_multi - we've finished with many descriptors.
  * @vrh: the vring.
  * @used: the head, length pairs.
  * @num_used: the number of used elements.
  *
- * You should check vringh_need_notify_user() after one or more calls
+ * You should check vringh_need_notify() after one or more calls
  * to this function.
  */
-int vringh_complete_multi_user(struct vringh *vrh,
+int vringh_complete_multi(struct vringh *vrh,
 			       const struct vring_used_elem used[],
 			       unsigned num_used)
 {
-	return __vringh_complete(vrh, used, num_used,
-				 putu16_user, putused_user);
-}
-EXPORT_SYMBOL(vringh_complete_multi_user);
-
-/**
- * vringh_notify_enable_user - we want to know if something changes.
- * @vrh: the vring.
- *
- * This always enables notifications, but returns false if there are
- * now more buffers available in the vring.
- */
-bool vringh_notify_enable_user(struct vringh *vrh)
-{
-	return __vringh_notify_enable(vrh, getu16_user, putu16_user);
+	return __vringh_complete(vrh, used, num_used);
 }
-EXPORT_SYMBOL(vringh_notify_enable_user);
+EXPORT_SYMBOL(vringh_complete_multi);
 
-/**
- * vringh_notify_disable_user - don't tell us if something changes.
- * @vrh: the vring.
- *
- * This is our normal running state: we disable and then only enable when
- * we're going to sleep.
- */
-void vringh_notify_disable_user(struct vringh *vrh)
-{
-	__vringh_notify_disable(vrh, putu16_user);
-}
-EXPORT_SYMBOL(vringh_notify_disable_user);
 
-/**
- * vringh_need_notify_user - must we tell the other side about used buffers?
- * @vrh: the vring we've called vringh_complete_user() on.
- *
- * Returns -errno or 0 if we don't need to tell the other side, 1 if we do.
- */
-int vringh_need_notify_user(struct vringh *vrh)
-{
-	return __vringh_need_notify(vrh, getu16_user);
-}
-EXPORT_SYMBOL(vringh_need_notify_user);
 
 /* Kernelspace access helpers. */
 static inline int getu16_kern(const struct vringh *vrh,
@@ -885,6 +858,17 @@ static inline int kern_xfer(const struct vringh *vrh, void *dst,
 	return 0;
 }
 
+static const struct vringh_ops kern_vringh_ops = {
+	.getu16 = getu16_kern,
+	.putu16 = putu16_kern,
+	.xfer_from = xfer_kern,
+	.xfer_to = xfer_kern,
+	.putused = putused_kern,
+	.copydesc = copydesc_kern,
+	.range_check = no_range_check,
+	.getrange = NULL,
+};
+
 /**
  * vringh_init_kern - initialize a vringh for a kernelspace vring.
  * @vrh: the vringh to initialize.
@@ -898,179 +882,22 @@ static inline int kern_xfer(const struct vringh *vrh, void *dst,
  * Returns an error if num is invalid.
  */
 int vringh_init_kern(struct vringh *vrh, u64 features,
-		     unsigned int num, bool weak_barriers,
+		     unsigned int num, bool weak_barriers, gfp_t gfp,
 		     struct vring_desc *desc,
 		     struct vring_avail *avail,
 		     struct vring_used *used)
-{
-	/* Sane power of 2 please! */
-	if (!num || num > 0xffff || (num & (num - 1))) {
-		vringh_bad("Bad ring size %u", num);
-		return -EINVAL;
-	}
-
-	vrh->little_endian = (features & (1ULL << VIRTIO_F_VERSION_1));
-	vrh->event_indices = (features & (1 << VIRTIO_RING_F_EVENT_IDX));
-	vrh->weak_barriers = weak_barriers;
-	vrh->completed = 0;
-	vrh->last_avail_idx = 0;
-	vrh->last_used_idx = 0;
-	vrh->vring.num = num;
-	vrh->vring.desc = desc;
-	vrh->vring.avail = avail;
-	vrh->vring.used = used;
-	return 0;
-}
-EXPORT_SYMBOL(vringh_init_kern);
-
-/**
- * vringh_getdesc_kern - get next available descriptor from kernelspace ring.
- * @vrh: the kernelspace vring.
- * @riov: where to put the readable descriptors (or NULL)
- * @wiov: where to put the writable descriptors (or NULL)
- * @head: head index we received, for passing to vringh_complete_kern().
- * @gfp: flags for allocating larger riov/wiov.
- *
- * Returns 0 if there was no descriptor, 1 if there was, or -errno.
- *
- * Note that on error return, you can tell the difference between an
- * invalid ring and a single invalid descriptor: in the former case,
- * *head will be vrh->vring.num.  You may be able to ignore an invalid
- * descriptor, but there's not much you can do with an invalid ring.
- *
- * Note that you can reuse riov and wiov with subsequent calls. Content is
- * overwritten and memory reallocated if more space is needed.
- * When you don't have to use riov and wiov anymore, you should clean up them
- * calling vringh_kiov_cleanup() to release the memory, even on error!
- */
-int vringh_getdesc_kern(struct vringh *vrh,
-			struct vringh_kiov *riov,
-			struct vringh_kiov *wiov,
-			u16 *head,
-			gfp_t gfp)
 {
 	int err;
 
-	err = __vringh_get_head(vrh, getu16_kern, &vrh->last_avail_idx);
-	if (err < 0)
-		return err;
-
-	/* Empty... */
-	if (err == vrh->vring.num)
-		return 0;
-
-	*head = err;
-	err = __vringh_iov(vrh, *head, riov, wiov, no_range_check, NULL,
-			   gfp, copydesc_kern);
+	err = __vringh_init(vrh, features, num, weak_barriers, gfp, desc, avail, used);
 	if (err)
 		return err;
 
-	return 1;
-}
-EXPORT_SYMBOL(vringh_getdesc_kern);
-
-/**
- * vringh_iov_pull_kern - copy bytes from vring_iov.
- * @riov: the riov as passed to vringh_getdesc_kern() (updated as we consume)
- * @dst: the place to copy.
- * @len: the maximum length to copy.
- *
- * Returns the bytes copied <= len or a negative errno.
- */
-ssize_t vringh_iov_pull_kern(struct vringh_kiov *riov, void *dst, size_t len)
-{
-	return vringh_iov_xfer(NULL, riov, dst, len, xfer_kern);
-}
-EXPORT_SYMBOL(vringh_iov_pull_kern);
-
-/**
- * vringh_iov_push_kern - copy bytes into vring_iov.
- * @wiov: the wiov as passed to vringh_getdesc_kern() (updated as we consume)
- * @src: the place to copy from.
- * @len: the maximum length to copy.
- *
- * Returns the bytes copied <= len or a negative errno.
- */
-ssize_t vringh_iov_push_kern(struct vringh_kiov *wiov,
-			     const void *src, size_t len)
-{
-	return vringh_iov_xfer(NULL, wiov, (void *)src, len, kern_xfer);
-}
-EXPORT_SYMBOL(vringh_iov_push_kern);
+	memcpy(&vrh->ops, &kern_vringh_ops, sizeof(kern_vringh_ops));
 
-/**
- * vringh_abandon_kern - we've decided not to handle the descriptor(s).
- * @vrh: the vring.
- * @num: the number of descriptors to put back (ie. num
- *	 vringh_get_kern() to undo).
- *
- * The next vringh_get_kern() will return the old descriptor(s) again.
- */
-void vringh_abandon_kern(struct vringh *vrh, unsigned int num)
-{
-	/* We only update vring_avail_event(vr) when we want to be notified,
-	 * so we haven't changed that yet. */
-	vrh->last_avail_idx -= num;
-}
-EXPORT_SYMBOL(vringh_abandon_kern);
-
-/**
- * vringh_complete_kern - we've finished with descriptor, publish it.
- * @vrh: the vring.
- * @head: the head as filled in by vringh_getdesc_kern.
- * @len: the length of data we have written.
- *
- * You should check vringh_need_notify_kern() after one or more calls
- * to this function.
- */
-int vringh_complete_kern(struct vringh *vrh, u16 head, u32 len)
-{
-	struct vring_used_elem used;
-
-	used.id = cpu_to_vringh32(vrh, head);
-	used.len = cpu_to_vringh32(vrh, len);
-
-	return __vringh_complete(vrh, &used, 1, putu16_kern, putused_kern);
-}
-EXPORT_SYMBOL(vringh_complete_kern);
-
-/**
- * vringh_notify_enable_kern - we want to know if something changes.
- * @vrh: the vring.
- *
- * This always enables notifications, but returns false if there are
- * now more buffers available in the vring.
- */
-bool vringh_notify_enable_kern(struct vringh *vrh)
-{
-	return __vringh_notify_enable(vrh, getu16_kern, putu16_kern);
-}
-EXPORT_SYMBOL(vringh_notify_enable_kern);
-
-/**
- * vringh_notify_disable_kern - don't tell us if something changes.
- * @vrh: the vring.
- *
- * This is our normal running state: we disable and then only enable when
- * we're going to sleep.
- */
-void vringh_notify_disable_kern(struct vringh *vrh)
-{
-	__vringh_notify_disable(vrh, putu16_kern);
-}
-EXPORT_SYMBOL(vringh_notify_disable_kern);
-
-/**
- * vringh_need_notify_kern - must we tell the other side about used buffers?
- * @vrh: the vring we've called vringh_complete_kern() on.
- *
- * Returns -errno or 0 if we don't need to tell the other side, 1 if we do.
- */
-int vringh_need_notify_kern(struct vringh *vrh)
-{
-	return __vringh_need_notify(vrh, getu16_kern);
+	return 0;
 }
-EXPORT_SYMBOL(vringh_need_notify_kern);
+EXPORT_SYMBOL(vringh_init_kern);
 
 #if IS_REACHABLE(CONFIG_VHOST_IOTLB)
 
@@ -1122,7 +949,7 @@ static int iotlb_translate(const struct vringh *vrh,
 	return ret;
 }
 
-static inline int copy_from_iotlb(const struct vringh *vrh, void *dst,
+static int copy_from_iotlb(const struct vringh *vrh, void *dst,
 				  void *src, size_t len)
 {
 	u64 total_translated = 0;
@@ -1155,7 +982,7 @@ static inline int copy_from_iotlb(const struct vringh *vrh, void *dst,
 	return total_translated;
 }
 
-static inline int copy_to_iotlb(const struct vringh *vrh, void *dst,
+static int copy_to_iotlb(const struct vringh *vrh, void *dst,
 				void *src, size_t len)
 {
 	u64 total_translated = 0;
@@ -1188,7 +1015,7 @@ static inline int copy_to_iotlb(const struct vringh *vrh, void *dst,
 	return total_translated;
 }
 
-static inline int getu16_iotlb(const struct vringh *vrh,
+static int getu16_iotlb(const struct vringh *vrh,
 			       u16 *val, const __virtio16 *p)
 {
 	struct bio_vec iov;
@@ -1209,7 +1036,7 @@ static inline int getu16_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
-static inline int putu16_iotlb(const struct vringh *vrh,
+static int putu16_iotlb(const struct vringh *vrh,
 			       __virtio16 *p, u16 val)
 {
 	struct bio_vec iov;
@@ -1230,7 +1057,7 @@ static inline int putu16_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
-static inline int copydesc_iotlb(const struct vringh *vrh,
+static int copydesc_iotlb(const struct vringh *vrh,
 				 void *dst, const void *src, size_t len)
 {
 	int ret;
@@ -1242,7 +1069,7 @@ static inline int copydesc_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
-static inline int xfer_from_iotlb(const struct vringh *vrh, void *src,
+static int xfer_from_iotlb(const struct vringh *vrh, void *src,
 				  void *dst, size_t len)
 {
 	int ret;
@@ -1254,7 +1081,7 @@ static inline int xfer_from_iotlb(const struct vringh *vrh, void *src,
 	return 0;
 }
 
-static inline int xfer_to_iotlb(const struct vringh *vrh,
+static int xfer_to_iotlb(const struct vringh *vrh,
 			       void *dst, void *src, size_t len)
 {
 	int ret;
@@ -1266,7 +1093,7 @@ static inline int xfer_to_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
-static inline int putused_iotlb(const struct vringh *vrh,
+static int putused_iotlb(const struct vringh *vrh,
 				struct vring_used_elem *dst,
 				const struct vring_used_elem *src,
 				unsigned int num)
@@ -1281,6 +1108,17 @@ static inline int putused_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
+static const struct vringh_ops iotlb_vringh_ops = {
+	.getu16 = getu16_iotlb,
+	.putu16 = putu16_iotlb,
+	.xfer_from = xfer_from_iotlb,
+	.xfer_to = xfer_to_iotlb,
+	.putused = putused_iotlb,
+	.copydesc = copydesc_iotlb,
+	.range_check = no_range_check,
+	.getrange = NULL,
+};
+
 /**
  * vringh_init_iotlb - initialize a vringh for a ring with IOTLB.
  * @vrh: the vringh to initialize.
@@ -1294,13 +1132,20 @@ static inline int putused_iotlb(const struct vringh *vrh,
  * Returns an error if num is invalid.
  */
 int vringh_init_iotlb(struct vringh *vrh, u64 features,
-		      unsigned int num, bool weak_barriers,
+		      unsigned int num, bool weak_barriers, gfp_t gfp,
 		      struct vring_desc *desc,
 		      struct vring_avail *avail,
 		      struct vring_used *used)
 {
-	return vringh_init_kern(vrh, features, num, weak_barriers,
-				desc, avail, used);
+	int err;
+
+	err = __vringh_init(vrh, features, num, weak_barriers, gfp, desc, avail, used);
+	if (err)
+		return err;
+
+	memcpy(&vrh->ops, &iotlb_vringh_ops, sizeof(iotlb_vringh_ops));
+
+	return 0;
 }
 EXPORT_SYMBOL(vringh_init_iotlb);
 
@@ -1318,162 +1163,6 @@ void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
 }
 EXPORT_SYMBOL(vringh_set_iotlb);
 
-/**
- * vringh_getdesc_iotlb - get next available descriptor from ring with
- * IOTLB.
- * @vrh: the kernelspace vring.
- * @riov: where to put the readable descriptors (or NULL)
- * @wiov: where to put the writable descriptors (or NULL)
- * @head: head index we received, for passing to vringh_complete_iotlb().
- * @gfp: flags for allocating larger riov/wiov.
- *
- * Returns 0 if there was no descriptor, 1 if there was, or -errno.
- *
- * Note that on error return, you can tell the difference between an
- * invalid ring and a single invalid descriptor: in the former case,
- * *head will be vrh->vring.num.  You may be able to ignore an invalid
- * descriptor, but there's not much you can do with an invalid ring.
- *
- * Note that you can reuse riov and wiov with subsequent calls. Content is
- * overwritten and memory reallocated if more space is needed.
- * When you don't have to use riov and wiov anymore, you should clean up them
- * calling vringh_kiov_cleanup() to release the memory, even on error!
- */
-int vringh_getdesc_iotlb(struct vringh *vrh,
-			 struct vringh_kiov *riov,
-			 struct vringh_kiov *wiov,
-			 u16 *head,
-			 gfp_t gfp)
-{
-	int err;
-
-	err = __vringh_get_head(vrh, getu16_iotlb, &vrh->last_avail_idx);
-	if (err < 0)
-		return err;
-
-	/* Empty... */
-	if (err == vrh->vring.num)
-		return 0;
-
-	*head = err;
-	err = __vringh_iov(vrh, *head, riov, wiov, no_range_check, NULL,
-			   gfp, copydesc_iotlb);
-	if (err)
-		return err;
-
-	return 1;
-}
-EXPORT_SYMBOL(vringh_getdesc_iotlb);
-
-/**
- * vringh_iov_pull_iotlb - copy bytes from vring_iov.
- * @vrh: the vring.
- * @riov: the riov as passed to vringh_getdesc_iotlb() (updated as we consume)
- * @dst: the place to copy.
- * @len: the maximum length to copy.
- *
- * Returns the bytes copied <= len or a negative errno.
- */
-ssize_t vringh_iov_pull_iotlb(struct vringh *vrh,
-			      struct vringh_kiov *riov,
-			      void *dst, size_t len)
-{
-	return vringh_iov_xfer(vrh, riov, dst, len, xfer_from_iotlb);
-}
-EXPORT_SYMBOL(vringh_iov_pull_iotlb);
-
-/**
- * vringh_iov_push_iotlb - copy bytes into vring_iov.
- * @vrh: the vring.
- * @wiov: the wiov as passed to vringh_getdesc_iotlb() (updated as we consume)
- * @src: the place to copy from.
- * @len: the maximum length to copy.
- *
- * Returns the bytes copied <= len or a negative errno.
- */
-ssize_t vringh_iov_push_iotlb(struct vringh *vrh,
-			      struct vringh_kiov *wiov,
-			      const void *src, size_t len)
-{
-	return vringh_iov_xfer(vrh, wiov, (void *)src, len, xfer_to_iotlb);
-}
-EXPORT_SYMBOL(vringh_iov_push_iotlb);
-
-/**
- * vringh_abandon_iotlb - we've decided not to handle the descriptor(s).
- * @vrh: the vring.
- * @num: the number of descriptors to put back (ie. num
- *	 vringh_get_iotlb() to undo).
- *
- * The next vringh_get_iotlb() will return the old descriptor(s) again.
- */
-void vringh_abandon_iotlb(struct vringh *vrh, unsigned int num)
-{
-	/* We only update vring_avail_event(vr) when we want to be notified,
-	 * so we haven't changed that yet.
-	 */
-	vrh->last_avail_idx -= num;
-}
-EXPORT_SYMBOL(vringh_abandon_iotlb);
-
-/**
- * vringh_complete_iotlb - we've finished with descriptor, publish it.
- * @vrh: the vring.
- * @head: the head as filled in by vringh_getdesc_iotlb.
- * @len: the length of data we have written.
- *
- * You should check vringh_need_notify_iotlb() after one or more calls
- * to this function.
- */
-int vringh_complete_iotlb(struct vringh *vrh, u16 head, u32 len)
-{
-	struct vring_used_elem used;
-
-	used.id = cpu_to_vringh32(vrh, head);
-	used.len = cpu_to_vringh32(vrh, len);
-
-	return __vringh_complete(vrh, &used, 1, putu16_iotlb, putused_iotlb);
-}
-EXPORT_SYMBOL(vringh_complete_iotlb);
-
-/**
- * vringh_notify_enable_iotlb - we want to know if something changes.
- * @vrh: the vring.
- *
- * This always enables notifications, but returns false if there are
- * now more buffers available in the vring.
- */
-bool vringh_notify_enable_iotlb(struct vringh *vrh)
-{
-	return __vringh_notify_enable(vrh, getu16_iotlb, putu16_iotlb);
-}
-EXPORT_SYMBOL(vringh_notify_enable_iotlb);
-
-/**
- * vringh_notify_disable_iotlb - don't tell us if something changes.
- * @vrh: the vring.
- *
- * This is our normal running state: we disable and then only enable when
- * we're going to sleep.
- */
-void vringh_notify_disable_iotlb(struct vringh *vrh)
-{
-	__vringh_notify_disable(vrh, putu16_iotlb);
-}
-EXPORT_SYMBOL(vringh_notify_disable_iotlb);
-
-/**
- * vringh_need_notify_iotlb - must we tell the other side about used buffers?
- * @vrh: the vring we've called vringh_complete_iotlb() on.
- *
- * Returns -errno or 0 if we don't need to tell the other side, 1 if we do.
- */
-int vringh_need_notify_iotlb(struct vringh *vrh)
-{
-	return __vringh_need_notify(vrh, getu16_iotlb);
-}
-EXPORT_SYMBOL(vringh_need_notify_iotlb);
-
 #endif
 
 MODULE_LICENSE("GPL");
diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 733d948e8123..89c73605c85f 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -21,6 +21,36 @@
 #endif
 #include <asm/barrier.h>
 
+struct vringh;
+struct vringh_range;
+
+/**
+ * struct vringh_ops - ops for accessing a vring and checking to access range.
+ * @getu16: read u16 value from pointer
+ * @putu16: write u16 value to pointer
+ * @xfer_from: copy memory range from specified address to local virtual address
+ * @xfer_tio: copy memory range from local virtual address to specified address
+ * @putused: update vring used descriptor
+ * @copydesc: copy desiptor from target to local virtual address
+ * @range_check: check if the region is accessible
+ * @getrange: return a range that vring can access
+ */
+struct vringh_ops {
+	int (*getu16)(const struct vringh *vrh, u16 *val, const __virtio16 *p);
+	int (*putu16)(const struct vringh *vrh, __virtio16 *p, u16 val);
+	int (*xfer_from)(const struct vringh *vrh, void *src, void *dst,
+			 size_t len);
+	int (*xfer_to)(const struct vringh *vrh, void *dst, void *src,
+		       size_t len);
+	int (*putused)(const struct vringh *vrh, struct vring_used_elem *dst,
+		       const struct vring_used_elem *src, unsigned int num);
+	int (*copydesc)(const struct vringh *vrh, void *dst, const void *src,
+			size_t len);
+	bool (*range_check)(struct vringh *vrh, u64 addr, size_t *len,
+			    struct vringh_range *range);
+	bool (*getrange)(struct vringh *vrh, u64 addr, struct vringh_range *r);
+};
+
 /* virtio_ring with information needed for host access. */
 struct vringh {
 	/* Everything is little endian */
@@ -52,6 +82,10 @@ struct vringh {
 
 	/* The function to call to notify the guest about added buffers */
 	void (*notify)(struct vringh *);
+
+	struct vringh_ops ops;
+
+	gfp_t desc_gfp;
 };
 
 /**
@@ -99,41 +133,40 @@ int vringh_init_user(struct vringh *vrh, u64 features,
 		     unsigned int num, bool weak_barriers,
 		     vring_desc_t __user *desc,
 		     vring_avail_t __user *avail,
-		     vring_used_t __user *used);
+		     vring_used_t __user *used,
+			 bool (*getrange)(struct vringh *vrh, u64 addr, struct vringh_range *r));
 
 /* Convert a descriptor into iovecs. */
-int vringh_getdesc_user(struct vringh *vrh,
+int vringh_getdesc(struct vringh *vrh,
 			struct vringh_kiov *riov,
 			struct vringh_kiov *wiov,
-			bool (*getrange)(struct vringh *vrh,
-					 u64 addr, struct vringh_range *r),
 			u16 *head);
 
 /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
-ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
+ssize_t vringh_iov_pull(struct vringh *vrh, struct vringh_kiov *riov, void *dst, size_t len);
 
 /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
-ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
+ssize_t vringh_iov_push(struct vringh *vrh, struct vringh_kiov *wiov,
 			     const void *src, size_t len);
 
 /* Mark a descriptor as used. */
-int vringh_complete_user(struct vringh *vrh, u16 head, u32 len);
-int vringh_complete_multi_user(struct vringh *vrh,
+int vringh_complete(struct vringh *vrh, u16 head, u32 len);
+int vringh_complete_multi(struct vringh *vrh,
 			       const struct vring_used_elem used[],
 			       unsigned num_used);
 
 /* Pretend we've never seen descriptor (for easy error handling). */
-void vringh_abandon_user(struct vringh *vrh, unsigned int num);
+void vringh_abandon(struct vringh *vrh, unsigned int num);
 
 /* Do we need to fire the eventfd to notify the other side? */
-int vringh_need_notify_user(struct vringh *vrh);
+int vringh_need_notify(struct vringh *vrh);
 
-bool vringh_notify_enable_user(struct vringh *vrh);
-void vringh_notify_disable_user(struct vringh *vrh);
+bool vringh_notify_enable(struct vringh *vrh);
+void vringh_notify_disable(struct vringh *vrh);
 
 /* Helpers for kernelspace vrings. */
 int vringh_init_kern(struct vringh *vrh, u64 features,
-		     unsigned int num, bool weak_barriers,
+		     unsigned int num, bool weak_barriers, gfp_t gfp,
 		     struct vring_desc *desc,
 		     struct vring_avail *avail,
 		     struct vring_used *used);
@@ -176,23 +209,6 @@ static inline size_t vringh_kiov_length(struct vringh_kiov *kiov)
 
 void vringh_kiov_advance(struct vringh_kiov *kiov, size_t len);
 
-int vringh_getdesc_kern(struct vringh *vrh,
-			struct vringh_kiov *riov,
-			struct vringh_kiov *wiov,
-			u16 *head,
-			gfp_t gfp);
-
-ssize_t vringh_iov_pull_kern(struct vringh_kiov *riov, void *dst, size_t len);
-ssize_t vringh_iov_push_kern(struct vringh_kiov *wiov,
-			     const void *src, size_t len);
-void vringh_abandon_kern(struct vringh *vrh, unsigned int num);
-int vringh_complete_kern(struct vringh *vrh, u16 head, u32 len);
-
-bool vringh_notify_enable_kern(struct vringh *vrh);
-void vringh_notify_disable_kern(struct vringh *vrh);
-
-int vringh_need_notify_kern(struct vringh *vrh);
-
 /* Notify the guest about buffers added to the used ring */
 static inline void vringh_notify(struct vringh *vrh)
 {
@@ -242,33 +258,11 @@ void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
 		      spinlock_t *iotlb_lock);
 
 int vringh_init_iotlb(struct vringh *vrh, u64 features,
-		      unsigned int num, bool weak_barriers,
+		      unsigned int num, bool weak_barriers, gfp_t gfp,
 		      struct vring_desc *desc,
 		      struct vring_avail *avail,
 		      struct vring_used *used);
 
-int vringh_getdesc_iotlb(struct vringh *vrh,
-			 struct vringh_kiov *riov,
-			 struct vringh_kiov *wiov,
-			 u16 *head,
-			 gfp_t gfp);
-
-ssize_t vringh_iov_pull_iotlb(struct vringh *vrh,
-			      struct vringh_kiov *riov,
-			      void *dst, size_t len);
-ssize_t vringh_iov_push_iotlb(struct vringh *vrh,
-			      struct vringh_kiov *wiov,
-			      const void *src, size_t len);
-
-void vringh_abandon_iotlb(struct vringh *vrh, unsigned int num);
-
-int vringh_complete_iotlb(struct vringh *vrh, u16 head, u32 len);
-
-bool vringh_notify_enable_iotlb(struct vringh *vrh);
-void vringh_notify_disable_iotlb(struct vringh *vrh);
-
-int vringh_need_notify_iotlb(struct vringh *vrh);
-
 #endif /* CONFIG_VHOST_IOTLB */
 
 #endif /* _LINUX_VRINGH_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 4/9] vringh: unify the APIs for all accessors
@ 2022-12-27  2:25   ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: netdev, Shunsuke Mie, linux-kernel, kvm, virtualization

Each vringh memory accessors that are for user, kern and iotlb has own
interfaces that calls common code. But some codes are duplicated and that
becomes loss extendability.

Introduce a struct vringh_ops and provide a common APIs for all accessors.
It can bee easily extended vringh code for new memory accessor and
simplified a caller code.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/vhost/vringh.c | 667 +++++++++++------------------------------
 include/linux/vringh.h | 100 +++---
 2 files changed, 225 insertions(+), 542 deletions(-)

diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index aa3cd27d2384..ebfd3644a1a3 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
 }
 
 /* Returns vring->num if empty, -ve on error. */
-static inline int __vringh_get_head(const struct vringh *vrh,
-				    int (*getu16)(const struct vringh *vrh,
-						  u16 *val, const __virtio16 *p),
-				    u16 *last_avail_idx)
+static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
 {
 	u16 avail_idx, i, head;
 	int err;
 
-	err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
+	err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
 	if (err) {
 		vringh_bad("Failed to access avail idx at %p",
 			   &vrh->vring.avail->idx);
@@ -58,7 +55,7 @@ static inline int __vringh_get_head(const struct vringh *vrh,
 
 	i = *last_avail_idx & (vrh->vring.num - 1);
 
-	err = getu16(vrh, &head, &vrh->vring.avail->ring[i]);
+	err = vrh->ops.getu16(vrh, &head, &vrh->vring.avail->ring[i]);
 	if (err) {
 		vringh_bad("Failed to read head: idx %d address %p",
 			   *last_avail_idx, &vrh->vring.avail->ring[i]);
@@ -131,12 +128,10 @@ static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
 
 /* May reduce *len if range is shorter. */
 static inline bool range_check(struct vringh *vrh, u64 addr, size_t *len,
-			       struct vringh_range *range,
-			       bool (*getrange)(struct vringh *,
-						u64, struct vringh_range *))
+			       struct vringh_range *range)
 {
 	if (addr < range->start || addr > range->end_incl) {
-		if (!getrange(vrh, addr, range))
+		if (!vrh->ops.getrange(vrh, addr, range))
 			return false;
 	}
 	BUG_ON(addr < range->start || addr > range->end_incl);
@@ -165,9 +160,7 @@ static inline bool range_check(struct vringh *vrh, u64 addr, size_t *len,
 }
 
 static inline bool no_range_check(struct vringh *vrh, u64 addr, size_t *len,
-				  struct vringh_range *range,
-				  bool (*getrange)(struct vringh *,
-						   u64, struct vringh_range *))
+				  struct vringh_range *range)
 {
 	return true;
 }
@@ -244,17 +237,7 @@ static u16 __cold return_from_indirect(const struct vringh *vrh, int *up_next,
 }
 
 static int slow_copy(struct vringh *vrh, void *dst, const void *src,
-		     bool (*rcheck)(struct vringh *vrh, u64 addr, size_t *len,
-				    struct vringh_range *range,
-				    bool (*getrange)(struct vringh *vrh,
-						     u64,
-						     struct vringh_range *)),
-		     bool (*getrange)(struct vringh *vrh,
-				      u64 addr,
-				      struct vringh_range *r),
-		     struct vringh_range *range,
-		     int (*copy)(const struct vringh *vrh,
-				 void *dst, const void *src, size_t len))
+		     struct vringh_range *range)
 {
 	size_t part, len = sizeof(struct vring_desc);
 
@@ -265,10 +248,10 @@ static int slow_copy(struct vringh *vrh, void *dst, const void *src,
 		part = len;
 		addr = (u64)(unsigned long)src - range->offset;
 
-		if (!rcheck(vrh, addr, &part, range, getrange))
+		if (!vrh->ops.range_check(vrh, addr, &part, range))
 			return -EINVAL;
 
-		err = copy(vrh, dst, src, part);
+		err = vrh->ops.copydesc(vrh, dst, src, part);
 		if (err)
 			return err;
 
@@ -279,18 +262,35 @@ static int slow_copy(struct vringh *vrh, void *dst, const void *src,
 	return 0;
 }
 
+static int __vringh_init(struct vringh *vrh, u64 features, unsigned int num,
+			 bool weak_barriers, gfp_t gfp, struct vring_desc *desc,
+			 struct vring_avail *avail, struct vring_used *used)
+{
+	/* Sane power of 2 please! */
+	if (!num || num > 0xffff || (num & (num - 1))) {
+		vringh_bad("Bad ring size %u", num);
+		return -EINVAL;
+	}
+
+	vrh->little_endian = (features & (1ULL << VIRTIO_F_VERSION_1));
+	vrh->event_indices = (features & (1 << VIRTIO_RING_F_EVENT_IDX));
+	vrh->weak_barriers = weak_barriers;
+	vrh->completed = 0;
+	vrh->last_avail_idx = 0;
+	vrh->last_used_idx = 0;
+	vrh->vring.num = num;
+	vrh->vring.desc = desc;
+	vrh->vring.avail = avail;
+	vrh->vring.used = used;
+	vrh->desc_gfp = gfp;
+
+	return 0;
+}
+
 static inline int
 __vringh_iov(struct vringh *vrh, u16 i,
 	     struct vringh_kiov *riov,
-	     struct vringh_kiov *wiov,
-	     bool (*rcheck)(struct vringh *vrh, u64 addr, size_t *len,
-			    struct vringh_range *range,
-			    bool (*getrange)(struct vringh *, u64,
-					     struct vringh_range *)),
-	     bool (*getrange)(struct vringh *, u64, struct vringh_range *),
-	     gfp_t gfp,
-	     int (*copy)(const struct vringh *vrh,
-			 void *dst, const void *src, size_t len))
+	     struct vringh_kiov *wiov, gfp_t gfp)
 {
 	int err, count = 0, indirect_count = 0, up_next, desc_max;
 	struct vring_desc desc, *descs;
@@ -317,10 +317,9 @@ __vringh_iov(struct vringh *vrh, u16 i,
 		size_t len;
 
 		if (unlikely(slow))
-			err = slow_copy(vrh, &desc, &descs[i], rcheck, getrange,
-					&slowrange, copy);
+			err = slow_copy(vrh, &desc, &descs[i], &slowrange);
 		else
-			err = copy(vrh, &desc, &descs[i], sizeof(desc));
+			err = vrh->ops.copydesc(vrh, &desc, &descs[i], sizeof(desc));
 		if (unlikely(err))
 			goto fail;
 
@@ -330,7 +329,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
 
 			/* Make sure it's OK, and get offset. */
 			len = vringh32_to_cpu(vrh, desc.len);
-			if (!rcheck(vrh, a, &len, &range, getrange)) {
+			if (!vrh->ops.range_check(vrh, a, &len, &range)) {
 				err = -EINVAL;
 				goto fail;
 			}
@@ -382,8 +381,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
 	again:
 		/* Make sure it's OK, and get offset. */
 		len = vringh32_to_cpu(vrh, desc.len);
-		if (!rcheck(vrh, vringh64_to_cpu(vrh, desc.addr), &len, &range,
-			    getrange)) {
+		if (!vrh->ops.range_check(vrh, vringh64_to_cpu(vrh, desc.addr), &len, &range)) {
 			err = -EINVAL;
 			goto fail;
 		}
@@ -436,13 +434,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
 
 static inline int __vringh_complete(struct vringh *vrh,
 				    const struct vring_used_elem *used,
-				    unsigned int num_used,
-				    int (*putu16)(const struct vringh *vrh,
-						  __virtio16 *p, u16 val),
-				    int (*putused)(const struct vringh *vrh,
-						   struct vring_used_elem *dst,
-						   const struct vring_used_elem
-						   *src, unsigned num))
+				    unsigned int num_used)
 {
 	struct vring_used *used_ring;
 	int err;
@@ -456,12 +448,12 @@ static inline int __vringh_complete(struct vringh *vrh,
 	/* Compiler knows num_used == 1 sometimes, hence extra check */
 	if (num_used > 1 && unlikely(off + num_used >= vrh->vring.num)) {
 		u16 part = vrh->vring.num - off;
-		err = putused(vrh, &used_ring->ring[off], used, part);
+		err = vrh->ops.putused(vrh, &used_ring->ring[off], used, part);
 		if (!err)
-			err = putused(vrh, &used_ring->ring[0], used + part,
+			err = vrh->ops.putused(vrh, &used_ring->ring[0], used + part,
 				      num_used - part);
 	} else
-		err = putused(vrh, &used_ring->ring[off], used, num_used);
+		err = vrh->ops.putused(vrh, &used_ring->ring[off], used, num_used);
 
 	if (err) {
 		vringh_bad("Failed to write %u used entries %u at %p",
@@ -472,7 +464,7 @@ static inline int __vringh_complete(struct vringh *vrh,
 	/* Make sure buffer is written before we update index. */
 	virtio_wmb(vrh->weak_barriers);
 
-	err = putu16(vrh, &vrh->vring.used->idx, used_idx + num_used);
+	err = vrh->ops.putu16(vrh, &vrh->vring.used->idx, used_idx + num_used);
 	if (err) {
 		vringh_bad("Failed to update used index at %p",
 			   &vrh->vring.used->idx);
@@ -483,11 +475,13 @@ static inline int __vringh_complete(struct vringh *vrh,
 	return 0;
 }
 
-
-static inline int __vringh_need_notify(struct vringh *vrh,
-				       int (*getu16)(const struct vringh *vrh,
-						     u16 *val,
-						     const __virtio16 *p))
+/**
+ * vringh_need_notify - must we tell the other side about used buffers?
+ * @vrh: the vring we've called vringh_complete() on.
+ *
+ * Returns -errno or 0 if we don't need to tell the other side, 1 if we do.
+ */
+int vringh_need_notify(struct vringh *vrh)
 {
 	bool notify;
 	u16 used_event;
@@ -501,7 +495,7 @@ static inline int __vringh_need_notify(struct vringh *vrh,
 	/* Old-style, without event indices. */
 	if (!vrh->event_indices) {
 		u16 flags;
-		err = getu16(vrh, &flags, &vrh->vring.avail->flags);
+		err = vrh->ops.getu16(vrh, &flags, &vrh->vring.avail->flags);
 		if (err) {
 			vringh_bad("Failed to get flags at %p",
 				   &vrh->vring.avail->flags);
@@ -511,7 +505,7 @@ static inline int __vringh_need_notify(struct vringh *vrh,
 	}
 
 	/* Modern: we know when other side wants to know. */
-	err = getu16(vrh, &used_event, &vring_used_event(&vrh->vring));
+	err = vrh->ops.getu16(vrh, &used_event, &vring_used_event(&vrh->vring));
 	if (err) {
 		vringh_bad("Failed to get used event idx at %p",
 			   &vring_used_event(&vrh->vring));
@@ -530,24 +524,28 @@ static inline int __vringh_need_notify(struct vringh *vrh,
 	vrh->completed = 0;
 	return notify;
 }
+EXPORT_SYMBOL(vringh_need_notify);
 
-static inline bool __vringh_notify_enable(struct vringh *vrh,
-					  int (*getu16)(const struct vringh *vrh,
-							u16 *val, const __virtio16 *p),
-					  int (*putu16)(const struct vringh *vrh,
-							__virtio16 *p, u16 val))
+/**
+ * vringh_notify_enable - we want to know if something changes.
+ * @vrh: the vring.
+ *
+ * This always enables notifications, but returns false if there are
+ * now more buffers available in the vring.
+ */
+bool vringh_notify_enable(struct vringh *vrh)
 {
 	u16 avail;
 
 	if (!vrh->event_indices) {
 		/* Old-school; update flags. */
-		if (putu16(vrh, &vrh->vring.used->flags, 0) != 0) {
+		if (vrh->ops.putu16(vrh, &vrh->vring.used->flags, 0) != 0) {
 			vringh_bad("Clearing used flags %p",
 				   &vrh->vring.used->flags);
 			return true;
 		}
 	} else {
-		if (putu16(vrh, &vring_avail_event(&vrh->vring),
+		if (vrh->ops.putu16(vrh, &vring_avail_event(&vrh->vring),
 			   vrh->last_avail_idx) != 0) {
 			vringh_bad("Updating avail event index %p",
 				   &vring_avail_event(&vrh->vring));
@@ -559,7 +557,7 @@ static inline bool __vringh_notify_enable(struct vringh *vrh,
 	 * sure it's written, then check again. */
 	virtio_mb(vrh->weak_barriers);
 
-	if (getu16(vrh, &avail, &vrh->vring.avail->idx) != 0) {
+	if (vrh->ops.getu16(vrh, &avail, &vrh->vring.avail->idx) != 0) {
 		vringh_bad("Failed to check avail idx at %p",
 			   &vrh->vring.avail->idx);
 		return true;
@@ -570,20 +568,27 @@ static inline bool __vringh_notify_enable(struct vringh *vrh,
 	 * notification anyway). */
 	return avail == vrh->last_avail_idx;
 }
+EXPORT_SYMBOL(vringh_notify_enable);
 
-static inline void __vringh_notify_disable(struct vringh *vrh,
-					   int (*putu16)(const struct vringh *vrh,
-							 __virtio16 *p, u16 val))
+/**
+ * vringh_notify_disable - don't tell us if something changes.
+ * @vrh: the vring.
+ *
+ * This is our normal running state: we disable and then only enable when
+ * we're going to sleep.
+ */
+void vringh_notify_disable(struct vringh *vrh)
 {
 	if (!vrh->event_indices) {
 		/* Old-school; update flags. */
-		if (putu16(vrh, &vrh->vring.used->flags,
+		if (vrh->ops.putu16(vrh, &vrh->vring.used->flags,
 			   VRING_USED_F_NO_NOTIFY)) {
 			vringh_bad("Setting used flags %p",
 				   &vrh->vring.used->flags);
 		}
 	}
 }
+EXPORT_SYMBOL(vringh_notify_disable);
 
 /* Userspace access helpers: in this case, addresses are really userspace. */
 static inline int getu16_user(const struct vringh *vrh, u16 *val, const __virtio16 *p)
@@ -630,6 +635,16 @@ static inline int xfer_to_user(const struct vringh *vrh,
 		-EFAULT : 0;
 }
 
+static struct vringh_ops user_vringh_ops = {
+	.getu16 = getu16_user,
+	.putu16 = putu16_user,
+	.xfer_from = xfer_from_user,
+	.xfer_to = xfer_to_user,
+	.putused = putused_user,
+	.copydesc = copydesc_user,
+	.range_check = range_check,
+};
+
 /**
  * vringh_init_user - initialize a vringh for a userspace vring.
  * @vrh: the vringh to initialize.
@@ -639,6 +654,7 @@ static inline int xfer_to_user(const struct vringh *vrh,
  * @desc: the userpace descriptor pointer.
  * @avail: the userpace avail pointer.
  * @used: the userpace used pointer.
+ * @getrange: a function that return a range that vring can access.
  *
  * Returns an error if num is invalid: you should check pointers
  * yourself!
@@ -647,36 +663,32 @@ int vringh_init_user(struct vringh *vrh, u64 features,
 		     unsigned int num, bool weak_barriers,
 		     vring_desc_t __user *desc,
 		     vring_avail_t __user *avail,
-		     vring_used_t __user *used)
+		     vring_used_t __user *used,
+			 bool (*getrange)(struct vringh *vrh, u64 addr, struct vringh_range *r))
 {
-	/* Sane power of 2 please! */
-	if (!num || num > 0xffff || (num & (num - 1))) {
-		vringh_bad("Bad ring size %u", num);
-		return -EINVAL;
-	}
+	int err;
+
+	err = __vringh_init(vrh, features, num, weak_barriers, GFP_KERNEL,
+			(__force struct vring_desc *)desc,
+			(__force struct vring_avail *)avail,
+			(__force struct vring_used *)used);
+	if (err)
+		return err;
+
+	memcpy(&vrh->ops, &user_vringh_ops, sizeof(user_vringh_ops));
+	vrh->ops.getrange = getrange;
 
-	vrh->little_endian = (features & (1ULL << VIRTIO_F_VERSION_1));
-	vrh->event_indices = (features & (1 << VIRTIO_RING_F_EVENT_IDX));
-	vrh->weak_barriers = weak_barriers;
-	vrh->completed = 0;
-	vrh->last_avail_idx = 0;
-	vrh->last_used_idx = 0;
-	vrh->vring.num = num;
-	/* vring expects kernel addresses, but only used via accessors. */
-	vrh->vring.desc = (__force struct vring_desc *)desc;
-	vrh->vring.avail = (__force struct vring_avail *)avail;
-	vrh->vring.used = (__force struct vring_used *)used;
 	return 0;
 }
 EXPORT_SYMBOL(vringh_init_user);
 
 /**
- * vringh_getdesc_user - get next available descriptor from userspace ring.
- * @vrh: the userspace vring.
+ * vringh_getdesc - get next available descriptor from ring.
+ * @vrh: the vringh to get desc.
  * @riov: where to put the readable descriptors (or NULL)
  * @wiov: where to put the writable descriptors (or NULL)
  * @getrange: function to call to check ranges.
- * @head: head index we received, for passing to vringh_complete_user().
+ * @head: head index we received, for passing to vringh_complete().
  *
  * Returns 0 if there was no descriptor, 1 if there was, or -errno.
  *
@@ -690,17 +702,15 @@ EXPORT_SYMBOL(vringh_init_user);
  * When you don't have to use riov and wiov anymore, you should clean up them
  * calling vringh_iov_cleanup() to release the memory, even on error!
  */
-int vringh_getdesc_user(struct vringh *vrh,
+int vringh_getdesc(struct vringh *vrh,
 			struct vringh_kiov *riov,
 			struct vringh_kiov *wiov,
-			bool (*getrange)(struct vringh *vrh,
-					 u64 addr, struct vringh_range *r),
 			u16 *head)
 {
 	int err;
 
 	*head = vrh->vring.num;
-	err = __vringh_get_head(vrh, getu16_user, &vrh->last_avail_idx);
+	err = __vringh_get_head(vrh, &vrh->last_avail_idx);
 	if (err < 0)
 		return err;
 
@@ -709,137 +719,100 @@ int vringh_getdesc_user(struct vringh *vrh,
 		return 0;
 
 	*head = err;
-	err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
-			   (struct vringh_kiov *)wiov,
-			   range_check, getrange, GFP_KERNEL, copydesc_user);
+	err = __vringh_iov(vrh, *head, riov, wiov, GFP_KERNEL);
 	if (err)
 		return err;
 
 	return 1;
 }
-EXPORT_SYMBOL(vringh_getdesc_user);
+EXPORT_SYMBOL(vringh_getdesc);
 
 /**
- * vringh_iov_pull_user - copy bytes from vring_kiov.
- * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
+ * vringh_iov_pull - copy bytes from vring_kiov.
+ * @vrh: the vringh to load data.
+ * @riov: the riov as passed to vringh_getdesc() (updated as we consume)
  * @dst: the place to copy.
  * @len: the maximum length to copy.
  *
  * Returns the bytes copied <= len or a negative errno.
  */
-ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
+ssize_t vringh_iov_pull(struct vringh *vrh, struct vringh_kiov *riov, void *dst, size_t len)
 {
 	return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
-			       dst, len, xfer_from_user);
+			       dst, len, vrh->ops.xfer_from);
 }
-EXPORT_SYMBOL(vringh_iov_pull_user);
+EXPORT_SYMBOL(vringh_iov_pull);
 
 /**
- * vringh_iov_push_user - copy bytes into vring_kiov.
- * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
+ * vringh_iov_push - copy bytes into vring_kiov.
+ * @vrh: the vringh to store data.
+ * @wiov: the wiov as passed to vringh_getdesc() (updated as we consume)
  * @src: the place to copy from.
  * @len: the maximum length to copy.
  *
  * Returns the bytes copied <= len or a negative errno.
  */
-ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
+ssize_t vringh_iov_push(struct vringh *vrh, struct vringh_kiov *wiov,
 			     const void *src, size_t len)
 {
 	return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
-			       (void *)src, len, xfer_to_user);
+			       (void *)src, len, vrh->ops.xfer_to);
 }
-EXPORT_SYMBOL(vringh_iov_push_user);
+EXPORT_SYMBOL(vringh_iov_push);
 
 /**
- * vringh_abandon_user - we've decided not to handle the descriptor(s).
+ * vringh_abandon - we've decided not to handle the descriptor(s).
  * @vrh: the vring.
  * @num: the number of descriptors to put back (ie. num
  *	 vringh_get_user() to undo).
  *
  * The next vringh_get_user() will return the old descriptor(s) again.
  */
-void vringh_abandon_user(struct vringh *vrh, unsigned int num)
+void vringh_abandon(struct vringh *vrh, unsigned int num)
 {
 	/* We only update vring_avail_event(vr) when we want to be notified,
 	 * so we haven't changed that yet. */
 	vrh->last_avail_idx -= num;
 }
-EXPORT_SYMBOL(vringh_abandon_user);
+EXPORT_SYMBOL(vringh_abandon);
 
 /**
- * vringh_complete_user - we've finished with descriptor, publish it.
+ * vringh_complete - we've finished with descriptor, publish it.
  * @vrh: the vring.
- * @head: the head as filled in by vringh_getdesc_user.
+ * @head: the head as filled in by vringh_getdesc.
  * @len: the length of data we have written.
  *
- * You should check vringh_need_notify_user() after one or more calls
+ * You should check vringh_need_notify() after one or more calls
  * to this function.
  */
-int vringh_complete_user(struct vringh *vrh, u16 head, u32 len)
+int vringh_complete(struct vringh *vrh, u16 head, u32 len)
 {
 	struct vring_used_elem used;
 
 	used.id = cpu_to_vringh32(vrh, head);
 	used.len = cpu_to_vringh32(vrh, len);
-	return __vringh_complete(vrh, &used, 1, putu16_user, putused_user);
+	return __vringh_complete(vrh, &used, 1);
 }
-EXPORT_SYMBOL(vringh_complete_user);
+EXPORT_SYMBOL(vringh_complete);
 
 /**
- * vringh_complete_multi_user - we've finished with many descriptors.
+ * vringh_complete_multi - we've finished with many descriptors.
  * @vrh: the vring.
  * @used: the head, length pairs.
  * @num_used: the number of used elements.
  *
- * You should check vringh_need_notify_user() after one or more calls
+ * You should check vringh_need_notify() after one or more calls
  * to this function.
  */
-int vringh_complete_multi_user(struct vringh *vrh,
+int vringh_complete_multi(struct vringh *vrh,
 			       const struct vring_used_elem used[],
 			       unsigned num_used)
 {
-	return __vringh_complete(vrh, used, num_used,
-				 putu16_user, putused_user);
-}
-EXPORT_SYMBOL(vringh_complete_multi_user);
-
-/**
- * vringh_notify_enable_user - we want to know if something changes.
- * @vrh: the vring.
- *
- * This always enables notifications, but returns false if there are
- * now more buffers available in the vring.
- */
-bool vringh_notify_enable_user(struct vringh *vrh)
-{
-	return __vringh_notify_enable(vrh, getu16_user, putu16_user);
+	return __vringh_complete(vrh, used, num_used);
 }
-EXPORT_SYMBOL(vringh_notify_enable_user);
+EXPORT_SYMBOL(vringh_complete_multi);
 
-/**
- * vringh_notify_disable_user - don't tell us if something changes.
- * @vrh: the vring.
- *
- * This is our normal running state: we disable and then only enable when
- * we're going to sleep.
- */
-void vringh_notify_disable_user(struct vringh *vrh)
-{
-	__vringh_notify_disable(vrh, putu16_user);
-}
-EXPORT_SYMBOL(vringh_notify_disable_user);
 
-/**
- * vringh_need_notify_user - must we tell the other side about used buffers?
- * @vrh: the vring we've called vringh_complete_user() on.
- *
- * Returns -errno or 0 if we don't need to tell the other side, 1 if we do.
- */
-int vringh_need_notify_user(struct vringh *vrh)
-{
-	return __vringh_need_notify(vrh, getu16_user);
-}
-EXPORT_SYMBOL(vringh_need_notify_user);
 
 /* Kernelspace access helpers. */
 static inline int getu16_kern(const struct vringh *vrh,
@@ -885,6 +858,17 @@ static inline int kern_xfer(const struct vringh *vrh, void *dst,
 	return 0;
 }
 
+static const struct vringh_ops kern_vringh_ops = {
+	.getu16 = getu16_kern,
+	.putu16 = putu16_kern,
+	.xfer_from = xfer_kern,
+	.xfer_to = xfer_kern,
+	.putused = putused_kern,
+	.copydesc = copydesc_kern,
+	.range_check = no_range_check,
+	.getrange = NULL,
+};
+
 /**
  * vringh_init_kern - initialize a vringh for a kernelspace vring.
  * @vrh: the vringh to initialize.
@@ -898,179 +882,22 @@ static inline int kern_xfer(const struct vringh *vrh, void *dst,
  * Returns an error if num is invalid.
  */
 int vringh_init_kern(struct vringh *vrh, u64 features,
-		     unsigned int num, bool weak_barriers,
+		     unsigned int num, bool weak_barriers, gfp_t gfp,
 		     struct vring_desc *desc,
 		     struct vring_avail *avail,
 		     struct vring_used *used)
-{
-	/* Sane power of 2 please! */
-	if (!num || num > 0xffff || (num & (num - 1))) {
-		vringh_bad("Bad ring size %u", num);
-		return -EINVAL;
-	}
-
-	vrh->little_endian = (features & (1ULL << VIRTIO_F_VERSION_1));
-	vrh->event_indices = (features & (1 << VIRTIO_RING_F_EVENT_IDX));
-	vrh->weak_barriers = weak_barriers;
-	vrh->completed = 0;
-	vrh->last_avail_idx = 0;
-	vrh->last_used_idx = 0;
-	vrh->vring.num = num;
-	vrh->vring.desc = desc;
-	vrh->vring.avail = avail;
-	vrh->vring.used = used;
-	return 0;
-}
-EXPORT_SYMBOL(vringh_init_kern);
-
-/**
- * vringh_getdesc_kern - get next available descriptor from kernelspace ring.
- * @vrh: the kernelspace vring.
- * @riov: where to put the readable descriptors (or NULL)
- * @wiov: where to put the writable descriptors (or NULL)
- * @head: head index we received, for passing to vringh_complete_kern().
- * @gfp: flags for allocating larger riov/wiov.
- *
- * Returns 0 if there was no descriptor, 1 if there was, or -errno.
- *
- * Note that on error return, you can tell the difference between an
- * invalid ring and a single invalid descriptor: in the former case,
- * *head will be vrh->vring.num.  You may be able to ignore an invalid
- * descriptor, but there's not much you can do with an invalid ring.
- *
- * Note that you can reuse riov and wiov with subsequent calls. Content is
- * overwritten and memory reallocated if more space is needed.
- * When you don't have to use riov and wiov anymore, you should clean up them
- * calling vringh_kiov_cleanup() to release the memory, even on error!
- */
-int vringh_getdesc_kern(struct vringh *vrh,
-			struct vringh_kiov *riov,
-			struct vringh_kiov *wiov,
-			u16 *head,
-			gfp_t gfp)
 {
 	int err;
 
-	err = __vringh_get_head(vrh, getu16_kern, &vrh->last_avail_idx);
-	if (err < 0)
-		return err;
-
-	/* Empty... */
-	if (err == vrh->vring.num)
-		return 0;
-
-	*head = err;
-	err = __vringh_iov(vrh, *head, riov, wiov, no_range_check, NULL,
-			   gfp, copydesc_kern);
+	err = __vringh_init(vrh, features, num, weak_barriers, gfp, desc, avail, used);
 	if (err)
 		return err;
 
-	return 1;
-}
-EXPORT_SYMBOL(vringh_getdesc_kern);
-
-/**
- * vringh_iov_pull_kern - copy bytes from vring_iov.
- * @riov: the riov as passed to vringh_getdesc_kern() (updated as we consume)
- * @dst: the place to copy.
- * @len: the maximum length to copy.
- *
- * Returns the bytes copied <= len or a negative errno.
- */
-ssize_t vringh_iov_pull_kern(struct vringh_kiov *riov, void *dst, size_t len)
-{
-	return vringh_iov_xfer(NULL, riov, dst, len, xfer_kern);
-}
-EXPORT_SYMBOL(vringh_iov_pull_kern);
-
-/**
- * vringh_iov_push_kern - copy bytes into vring_iov.
- * @wiov: the wiov as passed to vringh_getdesc_kern() (updated as we consume)
- * @src: the place to copy from.
- * @len: the maximum length to copy.
- *
- * Returns the bytes copied <= len or a negative errno.
- */
-ssize_t vringh_iov_push_kern(struct vringh_kiov *wiov,
-			     const void *src, size_t len)
-{
-	return vringh_iov_xfer(NULL, wiov, (void *)src, len, kern_xfer);
-}
-EXPORT_SYMBOL(vringh_iov_push_kern);
+	memcpy(&vrh->ops, &kern_vringh_ops, sizeof(kern_vringh_ops));
 
-/**
- * vringh_abandon_kern - we've decided not to handle the descriptor(s).
- * @vrh: the vring.
- * @num: the number of descriptors to put back (ie. num
- *	 vringh_get_kern() to undo).
- *
- * The next vringh_get_kern() will return the old descriptor(s) again.
- */
-void vringh_abandon_kern(struct vringh *vrh, unsigned int num)
-{
-	/* We only update vring_avail_event(vr) when we want to be notified,
-	 * so we haven't changed that yet. */
-	vrh->last_avail_idx -= num;
-}
-EXPORT_SYMBOL(vringh_abandon_kern);
-
-/**
- * vringh_complete_kern - we've finished with descriptor, publish it.
- * @vrh: the vring.
- * @head: the head as filled in by vringh_getdesc_kern.
- * @len: the length of data we have written.
- *
- * You should check vringh_need_notify_kern() after one or more calls
- * to this function.
- */
-int vringh_complete_kern(struct vringh *vrh, u16 head, u32 len)
-{
-	struct vring_used_elem used;
-
-	used.id = cpu_to_vringh32(vrh, head);
-	used.len = cpu_to_vringh32(vrh, len);
-
-	return __vringh_complete(vrh, &used, 1, putu16_kern, putused_kern);
-}
-EXPORT_SYMBOL(vringh_complete_kern);
-
-/**
- * vringh_notify_enable_kern - we want to know if something changes.
- * @vrh: the vring.
- *
- * This always enables notifications, but returns false if there are
- * now more buffers available in the vring.
- */
-bool vringh_notify_enable_kern(struct vringh *vrh)
-{
-	return __vringh_notify_enable(vrh, getu16_kern, putu16_kern);
-}
-EXPORT_SYMBOL(vringh_notify_enable_kern);
-
-/**
- * vringh_notify_disable_kern - don't tell us if something changes.
- * @vrh: the vring.
- *
- * This is our normal running state: we disable and then only enable when
- * we're going to sleep.
- */
-void vringh_notify_disable_kern(struct vringh *vrh)
-{
-	__vringh_notify_disable(vrh, putu16_kern);
-}
-EXPORT_SYMBOL(vringh_notify_disable_kern);
-
-/**
- * vringh_need_notify_kern - must we tell the other side about used buffers?
- * @vrh: the vring we've called vringh_complete_kern() on.
- *
- * Returns -errno or 0 if we don't need to tell the other side, 1 if we do.
- */
-int vringh_need_notify_kern(struct vringh *vrh)
-{
-	return __vringh_need_notify(vrh, getu16_kern);
+	return 0;
 }
-EXPORT_SYMBOL(vringh_need_notify_kern);
+EXPORT_SYMBOL(vringh_init_kern);
 
 #if IS_REACHABLE(CONFIG_VHOST_IOTLB)
 
@@ -1122,7 +949,7 @@ static int iotlb_translate(const struct vringh *vrh,
 	return ret;
 }
 
-static inline int copy_from_iotlb(const struct vringh *vrh, void *dst,
+static int copy_from_iotlb(const struct vringh *vrh, void *dst,
 				  void *src, size_t len)
 {
 	u64 total_translated = 0;
@@ -1155,7 +982,7 @@ static inline int copy_from_iotlb(const struct vringh *vrh, void *dst,
 	return total_translated;
 }
 
-static inline int copy_to_iotlb(const struct vringh *vrh, void *dst,
+static int copy_to_iotlb(const struct vringh *vrh, void *dst,
 				void *src, size_t len)
 {
 	u64 total_translated = 0;
@@ -1188,7 +1015,7 @@ static inline int copy_to_iotlb(const struct vringh *vrh, void *dst,
 	return total_translated;
 }
 
-static inline int getu16_iotlb(const struct vringh *vrh,
+static int getu16_iotlb(const struct vringh *vrh,
 			       u16 *val, const __virtio16 *p)
 {
 	struct bio_vec iov;
@@ -1209,7 +1036,7 @@ static inline int getu16_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
-static inline int putu16_iotlb(const struct vringh *vrh,
+static int putu16_iotlb(const struct vringh *vrh,
 			       __virtio16 *p, u16 val)
 {
 	struct bio_vec iov;
@@ -1230,7 +1057,7 @@ static inline int putu16_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
-static inline int copydesc_iotlb(const struct vringh *vrh,
+static int copydesc_iotlb(const struct vringh *vrh,
 				 void *dst, const void *src, size_t len)
 {
 	int ret;
@@ -1242,7 +1069,7 @@ static inline int copydesc_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
-static inline int xfer_from_iotlb(const struct vringh *vrh, void *src,
+static int xfer_from_iotlb(const struct vringh *vrh, void *src,
 				  void *dst, size_t len)
 {
 	int ret;
@@ -1254,7 +1081,7 @@ static inline int xfer_from_iotlb(const struct vringh *vrh, void *src,
 	return 0;
 }
 
-static inline int xfer_to_iotlb(const struct vringh *vrh,
+static int xfer_to_iotlb(const struct vringh *vrh,
 			       void *dst, void *src, size_t len)
 {
 	int ret;
@@ -1266,7 +1093,7 @@ static inline int xfer_to_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
-static inline int putused_iotlb(const struct vringh *vrh,
+static int putused_iotlb(const struct vringh *vrh,
 				struct vring_used_elem *dst,
 				const struct vring_used_elem *src,
 				unsigned int num)
@@ -1281,6 +1108,17 @@ static inline int putused_iotlb(const struct vringh *vrh,
 	return 0;
 }
 
+static const struct vringh_ops iotlb_vringh_ops = {
+	.getu16 = getu16_iotlb,
+	.putu16 = putu16_iotlb,
+	.xfer_from = xfer_from_iotlb,
+	.xfer_to = xfer_to_iotlb,
+	.putused = putused_iotlb,
+	.copydesc = copydesc_iotlb,
+	.range_check = no_range_check,
+	.getrange = NULL,
+};
+
 /**
  * vringh_init_iotlb - initialize a vringh for a ring with IOTLB.
  * @vrh: the vringh to initialize.
@@ -1294,13 +1132,20 @@ static inline int putused_iotlb(const struct vringh *vrh,
  * Returns an error if num is invalid.
  */
 int vringh_init_iotlb(struct vringh *vrh, u64 features,
-		      unsigned int num, bool weak_barriers,
+		      unsigned int num, bool weak_barriers, gfp_t gfp,
 		      struct vring_desc *desc,
 		      struct vring_avail *avail,
 		      struct vring_used *used)
 {
-	return vringh_init_kern(vrh, features, num, weak_barriers,
-				desc, avail, used);
+	int err;
+
+	err = __vringh_init(vrh, features, num, weak_barriers, gfp, desc, avail, used);
+	if (err)
+		return err;
+
+	memcpy(&vrh->ops, &iotlb_vringh_ops, sizeof(iotlb_vringh_ops));
+
+	return 0;
 }
 EXPORT_SYMBOL(vringh_init_iotlb);
 
@@ -1318,162 +1163,6 @@ void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
 }
 EXPORT_SYMBOL(vringh_set_iotlb);
 
-/**
- * vringh_getdesc_iotlb - get next available descriptor from ring with
- * IOTLB.
- * @vrh: the kernelspace vring.
- * @riov: where to put the readable descriptors (or NULL)
- * @wiov: where to put the writable descriptors (or NULL)
- * @head: head index we received, for passing to vringh_complete_iotlb().
- * @gfp: flags for allocating larger riov/wiov.
- *
- * Returns 0 if there was no descriptor, 1 if there was, or -errno.
- *
- * Note that on error return, you can tell the difference between an
- * invalid ring and a single invalid descriptor: in the former case,
- * *head will be vrh->vring.num.  You may be able to ignore an invalid
- * descriptor, but there's not much you can do with an invalid ring.
- *
- * Note that you can reuse riov and wiov with subsequent calls. Content is
- * overwritten and memory reallocated if more space is needed.
- * When you don't have to use riov and wiov anymore, you should clean up them
- * calling vringh_kiov_cleanup() to release the memory, even on error!
- */
-int vringh_getdesc_iotlb(struct vringh *vrh,
-			 struct vringh_kiov *riov,
-			 struct vringh_kiov *wiov,
-			 u16 *head,
-			 gfp_t gfp)
-{
-	int err;
-
-	err = __vringh_get_head(vrh, getu16_iotlb, &vrh->last_avail_idx);
-	if (err < 0)
-		return err;
-
-	/* Empty... */
-	if (err == vrh->vring.num)
-		return 0;
-
-	*head = err;
-	err = __vringh_iov(vrh, *head, riov, wiov, no_range_check, NULL,
-			   gfp, copydesc_iotlb);
-	if (err)
-		return err;
-
-	return 1;
-}
-EXPORT_SYMBOL(vringh_getdesc_iotlb);
-
-/**
- * vringh_iov_pull_iotlb - copy bytes from vring_iov.
- * @vrh: the vring.
- * @riov: the riov as passed to vringh_getdesc_iotlb() (updated as we consume)
- * @dst: the place to copy.
- * @len: the maximum length to copy.
- *
- * Returns the bytes copied <= len or a negative errno.
- */
-ssize_t vringh_iov_pull_iotlb(struct vringh *vrh,
-			      struct vringh_kiov *riov,
-			      void *dst, size_t len)
-{
-	return vringh_iov_xfer(vrh, riov, dst, len, xfer_from_iotlb);
-}
-EXPORT_SYMBOL(vringh_iov_pull_iotlb);
-
-/**
- * vringh_iov_push_iotlb - copy bytes into vring_iov.
- * @vrh: the vring.
- * @wiov: the wiov as passed to vringh_getdesc_iotlb() (updated as we consume)
- * @src: the place to copy from.
- * @len: the maximum length to copy.
- *
- * Returns the bytes copied <= len or a negative errno.
- */
-ssize_t vringh_iov_push_iotlb(struct vringh *vrh,
-			      struct vringh_kiov *wiov,
-			      const void *src, size_t len)
-{
-	return vringh_iov_xfer(vrh, wiov, (void *)src, len, xfer_to_iotlb);
-}
-EXPORT_SYMBOL(vringh_iov_push_iotlb);
-
-/**
- * vringh_abandon_iotlb - we've decided not to handle the descriptor(s).
- * @vrh: the vring.
- * @num: the number of descriptors to put back (ie. num
- *	 vringh_get_iotlb() to undo).
- *
- * The next vringh_get_iotlb() will return the old descriptor(s) again.
- */
-void vringh_abandon_iotlb(struct vringh *vrh, unsigned int num)
-{
-	/* We only update vring_avail_event(vr) when we want to be notified,
-	 * so we haven't changed that yet.
-	 */
-	vrh->last_avail_idx -= num;
-}
-EXPORT_SYMBOL(vringh_abandon_iotlb);
-
-/**
- * vringh_complete_iotlb - we've finished with descriptor, publish it.
- * @vrh: the vring.
- * @head: the head as filled in by vringh_getdesc_iotlb.
- * @len: the length of data we have written.
- *
- * You should check vringh_need_notify_iotlb() after one or more calls
- * to this function.
- */
-int vringh_complete_iotlb(struct vringh *vrh, u16 head, u32 len)
-{
-	struct vring_used_elem used;
-
-	used.id = cpu_to_vringh32(vrh, head);
-	used.len = cpu_to_vringh32(vrh, len);
-
-	return __vringh_complete(vrh, &used, 1, putu16_iotlb, putused_iotlb);
-}
-EXPORT_SYMBOL(vringh_complete_iotlb);
-
-/**
- * vringh_notify_enable_iotlb - we want to know if something changes.
- * @vrh: the vring.
- *
- * This always enables notifications, but returns false if there are
- * now more buffers available in the vring.
- */
-bool vringh_notify_enable_iotlb(struct vringh *vrh)
-{
-	return __vringh_notify_enable(vrh, getu16_iotlb, putu16_iotlb);
-}
-EXPORT_SYMBOL(vringh_notify_enable_iotlb);
-
-/**
- * vringh_notify_disable_iotlb - don't tell us if something changes.
- * @vrh: the vring.
- *
- * This is our normal running state: we disable and then only enable when
- * we're going to sleep.
- */
-void vringh_notify_disable_iotlb(struct vringh *vrh)
-{
-	__vringh_notify_disable(vrh, putu16_iotlb);
-}
-EXPORT_SYMBOL(vringh_notify_disable_iotlb);
-
-/**
- * vringh_need_notify_iotlb - must we tell the other side about used buffers?
- * @vrh: the vring we've called vringh_complete_iotlb() on.
- *
- * Returns -errno or 0 if we don't need to tell the other side, 1 if we do.
- */
-int vringh_need_notify_iotlb(struct vringh *vrh)
-{
-	return __vringh_need_notify(vrh, getu16_iotlb);
-}
-EXPORT_SYMBOL(vringh_need_notify_iotlb);
-
 #endif
 
 MODULE_LICENSE("GPL");
diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 733d948e8123..89c73605c85f 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -21,6 +21,36 @@
 #endif
 #include <asm/barrier.h>
 
+struct vringh;
+struct vringh_range;
+
+/**
+ * struct vringh_ops - ops for accessing a vring and checking to access range.
+ * @getu16: read u16 value from pointer
+ * @putu16: write u16 value to pointer
+ * @xfer_from: copy memory range from specified address to local virtual address
+ * @xfer_tio: copy memory range from local virtual address to specified address
+ * @putused: update vring used descriptor
+ * @copydesc: copy desiptor from target to local virtual address
+ * @range_check: check if the region is accessible
+ * @getrange: return a range that vring can access
+ */
+struct vringh_ops {
+	int (*getu16)(const struct vringh *vrh, u16 *val, const __virtio16 *p);
+	int (*putu16)(const struct vringh *vrh, __virtio16 *p, u16 val);
+	int (*xfer_from)(const struct vringh *vrh, void *src, void *dst,
+			 size_t len);
+	int (*xfer_to)(const struct vringh *vrh, void *dst, void *src,
+		       size_t len);
+	int (*putused)(const struct vringh *vrh, struct vring_used_elem *dst,
+		       const struct vring_used_elem *src, unsigned int num);
+	int (*copydesc)(const struct vringh *vrh, void *dst, const void *src,
+			size_t len);
+	bool (*range_check)(struct vringh *vrh, u64 addr, size_t *len,
+			    struct vringh_range *range);
+	bool (*getrange)(struct vringh *vrh, u64 addr, struct vringh_range *r);
+};
+
 /* virtio_ring with information needed for host access. */
 struct vringh {
 	/* Everything is little endian */
@@ -52,6 +82,10 @@ struct vringh {
 
 	/* The function to call to notify the guest about added buffers */
 	void (*notify)(struct vringh *);
+
+	struct vringh_ops ops;
+
+	gfp_t desc_gfp;
 };
 
 /**
@@ -99,41 +133,40 @@ int vringh_init_user(struct vringh *vrh, u64 features,
 		     unsigned int num, bool weak_barriers,
 		     vring_desc_t __user *desc,
 		     vring_avail_t __user *avail,
-		     vring_used_t __user *used);
+		     vring_used_t __user *used,
+			 bool (*getrange)(struct vringh *vrh, u64 addr, struct vringh_range *r));
 
 /* Convert a descriptor into iovecs. */
-int vringh_getdesc_user(struct vringh *vrh,
+int vringh_getdesc(struct vringh *vrh,
 			struct vringh_kiov *riov,
 			struct vringh_kiov *wiov,
-			bool (*getrange)(struct vringh *vrh,
-					 u64 addr, struct vringh_range *r),
 			u16 *head);
 
 /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
-ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
+ssize_t vringh_iov_pull(struct vringh *vrh, struct vringh_kiov *riov, void *dst, size_t len);
 
 /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
-ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
+ssize_t vringh_iov_push(struct vringh *vrh, struct vringh_kiov *wiov,
 			     const void *src, size_t len);
 
 /* Mark a descriptor as used. */
-int vringh_complete_user(struct vringh *vrh, u16 head, u32 len);
-int vringh_complete_multi_user(struct vringh *vrh,
+int vringh_complete(struct vringh *vrh, u16 head, u32 len);
+int vringh_complete_multi(struct vringh *vrh,
 			       const struct vring_used_elem used[],
 			       unsigned num_used);
 
 /* Pretend we've never seen descriptor (for easy error handling). */
-void vringh_abandon_user(struct vringh *vrh, unsigned int num);
+void vringh_abandon(struct vringh *vrh, unsigned int num);
 
 /* Do we need to fire the eventfd to notify the other side? */
-int vringh_need_notify_user(struct vringh *vrh);
+int vringh_need_notify(struct vringh *vrh);
 
-bool vringh_notify_enable_user(struct vringh *vrh);
-void vringh_notify_disable_user(struct vringh *vrh);
+bool vringh_notify_enable(struct vringh *vrh);
+void vringh_notify_disable(struct vringh *vrh);
 
 /* Helpers for kernelspace vrings. */
 int vringh_init_kern(struct vringh *vrh, u64 features,
-		     unsigned int num, bool weak_barriers,
+		     unsigned int num, bool weak_barriers, gfp_t gfp,
 		     struct vring_desc *desc,
 		     struct vring_avail *avail,
 		     struct vring_used *used);
@@ -176,23 +209,6 @@ static inline size_t vringh_kiov_length(struct vringh_kiov *kiov)
 
 void vringh_kiov_advance(struct vringh_kiov *kiov, size_t len);
 
-int vringh_getdesc_kern(struct vringh *vrh,
-			struct vringh_kiov *riov,
-			struct vringh_kiov *wiov,
-			u16 *head,
-			gfp_t gfp);
-
-ssize_t vringh_iov_pull_kern(struct vringh_kiov *riov, void *dst, size_t len);
-ssize_t vringh_iov_push_kern(struct vringh_kiov *wiov,
-			     const void *src, size_t len);
-void vringh_abandon_kern(struct vringh *vrh, unsigned int num);
-int vringh_complete_kern(struct vringh *vrh, u16 head, u32 len);
-
-bool vringh_notify_enable_kern(struct vringh *vrh);
-void vringh_notify_disable_kern(struct vringh *vrh);
-
-int vringh_need_notify_kern(struct vringh *vrh);
-
 /* Notify the guest about buffers added to the used ring */
 static inline void vringh_notify(struct vringh *vrh)
 {
@@ -242,33 +258,11 @@ void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
 		      spinlock_t *iotlb_lock);
 
 int vringh_init_iotlb(struct vringh *vrh, u64 features,
-		      unsigned int num, bool weak_barriers,
+		      unsigned int num, bool weak_barriers, gfp_t gfp,
 		      struct vring_desc *desc,
 		      struct vring_avail *avail,
 		      struct vring_used *used);
 
-int vringh_getdesc_iotlb(struct vringh *vrh,
-			 struct vringh_kiov *riov,
-			 struct vringh_kiov *wiov,
-			 u16 *head,
-			 gfp_t gfp);
-
-ssize_t vringh_iov_pull_iotlb(struct vringh *vrh,
-			      struct vringh_kiov *riov,
-			      void *dst, size_t len);
-ssize_t vringh_iov_push_iotlb(struct vringh *vrh,
-			      struct vringh_kiov *wiov,
-			      const void *src, size_t len);
-
-void vringh_abandon_iotlb(struct vringh *vrh, unsigned int num);
-
-int vringh_complete_iotlb(struct vringh *vrh, u16 head, u32 len);
-
-bool vringh_notify_enable_iotlb(struct vringh *vrh);
-void vringh_notify_disable_iotlb(struct vringh *vrh);
-
-int vringh_need_notify_iotlb(struct vringh *vrh);
-
 #endif /* CONFIG_VHOST_IOTLB */
 
 #endif /* _LINUX_VRINGH_H */
-- 
2.25.1


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 5/9] tools/virtio: convert to use new unified vringh APIs
  2022-12-27  2:25 ` Shunsuke Mie
@ 2022-12-27  2:25   ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: kvm, virtualization, netdev, linux-kernel, Shunsuke Mie

vringh_*_user APIs is being removed without vringh_init_user(). so change
to use new APIs.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 tools/virtio/vringh_test.c | 89 +++++++++++++++++++-------------------
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
index 6c9533b8a2ca..068c6d5aa4fd 100644
--- a/tools/virtio/vringh_test.c
+++ b/tools/virtio/vringh_test.c
@@ -187,7 +187,7 @@ static int parallel_test(u64 features,
 
 		vring_init(&vrh.vring, RINGSIZE, host_map, ALIGN);
 		vringh_init_user(&vrh, features, RINGSIZE, true,
-				 vrh.vring.desc, vrh.vring.avail, vrh.vring.used);
+				 vrh.vring.desc, vrh.vring.avail, vrh.vring.used, getrange);
 		CPU_SET(first_cpu, &cpu_set);
 		if (sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set))
 			errx(1, "Could not set affinity to cpu %u", first_cpu);
@@ -202,9 +202,9 @@ static int parallel_test(u64 features,
 					err = vringh_get_head(&vrh, &head);
 					if (err != 0)
 						break;
-					err = vringh_need_notify_user(&vrh);
+					err = vringh_need_notify(&vrh);
 					if (err < 0)
-						errx(1, "vringh_need_notify_user: %i",
+						errx(1, "vringh_need_notify: %i",
 						     err);
 					if (err) {
 						write(to_guest[1], "", 1);
@@ -223,46 +223,45 @@ static int parallel_test(u64 features,
 						host_wiov,
 						ARRAY_SIZE(host_wiov));
 
-				err = vringh_getdesc_user(&vrh, &riov, &wiov,
-							  getrange, &head);
+				err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 			}
 			if (err == 0) {
-				err = vringh_need_notify_user(&vrh);
+				err = vringh_need_notify(&vrh);
 				if (err < 0)
-					errx(1, "vringh_need_notify_user: %i",
+					errx(1, "vringh_need_notify: %i",
 					     err);
 				if (err) {
 					write(to_guest[1], "", 1);
 					notifies++;
 				}
 
-				if (!vringh_notify_enable_user(&vrh))
+				if (!vringh_notify_enable(&vrh))
 					continue;
 
 				/* Swallow all notifies at once. */
 				if (read(to_host[0], buf, sizeof(buf)) < 1)
 					break;
 
-				vringh_notify_disable_user(&vrh);
+				vringh_notify_disable(&vrh);
 				receives++;
 				continue;
 			}
 			if (err != 1)
-				errx(1, "vringh_getdesc_user: %i", err);
+				errx(1, "vringh_getdesc: %i", err);
 
 			/* We simply copy bytes. */
 			if (riov.used) {
-				rlen = vringh_iov_pull_user(&riov, rbuf,
+				rlen = vringh_iov_pull(&vrh, &riov, rbuf,
 							    sizeof(rbuf));
 				if (rlen != 4)
-					errx(1, "vringh_iov_pull_user: %i",
+					errx(1, "vringh_iov_pull: %i",
 					     rlen);
 				assert(riov.i == riov.used);
 				written = 0;
 			} else {
-				err = vringh_iov_push_user(&wiov, rbuf, rlen);
+				err = vringh_iov_push(&vrh, &wiov, rbuf, rlen);
 				if (err != rlen)
-					errx(1, "vringh_iov_push_user: %i",
+					errx(1, "vringh_iov_push: %i",
 					     err);
 				assert(wiov.i == wiov.used);
 				written = err;
@@ -270,14 +269,14 @@ static int parallel_test(u64 features,
 		complete:
 			xfers++;
 
-			err = vringh_complete_user(&vrh, head, written);
+			err = vringh_complete(&vrh, head, written);
 			if (err != 0)
-				errx(1, "vringh_complete_user: %i", err);
+				errx(1, "vringh_complete: %i", err);
 		}
 
-		err = vringh_need_notify_user(&vrh);
+		err = vringh_need_notify(&vrh);
 		if (err < 0)
-			errx(1, "vringh_need_notify_user: %i", err);
+			errx(1, "vringh_need_notify: %i", err);
 		if (err) {
 			write(to_guest[1], "", 1);
 			notifies++;
@@ -493,12 +492,12 @@ int main(int argc, char *argv[])
 	/* Set up host side. */
 	vring_init(&vrh.vring, RINGSIZE, __user_addr_min, ALIGN);
 	vringh_init_user(&vrh, vdev.features, RINGSIZE, true,
-			 vrh.vring.desc, vrh.vring.avail, vrh.vring.used);
+			 vrh.vring.desc, vrh.vring.avail, vrh.vring.used, getrange);
 
 	/* No descriptor to get yet... */
-	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+	err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 	if (err != 0)
-		errx(1, "vringh_getdesc_user: %i", err);
+		errx(1, "vringh_getdesc: %i", err);
 
 	/* Guest puts in a descriptor. */
 	memcpy(__user_addr_max - 1, "a", 1);
@@ -520,9 +519,9 @@ int main(int argc, char *argv[])
 	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
 	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
-	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+	err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 	if (err != 1)
-		errx(1, "vringh_getdesc_user: %i", err);
+		errx(1, "vringh_getdesc: %i", err);
 
 	assert(riov.used == 1);
 	assert(riov.iov[0].iov_base == __user_addr_max - 1);
@@ -539,25 +538,25 @@ int main(int argc, char *argv[])
 		assert(wiov.iov[1].iov_len == 1);
 	}
 
-	err = vringh_iov_pull_user(&riov, buf, 5);
+	err = vringh_iov_pull(&vrh, &riov, buf, 5);
 	if (err != 1)
-		errx(1, "vringh_iov_pull_user: %i", err);
+		errx(1, "vringh_iov_pull: %i", err);
 	assert(buf[0] == 'a');
 	assert(riov.i == 1);
-	assert(vringh_iov_pull_user(&riov, buf, 5) == 0);
+	assert(vringh_iov_pull(&vrh, &riov, buf, 5) == 0);
 
 	memcpy(buf, "bcdef", 5);
-	err = vringh_iov_push_user(&wiov, buf, 5);
+	err = vringh_iov_push(&vrh, &wiov, buf, 5);
 	if (err != 2)
-		errx(1, "vringh_iov_push_user: %i", err);
+		errx(1, "vringh_iov_push: %i", err);
 	assert(memcmp(__user_addr_max - 3, "bc", 2) == 0);
 	assert(wiov.i == wiov.used);
-	assert(vringh_iov_push_user(&wiov, buf, 5) == 0);
+	assert(vringh_iov_push(&vrh, &wiov, buf, 5) == 0);
 
 	/* Host is done. */
-	err = vringh_complete_user(&vrh, head, err);
+	err = vringh_complete(&vrh, head, err);
 	if (err != 0)
-		errx(1, "vringh_complete_user: %i", err);
+		errx(1, "vringh_complete: %i", err);
 
 	/* Guest should see used token now. */
 	__kfree_ignore_start = __user_addr_min + vring_size(RINGSIZE, ALIGN);
@@ -589,9 +588,9 @@ int main(int argc, char *argv[])
 	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
 	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
-	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+	err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 	if (err != 1)
-		errx(1, "vringh_getdesc_user: %i", err);
+		errx(1, "vringh_getdesc: %i", err);
 
 	assert(riov.max_num & VRINGH_IOV_ALLOCATED);
 	assert(riov.iov != host_riov);
@@ -605,9 +604,9 @@ int main(int argc, char *argv[])
 
 	/* Pull data back out (in odd chunks), should be as expected. */
 	for (i = 0; i < RINGSIZE * USER_MEM/4; i += 3) {
-		err = vringh_iov_pull_user(&riov, buf, 3);
+		err = vringh_iov_pull(&vrh, &riov, buf, 3);
 		if (err != 3 && i + err != RINGSIZE * USER_MEM/4)
-			errx(1, "vringh_iov_pull_user large: %i", err);
+			errx(1, "vringh_iov_pulllarge: %i", err);
 		assert(buf[0] == (char)i);
 		assert(err < 2 || buf[1] == (char)(i + 1));
 		assert(err < 3 || buf[2] == (char)(i + 2));
@@ -619,9 +618,9 @@ int main(int argc, char *argv[])
 	/* Complete using multi interface, just because we can. */
 	used[0].id = head;
 	used[0].len = 0;
-	err = vringh_complete_multi_user(&vrh, used, 1);
+	err = vringh_complete_multi(&vrh, used, 1);
 	if (err)
-		errx(1, "vringh_complete_multi_user(1): %i", err);
+		errx(1, "vringh_complete_multi(1): %i", err);
 
 	/* Free up those descriptors. */
 	ret = virtqueue_get_buf(vq, &i);
@@ -642,17 +641,17 @@ int main(int argc, char *argv[])
 	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 	for (i = 0; i < RINGSIZE; i++) {
-		err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+		err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 		if (err != 1)
-			errx(1, "vringh_getdesc_user: %i", err);
+			errx(1, "vringh_getdesc: %i", err);
 		used[i].id = head;
 		used[i].len = 0;
 	}
 	/* Make sure it wraps around ring, to test! */
 	assert(vrh.vring.used->idx % RINGSIZE != 0);
-	err = vringh_complete_multi_user(&vrh, used, RINGSIZE);
+	err = vringh_complete_multi(&vrh, used, RINGSIZE);
 	if (err)
-		errx(1, "vringh_complete_multi_user: %i", err);
+		errx(1, "vringh_complete_multi: %i", err);
 
 	/* Free those buffers. */
 	for (i = 0; i < RINGSIZE; i++) {
@@ -726,19 +725,19 @@ int main(int argc, char *argv[])
 		vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
 		vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
-		err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+		err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 		if (err != 1)
-			errx(1, "vringh_getdesc_user: %i", err);
+			errx(1, "vringh_getdesc: %i", err);
 
 		if (head != 0)
-			errx(1, "vringh_getdesc_user: head %i not 0", head);
+			errx(1, "vringh_getdesc: head %i not 0", head);
 
 		assert(riov.max_num & VRINGH_IOV_ALLOCATED);
 		if (getrange != getrange_slow)
 			assert(riov.used == 7);
 		else
 			assert(riov.used == 28);
-		err = vringh_iov_pull_user(&riov, buf, 29);
+		err = vringh_iov_pull(&vrh, &riov, buf, 29);
 		assert(err == 28);
 
 		/* Data should be linear. */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 5/9] tools/virtio: convert to use new unified vringh APIs
@ 2022-12-27  2:25   ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: netdev, Shunsuke Mie, linux-kernel, kvm, virtualization

vringh_*_user APIs is being removed without vringh_init_user(). so change
to use new APIs.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 tools/virtio/vringh_test.c | 89 +++++++++++++++++++-------------------
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
index 6c9533b8a2ca..068c6d5aa4fd 100644
--- a/tools/virtio/vringh_test.c
+++ b/tools/virtio/vringh_test.c
@@ -187,7 +187,7 @@ static int parallel_test(u64 features,
 
 		vring_init(&vrh.vring, RINGSIZE, host_map, ALIGN);
 		vringh_init_user(&vrh, features, RINGSIZE, true,
-				 vrh.vring.desc, vrh.vring.avail, vrh.vring.used);
+				 vrh.vring.desc, vrh.vring.avail, vrh.vring.used, getrange);
 		CPU_SET(first_cpu, &cpu_set);
 		if (sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set))
 			errx(1, "Could not set affinity to cpu %u", first_cpu);
@@ -202,9 +202,9 @@ static int parallel_test(u64 features,
 					err = vringh_get_head(&vrh, &head);
 					if (err != 0)
 						break;
-					err = vringh_need_notify_user(&vrh);
+					err = vringh_need_notify(&vrh);
 					if (err < 0)
-						errx(1, "vringh_need_notify_user: %i",
+						errx(1, "vringh_need_notify: %i",
 						     err);
 					if (err) {
 						write(to_guest[1], "", 1);
@@ -223,46 +223,45 @@ static int parallel_test(u64 features,
 						host_wiov,
 						ARRAY_SIZE(host_wiov));
 
-				err = vringh_getdesc_user(&vrh, &riov, &wiov,
-							  getrange, &head);
+				err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 			}
 			if (err == 0) {
-				err = vringh_need_notify_user(&vrh);
+				err = vringh_need_notify(&vrh);
 				if (err < 0)
-					errx(1, "vringh_need_notify_user: %i",
+					errx(1, "vringh_need_notify: %i",
 					     err);
 				if (err) {
 					write(to_guest[1], "", 1);
 					notifies++;
 				}
 
-				if (!vringh_notify_enable_user(&vrh))
+				if (!vringh_notify_enable(&vrh))
 					continue;
 
 				/* Swallow all notifies at once. */
 				if (read(to_host[0], buf, sizeof(buf)) < 1)
 					break;
 
-				vringh_notify_disable_user(&vrh);
+				vringh_notify_disable(&vrh);
 				receives++;
 				continue;
 			}
 			if (err != 1)
-				errx(1, "vringh_getdesc_user: %i", err);
+				errx(1, "vringh_getdesc: %i", err);
 
 			/* We simply copy bytes. */
 			if (riov.used) {
-				rlen = vringh_iov_pull_user(&riov, rbuf,
+				rlen = vringh_iov_pull(&vrh, &riov, rbuf,
 							    sizeof(rbuf));
 				if (rlen != 4)
-					errx(1, "vringh_iov_pull_user: %i",
+					errx(1, "vringh_iov_pull: %i",
 					     rlen);
 				assert(riov.i == riov.used);
 				written = 0;
 			} else {
-				err = vringh_iov_push_user(&wiov, rbuf, rlen);
+				err = vringh_iov_push(&vrh, &wiov, rbuf, rlen);
 				if (err != rlen)
-					errx(1, "vringh_iov_push_user: %i",
+					errx(1, "vringh_iov_push: %i",
 					     err);
 				assert(wiov.i == wiov.used);
 				written = err;
@@ -270,14 +269,14 @@ static int parallel_test(u64 features,
 		complete:
 			xfers++;
 
-			err = vringh_complete_user(&vrh, head, written);
+			err = vringh_complete(&vrh, head, written);
 			if (err != 0)
-				errx(1, "vringh_complete_user: %i", err);
+				errx(1, "vringh_complete: %i", err);
 		}
 
-		err = vringh_need_notify_user(&vrh);
+		err = vringh_need_notify(&vrh);
 		if (err < 0)
-			errx(1, "vringh_need_notify_user: %i", err);
+			errx(1, "vringh_need_notify: %i", err);
 		if (err) {
 			write(to_guest[1], "", 1);
 			notifies++;
@@ -493,12 +492,12 @@ int main(int argc, char *argv[])
 	/* Set up host side. */
 	vring_init(&vrh.vring, RINGSIZE, __user_addr_min, ALIGN);
 	vringh_init_user(&vrh, vdev.features, RINGSIZE, true,
-			 vrh.vring.desc, vrh.vring.avail, vrh.vring.used);
+			 vrh.vring.desc, vrh.vring.avail, vrh.vring.used, getrange);
 
 	/* No descriptor to get yet... */
-	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+	err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 	if (err != 0)
-		errx(1, "vringh_getdesc_user: %i", err);
+		errx(1, "vringh_getdesc: %i", err);
 
 	/* Guest puts in a descriptor. */
 	memcpy(__user_addr_max - 1, "a", 1);
@@ -520,9 +519,9 @@ int main(int argc, char *argv[])
 	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
 	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
-	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+	err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 	if (err != 1)
-		errx(1, "vringh_getdesc_user: %i", err);
+		errx(1, "vringh_getdesc: %i", err);
 
 	assert(riov.used == 1);
 	assert(riov.iov[0].iov_base == __user_addr_max - 1);
@@ -539,25 +538,25 @@ int main(int argc, char *argv[])
 		assert(wiov.iov[1].iov_len == 1);
 	}
 
-	err = vringh_iov_pull_user(&riov, buf, 5);
+	err = vringh_iov_pull(&vrh, &riov, buf, 5);
 	if (err != 1)
-		errx(1, "vringh_iov_pull_user: %i", err);
+		errx(1, "vringh_iov_pull: %i", err);
 	assert(buf[0] == 'a');
 	assert(riov.i == 1);
-	assert(vringh_iov_pull_user(&riov, buf, 5) == 0);
+	assert(vringh_iov_pull(&vrh, &riov, buf, 5) == 0);
 
 	memcpy(buf, "bcdef", 5);
-	err = vringh_iov_push_user(&wiov, buf, 5);
+	err = vringh_iov_push(&vrh, &wiov, buf, 5);
 	if (err != 2)
-		errx(1, "vringh_iov_push_user: %i", err);
+		errx(1, "vringh_iov_push: %i", err);
 	assert(memcmp(__user_addr_max - 3, "bc", 2) == 0);
 	assert(wiov.i == wiov.used);
-	assert(vringh_iov_push_user(&wiov, buf, 5) == 0);
+	assert(vringh_iov_push(&vrh, &wiov, buf, 5) == 0);
 
 	/* Host is done. */
-	err = vringh_complete_user(&vrh, head, err);
+	err = vringh_complete(&vrh, head, err);
 	if (err != 0)
-		errx(1, "vringh_complete_user: %i", err);
+		errx(1, "vringh_complete: %i", err);
 
 	/* Guest should see used token now. */
 	__kfree_ignore_start = __user_addr_min + vring_size(RINGSIZE, ALIGN);
@@ -589,9 +588,9 @@ int main(int argc, char *argv[])
 	vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
 	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
-	err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+	err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 	if (err != 1)
-		errx(1, "vringh_getdesc_user: %i", err);
+		errx(1, "vringh_getdesc: %i", err);
 
 	assert(riov.max_num & VRINGH_IOV_ALLOCATED);
 	assert(riov.iov != host_riov);
@@ -605,9 +604,9 @@ int main(int argc, char *argv[])
 
 	/* Pull data back out (in odd chunks), should be as expected. */
 	for (i = 0; i < RINGSIZE * USER_MEM/4; i += 3) {
-		err = vringh_iov_pull_user(&riov, buf, 3);
+		err = vringh_iov_pull(&vrh, &riov, buf, 3);
 		if (err != 3 && i + err != RINGSIZE * USER_MEM/4)
-			errx(1, "vringh_iov_pull_user large: %i", err);
+			errx(1, "vringh_iov_pulllarge: %i", err);
 		assert(buf[0] == (char)i);
 		assert(err < 2 || buf[1] == (char)(i + 1));
 		assert(err < 3 || buf[2] == (char)(i + 2));
@@ -619,9 +618,9 @@ int main(int argc, char *argv[])
 	/* Complete using multi interface, just because we can. */
 	used[0].id = head;
 	used[0].len = 0;
-	err = vringh_complete_multi_user(&vrh, used, 1);
+	err = vringh_complete_multi(&vrh, used, 1);
 	if (err)
-		errx(1, "vringh_complete_multi_user(1): %i", err);
+		errx(1, "vringh_complete_multi(1): %i", err);
 
 	/* Free up those descriptors. */
 	ret = virtqueue_get_buf(vq, &i);
@@ -642,17 +641,17 @@ int main(int argc, char *argv[])
 	vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
 	for (i = 0; i < RINGSIZE; i++) {
-		err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+		err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 		if (err != 1)
-			errx(1, "vringh_getdesc_user: %i", err);
+			errx(1, "vringh_getdesc: %i", err);
 		used[i].id = head;
 		used[i].len = 0;
 	}
 	/* Make sure it wraps around ring, to test! */
 	assert(vrh.vring.used->idx % RINGSIZE != 0);
-	err = vringh_complete_multi_user(&vrh, used, RINGSIZE);
+	err = vringh_complete_multi(&vrh, used, RINGSIZE);
 	if (err)
-		errx(1, "vringh_complete_multi_user: %i", err);
+		errx(1, "vringh_complete_multi: %i", err);
 
 	/* Free those buffers. */
 	for (i = 0; i < RINGSIZE; i++) {
@@ -726,19 +725,19 @@ int main(int argc, char *argv[])
 		vringh_kiov_init(&riov, host_riov, ARRAY_SIZE(host_riov));
 		vringh_kiov_init(&wiov, host_wiov, ARRAY_SIZE(host_wiov));
 
-		err = vringh_getdesc_user(&vrh, &riov, &wiov, getrange, &head);
+		err = vringh_getdesc(&vrh, &riov, &wiov, &head);
 		if (err != 1)
-			errx(1, "vringh_getdesc_user: %i", err);
+			errx(1, "vringh_getdesc: %i", err);
 
 		if (head != 0)
-			errx(1, "vringh_getdesc_user: head %i not 0", head);
+			errx(1, "vringh_getdesc: head %i not 0", head);
 
 		assert(riov.max_num & VRINGH_IOV_ALLOCATED);
 		if (getrange != getrange_slow)
 			assert(riov.used == 7);
 		else
 			assert(riov.used == 28);
-		err = vringh_iov_pull_user(&riov, buf, 29);
+		err = vringh_iov_pull(&vrh, &riov, buf, 29);
 		assert(err == 28);
 
 		/* Data should be linear. */
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 6/9] caif_virtio: convert to new unified vringh APIs
  2022-12-27  2:25 ` Shunsuke Mie
@ 2022-12-27  2:25   ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: netdev, Shunsuke Mie, linux-kernel, kvm, virtualization

vringh_*_kern APIs are being removed without vringh_init_kern(), so change
to use new APIs.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/net/caif/caif_virtio.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/net/caif/caif_virtio.c b/drivers/net/caif/caif_virtio.c
index 0b0f234b0b50..f9dd79807afa 100644
--- a/drivers/net/caif/caif_virtio.c
+++ b/drivers/net/caif/caif_virtio.c
@@ -265,18 +265,12 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
 		 */
 		if (riov->i == riov->used) {
 			if (cfv->ctx.head != USHRT_MAX) {
-				vringh_complete_kern(cfv->vr_rx,
-						     cfv->ctx.head,
-						     0);
+				vringh_complete(cfv->vr_rx, cfv->ctx.head, 0);
 				cfv->ctx.head = USHRT_MAX;
 			}
 
-			err = vringh_getdesc_kern(
-				cfv->vr_rx,
-				riov,
-				NULL,
-				&cfv->ctx.head,
-				GFP_ATOMIC);
+			err = vringh_getdesc(cfv->vr_rx, riov, NULL,
+					     &cfv->ctx.head);
 
 			if (err <= 0)
 				goto exit;
@@ -317,9 +311,9 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
 
 		/* Really out of packets? (stolen from virtio_net)*/
 		napi_complete(napi);
-		if (unlikely(!vringh_notify_enable_kern(cfv->vr_rx)) &&
+		if (unlikely(!vringh_notify_enable(cfv->vr_rx)) &&
 		    napi_schedule_prep(napi)) {
-			vringh_notify_disable_kern(cfv->vr_rx);
+			vringh_notify_disable(cfv->vr_rx);
 			__napi_schedule(napi);
 		}
 		break;
@@ -329,7 +323,7 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
 		dev_kfree_skb(skb);
 		/* Stop NAPI poll on OOM, we hope to be polled later */
 		napi_complete(napi);
-		vringh_notify_enable_kern(cfv->vr_rx);
+		vringh_notify_enable(cfv->vr_rx);
 		break;
 
 	default:
@@ -337,12 +331,12 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
 		netdev_warn(cfv->ndev, "Bad ring, disable device\n");
 		cfv->ndev->stats.rx_dropped = riov->used - riov->i;
 		napi_complete(napi);
-		vringh_notify_disable_kern(cfv->vr_rx);
+		vringh_notify_disable(cfv->vr_rx);
 		netif_carrier_off(cfv->ndev);
 		break;
 	}
 out:
-	if (rxcnt && vringh_need_notify_kern(cfv->vr_rx) > 0)
+	if (rxcnt && vringh_need_notify(cfv->vr_rx) > 0)
 		vringh_notify(cfv->vr_rx);
 	return rxcnt;
 }
@@ -352,7 +346,7 @@ static void cfv_recv(struct virtio_device *vdev, struct vringh *vr_rx)
 	struct cfv_info *cfv = vdev->priv;
 
 	++cfv->stats.rx_kicks;
-	vringh_notify_disable_kern(cfv->vr_rx);
+	vringh_notify_disable(cfv->vr_rx);
 	napi_schedule(&cfv->napi);
 }
 
@@ -460,7 +454,7 @@ static int cfv_netdev_close(struct net_device *netdev)
 	/* Disable interrupts, queues and NAPI polling */
 	netif_carrier_off(netdev);
 	virtqueue_disable_cb(cfv->vq_tx);
-	vringh_notify_disable_kern(cfv->vr_rx);
+	vringh_notify_disable(cfv->vr_rx);
 	napi_disable(&cfv->napi);
 
 	/* Release any TX buffers on both used and available rings */
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 6/9] caif_virtio: convert to new unified vringh APIs
@ 2022-12-27  2:25   ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang, Rusty Russell
  Cc: kvm, virtualization, netdev, linux-kernel, Shunsuke Mie

vringh_*_kern APIs are being removed without vringh_init_kern(), so change
to use new APIs.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
---
 drivers/net/caif/caif_virtio.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/net/caif/caif_virtio.c b/drivers/net/caif/caif_virtio.c
index 0b0f234b0b50..f9dd79807afa 100644
--- a/drivers/net/caif/caif_virtio.c
+++ b/drivers/net/caif/caif_virtio.c
@@ -265,18 +265,12 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
 		 */
 		if (riov->i == riov->used) {
 			if (cfv->ctx.head != USHRT_MAX) {
-				vringh_complete_kern(cfv->vr_rx,
-						     cfv->ctx.head,
-						     0);
+				vringh_complete(cfv->vr_rx, cfv->ctx.head, 0);
 				cfv->ctx.head = USHRT_MAX;
 			}
 
-			err = vringh_getdesc_kern(
-				cfv->vr_rx,
-				riov,
-				NULL,
-				&cfv->ctx.head,
-				GFP_ATOMIC);
+			err = vringh_getdesc(cfv->vr_rx, riov, NULL,
+					     &cfv->ctx.head);
 
 			if (err <= 0)
 				goto exit;
@@ -317,9 +311,9 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
 
 		/* Really out of packets? (stolen from virtio_net)*/
 		napi_complete(napi);
-		if (unlikely(!vringh_notify_enable_kern(cfv->vr_rx)) &&
+		if (unlikely(!vringh_notify_enable(cfv->vr_rx)) &&
 		    napi_schedule_prep(napi)) {
-			vringh_notify_disable_kern(cfv->vr_rx);
+			vringh_notify_disable(cfv->vr_rx);
 			__napi_schedule(napi);
 		}
 		break;
@@ -329,7 +323,7 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
 		dev_kfree_skb(skb);
 		/* Stop NAPI poll on OOM, we hope to be polled later */
 		napi_complete(napi);
-		vringh_notify_enable_kern(cfv->vr_rx);
+		vringh_notify_enable(cfv->vr_rx);
 		break;
 
 	default:
@@ -337,12 +331,12 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
 		netdev_warn(cfv->ndev, "Bad ring, disable device\n");
 		cfv->ndev->stats.rx_dropped = riov->used - riov->i;
 		napi_complete(napi);
-		vringh_notify_disable_kern(cfv->vr_rx);
+		vringh_notify_disable(cfv->vr_rx);
 		netif_carrier_off(cfv->ndev);
 		break;
 	}
 out:
-	if (rxcnt && vringh_need_notify_kern(cfv->vr_rx) > 0)
+	if (rxcnt && vringh_need_notify(cfv->vr_rx) > 0)
 		vringh_notify(cfv->vr_rx);
 	return rxcnt;
 }
@@ -352,7 +346,7 @@ static void cfv_recv(struct virtio_device *vdev, struct vringh *vr_rx)
 	struct cfv_info *cfv = vdev->priv;
 
 	++cfv->stats.rx_kicks;
-	vringh_notify_disable_kern(cfv->vr_rx);
+	vringh_notify_disable(cfv->vr_rx);
 	napi_schedule(&cfv->napi);
 }
 
@@ -460,7 +454,7 @@ static int cfv_netdev_close(struct net_device *netdev)
 	/* Disable interrupts, queues and NAPI polling */
 	netif_carrier_off(netdev);
 	virtqueue_disable_cb(cfv->vq_tx);
-	vringh_notify_disable_kern(cfv->vr_rx);
+	vringh_notify_disable(cfv->vr_rx);
 	napi_disable(&cfv->napi);
 
 	/* Release any TX buffers on both used and available rings */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-27  2:25   ` Shunsuke Mie
@ 2022-12-27  6:04     ` Jason Wang
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Wang @ 2022-12-27  6:04 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Michael S. Tsirkin, Rusty Russell, kvm, virtualization, netdev,
	linux-kernel

On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>
> struct vringh_iov is defined to hold userland addresses. However, to use
> common function, __vring_iov, finally the vringh_iov converts to the
> vringh_kiov with simple cast. It includes compile time check code to make
> sure it can be cast correctly.
>
> To simplify the code, this patch removes the struct vringh_iov and unifies
> APIs to struct vringh_kiov.
>
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>

While at this, I wonder if we need to go further, that is, switch to
using an iov iterator instead of a vringh customized one.

Thanks

> ---
>  drivers/vhost/vringh.c | 32 ++++++------------------------
>  include/linux/vringh.h | 45 ++++--------------------------------------
>  2 files changed, 10 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> index 828c29306565..aa3cd27d2384 100644
> --- a/drivers/vhost/vringh.c
> +++ b/drivers/vhost/vringh.c
> @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
>   * calling vringh_iov_cleanup() to release the memory, even on error!
>   */
>  int vringh_getdesc_user(struct vringh *vrh,
> -                       struct vringh_iov *riov,
> -                       struct vringh_iov *wiov,
> +                       struct vringh_kiov *riov,
> +                       struct vringh_kiov *wiov,
>                         bool (*getrange)(struct vringh *vrh,
>                                          u64 addr, struct vringh_range *r),
>                         u16 *head)
> @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
>         if (err == vrh->vring.num)
>                 return 0;
>
> -       /* We need the layouts to be the identical for this to work */
> -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
> -                    offsetof(struct vringh_iov, iov));
> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
> -                    offsetof(struct vringh_iov, i));
> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
> -                    offsetof(struct vringh_iov, used));
> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
> -                    offsetof(struct vringh_iov, max_num));
> -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
> -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
> -                    offsetof(struct kvec, iov_base));
> -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
> -                    offsetof(struct kvec, iov_len));
> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
> -                    != sizeof(((struct kvec *)NULL)->iov_base));
> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
> -                    != sizeof(((struct kvec *)NULL)->iov_len));
> -
>         *head = err;
>         err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
>                            (struct vringh_kiov *)wiov,
> @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
>  EXPORT_SYMBOL(vringh_getdesc_user);
>
>  /**
> - * vringh_iov_pull_user - copy bytes from vring_iov.
> + * vringh_iov_pull_user - copy bytes from vring_kiov.
>   * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
>   * @dst: the place to copy.
>   * @len: the maximum length to copy.
>   *
>   * Returns the bytes copied <= len or a negative errno.
>   */
> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
>  {
>         return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
>                                dst, len, xfer_from_user);
> @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>  EXPORT_SYMBOL(vringh_iov_pull_user);
>
>  /**
> - * vringh_iov_push_user - copy bytes into vring_iov.
> + * vringh_iov_push_user - copy bytes into vring_kiov.
>   * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
>   * @src: the place to copy from.
>   * @len: the maximum length to copy.
>   *
>   * Returns the bytes copied <= len or a negative errno.
>   */
> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>                              const void *src, size_t len)
>  {
>         return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> index 1991a02c6431..733d948e8123 100644
> --- a/include/linux/vringh.h
> +++ b/include/linux/vringh.h
> @@ -79,18 +79,6 @@ struct vringh_range {
>         u64 offset;
>  };
>
> -/**
> - * struct vringh_iov - iovec mangler.
> - *
> - * Mangles iovec in place, and restores it.
> - * Remaining data is iov + i, of used - i elements.
> - */
> -struct vringh_iov {
> -       struct iovec *iov;
> -       size_t consumed; /* Within iov[i] */
> -       unsigned i, used, max_num;
> -};
> -
>  /**
>   * struct vringh_kiov - kvec mangler.
>   *
> @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
>                      vring_avail_t __user *avail,
>                      vring_used_t __user *used);
>
> -static inline void vringh_iov_init(struct vringh_iov *iov,
> -                                  struct iovec *iovec, unsigned num)
> -{
> -       iov->used = iov->i = 0;
> -       iov->consumed = 0;
> -       iov->max_num = num;
> -       iov->iov = iovec;
> -}
> -
> -static inline void vringh_iov_reset(struct vringh_iov *iov)
> -{
> -       iov->iov[iov->i].iov_len += iov->consumed;
> -       iov->iov[iov->i].iov_base -= iov->consumed;
> -       iov->consumed = 0;
> -       iov->i = 0;
> -}
> -
> -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
> -{
> -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
> -               kfree(iov->iov);
> -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
> -       iov->iov = NULL;
> -}
> -
>  /* Convert a descriptor into iovecs. */
>  int vringh_getdesc_user(struct vringh *vrh,
> -                       struct vringh_iov *riov,
> -                       struct vringh_iov *wiov,
> +                       struct vringh_kiov *riov,
> +                       struct vringh_kiov *wiov,
>                         bool (*getrange)(struct vringh *vrh,
>                                          u64 addr, struct vringh_range *r),
>                         u16 *head);
>
>  /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
>
>  /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>                              const void *src, size_t len);
>
>  /* Mark a descriptor as used. */
> --
> 2.25.1
>


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2022-12-27  6:04     ` Jason Wang
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Wang @ 2022-12-27  6:04 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: kvm, Michael S. Tsirkin, netdev, Rusty Russell, linux-kernel,
	virtualization

On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>
> struct vringh_iov is defined to hold userland addresses. However, to use
> common function, __vring_iov, finally the vringh_iov converts to the
> vringh_kiov with simple cast. It includes compile time check code to make
> sure it can be cast correctly.
>
> To simplify the code, this patch removes the struct vringh_iov and unifies
> APIs to struct vringh_kiov.
>
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>

While at this, I wonder if we need to go further, that is, switch to
using an iov iterator instead of a vringh customized one.

Thanks

> ---
>  drivers/vhost/vringh.c | 32 ++++++------------------------
>  include/linux/vringh.h | 45 ++++--------------------------------------
>  2 files changed, 10 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> index 828c29306565..aa3cd27d2384 100644
> --- a/drivers/vhost/vringh.c
> +++ b/drivers/vhost/vringh.c
> @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
>   * calling vringh_iov_cleanup() to release the memory, even on error!
>   */
>  int vringh_getdesc_user(struct vringh *vrh,
> -                       struct vringh_iov *riov,
> -                       struct vringh_iov *wiov,
> +                       struct vringh_kiov *riov,
> +                       struct vringh_kiov *wiov,
>                         bool (*getrange)(struct vringh *vrh,
>                                          u64 addr, struct vringh_range *r),
>                         u16 *head)
> @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
>         if (err == vrh->vring.num)
>                 return 0;
>
> -       /* We need the layouts to be the identical for this to work */
> -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
> -                    offsetof(struct vringh_iov, iov));
> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
> -                    offsetof(struct vringh_iov, i));
> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
> -                    offsetof(struct vringh_iov, used));
> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
> -                    offsetof(struct vringh_iov, max_num));
> -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
> -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
> -                    offsetof(struct kvec, iov_base));
> -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
> -                    offsetof(struct kvec, iov_len));
> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
> -                    != sizeof(((struct kvec *)NULL)->iov_base));
> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
> -                    != sizeof(((struct kvec *)NULL)->iov_len));
> -
>         *head = err;
>         err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
>                            (struct vringh_kiov *)wiov,
> @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
>  EXPORT_SYMBOL(vringh_getdesc_user);
>
>  /**
> - * vringh_iov_pull_user - copy bytes from vring_iov.
> + * vringh_iov_pull_user - copy bytes from vring_kiov.
>   * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
>   * @dst: the place to copy.
>   * @len: the maximum length to copy.
>   *
>   * Returns the bytes copied <= len or a negative errno.
>   */
> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
>  {
>         return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
>                                dst, len, xfer_from_user);
> @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>  EXPORT_SYMBOL(vringh_iov_pull_user);
>
>  /**
> - * vringh_iov_push_user - copy bytes into vring_iov.
> + * vringh_iov_push_user - copy bytes into vring_kiov.
>   * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
>   * @src: the place to copy from.
>   * @len: the maximum length to copy.
>   *
>   * Returns the bytes copied <= len or a negative errno.
>   */
> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>                              const void *src, size_t len)
>  {
>         return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> index 1991a02c6431..733d948e8123 100644
> --- a/include/linux/vringh.h
> +++ b/include/linux/vringh.h
> @@ -79,18 +79,6 @@ struct vringh_range {
>         u64 offset;
>  };
>
> -/**
> - * struct vringh_iov - iovec mangler.
> - *
> - * Mangles iovec in place, and restores it.
> - * Remaining data is iov + i, of used - i elements.
> - */
> -struct vringh_iov {
> -       struct iovec *iov;
> -       size_t consumed; /* Within iov[i] */
> -       unsigned i, used, max_num;
> -};
> -
>  /**
>   * struct vringh_kiov - kvec mangler.
>   *
> @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
>                      vring_avail_t __user *avail,
>                      vring_used_t __user *used);
>
> -static inline void vringh_iov_init(struct vringh_iov *iov,
> -                                  struct iovec *iovec, unsigned num)
> -{
> -       iov->used = iov->i = 0;
> -       iov->consumed = 0;
> -       iov->max_num = num;
> -       iov->iov = iovec;
> -}
> -
> -static inline void vringh_iov_reset(struct vringh_iov *iov)
> -{
> -       iov->iov[iov->i].iov_len += iov->consumed;
> -       iov->iov[iov->i].iov_base -= iov->consumed;
> -       iov->consumed = 0;
> -       iov->i = 0;
> -}
> -
> -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
> -{
> -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
> -               kfree(iov->iov);
> -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
> -       iov->iov = NULL;
> -}
> -
>  /* Convert a descriptor into iovecs. */
>  int vringh_getdesc_user(struct vringh *vrh,
> -                       struct vringh_iov *riov,
> -                       struct vringh_iov *wiov,
> +                       struct vringh_kiov *riov,
> +                       struct vringh_kiov *wiov,
>                         bool (*getrange)(struct vringh *vrh,
>                                          u64 addr, struct vringh_range *r),
>                         u16 *head);
>
>  /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
>
>  /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>                              const void *src, size_t len);
>
>  /* Mark a descriptor as used. */
> --
> 2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
  2022-12-27  2:25   ` Shunsuke Mie
@ 2022-12-27  7:04     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-27  7:04 UTC (permalink / raw)
  To: Shunsuke Mie; +Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> Each vringh memory accessors that are for user, kern and iotlb has own
> interfaces that calls common code. But some codes are duplicated and that
> becomes loss extendability.
> 
> Introduce a struct vringh_ops and provide a common APIs for all accessors.
> It can bee easily extended vringh code for new memory accessor and
> simplified a caller code.
> 
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> ---
>  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
>  include/linux/vringh.h | 100 +++---
>  2 files changed, 225 insertions(+), 542 deletions(-)
> 
> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> index aa3cd27d2384..ebfd3644a1a3 100644
> --- a/drivers/vhost/vringh.c
> +++ b/drivers/vhost/vringh.c
> @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
>  }
>  
>  /* Returns vring->num if empty, -ve on error. */
> -static inline int __vringh_get_head(const struct vringh *vrh,
> -				    int (*getu16)(const struct vringh *vrh,
> -						  u16 *val, const __virtio16 *p),
> -				    u16 *last_avail_idx)
> +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
>  {
>  	u16 avail_idx, i, head;
>  	int err;
>  
> -	err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> +	err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
>  	if (err) {
>  		vringh_bad("Failed to access avail idx at %p",
>  			   &vrh->vring.avail->idx);

I like that this patch removes more lines of code than it adds.

However one of the design points of vringh abstractions is that they were
carefully written to be very low overhead.
This is why we are passing function pointers to inline functions -
compiler can optimize that out.

I think that introducing ops indirect functions calls here is going to break
these assumptions and hurt performance.
Unless compiler can somehow figure it out and optimize?
I don't see how it's possible with ops pointer in memory
but maybe I'm wrong.

Was any effort taken to test effect of these patches on performance?

Thanks!


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
@ 2022-12-27  7:04     ` Michael S. Tsirkin
  0 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-27  7:04 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> Each vringh memory accessors that are for user, kern and iotlb has own
> interfaces that calls common code. But some codes are duplicated and that
> becomes loss extendability.
> 
> Introduce a struct vringh_ops and provide a common APIs for all accessors.
> It can bee easily extended vringh code for new memory accessor and
> simplified a caller code.
> 
> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> ---
>  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
>  include/linux/vringh.h | 100 +++---
>  2 files changed, 225 insertions(+), 542 deletions(-)
> 
> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> index aa3cd27d2384..ebfd3644a1a3 100644
> --- a/drivers/vhost/vringh.c
> +++ b/drivers/vhost/vringh.c
> @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
>  }
>  
>  /* Returns vring->num if empty, -ve on error. */
> -static inline int __vringh_get_head(const struct vringh *vrh,
> -				    int (*getu16)(const struct vringh *vrh,
> -						  u16 *val, const __virtio16 *p),
> -				    u16 *last_avail_idx)
> +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
>  {
>  	u16 avail_idx, i, head;
>  	int err;
>  
> -	err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> +	err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
>  	if (err) {
>  		vringh_bad("Failed to access avail idx at %p",
>  			   &vrh->vring.avail->idx);

I like that this patch removes more lines of code than it adds.

However one of the design points of vringh abstractions is that they were
carefully written to be very low overhead.
This is why we are passing function pointers to inline functions -
compiler can optimize that out.

I think that introducing ops indirect functions calls here is going to break
these assumptions and hurt performance.
Unless compiler can somehow figure it out and optimize?
I don't see how it's possible with ops pointer in memory
but maybe I'm wrong.

Was any effort taken to test effect of these patches on performance?

Thanks!



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-27  6:04     ` Jason Wang
@ 2022-12-27  7:05       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-27  7:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization, Shunsuke Mie

On Tue, Dec 27, 2022 at 02:04:03PM +0800, Jason Wang wrote:
> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> >
> > struct vringh_iov is defined to hold userland addresses. However, to use
> > common function, __vring_iov, finally the vringh_iov converts to the
> > vringh_kiov with simple cast. It includes compile time check code to make
> > sure it can be cast correctly.
> >
> > To simplify the code, this patch removes the struct vringh_iov and unifies
> > APIs to struct vringh_kiov.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> 
> While at this, I wonder if we need to go further, that is, switch to
> using an iov iterator instead of a vringh customized one.
> 
> Thanks

Possibly, but when doing changes like this one needs to be careful
to avoid breaking all the inlining tricks vringh relies on for
performance.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2022-12-27  7:05       ` Michael S. Tsirkin
  0 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-27  7:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: Shunsuke Mie, Rusty Russell, kvm, virtualization, netdev, linux-kernel

On Tue, Dec 27, 2022 at 02:04:03PM +0800, Jason Wang wrote:
> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> >
> > struct vringh_iov is defined to hold userland addresses. However, to use
> > common function, __vring_iov, finally the vringh_iov converts to the
> > vringh_kiov with simple cast. It includes compile time check code to make
> > sure it can be cast correctly.
> >
> > To simplify the code, this patch removes the struct vringh_iov and unifies
> > APIs to struct vringh_kiov.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> 
> While at this, I wonder if we need to go further, that is, switch to
> using an iov iterator instead of a vringh customized one.
> 
> Thanks

Possibly, but when doing changes like this one needs to be careful
to avoid breaking all the inlining tricks vringh relies on for
performance.

-- 
MST


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-27  6:04     ` Jason Wang
@ 2022-12-27  7:05       ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  7:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Rusty Russell, kvm, virtualization, netdev,
	linux-kernel

2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
>
> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> >
> > struct vringh_iov is defined to hold userland addresses. However, to use
> > common function, __vring_iov, finally the vringh_iov converts to the
> > vringh_kiov with simple cast. It includes compile time check code to make
> > sure it can be cast correctly.
> >
> > To simplify the code, this patch removes the struct vringh_iov and unifies
> > APIs to struct vringh_kiov.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
>
> While at this, I wonder if we need to go further, that is, switch to
> using an iov iterator instead of a vringh customized one.
I didn't see the iov iterator yet, thank you for informing me.
Is that iov_iter? https://lwn.net/Articles/625077/
> Thanks
>
> > ---
> >  drivers/vhost/vringh.c | 32 ++++++------------------------
> >  include/linux/vringh.h | 45 ++++--------------------------------------
> >  2 files changed, 10 insertions(+), 67 deletions(-)
> >
> > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > index 828c29306565..aa3cd27d2384 100644
> > --- a/drivers/vhost/vringh.c
> > +++ b/drivers/vhost/vringh.c
> > @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
> >   * calling vringh_iov_cleanup() to release the memory, even on error!
> >   */
> >  int vringh_getdesc_user(struct vringh *vrh,
> > -                       struct vringh_iov *riov,
> > -                       struct vringh_iov *wiov,
> > +                       struct vringh_kiov *riov,
> > +                       struct vringh_kiov *wiov,
> >                         bool (*getrange)(struct vringh *vrh,
> >                                          u64 addr, struct vringh_range *r),
> >                         u16 *head)
> > @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
> >         if (err == vrh->vring.num)
> >                 return 0;
> >
> > -       /* We need the layouts to be the identical for this to work */
> > -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
> > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
> > -                    offsetof(struct vringh_iov, iov));
> > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
> > -                    offsetof(struct vringh_iov, i));
> > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
> > -                    offsetof(struct vringh_iov, used));
> > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
> > -                    offsetof(struct vringh_iov, max_num));
> > -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
> > -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
> > -                    offsetof(struct kvec, iov_base));
> > -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
> > -                    offsetof(struct kvec, iov_len));
> > -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
> > -                    != sizeof(((struct kvec *)NULL)->iov_base));
> > -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
> > -                    != sizeof(((struct kvec *)NULL)->iov_len));
> > -
> >         *head = err;
> >         err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
> >                            (struct vringh_kiov *)wiov,
> > @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
> >  EXPORT_SYMBOL(vringh_getdesc_user);
> >
> >  /**
> > - * vringh_iov_pull_user - copy bytes from vring_iov.
> > + * vringh_iov_pull_user - copy bytes from vring_kiov.
> >   * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
> >   * @dst: the place to copy.
> >   * @len: the maximum length to copy.
> >   *
> >   * Returns the bytes copied <= len or a negative errno.
> >   */
> > -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> > +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
> >  {
> >         return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
> >                                dst, len, xfer_from_user);
> > @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> >  EXPORT_SYMBOL(vringh_iov_pull_user);
> >
> >  /**
> > - * vringh_iov_push_user - copy bytes into vring_iov.
> > + * vringh_iov_push_user - copy bytes into vring_kiov.
> >   * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
> >   * @src: the place to copy from.
> >   * @len: the maximum length to copy.
> >   *
> >   * Returns the bytes copied <= len or a negative errno.
> >   */
> > -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> > +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> >                              const void *src, size_t len)
> >  {
> >         return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
> > diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> > index 1991a02c6431..733d948e8123 100644
> > --- a/include/linux/vringh.h
> > +++ b/include/linux/vringh.h
> > @@ -79,18 +79,6 @@ struct vringh_range {
> >         u64 offset;
> >  };
> >
> > -/**
> > - * struct vringh_iov - iovec mangler.
> > - *
> > - * Mangles iovec in place, and restores it.
> > - * Remaining data is iov + i, of used - i elements.
> > - */
> > -struct vringh_iov {
> > -       struct iovec *iov;
> > -       size_t consumed; /* Within iov[i] */
> > -       unsigned i, used, max_num;
> > -};
> > -
> >  /**
> >   * struct vringh_kiov - kvec mangler.
> >   *
> > @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
> >                      vring_avail_t __user *avail,
> >                      vring_used_t __user *used);
> >
> > -static inline void vringh_iov_init(struct vringh_iov *iov,
> > -                                  struct iovec *iovec, unsigned num)
> > -{
> > -       iov->used = iov->i = 0;
> > -       iov->consumed = 0;
> > -       iov->max_num = num;
> > -       iov->iov = iovec;
> > -}
> > -
> > -static inline void vringh_iov_reset(struct vringh_iov *iov)
> > -{
> > -       iov->iov[iov->i].iov_len += iov->consumed;
> > -       iov->iov[iov->i].iov_base -= iov->consumed;
> > -       iov->consumed = 0;
> > -       iov->i = 0;
> > -}
> > -
> > -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
> > -{
> > -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
> > -               kfree(iov->iov);
> > -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
> > -       iov->iov = NULL;
> > -}
> > -
> >  /* Convert a descriptor into iovecs. */
> >  int vringh_getdesc_user(struct vringh *vrh,
> > -                       struct vringh_iov *riov,
> > -                       struct vringh_iov *wiov,
> > +                       struct vringh_kiov *riov,
> > +                       struct vringh_kiov *wiov,
> >                         bool (*getrange)(struct vringh *vrh,
> >                                          u64 addr, struct vringh_range *r),
> >                         u16 *head);
> >
> >  /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
> > -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
> > +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
> >
> >  /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
> > -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> > +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> >                              const void *src, size_t len);
> >
> >  /* Mark a descriptor as used. */
> > --
> > 2.25.1
> >
>
Best,
Shunsuke

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2022-12-27  7:05       ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  7:05 UTC (permalink / raw)
  To: Jason Wang
  Cc: kvm, Michael S. Tsirkin, netdev, Rusty Russell, linux-kernel,
	virtualization

2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
>
> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> >
> > struct vringh_iov is defined to hold userland addresses. However, to use
> > common function, __vring_iov, finally the vringh_iov converts to the
> > vringh_kiov with simple cast. It includes compile time check code to make
> > sure it can be cast correctly.
> >
> > To simplify the code, this patch removes the struct vringh_iov and unifies
> > APIs to struct vringh_kiov.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
>
> While at this, I wonder if we need to go further, that is, switch to
> using an iov iterator instead of a vringh customized one.
I didn't see the iov iterator yet, thank you for informing me.
Is that iov_iter? https://lwn.net/Articles/625077/
> Thanks
>
> > ---
> >  drivers/vhost/vringh.c | 32 ++++++------------------------
> >  include/linux/vringh.h | 45 ++++--------------------------------------
> >  2 files changed, 10 insertions(+), 67 deletions(-)
> >
> > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > index 828c29306565..aa3cd27d2384 100644
> > --- a/drivers/vhost/vringh.c
> > +++ b/drivers/vhost/vringh.c
> > @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
> >   * calling vringh_iov_cleanup() to release the memory, even on error!
> >   */
> >  int vringh_getdesc_user(struct vringh *vrh,
> > -                       struct vringh_iov *riov,
> > -                       struct vringh_iov *wiov,
> > +                       struct vringh_kiov *riov,
> > +                       struct vringh_kiov *wiov,
> >                         bool (*getrange)(struct vringh *vrh,
> >                                          u64 addr, struct vringh_range *r),
> >                         u16 *head)
> > @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
> >         if (err == vrh->vring.num)
> >                 return 0;
> >
> > -       /* We need the layouts to be the identical for this to work */
> > -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
> > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
> > -                    offsetof(struct vringh_iov, iov));
> > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
> > -                    offsetof(struct vringh_iov, i));
> > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
> > -                    offsetof(struct vringh_iov, used));
> > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
> > -                    offsetof(struct vringh_iov, max_num));
> > -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
> > -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
> > -                    offsetof(struct kvec, iov_base));
> > -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
> > -                    offsetof(struct kvec, iov_len));
> > -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
> > -                    != sizeof(((struct kvec *)NULL)->iov_base));
> > -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
> > -                    != sizeof(((struct kvec *)NULL)->iov_len));
> > -
> >         *head = err;
> >         err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
> >                            (struct vringh_kiov *)wiov,
> > @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
> >  EXPORT_SYMBOL(vringh_getdesc_user);
> >
> >  /**
> > - * vringh_iov_pull_user - copy bytes from vring_iov.
> > + * vringh_iov_pull_user - copy bytes from vring_kiov.
> >   * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
> >   * @dst: the place to copy.
> >   * @len: the maximum length to copy.
> >   *
> >   * Returns the bytes copied <= len or a negative errno.
> >   */
> > -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> > +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
> >  {
> >         return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
> >                                dst, len, xfer_from_user);
> > @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> >  EXPORT_SYMBOL(vringh_iov_pull_user);
> >
> >  /**
> > - * vringh_iov_push_user - copy bytes into vring_iov.
> > + * vringh_iov_push_user - copy bytes into vring_kiov.
> >   * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
> >   * @src: the place to copy from.
> >   * @len: the maximum length to copy.
> >   *
> >   * Returns the bytes copied <= len or a negative errno.
> >   */
> > -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> > +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> >                              const void *src, size_t len)
> >  {
> >         return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
> > diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> > index 1991a02c6431..733d948e8123 100644
> > --- a/include/linux/vringh.h
> > +++ b/include/linux/vringh.h
> > @@ -79,18 +79,6 @@ struct vringh_range {
> >         u64 offset;
> >  };
> >
> > -/**
> > - * struct vringh_iov - iovec mangler.
> > - *
> > - * Mangles iovec in place, and restores it.
> > - * Remaining data is iov + i, of used - i elements.
> > - */
> > -struct vringh_iov {
> > -       struct iovec *iov;
> > -       size_t consumed; /* Within iov[i] */
> > -       unsigned i, used, max_num;
> > -};
> > -
> >  /**
> >   * struct vringh_kiov - kvec mangler.
> >   *
> > @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
> >                      vring_avail_t __user *avail,
> >                      vring_used_t __user *used);
> >
> > -static inline void vringh_iov_init(struct vringh_iov *iov,
> > -                                  struct iovec *iovec, unsigned num)
> > -{
> > -       iov->used = iov->i = 0;
> > -       iov->consumed = 0;
> > -       iov->max_num = num;
> > -       iov->iov = iovec;
> > -}
> > -
> > -static inline void vringh_iov_reset(struct vringh_iov *iov)
> > -{
> > -       iov->iov[iov->i].iov_len += iov->consumed;
> > -       iov->iov[iov->i].iov_base -= iov->consumed;
> > -       iov->consumed = 0;
> > -       iov->i = 0;
> > -}
> > -
> > -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
> > -{
> > -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
> > -               kfree(iov->iov);
> > -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
> > -       iov->iov = NULL;
> > -}
> > -
> >  /* Convert a descriptor into iovecs. */
> >  int vringh_getdesc_user(struct vringh *vrh,
> > -                       struct vringh_iov *riov,
> > -                       struct vringh_iov *wiov,
> > +                       struct vringh_kiov *riov,
> > +                       struct vringh_kiov *wiov,
> >                         bool (*getrange)(struct vringh *vrh,
> >                                          u64 addr, struct vringh_range *r),
> >                         u16 *head);
> >
> >  /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
> > -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
> > +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
> >
> >  /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
> > -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> > +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> >                              const void *src, size_t len);
> >
> >  /* Mark a descriptor as used. */
> > --
> > 2.25.1
> >
>
Best,
Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-27  7:05       ` Michael S. Tsirkin
@ 2022-12-27  7:13         ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  7:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

2022年12月27日(火) 16:05 Michael S. Tsirkin <mst@redhat.com>:
>
> On Tue, Dec 27, 2022 at 02:04:03PM +0800, Jason Wang wrote:
> > On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > >
> > > struct vringh_iov is defined to hold userland addresses. However, to use
> > > common function, __vring_iov, finally the vringh_iov converts to the
> > > vringh_kiov with simple cast. It includes compile time check code to make
> > > sure it can be cast correctly.
> > >
> > > To simplify the code, this patch removes the struct vringh_iov and unifies
> > > APIs to struct vringh_kiov.
> > >
> > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> >
> > While at this, I wonder if we need to go further, that is, switch to
> > using an iov iterator instead of a vringh customized one.
> >
> > Thanks
>
> Possibly, but when doing changes like this one needs to be careful
> to avoid breaking all the inlining tricks vringh relies on for
> performance.
Definitely, I'm evaluating the performance using vringh_test. I'll add a
result of the evaluation. But, If there are other evaluation methods, could you
please tell me?
> --
> MST
>

Best,
Shunsuke

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2022-12-27  7:13         ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  7:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

2022年12月27日(火) 16:05 Michael S. Tsirkin <mst@redhat.com>:
>
> On Tue, Dec 27, 2022 at 02:04:03PM +0800, Jason Wang wrote:
> > On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > >
> > > struct vringh_iov is defined to hold userland addresses. However, to use
> > > common function, __vring_iov, finally the vringh_iov converts to the
> > > vringh_kiov with simple cast. It includes compile time check code to make
> > > sure it can be cast correctly.
> > >
> > > To simplify the code, this patch removes the struct vringh_iov and unifies
> > > APIs to struct vringh_kiov.
> > >
> > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> >
> > While at this, I wonder if we need to go further, that is, switch to
> > using an iov iterator instead of a vringh customized one.
> >
> > Thanks
>
> Possibly, but when doing changes like this one needs to be careful
> to avoid breaking all the inlining tricks vringh relies on for
> performance.
Definitely, I'm evaluating the performance using vringh_test. I'll add a
result of the evaluation. But, If there are other evaluation methods, could you
please tell me?
> --
> MST
>

Best,
Shunsuke
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
  2022-12-27  7:04     ` Michael S. Tsirkin
@ 2022-12-27  7:49       ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  7:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
>
> On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > Each vringh memory accessors that are for user, kern and iotlb has own
> > interfaces that calls common code. But some codes are duplicated and that
> > becomes loss extendability.
> >
> > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > It can bee easily extended vringh code for new memory accessor and
> > simplified a caller code.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > ---
> >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> >  include/linux/vringh.h | 100 +++---
> >  2 files changed, 225 insertions(+), 542 deletions(-)
> >
> > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > index aa3cd27d2384..ebfd3644a1a3 100644
> > --- a/drivers/vhost/vringh.c
> > +++ b/drivers/vhost/vringh.c
> > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> >  }
> >
> >  /* Returns vring->num if empty, -ve on error. */
> > -static inline int __vringh_get_head(const struct vringh *vrh,
> > -                                 int (*getu16)(const struct vringh *vrh,
> > -                                               u16 *val, const __virtio16 *p),
> > -                                 u16 *last_avail_idx)
> > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> >  {
> >       u16 avail_idx, i, head;
> >       int err;
> >
> > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> >       if (err) {
> >               vringh_bad("Failed to access avail idx at %p",
> >                          &vrh->vring.avail->idx);
>
> I like that this patch removes more lines of code than it adds.
>
> However one of the design points of vringh abstractions is that they were
> carefully written to be very low overhead.
> This is why we are passing function pointers to inline functions -
> compiler can optimize that out.
>
> I think that introducing ops indirect functions calls here is going to break
> these assumptions and hurt performance.
> Unless compiler can somehow figure it out and optimize?
> I don't see how it's possible with ops pointer in memory
> but maybe I'm wrong.
I think your concern is correct. I have to understand the compiler
optimization and redesign this approach If it is needed.
> Was any effort taken to test effect of these patches on performance?
I just tested vringh_test and already faced little performance reduction.
I have to investigate that, as you said.

Thank you for your comments.
> Thanks!
>
>
Best,
Shunsuke.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
@ 2022-12-27  7:49       ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  7:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
>
> On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > Each vringh memory accessors that are for user, kern and iotlb has own
> > interfaces that calls common code. But some codes are duplicated and that
> > becomes loss extendability.
> >
> > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > It can bee easily extended vringh code for new memory accessor and
> > simplified a caller code.
> >
> > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > ---
> >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> >  include/linux/vringh.h | 100 +++---
> >  2 files changed, 225 insertions(+), 542 deletions(-)
> >
> > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > index aa3cd27d2384..ebfd3644a1a3 100644
> > --- a/drivers/vhost/vringh.c
> > +++ b/drivers/vhost/vringh.c
> > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> >  }
> >
> >  /* Returns vring->num if empty, -ve on error. */
> > -static inline int __vringh_get_head(const struct vringh *vrh,
> > -                                 int (*getu16)(const struct vringh *vrh,
> > -                                               u16 *val, const __virtio16 *p),
> > -                                 u16 *last_avail_idx)
> > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> >  {
> >       u16 avail_idx, i, head;
> >       int err;
> >
> > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> >       if (err) {
> >               vringh_bad("Failed to access avail idx at %p",
> >                          &vrh->vring.avail->idx);
>
> I like that this patch removes more lines of code than it adds.
>
> However one of the design points of vringh abstractions is that they were
> carefully written to be very low overhead.
> This is why we are passing function pointers to inline functions -
> compiler can optimize that out.
>
> I think that introducing ops indirect functions calls here is going to break
> these assumptions and hurt performance.
> Unless compiler can somehow figure it out and optimize?
> I don't see how it's possible with ops pointer in memory
> but maybe I'm wrong.
I think your concern is correct. I have to understand the compiler
optimization and redesign this approach If it is needed.
> Was any effort taken to test effect of these patches on performance?
I just tested vringh_test and already faced little performance reduction.
I have to investigate that, as you said.

Thank you for your comments.
> Thanks!
>
>
Best,
Shunsuke.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-27  7:13         ` Shunsuke Mie
@ 2022-12-27  7:56           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-27  7:56 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

On Tue, Dec 27, 2022 at 04:13:49PM +0900, Shunsuke Mie wrote:
> 2022年12月27日(火) 16:05 Michael S. Tsirkin <mst@redhat.com>:
> >
> > On Tue, Dec 27, 2022 at 02:04:03PM +0800, Jason Wang wrote:
> > > On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > > >
> > > > struct vringh_iov is defined to hold userland addresses. However, to use
> > > > common function, __vring_iov, finally the vringh_iov converts to the
> > > > vringh_kiov with simple cast. It includes compile time check code to make
> > > > sure it can be cast correctly.
> > > >
> > > > To simplify the code, this patch removes the struct vringh_iov and unifies
> > > > APIs to struct vringh_kiov.
> > > >
> > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > >
> > > While at this, I wonder if we need to go further, that is, switch to
> > > using an iov iterator instead of a vringh customized one.
> > >
> > > Thanks
> >
> > Possibly, but when doing changes like this one needs to be careful
> > to avoid breaking all the inlining tricks vringh relies on for
> > performance.
> Definitely, I'm evaluating the performance using vringh_test. I'll add a
> result of the evaluation. But, If there are other evaluation methods, could you
> please tell me?

high level tests over virtio blk and net are possible, but let's
start with vringh_test.

> > --
> > MST
> >
> 
> Best,
> Shunsuke


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2022-12-27  7:56           ` Michael S. Tsirkin
  0 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-27  7:56 UTC (permalink / raw)
  To: Shunsuke Mie; +Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

On Tue, Dec 27, 2022 at 04:13:49PM +0900, Shunsuke Mie wrote:
> 2022年12月27日(火) 16:05 Michael S. Tsirkin <mst@redhat.com>:
> >
> > On Tue, Dec 27, 2022 at 02:04:03PM +0800, Jason Wang wrote:
> > > On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > > >
> > > > struct vringh_iov is defined to hold userland addresses. However, to use
> > > > common function, __vring_iov, finally the vringh_iov converts to the
> > > > vringh_kiov with simple cast. It includes compile time check code to make
> > > > sure it can be cast correctly.
> > > >
> > > > To simplify the code, this patch removes the struct vringh_iov and unifies
> > > > APIs to struct vringh_kiov.
> > > >
> > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > >
> > > While at this, I wonder if we need to go further, that is, switch to
> > > using an iov iterator instead of a vringh customized one.
> > >
> > > Thanks
> >
> > Possibly, but when doing changes like this one needs to be careful
> > to avoid breaking all the inlining tricks vringh relies on for
> > performance.
> Definitely, I'm evaluating the performance using vringh_test. I'll add a
> result of the evaluation. But, If there are other evaluation methods, could you
> please tell me?

high level tests over virtio blk and net are possible, but let's
start with vringh_test.

> > --
> > MST
> >
> 
> Best,
> Shunsuke

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-27  7:56           ` Michael S. Tsirkin
@ 2022-12-27  7:57             ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  7:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

2022年12月27日(火) 16:56 Michael S. Tsirkin <mst@redhat.com>:
>
> On Tue, Dec 27, 2022 at 04:13:49PM +0900, Shunsuke Mie wrote:
> > 2022年12月27日(火) 16:05 Michael S. Tsirkin <mst@redhat.com>:
> > >
> > > On Tue, Dec 27, 2022 at 02:04:03PM +0800, Jason Wang wrote:
> > > > On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > > > >
> > > > > struct vringh_iov is defined to hold userland addresses. However, to use
> > > > > common function, __vring_iov, finally the vringh_iov converts to the
> > > > > vringh_kiov with simple cast. It includes compile time check code to make
> > > > > sure it can be cast correctly.
> > > > >
> > > > > To simplify the code, this patch removes the struct vringh_iov and unifies
> > > > > APIs to struct vringh_kiov.
> > > > >
> > > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > >
> > > > While at this, I wonder if we need to go further, that is, switch to
> > > > using an iov iterator instead of a vringh customized one.
> > > >
> > > > Thanks
> > >
> > > Possibly, but when doing changes like this one needs to be careful
> > > to avoid breaking all the inlining tricks vringh relies on for
> > > performance.
> > Definitely, I'm evaluating the performance using vringh_test. I'll add a
> > result of the evaluation. But, If there are other evaluation methods, could you
> > please tell me?
>
> high level tests over virtio blk and net are possible, but let's
> start with vringh_test.
Ok, I'll do it.
> > > --
> > > MST
> > >
> >
> > Best,
> > Shunsuke
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2022-12-27  7:57             ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27  7:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

2022年12月27日(火) 16:56 Michael S. Tsirkin <mst@redhat.com>:
>
> On Tue, Dec 27, 2022 at 04:13:49PM +0900, Shunsuke Mie wrote:
> > 2022年12月27日(火) 16:05 Michael S. Tsirkin <mst@redhat.com>:
> > >
> > > On Tue, Dec 27, 2022 at 02:04:03PM +0800, Jason Wang wrote:
> > > > On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > > > >
> > > > > struct vringh_iov is defined to hold userland addresses. However, to use
> > > > > common function, __vring_iov, finally the vringh_iov converts to the
> > > > > vringh_kiov with simple cast. It includes compile time check code to make
> > > > > sure it can be cast correctly.
> > > > >
> > > > > To simplify the code, this patch removes the struct vringh_iov and unifies
> > > > > APIs to struct vringh_kiov.
> > > > >
> > > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > >
> > > > While at this, I wonder if we need to go further, that is, switch to
> > > > using an iov iterator instead of a vringh customized one.
> > > >
> > > > Thanks
> > >
> > > Possibly, but when doing changes like this one needs to be careful
> > > to avoid breaking all the inlining tricks vringh relies on for
> > > performance.
> > Definitely, I'm evaluating the performance using vringh_test. I'll add a
> > result of the evaluation. But, If there are other evaluation methods, could you
> > please tell me?
>
> high level tests over virtio blk and net are possible, but let's
> start with vringh_test.
Ok, I'll do it.
> > > --
> > > MST
> > >
> >
> > Best,
> > Shunsuke
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
  2022-12-27  7:49       ` Shunsuke Mie
@ 2022-12-27 10:22         ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27 10:22 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
>
> 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
> >
> > On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > > Each vringh memory accessors that are for user, kern and iotlb has own
> > > interfaces that calls common code. But some codes are duplicated and that
> > > becomes loss extendability.
> > >
> > > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > > It can bee easily extended vringh code for new memory accessor and
> > > simplified a caller code.
> > >
> > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > ---
> > >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> > >  include/linux/vringh.h | 100 +++---
> > >  2 files changed, 225 insertions(+), 542 deletions(-)
> > >
> > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > index aa3cd27d2384..ebfd3644a1a3 100644
> > > --- a/drivers/vhost/vringh.c
> > > +++ b/drivers/vhost/vringh.c
> > > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> > >  }
> > >
> > >  /* Returns vring->num if empty, -ve on error. */
> > > -static inline int __vringh_get_head(const struct vringh *vrh,
> > > -                                 int (*getu16)(const struct vringh *vrh,
> > > -                                               u16 *val, const __virtio16 *p),
> > > -                                 u16 *last_avail_idx)
> > > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> > >  {
> > >       u16 avail_idx, i, head;
> > >       int err;
> > >
> > > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > >       if (err) {
> > >               vringh_bad("Failed to access avail idx at %p",
> > >                          &vrh->vring.avail->idx);
> >
> > I like that this patch removes more lines of code than it adds.
> >
> > However one of the design points of vringh abstractions is that they were
> > carefully written to be very low overhead.
> > This is why we are passing function pointers to inline functions -
> > compiler can optimize that out.
> >
> > I think that introducing ops indirect functions calls here is going to break
> > these assumptions and hurt performance.
> > Unless compiler can somehow figure it out and optimize?
> > I don't see how it's possible with ops pointer in memory
> > but maybe I'm wrong.
> I think your concern is correct. I have to understand the compiler
> optimization and redesign this approach If it is needed.
> > Was any effort taken to test effect of these patches on performance?
> I just tested vringh_test and already faced little performance reduction.
> I have to investigate that, as you said.
I attempted to test with perf. I found that the performance of patched code
is almost the same as the upstream one. However, I have to investigate way
this patch leads to this result, also the profiling should be run on
more powerful
machines too.

environment:
$ grep 'model name' /proc/cpuinfo
model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz

results:
* for patched code
 Performance counter stats for 'nice -n -20 ./vringh_test_patched
--parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):

          3,028.05 msec task-clock                #    0.995 CPUs
utilized            ( +-  0.12% )
            78,150      context-switches          #   25.691 K/sec
               ( +-  0.00% )
                 5      cpu-migrations            #    1.644 /sec
               ( +-  3.33% )
               190      page-faults               #   62.461 /sec
               ( +-  0.41% )
     6,919,025,222      cycles                    #    2.275 GHz
               ( +-  0.13% )
     8,990,220,160      instructions              #    1.29  insn per
cycle           ( +-  0.04% )
     1,788,326,786      branches                  #  587.899 M/sec
               ( +-  0.05% )
         4,557,398      branch-misses             #    0.25% of all
branches          ( +-  0.43% )

           3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )

* for upstream code
 Performance counter stats for 'nice -n -20 ./vringh_test_base
--parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):

          3,058.41 msec task-clock                #    0.999 CPUs
utilized            ( +-  0.14% )
            78,149      context-switches          #   25.545 K/sec
               ( +-  0.00% )
                 5      cpu-migrations            #    1.634 /sec
               ( +-  2.67% )
               194      page-faults               #   63.414 /sec
               ( +-  0.43% )
     6,988,713,963      cycles                    #    2.284 GHz
               ( +-  0.14% )
     8,512,533,269      instructions              #    1.22  insn per
cycle           ( +-  0.04% )
     1,638,375,371      branches                  #  535.549 M/sec
               ( +-  0.05% )
         4,428,866      branch-misses             #    0.27% of all
branches          ( +- 22.57% )

           3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )

> Thank you for your comments.
> > Thanks!
> >
> >
> Best,
> Shunsuke.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
@ 2022-12-27 10:22         ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-27 10:22 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
>
> 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
> >
> > On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > > Each vringh memory accessors that are for user, kern and iotlb has own
> > > interfaces that calls common code. But some codes are duplicated and that
> > > becomes loss extendability.
> > >
> > > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > > It can bee easily extended vringh code for new memory accessor and
> > > simplified a caller code.
> > >
> > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > ---
> > >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> > >  include/linux/vringh.h | 100 +++---
> > >  2 files changed, 225 insertions(+), 542 deletions(-)
> > >
> > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > index aa3cd27d2384..ebfd3644a1a3 100644
> > > --- a/drivers/vhost/vringh.c
> > > +++ b/drivers/vhost/vringh.c
> > > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> > >  }
> > >
> > >  /* Returns vring->num if empty, -ve on error. */
> > > -static inline int __vringh_get_head(const struct vringh *vrh,
> > > -                                 int (*getu16)(const struct vringh *vrh,
> > > -                                               u16 *val, const __virtio16 *p),
> > > -                                 u16 *last_avail_idx)
> > > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> > >  {
> > >       u16 avail_idx, i, head;
> > >       int err;
> > >
> > > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > >       if (err) {
> > >               vringh_bad("Failed to access avail idx at %p",
> > >                          &vrh->vring.avail->idx);
> >
> > I like that this patch removes more lines of code than it adds.
> >
> > However one of the design points of vringh abstractions is that they were
> > carefully written to be very low overhead.
> > This is why we are passing function pointers to inline functions -
> > compiler can optimize that out.
> >
> > I think that introducing ops indirect functions calls here is going to break
> > these assumptions and hurt performance.
> > Unless compiler can somehow figure it out and optimize?
> > I don't see how it's possible with ops pointer in memory
> > but maybe I'm wrong.
> I think your concern is correct. I have to understand the compiler
> optimization and redesign this approach If it is needed.
> > Was any effort taken to test effect of these patches on performance?
> I just tested vringh_test and already faced little performance reduction.
> I have to investigate that, as you said.
I attempted to test with perf. I found that the performance of patched code
is almost the same as the upstream one. However, I have to investigate way
this patch leads to this result, also the profiling should be run on
more powerful
machines too.

environment:
$ grep 'model name' /proc/cpuinfo
model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz

results:
* for patched code
 Performance counter stats for 'nice -n -20 ./vringh_test_patched
--parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):

          3,028.05 msec task-clock                #    0.995 CPUs
utilized            ( +-  0.12% )
            78,150      context-switches          #   25.691 K/sec
               ( +-  0.00% )
                 5      cpu-migrations            #    1.644 /sec
               ( +-  3.33% )
               190      page-faults               #   62.461 /sec
               ( +-  0.41% )
     6,919,025,222      cycles                    #    2.275 GHz
               ( +-  0.13% )
     8,990,220,160      instructions              #    1.29  insn per
cycle           ( +-  0.04% )
     1,788,326,786      branches                  #  587.899 M/sec
               ( +-  0.05% )
         4,557,398      branch-misses             #    0.25% of all
branches          ( +-  0.43% )

           3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )

* for upstream code
 Performance counter stats for 'nice -n -20 ./vringh_test_base
--parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):

          3,058.41 msec task-clock                #    0.999 CPUs
utilized            ( +-  0.14% )
            78,149      context-switches          #   25.545 K/sec
               ( +-  0.00% )
                 5      cpu-migrations            #    1.634 /sec
               ( +-  2.67% )
               194      page-faults               #   63.414 /sec
               ( +-  0.43% )
     6,988,713,963      cycles                    #    2.284 GHz
               ( +-  0.14% )
     8,512,533,269      instructions              #    1.22  insn per
cycle           ( +-  0.04% )
     1,638,375,371      branches                  #  535.549 M/sec
               ( +-  0.05% )
         4,428,866      branch-misses             #    0.27% of all
branches          ( +- 22.57% )

           3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )

> Thank you for your comments.
> > Thanks!
> >
> >
> Best,
> Shunsuke.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
  2022-12-27 10:22         ` Shunsuke Mie
@ 2022-12-27 14:37           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-27 14:37 UTC (permalink / raw)
  To: Shunsuke Mie; +Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

On Tue, Dec 27, 2022 at 07:22:36PM +0900, Shunsuke Mie wrote:
> 2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
> >
> > 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
> > >
> > > On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > > > Each vringh memory accessors that are for user, kern and iotlb has own
> > > > interfaces that calls common code. But some codes are duplicated and that
> > > > becomes loss extendability.
> > > >
> > > > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > > > It can bee easily extended vringh code for new memory accessor and
> > > > simplified a caller code.
> > > >
> > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > > ---
> > > >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> > > >  include/linux/vringh.h | 100 +++---
> > > >  2 files changed, 225 insertions(+), 542 deletions(-)
> > > >
> > > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > > index aa3cd27d2384..ebfd3644a1a3 100644
> > > > --- a/drivers/vhost/vringh.c
> > > > +++ b/drivers/vhost/vringh.c
> > > > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> > > >  }
> > > >
> > > >  /* Returns vring->num if empty, -ve on error. */
> > > > -static inline int __vringh_get_head(const struct vringh *vrh,
> > > > -                                 int (*getu16)(const struct vringh *vrh,
> > > > -                                               u16 *val, const __virtio16 *p),
> > > > -                                 u16 *last_avail_idx)
> > > > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> > > >  {
> > > >       u16 avail_idx, i, head;
> > > >       int err;
> > > >
> > > > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > >       if (err) {
> > > >               vringh_bad("Failed to access avail idx at %p",
> > > >                          &vrh->vring.avail->idx);
> > >
> > > I like that this patch removes more lines of code than it adds.
> > >
> > > However one of the design points of vringh abstractions is that they were
> > > carefully written to be very low overhead.
> > > This is why we are passing function pointers to inline functions -
> > > compiler can optimize that out.
> > >
> > > I think that introducing ops indirect functions calls here is going to break
> > > these assumptions and hurt performance.
> > > Unless compiler can somehow figure it out and optimize?
> > > I don't see how it's possible with ops pointer in memory
> > > but maybe I'm wrong.
> > I think your concern is correct. I have to understand the compiler
> > optimization and redesign this approach If it is needed.
> > > Was any effort taken to test effect of these patches on performance?
> > I just tested vringh_test and already faced little performance reduction.
> > I have to investigate that, as you said.
> I attempted to test with perf. I found that the performance of patched code
> is almost the same as the upstream one. However, I have to investigate way
> this patch leads to this result, also the profiling should be run on
> more powerful
> machines too.
> 
> environment:
> $ grep 'model name' /proc/cpuinfo
> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> 
> results:
> * for patched code
>  Performance counter stats for 'nice -n -20 ./vringh_test_patched
> --parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):
> 
>           3,028.05 msec task-clock                #    0.995 CPUs
> utilized            ( +-  0.12% )
>             78,150      context-switches          #   25.691 K/sec
>                ( +-  0.00% )
>                  5      cpu-migrations            #    1.644 /sec
>                ( +-  3.33% )
>                190      page-faults               #   62.461 /sec
>                ( +-  0.41% )
>      6,919,025,222      cycles                    #    2.275 GHz
>                ( +-  0.13% )
>      8,990,220,160      instructions              #    1.29  insn per
> cycle           ( +-  0.04% )
>      1,788,326,786      branches                  #  587.899 M/sec
>                ( +-  0.05% )
>          4,557,398      branch-misses             #    0.25% of all
> branches          ( +-  0.43% )
> 
>            3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )
> 
> * for upstream code
>  Performance counter stats for 'nice -n -20 ./vringh_test_base
> --parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):
> 
>           3,058.41 msec task-clock                #    0.999 CPUs
> utilized            ( +-  0.14% )
>             78,149      context-switches          #   25.545 K/sec
>                ( +-  0.00% )
>                  5      cpu-migrations            #    1.634 /sec
>                ( +-  2.67% )
>                194      page-faults               #   63.414 /sec
>                ( +-  0.43% )
>      6,988,713,963      cycles                    #    2.284 GHz
>                ( +-  0.14% )
>      8,512,533,269      instructions              #    1.22  insn per
> cycle           ( +-  0.04% )
>      1,638,375,371      branches                  #  535.549 M/sec
>                ( +-  0.05% )
>          4,428,866      branch-misses             #    0.27% of all
> branches          ( +- 22.57% )
> 
>            3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )


How you compiled it also matters. ATM we don't enable retpolines
and it did not matter since we didn't have indirect calls,
but we should. Didn't yet investigate how to do that for virtio tools.


> > Thank you for your comments.
> > > Thanks!
> > >
> > >
> > Best,
> > Shunsuke.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
@ 2022-12-27 14:37           ` Michael S. Tsirkin
  0 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-27 14:37 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

On Tue, Dec 27, 2022 at 07:22:36PM +0900, Shunsuke Mie wrote:
> 2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
> >
> > 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
> > >
> > > On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > > > Each vringh memory accessors that are for user, kern and iotlb has own
> > > > interfaces that calls common code. But some codes are duplicated and that
> > > > becomes loss extendability.
> > > >
> > > > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > > > It can bee easily extended vringh code for new memory accessor and
> > > > simplified a caller code.
> > > >
> > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > > ---
> > > >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> > > >  include/linux/vringh.h | 100 +++---
> > > >  2 files changed, 225 insertions(+), 542 deletions(-)
> > > >
> > > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > > index aa3cd27d2384..ebfd3644a1a3 100644
> > > > --- a/drivers/vhost/vringh.c
> > > > +++ b/drivers/vhost/vringh.c
> > > > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> > > >  }
> > > >
> > > >  /* Returns vring->num if empty, -ve on error. */
> > > > -static inline int __vringh_get_head(const struct vringh *vrh,
> > > > -                                 int (*getu16)(const struct vringh *vrh,
> > > > -                                               u16 *val, const __virtio16 *p),
> > > > -                                 u16 *last_avail_idx)
> > > > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> > > >  {
> > > >       u16 avail_idx, i, head;
> > > >       int err;
> > > >
> > > > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > >       if (err) {
> > > >               vringh_bad("Failed to access avail idx at %p",
> > > >                          &vrh->vring.avail->idx);
> > >
> > > I like that this patch removes more lines of code than it adds.
> > >
> > > However one of the design points of vringh abstractions is that they were
> > > carefully written to be very low overhead.
> > > This is why we are passing function pointers to inline functions -
> > > compiler can optimize that out.
> > >
> > > I think that introducing ops indirect functions calls here is going to break
> > > these assumptions and hurt performance.
> > > Unless compiler can somehow figure it out and optimize?
> > > I don't see how it's possible with ops pointer in memory
> > > but maybe I'm wrong.
> > I think your concern is correct. I have to understand the compiler
> > optimization and redesign this approach If it is needed.
> > > Was any effort taken to test effect of these patches on performance?
> > I just tested vringh_test and already faced little performance reduction.
> > I have to investigate that, as you said.
> I attempted to test with perf. I found that the performance of patched code
> is almost the same as the upstream one. However, I have to investigate way
> this patch leads to this result, also the profiling should be run on
> more powerful
> machines too.
> 
> environment:
> $ grep 'model name' /proc/cpuinfo
> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> 
> results:
> * for patched code
>  Performance counter stats for 'nice -n -20 ./vringh_test_patched
> --parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):
> 
>           3,028.05 msec task-clock                #    0.995 CPUs
> utilized            ( +-  0.12% )
>             78,150      context-switches          #   25.691 K/sec
>                ( +-  0.00% )
>                  5      cpu-migrations            #    1.644 /sec
>                ( +-  3.33% )
>                190      page-faults               #   62.461 /sec
>                ( +-  0.41% )
>      6,919,025,222      cycles                    #    2.275 GHz
>                ( +-  0.13% )
>      8,990,220,160      instructions              #    1.29  insn per
> cycle           ( +-  0.04% )
>      1,788,326,786      branches                  #  587.899 M/sec
>                ( +-  0.05% )
>          4,557,398      branch-misses             #    0.25% of all
> branches          ( +-  0.43% )
> 
>            3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )
> 
> * for upstream code
>  Performance counter stats for 'nice -n -20 ./vringh_test_base
> --parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):
> 
>           3,058.41 msec task-clock                #    0.999 CPUs
> utilized            ( +-  0.14% )
>             78,149      context-switches          #   25.545 K/sec
>                ( +-  0.00% )
>                  5      cpu-migrations            #    1.634 /sec
>                ( +-  2.67% )
>                194      page-faults               #   63.414 /sec
>                ( +-  0.43% )
>      6,988,713,963      cycles                    #    2.284 GHz
>                ( +-  0.14% )
>      8,512,533,269      instructions              #    1.22  insn per
> cycle           ( +-  0.04% )
>      1,638,375,371      branches                  #  535.549 M/sec
>                ( +-  0.05% )
>          4,428,866      branch-misses             #    0.27% of all
> branches          ( +- 22.57% )
> 
>            3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )


How you compiled it also matters. ATM we don't enable retpolines
and it did not matter since we didn't have indirect calls,
but we should. Didn't yet investigate how to do that for virtio tools.


> > Thank you for your comments.
> > > Thanks!
> > >
> > >
> > Best,
> > Shunsuke.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
  2022-12-27 14:37           ` Michael S. Tsirkin
@ 2022-12-28  2:24             ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-28  2:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

2022年12月27日(火) 23:37 Michael S. Tsirkin <mst@redhat.com>:
>
> On Tue, Dec 27, 2022 at 07:22:36PM +0900, Shunsuke Mie wrote:
> > 2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
> > >
> > > 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
> > > >
> > > > On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > > > > Each vringh memory accessors that are for user, kern and iotlb has own
> > > > > interfaces that calls common code. But some codes are duplicated and that
> > > > > becomes loss extendability.
> > > > >
> > > > > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > > > > It can bee easily extended vringh code for new memory accessor and
> > > > > simplified a caller code.
> > > > >
> > > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > > > ---
> > > > >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> > > > >  include/linux/vringh.h | 100 +++---
> > > > >  2 files changed, 225 insertions(+), 542 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > > > index aa3cd27d2384..ebfd3644a1a3 100644
> > > > > --- a/drivers/vhost/vringh.c
> > > > > +++ b/drivers/vhost/vringh.c
> > > > > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> > > > >  }
> > > > >
> > > > >  /* Returns vring->num if empty, -ve on error. */
> > > > > -static inline int __vringh_get_head(const struct vringh *vrh,
> > > > > -                                 int (*getu16)(const struct vringh *vrh,
> > > > > -                                               u16 *val, const __virtio16 *p),
> > > > > -                                 u16 *last_avail_idx)
> > > > > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> > > > >  {
> > > > >       u16 avail_idx, i, head;
> > > > >       int err;
> > > > >
> > > > > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > >       if (err) {
> > > > >               vringh_bad("Failed to access avail idx at %p",
> > > > >                          &vrh->vring.avail->idx);
> > > >
> > > > I like that this patch removes more lines of code than it adds.
> > > >
> > > > However one of the design points of vringh abstractions is that they were
> > > > carefully written to be very low overhead.
> > > > This is why we are passing function pointers to inline functions -
> > > > compiler can optimize that out.
> > > >
> > > > I think that introducing ops indirect functions calls here is going to break
> > > > these assumptions and hurt performance.
> > > > Unless compiler can somehow figure it out and optimize?
> > > > I don't see how it's possible with ops pointer in memory
> > > > but maybe I'm wrong.
> > > I think your concern is correct. I have to understand the compiler
> > > optimization and redesign this approach If it is needed.
> > > > Was any effort taken to test effect of these patches on performance?
> > > I just tested vringh_test and already faced little performance reduction.
> > > I have to investigate that, as you said.
> > I attempted to test with perf. I found that the performance of patched code
> > is almost the same as the upstream one. However, I have to investigate way
> > this patch leads to this result, also the profiling should be run on
> > more powerful
> > machines too.
> >
> > environment:
> > $ grep 'model name' /proc/cpuinfo
> > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> >
> > results:
> > * for patched code
> >  Performance counter stats for 'nice -n -20 ./vringh_test_patched
> > --parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):
> >
> >           3,028.05 msec task-clock                #    0.995 CPUs
> > utilized            ( +-  0.12% )
> >             78,150      context-switches          #   25.691 K/sec
> >                ( +-  0.00% )
> >                  5      cpu-migrations            #    1.644 /sec
> >                ( +-  3.33% )
> >                190      page-faults               #   62.461 /sec
> >                ( +-  0.41% )
> >      6,919,025,222      cycles                    #    2.275 GHz
> >                ( +-  0.13% )
> >      8,990,220,160      instructions              #    1.29  insn per
> > cycle           ( +-  0.04% )
> >      1,788,326,786      branches                  #  587.899 M/sec
> >                ( +-  0.05% )
> >          4,557,398      branch-misses             #    0.25% of all
> > branches          ( +-  0.43% )
> >
> >            3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )
> >
> > * for upstream code
> >  Performance counter stats for 'nice -n -20 ./vringh_test_base
> > --parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):
> >
> >           3,058.41 msec task-clock                #    0.999 CPUs
> > utilized            ( +-  0.14% )
> >             78,149      context-switches          #   25.545 K/sec
> >                ( +-  0.00% )
> >                  5      cpu-migrations            #    1.634 /sec
> >                ( +-  2.67% )
> >                194      page-faults               #   63.414 /sec
> >                ( +-  0.43% )
> >      6,988,713,963      cycles                    #    2.284 GHz
> >                ( +-  0.14% )
> >      8,512,533,269      instructions              #    1.22  insn per
> > cycle           ( +-  0.04% )
> >      1,638,375,371      branches                  #  535.549 M/sec
> >                ( +-  0.05% )
> >          4,428,866      branch-misses             #    0.27% of all
> > branches          ( +- 22.57% )
> >
> >            3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )
>
>
> How you compiled it also matters. ATM we don't enable retpolines
> and it did not matter since we didn't have indirect calls,
> but we should. Didn't yet investigate how to do that for virtio tools.
I think the retpolines certainly affect performance. Thank you for pointing
it out. I'd like to start the investigation that how to apply the
retpolines to the
virtio tools.
> > > Thank you for your comments.
> > > > Thanks!
> > > >
> > > >
> > > Best,
> > > Shunsuke.
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
@ 2022-12-28  2:24             ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2022-12-28  2:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

2022年12月27日(火) 23:37 Michael S. Tsirkin <mst@redhat.com>:
>
> On Tue, Dec 27, 2022 at 07:22:36PM +0900, Shunsuke Mie wrote:
> > 2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
> > >
> > > 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
> > > >
> > > > On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > > > > Each vringh memory accessors that are for user, kern and iotlb has own
> > > > > interfaces that calls common code. But some codes are duplicated and that
> > > > > becomes loss extendability.
> > > > >
> > > > > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > > > > It can bee easily extended vringh code for new memory accessor and
> > > > > simplified a caller code.
> > > > >
> > > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > > > ---
> > > > >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> > > > >  include/linux/vringh.h | 100 +++---
> > > > >  2 files changed, 225 insertions(+), 542 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > > > index aa3cd27d2384..ebfd3644a1a3 100644
> > > > > --- a/drivers/vhost/vringh.c
> > > > > +++ b/drivers/vhost/vringh.c
> > > > > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> > > > >  }
> > > > >
> > > > >  /* Returns vring->num if empty, -ve on error. */
> > > > > -static inline int __vringh_get_head(const struct vringh *vrh,
> > > > > -                                 int (*getu16)(const struct vringh *vrh,
> > > > > -                                               u16 *val, const __virtio16 *p),
> > > > > -                                 u16 *last_avail_idx)
> > > > > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> > > > >  {
> > > > >       u16 avail_idx, i, head;
> > > > >       int err;
> > > > >
> > > > > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > >       if (err) {
> > > > >               vringh_bad("Failed to access avail idx at %p",
> > > > >                          &vrh->vring.avail->idx);
> > > >
> > > > I like that this patch removes more lines of code than it adds.
> > > >
> > > > However one of the design points of vringh abstractions is that they were
> > > > carefully written to be very low overhead.
> > > > This is why we are passing function pointers to inline functions -
> > > > compiler can optimize that out.
> > > >
> > > > I think that introducing ops indirect functions calls here is going to break
> > > > these assumptions and hurt performance.
> > > > Unless compiler can somehow figure it out and optimize?
> > > > I don't see how it's possible with ops pointer in memory
> > > > but maybe I'm wrong.
> > > I think your concern is correct. I have to understand the compiler
> > > optimization and redesign this approach If it is needed.
> > > > Was any effort taken to test effect of these patches on performance?
> > > I just tested vringh_test and already faced little performance reduction.
> > > I have to investigate that, as you said.
> > I attempted to test with perf. I found that the performance of patched code
> > is almost the same as the upstream one. However, I have to investigate way
> > this patch leads to this result, also the profiling should be run on
> > more powerful
> > machines too.
> >
> > environment:
> > $ grep 'model name' /proc/cpuinfo
> > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> >
> > results:
> > * for patched code
> >  Performance counter stats for 'nice -n -20 ./vringh_test_patched
> > --parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):
> >
> >           3,028.05 msec task-clock                #    0.995 CPUs
> > utilized            ( +-  0.12% )
> >             78,150      context-switches          #   25.691 K/sec
> >                ( +-  0.00% )
> >                  5      cpu-migrations            #    1.644 /sec
> >                ( +-  3.33% )
> >                190      page-faults               #   62.461 /sec
> >                ( +-  0.41% )
> >      6,919,025,222      cycles                    #    2.275 GHz
> >                ( +-  0.13% )
> >      8,990,220,160      instructions              #    1.29  insn per
> > cycle           ( +-  0.04% )
> >      1,788,326,786      branches                  #  587.899 M/sec
> >                ( +-  0.05% )
> >          4,557,398      branch-misses             #    0.25% of all
> > branches          ( +-  0.43% )
> >
> >            3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )
> >
> > * for upstream code
> >  Performance counter stats for 'nice -n -20 ./vringh_test_base
> > --parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):
> >
> >           3,058.41 msec task-clock                #    0.999 CPUs
> > utilized            ( +-  0.14% )
> >             78,149      context-switches          #   25.545 K/sec
> >                ( +-  0.00% )
> >                  5      cpu-migrations            #    1.634 /sec
> >                ( +-  2.67% )
> >                194      page-faults               #   63.414 /sec
> >                ( +-  0.43% )
> >      6,988,713,963      cycles                    #    2.284 GHz
> >                ( +-  0.14% )
> >      8,512,533,269      instructions              #    1.22  insn per
> > cycle           ( +-  0.04% )
> >      1,638,375,371      branches                  #  535.549 M/sec
> >                ( +-  0.05% )
> >          4,428,866      branch-misses             #    0.27% of all
> > branches          ( +- 22.57% )
> >
> >            3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )
>
>
> How you compiled it also matters. ATM we don't enable retpolines
> and it did not matter since we didn't have indirect calls,
> but we should. Didn't yet investigate how to do that for virtio tools.
I think the retpolines certainly affect performance. Thank you for pointing
it out. I'd like to start the investigation that how to apply the
retpolines to the
virtio tools.
> > > Thank you for your comments.
> > > > Thanks!
> > > >
> > > >
> > > Best,
> > > Shunsuke.
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-27  7:05       ` Shunsuke Mie
@ 2022-12-28  6:36         ` Jason Wang
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Wang @ 2022-12-28  6:36 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: kvm, Michael S. Tsirkin, netdev, Rusty Russell, linux-kernel,
	virtualization

On Tue, Dec 27, 2022 at 3:06 PM Shunsuke Mie <mie@igel.co.jp> wrote:
>
> 2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
> >
> > On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > >
> > > struct vringh_iov is defined to hold userland addresses. However, to use
> > > common function, __vring_iov, finally the vringh_iov converts to the
> > > vringh_kiov with simple cast. It includes compile time check code to make
> > > sure it can be cast correctly.
> > >
> > > To simplify the code, this patch removes the struct vringh_iov and unifies
> > > APIs to struct vringh_kiov.
> > >
> > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> >
> > While at this, I wonder if we need to go further, that is, switch to
> > using an iov iterator instead of a vringh customized one.
> I didn't see the iov iterator yet, thank you for informing me.
> Is that iov_iter? https://lwn.net/Articles/625077/

Exactly.

Thanks

> > Thanks
> >
> > > ---
> > >  drivers/vhost/vringh.c | 32 ++++++------------------------
> > >  include/linux/vringh.h | 45 ++++--------------------------------------
> > >  2 files changed, 10 insertions(+), 67 deletions(-)
> > >
> > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > index 828c29306565..aa3cd27d2384 100644
> > > --- a/drivers/vhost/vringh.c
> > > +++ b/drivers/vhost/vringh.c
> > > @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
> > >   * calling vringh_iov_cleanup() to release the memory, even on error!
> > >   */
> > >  int vringh_getdesc_user(struct vringh *vrh,
> > > -                       struct vringh_iov *riov,
> > > -                       struct vringh_iov *wiov,
> > > +                       struct vringh_kiov *riov,
> > > +                       struct vringh_kiov *wiov,
> > >                         bool (*getrange)(struct vringh *vrh,
> > >                                          u64 addr, struct vringh_range *r),
> > >                         u16 *head)
> > > @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
> > >         if (err == vrh->vring.num)
> > >                 return 0;
> > >
> > > -       /* We need the layouts to be the identical for this to work */
> > > -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
> > > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
> > > -                    offsetof(struct vringh_iov, iov));
> > > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
> > > -                    offsetof(struct vringh_iov, i));
> > > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
> > > -                    offsetof(struct vringh_iov, used));
> > > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
> > > -                    offsetof(struct vringh_iov, max_num));
> > > -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
> > > -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
> > > -                    offsetof(struct kvec, iov_base));
> > > -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
> > > -                    offsetof(struct kvec, iov_len));
> > > -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
> > > -                    != sizeof(((struct kvec *)NULL)->iov_base));
> > > -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
> > > -                    != sizeof(((struct kvec *)NULL)->iov_len));
> > > -
> > >         *head = err;
> > >         err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
> > >                            (struct vringh_kiov *)wiov,
> > > @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
> > >  EXPORT_SYMBOL(vringh_getdesc_user);
> > >
> > >  /**
> > > - * vringh_iov_pull_user - copy bytes from vring_iov.
> > > + * vringh_iov_pull_user - copy bytes from vring_kiov.
> > >   * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
> > >   * @dst: the place to copy.
> > >   * @len: the maximum length to copy.
> > >   *
> > >   * Returns the bytes copied <= len or a negative errno.
> > >   */
> > > -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> > > +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
> > >  {
> > >         return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
> > >                                dst, len, xfer_from_user);
> > > @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> > >  EXPORT_SYMBOL(vringh_iov_pull_user);
> > >
> > >  /**
> > > - * vringh_iov_push_user - copy bytes into vring_iov.
> > > + * vringh_iov_push_user - copy bytes into vring_kiov.
> > >   * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
> > >   * @src: the place to copy from.
> > >   * @len: the maximum length to copy.
> > >   *
> > >   * Returns the bytes copied <= len or a negative errno.
> > >   */
> > > -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> > > +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> > >                              const void *src, size_t len)
> > >  {
> > >         return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
> > > diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> > > index 1991a02c6431..733d948e8123 100644
> > > --- a/include/linux/vringh.h
> > > +++ b/include/linux/vringh.h
> > > @@ -79,18 +79,6 @@ struct vringh_range {
> > >         u64 offset;
> > >  };
> > >
> > > -/**
> > > - * struct vringh_iov - iovec mangler.
> > > - *
> > > - * Mangles iovec in place, and restores it.
> > > - * Remaining data is iov + i, of used - i elements.
> > > - */
> > > -struct vringh_iov {
> > > -       struct iovec *iov;
> > > -       size_t consumed; /* Within iov[i] */
> > > -       unsigned i, used, max_num;
> > > -};
> > > -
> > >  /**
> > >   * struct vringh_kiov - kvec mangler.
> > >   *
> > > @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
> > >                      vring_avail_t __user *avail,
> > >                      vring_used_t __user *used);
> > >
> > > -static inline void vringh_iov_init(struct vringh_iov *iov,
> > > -                                  struct iovec *iovec, unsigned num)
> > > -{
> > > -       iov->used = iov->i = 0;
> > > -       iov->consumed = 0;
> > > -       iov->max_num = num;
> > > -       iov->iov = iovec;
> > > -}
> > > -
> > > -static inline void vringh_iov_reset(struct vringh_iov *iov)
> > > -{
> > > -       iov->iov[iov->i].iov_len += iov->consumed;
> > > -       iov->iov[iov->i].iov_base -= iov->consumed;
> > > -       iov->consumed = 0;
> > > -       iov->i = 0;
> > > -}
> > > -
> > > -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
> > > -{
> > > -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
> > > -               kfree(iov->iov);
> > > -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
> > > -       iov->iov = NULL;
> > > -}
> > > -
> > >  /* Convert a descriptor into iovecs. */
> > >  int vringh_getdesc_user(struct vringh *vrh,
> > > -                       struct vringh_iov *riov,
> > > -                       struct vringh_iov *wiov,
> > > +                       struct vringh_kiov *riov,
> > > +                       struct vringh_kiov *wiov,
> > >                         bool (*getrange)(struct vringh *vrh,
> > >                                          u64 addr, struct vringh_range *r),
> > >                         u16 *head);
> > >
> > >  /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
> > > -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
> > > +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
> > >
> > >  /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
> > > -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> > > +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> > >                              const void *src, size_t len);
> > >
> > >  /* Mark a descriptor as used. */
> > > --
> > > 2.25.1
> > >
> >
> Best,
> Shunsuke
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2022-12-28  6:36         ` Jason Wang
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Wang @ 2022-12-28  6:36 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Michael S. Tsirkin, Rusty Russell, kvm, virtualization, netdev,
	linux-kernel

On Tue, Dec 27, 2022 at 3:06 PM Shunsuke Mie <mie@igel.co.jp> wrote:
>
> 2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
> >
> > On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> > >
> > > struct vringh_iov is defined to hold userland addresses. However, to use
> > > common function, __vring_iov, finally the vringh_iov converts to the
> > > vringh_kiov with simple cast. It includes compile time check code to make
> > > sure it can be cast correctly.
> > >
> > > To simplify the code, this patch removes the struct vringh_iov and unifies
> > > APIs to struct vringh_kiov.
> > >
> > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> >
> > While at this, I wonder if we need to go further, that is, switch to
> > using an iov iterator instead of a vringh customized one.
> I didn't see the iov iterator yet, thank you for informing me.
> Is that iov_iter? https://lwn.net/Articles/625077/

Exactly.

Thanks

> > Thanks
> >
> > > ---
> > >  drivers/vhost/vringh.c | 32 ++++++------------------------
> > >  include/linux/vringh.h | 45 ++++--------------------------------------
> > >  2 files changed, 10 insertions(+), 67 deletions(-)
> > >
> > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > index 828c29306565..aa3cd27d2384 100644
> > > --- a/drivers/vhost/vringh.c
> > > +++ b/drivers/vhost/vringh.c
> > > @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
> > >   * calling vringh_iov_cleanup() to release the memory, even on error!
> > >   */
> > >  int vringh_getdesc_user(struct vringh *vrh,
> > > -                       struct vringh_iov *riov,
> > > -                       struct vringh_iov *wiov,
> > > +                       struct vringh_kiov *riov,
> > > +                       struct vringh_kiov *wiov,
> > >                         bool (*getrange)(struct vringh *vrh,
> > >                                          u64 addr, struct vringh_range *r),
> > >                         u16 *head)
> > > @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
> > >         if (err == vrh->vring.num)
> > >                 return 0;
> > >
> > > -       /* We need the layouts to be the identical for this to work */
> > > -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
> > > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
> > > -                    offsetof(struct vringh_iov, iov));
> > > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
> > > -                    offsetof(struct vringh_iov, i));
> > > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
> > > -                    offsetof(struct vringh_iov, used));
> > > -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
> > > -                    offsetof(struct vringh_iov, max_num));
> > > -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
> > > -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
> > > -                    offsetof(struct kvec, iov_base));
> > > -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
> > > -                    offsetof(struct kvec, iov_len));
> > > -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
> > > -                    != sizeof(((struct kvec *)NULL)->iov_base));
> > > -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
> > > -                    != sizeof(((struct kvec *)NULL)->iov_len));
> > > -
> > >         *head = err;
> > >         err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
> > >                            (struct vringh_kiov *)wiov,
> > > @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
> > >  EXPORT_SYMBOL(vringh_getdesc_user);
> > >
> > >  /**
> > > - * vringh_iov_pull_user - copy bytes from vring_iov.
> > > + * vringh_iov_pull_user - copy bytes from vring_kiov.
> > >   * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
> > >   * @dst: the place to copy.
> > >   * @len: the maximum length to copy.
> > >   *
> > >   * Returns the bytes copied <= len or a negative errno.
> > >   */
> > > -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> > > +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
> > >  {
> > >         return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
> > >                                dst, len, xfer_from_user);
> > > @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> > >  EXPORT_SYMBOL(vringh_iov_pull_user);
> > >
> > >  /**
> > > - * vringh_iov_push_user - copy bytes into vring_iov.
> > > + * vringh_iov_push_user - copy bytes into vring_kiov.
> > >   * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
> > >   * @src: the place to copy from.
> > >   * @len: the maximum length to copy.
> > >   *
> > >   * Returns the bytes copied <= len or a negative errno.
> > >   */
> > > -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> > > +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> > >                              const void *src, size_t len)
> > >  {
> > >         return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
> > > diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> > > index 1991a02c6431..733d948e8123 100644
> > > --- a/include/linux/vringh.h
> > > +++ b/include/linux/vringh.h
> > > @@ -79,18 +79,6 @@ struct vringh_range {
> > >         u64 offset;
> > >  };
> > >
> > > -/**
> > > - * struct vringh_iov - iovec mangler.
> > > - *
> > > - * Mangles iovec in place, and restores it.
> > > - * Remaining data is iov + i, of used - i elements.
> > > - */
> > > -struct vringh_iov {
> > > -       struct iovec *iov;
> > > -       size_t consumed; /* Within iov[i] */
> > > -       unsigned i, used, max_num;
> > > -};
> > > -
> > >  /**
> > >   * struct vringh_kiov - kvec mangler.
> > >   *
> > > @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
> > >                      vring_avail_t __user *avail,
> > >                      vring_used_t __user *used);
> > >
> > > -static inline void vringh_iov_init(struct vringh_iov *iov,
> > > -                                  struct iovec *iovec, unsigned num)
> > > -{
> > > -       iov->used = iov->i = 0;
> > > -       iov->consumed = 0;
> > > -       iov->max_num = num;
> > > -       iov->iov = iovec;
> > > -}
> > > -
> > > -static inline void vringh_iov_reset(struct vringh_iov *iov)
> > > -{
> > > -       iov->iov[iov->i].iov_len += iov->consumed;
> > > -       iov->iov[iov->i].iov_base -= iov->consumed;
> > > -       iov->consumed = 0;
> > > -       iov->i = 0;
> > > -}
> > > -
> > > -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
> > > -{
> > > -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
> > > -               kfree(iov->iov);
> > > -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
> > > -       iov->iov = NULL;
> > > -}
> > > -
> > >  /* Convert a descriptor into iovecs. */
> > >  int vringh_getdesc_user(struct vringh *vrh,
> > > -                       struct vringh_iov *riov,
> > > -                       struct vringh_iov *wiov,
> > > +                       struct vringh_kiov *riov,
> > > +                       struct vringh_kiov *wiov,
> > >                         bool (*getrange)(struct vringh *vrh,
> > >                                          u64 addr, struct vringh_range *r),
> > >                         u16 *head);
> > >
> > >  /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
> > > -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
> > > +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
> > >
> > >  /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
> > > -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> > > +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> > >                              const void *src, size_t len);
> > >
> > >  /* Mark a descriptor as used. */
> > > --
> > > 2.25.1
> > >
> >
> Best,
> Shunsuke
>


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
  2022-12-28  2:24             ` Shunsuke Mie
@ 2022-12-28  7:20               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-28  7:20 UTC (permalink / raw)
  To: Shunsuke Mie; +Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization

On Wed, Dec 28, 2022 at 11:24:10AM +0900, Shunsuke Mie wrote:
> 2022年12月27日(火) 23:37 Michael S. Tsirkin <mst@redhat.com>:
> >
> > On Tue, Dec 27, 2022 at 07:22:36PM +0900, Shunsuke Mie wrote:
> > > 2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
> > > >
> > > > 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
> > > > >
> > > > > On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > > > > > Each vringh memory accessors that are for user, kern and iotlb has own
> > > > > > interfaces that calls common code. But some codes are duplicated and that
> > > > > > becomes loss extendability.
> > > > > >
> > > > > > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > > > > > It can bee easily extended vringh code for new memory accessor and
> > > > > > simplified a caller code.
> > > > > >
> > > > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > > > > ---
> > > > > >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> > > > > >  include/linux/vringh.h | 100 +++---
> > > > > >  2 files changed, 225 insertions(+), 542 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > > > > index aa3cd27d2384..ebfd3644a1a3 100644
> > > > > > --- a/drivers/vhost/vringh.c
> > > > > > +++ b/drivers/vhost/vringh.c
> > > > > > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> > > > > >  }
> > > > > >
> > > > > >  /* Returns vring->num if empty, -ve on error. */
> > > > > > -static inline int __vringh_get_head(const struct vringh *vrh,
> > > > > > -                                 int (*getu16)(const struct vringh *vrh,
> > > > > > -                                               u16 *val, const __virtio16 *p),
> > > > > > -                                 u16 *last_avail_idx)
> > > > > > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> > > > > >  {
> > > > > >       u16 avail_idx, i, head;
> > > > > >       int err;
> > > > > >
> > > > > > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > > > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > > >       if (err) {
> > > > > >               vringh_bad("Failed to access avail idx at %p",
> > > > > >                          &vrh->vring.avail->idx);
> > > > >
> > > > > I like that this patch removes more lines of code than it adds.
> > > > >
> > > > > However one of the design points of vringh abstractions is that they were
> > > > > carefully written to be very low overhead.
> > > > > This is why we are passing function pointers to inline functions -
> > > > > compiler can optimize that out.
> > > > >
> > > > > I think that introducing ops indirect functions calls here is going to break
> > > > > these assumptions and hurt performance.
> > > > > Unless compiler can somehow figure it out and optimize?
> > > > > I don't see how it's possible with ops pointer in memory
> > > > > but maybe I'm wrong.
> > > > I think your concern is correct. I have to understand the compiler
> > > > optimization and redesign this approach If it is needed.
> > > > > Was any effort taken to test effect of these patches on performance?
> > > > I just tested vringh_test and already faced little performance reduction.
> > > > I have to investigate that, as you said.
> > > I attempted to test with perf. I found that the performance of patched code
> > > is almost the same as the upstream one. However, I have to investigate way
> > > this patch leads to this result, also the profiling should be run on
> > > more powerful
> > > machines too.
> > >
> > > environment:
> > > $ grep 'model name' /proc/cpuinfo
> > > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > >
> > > results:
> > > * for patched code
> > >  Performance counter stats for 'nice -n -20 ./vringh_test_patched
> > > --parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):
> > >
> > >           3,028.05 msec task-clock                #    0.995 CPUs
> > > utilized            ( +-  0.12% )
> > >             78,150      context-switches          #   25.691 K/sec
> > >                ( +-  0.00% )
> > >                  5      cpu-migrations            #    1.644 /sec
> > >                ( +-  3.33% )
> > >                190      page-faults               #   62.461 /sec
> > >                ( +-  0.41% )
> > >      6,919,025,222      cycles                    #    2.275 GHz
> > >                ( +-  0.13% )
> > >      8,990,220,160      instructions              #    1.29  insn per
> > > cycle           ( +-  0.04% )
> > >      1,788,326,786      branches                  #  587.899 M/sec
> > >                ( +-  0.05% )
> > >          4,557,398      branch-misses             #    0.25% of all
> > > branches          ( +-  0.43% )
> > >
> > >            3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )
> > >
> > > * for upstream code
> > >  Performance counter stats for 'nice -n -20 ./vringh_test_base
> > > --parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):
> > >
> > >           3,058.41 msec task-clock                #    0.999 CPUs
> > > utilized            ( +-  0.14% )
> > >             78,149      context-switches          #   25.545 K/sec
> > >                ( +-  0.00% )
> > >                  5      cpu-migrations            #    1.634 /sec
> > >                ( +-  2.67% )
> > >                194      page-faults               #   63.414 /sec
> > >                ( +-  0.43% )
> > >      6,988,713,963      cycles                    #    2.284 GHz
> > >                ( +-  0.14% )
> > >      8,512,533,269      instructions              #    1.22  insn per
> > > cycle           ( +-  0.04% )
> > >      1,638,375,371      branches                  #  535.549 M/sec
> > >                ( +-  0.05% )
> > >          4,428,866      branch-misses             #    0.27% of all
> > > branches          ( +- 22.57% )
> > >
> > >            3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )
> >
> >
> > How you compiled it also matters. ATM we don't enable retpolines
> > and it did not matter since we didn't have indirect calls,
> > but we should. Didn't yet investigate how to do that for virtio tools.
> I think the retpolines certainly affect performance. Thank you for pointing
> it out. I'd like to start the investigation that how to apply the
> retpolines to the
> virtio tools.
> > > > Thank you for your comments.
> > > > > Thanks!
> > > > >
> > > > >
> > > > Best,
> > > > Shunsuke.

This isn't all that trivial if we want this at runtime.
But compile time is kind of easy.
See Documentation/admin-guide/hw-vuln/spectre.rst



-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
@ 2022-12-28  7:20               ` Michael S. Tsirkin
  0 siblings, 0 replies; 48+ messages in thread
From: Michael S. Tsirkin @ 2022-12-28  7:20 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel

On Wed, Dec 28, 2022 at 11:24:10AM +0900, Shunsuke Mie wrote:
> 2022年12月27日(火) 23:37 Michael S. Tsirkin <mst@redhat.com>:
> >
> > On Tue, Dec 27, 2022 at 07:22:36PM +0900, Shunsuke Mie wrote:
> > > 2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
> > > >
> > > > 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
> > > > >
> > > > > On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
> > > > > > Each vringh memory accessors that are for user, kern and iotlb has own
> > > > > > interfaces that calls common code. But some codes are duplicated and that
> > > > > > becomes loss extendability.
> > > > > >
> > > > > > Introduce a struct vringh_ops and provide a common APIs for all accessors.
> > > > > > It can bee easily extended vringh code for new memory accessor and
> > > > > > simplified a caller code.
> > > > > >
> > > > > > Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> > > > > > ---
> > > > > >  drivers/vhost/vringh.c | 667 +++++++++++------------------------------
> > > > > >  include/linux/vringh.h | 100 +++---
> > > > > >  2 files changed, 225 insertions(+), 542 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> > > > > > index aa3cd27d2384..ebfd3644a1a3 100644
> > > > > > --- a/drivers/vhost/vringh.c
> > > > > > +++ b/drivers/vhost/vringh.c
> > > > > > @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
> > > > > >  }
> > > > > >
> > > > > >  /* Returns vring->num if empty, -ve on error. */
> > > > > > -static inline int __vringh_get_head(const struct vringh *vrh,
> > > > > > -                                 int (*getu16)(const struct vringh *vrh,
> > > > > > -                                               u16 *val, const __virtio16 *p),
> > > > > > -                                 u16 *last_avail_idx)
> > > > > > +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
> > > > > >  {
> > > > > >       u16 avail_idx, i, head;
> > > > > >       int err;
> > > > > >
> > > > > > -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > > > +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
> > > > > >       if (err) {
> > > > > >               vringh_bad("Failed to access avail idx at %p",
> > > > > >                          &vrh->vring.avail->idx);
> > > > >
> > > > > I like that this patch removes more lines of code than it adds.
> > > > >
> > > > > However one of the design points of vringh abstractions is that they were
> > > > > carefully written to be very low overhead.
> > > > > This is why we are passing function pointers to inline functions -
> > > > > compiler can optimize that out.
> > > > >
> > > > > I think that introducing ops indirect functions calls here is going to break
> > > > > these assumptions and hurt performance.
> > > > > Unless compiler can somehow figure it out and optimize?
> > > > > I don't see how it's possible with ops pointer in memory
> > > > > but maybe I'm wrong.
> > > > I think your concern is correct. I have to understand the compiler
> > > > optimization and redesign this approach If it is needed.
> > > > > Was any effort taken to test effect of these patches on performance?
> > > > I just tested vringh_test and already faced little performance reduction.
> > > > I have to investigate that, as you said.
> > > I attempted to test with perf. I found that the performance of patched code
> > > is almost the same as the upstream one. However, I have to investigate way
> > > this patch leads to this result, also the profiling should be run on
> > > more powerful
> > > machines too.
> > >
> > > environment:
> > > $ grep 'model name' /proc/cpuinfo
> > > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > > model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
> > >
> > > results:
> > > * for patched code
> > >  Performance counter stats for 'nice -n -20 ./vringh_test_patched
> > > --parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):
> > >
> > >           3,028.05 msec task-clock                #    0.995 CPUs
> > > utilized            ( +-  0.12% )
> > >             78,150      context-switches          #   25.691 K/sec
> > >                ( +-  0.00% )
> > >                  5      cpu-migrations            #    1.644 /sec
> > >                ( +-  3.33% )
> > >                190      page-faults               #   62.461 /sec
> > >                ( +-  0.41% )
> > >      6,919,025,222      cycles                    #    2.275 GHz
> > >                ( +-  0.13% )
> > >      8,990,220,160      instructions              #    1.29  insn per
> > > cycle           ( +-  0.04% )
> > >      1,788,326,786      branches                  #  587.899 M/sec
> > >                ( +-  0.05% )
> > >          4,557,398      branch-misses             #    0.25% of all
> > > branches          ( +-  0.43% )
> > >
> > >            3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )
> > >
> > > * for upstream code
> > >  Performance counter stats for 'nice -n -20 ./vringh_test_base
> > > --parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):
> > >
> > >           3,058.41 msec task-clock                #    0.999 CPUs
> > > utilized            ( +-  0.14% )
> > >             78,149      context-switches          #   25.545 K/sec
> > >                ( +-  0.00% )
> > >                  5      cpu-migrations            #    1.634 /sec
> > >                ( +-  2.67% )
> > >                194      page-faults               #   63.414 /sec
> > >                ( +-  0.43% )
> > >      6,988,713,963      cycles                    #    2.284 GHz
> > >                ( +-  0.14% )
> > >      8,512,533,269      instructions              #    1.22  insn per
> > > cycle           ( +-  0.04% )
> > >      1,638,375,371      branches                  #  535.549 M/sec
> > >                ( +-  0.05% )
> > >          4,428,866      branch-misses             #    0.27% of all
> > > branches          ( +- 22.57% )
> > >
> > >            3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )
> >
> >
> > How you compiled it also matters. ATM we don't enable retpolines
> > and it did not matter since we didn't have indirect calls,
> > but we should. Didn't yet investigate how to do that for virtio tools.
> I think the retpolines certainly affect performance. Thank you for pointing
> it out. I'd like to start the investigation that how to apply the
> retpolines to the
> virtio tools.
> > > > Thank you for your comments.
> > > > > Thanks!
> > > > >
> > > > >
> > > > Best,
> > > > Shunsuke.

This isn't all that trivial if we want this at runtime.
But compile time is kind of easy.
See Documentation/admin-guide/hw-vuln/spectre.rst



-- 
MST


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2022-12-28  6:36         ` Jason Wang
@ 2023-01-11  3:26           ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2023-01-11  3:26 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Rusty Russell, kvm, virtualization, netdev,
	linux-kernel


On 2022/12/28 15:36, Jason Wang wrote:
> On Tue, Dec 27, 2022 at 3:06 PM Shunsuke Mie <mie@igel.co.jp> wrote:
>> 2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
>>> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>>>> struct vringh_iov is defined to hold userland addresses. However, to use
>>>> common function, __vring_iov, finally the vringh_iov converts to the
>>>> vringh_kiov with simple cast. It includes compile time check code to make
>>>> sure it can be cast correctly.
>>>>
>>>> To simplify the code, this patch removes the struct vringh_iov and unifies
>>>> APIs to struct vringh_kiov.
>>>>
>>>> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
>>> While at this, I wonder if we need to go further, that is, switch to
>>> using an iov iterator instead of a vringh customized one.
>> I didn't see the iov iterator yet, thank you for informing me.
>> Is that iov_iter? https://lwn.net/Articles/625077/
> Exactly.

I've investigated the iov_iter, vhost and related APIs. As a result, I
think that it is not easy to switch to use the iov_iter. Because, the
design of vhost and vringh is different.

The iov_iter has vring desc info and meta data of transfer method. The
vhost provides generic transfer function for the iov_iter. In constrast,
vringh_iov just has vring desc info. The vringh provides transfer functions
for each methods.

In the future, it is better to use common data structure and APIs between
vhost and vringh (or merge completely), but it requires a lot of 
changes, so I'd like to just
organize data structure in vringh as a first step in this patch.


Best

> Thanks
>
>>> Thanks
>>>
>>>> ---
>>>>   drivers/vhost/vringh.c | 32 ++++++------------------------
>>>>   include/linux/vringh.h | 45 ++++--------------------------------------
>>>>   2 files changed, 10 insertions(+), 67 deletions(-)
>>>>
>>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>>> index 828c29306565..aa3cd27d2384 100644
>>>> --- a/drivers/vhost/vringh.c
>>>> +++ b/drivers/vhost/vringh.c
>>>> @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
>>>>    * calling vringh_iov_cleanup() to release the memory, even on error!
>>>>    */
>>>>   int vringh_getdesc_user(struct vringh *vrh,
>>>> -                       struct vringh_iov *riov,
>>>> -                       struct vringh_iov *wiov,
>>>> +                       struct vringh_kiov *riov,
>>>> +                       struct vringh_kiov *wiov,
>>>>                          bool (*getrange)(struct vringh *vrh,
>>>>                                           u64 addr, struct vringh_range *r),
>>>>                          u16 *head)
>>>> @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
>>>>          if (err == vrh->vring.num)
>>>>                  return 0;
>>>>
>>>> -       /* We need the layouts to be the identical for this to work */
>>>> -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
>>>> -                    offsetof(struct vringh_iov, iov));
>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
>>>> -                    offsetof(struct vringh_iov, i));
>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
>>>> -                    offsetof(struct vringh_iov, used));
>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
>>>> -                    offsetof(struct vringh_iov, max_num));
>>>> -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
>>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
>>>> -                    offsetof(struct kvec, iov_base));
>>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
>>>> -                    offsetof(struct kvec, iov_len));
>>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
>>>> -                    != sizeof(((struct kvec *)NULL)->iov_base));
>>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
>>>> -                    != sizeof(((struct kvec *)NULL)->iov_len));
>>>> -
>>>>          *head = err;
>>>>          err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
>>>>                             (struct vringh_kiov *)wiov,
>>>> @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
>>>>   EXPORT_SYMBOL(vringh_getdesc_user);
>>>>
>>>>   /**
>>>> - * vringh_iov_pull_user - copy bytes from vring_iov.
>>>> + * vringh_iov_pull_user - copy bytes from vring_kiov.
>>>>    * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
>>>>    * @dst: the place to copy.
>>>>    * @len: the maximum length to copy.
>>>>    *
>>>>    * Returns the bytes copied <= len or a negative errno.
>>>>    */
>>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
>>>>   {
>>>>          return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
>>>>                                 dst, len, xfer_from_user);
>>>> @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>>>>   EXPORT_SYMBOL(vringh_iov_pull_user);
>>>>
>>>>   /**
>>>> - * vringh_iov_push_user - copy bytes into vring_iov.
>>>> + * vringh_iov_push_user - copy bytes into vring_kiov.
>>>>    * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
>>>>    * @src: the place to copy from.
>>>>    * @len: the maximum length to copy.
>>>>    *
>>>>    * Returns the bytes copied <= len or a negative errno.
>>>>    */
>>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
>>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>>>>                               const void *src, size_t len)
>>>>   {
>>>>          return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
>>>> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
>>>> index 1991a02c6431..733d948e8123 100644
>>>> --- a/include/linux/vringh.h
>>>> +++ b/include/linux/vringh.h
>>>> @@ -79,18 +79,6 @@ struct vringh_range {
>>>>          u64 offset;
>>>>   };
>>>>
>>>> -/**
>>>> - * struct vringh_iov - iovec mangler.
>>>> - *
>>>> - * Mangles iovec in place, and restores it.
>>>> - * Remaining data is iov + i, of used - i elements.
>>>> - */
>>>> -struct vringh_iov {
>>>> -       struct iovec *iov;
>>>> -       size_t consumed; /* Within iov[i] */
>>>> -       unsigned i, used, max_num;
>>>> -};
>>>> -
>>>>   /**
>>>>    * struct vringh_kiov - kvec mangler.
>>>>    *
>>>> @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
>>>>                       vring_avail_t __user *avail,
>>>>                       vring_used_t __user *used);
>>>>
>>>> -static inline void vringh_iov_init(struct vringh_iov *iov,
>>>> -                                  struct iovec *iovec, unsigned num)
>>>> -{
>>>> -       iov->used = iov->i = 0;
>>>> -       iov->consumed = 0;
>>>> -       iov->max_num = num;
>>>> -       iov->iov = iovec;
>>>> -}
>>>> -
>>>> -static inline void vringh_iov_reset(struct vringh_iov *iov)
>>>> -{
>>>> -       iov->iov[iov->i].iov_len += iov->consumed;
>>>> -       iov->iov[iov->i].iov_base -= iov->consumed;
>>>> -       iov->consumed = 0;
>>>> -       iov->i = 0;
>>>> -}
>>>> -
>>>> -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
>>>> -{
>>>> -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
>>>> -               kfree(iov->iov);
>>>> -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
>>>> -       iov->iov = NULL;
>>>> -}
>>>> -
>>>>   /* Convert a descriptor into iovecs. */
>>>>   int vringh_getdesc_user(struct vringh *vrh,
>>>> -                       struct vringh_iov *riov,
>>>> -                       struct vringh_iov *wiov,
>>>> +                       struct vringh_kiov *riov,
>>>> +                       struct vringh_kiov *wiov,
>>>>                          bool (*getrange)(struct vringh *vrh,
>>>>                                           u64 addr, struct vringh_range *r),
>>>>                          u16 *head);
>>>>
>>>>   /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
>>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
>>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
>>>>
>>>>   /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
>>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
>>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>>>>                               const void *src, size_t len);
>>>>
>>>>   /* Mark a descriptor as used. */
>>>> --
>>>> 2.25.1
>>>>
>> Best,
>> Shunsuke
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2023-01-11  3:26           ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2023-01-11  3:26 UTC (permalink / raw)
  To: Jason Wang
  Cc: kvm, Michael S. Tsirkin, netdev, Rusty Russell, linux-kernel,
	virtualization


On 2022/12/28 15:36, Jason Wang wrote:
> On Tue, Dec 27, 2022 at 3:06 PM Shunsuke Mie <mie@igel.co.jp> wrote:
>> 2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
>>> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>>>> struct vringh_iov is defined to hold userland addresses. However, to use
>>>> common function, __vring_iov, finally the vringh_iov converts to the
>>>> vringh_kiov with simple cast. It includes compile time check code to make
>>>> sure it can be cast correctly.
>>>>
>>>> To simplify the code, this patch removes the struct vringh_iov and unifies
>>>> APIs to struct vringh_kiov.
>>>>
>>>> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
>>> While at this, I wonder if we need to go further, that is, switch to
>>> using an iov iterator instead of a vringh customized one.
>> I didn't see the iov iterator yet, thank you for informing me.
>> Is that iov_iter? https://lwn.net/Articles/625077/
> Exactly.

I've investigated the iov_iter, vhost and related APIs. As a result, I
think that it is not easy to switch to use the iov_iter. Because, the
design of vhost and vringh is different.

The iov_iter has vring desc info and meta data of transfer method. The
vhost provides generic transfer function for the iov_iter. In constrast,
vringh_iov just has vring desc info. The vringh provides transfer functions
for each methods.

In the future, it is better to use common data structure and APIs between
vhost and vringh (or merge completely), but it requires a lot of 
changes, so I'd like to just
organize data structure in vringh as a first step in this patch.


Best

> Thanks
>
>>> Thanks
>>>
>>>> ---
>>>>   drivers/vhost/vringh.c | 32 ++++++------------------------
>>>>   include/linux/vringh.h | 45 ++++--------------------------------------
>>>>   2 files changed, 10 insertions(+), 67 deletions(-)
>>>>
>>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>>> index 828c29306565..aa3cd27d2384 100644
>>>> --- a/drivers/vhost/vringh.c
>>>> +++ b/drivers/vhost/vringh.c
>>>> @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
>>>>    * calling vringh_iov_cleanup() to release the memory, even on error!
>>>>    */
>>>>   int vringh_getdesc_user(struct vringh *vrh,
>>>> -                       struct vringh_iov *riov,
>>>> -                       struct vringh_iov *wiov,
>>>> +                       struct vringh_kiov *riov,
>>>> +                       struct vringh_kiov *wiov,
>>>>                          bool (*getrange)(struct vringh *vrh,
>>>>                                           u64 addr, struct vringh_range *r),
>>>>                          u16 *head)
>>>> @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
>>>>          if (err == vrh->vring.num)
>>>>                  return 0;
>>>>
>>>> -       /* We need the layouts to be the identical for this to work */
>>>> -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
>>>> -                    offsetof(struct vringh_iov, iov));
>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
>>>> -                    offsetof(struct vringh_iov, i));
>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
>>>> -                    offsetof(struct vringh_iov, used));
>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
>>>> -                    offsetof(struct vringh_iov, max_num));
>>>> -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
>>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
>>>> -                    offsetof(struct kvec, iov_base));
>>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
>>>> -                    offsetof(struct kvec, iov_len));
>>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
>>>> -                    != sizeof(((struct kvec *)NULL)->iov_base));
>>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
>>>> -                    != sizeof(((struct kvec *)NULL)->iov_len));
>>>> -
>>>>          *head = err;
>>>>          err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
>>>>                             (struct vringh_kiov *)wiov,
>>>> @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
>>>>   EXPORT_SYMBOL(vringh_getdesc_user);
>>>>
>>>>   /**
>>>> - * vringh_iov_pull_user - copy bytes from vring_iov.
>>>> + * vringh_iov_pull_user - copy bytes from vring_kiov.
>>>>    * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
>>>>    * @dst: the place to copy.
>>>>    * @len: the maximum length to copy.
>>>>    *
>>>>    * Returns the bytes copied <= len or a negative errno.
>>>>    */
>>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
>>>>   {
>>>>          return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
>>>>                                 dst, len, xfer_from_user);
>>>> @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>>>>   EXPORT_SYMBOL(vringh_iov_pull_user);
>>>>
>>>>   /**
>>>> - * vringh_iov_push_user - copy bytes into vring_iov.
>>>> + * vringh_iov_push_user - copy bytes into vring_kiov.
>>>>    * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
>>>>    * @src: the place to copy from.
>>>>    * @len: the maximum length to copy.
>>>>    *
>>>>    * Returns the bytes copied <= len or a negative errno.
>>>>    */
>>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
>>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>>>>                               const void *src, size_t len)
>>>>   {
>>>>          return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
>>>> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
>>>> index 1991a02c6431..733d948e8123 100644
>>>> --- a/include/linux/vringh.h
>>>> +++ b/include/linux/vringh.h
>>>> @@ -79,18 +79,6 @@ struct vringh_range {
>>>>          u64 offset;
>>>>   };
>>>>
>>>> -/**
>>>> - * struct vringh_iov - iovec mangler.
>>>> - *
>>>> - * Mangles iovec in place, and restores it.
>>>> - * Remaining data is iov + i, of used - i elements.
>>>> - */
>>>> -struct vringh_iov {
>>>> -       struct iovec *iov;
>>>> -       size_t consumed; /* Within iov[i] */
>>>> -       unsigned i, used, max_num;
>>>> -};
>>>> -
>>>>   /**
>>>>    * struct vringh_kiov - kvec mangler.
>>>>    *
>>>> @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
>>>>                       vring_avail_t __user *avail,
>>>>                       vring_used_t __user *used);
>>>>
>>>> -static inline void vringh_iov_init(struct vringh_iov *iov,
>>>> -                                  struct iovec *iovec, unsigned num)
>>>> -{
>>>> -       iov->used = iov->i = 0;
>>>> -       iov->consumed = 0;
>>>> -       iov->max_num = num;
>>>> -       iov->iov = iovec;
>>>> -}
>>>> -
>>>> -static inline void vringh_iov_reset(struct vringh_iov *iov)
>>>> -{
>>>> -       iov->iov[iov->i].iov_len += iov->consumed;
>>>> -       iov->iov[iov->i].iov_base -= iov->consumed;
>>>> -       iov->consumed = 0;
>>>> -       iov->i = 0;
>>>> -}
>>>> -
>>>> -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
>>>> -{
>>>> -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
>>>> -               kfree(iov->iov);
>>>> -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
>>>> -       iov->iov = NULL;
>>>> -}
>>>> -
>>>>   /* Convert a descriptor into iovecs. */
>>>>   int vringh_getdesc_user(struct vringh *vrh,
>>>> -                       struct vringh_iov *riov,
>>>> -                       struct vringh_iov *wiov,
>>>> +                       struct vringh_kiov *riov,
>>>> +                       struct vringh_kiov *wiov,
>>>>                          bool (*getrange)(struct vringh *vrh,
>>>>                                           u64 addr, struct vringh_range *r),
>>>>                          u16 *head);
>>>>
>>>>   /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
>>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
>>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
>>>>
>>>>   /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
>>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
>>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>>>>                               const void *src, size_t len);
>>>>
>>>>   /* Mark a descriptor as used. */
>>>> --
>>>> 2.25.1
>>>>
>> Best,
>> Shunsuke
>>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
  2022-12-28  7:20               ` Michael S. Tsirkin
@ 2023-01-11  4:10                 ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2023-01-11  4:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Rusty Russell, kvm, virtualization, netdev, linux-kernel


On 2022/12/28 16:20, Michael S. Tsirkin wrote:
> On Wed, Dec 28, 2022 at 11:24:10AM +0900, Shunsuke Mie wrote:
>> 2022年12月27日(火) 23:37 Michael S. Tsirkin <mst@redhat.com>:
>>> On Tue, Dec 27, 2022 at 07:22:36PM +0900, Shunsuke Mie wrote:
>>>> 2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
>>>>> 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
>>>>>> On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
>>>>>>> Each vringh memory accessors that are for user, kern and iotlb has own
>>>>>>> interfaces that calls common code. But some codes are duplicated and that
>>>>>>> becomes loss extendability.
>>>>>>>
>>>>>>> Introduce a struct vringh_ops and provide a common APIs for all accessors.
>>>>>>> It can bee easily extended vringh code for new memory accessor and
>>>>>>> simplified a caller code.
>>>>>>>
>>>>>>> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
>>>>>>> ---
>>>>>>>   drivers/vhost/vringh.c | 667 +++++++++++------------------------------
>>>>>>>   include/linux/vringh.h | 100 +++---
>>>>>>>   2 files changed, 225 insertions(+), 542 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>>>>>> index aa3cd27d2384..ebfd3644a1a3 100644
>>>>>>> --- a/drivers/vhost/vringh.c
>>>>>>> +++ b/drivers/vhost/vringh.c
>>>>>>> @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
>>>>>>>   }
>>>>>>>
>>>>>>>   /* Returns vring->num if empty, -ve on error. */
>>>>>>> -static inline int __vringh_get_head(const struct vringh *vrh,
>>>>>>> -                                 int (*getu16)(const struct vringh *vrh,
>>>>>>> -                                               u16 *val, const __virtio16 *p),
>>>>>>> -                                 u16 *last_avail_idx)
>>>>>>> +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
>>>>>>>   {
>>>>>>>        u16 avail_idx, i, head;
>>>>>>>        int err;
>>>>>>>
>>>>>>> -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
>>>>>>> +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
>>>>>>>        if (err) {
>>>>>>>                vringh_bad("Failed to access avail idx at %p",
>>>>>>>                           &vrh->vring.avail->idx);
>>>>>> I like that this patch removes more lines of code than it adds.
>>>>>>
>>>>>> However one of the design points of vringh abstractions is that they were
>>>>>> carefully written to be very low overhead.
>>>>>> This is why we are passing function pointers to inline functions -
>>>>>> compiler can optimize that out.
>>>>>>
>>>>>> I think that introducing ops indirect functions calls here is going to break
>>>>>> these assumptions and hurt performance.
>>>>>> Unless compiler can somehow figure it out and optimize?
>>>>>> I don't see how it's possible with ops pointer in memory
>>>>>> but maybe I'm wrong.
>>>>> I think your concern is correct. I have to understand the compiler
>>>>> optimization and redesign this approach If it is needed.
>>>>>> Was any effort taken to test effect of these patches on performance?
>>>>> I just tested vringh_test and already faced little performance reduction.
>>>>> I have to investigate that, as you said.
>>>> I attempted to test with perf. I found that the performance of patched code
>>>> is almost the same as the upstream one. However, I have to investigate way
>>>> this patch leads to this result, also the profiling should be run on
>>>> more powerful
>>>> machines too.
>>>>
>>>> environment:
>>>> $ grep 'model name' /proc/cpuinfo
>>>> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
>>>> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
>>>> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
>>>> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
>>>>
>>>> results:
>>>> * for patched code
>>>>   Performance counter stats for 'nice -n -20 ./vringh_test_patched
>>>> --parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):
>>>>
>>>>            3,028.05 msec task-clock                #    0.995 CPUs
>>>> utilized            ( +-  0.12% )
>>>>              78,150      context-switches          #   25.691 K/sec
>>>>                 ( +-  0.00% )
>>>>                   5      cpu-migrations            #    1.644 /sec
>>>>                 ( +-  3.33% )
>>>>                 190      page-faults               #   62.461 /sec
>>>>                 ( +-  0.41% )
>>>>       6,919,025,222      cycles                    #    2.275 GHz
>>>>                 ( +-  0.13% )
>>>>       8,990,220,160      instructions              #    1.29  insn per
>>>> cycle           ( +-  0.04% )
>>>>       1,788,326,786      branches                  #  587.899 M/sec
>>>>                 ( +-  0.05% )
>>>>           4,557,398      branch-misses             #    0.25% of all
>>>> branches          ( +-  0.43% )
>>>>
>>>>             3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )
>>>>
>>>> * for upstream code
>>>>   Performance counter stats for 'nice -n -20 ./vringh_test_base
>>>> --parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):
>>>>
>>>>            3,058.41 msec task-clock                #    0.999 CPUs
>>>> utilized            ( +-  0.14% )
>>>>              78,149      context-switches          #   25.545 K/sec
>>>>                 ( +-  0.00% )
>>>>                   5      cpu-migrations            #    1.634 /sec
>>>>                 ( +-  2.67% )
>>>>                 194      page-faults               #   63.414 /sec
>>>>                 ( +-  0.43% )
>>>>       6,988,713,963      cycles                    #    2.284 GHz
>>>>                 ( +-  0.14% )
>>>>       8,512,533,269      instructions              #    1.22  insn per
>>>> cycle           ( +-  0.04% )
>>>>       1,638,375,371      branches                  #  535.549 M/sec
>>>>                 ( +-  0.05% )
>>>>           4,428,866      branch-misses             #    0.27% of all
>>>> branches          ( +- 22.57% )
>>>>
>>>>             3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )
>>>
>>> How you compiled it also matters. ATM we don't enable retpolines
>>> and it did not matter since we didn't have indirect calls,
>>> but we should. Didn't yet investigate how to do that for virtio tools.
>> I think the retpolines certainly affect performance. Thank you for pointing
>> it out. I'd like to start the investigation that how to apply the
>> retpolines to the
>> virtio tools.
>>>>> Thank you for your comments.
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>> Best,
>>>>> Shunsuke.
> This isn't all that trivial if we want this at runtime.
> But compile time is kind of easy.
> See Documentation/admin-guide/hw-vuln/spectre.rst

Thank you for showing it.


I followed the document and added options to CFLAGS to the tools Makefile.

That is

---

diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile
index 1b25cc7c64bb..7b7139d97d74 100644
--- a/tools/virtio/Makefile
+++ b/tools/virtio/Makefile
@@ -4,7 +4,7 @@ test: virtio_test vringh_test
  virtio_test: virtio_ring.o virtio_test.o
  vringh_test: vringh_test.o vringh.o virtio_ring.o

-CFLAGS += -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. 
-I../include/ -I ../../usr/include/ -Wno-pointer-sign 
-fno-strict-overflow -fno-strict-aliasing -fno-common -MMD 
-U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h
+CFLAGS += -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. 
-I../include/ -I ../../usr/include/ -Wno-pointer-sign 
-fno-strict-overflow -fno-strict-aliasing -fno-common -MMD 
-U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h 
-mfunction-return=thunk -fcf-protection=none -mindirect-branch-register
  CFLAGS += -pthread
  LDFLAGS += -pthread
  vpath %.c ../../drivers/virtio ../../drivers/vhost
---

And results of evaluation are following:

- base with retpoline

$ sudo perf stat --repeat 20 -- nice -n -20 ./vringh_test_retp_origin 
--parallel --eventidx --fast-vringh
Using CPUS 0 and 3
Guest: notified 0, pinged 98040
Host: notified 98040, pinged 0
...

  Performance counter stats for 'nice -n -20 ./vringh_test_retp_origin 
--parallel --eventidx --fast-vringh' (20 runs):

           6,228.33 msec task-clock                #    1.004 CPUs 
utilized            ( +-  0.05% )
            196,110      context-switches          #   31.616 
K/sec                    ( +-  0.00% )
                  6      cpu-migrations            #    0.967 
/sec                     ( +-  2.39% )
                205      page-faults               #   33.049 
/sec                     ( +-  0.46% )
     14,218,527,987      cycles                    #    2.292 
GHz                      ( +-  0.05% )
     10,342,897,254      instructions              #    0.73  insn per 
cycle           ( +-  0.02% )
      2,310,572,989      branches                  #  372.500 
M/sec                    ( +-  0.03% )
        178,273,068      branch-misses             #    7.72% of all 
branches          ( +-  0.04% )

            6.20406 +- 0.00308 seconds time elapsed  ( +-  0.05% )

- patched (unified APIs) with retpoline

$ sudo perf stat --repeat 20 -- nice -n -20 ./vringh_test_retp_patched 
--parallel --eventidx --fast-vringh
Using CPUS 0 and 3
Guest: notified 0, pinged 98040
Host: notified 98040, pinged 0
...

  Performance counter stats for 'nice -n -20 ./vringh_test_retp_patched 
--parallel --eventidx --fast-vringh' (20 runs):

           6,103.94 msec task-clock                #    1.001 CPUs 
utilized            ( +-  0.03% )
            196,125      context-switches          #   32.165 
K/sec                    ( +-  0.00% )
                  7      cpu-migrations            #    1.148 
/sec                     ( +-  1.56% )
                196      page-faults               #   32.144 
/sec                     ( +-  0.41% )
     13,933,055,778      cycles                    #    2.285 
GHz                      ( +-  0.03% )
     10,309,004,718      instructions              #    0.74  insn per 
cycle           ( +-  0.03% )
      2,368,447,519      branches                  #  388.425 
M/sec                    ( +-  0.04% )
        211,364,886      branch-misses             #    8.94% of all 
branches          ( +-  0.05% )

            6.09888 +- 0.00155 seconds time elapsed  ( +-  0.03% )

As a result, at the patched code, the branch-misses was increased but
elapsed time became faster than the based code. The number of 
page-faults was
a little different. I'm suspicious of that the page-fault penalty leads the
performance result.

I think that a pattern of memory access for data is same with those, but
for instruction is different. Actually a code size (.text segment) was a
little smaller. 0x6a65 and 0x63f5.

$ readelf -a ./vringh_test_retp_origin |grep .text -1
        0000000000000008  0000000000000008  AX       0     0     8
   [14] .text             PROGBITS         0000000000001230 00001230
        0000000000006a65  0000000000000000  AX       0     0     16
--
    02     .interp .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym 
.dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
    03     .init .plt .plt.got .text .fini
    04     .rodata .eh_frame_hdr .eh_frame


$ readelf -a ./vringh_test_retp_patched |grep .text -1
        0000000000000008  0000000000000008  AX       0     0     8
   [14] .text             PROGBITS         0000000000001230 00001230
        00000000000063f5  0000000000000000  AX       0     0     16
--
    02     .interp .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym 
.dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
    03     .init .plt .plt.got .text .fini
    04     .rodata .eh_frame_hdr .eh_frame

I'll keep this investigation. I was wondering if you could comment me.

Best

>

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/9] vringh: unify the APIs for all accessors
@ 2023-01-11  4:10                 ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2023-01-11  4:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, netdev, Rusty Russell, linux-kernel, virtualization


On 2022/12/28 16:20, Michael S. Tsirkin wrote:
> On Wed, Dec 28, 2022 at 11:24:10AM +0900, Shunsuke Mie wrote:
>> 2022年12月27日(火) 23:37 Michael S. Tsirkin <mst@redhat.com>:
>>> On Tue, Dec 27, 2022 at 07:22:36PM +0900, Shunsuke Mie wrote:
>>>> 2022年12月27日(火) 16:49 Shunsuke Mie <mie@igel.co.jp>:
>>>>> 2022年12月27日(火) 16:04 Michael S. Tsirkin <mst@redhat.com>:
>>>>>> On Tue, Dec 27, 2022 at 11:25:26AM +0900, Shunsuke Mie wrote:
>>>>>>> Each vringh memory accessors that are for user, kern and iotlb has own
>>>>>>> interfaces that calls common code. But some codes are duplicated and that
>>>>>>> becomes loss extendability.
>>>>>>>
>>>>>>> Introduce a struct vringh_ops and provide a common APIs for all accessors.
>>>>>>> It can bee easily extended vringh code for new memory accessor and
>>>>>>> simplified a caller code.
>>>>>>>
>>>>>>> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
>>>>>>> ---
>>>>>>>   drivers/vhost/vringh.c | 667 +++++++++++------------------------------
>>>>>>>   include/linux/vringh.h | 100 +++---
>>>>>>>   2 files changed, 225 insertions(+), 542 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>>>>>> index aa3cd27d2384..ebfd3644a1a3 100644
>>>>>>> --- a/drivers/vhost/vringh.c
>>>>>>> +++ b/drivers/vhost/vringh.c
>>>>>>> @@ -35,15 +35,12 @@ static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
>>>>>>>   }
>>>>>>>
>>>>>>>   /* Returns vring->num if empty, -ve on error. */
>>>>>>> -static inline int __vringh_get_head(const struct vringh *vrh,
>>>>>>> -                                 int (*getu16)(const struct vringh *vrh,
>>>>>>> -                                               u16 *val, const __virtio16 *p),
>>>>>>> -                                 u16 *last_avail_idx)
>>>>>>> +static inline int __vringh_get_head(const struct vringh *vrh, u16 *last_avail_idx)
>>>>>>>   {
>>>>>>>        u16 avail_idx, i, head;
>>>>>>>        int err;
>>>>>>>
>>>>>>> -     err = getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
>>>>>>> +     err = vrh->ops.getu16(vrh, &avail_idx, &vrh->vring.avail->idx);
>>>>>>>        if (err) {
>>>>>>>                vringh_bad("Failed to access avail idx at %p",
>>>>>>>                           &vrh->vring.avail->idx);
>>>>>> I like that this patch removes more lines of code than it adds.
>>>>>>
>>>>>> However one of the design points of vringh abstractions is that they were
>>>>>> carefully written to be very low overhead.
>>>>>> This is why we are passing function pointers to inline functions -
>>>>>> compiler can optimize that out.
>>>>>>
>>>>>> I think that introducing ops indirect functions calls here is going to break
>>>>>> these assumptions and hurt performance.
>>>>>> Unless compiler can somehow figure it out and optimize?
>>>>>> I don't see how it's possible with ops pointer in memory
>>>>>> but maybe I'm wrong.
>>>>> I think your concern is correct. I have to understand the compiler
>>>>> optimization and redesign this approach If it is needed.
>>>>>> Was any effort taken to test effect of these patches on performance?
>>>>> I just tested vringh_test and already faced little performance reduction.
>>>>> I have to investigate that, as you said.
>>>> I attempted to test with perf. I found that the performance of patched code
>>>> is almost the same as the upstream one. However, I have to investigate way
>>>> this patch leads to this result, also the profiling should be run on
>>>> more powerful
>>>> machines too.
>>>>
>>>> environment:
>>>> $ grep 'model name' /proc/cpuinfo
>>>> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
>>>> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
>>>> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
>>>> model name      : Intel(R) Core(TM) i3-7020U CPU @ 2.30GHz
>>>>
>>>> results:
>>>> * for patched code
>>>>   Performance counter stats for 'nice -n -20 ./vringh_test_patched
>>>> --parallel --eventidx --fast-vringh --indirect --virtio-1' (20 runs):
>>>>
>>>>            3,028.05 msec task-clock                #    0.995 CPUs
>>>> utilized            ( +-  0.12% )
>>>>              78,150      context-switches          #   25.691 K/sec
>>>>                 ( +-  0.00% )
>>>>                   5      cpu-migrations            #    1.644 /sec
>>>>                 ( +-  3.33% )
>>>>                 190      page-faults               #   62.461 /sec
>>>>                 ( +-  0.41% )
>>>>       6,919,025,222      cycles                    #    2.275 GHz
>>>>                 ( +-  0.13% )
>>>>       8,990,220,160      instructions              #    1.29  insn per
>>>> cycle           ( +-  0.04% )
>>>>       1,788,326,786      branches                  #  587.899 M/sec
>>>>                 ( +-  0.05% )
>>>>           4,557,398      branch-misses             #    0.25% of all
>>>> branches          ( +-  0.43% )
>>>>
>>>>             3.04359 +- 0.00378 seconds time elapsed  ( +-  0.12% )
>>>>
>>>> * for upstream code
>>>>   Performance counter stats for 'nice -n -20 ./vringh_test_base
>>>> --parallel --eventidx --fast-vringh --indirect --virtio-1' (10 runs):
>>>>
>>>>            3,058.41 msec task-clock                #    0.999 CPUs
>>>> utilized            ( +-  0.14% )
>>>>              78,149      context-switches          #   25.545 K/sec
>>>>                 ( +-  0.00% )
>>>>                   5      cpu-migrations            #    1.634 /sec
>>>>                 ( +-  2.67% )
>>>>                 194      page-faults               #   63.414 /sec
>>>>                 ( +-  0.43% )
>>>>       6,988,713,963      cycles                    #    2.284 GHz
>>>>                 ( +-  0.14% )
>>>>       8,512,533,269      instructions              #    1.22  insn per
>>>> cycle           ( +-  0.04% )
>>>>       1,638,375,371      branches                  #  535.549 M/sec
>>>>                 ( +-  0.05% )
>>>>           4,428,866      branch-misses             #    0.27% of all
>>>> branches          ( +- 22.57% )
>>>>
>>>>             3.06085 +- 0.00420 seconds time elapsed  ( +-  0.14% )
>>>
>>> How you compiled it also matters. ATM we don't enable retpolines
>>> and it did not matter since we didn't have indirect calls,
>>> but we should. Didn't yet investigate how to do that for virtio tools.
>> I think the retpolines certainly affect performance. Thank you for pointing
>> it out. I'd like to start the investigation that how to apply the
>> retpolines to the
>> virtio tools.
>>>>> Thank you for your comments.
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>> Best,
>>>>> Shunsuke.
> This isn't all that trivial if we want this at runtime.
> But compile time is kind of easy.
> See Documentation/admin-guide/hw-vuln/spectre.rst

Thank you for showing it.


I followed the document and added options to CFLAGS to the tools Makefile.

That is

---

diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile
index 1b25cc7c64bb..7b7139d97d74 100644
--- a/tools/virtio/Makefile
+++ b/tools/virtio/Makefile
@@ -4,7 +4,7 @@ test: virtio_test vringh_test
  virtio_test: virtio_ring.o virtio_test.o
  vringh_test: vringh_test.o vringh.o virtio_ring.o

-CFLAGS += -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. 
-I../include/ -I ../../usr/include/ -Wno-pointer-sign 
-fno-strict-overflow -fno-strict-aliasing -fno-common -MMD 
-U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h
+CFLAGS += -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. 
-I../include/ -I ../../usr/include/ -Wno-pointer-sign 
-fno-strict-overflow -fno-strict-aliasing -fno-common -MMD 
-U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h 
-mfunction-return=thunk -fcf-protection=none -mindirect-branch-register
  CFLAGS += -pthread
  LDFLAGS += -pthread
  vpath %.c ../../drivers/virtio ../../drivers/vhost
---

And results of evaluation are following:

- base with retpoline

$ sudo perf stat --repeat 20 -- nice -n -20 ./vringh_test_retp_origin 
--parallel --eventidx --fast-vringh
Using CPUS 0 and 3
Guest: notified 0, pinged 98040
Host: notified 98040, pinged 0
...

  Performance counter stats for 'nice -n -20 ./vringh_test_retp_origin 
--parallel --eventidx --fast-vringh' (20 runs):

           6,228.33 msec task-clock                #    1.004 CPUs 
utilized            ( +-  0.05% )
            196,110      context-switches          #   31.616 
K/sec                    ( +-  0.00% )
                  6      cpu-migrations            #    0.967 
/sec                     ( +-  2.39% )
                205      page-faults               #   33.049 
/sec                     ( +-  0.46% )
     14,218,527,987      cycles                    #    2.292 
GHz                      ( +-  0.05% )
     10,342,897,254      instructions              #    0.73  insn per 
cycle           ( +-  0.02% )
      2,310,572,989      branches                  #  372.500 
M/sec                    ( +-  0.03% )
        178,273,068      branch-misses             #    7.72% of all 
branches          ( +-  0.04% )

            6.20406 +- 0.00308 seconds time elapsed  ( +-  0.05% )

- patched (unified APIs) with retpoline

$ sudo perf stat --repeat 20 -- nice -n -20 ./vringh_test_retp_patched 
--parallel --eventidx --fast-vringh
Using CPUS 0 and 3
Guest: notified 0, pinged 98040
Host: notified 98040, pinged 0
...

  Performance counter stats for 'nice -n -20 ./vringh_test_retp_patched 
--parallel --eventidx --fast-vringh' (20 runs):

           6,103.94 msec task-clock                #    1.001 CPUs 
utilized            ( +-  0.03% )
            196,125      context-switches          #   32.165 
K/sec                    ( +-  0.00% )
                  7      cpu-migrations            #    1.148 
/sec                     ( +-  1.56% )
                196      page-faults               #   32.144 
/sec                     ( +-  0.41% )
     13,933,055,778      cycles                    #    2.285 
GHz                      ( +-  0.03% )
     10,309,004,718      instructions              #    0.74  insn per 
cycle           ( +-  0.03% )
      2,368,447,519      branches                  #  388.425 
M/sec                    ( +-  0.04% )
        211,364,886      branch-misses             #    8.94% of all 
branches          ( +-  0.05% )

            6.09888 +- 0.00155 seconds time elapsed  ( +-  0.03% )

As a result, at the patched code, the branch-misses was increased but
elapsed time became faster than the based code. The number of 
page-faults was
a little different. I'm suspicious of that the page-fault penalty leads the
performance result.

I think that a pattern of memory access for data is same with those, but
for instruction is different. Actually a code size (.text segment) was a
little smaller. 0x6a65 and 0x63f5.

$ readelf -a ./vringh_test_retp_origin |grep .text -1
        0000000000000008  0000000000000008  AX       0     0     8
   [14] .text             PROGBITS         0000000000001230 00001230
        0000000000006a65  0000000000000000  AX       0     0     16
--
    02     .interp .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym 
.dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
    03     .init .plt .plt.got .text .fini
    04     .rodata .eh_frame_hdr .eh_frame


$ readelf -a ./vringh_test_retp_patched |grep .text -1
        0000000000000008  0000000000000008  AX       0     0     8
   [14] .text             PROGBITS         0000000000001230 00001230
        00000000000063f5  0000000000000000  AX       0     0     16
--
    02     .interp .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym 
.dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
    03     .init .plt .plt.got .text .fini
    04     .rodata .eh_frame_hdr .eh_frame

I'll keep this investigation. I was wondering if you could comment me.

Best

>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2023-01-11  3:26           ` Shunsuke Mie
@ 2023-01-11  5:54             ` Jason Wang
  -1 siblings, 0 replies; 48+ messages in thread
From: Jason Wang @ 2023-01-11  5:54 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: kvm, Michael S. Tsirkin, netdev, Rusty Russell, linux-kernel,
	virtualization

On Wed, Jan 11, 2023 at 11:27 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>
>
> On 2022/12/28 15:36, Jason Wang wrote:
> > On Tue, Dec 27, 2022 at 3:06 PM Shunsuke Mie <mie@igel.co.jp> wrote:
> >> 2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
> >>> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> >>>> struct vringh_iov is defined to hold userland addresses. However, to use
> >>>> common function, __vring_iov, finally the vringh_iov converts to the
> >>>> vringh_kiov with simple cast. It includes compile time check code to make
> >>>> sure it can be cast correctly.
> >>>>
> >>>> To simplify the code, this patch removes the struct vringh_iov and unifies
> >>>> APIs to struct vringh_kiov.
> >>>>
> >>>> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> >>> While at this, I wonder if we need to go further, that is, switch to
> >>> using an iov iterator instead of a vringh customized one.
> >> I didn't see the iov iterator yet, thank you for informing me.
> >> Is that iov_iter? https://lwn.net/Articles/625077/
> > Exactly.
>
> I've investigated the iov_iter, vhost and related APIs. As a result, I
> think that it is not easy to switch to use the iov_iter. Because, the
> design of vhost and vringh is different.

Yes, but just to make sure we are on the same page, the reason I
suggest iov_iter for vringh is that the vringh itself has customized
iter equivalent, e.g it has iter for kernel,user, or even iotlb. At
least the kernel and userspace part could be switched to iov_iter.
Note that it has nothing to do with vhost.

>
> The iov_iter has vring desc info and meta data of transfer method. The
> vhost provides generic transfer function for the iov_iter. In constrast,
> vringh_iov just has vring desc info. The vringh provides transfer functions
> for each methods.
>
> In the future, it is better to use common data structure and APIs between
> vhost and vringh (or merge completely), but it requires a lot of
> changes, so I'd like to just
> organize data structure in vringh as a first step in this patch.

That's fine.

Thansk

>
>
> Best
>
> > Thanks
> >
> >>> Thanks
> >>>
> >>>> ---
> >>>>   drivers/vhost/vringh.c | 32 ++++++------------------------
> >>>>   include/linux/vringh.h | 45 ++++--------------------------------------
> >>>>   2 files changed, 10 insertions(+), 67 deletions(-)
> >>>>
> >>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> >>>> index 828c29306565..aa3cd27d2384 100644
> >>>> --- a/drivers/vhost/vringh.c
> >>>> +++ b/drivers/vhost/vringh.c
> >>>> @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
> >>>>    * calling vringh_iov_cleanup() to release the memory, even on error!
> >>>>    */
> >>>>   int vringh_getdesc_user(struct vringh *vrh,
> >>>> -                       struct vringh_iov *riov,
> >>>> -                       struct vringh_iov *wiov,
> >>>> +                       struct vringh_kiov *riov,
> >>>> +                       struct vringh_kiov *wiov,
> >>>>                          bool (*getrange)(struct vringh *vrh,
> >>>>                                           u64 addr, struct vringh_range *r),
> >>>>                          u16 *head)
> >>>> @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
> >>>>          if (err == vrh->vring.num)
> >>>>                  return 0;
> >>>>
> >>>> -       /* We need the layouts to be the identical for this to work */
> >>>> -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
> >>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
> >>>> -                    offsetof(struct vringh_iov, iov));
> >>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
> >>>> -                    offsetof(struct vringh_iov, i));
> >>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
> >>>> -                    offsetof(struct vringh_iov, used));
> >>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
> >>>> -                    offsetof(struct vringh_iov, max_num));
> >>>> -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
> >>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
> >>>> -                    offsetof(struct kvec, iov_base));
> >>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
> >>>> -                    offsetof(struct kvec, iov_len));
> >>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
> >>>> -                    != sizeof(((struct kvec *)NULL)->iov_base));
> >>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
> >>>> -                    != sizeof(((struct kvec *)NULL)->iov_len));
> >>>> -
> >>>>          *head = err;
> >>>>          err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
> >>>>                             (struct vringh_kiov *)wiov,
> >>>> @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
> >>>>   EXPORT_SYMBOL(vringh_getdesc_user);
> >>>>
> >>>>   /**
> >>>> - * vringh_iov_pull_user - copy bytes from vring_iov.
> >>>> + * vringh_iov_pull_user - copy bytes from vring_kiov.
> >>>>    * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
> >>>>    * @dst: the place to copy.
> >>>>    * @len: the maximum length to copy.
> >>>>    *
> >>>>    * Returns the bytes copied <= len or a negative errno.
> >>>>    */
> >>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> >>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
> >>>>   {
> >>>>          return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
> >>>>                                 dst, len, xfer_from_user);
> >>>> @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> >>>>   EXPORT_SYMBOL(vringh_iov_pull_user);
> >>>>
> >>>>   /**
> >>>> - * vringh_iov_push_user - copy bytes into vring_iov.
> >>>> + * vringh_iov_push_user - copy bytes into vring_kiov.
> >>>>    * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
> >>>>    * @src: the place to copy from.
> >>>>    * @len: the maximum length to copy.
> >>>>    *
> >>>>    * Returns the bytes copied <= len or a negative errno.
> >>>>    */
> >>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> >>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> >>>>                               const void *src, size_t len)
> >>>>   {
> >>>>          return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
> >>>> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> >>>> index 1991a02c6431..733d948e8123 100644
> >>>> --- a/include/linux/vringh.h
> >>>> +++ b/include/linux/vringh.h
> >>>> @@ -79,18 +79,6 @@ struct vringh_range {
> >>>>          u64 offset;
> >>>>   };
> >>>>
> >>>> -/**
> >>>> - * struct vringh_iov - iovec mangler.
> >>>> - *
> >>>> - * Mangles iovec in place, and restores it.
> >>>> - * Remaining data is iov + i, of used - i elements.
> >>>> - */
> >>>> -struct vringh_iov {
> >>>> -       struct iovec *iov;
> >>>> -       size_t consumed; /* Within iov[i] */
> >>>> -       unsigned i, used, max_num;
> >>>> -};
> >>>> -
> >>>>   /**
> >>>>    * struct vringh_kiov - kvec mangler.
> >>>>    *
> >>>> @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
> >>>>                       vring_avail_t __user *avail,
> >>>>                       vring_used_t __user *used);
> >>>>
> >>>> -static inline void vringh_iov_init(struct vringh_iov *iov,
> >>>> -                                  struct iovec *iovec, unsigned num)
> >>>> -{
> >>>> -       iov->used = iov->i = 0;
> >>>> -       iov->consumed = 0;
> >>>> -       iov->max_num = num;
> >>>> -       iov->iov = iovec;
> >>>> -}
> >>>> -
> >>>> -static inline void vringh_iov_reset(struct vringh_iov *iov)
> >>>> -{
> >>>> -       iov->iov[iov->i].iov_len += iov->consumed;
> >>>> -       iov->iov[iov->i].iov_base -= iov->consumed;
> >>>> -       iov->consumed = 0;
> >>>> -       iov->i = 0;
> >>>> -}
> >>>> -
> >>>> -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
> >>>> -{
> >>>> -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
> >>>> -               kfree(iov->iov);
> >>>> -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
> >>>> -       iov->iov = NULL;
> >>>> -}
> >>>> -
> >>>>   /* Convert a descriptor into iovecs. */
> >>>>   int vringh_getdesc_user(struct vringh *vrh,
> >>>> -                       struct vringh_iov *riov,
> >>>> -                       struct vringh_iov *wiov,
> >>>> +                       struct vringh_kiov *riov,
> >>>> +                       struct vringh_kiov *wiov,
> >>>>                          bool (*getrange)(struct vringh *vrh,
> >>>>                                           u64 addr, struct vringh_range *r),
> >>>>                          u16 *head);
> >>>>
> >>>>   /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
> >>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
> >>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
> >>>>
> >>>>   /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
> >>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> >>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> >>>>                               const void *src, size_t len);
> >>>>
> >>>>   /* Mark a descriptor as used. */
> >>>> --
> >>>> 2.25.1
> >>>>
> >> Best,
> >> Shunsuke
> >>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2023-01-11  5:54             ` Jason Wang
  0 siblings, 0 replies; 48+ messages in thread
From: Jason Wang @ 2023-01-11  5:54 UTC (permalink / raw)
  To: Shunsuke Mie
  Cc: Michael S. Tsirkin, Rusty Russell, kvm, virtualization, netdev,
	linux-kernel

On Wed, Jan 11, 2023 at 11:27 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>
>
> On 2022/12/28 15:36, Jason Wang wrote:
> > On Tue, Dec 27, 2022 at 3:06 PM Shunsuke Mie <mie@igel.co.jp> wrote:
> >> 2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
> >>> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
> >>>> struct vringh_iov is defined to hold userland addresses. However, to use
> >>>> common function, __vring_iov, finally the vringh_iov converts to the
> >>>> vringh_kiov with simple cast. It includes compile time check code to make
> >>>> sure it can be cast correctly.
> >>>>
> >>>> To simplify the code, this patch removes the struct vringh_iov and unifies
> >>>> APIs to struct vringh_kiov.
> >>>>
> >>>> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
> >>> While at this, I wonder if we need to go further, that is, switch to
> >>> using an iov iterator instead of a vringh customized one.
> >> I didn't see the iov iterator yet, thank you for informing me.
> >> Is that iov_iter? https://lwn.net/Articles/625077/
> > Exactly.
>
> I've investigated the iov_iter, vhost and related APIs. As a result, I
> think that it is not easy to switch to use the iov_iter. Because, the
> design of vhost and vringh is different.

Yes, but just to make sure we are on the same page, the reason I
suggest iov_iter for vringh is that the vringh itself has customized
iter equivalent, e.g it has iter for kernel,user, or even iotlb. At
least the kernel and userspace part could be switched to iov_iter.
Note that it has nothing to do with vhost.

>
> The iov_iter has vring desc info and meta data of transfer method. The
> vhost provides generic transfer function for the iov_iter. In constrast,
> vringh_iov just has vring desc info. The vringh provides transfer functions
> for each methods.
>
> In the future, it is better to use common data structure and APIs between
> vhost and vringh (or merge completely), but it requires a lot of
> changes, so I'd like to just
> organize data structure in vringh as a first step in this patch.

That's fine.

Thansk

>
>
> Best
>
> > Thanks
> >
> >>> Thanks
> >>>
> >>>> ---
> >>>>   drivers/vhost/vringh.c | 32 ++++++------------------------
> >>>>   include/linux/vringh.h | 45 ++++--------------------------------------
> >>>>   2 files changed, 10 insertions(+), 67 deletions(-)
> >>>>
> >>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> >>>> index 828c29306565..aa3cd27d2384 100644
> >>>> --- a/drivers/vhost/vringh.c
> >>>> +++ b/drivers/vhost/vringh.c
> >>>> @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
> >>>>    * calling vringh_iov_cleanup() to release the memory, even on error!
> >>>>    */
> >>>>   int vringh_getdesc_user(struct vringh *vrh,
> >>>> -                       struct vringh_iov *riov,
> >>>> -                       struct vringh_iov *wiov,
> >>>> +                       struct vringh_kiov *riov,
> >>>> +                       struct vringh_kiov *wiov,
> >>>>                          bool (*getrange)(struct vringh *vrh,
> >>>>                                           u64 addr, struct vringh_range *r),
> >>>>                          u16 *head)
> >>>> @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
> >>>>          if (err == vrh->vring.num)
> >>>>                  return 0;
> >>>>
> >>>> -       /* We need the layouts to be the identical for this to work */
> >>>> -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
> >>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
> >>>> -                    offsetof(struct vringh_iov, iov));
> >>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
> >>>> -                    offsetof(struct vringh_iov, i));
> >>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
> >>>> -                    offsetof(struct vringh_iov, used));
> >>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
> >>>> -                    offsetof(struct vringh_iov, max_num));
> >>>> -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
> >>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
> >>>> -                    offsetof(struct kvec, iov_base));
> >>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
> >>>> -                    offsetof(struct kvec, iov_len));
> >>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
> >>>> -                    != sizeof(((struct kvec *)NULL)->iov_base));
> >>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
> >>>> -                    != sizeof(((struct kvec *)NULL)->iov_len));
> >>>> -
> >>>>          *head = err;
> >>>>          err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
> >>>>                             (struct vringh_kiov *)wiov,
> >>>> @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
> >>>>   EXPORT_SYMBOL(vringh_getdesc_user);
> >>>>
> >>>>   /**
> >>>> - * vringh_iov_pull_user - copy bytes from vring_iov.
> >>>> + * vringh_iov_pull_user - copy bytes from vring_kiov.
> >>>>    * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
> >>>>    * @dst: the place to copy.
> >>>>    * @len: the maximum length to copy.
> >>>>    *
> >>>>    * Returns the bytes copied <= len or a negative errno.
> >>>>    */
> >>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> >>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
> >>>>   {
> >>>>          return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
> >>>>                                 dst, len, xfer_from_user);
> >>>> @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
> >>>>   EXPORT_SYMBOL(vringh_iov_pull_user);
> >>>>
> >>>>   /**
> >>>> - * vringh_iov_push_user - copy bytes into vring_iov.
> >>>> + * vringh_iov_push_user - copy bytes into vring_kiov.
> >>>>    * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
> >>>>    * @src: the place to copy from.
> >>>>    * @len: the maximum length to copy.
> >>>>    *
> >>>>    * Returns the bytes copied <= len or a negative errno.
> >>>>    */
> >>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> >>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> >>>>                               const void *src, size_t len)
> >>>>   {
> >>>>          return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
> >>>> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> >>>> index 1991a02c6431..733d948e8123 100644
> >>>> --- a/include/linux/vringh.h
> >>>> +++ b/include/linux/vringh.h
> >>>> @@ -79,18 +79,6 @@ struct vringh_range {
> >>>>          u64 offset;
> >>>>   };
> >>>>
> >>>> -/**
> >>>> - * struct vringh_iov - iovec mangler.
> >>>> - *
> >>>> - * Mangles iovec in place, and restores it.
> >>>> - * Remaining data is iov + i, of used - i elements.
> >>>> - */
> >>>> -struct vringh_iov {
> >>>> -       struct iovec *iov;
> >>>> -       size_t consumed; /* Within iov[i] */
> >>>> -       unsigned i, used, max_num;
> >>>> -};
> >>>> -
> >>>>   /**
> >>>>    * struct vringh_kiov - kvec mangler.
> >>>>    *
> >>>> @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
> >>>>                       vring_avail_t __user *avail,
> >>>>                       vring_used_t __user *used);
> >>>>
> >>>> -static inline void vringh_iov_init(struct vringh_iov *iov,
> >>>> -                                  struct iovec *iovec, unsigned num)
> >>>> -{
> >>>> -       iov->used = iov->i = 0;
> >>>> -       iov->consumed = 0;
> >>>> -       iov->max_num = num;
> >>>> -       iov->iov = iovec;
> >>>> -}
> >>>> -
> >>>> -static inline void vringh_iov_reset(struct vringh_iov *iov)
> >>>> -{
> >>>> -       iov->iov[iov->i].iov_len += iov->consumed;
> >>>> -       iov->iov[iov->i].iov_base -= iov->consumed;
> >>>> -       iov->consumed = 0;
> >>>> -       iov->i = 0;
> >>>> -}
> >>>> -
> >>>> -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
> >>>> -{
> >>>> -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
> >>>> -               kfree(iov->iov);
> >>>> -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
> >>>> -       iov->iov = NULL;
> >>>> -}
> >>>> -
> >>>>   /* Convert a descriptor into iovecs. */
> >>>>   int vringh_getdesc_user(struct vringh *vrh,
> >>>> -                       struct vringh_iov *riov,
> >>>> -                       struct vringh_iov *wiov,
> >>>> +                       struct vringh_kiov *riov,
> >>>> +                       struct vringh_kiov *wiov,
> >>>>                          bool (*getrange)(struct vringh *vrh,
> >>>>                                           u64 addr, struct vringh_range *r),
> >>>>                          u16 *head);
> >>>>
> >>>>   /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
> >>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
> >>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
> >>>>
> >>>>   /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
> >>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
> >>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
> >>>>                               const void *src, size_t len);
> >>>>
> >>>>   /* Mark a descriptor as used. */
> >>>> --
> >>>> 2.25.1
> >>>>
> >> Best,
> >> Shunsuke
> >>
>


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
  2023-01-11  5:54             ` Jason Wang
@ 2023-01-11  6:19               ` Shunsuke Mie
  -1 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2023-01-11  6:19 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Rusty Russell, kvm, virtualization, netdev,
	linux-kernel


On 2023/01/11 14:54, Jason Wang wrote:
> On Wed, Jan 11, 2023 at 11:27 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>>
>> On 2022/12/28 15:36, Jason Wang wrote:
>>> On Tue, Dec 27, 2022 at 3:06 PM Shunsuke Mie <mie@igel.co.jp> wrote:
>>>> 2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
>>>>> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>>>>>> struct vringh_iov is defined to hold userland addresses. However, to use
>>>>>> common function, __vring_iov, finally the vringh_iov converts to the
>>>>>> vringh_kiov with simple cast. It includes compile time check code to make
>>>>>> sure it can be cast correctly.
>>>>>>
>>>>>> To simplify the code, this patch removes the struct vringh_iov and unifies
>>>>>> APIs to struct vringh_kiov.
>>>>>>
>>>>>> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
>>>>> While at this, I wonder if we need to go further, that is, switch to
>>>>> using an iov iterator instead of a vringh customized one.
>>>> I didn't see the iov iterator yet, thank you for informing me.
>>>> Is that iov_iter? https://lwn.net/Articles/625077/
>>> Exactly.
>> I've investigated the iov_iter, vhost and related APIs. As a result, I
>> think that it is not easy to switch to use the iov_iter. Because, the
>> design of vhost and vringh is different.
> Yes, but just to make sure we are on the same page, the reason I
> suggest iov_iter for vringh is that the vringh itself has customized
> iter equivalent, e.g it has iter for kernel,user, or even iotlb. At
> least the kernel and userspace part could be switched to iov_iter.
> Note that it has nothing to do with vhost.
I agree. It can be switch to use iov_iter, but I think we need to change
fundamentally. There are duplicated code on vhost and vringh to access
vring, and some helper functions...

Anyway, I'd like to focus vringh in this patchset. Thank you for your

comments and suggestions!


Best

>> The iov_iter has vring desc info and meta data of transfer method. The
>> vhost provides generic transfer function for the iov_iter. In constrast,
>> vringh_iov just has vring desc info. The vringh provides transfer functions
>> for each methods.
>>
>> In the future, it is better to use common data structure and APIs between
>> vhost and vringh (or merge completely), but it requires a lot of
>> changes, so I'd like to just
>> organize data structure in vringh as a first step in this patch.
> That's fine.
>
> Thansk
>
>>
>> Best
>>
>>> Thanks
>>>
>>>>> Thanks
>>>>>
>>>>>> ---
>>>>>>    drivers/vhost/vringh.c | 32 ++++++------------------------
>>>>>>    include/linux/vringh.h | 45 ++++--------------------------------------
>>>>>>    2 files changed, 10 insertions(+), 67 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>>>>> index 828c29306565..aa3cd27d2384 100644
>>>>>> --- a/drivers/vhost/vringh.c
>>>>>> +++ b/drivers/vhost/vringh.c
>>>>>> @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
>>>>>>     * calling vringh_iov_cleanup() to release the memory, even on error!
>>>>>>     */
>>>>>>    int vringh_getdesc_user(struct vringh *vrh,
>>>>>> -                       struct vringh_iov *riov,
>>>>>> -                       struct vringh_iov *wiov,
>>>>>> +                       struct vringh_kiov *riov,
>>>>>> +                       struct vringh_kiov *wiov,
>>>>>>                           bool (*getrange)(struct vringh *vrh,
>>>>>>                                            u64 addr, struct vringh_range *r),
>>>>>>                           u16 *head)
>>>>>> @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
>>>>>>           if (err == vrh->vring.num)
>>>>>>                   return 0;
>>>>>>
>>>>>> -       /* We need the layouts to be the identical for this to work */
>>>>>> -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
>>>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
>>>>>> -                    offsetof(struct vringh_iov, iov));
>>>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
>>>>>> -                    offsetof(struct vringh_iov, i));
>>>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
>>>>>> -                    offsetof(struct vringh_iov, used));
>>>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
>>>>>> -                    offsetof(struct vringh_iov, max_num));
>>>>>> -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
>>>>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
>>>>>> -                    offsetof(struct kvec, iov_base));
>>>>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
>>>>>> -                    offsetof(struct kvec, iov_len));
>>>>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
>>>>>> -                    != sizeof(((struct kvec *)NULL)->iov_base));
>>>>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
>>>>>> -                    != sizeof(((struct kvec *)NULL)->iov_len));
>>>>>> -
>>>>>>           *head = err;
>>>>>>           err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
>>>>>>                              (struct vringh_kiov *)wiov,
>>>>>> @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
>>>>>>    EXPORT_SYMBOL(vringh_getdesc_user);
>>>>>>
>>>>>>    /**
>>>>>> - * vringh_iov_pull_user - copy bytes from vring_iov.
>>>>>> + * vringh_iov_pull_user - copy bytes from vring_kiov.
>>>>>>     * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
>>>>>>     * @dst: the place to copy.
>>>>>>     * @len: the maximum length to copy.
>>>>>>     *
>>>>>>     * Returns the bytes copied <= len or a negative errno.
>>>>>>     */
>>>>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>>>>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
>>>>>>    {
>>>>>>           return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
>>>>>>                                  dst, len, xfer_from_user);
>>>>>> @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>>>>>>    EXPORT_SYMBOL(vringh_iov_pull_user);
>>>>>>
>>>>>>    /**
>>>>>> - * vringh_iov_push_user - copy bytes into vring_iov.
>>>>>> + * vringh_iov_push_user - copy bytes into vring_kiov.
>>>>>>     * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
>>>>>>     * @src: the place to copy from.
>>>>>>     * @len: the maximum length to copy.
>>>>>>     *
>>>>>>     * Returns the bytes copied <= len or a negative errno.
>>>>>>     */
>>>>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
>>>>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>>>>>>                                const void *src, size_t len)
>>>>>>    {
>>>>>>           return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
>>>>>> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
>>>>>> index 1991a02c6431..733d948e8123 100644
>>>>>> --- a/include/linux/vringh.h
>>>>>> +++ b/include/linux/vringh.h
>>>>>> @@ -79,18 +79,6 @@ struct vringh_range {
>>>>>>           u64 offset;
>>>>>>    };
>>>>>>
>>>>>> -/**
>>>>>> - * struct vringh_iov - iovec mangler.
>>>>>> - *
>>>>>> - * Mangles iovec in place, and restores it.
>>>>>> - * Remaining data is iov + i, of used - i elements.
>>>>>> - */
>>>>>> -struct vringh_iov {
>>>>>> -       struct iovec *iov;
>>>>>> -       size_t consumed; /* Within iov[i] */
>>>>>> -       unsigned i, used, max_num;
>>>>>> -};
>>>>>> -
>>>>>>    /**
>>>>>>     * struct vringh_kiov - kvec mangler.
>>>>>>     *
>>>>>> @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
>>>>>>                        vring_avail_t __user *avail,
>>>>>>                        vring_used_t __user *used);
>>>>>>
>>>>>> -static inline void vringh_iov_init(struct vringh_iov *iov,
>>>>>> -                                  struct iovec *iovec, unsigned num)
>>>>>> -{
>>>>>> -       iov->used = iov->i = 0;
>>>>>> -       iov->consumed = 0;
>>>>>> -       iov->max_num = num;
>>>>>> -       iov->iov = iovec;
>>>>>> -}
>>>>>> -
>>>>>> -static inline void vringh_iov_reset(struct vringh_iov *iov)
>>>>>> -{
>>>>>> -       iov->iov[iov->i].iov_len += iov->consumed;
>>>>>> -       iov->iov[iov->i].iov_base -= iov->consumed;
>>>>>> -       iov->consumed = 0;
>>>>>> -       iov->i = 0;
>>>>>> -}
>>>>>> -
>>>>>> -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
>>>>>> -{
>>>>>> -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
>>>>>> -               kfree(iov->iov);
>>>>>> -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
>>>>>> -       iov->iov = NULL;
>>>>>> -}
>>>>>> -
>>>>>>    /* Convert a descriptor into iovecs. */
>>>>>>    int vringh_getdesc_user(struct vringh *vrh,
>>>>>> -                       struct vringh_iov *riov,
>>>>>> -                       struct vringh_iov *wiov,
>>>>>> +                       struct vringh_kiov *riov,
>>>>>> +                       struct vringh_kiov *wiov,
>>>>>>                           bool (*getrange)(struct vringh *vrh,
>>>>>>                                            u64 addr, struct vringh_range *r),
>>>>>>                           u16 *head);
>>>>>>
>>>>>>    /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
>>>>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
>>>>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
>>>>>>
>>>>>>    /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
>>>>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
>>>>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>>>>>>                                const void *src, size_t len);
>>>>>>
>>>>>>    /* Mark a descriptor as used. */
>>>>>> --
>>>>>> 2.25.1
>>>>>>
>>>> Best,
>>>> Shunsuke
>>>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov
@ 2023-01-11  6:19               ` Shunsuke Mie
  0 siblings, 0 replies; 48+ messages in thread
From: Shunsuke Mie @ 2023-01-11  6:19 UTC (permalink / raw)
  To: Jason Wang
  Cc: kvm, Michael S. Tsirkin, netdev, Rusty Russell, linux-kernel,
	virtualization


On 2023/01/11 14:54, Jason Wang wrote:
> On Wed, Jan 11, 2023 at 11:27 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>>
>> On 2022/12/28 15:36, Jason Wang wrote:
>>> On Tue, Dec 27, 2022 at 3:06 PM Shunsuke Mie <mie@igel.co.jp> wrote:
>>>> 2022年12月27日(火) 15:04 Jason Wang <jasowang@redhat.com>:
>>>>> On Tue, Dec 27, 2022 at 10:25 AM Shunsuke Mie <mie@igel.co.jp> wrote:
>>>>>> struct vringh_iov is defined to hold userland addresses. However, to use
>>>>>> common function, __vring_iov, finally the vringh_iov converts to the
>>>>>> vringh_kiov with simple cast. It includes compile time check code to make
>>>>>> sure it can be cast correctly.
>>>>>>
>>>>>> To simplify the code, this patch removes the struct vringh_iov and unifies
>>>>>> APIs to struct vringh_kiov.
>>>>>>
>>>>>> Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
>>>>> While at this, I wonder if we need to go further, that is, switch to
>>>>> using an iov iterator instead of a vringh customized one.
>>>> I didn't see the iov iterator yet, thank you for informing me.
>>>> Is that iov_iter? https://lwn.net/Articles/625077/
>>> Exactly.
>> I've investigated the iov_iter, vhost and related APIs. As a result, I
>> think that it is not easy to switch to use the iov_iter. Because, the
>> design of vhost and vringh is different.
> Yes, but just to make sure we are on the same page, the reason I
> suggest iov_iter for vringh is that the vringh itself has customized
> iter equivalent, e.g it has iter for kernel,user, or even iotlb. At
> least the kernel and userspace part could be switched to iov_iter.
> Note that it has nothing to do with vhost.
I agree. It can be switch to use iov_iter, but I think we need to change
fundamentally. There are duplicated code on vhost and vringh to access
vring, and some helper functions...

Anyway, I'd like to focus vringh in this patchset. Thank you for your

comments and suggestions!


Best

>> The iov_iter has vring desc info and meta data of transfer method. The
>> vhost provides generic transfer function for the iov_iter. In constrast,
>> vringh_iov just has vring desc info. The vringh provides transfer functions
>> for each methods.
>>
>> In the future, it is better to use common data structure and APIs between
>> vhost and vringh (or merge completely), but it requires a lot of
>> changes, so I'd like to just
>> organize data structure in vringh as a first step in this patch.
> That's fine.
>
> Thansk
>
>>
>> Best
>>
>>> Thanks
>>>
>>>>> Thanks
>>>>>
>>>>>> ---
>>>>>>    drivers/vhost/vringh.c | 32 ++++++------------------------
>>>>>>    include/linux/vringh.h | 45 ++++--------------------------------------
>>>>>>    2 files changed, 10 insertions(+), 67 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>>>>> index 828c29306565..aa3cd27d2384 100644
>>>>>> --- a/drivers/vhost/vringh.c
>>>>>> +++ b/drivers/vhost/vringh.c
>>>>>> @@ -691,8 +691,8 @@ EXPORT_SYMBOL(vringh_init_user);
>>>>>>     * calling vringh_iov_cleanup() to release the memory, even on error!
>>>>>>     */
>>>>>>    int vringh_getdesc_user(struct vringh *vrh,
>>>>>> -                       struct vringh_iov *riov,
>>>>>> -                       struct vringh_iov *wiov,
>>>>>> +                       struct vringh_kiov *riov,
>>>>>> +                       struct vringh_kiov *wiov,
>>>>>>                           bool (*getrange)(struct vringh *vrh,
>>>>>>                                            u64 addr, struct vringh_range *r),
>>>>>>                           u16 *head)
>>>>>> @@ -708,26 +708,6 @@ int vringh_getdesc_user(struct vringh *vrh,
>>>>>>           if (err == vrh->vring.num)
>>>>>>                   return 0;
>>>>>>
>>>>>> -       /* We need the layouts to be the identical for this to work */
>>>>>> -       BUILD_BUG_ON(sizeof(struct vringh_kiov) != sizeof(struct vringh_iov));
>>>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, iov) !=
>>>>>> -                    offsetof(struct vringh_iov, iov));
>>>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, i) !=
>>>>>> -                    offsetof(struct vringh_iov, i));
>>>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, used) !=
>>>>>> -                    offsetof(struct vringh_iov, used));
>>>>>> -       BUILD_BUG_ON(offsetof(struct vringh_kiov, max_num) !=
>>>>>> -                    offsetof(struct vringh_iov, max_num));
>>>>>> -       BUILD_BUG_ON(sizeof(struct iovec) != sizeof(struct kvec));
>>>>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_base) !=
>>>>>> -                    offsetof(struct kvec, iov_base));
>>>>>> -       BUILD_BUG_ON(offsetof(struct iovec, iov_len) !=
>>>>>> -                    offsetof(struct kvec, iov_len));
>>>>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_base)
>>>>>> -                    != sizeof(((struct kvec *)NULL)->iov_base));
>>>>>> -       BUILD_BUG_ON(sizeof(((struct iovec *)NULL)->iov_len)
>>>>>> -                    != sizeof(((struct kvec *)NULL)->iov_len));
>>>>>> -
>>>>>>           *head = err;
>>>>>>           err = __vringh_iov(vrh, *head, (struct vringh_kiov *)riov,
>>>>>>                              (struct vringh_kiov *)wiov,
>>>>>> @@ -740,14 +720,14 @@ int vringh_getdesc_user(struct vringh *vrh,
>>>>>>    EXPORT_SYMBOL(vringh_getdesc_user);
>>>>>>
>>>>>>    /**
>>>>>> - * vringh_iov_pull_user - copy bytes from vring_iov.
>>>>>> + * vringh_iov_pull_user - copy bytes from vring_kiov.
>>>>>>     * @riov: the riov as passed to vringh_getdesc_user() (updated as we consume)
>>>>>>     * @dst: the place to copy.
>>>>>>     * @len: the maximum length to copy.
>>>>>>     *
>>>>>>     * Returns the bytes copied <= len or a negative errno.
>>>>>>     */
>>>>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>>>>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len)
>>>>>>    {
>>>>>>           return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov,
>>>>>>                                  dst, len, xfer_from_user);
>>>>>> @@ -755,14 +735,14 @@ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len)
>>>>>>    EXPORT_SYMBOL(vringh_iov_pull_user);
>>>>>>
>>>>>>    /**
>>>>>> - * vringh_iov_push_user - copy bytes into vring_iov.
>>>>>> + * vringh_iov_push_user - copy bytes into vring_kiov.
>>>>>>     * @wiov: the wiov as passed to vringh_getdesc_user() (updated as we consume)
>>>>>>     * @src: the place to copy from.
>>>>>>     * @len: the maximum length to copy.
>>>>>>     *
>>>>>>     * Returns the bytes copied <= len or a negative errno.
>>>>>>     */
>>>>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
>>>>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>>>>>>                                const void *src, size_t len)
>>>>>>    {
>>>>>>           return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov,
>>>>>> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
>>>>>> index 1991a02c6431..733d948e8123 100644
>>>>>> --- a/include/linux/vringh.h
>>>>>> +++ b/include/linux/vringh.h
>>>>>> @@ -79,18 +79,6 @@ struct vringh_range {
>>>>>>           u64 offset;
>>>>>>    };
>>>>>>
>>>>>> -/**
>>>>>> - * struct vringh_iov - iovec mangler.
>>>>>> - *
>>>>>> - * Mangles iovec in place, and restores it.
>>>>>> - * Remaining data is iov + i, of used - i elements.
>>>>>> - */
>>>>>> -struct vringh_iov {
>>>>>> -       struct iovec *iov;
>>>>>> -       size_t consumed; /* Within iov[i] */
>>>>>> -       unsigned i, used, max_num;
>>>>>> -};
>>>>>> -
>>>>>>    /**
>>>>>>     * struct vringh_kiov - kvec mangler.
>>>>>>     *
>>>>>> @@ -113,44 +101,19 @@ int vringh_init_user(struct vringh *vrh, u64 features,
>>>>>>                        vring_avail_t __user *avail,
>>>>>>                        vring_used_t __user *used);
>>>>>>
>>>>>> -static inline void vringh_iov_init(struct vringh_iov *iov,
>>>>>> -                                  struct iovec *iovec, unsigned num)
>>>>>> -{
>>>>>> -       iov->used = iov->i = 0;
>>>>>> -       iov->consumed = 0;
>>>>>> -       iov->max_num = num;
>>>>>> -       iov->iov = iovec;
>>>>>> -}
>>>>>> -
>>>>>> -static inline void vringh_iov_reset(struct vringh_iov *iov)
>>>>>> -{
>>>>>> -       iov->iov[iov->i].iov_len += iov->consumed;
>>>>>> -       iov->iov[iov->i].iov_base -= iov->consumed;
>>>>>> -       iov->consumed = 0;
>>>>>> -       iov->i = 0;
>>>>>> -}
>>>>>> -
>>>>>> -static inline void vringh_iov_cleanup(struct vringh_iov *iov)
>>>>>> -{
>>>>>> -       if (iov->max_num & VRINGH_IOV_ALLOCATED)
>>>>>> -               kfree(iov->iov);
>>>>>> -       iov->max_num = iov->used = iov->i = iov->consumed = 0;
>>>>>> -       iov->iov = NULL;
>>>>>> -}
>>>>>> -
>>>>>>    /* Convert a descriptor into iovecs. */
>>>>>>    int vringh_getdesc_user(struct vringh *vrh,
>>>>>> -                       struct vringh_iov *riov,
>>>>>> -                       struct vringh_iov *wiov,
>>>>>> +                       struct vringh_kiov *riov,
>>>>>> +                       struct vringh_kiov *wiov,
>>>>>>                           bool (*getrange)(struct vringh *vrh,
>>>>>>                                            u64 addr, struct vringh_range *r),
>>>>>>                           u16 *head);
>>>>>>
>>>>>>    /* Copy bytes from readable vsg, consuming it (and incrementing wiov->i). */
>>>>>> -ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len);
>>>>>> +ssize_t vringh_iov_pull_user(struct vringh_kiov *riov, void *dst, size_t len);
>>>>>>
>>>>>>    /* Copy bytes into writable vsg, consuming it (and incrementing wiov->i). */
>>>>>> -ssize_t vringh_iov_push_user(struct vringh_iov *wiov,
>>>>>> +ssize_t vringh_iov_push_user(struct vringh_kiov *wiov,
>>>>>>                                const void *src, size_t len);
>>>>>>
>>>>>>    /* Mark a descriptor as used. */
>>>>>> --
>>>>>> 2.25.1
>>>>>>
>>>> Best,
>>>> Shunsuke
>>>>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2023-01-11  6:20 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-27  2:25 [RFC PATCH 0/6] Introduce a vringh accessor for IO memory Shunsuke Mie
2022-12-27  2:25 ` Shunsuke Mie
2022-12-27  2:25 ` [RFC PATCH 1/9] vringh: fix a typo in comments for vringh_kiov Shunsuke Mie
2022-12-27  2:25   ` Shunsuke Mie
2022-12-27  2:25 ` [RFC PATCH 2/9] vringh: remove vringh_iov and unite to vringh_kiov Shunsuke Mie
2022-12-27  2:25   ` Shunsuke Mie
2022-12-27  6:04   ` Jason Wang
2022-12-27  6:04     ` Jason Wang
2022-12-27  7:05     ` Michael S. Tsirkin
2022-12-27  7:05       ` Michael S. Tsirkin
2022-12-27  7:13       ` Shunsuke Mie
2022-12-27  7:13         ` Shunsuke Mie
2022-12-27  7:56         ` Michael S. Tsirkin
2022-12-27  7:56           ` Michael S. Tsirkin
2022-12-27  7:57           ` Shunsuke Mie
2022-12-27  7:57             ` Shunsuke Mie
2022-12-27  7:05     ` Shunsuke Mie
2022-12-27  7:05       ` Shunsuke Mie
2022-12-28  6:36       ` Jason Wang
2022-12-28  6:36         ` Jason Wang
2023-01-11  3:26         ` Shunsuke Mie
2023-01-11  3:26           ` Shunsuke Mie
2023-01-11  5:54           ` Jason Wang
2023-01-11  5:54             ` Jason Wang
2023-01-11  6:19             ` Shunsuke Mie
2023-01-11  6:19               ` Shunsuke Mie
2022-12-27  2:25 ` [RFC PATCH 3/9] tools/virtio: convert to new vringh user APIs Shunsuke Mie
2022-12-27  2:25   ` Shunsuke Mie
2022-12-27  2:25 ` [RFC PATCH 4/9] vringh: unify the APIs for all accessors Shunsuke Mie
2022-12-27  2:25   ` Shunsuke Mie
2022-12-27  7:04   ` Michael S. Tsirkin
2022-12-27  7:04     ` Michael S. Tsirkin
2022-12-27  7:49     ` Shunsuke Mie
2022-12-27  7:49       ` Shunsuke Mie
2022-12-27 10:22       ` Shunsuke Mie
2022-12-27 10:22         ` Shunsuke Mie
2022-12-27 14:37         ` Michael S. Tsirkin
2022-12-27 14:37           ` Michael S. Tsirkin
2022-12-28  2:24           ` Shunsuke Mie
2022-12-28  2:24             ` Shunsuke Mie
2022-12-28  7:20             ` Michael S. Tsirkin
2022-12-28  7:20               ` Michael S. Tsirkin
2023-01-11  4:10               ` Shunsuke Mie
2023-01-11  4:10                 ` Shunsuke Mie
2022-12-27  2:25 ` [RFC PATCH 5/9] tools/virtio: convert to use new unified vringh APIs Shunsuke Mie
2022-12-27  2:25   ` Shunsuke Mie
2022-12-27  2:25 ` [RFC PATCH 6/9] caif_virtio: convert to " Shunsuke Mie
2022-12-27  2:25   ` Shunsuke Mie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.