linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device
@ 2021-01-28 14:41 Stefano Garzarella
  2021-01-28 14:41 ` [PATCH RFC v2 01/10] vdpa_sim: use iova module to allocate IOVA addresses Stefano Garzarella
                   ` (9 more replies)
  0 siblings, 10 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

v1: https://lore.kernel.org/lkml/93f207c0-61e6-3696-f218-e7d7ea9a7c93@redhat.com/

This series is the second part of the v1 linked above. The first part with
refactoring of vdpa_sim has already been merged.

The patches are based on Max Gurtovoy's work and extend the block simulator to
have a ramdisk behaviour.

As mentioned in the v1 there was 2 issues and I fixed them in this series:
1. The identical mapping in the IOMMU used until now in vdpa_sim created issues
   when mapping different virtual pages with the same physical address.
   Fixed by patch "vdpa_sim: use iova module to allocate IOVA addresses"

2. There was a race accessing the IOMMU between the vdpasim_blk_work() and the
   device driver that map/unmap DMA regions. Fixed by patch "vringh: add
   'iotlb_lock' to synchronize iotlb accesses"

Since this series is still a RFC, I used the Xie's patch as is to allow
vhost-vdpa to use block devices, but I'll rebase when he splits it into
multiple patches.

The series also includes small fixes for vdpa_sim that I discovered while
implementing the block simulator.

Thanks for your feedback,
Stefano

Max Gurtovoy (1):
  vdpa: add vdpa simulator for block device

Stefano Garzarella (8):
  vdpa_sim: use iova module to allocate IOVA addresses
  vringh: add 'iotlb_lock' to synchronize iotlb accesses
  vringh: reset kiov 'consumed' field in __vringh_iov()
  vringh: implement vringh_kiov_advance()
  vringh: add vringh_kiov_length() helper
  vdpa_sim: cleanup kiovs in vdpasim_free()
  vdpa_sim_blk: implement ramdisk behaviour
  vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID

Xie Yongji (1):
  vdpa: Remove the restriction that only supports virtio-net devices

 drivers/vdpa/vdpa_sim/vdpa_sim.h     |   2 +
 include/linux/vringh.h               |  19 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.c     | 118 +++++++----
 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 288 +++++++++++++++++++++++++++
 drivers/vhost/vdpa.c                 |  28 +--
 drivers/vhost/vringh.c               |  54 +++--
 drivers/vdpa/Kconfig                 |   8 +
 drivers/vdpa/vdpa_sim/Makefile       |   1 +
 8 files changed, 433 insertions(+), 85 deletions(-)
 create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c

-- 
2.29.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 01/10] vdpa_sim: use iova module to allocate IOVA addresses
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-01-28 14:41 ` [PATCH RFC v2 02/10] vringh: add 'iotlb_lock' to synchronize iotlb accesses Stefano Garzarella
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

The identical mapping used until now created issues when mapping
different virtual pages with the same physical address.
To solve this issue, we can use the iova module, to handle the IOVA
allocation.
For simplicity we use an IOVA allocator with byte granularity.

We add two new functions, vdpasim_map_range() and vdpasim_unmap_range(),
to handle the IOVA allocation and the registration into the IOMMU/IOTLB.
These functions are used by dma_map_ops callbacks.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
v2:
- used ULONG_MAX instead of ~0UL [Jason]
- fixed typos in comment and patch description [Jason]
---
 drivers/vdpa/vdpa_sim/vdpa_sim.h |   2 +
 drivers/vdpa/vdpa_sim/vdpa_sim.c | 108 +++++++++++++++++++------------
 drivers/vdpa/Kconfig             |   1 +
 3 files changed, 69 insertions(+), 42 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.h b/drivers/vdpa/vdpa_sim/vdpa_sim.h
index 6d75444f9948..cd58e888bcf3 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.h
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.h
@@ -6,6 +6,7 @@
 #ifndef _VDPA_SIM_H
 #define _VDPA_SIM_H
 
+#include <linux/iova.h>
 #include <linux/vringh.h>
 #include <linux/vdpa.h>
 #include <linux/virtio_byteorder.h>
@@ -57,6 +58,7 @@ struct vdpasim {
 	/* virtio config according to device type */
 	void *config;
 	struct vhost_iotlb *iommu;
+	struct iova_domain iova;
 	void *buffer;
 	u32 status;
 	u32 generation;
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index d5942842432d..2183a833fcf4 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -17,6 +17,7 @@
 #include <linux/vringh.h>
 #include <linux/vdpa.h>
 #include <linux/vhost_iotlb.h>
+#include <linux/iova.h>
 
 #include "vdpa_sim.h"
 
@@ -128,30 +129,57 @@ static int dir_to_perm(enum dma_data_direction dir)
 	return perm;
 }
 
+static dma_addr_t vdpasim_map_range(struct vdpasim *vdpasim, phys_addr_t paddr,
+				    size_t size, unsigned int perm)
+{
+	struct iova *iova;
+	dma_addr_t dma_addr;
+	int ret;
+
+	/* We set the limit_pfn to the maximum (ULONG_MAX - 1) */
+	iova = alloc_iova(&vdpasim->iova, size, ULONG_MAX - 1, true);
+	if (!iova)
+		return DMA_MAPPING_ERROR;
+
+	dma_addr = iova_dma_addr(&vdpasim->iova, iova);
+
+	spin_lock(&vdpasim->iommu_lock);
+	ret = vhost_iotlb_add_range(vdpasim->iommu, (u64)dma_addr,
+				    (u64)dma_addr + size - 1, (u64)paddr, perm);
+	spin_unlock(&vdpasim->iommu_lock);
+
+	if (ret) {
+		__free_iova(&vdpasim->iova, iova);
+		return DMA_MAPPING_ERROR;
+	}
+
+	return dma_addr;
+}
+
+static void vdpasim_unmap_range(struct vdpasim *vdpasim, dma_addr_t dma_addr,
+				size_t size)
+{
+	spin_lock(&vdpasim->iommu_lock);
+	vhost_iotlb_del_range(vdpasim->iommu, (u64)dma_addr,
+			      (u64)dma_addr + size - 1);
+	spin_unlock(&vdpasim->iommu_lock);
+
+	free_iova(&vdpasim->iova, iova_pfn(&vdpasim->iova, dma_addr));
+}
+
 static dma_addr_t vdpasim_map_page(struct device *dev, struct page *page,
 				   unsigned long offset, size_t size,
 				   enum dma_data_direction dir,
 				   unsigned long attrs)
 {
 	struct vdpasim *vdpasim = dev_to_sim(dev);
-	struct vhost_iotlb *iommu = vdpasim->iommu;
-	u64 pa = (page_to_pfn(page) << PAGE_SHIFT) + offset;
-	int ret, perm = dir_to_perm(dir);
+	phys_addr_t paddr = page_to_phys(page) + offset;
+	int perm = dir_to_perm(dir);
 
 	if (perm < 0)
 		return DMA_MAPPING_ERROR;
 
-	/* For simplicity, use identical mapping to avoid e.g iova
-	 * allocator.
-	 */
-	spin_lock(&vdpasim->iommu_lock);
-	ret = vhost_iotlb_add_range(iommu, pa, pa + size - 1,
-				    pa, dir_to_perm(dir));
-	spin_unlock(&vdpasim->iommu_lock);
-	if (ret)
-		return DMA_MAPPING_ERROR;
-
-	return (dma_addr_t)(pa);
+	return vdpasim_map_range(vdpasim, paddr, size, perm);
 }
 
 static void vdpasim_unmap_page(struct device *dev, dma_addr_t dma_addr,
@@ -159,12 +187,8 @@ static void vdpasim_unmap_page(struct device *dev, dma_addr_t dma_addr,
 			       unsigned long attrs)
 {
 	struct vdpasim *vdpasim = dev_to_sim(dev);
-	struct vhost_iotlb *iommu = vdpasim->iommu;
 
-	spin_lock(&vdpasim->iommu_lock);
-	vhost_iotlb_del_range(iommu, (u64)dma_addr,
-			      (u64)dma_addr + size - 1);
-	spin_unlock(&vdpasim->iommu_lock);
+	vdpasim_unmap_range(vdpasim, dma_addr, size);
 }
 
 static void *vdpasim_alloc_coherent(struct device *dev, size_t size,
@@ -172,27 +196,22 @@ static void *vdpasim_alloc_coherent(struct device *dev, size_t size,
 				    unsigned long attrs)
 {
 	struct vdpasim *vdpasim = dev_to_sim(dev);
-	struct vhost_iotlb *iommu = vdpasim->iommu;
-	void *addr = kmalloc(size, flag);
-	int ret;
+	phys_addr_t paddr;
+	void *addr;
 
-	spin_lock(&vdpasim->iommu_lock);
+	addr = kmalloc(size, flag);
 	if (!addr) {
 		*dma_addr = DMA_MAPPING_ERROR;
-	} else {
-		u64 pa = virt_to_phys(addr);
-
-		ret = vhost_iotlb_add_range(iommu, (u64)pa,
-					    (u64)pa + size - 1,
-					    pa, VHOST_MAP_RW);
-		if (ret) {
-			*dma_addr = DMA_MAPPING_ERROR;
-			kfree(addr);
-			addr = NULL;
-		} else
-			*dma_addr = (dma_addr_t)pa;
+		return NULL;
+	}
+
+	paddr = virt_to_phys(addr);
+
+	*dma_addr = vdpasim_map_range(vdpasim, paddr, size, VHOST_MAP_RW);
+	if (*dma_addr == DMA_MAPPING_ERROR) {
+		kfree(addr);
+		return NULL;
 	}
-	spin_unlock(&vdpasim->iommu_lock);
 
 	return addr;
 }
@@ -202,14 +221,10 @@ static void vdpasim_free_coherent(struct device *dev, size_t size,
 				  unsigned long attrs)
 {
 	struct vdpasim *vdpasim = dev_to_sim(dev);
-	struct vhost_iotlb *iommu = vdpasim->iommu;
 
-	spin_lock(&vdpasim->iommu_lock);
-	vhost_iotlb_del_range(iommu, (u64)dma_addr,
-			      (u64)dma_addr + size - 1);
-	spin_unlock(&vdpasim->iommu_lock);
+	vdpasim_unmap_range(vdpasim, dma_addr, size);
 
-	kfree(phys_to_virt((uintptr_t)dma_addr));
+	kfree(vaddr);
 }
 
 static const struct dma_map_ops vdpasim_dma_ops = {
@@ -271,6 +286,13 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
 	for (i = 0; i < dev_attr->nvqs; i++)
 		vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
 
+	ret = iova_cache_get();
+	if (ret)
+		goto err_iommu;
+
+	/* For simplicity we use an IOVA allocator with byte granularity */
+	init_iova_domain(&vdpasim->iova, 1, 0);
+
 	vdpasim->vdpa.dma_dev = dev;
 
 	return vdpasim;
@@ -541,6 +563,8 @@ static void vdpasim_free(struct vdpa_device *vdpa)
 	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
 
 	cancel_work_sync(&vdpasim->work);
+	put_iova_domain(&vdpasim->iova);
+	iova_cache_put();
 	kvfree(vdpasim->buffer);
 	if (vdpasim->iommu)
 		vhost_iotlb_free(vdpasim->iommu);
diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index ffd1e098bfd2..21a23500f430 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -14,6 +14,7 @@ config VDPA_SIM
 	depends on RUNTIME_TESTING_MENU && HAS_DMA
 	select DMA_OPS
 	select VHOST_RING
+	select IOMMU_IOVA
 	help
 	  Enable this module to support vDPA device simulators. These devices
 	  are used for testing, prototyping and development of vDPA.
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 02/10] vringh: add 'iotlb_lock' to synchronize iotlb accesses
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
  2021-01-28 14:41 ` [PATCH RFC v2 01/10] vdpa_sim: use iova module to allocate IOVA addresses Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-01-29  7:43   ` Jason Wang
  2021-01-28 14:41 ` [PATCH RFC v2 03/10] vringh: reset kiov 'consumed' field in __vringh_iov() Stefano Garzarella
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

Usually iotlb accesses are synchronized with a spinlock.
Let's request it as a new parameter in vringh_set_iotlb() and
hold it when we navigate the iotlb in iotlb_translate() to avoid
race conditions with any new additions/deletions of ranges from
the ioltb.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 include/linux/vringh.h           | 6 +++++-
 drivers/vdpa/vdpa_sim/vdpa_sim.c | 3 ++-
 drivers/vhost/vringh.c           | 9 ++++++++-
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 59bd50f99291..9c077863c8f6 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -46,6 +46,9 @@ struct vringh {
 	/* IOTLB for this vring */
 	struct vhost_iotlb *iotlb;
 
+	/* spinlock to synchronize IOTLB accesses */
+	spinlock_t *iotlb_lock;
+
 	/* The function to call to notify the guest about added buffers */
 	void (*notify)(struct vringh *);
 };
@@ -258,7 +261,8 @@ static inline __virtio64 cpu_to_vringh64(const struct vringh *vrh, u64 val)
 
 #if IS_REACHABLE(CONFIG_VHOST_IOTLB)
 
-void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb);
+void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
+		      spinlock_t *iotlb_lock);
 
 int vringh_init_iotlb(struct vringh *vrh, u64 features,
 		      unsigned int num, bool weak_barriers,
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 2183a833fcf4..53238989713d 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -284,7 +284,8 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
 		goto err_iommu;
 
 	for (i = 0; i < dev_attr->nvqs; i++)
-		vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
+		vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu,
+				 &vdpasim->iommu_lock);
 
 	ret = iova_cache_get();
 	if (ret)
diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index 85d85faba058..f68122705719 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -1074,6 +1074,8 @@ static int iotlb_translate(const struct vringh *vrh,
 	int ret = 0;
 	u64 s = 0;
 
+	spin_lock(vrh->iotlb_lock);
+
 	while (len > s) {
 		u64 size, pa, pfn;
 
@@ -1103,6 +1105,8 @@ static int iotlb_translate(const struct vringh *vrh,
 		++ret;
 	}
 
+	spin_unlock(vrh->iotlb_lock);
+
 	return ret;
 }
 
@@ -1262,10 +1266,13 @@ EXPORT_SYMBOL(vringh_init_iotlb);
  * vringh_set_iotlb - initialize a vringh for a ring with IOTLB.
  * @vrh: the vring
  * @iotlb: iotlb associated with this vring
+ * @iotlb_lock: spinlock to synchronize the iotlb accesses
  */
-void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb)
+void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
+		      spinlock_t *iotlb_lock)
 {
 	vrh->iotlb = iotlb;
+	vrh->iotlb_lock = iotlb_lock;
 }
 EXPORT_SYMBOL(vringh_set_iotlb);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 03/10] vringh: reset kiov 'consumed' field in __vringh_iov()
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
  2021-01-28 14:41 ` [PATCH RFC v2 01/10] vdpa_sim: use iova module to allocate IOVA addresses Stefano Garzarella
  2021-01-28 14:41 ` [PATCH RFC v2 02/10] vringh: add 'iotlb_lock' to synchronize iotlb accesses Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-02-01  5:40   ` Jason Wang
  2021-01-28 14:41 ` [PATCH RFC v2 04/10] vringh: implement vringh_kiov_advance() Stefano Garzarella
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

__vringh_iov() overwrites the contents of riov and wiov, in fact it
resets the 'i' and 'used' fields, but also the consumed field should
be reset to avoid an inconsistent state.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 drivers/vhost/vringh.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index f68122705719..bee63d68201a 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -290,9 +290,9 @@ __vringh_iov(struct vringh *vrh, u16 i,
 		return -EINVAL;
 
 	if (riov)
-		riov->i = riov->used = 0;
+		riov->i = riov->used = riov->consumed = 0;
 	if (wiov)
-		wiov->i = wiov->used = 0;
+		wiov->i = wiov->used = wiov->consumed = 0;
 
 	for (;;) {
 		void *addr;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 04/10] vringh: implement vringh_kiov_advance()
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
                   ` (2 preceding siblings ...)
  2021-01-28 14:41 ` [PATCH RFC v2 03/10] vringh: reset kiov 'consumed' field in __vringh_iov() Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-02-01  5:43   ` Jason Wang
  2021-01-28 14:41 ` [PATCH RFC v2 05/10] vringh: add vringh_kiov_length() helper Stefano Garzarella
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

In some cases, it may be useful to provide a way to skip a number
of bytes in a vringh_kiov.

Let's implement vringh_kiov_advance() for this purpose, reusing the
code from vringh_iov_xfer().
We replace that code calling the new vringh_kiov_advance().

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 include/linux/vringh.h |  2 ++
 drivers/vhost/vringh.c | 41 +++++++++++++++++++++++++++++------------
 2 files changed, 31 insertions(+), 12 deletions(-)

diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 9c077863c8f6..755211ebd195 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -199,6 +199,8 @@ static inline void vringh_kiov_cleanup(struct vringh_kiov *kiov)
 	kiov->iov = NULL;
 }
 
+void vringh_kiov_advance(struct vringh_kiov *kiov, size_t len);
+
 int vringh_getdesc_kern(struct vringh *vrh,
 			struct vringh_kiov *riov,
 			struct vringh_kiov *wiov,
diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index bee63d68201a..4d800e4f31ca 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -75,6 +75,34 @@ static inline int __vringh_get_head(const struct vringh *vrh,
 	return head;
 }
 
+/**
+ * vringh_kiov_advance - skip bytes from vring_kiov
+ * @iov: an iov passed to vringh_getdesc_*() (updated as we consume)
+ * @len: the maximum length to advance
+ */
+void vringh_kiov_advance(struct vringh_kiov *iov, size_t len)
+{
+	while (len && iov->i < iov->used) {
+		size_t partlen = min(iov->iov[iov->i].iov_len, len);
+
+		iov->consumed += partlen;
+		iov->iov[iov->i].iov_len -= partlen;
+		iov->iov[iov->i].iov_base += partlen;
+
+		if (!iov->iov[iov->i].iov_len) {
+			/* Fix up old iov element then increment. */
+			iov->iov[iov->i].iov_len = iov->consumed;
+			iov->iov[iov->i].iov_base -= iov->consumed;
+
+			iov->consumed = 0;
+			iov->i++;
+		}
+
+		len -= partlen;
+	}
+}
+EXPORT_SYMBOL(vringh_kiov_advance);
+
 /* Copy some bytes to/from the iovec.  Returns num copied. */
 static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
 				      struct vringh_kiov *iov,
@@ -95,19 +123,8 @@ static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
 		done += partlen;
 		len -= partlen;
 		ptr += partlen;
-		iov->consumed += partlen;
-		iov->iov[iov->i].iov_len -= partlen;
-		iov->iov[iov->i].iov_base += partlen;
 
-		if (!iov->iov[iov->i].iov_len) {
-			/* Fix up old iov element then increment. */
-			iov->iov[iov->i].iov_len = iov->consumed;
-			iov->iov[iov->i].iov_base -= iov->consumed;
-
-			
-			iov->consumed = 0;
-			iov->i++;
-		}
+		vringh_kiov_advance(iov, partlen);
 	}
 	return done;
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 05/10] vringh: add vringh_kiov_length() helper
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
                   ` (3 preceding siblings ...)
  2021-01-28 14:41 ` [PATCH RFC v2 04/10] vringh: implement vringh_kiov_advance() Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-02-01  5:46   ` Jason Wang
  2021-01-28 14:41 ` [PATCH RFC v2 06/10] vdpa_sim: cleanup kiovs in vdpasim_free() Stefano Garzarella
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

This new helper returns the total number of bytes covered by
a vringh_kiov.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 include/linux/vringh.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/vringh.h b/include/linux/vringh.h
index 755211ebd195..84db7b8f912f 100644
--- a/include/linux/vringh.h
+++ b/include/linux/vringh.h
@@ -199,6 +199,17 @@ static inline void vringh_kiov_cleanup(struct vringh_kiov *kiov)
 	kiov->iov = NULL;
 }
 
+static inline size_t vringh_kiov_length(struct vringh_kiov *kiov)
+{
+	size_t len = 0;
+	int i;
+
+	for (i = kiov->i; i < kiov->used; i++)
+		len += kiov->iov[i].iov_len;
+
+	return len;
+}
+
 void vringh_kiov_advance(struct vringh_kiov *kiov, size_t len);
 
 int vringh_getdesc_kern(struct vringh *vrh,
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 06/10] vdpa_sim: cleanup kiovs in vdpasim_free()
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
                   ` (4 preceding siblings ...)
  2021-01-28 14:41 ` [PATCH RFC v2 05/10] vringh: add vringh_kiov_length() helper Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-02-01  5:47   ` Jason Wang
  2021-01-28 14:41 ` [PATCH RFC v2 07/10] vdpa: Remove the restriction that only supports virtio-net devices Stefano Garzarella
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

vringh_getdesc_iotlb() allocates memory to store the kvec, that
is freed with vringh_kiov_cleanup().

vringh_getdesc_iotlb() is able to reuse a kvec previously allocated,
so in order to avoid to allocate the kvec for each request, we are
not calling vringh_kiov_cleanup() when we finished to handle a
request, but we should call it when we free the entire device.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 drivers/vdpa/vdpa_sim/vdpa_sim.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 53238989713d..a7aeb5d01c3e 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -562,8 +562,15 @@ static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
 static void vdpasim_free(struct vdpa_device *vdpa)
 {
 	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+	int i;
 
 	cancel_work_sync(&vdpasim->work);
+
+	for (i = 0; i < vdpasim->dev_attr.nvqs; i++) {
+		vringh_kiov_cleanup(&vdpasim->vqs[i].out_iov);
+		vringh_kiov_cleanup(&vdpasim->vqs[i].in_iov);
+	}
+
 	put_iova_domain(&vdpasim->iova);
 	iova_cache_put();
 	kvfree(vdpasim->buffer);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 07/10] vdpa: Remove the restriction that only supports virtio-net devices
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
                   ` (5 preceding siblings ...)
  2021-01-28 14:41 ` [PATCH RFC v2 06/10] vdpa_sim: cleanup kiovs in vdpasim_free() Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-01-28 14:41 ` [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device Stefano Garzarella
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

From: Xie Yongji <xieyongji@bytedance.com>

With VDUSE, we should be able to support all kinds of virtio devices.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
---
 drivers/vhost/vdpa.c | 28 ++--------------------------
 1 file changed, 2 insertions(+), 26 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index ef688c8c0e0e..28624cbfe6dd 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -185,26 +185,6 @@ static long vhost_vdpa_set_status(struct vhost_vdpa *v, u8 __user *statusp)
 	return 0;
 }
 
-static int vhost_vdpa_config_validate(struct vhost_vdpa *v,
-				      struct vhost_vdpa_config *c)
-{
-	long size = 0;
-
-	switch (v->virtio_id) {
-	case VIRTIO_ID_NET:
-		size = sizeof(struct virtio_net_config);
-		break;
-	}
-
-	if (c->len == 0)
-		return -EINVAL;
-
-	if (c->len > size - c->off)
-		return -E2BIG;
-
-	return 0;
-}
-
 static long vhost_vdpa_get_config(struct vhost_vdpa *v,
 				  struct vhost_vdpa_config __user *c)
 {
@@ -215,7 +195,7 @@ static long vhost_vdpa_get_config(struct vhost_vdpa *v,
 
 	if (copy_from_user(&config, c, size))
 		return -EFAULT;
-	if (vhost_vdpa_config_validate(v, &config))
+	if (config.len == 0)
 		return -EINVAL;
 	buf = kvzalloc(config.len, GFP_KERNEL);
 	if (!buf)
@@ -243,7 +223,7 @@ static long vhost_vdpa_set_config(struct vhost_vdpa *v,
 
 	if (copy_from_user(&config, c, size))
 		return -EFAULT;
-	if (vhost_vdpa_config_validate(v, &config))
+	if (config.len == 0)
 		return -EINVAL;
 
 	buf = vmemdup_user(c->buf, config.len);
@@ -1021,10 +1001,6 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
 	int minor;
 	int r;
 
-	/* Currently, we only accept the network devices. */
-	if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
-		return -ENOTSUPP;
-
 	v = kzalloc(sizeof(*v), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
 	if (!v)
 		return -ENOMEM;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
                   ` (6 preceding siblings ...)
  2021-01-28 14:41 ` [PATCH RFC v2 07/10] vdpa: Remove the restriction that only supports virtio-net devices Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-01-31 15:31   ` Max Gurtovoy
                     ` (2 more replies)
  2021-01-28 14:41 ` [PATCH RFC v2 09/10] vdpa_sim_blk: implement ramdisk behaviour Stefano Garzarella
  2021-01-28 14:41 ` [PATCH RFC v2 10/10] vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID Stefano Garzarella
  9 siblings, 3 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

From: Max Gurtovoy <mgurtovoy@nvidia.com>

This will allow running vDPA for virtio block protocol.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
[sgarzare: various cleanups/fixes]
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
v2:
- rebased on top of other changes (dev_attr, get_config(), notify(), etc.)
- memset to 0 the config structure in vdpasim_blk_get_config()
- used vdpasim pointer in vdpasim_blk_get_config()

v1:
- Removed unused headers
- Used cpu_to_vdpasim*() to store config fields
- Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
  option can not depend on other [Jason]
- Start with a single queue for now [Jason]
- Add comments to memory barriers
---
 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 145 +++++++++++++++++++++++++++
 drivers/vdpa/Kconfig                 |   7 ++
 drivers/vdpa/vdpa_sim/Makefile       |   1 +
 3 files changed, 153 insertions(+)
 create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
new file mode 100644
index 000000000000..999f9ca0b628
--- /dev/null
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDPA simulator for block device.
+ *
+ * Copyright (c) 2020, Mellanox Technologies. All rights reserved.
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/vringh.h>
+#include <linux/vdpa.h>
+#include <uapi/linux/virtio_blk.h>
+
+#include "vdpa_sim.h"
+
+#define DRV_VERSION  "0.1"
+#define DRV_AUTHOR   "Max Gurtovoy <mgurtovoy@nvidia.com>"
+#define DRV_DESC     "vDPA Device Simulator for block device"
+#define DRV_LICENSE  "GPL v2"
+
+#define VDPASIM_BLK_FEATURES	(VDPASIM_FEATURES | \
+				 (1ULL << VIRTIO_BLK_F_SIZE_MAX) | \
+				 (1ULL << VIRTIO_BLK_F_SEG_MAX)  | \
+				 (1ULL << VIRTIO_BLK_F_BLK_SIZE) | \
+				 (1ULL << VIRTIO_BLK_F_TOPOLOGY) | \
+				 (1ULL << VIRTIO_BLK_F_MQ))
+
+#define VDPASIM_BLK_CAPACITY	0x40000
+#define VDPASIM_BLK_SIZE_MAX	0x1000
+#define VDPASIM_BLK_SEG_MAX	32
+#define VDPASIM_BLK_VQ_NUM	1
+
+static struct vdpasim *vdpasim_blk_dev;
+
+static void vdpasim_blk_work(struct work_struct *work)
+{
+	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
+	u8 status = VIRTIO_BLK_S_OK;
+	int i;
+
+	spin_lock(&vdpasim->lock);
+
+	if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
+		goto out;
+
+	for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
+		struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
+
+		if (!vq->ready)
+			continue;
+
+		while (vringh_getdesc_iotlb(&vq->vring, &vq->out_iov,
+					    &vq->in_iov, &vq->head,
+					    GFP_ATOMIC) > 0) {
+			int write;
+
+			vq->in_iov.i = vq->in_iov.used - 1;
+			write = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
+						      &status, 1);
+			if (write <= 0)
+				break;
+
+			/* Make sure data is wrote before advancing index */
+			smp_wmb();
+
+			vringh_complete_iotlb(&vq->vring, vq->head, write);
+
+			/* Make sure used is visible before rasing the interrupt. */
+			smp_wmb();
+
+			local_bh_disable();
+			if (vringh_need_notify_iotlb(&vq->vring) > 0)
+				vringh_notify(&vq->vring);
+			local_bh_enable();
+		}
+	}
+out:
+	spin_unlock(&vdpasim->lock);
+}
+
+static void vdpasim_blk_get_config(struct vdpasim *vdpasim, void *config)
+{
+	struct virtio_blk_config *blk_config =
+		(struct virtio_blk_config *)config;
+
+	memset(config, 0, sizeof(struct virtio_blk_config));
+
+	blk_config->capacity = cpu_to_vdpasim64(vdpasim, VDPASIM_BLK_CAPACITY);
+	blk_config->size_max = cpu_to_vdpasim32(vdpasim, VDPASIM_BLK_SIZE_MAX);
+	blk_config->seg_max = cpu_to_vdpasim32(vdpasim, VDPASIM_BLK_SEG_MAX);
+	blk_config->num_queues = cpu_to_vdpasim16(vdpasim, VDPASIM_BLK_VQ_NUM);
+	blk_config->min_io_size = cpu_to_vdpasim16(vdpasim, 1);
+	blk_config->opt_io_size = cpu_to_vdpasim32(vdpasim, 1);
+	blk_config->blk_size = cpu_to_vdpasim32(vdpasim, SECTOR_SIZE);
+}
+
+static int __init vdpasim_blk_init(void)
+{
+	struct vdpasim_dev_attr dev_attr = {};
+	int ret;
+
+	dev_attr.id = VIRTIO_ID_BLOCK;
+	dev_attr.supported_features = VDPASIM_BLK_FEATURES;
+	dev_attr.nvqs = VDPASIM_BLK_VQ_NUM;
+	dev_attr.config_size = sizeof(struct virtio_blk_config);
+	dev_attr.get_config = vdpasim_blk_get_config;
+	dev_attr.work_fn = vdpasim_blk_work;
+	dev_attr.buffer_size = PAGE_SIZE;
+
+	vdpasim_blk_dev = vdpasim_create(&dev_attr);
+	if (IS_ERR(vdpasim_blk_dev)) {
+		ret = PTR_ERR(vdpasim_blk_dev);
+		goto out;
+	}
+
+	ret = vdpa_register_device(&vdpasim_blk_dev->vdpa);
+	if (ret)
+		goto put_dev;
+
+	return 0;
+
+put_dev:
+	put_device(&vdpasim_blk_dev->vdpa.dev);
+out:
+	return ret;
+}
+
+static void __exit vdpasim_blk_exit(void)
+{
+	struct vdpa_device *vdpa = &vdpasim_blk_dev->vdpa;
+
+	vdpa_unregister_device(vdpa);
+}
+
+module_init(vdpasim_blk_init)
+module_exit(vdpasim_blk_exit)
+
+MODULE_VERSION(DRV_VERSION);
+MODULE_LICENSE(DRV_LICENSE);
+MODULE_AUTHOR(DRV_AUTHOR);
+MODULE_DESCRIPTION(DRV_DESC);
diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index 21a23500f430..b8bd92cf04f9 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -26,6 +26,13 @@ config VDPA_SIM_NET
 	help
 	  vDPA networking device simulator which loops TX traffic back to RX.
 
+config VDPA_SIM_BLOCK
+	tristate "vDPA simulator for block device"
+	depends on VDPA_SIM
+	help
+	  vDPA block device simulator which terminates IO request in a
+	  memory buffer.
+
 config IFCVF
 	tristate "Intel IFC VF vDPA driver"
 	depends on PCI_MSI
diff --git a/drivers/vdpa/vdpa_sim/Makefile b/drivers/vdpa/vdpa_sim/Makefile
index 79d4536d347e..d458103302f2 100644
--- a/drivers/vdpa/vdpa_sim/Makefile
+++ b/drivers/vdpa/vdpa_sim/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
 obj-$(CONFIG_VDPA_SIM_NET) += vdpa_sim_net.o
+obj-$(CONFIG_VDPA_SIM_BLOCK) += vdpa_sim_blk.o
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 09/10] vdpa_sim_blk: implement ramdisk behaviour
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
                   ` (7 preceding siblings ...)
  2021-01-28 14:41 ` [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-02-01  6:03   ` Jason Wang
  2021-02-02  9:41   ` Stefan Hajnoczi
  2021-01-28 14:41 ` [PATCH RFC v2 10/10] vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID Stefano Garzarella
  9 siblings, 2 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

The previous implementation wrote only the status of each request.
This patch implements a more accurate block device simulator,
providing a ramdisk-like behavior.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
v2:
- used %zd %zx to print size_t and ssize_t variables in dev_err()
- removed unnecessary new line [Jason]
- moved VIRTIO_BLK_T_GET_ID in another patch [Jason]
- used push/pull instead of write/read terminology
- added vdpasim_blk_check_range() to avoid overflows [Stefan]
- use vdpasim*_to_cpu instead of le*_to_cpu
- used vringh_kiov_length() helper [Jason]
---
 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 164 ++++++++++++++++++++++++---
 1 file changed, 146 insertions(+), 18 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
index 999f9ca0b628..fc47e8320358 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -3,6 +3,7 @@
  * VDPA simulator for block device.
  *
  * Copyright (c) 2020, Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2021, Red Hat Inc. All rights reserved.
  *
  */
 
@@ -13,6 +14,7 @@
 #include <linux/sched.h>
 #include <linux/vringh.h>
 #include <linux/vdpa.h>
+#include <linux/blkdev.h>
 #include <uapi/linux/virtio_blk.h>
 
 #include "vdpa_sim.h"
@@ -36,10 +38,151 @@
 
 static struct vdpasim *vdpasim_blk_dev;
 
+static bool vdpasim_blk_check_range(u64 start_sector, size_t range_size)
+{
+	u64 range_sectors = range_size >> SECTOR_SHIFT;
+
+	if (range_size > VDPASIM_BLK_SIZE_MAX * VDPASIM_BLK_SEG_MAX)
+		return false;
+
+	if (start_sector > VDPASIM_BLK_CAPACITY)
+		return false;
+
+	if (range_sectors > VDPASIM_BLK_CAPACITY - start_sector)
+		return false;
+
+	return true;
+}
+
+/* Returns 'true' if the request is handled (with or without an I/O error)
+ * and the status is correctly written in the last byte of the 'in iov',
+ * 'false' otherwise.
+ */
+static bool vdpasim_blk_handle_req(struct vdpasim *vdpasim,
+				   struct vdpasim_virtqueue *vq)
+{
+	size_t pushed = 0, to_pull, to_push;
+	struct virtio_blk_outhdr hdr;
+	ssize_t bytes;
+	loff_t offset;
+	u64 sector;
+	u8 status;
+	u32 type;
+	int ret;
+
+	ret = vringh_getdesc_iotlb(&vq->vring, &vq->out_iov, &vq->in_iov,
+				   &vq->head, GFP_ATOMIC);
+	if (ret != 1)
+		return false;
+
+	if (vq->out_iov.used < 1 || vq->in_iov.used < 1) {
+		dev_err(&vdpasim->vdpa.dev, "missing headers - out_iov: %u in_iov %u\n",
+			vq->out_iov.used, vq->in_iov.used);
+		return false;
+	}
+
+	if (vq->in_iov.iov[vq->in_iov.used - 1].iov_len < 1) {
+		dev_err(&vdpasim->vdpa.dev, "request in header too short\n");
+		return false;
+	}
+
+	/* The last byte is the status and we checked if the last iov has
+	 * enough room for it.
+	 */
+	to_push = vringh_kiov_length(&vq->in_iov) - 1;
+
+	to_pull = vringh_kiov_length(&vq->out_iov);
+
+	bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->out_iov, &hdr,
+				      sizeof(hdr));
+	if (bytes != sizeof(hdr)) {
+		dev_err(&vdpasim->vdpa.dev, "request out header too short\n");
+		return false;
+	}
+
+	to_pull -= bytes;
+
+	type = vdpasim32_to_cpu(vdpasim, hdr.type);
+	sector = vdpasim64_to_cpu(vdpasim, hdr.sector);
+	offset = sector << SECTOR_SHIFT;
+	status = VIRTIO_BLK_S_OK;
+
+	switch (type) {
+	case VIRTIO_BLK_T_IN:
+		if (!vdpasim_blk_check_range(sector, to_push)) {
+			dev_err(&vdpasim->vdpa.dev,
+				"reading over the capacity - offset: 0x%llx len: 0x%zx\n",
+				offset, to_push);
+			status = VIRTIO_BLK_S_IOERR;
+			break;
+		}
+
+		bytes = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
+					      vdpasim->buffer + offset,
+					      to_push);
+		if (bytes < 0) {
+			dev_err(&vdpasim->vdpa.dev,
+				"vringh_iov_push_iotlb() error: %zd offset: 0x%llx len: 0x%zx\n",
+				bytes, offset, to_push);
+			status = VIRTIO_BLK_S_IOERR;
+			break;
+		}
+
+		pushed += bytes;
+		break;
+
+	case VIRTIO_BLK_T_OUT:
+		if (!vdpasim_blk_check_range(sector, to_pull)) {
+			dev_err(&vdpasim->vdpa.dev,
+				"writing over the capacity - offset: 0x%llx len: 0x%zx\n",
+				offset, to_pull);
+			status = VIRTIO_BLK_S_IOERR;
+			break;
+		}
+
+		bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->out_iov,
+					      vdpasim->buffer + offset,
+					      to_pull);
+		if (bytes < 0) {
+			dev_err(&vdpasim->vdpa.dev,
+				"vringh_iov_pull_iotlb() error: %zd offset: 0x%llx len: 0x%zx\n",
+				bytes, offset, to_pull);
+			status = VIRTIO_BLK_S_IOERR;
+			break;
+		}
+		break;
+
+	default:
+		dev_warn(&vdpasim->vdpa.dev,
+			 "Unsupported request type %d\n", type);
+		status = VIRTIO_BLK_S_IOERR;
+		break;
+	}
+
+	/* If some operations fail, we need to skip the remaining bytes
+	 * to put the status in the last byte
+	 */
+	if (to_push - pushed > 0)
+		vringh_kiov_advance(&vq->in_iov, to_push - pushed);
+
+	/* Last byte is the status */
+	bytes = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov, &status, 1);
+	if (bytes != 1)
+		return false;
+
+	pushed += bytes;
+
+	/* Make sure data is wrote before advancing index */
+	smp_wmb();
+
+	vringh_complete_iotlb(&vq->vring, vq->head, pushed);
+
+	return true;
+}
+
 static void vdpasim_blk_work(struct work_struct *work)
 {
 	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
-	u8 status = VIRTIO_BLK_S_OK;
 	int i;
 
 	spin_lock(&vdpasim->lock);
@@ -53,22 +196,7 @@ static void vdpasim_blk_work(struct work_struct *work)
 		if (!vq->ready)
 			continue;
 
-		while (vringh_getdesc_iotlb(&vq->vring, &vq->out_iov,
-					    &vq->in_iov, &vq->head,
-					    GFP_ATOMIC) > 0) {
-			int write;
-
-			vq->in_iov.i = vq->in_iov.used - 1;
-			write = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
-						      &status, 1);
-			if (write <= 0)
-				break;
-
-			/* Make sure data is wrote before advancing index */
-			smp_wmb();
-
-			vringh_complete_iotlb(&vq->vring, vq->head, write);
-
+		while (vdpasim_blk_handle_req(vdpasim, vq)) {
 			/* Make sure used is visible before rasing the interrupt. */
 			smp_wmb();
 
@@ -109,7 +237,7 @@ static int __init vdpasim_blk_init(void)
 	dev_attr.config_size = sizeof(struct virtio_blk_config);
 	dev_attr.get_config = vdpasim_blk_get_config;
 	dev_attr.work_fn = vdpasim_blk_work;
-	dev_attr.buffer_size = PAGE_SIZE;
+	dev_attr.buffer_size = VDPASIM_BLK_CAPACITY << SECTOR_SHIFT;
 
 	vdpasim_blk_dev = vdpasim_create(&dev_attr);
 	if (IS_ERR(vdpasim_blk_dev)) {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH RFC v2 10/10] vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID
  2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
                   ` (8 preceding siblings ...)
  2021-01-28 14:41 ` [PATCH RFC v2 09/10] vdpa_sim_blk: implement ramdisk behaviour Stefano Garzarella
@ 2021-01-28 14:41 ` Stefano Garzarella
  2021-02-01  6:04   ` Jason Wang
  2021-02-02  9:44   ` Stefan Hajnoczi
  9 siblings, 2 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-01-28 14:41 UTC (permalink / raw)
  To: virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Stefano Garzarella,
	Laurent Vivier, Stefan Hajnoczi, linux-kernel, Max Gurtovoy,
	Jason Wang, kvm

Handle VIRTIO_BLK_T_GET_ID request, always answering the
"vdpa_blk_sim" string.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
v2:
- made 'vdpasim_blk_id' static [Jason]
---
 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
index fc47e8320358..a3f8afad8d14 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
@@ -37,6 +37,7 @@
 #define VDPASIM_BLK_VQ_NUM	1
 
 static struct vdpasim *vdpasim_blk_dev;
+static char vdpasim_blk_id[VIRTIO_BLK_ID_BYTES] = "vdpa_blk_sim";
 
 static bool vdpasim_blk_check_range(u64 start_sector, size_t range_size)
 {
@@ -152,6 +153,20 @@ static bool vdpasim_blk_handle_req(struct vdpasim *vdpasim,
 		}
 		break;
 
+	case VIRTIO_BLK_T_GET_ID:
+		bytes = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
+					      vdpasim_blk_id,
+					      VIRTIO_BLK_ID_BYTES);
+		if (bytes < 0) {
+			dev_err(&vdpasim->vdpa.dev,
+				"vringh_iov_push_iotlb() error: %zd\n", bytes);
+			status = VIRTIO_BLK_S_IOERR;
+			break;
+		}
+
+		pushed += bytes;
+		break;
+
 	default:
 		dev_warn(&vdpasim->vdpa.dev,
 			 "Unsupported request type %d\n", type);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 02/10] vringh: add 'iotlb_lock' to synchronize iotlb accesses
  2021-01-28 14:41 ` [PATCH RFC v2 02/10] vringh: add 'iotlb_lock' to synchronize iotlb accesses Stefano Garzarella
@ 2021-01-29  7:43   ` Jason Wang
       [not found]     ` <20210129091850.gatf3ih3knw2p4l4@steredhat>
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Wang @ 2021-01-29  7:43 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Max Gurtovoy, kvm


On 2021/1/28 下午10:41, Stefano Garzarella wrote:
> Usually iotlb accesses are synchronized with a spinlock.
> Let's request it as a new parameter in vringh_set_iotlb() and
> hold it when we navigate the iotlb in iotlb_translate() to avoid
> race conditions with any new additions/deletions of ranges from
> the ioltb.


Patch looks fine but I wonder if this is the best approach comparing to 
do locking by the caller.

Thanks


>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
>   include/linux/vringh.h           | 6 +++++-
>   drivers/vdpa/vdpa_sim/vdpa_sim.c | 3 ++-
>   drivers/vhost/vringh.c           | 9 ++++++++-
>   3 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> index 59bd50f99291..9c077863c8f6 100644
> --- a/include/linux/vringh.h
> +++ b/include/linux/vringh.h
> @@ -46,6 +46,9 @@ struct vringh {
>   	/* IOTLB for this vring */
>   	struct vhost_iotlb *iotlb;
>   
> +	/* spinlock to synchronize IOTLB accesses */
> +	spinlock_t *iotlb_lock;
> +
>   	/* The function to call to notify the guest about added buffers */
>   	void (*notify)(struct vringh *);
>   };
> @@ -258,7 +261,8 @@ static inline __virtio64 cpu_to_vringh64(const struct vringh *vrh, u64 val)
>   
>   #if IS_REACHABLE(CONFIG_VHOST_IOTLB)
>   
> -void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb);
> +void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
> +		      spinlock_t *iotlb_lock);
>   
>   int vringh_init_iotlb(struct vringh *vrh, u64 features,
>   		      unsigned int num, bool weak_barriers,
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 2183a833fcf4..53238989713d 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -284,7 +284,8 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr)
>   		goto err_iommu;
>   
>   	for (i = 0; i < dev_attr->nvqs; i++)
> -		vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
> +		vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu,
> +				 &vdpasim->iommu_lock);
>   
>   	ret = iova_cache_get();
>   	if (ret)
> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> index 85d85faba058..f68122705719 100644
> --- a/drivers/vhost/vringh.c
> +++ b/drivers/vhost/vringh.c
> @@ -1074,6 +1074,8 @@ static int iotlb_translate(const struct vringh *vrh,
>   	int ret = 0;
>   	u64 s = 0;
>   
> +	spin_lock(vrh->iotlb_lock);
> +
>   	while (len > s) {
>   		u64 size, pa, pfn;
>   
> @@ -1103,6 +1105,8 @@ static int iotlb_translate(const struct vringh *vrh,
>   		++ret;
>   	}
>   
> +	spin_unlock(vrh->iotlb_lock);
> +
>   	return ret;
>   }
>   
> @@ -1262,10 +1266,13 @@ EXPORT_SYMBOL(vringh_init_iotlb);
>    * vringh_set_iotlb - initialize a vringh for a ring with IOTLB.
>    * @vrh: the vring
>    * @iotlb: iotlb associated with this vring
> + * @iotlb_lock: spinlock to synchronize the iotlb accesses
>    */
> -void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb)
> +void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
> +		      spinlock_t *iotlb_lock)
>   {
>   	vrh->iotlb = iotlb;
> +	vrh->iotlb_lock = iotlb_lock;
>   }
>   EXPORT_SYMBOL(vringh_set_iotlb);
>   


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device
  2021-01-28 14:41 ` [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device Stefano Garzarella
@ 2021-01-31 15:31   ` Max Gurtovoy
  2021-02-01  8:29     ` Stefano Garzarella
  2021-02-01  5:50   ` Jason Wang
  2021-02-02  9:34   ` Stefan Hajnoczi
  2 siblings, 1 reply; 31+ messages in thread
From: Max Gurtovoy @ 2021-01-31 15:31 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Jason Wang, kvm


On 1/28/2021 4:41 PM, Stefano Garzarella wrote:
> From: Max Gurtovoy <mgurtovoy@nvidia.com>
>
> This will allow running vDPA for virtio block protocol.
>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> [sgarzare: various cleanups/fixes]
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> v2:
> - rebased on top of other changes (dev_attr, get_config(), notify(), etc.)
> - memset to 0 the config structure in vdpasim_blk_get_config()
> - used vdpasim pointer in vdpasim_blk_get_config()
>
> v1:
> - Removed unused headers
> - Used cpu_to_vdpasim*() to store config fields
> - Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
>    option can not depend on other [Jason]
> - Start with a single queue for now [Jason]
> - Add comments to memory barriers
> ---
>   drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 145 +++++++++++++++++++++++++++
>   drivers/vdpa/Kconfig                 |   7 ++
>   drivers/vdpa/vdpa_sim/Makefile       |   1 +
>   3 files changed, 153 insertions(+)
>   create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> new file mode 100644
> index 000000000000..999f9ca0b628
> --- /dev/null
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -0,0 +1,145 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * VDPA simulator for block device.
> + *
> + * Copyright (c) 2020, Mellanox Technologies. All rights reserved.

I guess we can change the copyright from Mellanox to:

Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.

Thanks.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 03/10] vringh: reset kiov 'consumed' field in __vringh_iov()
  2021-01-28 14:41 ` [PATCH RFC v2 03/10] vringh: reset kiov 'consumed' field in __vringh_iov() Stefano Garzarella
@ 2021-02-01  5:40   ` Jason Wang
  2021-02-01 10:21     ` Stefano Garzarella
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Wang @ 2021-02-01  5:40 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Max Gurtovoy, kvm


On 2021/1/28 下午10:41, Stefano Garzarella wrote:
> __vringh_iov() overwrites the contents of riov and wiov, in fact it
> resets the 'i' and 'used' fields, but also the consumed field should
> be reset to avoid an inconsistent state.
>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>


I had a question(I remember we had some discussion like this but I 
forget the conclusion):

I see e.g in vringh_getdesc_kern() it has the following comment:

/*
  * Note that you may need to clean up riov and wiov, even on error!
  */

So it looks to me the correct way is to call vringh_kiov_cleanup() before?

Thanks


> ---
>   drivers/vhost/vringh.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> index f68122705719..bee63d68201a 100644
> --- a/drivers/vhost/vringh.c
> +++ b/drivers/vhost/vringh.c
> @@ -290,9 +290,9 @@ __vringh_iov(struct vringh *vrh, u16 i,
>   		return -EINVAL;
>   
>   	if (riov)
> -		riov->i = riov->used = 0;
> +		riov->i = riov->used = riov->consumed = 0;
>   	if (wiov)
> -		wiov->i = wiov->used = 0;
> +		wiov->i = wiov->used = wiov->consumed = 0;
>   
>   	for (;;) {
>   		void *addr;


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 04/10] vringh: implement vringh_kiov_advance()
  2021-01-28 14:41 ` [PATCH RFC v2 04/10] vringh: implement vringh_kiov_advance() Stefano Garzarella
@ 2021-02-01  5:43   ` Jason Wang
  2021-02-01 10:23     ` Stefano Garzarella
  0 siblings, 1 reply; 31+ messages in thread
From: Jason Wang @ 2021-02-01  5:43 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Max Gurtovoy, kvm


On 2021/1/28 下午10:41, Stefano Garzarella wrote:
> In some cases, it may be useful to provide a way to skip a number
> of bytes in a vringh_kiov.
>
> Let's implement vringh_kiov_advance() for this purpose, reusing the
> code from vringh_iov_xfer().
> We replace that code calling the new vringh_kiov_advance().


Acked-by: Jason Wang <jasowang@redhat.com>

In the long run we need to switch to use iov iterator library instead.

Thanks


>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
>   include/linux/vringh.h |  2 ++
>   drivers/vhost/vringh.c | 41 +++++++++++++++++++++++++++++------------
>   2 files changed, 31 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> index 9c077863c8f6..755211ebd195 100644
> --- a/include/linux/vringh.h
> +++ b/include/linux/vringh.h
> @@ -199,6 +199,8 @@ static inline void vringh_kiov_cleanup(struct vringh_kiov *kiov)
>   	kiov->iov = NULL;
>   }
>   
> +void vringh_kiov_advance(struct vringh_kiov *kiov, size_t len);
> +
>   int vringh_getdesc_kern(struct vringh *vrh,
>   			struct vringh_kiov *riov,
>   			struct vringh_kiov *wiov,
> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
> index bee63d68201a..4d800e4f31ca 100644
> --- a/drivers/vhost/vringh.c
> +++ b/drivers/vhost/vringh.c
> @@ -75,6 +75,34 @@ static inline int __vringh_get_head(const struct vringh *vrh,
>   	return head;
>   }
>   
> +/**
> + * vringh_kiov_advance - skip bytes from vring_kiov
> + * @iov: an iov passed to vringh_getdesc_*() (updated as we consume)
> + * @len: the maximum length to advance
> + */
> +void vringh_kiov_advance(struct vringh_kiov *iov, size_t len)
> +{
> +	while (len && iov->i < iov->used) {
> +		size_t partlen = min(iov->iov[iov->i].iov_len, len);
> +
> +		iov->consumed += partlen;
> +		iov->iov[iov->i].iov_len -= partlen;
> +		iov->iov[iov->i].iov_base += partlen;
> +
> +		if (!iov->iov[iov->i].iov_len) {
> +			/* Fix up old iov element then increment. */
> +			iov->iov[iov->i].iov_len = iov->consumed;
> +			iov->iov[iov->i].iov_base -= iov->consumed;
> +
> +			iov->consumed = 0;
> +			iov->i++;
> +		}
> +
> +		len -= partlen;
> +	}
> +}
> +EXPORT_SYMBOL(vringh_kiov_advance);
> +
>   /* Copy some bytes to/from the iovec.  Returns num copied. */
>   static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
>   				      struct vringh_kiov *iov,
> @@ -95,19 +123,8 @@ static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
>   		done += partlen;
>   		len -= partlen;
>   		ptr += partlen;
> -		iov->consumed += partlen;
> -		iov->iov[iov->i].iov_len -= partlen;
> -		iov->iov[iov->i].iov_base += partlen;
>   
> -		if (!iov->iov[iov->i].iov_len) {
> -			/* Fix up old iov element then increment. */
> -			iov->iov[iov->i].iov_len = iov->consumed;
> -			iov->iov[iov->i].iov_base -= iov->consumed;
> -
> -			
> -			iov->consumed = 0;
> -			iov->i++;
> -		}
> +		vringh_kiov_advance(iov, partlen);
>   	}
>   	return done;
>   }


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 05/10] vringh: add vringh_kiov_length() helper
  2021-01-28 14:41 ` [PATCH RFC v2 05/10] vringh: add vringh_kiov_length() helper Stefano Garzarella
@ 2021-02-01  5:46   ` Jason Wang
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Wang @ 2021-02-01  5:46 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Max Gurtovoy, kvm


On 2021/1/28 下午10:41, Stefano Garzarella wrote:
> This new helper returns the total number of bytes covered by
> a vringh_kiov.
>
> Suggested-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   include/linux/vringh.h | 11 +++++++++++
>   1 file changed, 11 insertions(+)
>
> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
> index 755211ebd195..84db7b8f912f 100644
> --- a/include/linux/vringh.h
> +++ b/include/linux/vringh.h
> @@ -199,6 +199,17 @@ static inline void vringh_kiov_cleanup(struct vringh_kiov *kiov)
>   	kiov->iov = NULL;
>   }
>   
> +static inline size_t vringh_kiov_length(struct vringh_kiov *kiov)
> +{
> +	size_t len = 0;
> +	int i;
> +
> +	for (i = kiov->i; i < kiov->used; i++)
> +		len += kiov->iov[i].iov_len;
> +
> +	return len;
> +}
> +
>   void vringh_kiov_advance(struct vringh_kiov *kiov, size_t len);
>   
>   int vringh_getdesc_kern(struct vringh *vrh,


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 06/10] vdpa_sim: cleanup kiovs in vdpasim_free()
  2021-01-28 14:41 ` [PATCH RFC v2 06/10] vdpa_sim: cleanup kiovs in vdpasim_free() Stefano Garzarella
@ 2021-02-01  5:47   ` Jason Wang
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Wang @ 2021-02-01  5:47 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Max Gurtovoy, kvm


On 2021/1/28 下午10:41, Stefano Garzarella wrote:
> vringh_getdesc_iotlb() allocates memory to store the kvec, that
> is freed with vringh_kiov_cleanup().
>
> vringh_getdesc_iotlb() is able to reuse a kvec previously allocated,
> so in order to avoid to allocate the kvec for each request, we are
> not calling vringh_kiov_cleanup() when we finished to handle a
> request, but we should call it when we free the entire device.
>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/vdpa_sim/vdpa_sim.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 53238989713d..a7aeb5d01c3e 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -562,8 +562,15 @@ static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
>   static void vdpasim_free(struct vdpa_device *vdpa)
>   {
>   	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> +	int i;
>   
>   	cancel_work_sync(&vdpasim->work);
> +
> +	for (i = 0; i < vdpasim->dev_attr.nvqs; i++) {
> +		vringh_kiov_cleanup(&vdpasim->vqs[i].out_iov);
> +		vringh_kiov_cleanup(&vdpasim->vqs[i].in_iov);
> +	}
> +
>   	put_iova_domain(&vdpasim->iova);
>   	iova_cache_put();
>   	kvfree(vdpasim->buffer);


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device
  2021-01-28 14:41 ` [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device Stefano Garzarella
  2021-01-31 15:31   ` Max Gurtovoy
@ 2021-02-01  5:50   ` Jason Wang
  2021-02-02  9:34   ` Stefan Hajnoczi
  2 siblings, 0 replies; 31+ messages in thread
From: Jason Wang @ 2021-02-01  5:50 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Max Gurtovoy, kvm


On 2021/1/28 下午10:41, Stefano Garzarella wrote:
> From: Max Gurtovoy <mgurtovoy@nvidia.com>
>
> This will allow running vDPA for virtio block protocol.
>
> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
> [sgarzare: various cleanups/fixes]
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
> v2:
> - rebased on top of other changes (dev_attr, get_config(), notify(), etc.)
> - memset to 0 the config structure in vdpasim_blk_get_config()
> - used vdpasim pointer in vdpasim_blk_get_config()
>
> v1:
> - Removed unused headers
> - Used cpu_to_vdpasim*() to store config fields
> - Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
>    option can not depend on other [Jason]
> - Start with a single queue for now [Jason]
> - Add comments to memory barriers
> ---
>   drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 145 +++++++++++++++++++++++++++
>   drivers/vdpa/Kconfig                 |   7 ++
>   drivers/vdpa/vdpa_sim/Makefile       |   1 +
>   3 files changed, 153 insertions(+)
>   create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> new file mode 100644
> index 000000000000..999f9ca0b628
> --- /dev/null
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -0,0 +1,145 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * VDPA simulator for block device.
> + *
> + * Copyright (c) 2020, Mellanox Technologies. All rights reserved.
> + *
> + */
> +
> +#include <linux/init.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/sched.h>
> +#include <linux/vringh.h>
> +#include <linux/vdpa.h>
> +#include <uapi/linux/virtio_blk.h>
> +
> +#include "vdpa_sim.h"
> +
> +#define DRV_VERSION  "0.1"
> +#define DRV_AUTHOR   "Max Gurtovoy <mgurtovoy@nvidia.com>"
> +#define DRV_DESC     "vDPA Device Simulator for block device"
> +#define DRV_LICENSE  "GPL v2"
> +
> +#define VDPASIM_BLK_FEATURES	(VDPASIM_FEATURES | \
> +				 (1ULL << VIRTIO_BLK_F_SIZE_MAX) | \
> +				 (1ULL << VIRTIO_BLK_F_SEG_MAX)  | \
> +				 (1ULL << VIRTIO_BLK_F_BLK_SIZE) | \
> +				 (1ULL << VIRTIO_BLK_F_TOPOLOGY) | \
> +				 (1ULL << VIRTIO_BLK_F_MQ))
> +
> +#define VDPASIM_BLK_CAPACITY	0x40000
> +#define VDPASIM_BLK_SIZE_MAX	0x1000
> +#define VDPASIM_BLK_SEG_MAX	32
> +#define VDPASIM_BLK_VQ_NUM	1
> +
> +static struct vdpasim *vdpasim_blk_dev;
> +
> +static void vdpasim_blk_work(struct work_struct *work)
> +{
> +	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> +	u8 status = VIRTIO_BLK_S_OK;
> +	int i;
> +
> +	spin_lock(&vdpasim->lock);
> +
> +	if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> +		goto out;
> +
> +	for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
> +		struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
> +
> +		if (!vq->ready)
> +			continue;
> +
> +		while (vringh_getdesc_iotlb(&vq->vring, &vq->out_iov,
> +					    &vq->in_iov, &vq->head,
> +					    GFP_ATOMIC) > 0) {
> +			int write;
> +
> +			vq->in_iov.i = vq->in_iov.used - 1;
> +			write = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
> +						      &status, 1);
> +			if (write <= 0)
> +				break;
> +
> +			/* Make sure data is wrote before advancing index */
> +			smp_wmb();
> +
> +			vringh_complete_iotlb(&vq->vring, vq->head, write);
> +
> +			/* Make sure used is visible before rasing the interrupt. */
> +			smp_wmb();
> +
> +			local_bh_disable();
> +			if (vringh_need_notify_iotlb(&vq->vring) > 0)
> +				vringh_notify(&vq->vring);
> +			local_bh_enable();
> +		}
> +	}
> +out:
> +	spin_unlock(&vdpasim->lock);
> +}
> +
> +static void vdpasim_blk_get_config(struct vdpasim *vdpasim, void *config)
> +{
> +	struct virtio_blk_config *blk_config =
> +		(struct virtio_blk_config *)config;
> +
> +	memset(config, 0, sizeof(struct virtio_blk_config));
> +
> +	blk_config->capacity = cpu_to_vdpasim64(vdpasim, VDPASIM_BLK_CAPACITY);
> +	blk_config->size_max = cpu_to_vdpasim32(vdpasim, VDPASIM_BLK_SIZE_MAX);
> +	blk_config->seg_max = cpu_to_vdpasim32(vdpasim, VDPASIM_BLK_SEG_MAX);
> +	blk_config->num_queues = cpu_to_vdpasim16(vdpasim, VDPASIM_BLK_VQ_NUM);
> +	blk_config->min_io_size = cpu_to_vdpasim16(vdpasim, 1);
> +	blk_config->opt_io_size = cpu_to_vdpasim32(vdpasim, 1);
> +	blk_config->blk_size = cpu_to_vdpasim32(vdpasim, SECTOR_SIZE);
> +}
> +
> +static int __init vdpasim_blk_init(void)
> +{
> +	struct vdpasim_dev_attr dev_attr = {};
> +	int ret;
> +
> +	dev_attr.id = VIRTIO_ID_BLOCK;
> +	dev_attr.supported_features = VDPASIM_BLK_FEATURES;
> +	dev_attr.nvqs = VDPASIM_BLK_VQ_NUM;
> +	dev_attr.config_size = sizeof(struct virtio_blk_config);
> +	dev_attr.get_config = vdpasim_blk_get_config;
> +	dev_attr.work_fn = vdpasim_blk_work;
> +	dev_attr.buffer_size = PAGE_SIZE;
> +
> +	vdpasim_blk_dev = vdpasim_create(&dev_attr);
> +	if (IS_ERR(vdpasim_blk_dev)) {
> +		ret = PTR_ERR(vdpasim_blk_dev);
> +		goto out;
> +	}
> +
> +	ret = vdpa_register_device(&vdpasim_blk_dev->vdpa);
> +	if (ret)
> +		goto put_dev;
> +
> +	return 0;
> +
> +put_dev:
> +	put_device(&vdpasim_blk_dev->vdpa.dev);
> +out:
> +	return ret;
> +}
> +
> +static void __exit vdpasim_blk_exit(void)
> +{
> +	struct vdpa_device *vdpa = &vdpasim_blk_dev->vdpa;
> +
> +	vdpa_unregister_device(vdpa);
> +}
> +
> +module_init(vdpasim_blk_init)
> +module_exit(vdpasim_blk_exit)
> +
> +MODULE_VERSION(DRV_VERSION);
> +MODULE_LICENSE(DRV_LICENSE);
> +MODULE_AUTHOR(DRV_AUTHOR);
> +MODULE_DESCRIPTION(DRV_DESC);
> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
> index 21a23500f430..b8bd92cf04f9 100644
> --- a/drivers/vdpa/Kconfig
> +++ b/drivers/vdpa/Kconfig
> @@ -26,6 +26,13 @@ config VDPA_SIM_NET
>   	help
>   	  vDPA networking device simulator which loops TX traffic back to RX.
>   
> +config VDPA_SIM_BLOCK
> +	tristate "vDPA simulator for block device"
> +	depends on VDPA_SIM
> +	help
> +	  vDPA block device simulator which terminates IO request in a
> +	  memory buffer.
> +
>   config IFCVF
>   	tristate "Intel IFC VF vDPA driver"
>   	depends on PCI_MSI
> diff --git a/drivers/vdpa/vdpa_sim/Makefile b/drivers/vdpa/vdpa_sim/Makefile
> index 79d4536d347e..d458103302f2 100644
> --- a/drivers/vdpa/vdpa_sim/Makefile
> +++ b/drivers/vdpa/vdpa_sim/Makefile
> @@ -1,3 +1,4 @@
>   # SPDX-License-Identifier: GPL-2.0
>   obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
>   obj-$(CONFIG_VDPA_SIM_NET) += vdpa_sim_net.o
> +obj-$(CONFIG_VDPA_SIM_BLOCK) += vdpa_sim_blk.o


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 09/10] vdpa_sim_blk: implement ramdisk behaviour
  2021-01-28 14:41 ` [PATCH RFC v2 09/10] vdpa_sim_blk: implement ramdisk behaviour Stefano Garzarella
@ 2021-02-01  6:03   ` Jason Wang
  2021-02-02  9:41   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Jason Wang @ 2021-02-01  6:03 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Max Gurtovoy, kvm


On 2021/1/28 下午10:41, Stefano Garzarella wrote:
> The previous implementation wrote only the status of each request.
> This patch implements a more accurate block device simulator,
> providing a ramdisk-like behavior.
>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> v2:
> - used %zd %zx to print size_t and ssize_t variables in dev_err()
> - removed unnecessary new line [Jason]
> - moved VIRTIO_BLK_T_GET_ID in another patch [Jason]
> - used push/pull instead of write/read terminology
> - added vdpasim_blk_check_range() to avoid overflows [Stefan]
> - use vdpasim*_to_cpu instead of le*_to_cpu
> - used vringh_kiov_length() helper [Jason]


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 164 ++++++++++++++++++++++++---
>   1 file changed, 146 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> index 999f9ca0b628..fc47e8320358 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -3,6 +3,7 @@
>    * VDPA simulator for block device.
>    *
>    * Copyright (c) 2020, Mellanox Technologies. All rights reserved.
> + * Copyright (c) 2021, Red Hat Inc. All rights reserved.
>    *
>    */
>   
> @@ -13,6 +14,7 @@
>   #include <linux/sched.h>
>   #include <linux/vringh.h>
>   #include <linux/vdpa.h>
> +#include <linux/blkdev.h>
>   #include <uapi/linux/virtio_blk.h>
>   
>   #include "vdpa_sim.h"
> @@ -36,10 +38,151 @@
>   
>   static struct vdpasim *vdpasim_blk_dev;
>   
> +static bool vdpasim_blk_check_range(u64 start_sector, size_t range_size)
> +{
> +	u64 range_sectors = range_size >> SECTOR_SHIFT;
> +
> +	if (range_size > VDPASIM_BLK_SIZE_MAX * VDPASIM_BLK_SEG_MAX)
> +		return false;
> +
> +	if (start_sector > VDPASIM_BLK_CAPACITY)
> +		return false;
> +
> +	if (range_sectors > VDPASIM_BLK_CAPACITY - start_sector)
> +		return false;
> +
> +	return true;
> +}
> +
> +/* Returns 'true' if the request is handled (with or without an I/O error)
> + * and the status is correctly written in the last byte of the 'in iov',
> + * 'false' otherwise.
> + */
> +static bool vdpasim_blk_handle_req(struct vdpasim *vdpasim,
> +				   struct vdpasim_virtqueue *vq)
> +{
> +	size_t pushed = 0, to_pull, to_push;
> +	struct virtio_blk_outhdr hdr;
> +	ssize_t bytes;
> +	loff_t offset;
> +	u64 sector;
> +	u8 status;
> +	u32 type;
> +	int ret;
> +
> +	ret = vringh_getdesc_iotlb(&vq->vring, &vq->out_iov, &vq->in_iov,
> +				   &vq->head, GFP_ATOMIC);
> +	if (ret != 1)
> +		return false;
> +
> +	if (vq->out_iov.used < 1 || vq->in_iov.used < 1) {
> +		dev_err(&vdpasim->vdpa.dev, "missing headers - out_iov: %u in_iov %u\n",
> +			vq->out_iov.used, vq->in_iov.used);
> +		return false;
> +	}
> +
> +	if (vq->in_iov.iov[vq->in_iov.used - 1].iov_len < 1) {
> +		dev_err(&vdpasim->vdpa.dev, "request in header too short\n");
> +		return false;
> +	}
> +
> +	/* The last byte is the status and we checked if the last iov has
> +	 * enough room for it.
> +	 */
> +	to_push = vringh_kiov_length(&vq->in_iov) - 1;
> +
> +	to_pull = vringh_kiov_length(&vq->out_iov);
> +
> +	bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->out_iov, &hdr,
> +				      sizeof(hdr));
> +	if (bytes != sizeof(hdr)) {
> +		dev_err(&vdpasim->vdpa.dev, "request out header too short\n");
> +		return false;
> +	}
> +
> +	to_pull -= bytes;
> +
> +	type = vdpasim32_to_cpu(vdpasim, hdr.type);
> +	sector = vdpasim64_to_cpu(vdpasim, hdr.sector);
> +	offset = sector << SECTOR_SHIFT;
> +	status = VIRTIO_BLK_S_OK;
> +
> +	switch (type) {
> +	case VIRTIO_BLK_T_IN:
> +		if (!vdpasim_blk_check_range(sector, to_push)) {
> +			dev_err(&vdpasim->vdpa.dev,
> +				"reading over the capacity - offset: 0x%llx len: 0x%zx\n",
> +				offset, to_push);
> +			status = VIRTIO_BLK_S_IOERR;
> +			break;
> +		}
> +
> +		bytes = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
> +					      vdpasim->buffer + offset,
> +					      to_push);
> +		if (bytes < 0) {
> +			dev_err(&vdpasim->vdpa.dev,
> +				"vringh_iov_push_iotlb() error: %zd offset: 0x%llx len: 0x%zx\n",
> +				bytes, offset, to_push);
> +			status = VIRTIO_BLK_S_IOERR;
> +			break;
> +		}
> +
> +		pushed += bytes;
> +		break;
> +
> +	case VIRTIO_BLK_T_OUT:
> +		if (!vdpasim_blk_check_range(sector, to_pull)) {
> +			dev_err(&vdpasim->vdpa.dev,
> +				"writing over the capacity - offset: 0x%llx len: 0x%zx\n",
> +				offset, to_pull);
> +			status = VIRTIO_BLK_S_IOERR;
> +			break;
> +		}
> +
> +		bytes = vringh_iov_pull_iotlb(&vq->vring, &vq->out_iov,
> +					      vdpasim->buffer + offset,
> +					      to_pull);
> +		if (bytes < 0) {
> +			dev_err(&vdpasim->vdpa.dev,
> +				"vringh_iov_pull_iotlb() error: %zd offset: 0x%llx len: 0x%zx\n",
> +				bytes, offset, to_pull);
> +			status = VIRTIO_BLK_S_IOERR;
> +			break;
> +		}
> +		break;
> +
> +	default:
> +		dev_warn(&vdpasim->vdpa.dev,
> +			 "Unsupported request type %d\n", type);
> +		status = VIRTIO_BLK_S_IOERR;
> +		break;
> +	}
> +
> +	/* If some operations fail, we need to skip the remaining bytes
> +	 * to put the status in the last byte
> +	 */
> +	if (to_push - pushed > 0)
> +		vringh_kiov_advance(&vq->in_iov, to_push - pushed);
> +
> +	/* Last byte is the status */
> +	bytes = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov, &status, 1);
> +	if (bytes != 1)
> +		return false;
> +
> +	pushed += bytes;
> +
> +	/* Make sure data is wrote before advancing index */
> +	smp_wmb();
> +
> +	vringh_complete_iotlb(&vq->vring, vq->head, pushed);
> +
> +	return true;
> +}
> +
>   static void vdpasim_blk_work(struct work_struct *work)
>   {
>   	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> -	u8 status = VIRTIO_BLK_S_OK;
>   	int i;
>   
>   	spin_lock(&vdpasim->lock);
> @@ -53,22 +196,7 @@ static void vdpasim_blk_work(struct work_struct *work)
>   		if (!vq->ready)
>   			continue;
>   
> -		while (vringh_getdesc_iotlb(&vq->vring, &vq->out_iov,
> -					    &vq->in_iov, &vq->head,
> -					    GFP_ATOMIC) > 0) {
> -			int write;
> -
> -			vq->in_iov.i = vq->in_iov.used - 1;
> -			write = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
> -						      &status, 1);
> -			if (write <= 0)
> -				break;
> -
> -			/* Make sure data is wrote before advancing index */
> -			smp_wmb();
> -
> -			vringh_complete_iotlb(&vq->vring, vq->head, write);
> -
> +		while (vdpasim_blk_handle_req(vdpasim, vq)) {
>   			/* Make sure used is visible before rasing the interrupt. */
>   			smp_wmb();
>   
> @@ -109,7 +237,7 @@ static int __init vdpasim_blk_init(void)
>   	dev_attr.config_size = sizeof(struct virtio_blk_config);
>   	dev_attr.get_config = vdpasim_blk_get_config;
>   	dev_attr.work_fn = vdpasim_blk_work;
> -	dev_attr.buffer_size = PAGE_SIZE;
> +	dev_attr.buffer_size = VDPASIM_BLK_CAPACITY << SECTOR_SHIFT;
>   
>   	vdpasim_blk_dev = vdpasim_create(&dev_attr);
>   	if (IS_ERR(vdpasim_blk_dev)) {


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 10/10] vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID
  2021-01-28 14:41 ` [PATCH RFC v2 10/10] vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID Stefano Garzarella
@ 2021-02-01  6:04   ` Jason Wang
  2021-02-02  9:44   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Jason Wang @ 2021-02-01  6:04 UTC (permalink / raw)
  To: Stefano Garzarella, virtualization
  Cc: Xie Yongji, Michael S. Tsirkin, Laurent Vivier, Stefan Hajnoczi,
	linux-kernel, Max Gurtovoy, kvm


On 2021/1/28 下午10:41, Stefano Garzarella wrote:
> Handle VIRTIO_BLK_T_GET_ID request, always answering the
> "vdpa_blk_sim" string.
>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> v2:
> - made 'vdpasim_blk_id' static [Jason]


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>   drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> index fc47e8320358..a3f8afad8d14 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
> @@ -37,6 +37,7 @@
>   #define VDPASIM_BLK_VQ_NUM	1
>   
>   static struct vdpasim *vdpasim_blk_dev;
> +static char vdpasim_blk_id[VIRTIO_BLK_ID_BYTES] = "vdpa_blk_sim";
>   
>   static bool vdpasim_blk_check_range(u64 start_sector, size_t range_size)
>   {
> @@ -152,6 +153,20 @@ static bool vdpasim_blk_handle_req(struct vdpasim *vdpasim,
>   		}
>   		break;
>   
> +	case VIRTIO_BLK_T_GET_ID:
> +		bytes = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
> +					      vdpasim_blk_id,
> +					      VIRTIO_BLK_ID_BYTES);
> +		if (bytes < 0) {
> +			dev_err(&vdpasim->vdpa.dev,
> +				"vringh_iov_push_iotlb() error: %zd\n", bytes);
> +			status = VIRTIO_BLK_S_IOERR;
> +			break;
> +		}
> +
> +		pushed += bytes;
> +		break;
> +
>   	default:
>   		dev_warn(&vdpasim->vdpa.dev,
>   			 "Unsupported request type %d\n", type);


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 02/10] vringh: add 'iotlb_lock' to synchronize iotlb accesses
       [not found]     ` <20210129091850.gatf3ih3knw2p4l4@steredhat>
@ 2021-02-01  6:31       ` Jason Wang
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Wang @ 2021-02-01  6:31 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	Stefan Hajnoczi, linux-kernel, Max Gurtovoy, kvm


On 2021/1/29 下午5:18, Stefano Garzarella wrote:
> On Fri, Jan 29, 2021 at 03:43:40PM +0800, Jason Wang wrote:
>>
>> On 2021/1/28 下午10:41, Stefano Garzarella wrote:
>>> Usually iotlb accesses are synchronized with a spinlock.
>>> Let's request it as a new parameter in vringh_set_iotlb() and
>>> hold it when we navigate the iotlb in iotlb_translate() to avoid
>>> race conditions with any new additions/deletions of ranges from
>>> the ioltb.
>>
>>
>> Patch looks fine but I wonder if this is the best approach comparing 
>> to do locking by the caller.
>
> Initially I tried to hold the lock in the vdpasim_blk_work(), but since
> we have a lot of different functions for vringh, I opted to take the 
> lock at the beginning and release it at the end.
> Also because several times I went to see if that call used 
> iotlb_translate or not.
>
> This could be a problem for example if we have multiple workers to 
> handle multiple queues.
>
> Also, some functions are quite long (e.g. vringh_getdesc_iotlb) and 
> holding the lock for that long could reduce parallelism.
>
> For these reasons I thought it was better to hide everything from the 
> caller who doesn't have to worry about which function calls 
> iotlb_translate() and thus hold the lock.


Fine with me.

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks


>
> Thanks,
> Stefano
>
>>
>> Thanks
>>
>>
>>>
>>> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>>> ---
>>>  include/linux/vringh.h           | 6 +++++-
>>>  drivers/vdpa/vdpa_sim/vdpa_sim.c | 3 ++-
>>>  drivers/vhost/vringh.c           | 9 ++++++++-
>>>  3 files changed, 15 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/include/linux/vringh.h b/include/linux/vringh.h
>>> index 59bd50f99291..9c077863c8f6 100644
>>> --- a/include/linux/vringh.h
>>> +++ b/include/linux/vringh.h
>>> @@ -46,6 +46,9 @@ struct vringh {
>>>      /* IOTLB for this vring */
>>>      struct vhost_iotlb *iotlb;
>>> +    /* spinlock to synchronize IOTLB accesses */
>>> +    spinlock_t *iotlb_lock;
>>> +
>>>      /* The function to call to notify the guest about added buffers */
>>>      void (*notify)(struct vringh *);
>>>  };
>>> @@ -258,7 +261,8 @@ static inline __virtio64 cpu_to_vringh64(const 
>>> struct vringh *vrh, u64 val)
>>>  #if IS_REACHABLE(CONFIG_VHOST_IOTLB)
>>> -void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb);
>>> +void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
>>> +              spinlock_t *iotlb_lock);
>>>  int vringh_init_iotlb(struct vringh *vrh, u64 features,
>>>                unsigned int num, bool weak_barriers,
>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c 
>>> b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>> index 2183a833fcf4..53238989713d 100644
>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>> @@ -284,7 +284,8 @@ struct vdpasim *vdpasim_create(struct 
>>> vdpasim_dev_attr *dev_attr)
>>>          goto err_iommu;
>>>      for (i = 0; i < dev_attr->nvqs; i++)
>>> -        vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu);
>>> +        vringh_set_iotlb(&vdpasim->vqs[i].vring, vdpasim->iommu,
>>> +                 &vdpasim->iommu_lock);
>>>      ret = iova_cache_get();
>>>      if (ret)
>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>> index 85d85faba058..f68122705719 100644
>>> --- a/drivers/vhost/vringh.c
>>> +++ b/drivers/vhost/vringh.c
>>> @@ -1074,6 +1074,8 @@ static int iotlb_translate(const struct vringh 
>>> *vrh,
>>>      int ret = 0;
>>>      u64 s = 0;
>>> +    spin_lock(vrh->iotlb_lock);
>>> +
>>>      while (len > s) {
>>>          u64 size, pa, pfn;
>>> @@ -1103,6 +1105,8 @@ static int iotlb_translate(const struct vringh 
>>> *vrh,
>>>          ++ret;
>>>      }
>>> +    spin_unlock(vrh->iotlb_lock);
>>> +
>>>      return ret;
>>>  }
>>> @@ -1262,10 +1266,13 @@ EXPORT_SYMBOL(vringh_init_iotlb);
>>>   * vringh_set_iotlb - initialize a vringh for a ring with IOTLB.
>>>   * @vrh: the vring
>>>   * @iotlb: iotlb associated with this vring
>>> + * @iotlb_lock: spinlock to synchronize the iotlb accesses
>>>   */
>>> -void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb)
>>> +void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb,
>>> +              spinlock_t *iotlb_lock)
>>>  {
>>>      vrh->iotlb = iotlb;
>>> +    vrh->iotlb_lock = iotlb_lock;
>>>  }
>>>  EXPORT_SYMBOL(vringh_set_iotlb);
>>
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device
  2021-01-31 15:31   ` Max Gurtovoy
@ 2021-02-01  8:29     ` Stefano Garzarella
  0 siblings, 0 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-02-01  8:29 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	Stefan Hajnoczi, linux-kernel, Jason Wang, kvm

On Sun, Jan 31, 2021 at 05:31:43PM +0200, Max Gurtovoy wrote:
>
>On 1/28/2021 4:41 PM, Stefano Garzarella wrote:
>>From: Max Gurtovoy <mgurtovoy@nvidia.com>
>>
>>This will allow running vDPA for virtio block protocol.
>>
>>Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
>>[sgarzare: various cleanups/fixes]
>>Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>>---
>>v2:
>>- rebased on top of other changes (dev_attr, get_config(), notify(), etc.)
>>- memset to 0 the config structure in vdpasim_blk_get_config()
>>- used vdpasim pointer in vdpasim_blk_get_config()
>>
>>v1:
>>- Removed unused headers
>>- Used cpu_to_vdpasim*() to store config fields
>>- Replaced 'select VDPA_SIM' with 'depends on VDPA_SIM' since selected
>>   option can not depend on other [Jason]
>>- Start with a single queue for now [Jason]
>>- Add comments to memory barriers
>>---
>>  drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 145 +++++++++++++++++++++++++++
>>  drivers/vdpa/Kconfig                 |   7 ++
>>  drivers/vdpa/vdpa_sim/Makefile       |   1 +
>>  3 files changed, 153 insertions(+)
>>  create mode 100644 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>
>>diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>new file mode 100644
>>index 000000000000..999f9ca0b628
>>--- /dev/null
>>+++ b/drivers/vdpa/vdpa_sim/vdpa_sim_blk.c
>>@@ -0,0 +1,145 @@
>>+// SPDX-License-Identifier: GPL-2.0-only
>>+/*
>>+ * VDPA simulator for block device.
>>+ *
>>+ * Copyright (c) 2020, Mellanox Technologies. All rights reserved.
>
>I guess we can change the copyright from Mellanox to:
>
>Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.

I'll update in the next version.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 03/10] vringh: reset kiov 'consumed' field in __vringh_iov()
  2021-02-01  5:40   ` Jason Wang
@ 2021-02-01 10:21     ` Stefano Garzarella
  2021-02-02  3:26       ` Jason Wang
  0 siblings, 1 reply; 31+ messages in thread
From: Stefano Garzarella @ 2021-02-01 10:21 UTC (permalink / raw)
  To: Jason Wang
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	Stefan Hajnoczi, linux-kernel, Max Gurtovoy, kvm

On Mon, Feb 01, 2021 at 01:40:01PM +0800, Jason Wang wrote:
>
>On 2021/1/28 下午10:41, Stefano Garzarella wrote:
>>__vringh_iov() overwrites the contents of riov and wiov, in fact it
>>resets the 'i' and 'used' fields, but also the consumed field should
>>be reset to avoid an inconsistent state.
>>
>>Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>
>
>I had a question(I remember we had some discussion like this but I 
>forget the conclusion):

Sorry, I forgot to update you.

>
>I see e.g in vringh_getdesc_kern() it has the following comment:
>
>/*
> * Note that you may need to clean up riov and wiov, even on error!
> */
>
>So it looks to me the correct way is to call vringh_kiov_cleanup() 
>before?

Looking at the code the right pattern should be:

     vringh_getdesc_*(..., &out_iov, &in_iov, ...);

     // use out_iov and in_iov

     vringh_kiov_cleanup(&out_iov);
     vringh_kiov_cleanup(&in_iov);

This because vringh_getdesc_*() calls __vringh_iov() where 
resize_iovec() is called to allocate the iov wrapped by 'struct 
vringh_kiov' and vringh_kiov_cleanup() frees that memory.

Looking better, __vringh_iov() is able to extend a 'vringh_kiov' 
pre-allocated, so in order to avoid to allocate and free the iov for 
each request we can avoid to call vringh_kiov_cleanup(), but this patch 
is needed to avoid an inconsistent state.

And also patch "vdpa_sim: cleanup kiovs in vdpasim_free()" is required 
to free the iov when the device is going away.

Does that make sense to you?

Maybe I should add a comment in vringh.c to explain this better.

Thanks,
Stefano

>
>Thanks
>
>
>>---
>>  drivers/vhost/vringh.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>>diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>index f68122705719..bee63d68201a 100644
>>--- a/drivers/vhost/vringh.c
>>+++ b/drivers/vhost/vringh.c
>>@@ -290,9 +290,9 @@ __vringh_iov(struct vringh *vrh, u16 i,
>>  		return -EINVAL;
>>  	if (riov)
>>-		riov->i = riov->used = 0;
>>+		riov->i = riov->used = riov->consumed = 0;
>>  	if (wiov)
>>-		wiov->i = wiov->used = 0;
>>+		wiov->i = wiov->used = wiov->consumed = 0;
>>  	for (;;) {
>>  		void *addr;
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 04/10] vringh: implement vringh_kiov_advance()
  2021-02-01  5:43   ` Jason Wang
@ 2021-02-01 10:23     ` Stefano Garzarella
  0 siblings, 0 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-02-01 10:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	Stefan Hajnoczi, linux-kernel, Max Gurtovoy, kvm

On Mon, Feb 01, 2021 at 01:43:23PM +0800, Jason Wang wrote:
>
>On 2021/1/28 下午10:41, Stefano Garzarella wrote:
>>In some cases, it may be useful to provide a way to skip a number
>>of bytes in a vringh_kiov.
>>
>>Let's implement vringh_kiov_advance() for this purpose, reusing the
>>code from vringh_iov_xfer().
>>We replace that code calling the new vringh_kiov_advance().
>
>
>Acked-by: Jason Wang <jasowang@redhat.com>
>
>In the long run we need to switch to use iov iterator library instead.

Yes I agree.
I've tried to do this, but it requires quite a bit of work to change 
vringh, I'll put it on my todo list.

Thanks,
Stefano

>
>Thanks
>
>
>>
>>Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>>---
>>  include/linux/vringh.h |  2 ++
>>  drivers/vhost/vringh.c | 41 +++++++++++++++++++++++++++++------------
>>  2 files changed, 31 insertions(+), 12 deletions(-)
>>
>>diff --git a/include/linux/vringh.h b/include/linux/vringh.h
>>index 9c077863c8f6..755211ebd195 100644
>>--- a/include/linux/vringh.h
>>+++ b/include/linux/vringh.h
>>@@ -199,6 +199,8 @@ static inline void vringh_kiov_cleanup(struct vringh_kiov *kiov)
>>  	kiov->iov = NULL;
>>  }
>>+void vringh_kiov_advance(struct vringh_kiov *kiov, size_t len);
>>+
>>  int vringh_getdesc_kern(struct vringh *vrh,
>>  			struct vringh_kiov *riov,
>>  			struct vringh_kiov *wiov,
>>diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>index bee63d68201a..4d800e4f31ca 100644
>>--- a/drivers/vhost/vringh.c
>>+++ b/drivers/vhost/vringh.c
>>@@ -75,6 +75,34 @@ static inline int __vringh_get_head(const struct vringh *vrh,
>>  	return head;
>>  }
>>+/**
>>+ * vringh_kiov_advance - skip bytes from vring_kiov
>>+ * @iov: an iov passed to vringh_getdesc_*() (updated as we consume)
>>+ * @len: the maximum length to advance
>>+ */
>>+void vringh_kiov_advance(struct vringh_kiov *iov, size_t len)
>>+{
>>+	while (len && iov->i < iov->used) {
>>+		size_t partlen = min(iov->iov[iov->i].iov_len, len);
>>+
>>+		iov->consumed += partlen;
>>+		iov->iov[iov->i].iov_len -= partlen;
>>+		iov->iov[iov->i].iov_base += partlen;
>>+
>>+		if (!iov->iov[iov->i].iov_len) {
>>+			/* Fix up old iov element then increment. */
>>+			iov->iov[iov->i].iov_len = iov->consumed;
>>+			iov->iov[iov->i].iov_base -= iov->consumed;
>>+
>>+			iov->consumed = 0;
>>+			iov->i++;
>>+		}
>>+
>>+		len -= partlen;
>>+	}
>>+}
>>+EXPORT_SYMBOL(vringh_kiov_advance);
>>+
>>  /* Copy some bytes to/from the iovec.  Returns num copied. */
>>  static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
>>  				      struct vringh_kiov *iov,
>>@@ -95,19 +123,8 @@ static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
>>  		done += partlen;
>>  		len -= partlen;
>>  		ptr += partlen;
>>-		iov->consumed += partlen;
>>-		iov->iov[iov->i].iov_len -= partlen;
>>-		iov->iov[iov->i].iov_base += partlen;
>>-		if (!iov->iov[iov->i].iov_len) {
>>-			/* Fix up old iov element then increment. */
>>-			iov->iov[iov->i].iov_len = iov->consumed;
>>-			iov->iov[iov->i].iov_base -= iov->consumed;
>>-
>>-			
>>-			iov->consumed = 0;
>>-			iov->i++;
>>-		}
>>+		vringh_kiov_advance(iov, partlen);
>>  	}
>>  	return done;
>>  }
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 03/10] vringh: reset kiov 'consumed' field in __vringh_iov()
  2021-02-01 10:21     ` Stefano Garzarella
@ 2021-02-02  3:26       ` Jason Wang
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Wang @ 2021-02-02  3:26 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	Stefan Hajnoczi, linux-kernel, Max Gurtovoy, kvm


On 2021/2/1 下午6:21, Stefano Garzarella wrote:
> On Mon, Feb 01, 2021 at 01:40:01PM +0800, Jason Wang wrote:
>>
>> On 2021/1/28 下午10:41, Stefano Garzarella wrote:
>>> __vringh_iov() overwrites the contents of riov and wiov, in fact it
>>> resets the 'i' and 'used' fields, but also the consumed field should
>>> be reset to avoid an inconsistent state.
>>>
>>> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
>>
>>
>> I had a question(I remember we had some discussion like this but I 
>> forget the conclusion):
>
> Sorry, I forgot to update you.
>
>>
>> I see e.g in vringh_getdesc_kern() it has the following comment:
>>
>> /*
>>  * Note that you may need to clean up riov and wiov, even on error!
>>  */
>>
>> So it looks to me the correct way is to call vringh_kiov_cleanup() 
>> before?
>
> Looking at the code the right pattern should be:
>
>     vringh_getdesc_*(..., &out_iov, &in_iov, ...);
>
>     // use out_iov and in_iov
>
>     vringh_kiov_cleanup(&out_iov);
>     vringh_kiov_cleanup(&in_iov);
>
> This because vringh_getdesc_*() calls __vringh_iov() where 
> resize_iovec() is called to allocate the iov wrapped by 'struct 
> vringh_kiov' and vringh_kiov_cleanup() frees that memory.
>
> Looking better, __vringh_iov() is able to extend a 'vringh_kiov' 
> pre-allocated, so in order to avoid to allocate and free the iov for 
> each request we can avoid to call vringh_kiov_cleanup(), but this 
> patch is needed to avoid an inconsistent state.
>
> And also patch "vdpa_sim: cleanup kiovs in vdpasim_free()" is required 
> to free the iov when the device is going away.
>
> Does that make sense to you?


Make sense.


>
> Maybe I should add a comment in vringh.c to explain this better.


Yes, please.

Thanks


>
> Thanks,
> Stefano
>
>>
>> Thanks
>>
>>
>>> ---
>>>  drivers/vhost/vringh.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
>>> index f68122705719..bee63d68201a 100644
>>> --- a/drivers/vhost/vringh.c
>>> +++ b/drivers/vhost/vringh.c
>>> @@ -290,9 +290,9 @@ __vringh_iov(struct vringh *vrh, u16 i,
>>>          return -EINVAL;
>>>      if (riov)
>>> -        riov->i = riov->used = 0;
>>> +        riov->i = riov->used = riov->consumed = 0;
>>>      if (wiov)
>>> -        wiov->i = wiov->used = 0;
>>> +        wiov->i = wiov->used = wiov->consumed = 0;
>>>      for (;;) {
>>>          void *addr;
>>
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device
  2021-01-28 14:41 ` [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device Stefano Garzarella
  2021-01-31 15:31   ` Max Gurtovoy
  2021-02-01  5:50   ` Jason Wang
@ 2021-02-02  9:34   ` Stefan Hajnoczi
  2021-02-02 15:49     ` Stefano Garzarella
  2 siblings, 1 reply; 31+ messages in thread
From: Stefan Hajnoczi @ 2021-02-02  9:34 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	linux-kernel, Max Gurtovoy, Jason Wang, kvm

[-- Attachment #1: Type: text/plain, Size: 1398 bytes --]

On Thu, Jan 28, 2021 at 03:41:25PM +0100, Stefano Garzarella wrote:
> +static void vdpasim_blk_work(struct work_struct *work)
> +{
> +	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> +	u8 status = VIRTIO_BLK_S_OK;
> +	int i;
> +
> +	spin_lock(&vdpasim->lock);
> +
> +	if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> +		goto out;
> +
> +	for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
> +		struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
> +
> +		if (!vq->ready)
> +			continue;
> +
> +		while (vringh_getdesc_iotlb(&vq->vring, &vq->out_iov,
> +					    &vq->in_iov, &vq->head,
> +					    GFP_ATOMIC) > 0) {
> +			int write;
> +
> +			vq->in_iov.i = vq->in_iov.used - 1;
> +			write = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
> +						      &status, 1);
> +			if (write <= 0)
> +				break;

This code looks fragile:

1. Relying on unsigned underflow and the while loop in
   vringh_iov_push_iotlb() to handle the case where in_iov.used == 0 is
   risky and could break.

2. Does this assume that the last in_iov element has size 1? For
   example, the guest driver may send a single "in" iovec with size 513
   when reading 512 bytes (with an extra byte for the request status).

Please validate inputs fully, even in test/development code, because
it's likely to be copied by others when writing production code (or
deployed in production by unsuspecting users) :).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 09/10] vdpa_sim_blk: implement ramdisk behaviour
  2021-01-28 14:41 ` [PATCH RFC v2 09/10] vdpa_sim_blk: implement ramdisk behaviour Stefano Garzarella
  2021-02-01  6:03   ` Jason Wang
@ 2021-02-02  9:41   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2021-02-02  9:41 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	linux-kernel, Max Gurtovoy, Jason Wang, kvm

[-- Attachment #1: Type: text/plain, Size: 889 bytes --]

On Thu, Jan 28, 2021 at 03:41:26PM +0100, Stefano Garzarella wrote:
> The previous implementation wrote only the status of each request.
> This patch implements a more accurate block device simulator,
> providing a ramdisk-like behavior.
> 
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> v2:
> - used %zd %zx to print size_t and ssize_t variables in dev_err()
> - removed unnecessary new line [Jason]
> - moved VIRTIO_BLK_T_GET_ID in another patch [Jason]
> - used push/pull instead of write/read terminology
> - added vdpasim_blk_check_range() to avoid overflows [Stefan]
> - use vdpasim*_to_cpu instead of le*_to_cpu
> - used vringh_kiov_length() helper [Jason]
> ---
>  drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 164 ++++++++++++++++++++++++---
>  1 file changed, 146 insertions(+), 18 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 10/10] vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID
  2021-01-28 14:41 ` [PATCH RFC v2 10/10] vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID Stefano Garzarella
  2021-02-01  6:04   ` Jason Wang
@ 2021-02-02  9:44   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2021-02-02  9:44 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	linux-kernel, Max Gurtovoy, Jason Wang, kvm

[-- Attachment #1: Type: text/plain, Size: 434 bytes --]

On Thu, Jan 28, 2021 at 03:41:27PM +0100, Stefano Garzarella wrote:
> Handle VIRTIO_BLK_T_GET_ID request, always answering the
> "vdpa_blk_sim" string.
> 
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> v2:
> - made 'vdpasim_blk_id' static [Jason]
> ---
>  drivers/vdpa/vdpa_sim/vdpa_sim_blk.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device
  2021-02-02  9:34   ` Stefan Hajnoczi
@ 2021-02-02 15:49     ` Stefano Garzarella
  2021-02-03 16:45       ` Stefan Hajnoczi
  0 siblings, 1 reply; 31+ messages in thread
From: Stefano Garzarella @ 2021-02-02 15:49 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	linux-kernel, Max Gurtovoy, Jason Wang, kvm

On Tue, Feb 02, 2021 at 09:34:12AM +0000, Stefan Hajnoczi wrote:
>On Thu, Jan 28, 2021 at 03:41:25PM +0100, Stefano Garzarella wrote:
>> +static void vdpasim_blk_work(struct work_struct *work)
>> +{
>> +	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
>> +	u8 status = VIRTIO_BLK_S_OK;
>> +	int i;
>> +
>> +	spin_lock(&vdpasim->lock);
>> +
>> +	if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
>> +		goto out;
>> +
>> +	for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
>> +		struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
>> +
>> +		if (!vq->ready)
>> +			continue;
>> +
>> +		while (vringh_getdesc_iotlb(&vq->vring, &vq->out_iov,
>> +					    &vq->in_iov, &vq->head,
>> +					    GFP_ATOMIC) > 0) {
>> +			int write;
>> +
>> +			vq->in_iov.i = vq->in_iov.used - 1;
>> +			write = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
>> +						      &status, 1);
>> +			if (write <= 0)
>> +				break;
>
>This code looks fragile:
>
>1. Relying on unsigned underflow and the while loop in
>   vringh_iov_push_iotlb() to handle the case where in_iov.used == 0 is
>   risky and could break.
>
>2. Does this assume that the last in_iov element has size 1? For
>   example, the guest driver may send a single "in" iovec with size 513
>   when reading 512 bytes (with an extra byte for the request status).
>
>Please validate inputs fully, even in test/development code, because
>it's likely to be copied by others when writing production code (or
>deployed in production by unsuspecting users) :).

Perfectly agree on that, so I addressed these things, also following 
your review on the previous version, on the next patch of this series:
"vdpa_sim_blk: implement ramdisk behaviour".

Do you think should I move these checks in this patch?

I did this to leave Max credit for this patch and add more code to 
emulate a ramdisk in later patches.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device
  2021-02-02 15:49     ` Stefano Garzarella
@ 2021-02-03 16:45       ` Stefan Hajnoczi
  2021-02-04  8:02         ` Stefano Garzarella
  0 siblings, 1 reply; 31+ messages in thread
From: Stefan Hajnoczi @ 2021-02-03 16:45 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	linux-kernel, Max Gurtovoy, Jason Wang, kvm

[-- Attachment #1: Type: text/plain, Size: 2253 bytes --]

On Tue, Feb 02, 2021 at 04:49:50PM +0100, Stefano Garzarella wrote:
> On Tue, Feb 02, 2021 at 09:34:12AM +0000, Stefan Hajnoczi wrote:
> > On Thu, Jan 28, 2021 at 03:41:25PM +0100, Stefano Garzarella wrote:
> > > +static void vdpasim_blk_work(struct work_struct *work)
> > > +{
> > > +	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
> > > +	u8 status = VIRTIO_BLK_S_OK;
> > > +	int i;
> > > +
> > > +	spin_lock(&vdpasim->lock);
> > > +
> > > +	if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
> > > +		goto out;
> > > +
> > > +	for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
> > > +		struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
> > > +
> > > +		if (!vq->ready)
> > > +			continue;
> > > +
> > > +		while (vringh_getdesc_iotlb(&vq->vring, &vq->out_iov,
> > > +					    &vq->in_iov, &vq->head,
> > > +					    GFP_ATOMIC) > 0) {
> > > +			int write;
> > > +
> > > +			vq->in_iov.i = vq->in_iov.used - 1;
> > > +			write = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
> > > +						      &status, 1);
> > > +			if (write <= 0)
> > > +				break;
> > 
> > This code looks fragile:
> > 
> > 1. Relying on unsigned underflow and the while loop in
> >   vringh_iov_push_iotlb() to handle the case where in_iov.used == 0 is
> >   risky and could break.
> > 
> > 2. Does this assume that the last in_iov element has size 1? For
> >   example, the guest driver may send a single "in" iovec with size 513
> >   when reading 512 bytes (with an extra byte for the request status).
> > 
> > Please validate inputs fully, even in test/development code, because
> > it's likely to be copied by others when writing production code (or
> > deployed in production by unsuspecting users) :).
> 
> Perfectly agree on that, so I addressed these things, also following your
> review on the previous version, on the next patch of this series:
> "vdpa_sim_blk: implement ramdisk behaviour".
> 
> Do you think should I move these checks in this patch?
> 
> I did this to leave Max credit for this patch and add more code to emulate a
> ramdisk in later patches.

You could update the commit description so it's clear that input
validation is missing and will be added in the next commit.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device
  2021-02-03 16:45       ` Stefan Hajnoczi
@ 2021-02-04  8:02         ` Stefano Garzarella
  0 siblings, 0 replies; 31+ messages in thread
From: Stefano Garzarella @ 2021-02-04  8:02 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: virtualization, Xie Yongji, Michael S. Tsirkin, Laurent Vivier,
	linux-kernel, Max Gurtovoy, Jason Wang, kvm

On Wed, Feb 03, 2021 at 04:45:51PM +0000, Stefan Hajnoczi wrote:
>On Tue, Feb 02, 2021 at 04:49:50PM +0100, Stefano Garzarella wrote:
>> On Tue, Feb 02, 2021 at 09:34:12AM +0000, Stefan Hajnoczi wrote:
>> > On Thu, Jan 28, 2021 at 03:41:25PM +0100, Stefano Garzarella wrote:
>> > > +static void vdpasim_blk_work(struct work_struct *work)
>> > > +{
>> > > +	struct vdpasim *vdpasim = container_of(work, struct vdpasim, work);
>> > > +	u8 status = VIRTIO_BLK_S_OK;
>> > > +	int i;
>> > > +
>> > > +	spin_lock(&vdpasim->lock);
>> > > +
>> > > +	if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK))
>> > > +		goto out;
>> > > +
>> > > +	for (i = 0; i < VDPASIM_BLK_VQ_NUM; i++) {
>> > > +		struct vdpasim_virtqueue *vq = &vdpasim->vqs[i];
>> > > +
>> > > +		if (!vq->ready)
>> > > +			continue;
>> > > +
>> > > +		while (vringh_getdesc_iotlb(&vq->vring, &vq->out_iov,
>> > > +					    &vq->in_iov, &vq->head,
>> > > +					    GFP_ATOMIC) > 0) {
>> > > +			int write;
>> > > +
>> > > +			vq->in_iov.i = vq->in_iov.used - 1;
>> > > +			write = vringh_iov_push_iotlb(&vq->vring, &vq->in_iov,
>> > > +						      &status, 1);
>> > > +			if (write <= 0)
>> > > +				break;
>> >
>> > This code looks fragile:
>> >
>> > 1. Relying on unsigned underflow and the while loop in
>> >   vringh_iov_push_iotlb() to handle the case where in_iov.used == 0 is
>> >   risky and could break.
>> >
>> > 2. Does this assume that the last in_iov element has size 1? For
>> >   example, the guest driver may send a single "in" iovec with size 513
>> >   when reading 512 bytes (with an extra byte for the request status).
>> >
>> > Please validate inputs fully, even in test/development code, because
>> > it's likely to be copied by others when writing production code (or
>> > deployed in production by unsuspecting users) :).
>>
>> Perfectly agree on that, so I addressed these things, also following your
>> review on the previous version, on the next patch of this series:
>> "vdpa_sim_blk: implement ramdisk behaviour".
>>
>> Do you think should I move these checks in this patch?
>>
>> I did this to leave Max credit for this patch and add more code to emulate a
>> ramdisk in later patches.
>
>You could update the commit description so it's clear that input
>validation is missing and will be added in the next commit.

I'll do it.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-02-04  8:04 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-28 14:41 [PATCH RFC v2 00/10] vdpa: add vdpa simulator for block device Stefano Garzarella
2021-01-28 14:41 ` [PATCH RFC v2 01/10] vdpa_sim: use iova module to allocate IOVA addresses Stefano Garzarella
2021-01-28 14:41 ` [PATCH RFC v2 02/10] vringh: add 'iotlb_lock' to synchronize iotlb accesses Stefano Garzarella
2021-01-29  7:43   ` Jason Wang
     [not found]     ` <20210129091850.gatf3ih3knw2p4l4@steredhat>
2021-02-01  6:31       ` Jason Wang
2021-01-28 14:41 ` [PATCH RFC v2 03/10] vringh: reset kiov 'consumed' field in __vringh_iov() Stefano Garzarella
2021-02-01  5:40   ` Jason Wang
2021-02-01 10:21     ` Stefano Garzarella
2021-02-02  3:26       ` Jason Wang
2021-01-28 14:41 ` [PATCH RFC v2 04/10] vringh: implement vringh_kiov_advance() Stefano Garzarella
2021-02-01  5:43   ` Jason Wang
2021-02-01 10:23     ` Stefano Garzarella
2021-01-28 14:41 ` [PATCH RFC v2 05/10] vringh: add vringh_kiov_length() helper Stefano Garzarella
2021-02-01  5:46   ` Jason Wang
2021-01-28 14:41 ` [PATCH RFC v2 06/10] vdpa_sim: cleanup kiovs in vdpasim_free() Stefano Garzarella
2021-02-01  5:47   ` Jason Wang
2021-01-28 14:41 ` [PATCH RFC v2 07/10] vdpa: Remove the restriction that only supports virtio-net devices Stefano Garzarella
2021-01-28 14:41 ` [PATCH RFC v2 08/10] vdpa: add vdpa simulator for block device Stefano Garzarella
2021-01-31 15:31   ` Max Gurtovoy
2021-02-01  8:29     ` Stefano Garzarella
2021-02-01  5:50   ` Jason Wang
2021-02-02  9:34   ` Stefan Hajnoczi
2021-02-02 15:49     ` Stefano Garzarella
2021-02-03 16:45       ` Stefan Hajnoczi
2021-02-04  8:02         ` Stefano Garzarella
2021-01-28 14:41 ` [PATCH RFC v2 09/10] vdpa_sim_blk: implement ramdisk behaviour Stefano Garzarella
2021-02-01  6:03   ` Jason Wang
2021-02-02  9:41   ` Stefan Hajnoczi
2021-01-28 14:41 ` [PATCH RFC v2 10/10] vdpa_sim_blk: handle VIRTIO_BLK_T_GET_ID Stefano Garzarella
2021-02-01  6:04   ` Jason Wang
2021-02-02  9:44   ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).