All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH kvmtool 00/24] Virtio v1 support
@ 2022-06-07 17:02 Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 01/24] virtio: Add NEEDS_RESET to the status mask Jean-Philippe Brucker
                   ` (25 more replies)
  0 siblings, 26 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Add support for version 1 of the virtio transport to kvmtool. Based on a
RFC by Sasha Levin [1], I've been trying to complete it here and there.
It's long overdue and is quite painful to rebase, so let's get it
merged.

Several reasons why the legacy transport needs to be replaced:

* Only 32 feature bits are supported. Most importantly
  VIRTIO_F_ACCESS_PLATFORM, which forces a Linux guest to use the DMA
  API, cannot be enabled. So we can't support private guests that
  decrypt or share only their DMA memory with the host.

* Legacy virtqueue address is a 32-bit pfn, aligned on 4kB. Since Linux
  guests bypass the DMA API they can't support large GPAs.

* New devices types (iommu, crypto, memory, etc) and new features cannot
  be supported.

* New guests won't implement the legacy transport. Existing guests will
  eventually drop legacy support.

Support for modern transport becomes the default and legacy is enabled
with --virtio-legacy.

I only tested what I could: vsock, scsi and vhost-net are currently
broken and can be fixed later (they have issues with mem regions and
feature mask, among other things). I also haven't tested big-endian.

Find the series at https://jpbrucker.net/git/kvmtool/ virtio/devel

[1] https://lore.kernel.org/all/1447823472-17047-1-git-send-email-sasha.levin@oracle.com/
    The SOB was kept in patch 21

Jean-Philippe Brucker (24):
  virtio: Add NEEDS_RESET to the status mask
  virtio: Remove redundant test
  virtio/vsock: Remove redundant state tracking
  virtio: Factor virtqueue initialization
  virtio: Support modern virtqueue addresses
  virtio: Add config access helpers
  virtio: Fix device-specific config endianness
  virtio/console: Remove unused callback
  virtio: Remove set_guest_features() device op
  Add memcpy_fromiovec_safe
  virtio/net: Offload vnet header endianness conversion to tap
  virtio/net: Prepare for modern virtio
  virtio/net: Implement VIRTIO_F_ANY_LAYOUT feature
  virtio/console: Add VIRTIO_F_ANY_LAYOUT feature
  virtio/blk: Implement VIRTIO_F_ANY_LAYOUT feature
  virtio/pci: Factor MSI route creation
  virtio/pci: Delete MSI routes
  virtio: Extract init_vq() for PCI and MMIO
  virtio/pci: Make doorbell offset dynamic
  virtio: Move PCI transport to pci-legacy
  virtio: Add support for modern virtio-pci
  virtio: Move MMIO transport to mmio-legacy
  virtio: Add support for modern virtio-mmio
  virtio/pci: Initialize all vectors to VIRTIO_MSI_NO_VECTOR

 Makefile                          |   4 +
 arm/include/arm-common/kvm-arch.h |   6 +-
 include/kvm/disk-image.h          |   3 +-
 include/kvm/iovec.h               |   2 +
 include/kvm/kvm-config.h          |   1 +
 include/kvm/kvm.h                 |   6 +
 include/kvm/pci.h                 |  11 +
 include/kvm/virtio-9p.h           |   2 +-
 include/kvm/virtio-mmio.h         |  29 ++-
 include/kvm/virtio-pci-dev.h      |   4 +
 include/kvm/virtio-pci.h          |  48 +++-
 include/kvm/virtio.h              |  52 ++--
 mips/include/kvm/kvm-arch.h       |   2 -
 powerpc/include/kvm/kvm-arch.h    |   2 -
 x86/include/kvm/kvm-arch.h        |   2 -
 builtin-run.c                     |   2 +
 disk/core.c                       |  26 +-
 net/uip/core.c                    |  71 ++++--
 util/iovec.c                      |  31 +++
 virtio/9p.c                       |  27 +--
 virtio/balloon.c                  |  46 ++--
 virtio/blk.c                      | 102 ++++----
 virtio/console.c                  |  33 +--
 virtio/core.c                     |  82 ++++++-
 virtio/mmio-legacy.c              | 150 ++++++++++++
 virtio/mmio-modern.c              | 157 ++++++++++++
 virtio/mmio.c                     | 202 ++--------------
 virtio/net.c                      | 122 +++++-----
 virtio/pci-legacy.c               | 205 ++++++++++++++++
 virtio/pci-modern.c               | 386 ++++++++++++++++++++++++++++++
 virtio/pci.c                      | 361 ++++++----------------------
 virtio/rng.c                      |  15 +-
 virtio/scsi.c                     |  44 ++--
 virtio/vsock.c                    |  39 ++-
 34 files changed, 1490 insertions(+), 785 deletions(-)
 create mode 100644 virtio/mmio-legacy.c
 create mode 100644 virtio/mmio-modern.c
 create mode 100644 virtio/pci-legacy.c
 create mode 100644 virtio/pci-modern.c

-- 
2.36.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 01/24] virtio: Add NEEDS_RESET to the status mask
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 02/24] virtio: Remove redundant test Jean-Philippe Brucker
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Not all toolchains used to know about VIRTIO_CONFIG_S_NEEDS_RESET, so we
left it out of the status mask. Now that we include our own version of
virtio_config.h and we'll need it for virtio 1.0, add it back.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index ad274acf..8a363632 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -35,6 +35,7 @@
 	 VIRTIO_CONFIG_S_DRIVER |	\
 	 VIRTIO_CONFIG_S_DRIVER_OK |	\
 	 VIRTIO_CONFIG_S_FEATURES_OK |	\
+	 VIRTIO_CONFIG_S_NEEDS_RESET |	\
 	 VIRTIO_CONFIG_S_FAILED)
 
 /* Kvmtool status bits */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 02/24] virtio: Remove redundant test
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 01/24] virtio: Add NEEDS_RESET to the status mask Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 03/24] virtio/vsock: Remove redundant state tracking Jean-Philippe Brucker
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Don't test for VIRTIO__STATUS_STOP right after setting it.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 virtio/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/virtio/core.c b/virtio/core.c
index 90a661d1..40532664 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -247,8 +247,7 @@ void virtio_notify_status(struct kvm *kvm, struct virtio_device *vdev,
 		 * Reset virtqueues and stop all traffic now, so that the device
 		 * can safely reset the backend in notify_status().
 		 */
-		if (ext_status & VIRTIO__STATUS_STOP)
-			vdev->ops->reset(kvm, vdev);
+		vdev->ops->reset(kvm, vdev);
 	}
 
 	if (vdev->ops->notify_status)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 03/24] virtio/vsock: Remove redundant state tracking
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 01/24] virtio: Add NEEDS_RESET to the status mask Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 02/24] virtio: Remove redundant test Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 04/24] virtio: Factor virtqueue initialization Jean-Philippe Brucker
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

The core already tells us whether a device is being started or stopped.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 virtio/vsock.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/virtio/vsock.c b/virtio/vsock.c
index 64b4e95a..780169b1 100644
--- a/virtio/vsock.c
+++ b/virtio/vsock.c
@@ -31,7 +31,6 @@ struct vsock_dev {
 	struct virtio_device		vdev;
 	struct list_head		list;
 	struct kvm			*kvm;
-	bool				started;
 };
 
 static u8 *get_config(struct kvm *kvm, void *dev)
@@ -140,15 +139,16 @@ static void notify_status(struct kvm *kvm, void *dev, u32 status)
 	struct vsock_dev *vdev = dev;
 	int r, start;
 
-	start = !!(status & VIRTIO_CONFIG_S_DRIVER_OK);
-	if (vdev->started == start)
+	if (status & VIRTIO__STATUS_START)
+		start = 1;
+	else if (status & VIRTIO__STATUS_STOP)
+		start = 0;
+	else
 		return;
 
 	r = ioctl(vdev->vhost_fd, VHOST_VSOCK_SET_RUNNING, &start);
 	if (r != 0)
 		die("VHOST_VSOCK_SET_RUNNING failed %d", errno);
-
-	vdev->started = start;
 }
 
 static int notify_vq(struct kvm *kvm, void *dev, u32 vq)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 04/24] virtio: Factor virtqueue initialization
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (2 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 03/24] virtio/vsock: Remove redundant state tracking Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 05/24] virtio: Support modern virtqueue addresses Jean-Philippe Brucker
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

All virtio devices perform the same set of operations when initializing
their virtqueues. Move it to virtio core.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio.h | 17 +++--------------
 virtio/9p.c          |  7 ++-----
 virtio/balloon.c     |  8 +++-----
 virtio/blk.c         | 10 ++--------
 virtio/console.c     |  7 ++-----
 virtio/core.c        | 14 ++++++++++++++
 virtio/net.c         |  8 ++------
 virtio/rng.c         |  7 ++-----
 virtio/scsi.c        |  7 ++-----
 virtio/vsock.c       |  8 ++------
 10 files changed, 34 insertions(+), 59 deletions(-)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 8a363632..f0b79334 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -212,20 +212,9 @@ int __must_check virtio_init(struct kvm *kvm, void *dev, struct virtio_device *v
 			     int device_id, int subsys_id, int class);
 int virtio_compat_add_message(const char *device, const char *config);
 const char* virtio_trans_name(enum virtio_trans trans);
-
-static inline void *virtio_get_vq(struct kvm *kvm, u32 pfn, u32 page_size)
-{
-	return guest_flat_to_host(kvm, (u64)pfn * page_size);
-}
-
-static inline void virtio_init_device_vq(struct virtio_device *vdev,
-					 struct virt_queue *vq)
-{
-	vq->endian = vdev->endian;
-	vq->use_event_idx = (vdev->features & VIRTIO_RING_F_EVENT_IDX);
-	vq->enabled = true;
-}
-
+void virtio_init_device_vq(struct kvm *kvm, struct virtio_device *vdev,
+			   struct virt_queue *vq, size_t nr_descs,
+			   u32 page_size, u32 align, u32 pfn);
 void virtio_exit_vq(struct kvm *kvm, struct virtio_device *vdev, void *dev,
 		    int num);
 void virtio_set_guest_features(struct kvm *kvm, struct virtio_device *vdev,
diff --git a/virtio/9p.c b/virtio/9p.c
index 7c9d7925..d9a77377 100644
--- a/virtio/9p.c
+++ b/virtio/9p.c
@@ -1414,17 +1414,14 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	struct p9_dev *p9dev = dev;
 	struct p9_dev_job *job;
 	struct virt_queue *queue;
-	void *p;
 
 	compat__remove_message(compat_id);
 
 	queue		= &p9dev->vqs[vq];
-	queue->pfn	= pfn;
-	p		= virtio_get_vq(kvm, queue->pfn, page_size);
 	job		= &p9dev->jobs[vq];
 
-	vring_init(&queue->vring, VIRTQUEUE_NUM, p, align);
-	virtio_init_device_vq(&p9dev->vdev, queue);
+	virtio_init_device_vq(kvm, &p9dev->vdev, queue, VIRTQUEUE_NUM,
+			      page_size, align, pfn);
 
 	*job		= (struct p9_dev_job) {
 		.vq		= queue,
diff --git a/virtio/balloon.c b/virtio/balloon.c
index f398ce47..720073dc 100644
--- a/virtio/balloon.c
+++ b/virtio/balloon.c
@@ -214,17 +214,15 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 {
 	struct bln_dev *bdev = dev;
 	struct virt_queue *queue;
-	void *p;
 
 	compat__remove_message(compat_id);
 
 	queue		= &bdev->vqs[vq];
-	queue->pfn	= pfn;
-	p		= virtio_get_vq(kvm, queue->pfn, page_size);
+
+	virtio_init_device_vq(kvm, &bdev->vdev, queue, VIRTIO_BLN_QUEUE_SIZE,
+			      page_size, align, pfn);
 
 	thread_pool__init_job(&bdev->jobs[vq], kvm, virtio_bln_do_io, queue);
-	vring_init(&queue->vring, VIRTIO_BLN_QUEUE_SIZE, p, align);
-	virtio_init_device_vq(&bdev->vdev, queue);
 
 	return 0;
 }
diff --git a/virtio/blk.c b/virtio/blk.c
index 46ee0281..af8c62f6 100644
--- a/virtio/blk.c
+++ b/virtio/blk.c
@@ -212,17 +212,11 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 {
 	unsigned int i;
 	struct blk_dev *bdev = dev;
-	struct virt_queue *queue;
-	void *p;
 
 	compat__remove_message(compat_id);
 
-	queue		= &bdev->vqs[vq];
-	queue->pfn	= pfn;
-	p		= virtio_get_vq(kvm, queue->pfn, page_size);
-
-	vring_init(&queue->vring, VIRTIO_BLK_QUEUE_SIZE, p, align);
-	virtio_init_device_vq(&bdev->vdev, queue);
+	virtio_init_device_vq(kvm, &bdev->vdev, &bdev->vqs[vq],
+			      VIRTIO_BLK_QUEUE_SIZE, page_size, align, pfn);
 
 	if (vq != 0)
 		return 0;
diff --git a/virtio/console.c b/virtio/console.c
index 83158082..9fbd1016 100644
--- a/virtio/console.c
+++ b/virtio/console.c
@@ -151,18 +151,15 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 		   u32 pfn)
 {
 	struct virt_queue *queue;
-	void *p;
 
 	BUG_ON(vq >= VIRTIO_CONSOLE_NUM_QUEUES);
 
 	compat__remove_message(compat_id);
 
 	queue		= &cdev.vqs[vq];
-	queue->pfn	= pfn;
-	p		= virtio_get_vq(kvm, queue->pfn, page_size);
 
-	vring_init(&queue->vring, VIRTIO_CONSOLE_QUEUE_SIZE, p, align);
-	virtio_init_device_vq(&cdev.vdev, queue);
+	virtio_init_device_vq(kvm, &cdev.vdev, queue, VIRTIO_CONSOLE_QUEUE_SIZE,
+			      page_size, align, pfn);
 
 	if (vq == VIRTIO_CONSOLE_TX_QUEUE) {
 		thread_pool__init_job(&cdev.jobs[vq], kvm, virtio_console_handle_callback, queue);
diff --git a/virtio/core.c b/virtio/core.c
index 40532664..a5125fe1 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -159,6 +159,20 @@ u16 virt_queue__get_inout_iov(struct kvm *kvm, struct virt_queue *queue,
 	return head;
 }
 
+void virtio_init_device_vq(struct kvm *kvm, struct virtio_device *vdev,
+			   struct virt_queue *vq, size_t nr_descs,
+			   u32 page_size, u32 align, u32 pfn)
+{
+	void *p = guest_flat_to_host(kvm, (u64)pfn * page_size);
+
+	vq->endian		= vdev->endian;
+	vq->pfn			= pfn;
+	vq->use_event_idx	= (vdev->features & VIRTIO_RING_F_EVENT_IDX);
+	vq->enabled		= true;
+
+	vring_init(&vq->vring, nr_descs, p, align);
+}
+
 void virtio_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
 			   void *dev, int num)
 {
diff --git a/virtio/net.c b/virtio/net.c
index 67070d65..de5ae7b4 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -590,7 +590,6 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	struct vhost_vring_addr addr;
 	struct net_dev *ndev = dev;
 	struct virt_queue *queue;
-	void *p;
 	int r;
 
 	compat__remove_message(compat_id);
@@ -599,11 +598,8 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	net_queue->id	= vq;
 	net_queue->ndev	= ndev;
 	queue		= &net_queue->vq;
-	queue->pfn	= pfn;
-	p		= virtio_get_vq(kvm, queue->pfn, page_size);
-
-	vring_init(&queue->vring, VIRTIO_NET_QUEUE_SIZE, p, align);
-	virtio_init_device_vq(&ndev->vdev, queue);
+	virtio_init_device_vq(kvm, &ndev->vdev, queue, VIRTIO_NET_QUEUE_SIZE,
+			      page_size, align, pfn);
 
 	mutex_init(&net_queue->lock);
 	pthread_cond_init(&net_queue->cond, NULL);
diff --git a/virtio/rng.c b/virtio/rng.c
index 75b682e3..5bcd05a2 100644
--- a/virtio/rng.c
+++ b/virtio/rng.c
@@ -97,18 +97,15 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	struct rng_dev *rdev = dev;
 	struct virt_queue *queue;
 	struct rng_dev_job *job;
-	void *p;
 
 	compat__remove_message(compat_id);
 
 	queue		= &rdev->vqs[vq];
-	queue->pfn	= pfn;
-	p		= virtio_get_vq(kvm, queue->pfn, page_size);
 
 	job = &rdev->jobs[vq];
 
-	vring_init(&queue->vring, VIRTIO_RNG_QUEUE_SIZE, p, align);
-	virtio_init_device_vq(&rdev->vdev, queue);
+	virtio_init_device_vq(kvm, &rdev->vdev, queue, VIRTIO_RNG_QUEUE_SIZE,
+			      page_size, align, pfn);
 
 	*job = (struct rng_dev_job) {
 		.vq	= queue,
diff --git a/virtio/scsi.c b/virtio/scsi.c
index 60432cc8..9dd9e9ac 100644
--- a/virtio/scsi.c
+++ b/virtio/scsi.c
@@ -69,17 +69,14 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	struct vhost_vring_addr addr;
 	struct scsi_dev *sdev = dev;
 	struct virt_queue *queue;
-	void *p;
 	int r;
 
 	compat__remove_message(compat_id);
 
 	queue		= &sdev->vqs[vq];
-	queue->pfn	= pfn;
-	p		= virtio_get_vq(kvm, queue->pfn, page_size);
 
-	vring_init(&queue->vring, VIRTIO_SCSI_QUEUE_SIZE, p, align);
-	virtio_init_device_vq(&sdev->vdev, queue);
+	virtio_init_device_vq(kvm, &sdev->vdev, queue, VIRTIO_SCSI_QUEUE_SIZE,
+			      page_size, align, pfn);
 
 	if (sdev->vhost_fd == 0)
 		return 0;
diff --git a/virtio/vsock.c b/virtio/vsock.c
index 780169b1..79a672fe 100644
--- a/virtio/vsock.c
+++ b/virtio/vsock.c
@@ -72,17 +72,13 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	struct vhost_vring_addr addr;
 	struct vsock_dev *vdev = dev;
 	struct virt_queue *queue;
-	void *p;
 	int r;
 
 	compat__remove_message(compat_id);
 
 	queue		= &vdev->vqs[vq];
-	queue->pfn	= pfn;
-	p		= virtio_get_vq(kvm, queue->pfn, page_size);
-
-	vring_init(&queue->vring, VIRTIO_VSOCK_QUEUE_SIZE, p, align);
-	virtio_init_device_vq(&vdev->vdev, queue);
+	virtio_init_device_vq(kvm, &vdev->vdev, queue, VIRTIO_VSOCK_QUEUE_SIZE,
+			      page_size, align, pfn);
 
 	if (vdev->vhost_fd == -1)
 		return 0;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 05/24] virtio: Support modern virtqueue addresses
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (3 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 04/24] virtio: Factor virtqueue initialization Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 06/24] virtio: Add config access helpers Jean-Philippe Brucker
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Modern virtio devices can use separate buffer for descriptors, available
and used rings. They can also use 64-bit addresses instead of 44-bit.
Rework the virtqueue initialization function to support modern virtio.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio.h | 29 ++++++++++++++++++++++++-----
 virtio/9p.c          |  6 ++----
 virtio/balloon.c     |  8 +++-----
 virtio/blk.c         |  5 ++---
 virtio/console.c     |  6 ++----
 virtio/core.c        | 24 +++++++++++++++++++-----
 virtio/mmio.c        | 21 ++++++++++++++-------
 virtio/net.c         |  6 ++----
 virtio/pci.c         | 16 ++++++++++++----
 virtio/rng.c         |  6 ++----
 virtio/scsi.c        |  6 ++----
 virtio/vsock.c       |  6 ++----
 12 files changed, 86 insertions(+), 53 deletions(-)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index f0b79334..24179ecc 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -44,9 +44,30 @@
 /* Stop the device */
 #define VIRTIO__STATUS_STOP		(1 << 9)
 
+struct vring_addr {
+	bool			legacy;
+	union {
+		/* Legacy description */
+		struct {
+			u32	pfn;
+			u32	align;
+			u32	pgsize;
+		};
+		/* Modern description */
+		struct {
+			u32	desc_lo;
+			u32	desc_hi;
+			u32	avail_lo;
+			u32	avail_hi;
+			u32	used_lo;
+			u32	used_hi;
+		};
+	};
+};
+
 struct virt_queue {
 	struct vring	vring;
-	u32		pfn;
+	struct vring_addr vring_addr;
 	/* The last_avail_idx field is an index to ->ring of struct vring_avail.
 	   It's where we assume the next request index is at.  */
 	u16		last_avail_idx;
@@ -189,8 +210,7 @@ struct virtio_ops {
 	u32 (*get_host_features)(struct kvm *kvm, void *dev);
 	void (*set_guest_features)(struct kvm *kvm, void *dev, u32 features);
 	unsigned int (*get_vq_count)(struct kvm *kvm, void *dev);
-	int (*init_vq)(struct kvm *kvm, void *dev, u32 vq, u32 page_size,
-		       u32 align, u32 pfn);
+	int (*init_vq)(struct kvm *kvm, void *dev, u32 vq);
 	void (*exit_vq)(struct kvm *kvm, void *dev, u32 vq);
 	int (*notify_vq)(struct kvm *kvm, void *dev, u32 vq);
 	struct virt_queue *(*get_vq)(struct kvm *kvm, void *dev, u32 vq);
@@ -213,8 +233,7 @@ int __must_check virtio_init(struct kvm *kvm, void *dev, struct virtio_device *v
 int virtio_compat_add_message(const char *device, const char *config);
 const char* virtio_trans_name(enum virtio_trans trans);
 void virtio_init_device_vq(struct kvm *kvm, struct virtio_device *vdev,
-			   struct virt_queue *vq, size_t nr_descs,
-			   u32 page_size, u32 align, u32 pfn);
+			   struct virt_queue *vq, size_t nr_descs);
 void virtio_exit_vq(struct kvm *kvm, struct virtio_device *vdev, void *dev,
 		    int num);
 void virtio_set_guest_features(struct kvm *kvm, struct virtio_device *vdev,
diff --git a/virtio/9p.c b/virtio/9p.c
index d9a77377..a3f96669 100644
--- a/virtio/9p.c
+++ b/virtio/9p.c
@@ -1408,8 +1408,7 @@ static void notify_status(struct kvm *kvm, void *dev, u32 status)
 		close_fid(p9dev, pfid->fid);
 }
 
-static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
-		   u32 pfn)
+static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct p9_dev *p9dev = dev;
 	struct p9_dev_job *job;
@@ -1420,8 +1419,7 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	queue		= &p9dev->vqs[vq];
 	job		= &p9dev->jobs[vq];
 
-	virtio_init_device_vq(kvm, &p9dev->vdev, queue, VIRTQUEUE_NUM,
-			      page_size, align, pfn);
+	virtio_init_device_vq(kvm, &p9dev->vdev, queue, VIRTQUEUE_NUM);
 
 	*job		= (struct p9_dev_job) {
 		.vq		= queue,
diff --git a/virtio/balloon.c b/virtio/balloon.c
index 720073dc..ffeeb293 100644
--- a/virtio/balloon.c
+++ b/virtio/balloon.c
@@ -130,7 +130,7 @@ static int virtio_bln__collect_stats(struct kvm *kvm)
 	u64 tmp;
 
 	/* Exit if the queue is not set up. */
-	if (!vq->pfn)
+	if (!vq->enabled)
 		return -ENODEV;
 
 	virt_queue__set_used_elem(vq, bdev.cur_stat_head,
@@ -209,8 +209,7 @@ static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
 }
 
-static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
-		   u32 pfn)
+static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct bln_dev *bdev = dev;
 	struct virt_queue *queue;
@@ -219,8 +218,7 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 
 	queue		= &bdev->vqs[vq];
 
-	virtio_init_device_vq(kvm, &bdev->vdev, queue, VIRTIO_BLN_QUEUE_SIZE,
-			      page_size, align, pfn);
+	virtio_init_device_vq(kvm, &bdev->vdev, queue, VIRTIO_BLN_QUEUE_SIZE);
 
 	thread_pool__init_job(&bdev->jobs[vq], kvm, virtio_bln_do_io, queue);
 
diff --git a/virtio/blk.c b/virtio/blk.c
index af8c62f6..2479e006 100644
--- a/virtio/blk.c
+++ b/virtio/blk.c
@@ -207,8 +207,7 @@ static void *virtio_blk_thread(void *dev)
 	return NULL;
 }
 
-static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
-		   u32 pfn)
+static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	unsigned int i;
 	struct blk_dev *bdev = dev;
@@ -216,7 +215,7 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	compat__remove_message(compat_id);
 
 	virtio_init_device_vq(kvm, &bdev->vdev, &bdev->vqs[vq],
-			      VIRTIO_BLK_QUEUE_SIZE, page_size, align, pfn);
+			      VIRTIO_BLK_QUEUE_SIZE);
 
 	if (vq != 0)
 		return 0;
diff --git a/virtio/console.c b/virtio/console.c
index 9fbd1016..5263b8e7 100644
--- a/virtio/console.c
+++ b/virtio/console.c
@@ -147,8 +147,7 @@ static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
 }
 
-static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
-		   u32 pfn)
+static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct virt_queue *queue;
 
@@ -158,8 +157,7 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 
 	queue		= &cdev.vqs[vq];
 
-	virtio_init_device_vq(kvm, &cdev.vdev, queue, VIRTIO_CONSOLE_QUEUE_SIZE,
-			      page_size, align, pfn);
+	virtio_init_device_vq(kvm, &cdev.vdev, queue, VIRTIO_CONSOLE_QUEUE_SIZE);
 
 	if (vq == VIRTIO_CONSOLE_TX_QUEUE) {
 		thread_pool__init_job(&cdev.jobs[vq], kvm, virtio_console_handle_callback, queue);
diff --git a/virtio/core.c b/virtio/core.c
index a5125fe1..d6f2c689 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -160,17 +160,31 @@ u16 virt_queue__get_inout_iov(struct kvm *kvm, struct virt_queue *queue,
 }
 
 void virtio_init_device_vq(struct kvm *kvm, struct virtio_device *vdev,
-			   struct virt_queue *vq, size_t nr_descs,
-			   u32 page_size, u32 align, u32 pfn)
+			   struct virt_queue *vq, size_t nr_descs)
 {
-	void *p = guest_flat_to_host(kvm, (u64)pfn * page_size);
+	struct vring_addr *addr = &vq->vring_addr;
 
 	vq->endian		= vdev->endian;
-	vq->pfn			= pfn;
 	vq->use_event_idx	= (vdev->features & VIRTIO_RING_F_EVENT_IDX);
 	vq->enabled		= true;
 
-	vring_init(&vq->vring, nr_descs, p, align);
+	if (addr->legacy) {
+		unsigned long base = (u64)addr->pfn * addr->pgsize;
+		void *p = guest_flat_to_host(kvm, base);
+
+		vring_init(&vq->vring, nr_descs, p, addr->align);
+	} else {
+		u64 desc = (u64)addr->desc_hi << 32 | addr->desc_lo;
+		u64 avail = (u64)addr->avail_hi << 32 | addr->avail_lo;
+		u64 used = (u64)addr->used_hi << 32 | addr->used_lo;
+
+		vq->vring = (struct vring) {
+			.desc	= guest_flat_to_host(kvm, desc),
+			.used	= guest_flat_to_host(kvm, used),
+			.avail	= guest_flat_to_host(kvm, avail),
+			.num	= nr_descs,
+		};
+	}
 }
 
 void virtio_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
diff --git a/virtio/mmio.c b/virtio/mmio.c
index 3782d55a..77289e2b 100644
--- a/virtio/mmio.c
+++ b/virtio/mmio.c
@@ -125,6 +125,9 @@ static void virtio_mmio_device_specific(struct kvm_cpu *vcpu,
 	}
 }
 
+#define vmmio_selected_vq(vdev, vmmio) \
+	(vdev)->ops->get_vq((vmmio)->kvm, (vmmio)->dev, (vmmio)->hdr.queue_sel)
+
 static void virtio_mmio_config_in(struct kvm_cpu *vcpu,
 				  u64 addr, void *data, u32 len,
 				  struct virtio_device *vdev)
@@ -149,9 +152,8 @@ static void virtio_mmio_config_in(struct kvm_cpu *vcpu,
 		ioport__write32(data, val);
 		break;
 	case VIRTIO_MMIO_QUEUE_PFN:
-		vq = vdev->ops->get_vq(vmmio->kvm, vmmio->dev,
-				       vmmio->hdr.queue_sel);
-		ioport__write32(data, vq->pfn);
+		vq = vmmio_selected_vq(vdev, vmmio);
+		ioport__write32(data, vq->vring_addr.pfn);
 		break;
 	case VIRTIO_MMIO_QUEUE_NUM_MAX:
 		val = vdev->ops->get_size_vq(vmmio->kvm, vmmio->dev,
@@ -170,6 +172,7 @@ static void virtio_mmio_config_out(struct kvm_cpu *vcpu,
 	struct virtio_mmio *vmmio = vdev->virtio;
 	struct kvm *kvm = vmmio->kvm;
 	unsigned int vq_count = vdev->ops->get_vq_count(kvm, vmmio->dev);
+	struct virt_queue *vq;
 	u32 val = 0;
 
 	switch (addr) {
@@ -217,13 +220,17 @@ static void virtio_mmio_config_out(struct kvm_cpu *vcpu,
 	case VIRTIO_MMIO_QUEUE_PFN:
 		val = ioport__read32(data);
 		if (val) {
+			vq = vmmio_selected_vq(vdev, vmmio);
+			vq->vring_addr = (struct vring_addr) {
+				.legacy	= true,
+				.pfn	= val,
+				.align	= vmmio->hdr.queue_align,
+				.pgsize	= vmmio->hdr.guest_page_size,
+			};
 			virtio_mmio_init_ioeventfd(vmmio->kvm, vdev,
 						   vmmio->hdr.queue_sel);
 			vdev->ops->init_vq(vmmio->kvm, vmmio->dev,
-					   vmmio->hdr.queue_sel,
-					   vmmio->hdr.guest_page_size,
-					   vmmio->hdr.queue_align,
-					   val);
+					   vmmio->hdr.queue_sel);
 		} else {
 			virtio_mmio_exit_vq(kvm, vdev, vmmio->hdr.queue_sel);
 		}
diff --git a/virtio/net.c b/virtio/net.c
index de5ae7b4..7c7970a7 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -582,8 +582,7 @@ static bool is_ctrl_vq(struct net_dev *ndev, u32 vq)
 	return vq == (u32)(ndev->queue_pairs * 2);
 }
 
-static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
-		   u32 pfn)
+static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct vhost_vring_state state = { .index = vq };
 	struct net_dev_queue *net_queue;
@@ -598,8 +597,7 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	net_queue->id	= vq;
 	net_queue->ndev	= ndev;
 	queue		= &net_queue->vq;
-	virtio_init_device_vq(kvm, &ndev->vdev, queue, VIRTIO_NET_QUEUE_SIZE,
-			      page_size, align, pfn);
+	virtio_init_device_vq(kvm, &ndev->vdev, queue, VIRTIO_NET_QUEUE_SIZE);
 
 	mutex_init(&net_queue->lock);
 	pthread_cond_init(&net_queue->cond, NULL);
diff --git a/virtio/pci.c b/virtio/pci.c
index 23831d5a..20b16228 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -178,7 +178,7 @@ static bool virtio_pci__data_in(struct kvm_cpu *vcpu, struct virtio_device *vdev
 		break;
 	case VIRTIO_PCI_QUEUE_PFN:
 		vq = vdev->ops->get_vq(kvm, vpci->dev, vpci->queue_selector);
-		ioport__write32(data, vq->pfn);
+		ioport__write32(data, vq->vring_addr.pfn);
 		break;
 	case VIRTIO_PCI_QUEUE_NUM:
 		val = vdev->ops->get_size_vq(kvm, vpci->dev, vpci->queue_selector);
@@ -318,6 +318,7 @@ static bool virtio_pci__data_out(struct kvm_cpu *vcpu, struct virtio_device *vde
 {
 	bool ret = true;
 	struct virtio_pci *vpci;
+	struct virt_queue *vq;
 	struct kvm *kvm;
 	u32 val;
 	unsigned int vq_count;
@@ -334,11 +335,18 @@ static bool virtio_pci__data_out(struct kvm_cpu *vcpu, struct virtio_device *vde
 	case VIRTIO_PCI_QUEUE_PFN:
 		val = ioport__read32(data);
 		if (val) {
+			vq = vdev->ops->get_vq(kvm, vpci->dev,
+					       vpci->queue_selector);
+			vq->vring_addr = (struct vring_addr) {
+				.legacy	= true,
+				.pfn	= val,
+				.align	= VIRTIO_PCI_VRING_ALIGN,
+				.pgsize	= 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT,
+			};
 			virtio_pci__init_ioeventfd(kvm, vdev,
 						   vpci->queue_selector);
-			vdev->ops->init_vq(kvm, vpci->dev, vpci->queue_selector,
-					   1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT,
-					   VIRTIO_PCI_VRING_ALIGN, val);
+			vdev->ops->init_vq(kvm, vpci->dev,
+					   vpci->queue_selector);
 		} else {
 			virtio_pci_exit_vq(kvm, vdev, vpci->queue_selector);
 		}
diff --git a/virtio/rng.c b/virtio/rng.c
index 5bcd05a2..840da0ee 100644
--- a/virtio/rng.c
+++ b/virtio/rng.c
@@ -91,8 +91,7 @@ static void virtio_rng_do_io(struct kvm *kvm, void *param)
 	rdev->vdev.ops->signal_vq(kvm, &rdev->vdev, vq - rdev->vqs);
 }
 
-static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
-		   u32 pfn)
+static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct rng_dev *rdev = dev;
 	struct virt_queue *queue;
@@ -104,8 +103,7 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 
 	job = &rdev->jobs[vq];
 
-	virtio_init_device_vq(kvm, &rdev->vdev, queue, VIRTIO_RNG_QUEUE_SIZE,
-			      page_size, align, pfn);
+	virtio_init_device_vq(kvm, &rdev->vdev, queue, VIRTIO_RNG_QUEUE_SIZE);
 
 	*job = (struct rng_dev_job) {
 		.vq	= queue,
diff --git a/virtio/scsi.c b/virtio/scsi.c
index 9dd9e9ac..507cf3f1 100644
--- a/virtio/scsi.c
+++ b/virtio/scsi.c
@@ -62,8 +62,7 @@ static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
 }
 
-static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
-		   u32 pfn)
+static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct vhost_vring_state state = { .index = vq };
 	struct vhost_vring_addr addr;
@@ -75,8 +74,7 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 
 	queue		= &sdev->vqs[vq];
 
-	virtio_init_device_vq(kvm, &sdev->vdev, queue, VIRTIO_SCSI_QUEUE_SIZE,
-			      page_size, align, pfn);
+	virtio_init_device_vq(kvm, &sdev->vdev, queue, VIRTIO_SCSI_QUEUE_SIZE);
 
 	if (sdev->vhost_fd == 0)
 		return 0;
diff --git a/virtio/vsock.c b/virtio/vsock.c
index 79a672fe..dfd62112 100644
--- a/virtio/vsock.c
+++ b/virtio/vsock.c
@@ -65,8 +65,7 @@ static bool is_event_vq(u32 vq)
 	return vq == VSOCK_VQ_EVENT;
 }
 
-static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
-		   u32 pfn)
+static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct vhost_vring_state state = { .index = vq };
 	struct vhost_vring_addr addr;
@@ -77,8 +76,7 @@ static int init_vq(struct kvm *kvm, void *dev, u32 vq, u32 page_size, u32 align,
 	compat__remove_message(compat_id);
 
 	queue		= &vdev->vqs[vq];
-	virtio_init_device_vq(kvm, &vdev->vdev, queue, VIRTIO_VSOCK_QUEUE_SIZE,
-			      page_size, align, pfn);
+	virtio_init_device_vq(kvm, &vdev->vdev, queue, VIRTIO_VSOCK_QUEUE_SIZE);
 
 	if (vdev->vhost_fd == -1)
 		return 0;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 06/24] virtio: Add config access helpers
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (4 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 05/24] virtio: Support modern virtqueue addresses Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 07/24] virtio: Fix device-specific config endianness Jean-Philippe Brucker
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

At the moment device-specific config access is tailored for a Linux
guest, that performs any access in 8 bits. But config access can have
any size, and modern virtio drivers must use the size of the accessed
field. Add helpers that generalize config accesses.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio.h |  3 +++
 virtio/core.c        | 38 ++++++++++++++++++++++++++++++++++++++
 virtio/mmio.c        | 30 ++----------------------------
 virtio/pci.c         | 41 ++++-------------------------------------
 4 files changed, 47 insertions(+), 65 deletions(-)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 24179ecc..cc4ba1d6 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -236,6 +236,9 @@ void virtio_init_device_vq(struct kvm *kvm, struct virtio_device *vdev,
 			   struct virt_queue *vq, size_t nr_descs);
 void virtio_exit_vq(struct kvm *kvm, struct virtio_device *vdev, void *dev,
 		    int num);
+bool virtio_access_config(struct kvm *kvm, struct virtio_device *vdev, void *dev,
+			  unsigned long offset, void *data, size_t size,
+			  bool is_write);
 void virtio_set_guest_features(struct kvm *kvm, struct virtio_device *vdev,
 			       void *dev, u32 features);
 void virtio_notify_status(struct kvm *kvm, struct virtio_device *vdev,
diff --git a/virtio/core.c b/virtio/core.c
index d6f2c689..be0f6f8d 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -282,6 +282,44 @@ void virtio_notify_status(struct kvm *kvm, struct virtio_device *vdev,
 		vdev->ops->notify_status(kvm, dev, ext_status);
 }
 
+bool virtio_access_config(struct kvm *kvm, struct virtio_device *vdev,
+			  void *dev, unsigned long offset, void *data,
+			  size_t size, bool is_write)
+{
+	void *in, *out, *config;
+	size_t config_size = vdev->ops->get_config_size(kvm, dev);
+
+	if (WARN_ONCE(offset + size > config_size,
+		      "Config access offset (%lu) is beyond config size (%zu)\n",
+		      offset, config_size))
+		return false;
+
+	config = vdev->ops->get_config(kvm, dev) + offset;
+
+	in = is_write ? data : config;
+	out = is_write ? config : data;
+
+	switch (size) {
+	case 1:
+		*(u8 *)out = *(u8 *)in;
+		break;
+	case 2:
+		*(u16 *)out = *(u16 *)in;
+		break;
+	case 4:
+		*(u32 *)out = *(u32 *)in;
+		break;
+	case 8:
+		*(u64 *)out = *(u64 *)in;
+		break;
+	default:
+		WARN_ONCE(1, "%s: invalid access size\n", __func__);
+		return false;
+	}
+
+	return true;
+}
+
 int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 		struct virtio_ops *ops, enum virtio_trans trans,
 		int device_id, int subsys_id, int class)
diff --git a/virtio/mmio.c b/virtio/mmio.c
index 77289e2b..268a4391 100644
--- a/virtio/mmio.c
+++ b/virtio/mmio.c
@@ -98,33 +98,6 @@ int virtio_mmio_signal_config(struct kvm *kvm, struct virtio_device *vdev)
 	return 0;
 }
 
-static void virtio_mmio_device_specific(struct kvm_cpu *vcpu,
-					u64 addr, u8 *data, u32 len,
-					u8 is_write, struct virtio_device *vdev)
-{
-	struct virtio_mmio *vmmio = vdev->virtio;
-	u8 *config;
-	size_t config_size;
-	u32 i;
-
-	config = vdev->ops->get_config(vmmio->kvm, vmmio->dev);
-	config_size = vdev->ops->get_config_size(vmmio->kvm, vmmio->dev);
-
-	/* Prevent invalid accesses which go beyond the config */
-	if (config_size < addr + len) {
-		WARN_ONCE(1, "Offset (%llu) Length (%u) goes beyond config size (%zu).\n",
-			addr, len, config_size);
-		return;
-	}
-
-	for (i = 0; i < len; i++) {
-		if (is_write)
-			config[addr + i] = *(u8 *)data + i;
-		else
-			data[i] = config[addr + i];
-	}
-}
-
 #define vmmio_selected_vq(vdev, vmmio) \
 	(vdev)->ops->get_vq((vmmio)->kvm, (vmmio)->dev, (vmmio)->hdr.queue_sel)
 
@@ -263,7 +236,8 @@ static void virtio_mmio_mmio_callback(struct kvm_cpu *vcpu,
 
 	if (offset >= VIRTIO_MMIO_CONFIG) {
 		offset -= VIRTIO_MMIO_CONFIG;
-		virtio_mmio_device_specific(vcpu, offset, data, len, is_write, ptr);
+		virtio_access_config(vmmio->kvm, vdev, vmmio->dev, offset, data,
+				     len, is_write);
 		return;
 	}
 
diff --git a/virtio/pci.c b/virtio/pci.c
index 20b16228..85018e79 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -135,25 +135,8 @@ static bool virtio_pci__specific_data_in(struct kvm *kvm, struct virtio_device *
 
 		return true;
 	} else if (type == VIRTIO_PCI_O_CONFIG) {
-		u8 cfg;
-		size_t config_size;
-
-		config_size = vdev->ops->get_config_size(kvm, vpci->dev);
-		if (config_offset + size > config_size) {
-			/* Access goes beyond the config size, so return failure. */
-			WARN_ONCE(1, "Config access offset (%u) is beyond config size (%zu)\n",
-				config_offset, config_size);
-			return false;
-		}
-
-		/* TODO: Handle access lengths beyond one byte */
-		if (size != 1) {
-			WARN_ONCE(1, "Size (%u) not supported\n", size);
-			return false;
-		}
-		cfg = vdev->ops->get_config(kvm, vpci->dev)[config_offset];
-		ioport__write8(data, cfg);
-		return true;
+		return virtio_access_config(kvm, vdev, vpci->dev, config_offset,
+					    data, size, false);
 	}
 
 	return false;
@@ -290,24 +273,8 @@ static bool virtio_pci__specific_data_out(struct kvm *kvm, struct virtio_device
 
 		return true;
 	} else if (type == VIRTIO_PCI_O_CONFIG) {
-		size_t config_size;
-
-		config_size = vdev->ops->get_config_size(kvm, vpci->dev);
-		if (config_offset + size > config_size) {
-			/* Access goes beyond the config size, so return failure. */
-			WARN_ONCE(1, "Config access offset (%u) is beyond config size (%zu)\n",
-				config_offset, config_size);
-			return false;
-		}
-
-		/* TODO: Handle access lengths beyond one byte */
-		if (size != 1) {
-			WARN_ONCE(1, "Size (%u) not supported\n", size);
-			return false;
-		}
-		vdev->ops->get_config(kvm, vpci->dev)[config_offset] = *(u8 *)data;
-
-		return true;
+		return virtio_access_config(kvm, vdev, vpci->dev, config_offset,
+					    data, size, true);
 	}
 
 	return false;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 07/24] virtio: Fix device-specific config endianness
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (5 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 06/24] virtio: Add config access helpers Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 08/24] virtio/console: Remove unused callback Jean-Philippe Brucker
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Some legacy virtio drivers expect to read the device-specific config in
guest endianness (2.5.3 "Legacy Interface: A Note on Device
Configuration Space endian-ness").

Kvmtool doesn't know the guest endianness until it can probe a VCPU. So
the config fields start in host endianness, and are swapped once the
guest is running. Currently this is done in set_guest_features(), but
that is too late because the driver is allowed to read config fields
before setting feature bits (2.5.2 "Device Requirements: Device
Configuration Space"). In addition some devices don't swap the fields,
and those that do swap the fields do it every time the guest writes the
feature register, which can't work if a device gets reset more than
once.

Initialize the config on device reset. Do it on every reset because in
theory multiple guests could run with different endianness during the
lifetime of the device.

Notes:

* the balloon device uses little-endian (5.5.4.0.0.1 "Legacy Interface:
  Device configuration layout").

* the vsock device was introduced after virtio 0.9.5, hence doesn't
  describe a legacy interface, but the Linux driver allows to use the
  legacy transport, and always reads the 64-bit guest_cid field as
  little-endian.

* the specification does not describe the 9p device, but the Linux
  driver uses guest-endian helpers.

* the specification does not explicitly forbid a driver from reading the
  configuration at any time, but a driver must follow the sequence from
  3.1.1 "Driver Requirements: Device Initialization", where the driver
  is allowed to read the config after setting the DRIVER status bit. It
  should therefore be safe to keep dealing with guest endianness only on
  device reset, and not on the first config access.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio-9p.h |  1 +
 include/kvm/virtio.h    |  2 ++
 virtio/9p.c             | 10 ++++++----
 virtio/balloon.c        | 19 ++++++++++++++-----
 virtio/blk.c            | 27 ++++++++++-----------------
 virtio/console.c        | 22 +++++++++-------------
 virtio/core.c           |  2 ++
 virtio/net.c            | 23 +++++++++++++++--------
 virtio/scsi.c           | 30 ++++++++++++++++++------------
 virtio/vsock.c          | 11 +++++++----
 10 files changed, 84 insertions(+), 63 deletions(-)

diff --git a/include/kvm/virtio-9p.h b/include/kvm/virtio-9p.h
index 77c5062a..46922781 100644
--- a/include/kvm/virtio-9p.h
+++ b/include/kvm/virtio-9p.h
@@ -47,6 +47,7 @@ struct p9_dev {
 	size_t config_size;
 	struct virtio_9p_config	*config;
 	u32			features;
+	u16			tag_len;
 
 	/* virtio queue */
 	struct virt_queue	vqs[NUM_VIRT_QUEUES];
diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index cc4ba1d6..9913ba92 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -43,6 +43,8 @@
 #define VIRTIO__STATUS_START		(1 << 8)
 /* Stop the device */
 #define VIRTIO__STATUS_STOP		(1 << 9)
+/* Initialize the config */
+#define VIRTIO__STATUS_CONFIG		(1 << 10)
 
 struct vring_addr {
 	bool			legacy;
diff --git a/virtio/9p.c b/virtio/9p.c
index a3f96669..e16c1edd 100644
--- a/virtio/9p.c
+++ b/virtio/9p.c
@@ -1390,10 +1390,8 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
 {
 	struct p9_dev *p9dev = dev;
-	struct virtio_9p_config *conf = p9dev->config;
 
 	p9dev->features = features;
-	conf->tag_len = virtio_host_to_guest_u16(&p9dev->vdev, conf->tag_len);
 }
 
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
@@ -1401,6 +1399,10 @@ static void notify_status(struct kvm *kvm, void *dev, u32 status)
 	struct p9_dev *p9dev = dev;
 	struct p9_fid *pfid, *next;
 
+	if (status & VIRTIO__STATUS_CONFIG)
+		p9dev->config->tag_len = virtio_host_to_guest_u16(&p9dev->vdev,
+								  p9dev->tag_len);
+
 	if (!(status & VIRTIO__STATUS_STOP))
 		return;
 
@@ -1596,8 +1598,8 @@ int virtio_9p__register(struct kvm *kvm, const char *root, const char *tag_name)
 	strncpy(p9dev->root_dir, root, sizeof(p9dev->root_dir));
 	p9dev->root_dir[sizeof(p9dev->root_dir)-1] = '\x00';
 
-	p9dev->config->tag_len = tag_length;
-	if (p9dev->config->tag_len > MAX_TAG_LEN) {
+	p9dev->tag_len = tag_length;
+	if (p9dev->tag_len > MAX_TAG_LEN) {
 		err = -EINVAL;
 		goto free_p9dev_config;
 	}
diff --git a/virtio/balloon.c b/virtio/balloon.c
index ffeeb293..753171d1 100644
--- a/virtio/balloon.c
+++ b/virtio/balloon.c
@@ -13,6 +13,7 @@
 #include <linux/virtio_ring.h>
 #include <linux/virtio_balloon.h>
 
+#include <linux/byteorder.h>
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <fcntl.h>
@@ -56,22 +57,25 @@ static bool virtio_bln_do_io_request(struct kvm *kvm, struct bln_dev *bdev, stru
 	unsigned int len = 0;
 	u16 out, in, head;
 	u32 *ptrs, i;
+	u32 actual;
 
 	head	= virt_queue__get_iov(queue, iov, &out, &in, kvm);
 	ptrs	= iov[0].iov_base;
 	len	= iov[0].iov_len / sizeof(u32);
 
+	actual = le32_to_cpu(bdev->config.actual);
 	for (i = 0 ; i < len ; i++) {
 		void *guest_ptr;
 
 		guest_ptr = guest_flat_to_host(kvm, (u64)ptrs[i] << VIRTIO_BALLOON_PFN_SHIFT);
 		if (queue == &bdev->vqs[VIRTIO_BLN_INFLATE]) {
 			madvise(guest_ptr, 1 << VIRTIO_BALLOON_PFN_SHIFT, MADV_DONTNEED);
-			bdev->config.actual++;
+			actual++;
 		} else if (queue == &bdev->vqs[VIRTIO_BLN_DEFLATE]) {
-			bdev->config.actual--;
+			actual--;
 		}
 	}
+	bdev->config.actual = cpu_to_le32(actual);
 
 	virt_queue__set_used_elem(queue, head, len);
 
@@ -161,20 +165,25 @@ static void virtio_bln__print_stats(struct kvm *kvm, int fd, u32 type, u32 len,
 static void handle_mem(struct kvm *kvm, int fd, u32 type, u32 len, u8 *msg)
 {
 	int mem;
+	u32 num_pages;
 
 	if (WARN_ON(type != KVM_IPC_BALLOON || len != sizeof(int)))
 		return;
 
 	mem = *(int *)msg;
+	num_pages = le32_to_cpu(bdev.config.num_pages);
+
 	if (mem > 0) {
-		bdev.config.num_pages += 256 * mem;
+		num_pages += 256 * mem;
 	} else if (mem < 0) {
-		if (bdev.config.num_pages < (u32)(256 * (-mem)))
+		if (num_pages < (u32)(256 * (-mem)))
 			return;
 
-		bdev.config.num_pages += 256 * mem;
+		num_pages += 256 * mem;
 	}
 
+	bdev.config.num_pages = cpu_to_le32(num_pages);
+
 	/* Notify that the configuration space has changed */
 	bdev.vdev.ops->signal_config(kvm, &bdev.vdev);
 }
diff --git a/virtio/blk.c b/virtio/blk.c
index 2479e006..b07234ff 100644
--- a/virtio/blk.c
+++ b/virtio/blk.c
@@ -43,6 +43,7 @@ struct blk_dev {
 
 	struct virtio_device		vdev;
 	struct virtio_blk_config	blk_config;
+	u64				capacity;
 	struct disk_image		*disk;
 	u32				features;
 
@@ -167,25 +168,20 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
 {
 	struct blk_dev *bdev = dev;
-	struct virtio_blk_config *conf = &bdev->blk_config;
 
 	bdev->features = features;
-
-	conf->capacity = virtio_host_to_guest_u64(&bdev->vdev, conf->capacity);
-	conf->size_max = virtio_host_to_guest_u32(&bdev->vdev, conf->size_max);
-	conf->seg_max = virtio_host_to_guest_u32(&bdev->vdev, conf->seg_max);
-
-	/* Geometry */
-	conf->geometry.cylinders = virtio_host_to_guest_u16(&bdev->vdev,
-						conf->geometry.cylinders);
-
-	conf->blk_size = virtio_host_to_guest_u32(&bdev->vdev, conf->blk_size);
-	conf->min_io_size = virtio_host_to_guest_u16(&bdev->vdev, conf->min_io_size);
-	conf->opt_io_size = virtio_host_to_guest_u32(&bdev->vdev, conf->opt_io_size);
 }
 
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
+	struct blk_dev *bdev = dev;
+	struct virtio_blk_config *conf = &bdev->blk_config;
+
+	if (!(status & VIRTIO__STATUS_CONFIG))
+		return;
+
+	conf->capacity = virtio_host_to_guest_u64(&bdev->vdev, bdev->capacity);
+	conf->seg_max = virtio_host_to_guest_u32(&bdev->vdev, DISK_SEG_MAX);
 }
 
 static void *virtio_blk_thread(void *dev)
@@ -318,10 +314,7 @@ static int virtio_blk__init_one(struct kvm *kvm, struct disk_image *disk)
 
 	*bdev = (struct blk_dev) {
 		.disk			= disk,
-		.blk_config		= (struct virtio_blk_config) {
-			.capacity	= disk->size / SECTOR_SIZE,
-			.seg_max	= DISK_SEG_MAX,
-		},
+		.capacity		= disk->size / SECTOR_SIZE,
 		.kvm			= kvm,
 	};
 
diff --git a/virtio/console.c b/virtio/console.c
index 5263b8e7..7ba948d3 100644
--- a/virtio/console.c
+++ b/virtio/console.c
@@ -42,14 +42,7 @@ struct con_dev {
 
 static struct con_dev cdev = {
 	.mutex				= MUTEX_INITIALIZER,
-
 	.vq_ready			= 0,
-
-	.config = {
-		.cols			= 80,
-		.rows			= 24,
-		.max_nr_ports		= 1,
-	},
 };
 
 static int compat_id = -1;
@@ -135,16 +128,19 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 
 static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
 {
-	struct con_dev *cdev = dev;
-	struct virtio_console_config *conf = &cdev->config;
-
-	conf->cols = virtio_host_to_guest_u16(&cdev->vdev, conf->cols);
-	conf->rows = virtio_host_to_guest_u16(&cdev->vdev, conf->rows);
-	conf->max_nr_ports = virtio_host_to_guest_u32(&cdev->vdev, conf->max_nr_ports);
 }
 
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
+	struct con_dev *cdev = dev;
+	struct virtio_console_config *conf = &cdev->config;
+
+	if (!(status & VIRTIO__STATUS_CONFIG))
+		return;
+
+	conf->cols = virtio_host_to_guest_u16(&cdev->vdev, 80);
+	conf->rows = virtio_host_to_guest_u16(&cdev->vdev, 24);
+	conf->max_nr_ports = virtio_host_to_guest_u32(&cdev->vdev, 1);
 }
 
 static int init_vq(struct kvm *kvm, void *dev, u32 vq)
diff --git a/virtio/core.c b/virtio/core.c
index be0f6f8d..fbf4b139 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -277,6 +277,8 @@ void virtio_notify_status(struct kvm *kvm, struct virtio_device *vdev,
 		 */
 		vdev->ops->reset(kvm, vdev);
 	}
+	if (!status)
+		ext_status |= VIRTIO__STATUS_CONFIG;
 
 	if (vdev->ops->notify_status)
 		vdev->ops->notify_status(kvm, dev, ext_status);
diff --git a/virtio/net.c b/virtio/net.c
index 7c7970a7..29b33e08 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -534,13 +534,8 @@ static int virtio_net__vhost_set_features(struct net_dev *ndev)
 static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
 {
 	struct net_dev *ndev = dev;
-	struct virtio_net_config *conf = &ndev->config;
 
 	ndev->features = features;
-
-	conf->status = virtio_host_to_guest_u16(&ndev->vdev, conf->status);
-	conf->max_virtqueue_pairs = virtio_host_to_guest_u16(&ndev->vdev,
-							     conf->max_virtqueue_pairs);
 }
 
 static void virtio_net_start(struct net_dev *ndev)
@@ -569,8 +564,23 @@ static void virtio_net_stop(struct net_dev *ndev)
 		uip_exit(&ndev->info);
 }
 
+static void virtio_net_update_endian(struct net_dev *ndev)
+{
+	struct virtio_net_config *conf = &ndev->config;
+
+	conf->status = virtio_host_to_guest_u16(&ndev->vdev,
+						VIRTIO_NET_S_LINK_UP);
+	conf->max_virtqueue_pairs = virtio_host_to_guest_u16(&ndev->vdev,
+							     ndev->queue_pairs);
+}
+
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
+	struct net_dev *ndev = dev;
+
+	if (status & VIRTIO__STATUS_CONFIG)
+		virtio_net_update_endian(ndev);
+
 	if (status & VIRTIO__STATUS_START)
 		virtio_net_start(dev);
 	else if (status & VIRTIO__STATUS_STOP)
@@ -932,9 +942,6 @@ static int virtio_net__init_one(struct virtio_net_params *params)
 
 	mutex_init(&ndev->mutex);
 	ndev->queue_pairs = max(1, min(VIRTIO_NET_NUM_QUEUES, params->mq));
-	ndev->config.status = VIRTIO_NET_S_LINK_UP;
-	if (ndev->queue_pairs > 1)
-		ndev->config.max_virtqueue_pairs = ndev->queue_pairs;
 
 	for (i = 0 ; i < 6 ; i++) {
 		ndev->config.mac[i]		= params->guest_mac[i];
diff --git a/virtio/scsi.c b/virtio/scsi.c
index 507cf3f1..ef75ef79 100644
--- a/virtio/scsi.c
+++ b/virtio/scsi.c
@@ -60,6 +60,24 @@ static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
 
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
+	struct scsi_dev *sdev = dev;
+	struct virtio_device *vdev = &sdev->vdev;
+	struct virtio_scsi_config *conf = &sdev->config;
+
+	if (!(status & VIRTIO__STATUS_CONFIG))
+		return;
+
+	/* Avoid warning when endianness helpers are compiled out */
+	vdev = vdev;
+
+	conf->num_queues = virtio_host_to_guest_u32(vdev, NUM_VIRT_QUEUES - 2);
+	conf->seg_max = virtio_host_to_guest_u32(vdev, VIRTIO_SCSI_CDB_SIZE - 2);
+	conf->max_sectors = virtio_host_to_guest_u32(vdev, 65535);
+	conf->cmd_per_lun = virtio_host_to_guest_u32(vdev, 128);
+	conf->sense_size = virtio_host_to_guest_u32(vdev, VIRTIO_SCSI_SENSE_SIZE);
+	conf->cdb_size = virtio_host_to_guest_u32(vdev, VIRTIO_SCSI_CDB_SIZE);
+	conf->max_lun = virtio_host_to_guest_u32(vdev, 16383);
+	conf->event_info_size = virtio_host_to_guest_u32(vdev, sizeof(struct virtio_scsi_event));
 }
 
 static int init_vq(struct kvm *kvm, void *dev, u32 vq)
@@ -247,18 +265,6 @@ static int virtio_scsi_init_one(struct kvm *kvm, struct disk_image *disk)
 		return -ENOMEM;
 
 	*sdev = (struct scsi_dev) {
-		.config	= (struct virtio_scsi_config) {
-			.num_queues	= NUM_VIRT_QUEUES - 2,
-			.seg_max	= VIRTIO_SCSI_CDB_SIZE - 2,
-			.max_sectors	= 65535,
-			.cmd_per_lun	= 128,
-			.sense_size	= VIRTIO_SCSI_SENSE_SIZE,
-			.cdb_size	= VIRTIO_SCSI_CDB_SIZE,
-			.max_channel	= 0,
-			.max_target	= 0,
-			.max_lun	= 16383,
-			.event_info_size = sizeof(struct virtio_scsi_event),
-		},
 		.kvm			= kvm,
 	};
 	strlcpy((char *)&sdev->target.vhost_wwpn, disk->wwpn, sizeof(sdev->target.vhost_wwpn));
diff --git a/virtio/vsock.c b/virtio/vsock.c
index dfd62112..594bab7a 100644
--- a/virtio/vsock.c
+++ b/virtio/vsock.c
@@ -7,6 +7,7 @@
 #include "kvm/virtio-pci.h"
 #include "kvm/virtio.h"
 
+#include <linux/byteorder.h>
 #include <linux/kernel.h>
 #include <linux/virtio_vsock.h>
 #include <linux/vhost.h>
@@ -26,6 +27,7 @@ enum {
 struct vsock_dev {
 	struct virt_queue		vqs[VSOCK_VQ_MAX];
 	struct virtio_vsock_config	config;
+	u64				guest_cid;
 	u32				features;
 	int				vhost_fd;
 	struct virtio_device		vdev;
@@ -133,6 +135,9 @@ static void notify_status(struct kvm *kvm, void *dev, u32 status)
 	struct vsock_dev *vdev = dev;
 	int r, start;
 
+	if (status & VIRTIO__STATUS_CONFIG)
+		vdev->config.guest_cid = cpu_to_le64(vdev->guest_cid);
+
 	if (status & VIRTIO__STATUS_START)
 		start = 1;
 	else if (status & VIRTIO__STATUS_STOP)
@@ -261,7 +266,7 @@ static void virtio_vhost_vsock_init(struct kvm *kvm, struct vsock_dev *vdev)
 	if (r != 0)
 		die_perror("VHOST_SET_FEATURES failed");
 
-	r = ioctl(vdev->vhost_fd, VHOST_VSOCK_SET_GUEST_CID, &vdev->config.guest_cid);
+	r = ioctl(vdev->vhost_fd, VHOST_VSOCK_SET_GUEST_CID, &vdev->guest_cid);
 	if (r != 0)
 		die_perror("VHOST_VSOCK_SET_GUEST_CID failed");
 
@@ -280,9 +285,7 @@ static int virtio_vsock_init_one(struct kvm *kvm, u64 guest_cid)
 		return -ENOMEM;
 
 	*vdev = (struct vsock_dev) {
-		.config	= (struct virtio_vsock_config) {
-			.guest_cid	= guest_cid,
-		},
+		.guest_cid		= guest_cid,
 		.vhost_fd		= -1,
 		.kvm			= kvm,
 	};
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 08/24] virtio/console: Remove unused callback
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (6 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 07/24] virtio: Fix device-specific config endianness Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 09/24] virtio: Remove set_guest_features() device op Jean-Philippe Brucker
                   ` (17 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Remove unused set_status() callback

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 virtio/balloon.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/virtio/balloon.c b/virtio/balloon.c
index 753171d1..f06955d2 100644
--- a/virtio/balloon.c
+++ b/virtio/balloon.c
@@ -214,10 +214,6 @@ static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
 	bdev->features = features;
 }
 
-static void notify_status(struct kvm *kvm, void *dev, u32 status)
-{
-}
-
 static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct bln_dev *bdev = dev;
@@ -272,7 +268,6 @@ struct virtio_ops bln_dev_virtio_ops = {
 	.get_host_features	= get_host_features,
 	.set_guest_features	= set_guest_features,
 	.init_vq		= init_vq,
-	.notify_status		= notify_status,
 	.notify_vq		= notify_vq,
 	.get_vq			= get_vq,
 	.get_size_vq		= get_size_vq,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 09/24] virtio: Remove set_guest_features() device op
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (7 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 08/24] virtio/console: Remove unused callback Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 10/24] Add memcpy_fromiovec_safe Jean-Philippe Brucker
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Now that devices have a status callback, they don't use
set_guest_features() anymore. The negotiated feature set is available in
struct virtio_device. Remove the callback from all devices.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio-9p.h |  1 -
 include/kvm/virtio.h    |  1 -
 virtio/9p.c             |  8 --------
 virtio/balloon.c        | 10 ----------
 virtio/blk.c            |  9 ---------
 virtio/console.c        |  6 ------
 virtio/core.c           |  1 -
 virtio/net.c            | 12 ++----------
 virtio/rng.c            |  6 ------
 virtio/scsi.c           |  9 ---------
 virtio/vsock.c          |  8 --------
 11 files changed, 2 insertions(+), 69 deletions(-)

diff --git a/include/kvm/virtio-9p.h b/include/kvm/virtio-9p.h
index 46922781..1dffc955 100644
--- a/include/kvm/virtio-9p.h
+++ b/include/kvm/virtio-9p.h
@@ -46,7 +46,6 @@ struct p9_dev {
 
 	size_t config_size;
 	struct virtio_9p_config	*config;
-	u32			features;
 	u16			tag_len;
 
 	/* virtio queue */
diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 9913ba92..2da5e4f6 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -210,7 +210,6 @@ struct virtio_ops {
 	u8 *(*get_config)(struct kvm *kvm, void *dev);
 	size_t (*get_config_size)(struct kvm *kvm, void *dev);
 	u32 (*get_host_features)(struct kvm *kvm, void *dev);
-	void (*set_guest_features)(struct kvm *kvm, void *dev, u32 features);
 	unsigned int (*get_vq_count)(struct kvm *kvm, void *dev);
 	int (*init_vq)(struct kvm *kvm, void *dev, u32 vq);
 	void (*exit_vq)(struct kvm *kvm, void *dev, u32 vq);
diff --git a/virtio/9p.c b/virtio/9p.c
index e16c1edd..cb3a42a4 100644
--- a/virtio/9p.c
+++ b/virtio/9p.c
@@ -1387,13 +1387,6 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 	return 1 << VIRTIO_9P_MOUNT_TAG;
 }
 
-static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
-{
-	struct p9_dev *p9dev = dev;
-
-	p9dev->features = features;
-}
-
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
 	struct p9_dev *p9dev = dev;
@@ -1475,7 +1468,6 @@ struct virtio_ops p9_dev_virtio_ops = {
 	.get_config		= get_config,
 	.get_config_size	= get_config_size,
 	.get_host_features	= get_host_features,
-	.set_guest_features	= set_guest_features,
 	.init_vq		= init_vq,
 	.exit_vq		= exit_vq,
 	.notify_status		= notify_status,
diff --git a/virtio/balloon.c b/virtio/balloon.c
index f06955d2..6f10219e 100644
--- a/virtio/balloon.c
+++ b/virtio/balloon.c
@@ -33,8 +33,6 @@ struct bln_dev {
 	struct list_head	list;
 	struct virtio_device	vdev;
 
-	u32			features;
-
 	/* virtio queue */
 	struct virt_queue	vqs[NUM_VIRT_QUEUES];
 	struct thread_pool__job	jobs[NUM_VIRT_QUEUES];
@@ -207,13 +205,6 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 	return 1 << VIRTIO_BALLOON_F_STATS_VQ;
 }
 
-static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
-{
-	struct bln_dev *bdev = dev;
-
-	bdev->features = features;
-}
-
 static int init_vq(struct kvm *kvm, void *dev, u32 vq)
 {
 	struct bln_dev *bdev = dev;
@@ -266,7 +257,6 @@ struct virtio_ops bln_dev_virtio_ops = {
 	.get_config		= get_config,
 	.get_config_size	= get_config_size,
 	.get_host_features	= get_host_features,
-	.set_guest_features	= set_guest_features,
 	.init_vq		= init_vq,
 	.notify_vq		= notify_vq,
 	.get_vq			= get_vq,
diff --git a/virtio/blk.c b/virtio/blk.c
index b07234ff..b56d45bd 100644
--- a/virtio/blk.c
+++ b/virtio/blk.c
@@ -45,7 +45,6 @@ struct blk_dev {
 	struct virtio_blk_config	blk_config;
 	u64				capacity;
 	struct disk_image		*disk;
-	u32				features;
 
 	struct virt_queue		vqs[NUM_VIRT_QUEUES];
 	struct blk_dev_req		reqs[VIRTIO_BLK_QUEUE_SIZE];
@@ -165,13 +164,6 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 		| (bdev->disk->readonly ? 1UL << VIRTIO_BLK_F_RO : 0);
 }
 
-static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
-{
-	struct blk_dev *bdev = dev;
-
-	bdev->features = features;
-}
-
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
 	struct blk_dev *bdev = dev;
@@ -289,7 +281,6 @@ static struct virtio_ops blk_dev_virtio_ops = {
 	.get_config		= get_config,
 	.get_config_size	= get_config_size,
 	.get_host_features	= get_host_features,
-	.set_guest_features	= set_guest_features,
 	.get_vq_count		= get_vq_count,
 	.init_vq		= init_vq,
 	.exit_vq		= exit_vq,
diff --git a/virtio/console.c b/virtio/console.c
index 7ba948d3..610995de 100644
--- a/virtio/console.c
+++ b/virtio/console.c
@@ -34,7 +34,6 @@ struct con_dev {
 	struct virtio_device		vdev;
 	struct virt_queue		vqs[VIRTIO_CONSOLE_NUM_QUEUES];
 	struct virtio_console_config	config;
-	u32				features;
 	int				vq_ready;
 
 	struct thread_pool__job		jobs[VIRTIO_CONSOLE_NUM_QUEUES];
@@ -126,10 +125,6 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 	return 0;
 }
 
-static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
-{
-}
-
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
 	struct con_dev *cdev = dev;
@@ -216,7 +211,6 @@ static struct virtio_ops con_dev_virtio_ops = {
 	.get_config		= get_config,
 	.get_config_size	= get_config_size,
 	.get_host_features	= get_host_features,
-	.set_guest_features	= set_guest_features,
 	.get_vq_count		= get_vq_count,
 	.init_vq		= init_vq,
 	.exit_vq		= exit_vq,
diff --git a/virtio/core.c b/virtio/core.c
index fbf4b139..568667f2 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -250,7 +250,6 @@ void virtio_set_guest_features(struct kvm *kvm, struct virtio_device *vdev,
 	/* TODO: fail negotiation if features & ~host_features */
 
 	vdev->features = features;
-	vdev->ops->set_guest_features(kvm, dev, features);
 }
 
 void virtio_notify_status(struct kvm *kvm, struct virtio_device *vdev,
diff --git a/virtio/net.c b/virtio/net.c
index 29b33e08..844612ac 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -55,7 +55,7 @@ struct net_dev {
 
 	struct net_dev_queue		queues[VIRTIO_NET_NUM_QUEUES * 2 + 1];
 	struct virtio_net_config	config;
-	u32				features, queue_pairs;
+	u32				queue_pairs;
 
 	int				vhost_fd;
 	int				tap_fd;
@@ -78,7 +78,7 @@ static int compat_id = -1;
 
 static bool has_virtio_feature(struct net_dev *ndev, u32 feature)
 {
-	return ndev->features & (1 << feature);
+	return ndev->vdev.features & (1 << feature);
 }
 
 static void virtio_net_fix_tx_hdr(struct virtio_net_hdr *hdr, struct net_dev *ndev)
@@ -531,13 +531,6 @@ static int virtio_net__vhost_set_features(struct net_dev *ndev)
 	return ioctl(ndev->vhost_fd, VHOST_SET_FEATURES, &features);
 }
 
-static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
-{
-	struct net_dev *ndev = dev;
-
-	ndev->features = features;
-}
-
 static void virtio_net_start(struct net_dev *ndev)
 {
 	if (ndev->mode == NET_MODE_TAP) {
@@ -770,7 +763,6 @@ static struct virtio_ops net_dev_virtio_ops = {
 	.get_config		= get_config,
 	.get_config_size	= get_config_size,
 	.get_host_features	= get_host_features,
-	.set_guest_features	= set_guest_features,
 	.get_vq_count		= get_vq_count,
 	.init_vq		= init_vq,
 	.exit_vq		= exit_vq,
diff --git a/virtio/rng.c b/virtio/rng.c
index 840da0ee..8fda9dd6 100644
--- a/virtio/rng.c
+++ b/virtio/rng.c
@@ -58,11 +58,6 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 	return 0;
 }
 
-static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
-{
-	/* Unused */
-}
-
 static bool virtio_rng_do_io_request(struct kvm *kvm, struct rng_dev *rdev, struct virt_queue *queue)
 {
 	struct iovec iov[VIRTIO_RNG_QUEUE_SIZE];
@@ -151,7 +146,6 @@ static struct virtio_ops rng_dev_virtio_ops = {
 	.get_config		= get_config,
 	.get_config_size	= get_config_size,
 	.get_host_features	= get_host_features,
-	.set_guest_features	= set_guest_features,
 	.init_vq		= init_vq,
 	.notify_vq		= notify_vq,
 	.get_vq			= get_vq,
diff --git a/virtio/scsi.c b/virtio/scsi.c
index ef75ef79..d69183b7 100644
--- a/virtio/scsi.c
+++ b/virtio/scsi.c
@@ -24,7 +24,6 @@ struct scsi_dev {
 	struct virt_queue		vqs[NUM_VIRT_QUEUES];
 	struct virtio_scsi_config	config;
 	struct vhost_scsi_target	target;
-	u32				features;
 	int				vhost_fd;
 	struct virtio_device		vdev;
 	struct list_head		list;
@@ -51,13 +50,6 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 		1UL << VIRTIO_RING_F_INDIRECT_DESC;
 }
 
-static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
-{
-	struct scsi_dev *sdev = dev;
-
-	sdev->features = features;
-}
-
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
 {
 	struct scsi_dev *sdev = dev;
@@ -198,7 +190,6 @@ static struct virtio_ops scsi_dev_virtio_ops = {
 	.get_config		= get_config,
 	.get_config_size	= get_config_size,
 	.get_host_features	= get_host_features,
-	.set_guest_features	= set_guest_features,
 	.init_vq		= init_vq,
 	.get_vq			= get_vq,
 	.get_size_vq		= get_size_vq,
diff --git a/virtio/vsock.c b/virtio/vsock.c
index 594bab7a..02cee683 100644
--- a/virtio/vsock.c
+++ b/virtio/vsock.c
@@ -55,13 +55,6 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 		| 1UL << VIRTIO_RING_F_INDIRECT_DESC;
 }
 
-static void set_guest_features(struct kvm *kvm, void *dev, u32 features)
-{
-	struct vsock_dev *vdev = dev;
-
-	vdev->features = features;
-}
-
 static bool is_event_vq(u32 vq)
 {
 	return vq == VSOCK_VQ_EVENT;
@@ -212,7 +205,6 @@ static struct virtio_ops vsock_dev_virtio_ops = {
 	.get_config		= get_config,
 	.get_config_size	= get_config_size,
 	.get_host_features	= get_host_features,
-	.set_guest_features	= set_guest_features,
 	.init_vq		= init_vq,
 	.get_vq			= get_vq,
 	.get_size_vq		= get_size_vq,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 10/24] Add memcpy_fromiovec_safe
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (8 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 09/24] virtio: Remove set_guest_features() device op Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 11/24] virtio/net: Offload vnet header endianness conversion to tap Jean-Philippe Brucker
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Existing IOV functions don't take the iovec size as parameter. This is
unfortunate because when parsing buffers split into header and body,
callers may want to know where the body starts in the iovec, after copying
the header. Add a function that does the same as memcpy_fromiovec, but
also allows to iterate over the iovec.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/iovec.h |  2 ++
 util/iovec.c        | 31 +++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/include/kvm/iovec.h b/include/kvm/iovec.h
index fe79dd48..55a03913 100644
--- a/include/kvm/iovec.h
+++ b/include/kvm/iovec.h
@@ -7,6 +7,8 @@ extern int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov,
 extern int memcpy_toiovec(struct iovec *v, unsigned char *kdata, int len);
 extern int memcpy_toiovecend(const struct iovec *v, unsigned char *kdata,
 				size_t offset, int len);
+ssize_t memcpy_fromiovec_safe(void *buf, struct iovec **iov, size_t len,
+			      size_t *iovcount);
 
 static inline size_t iov_size(const struct iovec *iovec, size_t len)
 {
diff --git a/util/iovec.c b/util/iovec.c
index 089f1051..c0159011 100644
--- a/util/iovec.c
+++ b/util/iovec.c
@@ -93,6 +93,37 @@ int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len)
 	return 0;
 }
 
+/*
+ *	Copy at most @len bytes from iovec to buffer.
+ *	Returns the remaining len.
+ *
+ *	Note: this modifies the original iovec, the iov pointer, and the
+ *	iovcount to describe the remaining buffer.
+ */
+ssize_t memcpy_fromiovec_safe(void *buf, struct iovec **iov, size_t len,
+			      size_t *iovcount)
+{
+	size_t copy;
+
+	while (len && *iovcount) {
+		copy = min(len, (*iov)->iov_len);
+		memcpy(buf, (*iov)->iov_base, copy);
+		buf += copy;
+		len -= copy;
+
+		/* Move iov cursor */
+		(*iov)->iov_base += copy;
+		(*iov)->iov_len -= copy;
+
+		if (!(*iov)->iov_len) {
+			(*iov)++;
+			(*iovcount)--;
+		}
+	}
+
+	return len;
+}
+
 /*
  *	Copy iovec from kernel. Returns -EFAULT on error.
  */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 11/24] virtio/net: Offload vnet header endianness conversion to tap
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (9 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 10/24] Add memcpy_fromiovec_safe Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 12/24] virtio/net: Prepare for modern virtio Jean-Philippe Brucker
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

The conversion of vnet header fields will be more difficult when
supporting the virtio ANY_LAYOUT feature. Since the uip backend doesn't
use the vnet header, and since tap can handle that conversion itself,
offload it to tap.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 virtio/net.c | 39 +++++++++++++++++++--------------------
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/virtio/net.c b/virtio/net.c
index 844612ac..70002f72 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -81,22 +81,6 @@ static bool has_virtio_feature(struct net_dev *ndev, u32 feature)
 	return ndev->vdev.features & (1 << feature);
 }
 
-static void virtio_net_fix_tx_hdr(struct virtio_net_hdr *hdr, struct net_dev *ndev)
-{
-	hdr->hdr_len		= virtio_guest_to_host_u16(&ndev->vdev, hdr->hdr_len);
-	hdr->gso_size		= virtio_guest_to_host_u16(&ndev->vdev, hdr->gso_size);
-	hdr->csum_start		= virtio_guest_to_host_u16(&ndev->vdev, hdr->csum_start);
-	hdr->csum_offset	= virtio_guest_to_host_u16(&ndev->vdev, hdr->csum_offset);
-}
-
-static void virtio_net_fix_rx_hdr(struct virtio_net_hdr *hdr, struct net_dev *ndev)
-{
-	hdr->hdr_len		= virtio_host_to_guest_u16(&ndev->vdev, hdr->hdr_len);
-	hdr->gso_size		= virtio_host_to_guest_u16(&ndev->vdev, hdr->gso_size);
-	hdr->csum_start		= virtio_host_to_guest_u16(&ndev->vdev, hdr->csum_start);
-	hdr->csum_offset	= virtio_host_to_guest_u16(&ndev->vdev, hdr->csum_offset);
-}
-
 static void *virtio_net_rx_thread(void *p)
 {
 	struct iovec iov[VIRTIO_NET_QUEUE_SIZE];
@@ -149,7 +133,6 @@ static void *virtio_net_rx_thread(void *p)
 				head = virt_queue__get_iov(vq, iov, &out, &in, kvm);
 			}
 
-			virtio_net_fix_rx_hdr(&hdr->hdr, ndev);
 			if (has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF))
 				hdr->num_buffers = virtio_host_to_guest_u16(vq, num_buffers);
 
@@ -189,10 +172,7 @@ static void *virtio_net_tx_thread(void *p)
 		mutex_unlock(&queue->lock);
 
 		while (virt_queue__available(vq)) {
-			struct virtio_net_hdr *hdr;
 			head = virt_queue__get_iov(vq, iov, &out, &in, kvm);
-			hdr = iov[0].iov_base;
-			virtio_net_fix_tx_hdr(hdr, ndev);
 			len = ndev->ops->tx(iov, out, ndev);
 			if (len < 0) {
 				pr_warning("%s: tx on vq %u failed (%d)\n",
@@ -565,6 +545,25 @@ static void virtio_net_update_endian(struct net_dev *ndev)
 						VIRTIO_NET_S_LINK_UP);
 	conf->max_virtqueue_pairs = virtio_host_to_guest_u16(&ndev->vdev,
 							     ndev->queue_pairs);
+
+	/* Let TAP know about vnet header endianness */
+	if (ndev->mode == NET_MODE_TAP &&
+	    ndev->vdev.endian != VIRTIO_ENDIAN_HOST) {
+		int enable_val = 1, disable_val = 0;
+		int enable_req, disable_req;
+
+		if (ndev->vdev.endian == VIRTIO_ENDIAN_LE) {
+			enable_req = TUNSETVNETLE;
+			disable_req = TUNSETVNETBE;
+		} else {
+			enable_req = TUNSETVNETBE;
+			disable_req = TUNSETVNETLE;
+		}
+
+		ioctl(ndev->tap_fd, disable_req, &disable_val);
+		if (ioctl(ndev->tap_fd, enable_req, &enable_val) < 0)
+			pr_err("Config tap device TUNSETVNETLE/BE error");
+	}
 }
 
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 12/24] virtio/net: Prepare for modern virtio
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (10 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 11/24] virtio/net: Offload vnet header endianness conversion to tap Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 13/24] virtio/net: Implement VIRTIO_F_ANY_LAYOUT feature Jean-Philippe Brucker
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

The virtio_net header contains a 'num_buffers' field, used when the
VIRTIO_NET_F_MRG_RXBUF feature is negotiated. The legacy driver does not
present this field when the feature is not negotiated. In that case the
header is 2 bytes smaller.

When using the modern virtio transport, the header always contains the
field and in addition the device MUST set it to 1 when the
VIRTIO_NET_F_MRG_RXBUF is not negotiated. Prepare for modern virtio
support by enabling this case once the 'legacy' flag is switched off.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio.h |  1 +
 virtio/core.c        |  2 ++
 virtio/net.c         | 25 ++++++++++++++++++-------
 3 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 2da5e4f6..8c05bae2 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -198,6 +198,7 @@ enum virtio_trans {
 };
 
 struct virtio_device {
+	bool			legacy;
 	bool			use_vhost;
 	void			*virtio;
 	struct virtio_ops	*ops;
diff --git a/virtio/core.c b/virtio/core.c
index 568667f2..09abbf40 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -330,6 +330,7 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 
 	switch (trans) {
 	case VIRTIO_PCI:
+		vdev->legacy			= true;
 		virtio = calloc(sizeof(struct virtio_pci), 1);
 		if (!virtio)
 			return -ENOMEM;
@@ -343,6 +344,7 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 		r = vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class);
 		break;
 	case VIRTIO_MMIO:
+		vdev->legacy			= true;
 		virtio = calloc(sizeof(struct virtio_mmio), 1);
 		if (!virtio)
 			return -ENOMEM;
diff --git a/virtio/net.c b/virtio/net.c
index 70002f72..985642f6 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -81,6 +81,15 @@ static bool has_virtio_feature(struct net_dev *ndev, u32 feature)
 	return ndev->vdev.features & (1 << feature);
 }
 
+static int virtio_net_hdr_len(struct net_dev *ndev)
+{
+	if (has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF) ||
+	    !ndev->vdev.legacy)
+		return sizeof(struct virtio_net_hdr_mrg_rxbuf);
+
+	return sizeof(struct virtio_net_hdr);
+}
+
 static void *virtio_net_rx_thread(void *p)
 {
 	struct iovec iov[VIRTIO_NET_QUEUE_SIZE];
@@ -133,7 +142,13 @@ static void *virtio_net_rx_thread(void *p)
 				head = virt_queue__get_iov(vq, iov, &out, &in, kvm);
 			}
 
-			if (has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF))
+			/*
+			 * The device MUST set num_buffers, except in the case
+			 * where the legacy driver did not negotiate
+			 * VIRTIO_NET_F_MRG_RXBUF and the field does not exist.
+			 */
+			if (has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF) ||
+			    !ndev->vdev.legacy)
 				hdr->num_buffers = virtio_host_to_guest_u16(vq, num_buffers);
 
 			virt_queue__used_idx_advance(vq, num_buffers);
@@ -301,9 +316,7 @@ static bool virtio_net__tap_init(struct net_dev *ndev)
 	const struct virtio_net_params *params = ndev->params;
 	bool skipconf = !!params->tapif;
 
-	hdr_len = has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF) ?
-			sizeof(struct virtio_net_hdr_mrg_rxbuf) :
-			sizeof(struct virtio_net_hdr);
+	hdr_len = virtio_net_hdr_len(ndev);
 	if (ioctl(ndev->tap_fd, TUNSETVNETHDRSZ, &hdr_len) < 0)
 		pr_warning("Config tap device TUNSETVNETHDRSZ error");
 
@@ -521,9 +534,7 @@ static void virtio_net_start(struct net_dev *ndev)
 				virtio_net__vhost_set_features(ndev) != 0)
 			die_perror("VHOST_SET_FEATURES failed");
 	} else {
-		ndev->info.vnet_hdr_len = has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF) ?
-						sizeof(struct virtio_net_hdr_mrg_rxbuf) :
-						sizeof(struct virtio_net_hdr);
+		ndev->info.vnet_hdr_len = virtio_net_hdr_len(ndev);
 		uip_init(&ndev->info);
 	}
 }
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 13/24] virtio/net: Implement VIRTIO_F_ANY_LAYOUT feature
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (11 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 12/24] virtio/net: Prepare for modern virtio Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 14/24] virtio/console: Add " Jean-Philippe Brucker
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Modern virtio demands that devices do not make assumptions about the
buffer layouts. Currently the user network backend assumes that TX
packets are neatly split between virtio-net header and ethernet frame.
Modern virtio-net usually puts everything into one descriptor, but could
also split the buffer arbitrarily. Handle arbitrary buffer layouts and
advertise the VIRTIO_F_ANY_LAYOUT feature.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 net/uip/core.c | 71 ++++++++++++++++++++++++++++++++------------------
 virtio/net.c   | 21 ++++++++-------
 2 files changed, 57 insertions(+), 35 deletions(-)

diff --git a/net/uip/core.c b/net/uip/core.c
index 977b9b0c..e409512e 100644
--- a/net/uip/core.c
+++ b/net/uip/core.c
@@ -9,40 +9,56 @@
 int uip_tx(struct iovec *iov, u16 out, struct uip_info *info)
 {
 	void *vnet;
+	ssize_t len;
 	struct uip_tx_arg arg;
-	int eth_len, vnet_len;
+	size_t eth_len, vnet_len;
 	struct uip_eth *eth;
-	u8 *buf = NULL;
+	void *vnet_buf = NULL;
+	void *eth_buf = NULL;
+	size_t iovcount = out;
+
 	u16 proto;
-	int i;
 
 	/*
 	 * Buffer from guest to device
 	 */
-	vnet_len = iov[0].iov_len;
+	vnet_len = info->vnet_hdr_len;
 	vnet	 = iov[0].iov_base;
 
-	eth_len	 = iov[1].iov_len;
-	eth	 = iov[1].iov_base;
+	len = iov_size(iov, iovcount);
+	if (len <= (ssize_t)vnet_len)
+		return -EINVAL;
+
+	/* Try to avoid memcpy if possible */
+	if (iov[0].iov_len == vnet_len && out == 2) {
+		/* Legacy layout: first descriptor for vnet header */
+		eth	= iov[1].iov_base;
+		eth_len	= iov[1].iov_len;
+
+	} else if (out == 1) {
+		/* Single descriptor */
+		eth	= (void *)vnet + vnet_len;
+		eth_len	= iov[0].iov_len - vnet_len;
+
+	} else {
+		/* Any layout */
+		len = vnet_len;
+		vnet = vnet_buf = malloc(len);
+		if (!vnet)
+			return -ENOMEM;
 
-	/*
-	 * In case, ethernet frame is in more than one iov entry.
-	 * Copy iov buffer into one linear buffer.
-	 */
-	if (out > 2) {
-		eth_len = 0;
-		for (i = 1; i < out; i++)
-			eth_len += iov[i].iov_len;
+		len = memcpy_fromiovec_safe(vnet_buf, &iov, len, &iovcount);
+		if (len)
+			goto out_free_buf;
 
-		buf = malloc(eth_len);
-		if (!buf)
-			return -ENOMEM;
+		len = eth_len = iov_size(iov, iovcount);
+		eth = eth_buf = malloc(len);
+		if (!eth)
+			goto out_free_buf;
 
-		eth = (struct uip_eth *)buf;
-		for (i = 1; i < out; i++) {
-			memcpy(buf, iov[i].iov_base, iov[i].iov_len);
-			buf += iov[i].iov_len;
-		}
+		len = memcpy_fromiovec_safe(eth_buf, &iov, len, &iovcount);
+		if (len)
+			goto out_free_buf;
 	}
 
 	memset(&arg, 0, sizeof(arg));
@@ -65,14 +81,17 @@ int uip_tx(struct iovec *iov, u16 out, struct uip_info *info)
 	case UIP_ETH_P_IP:
 		uip_tx_do_ipv4(&arg);
 		break;
-	default:
-		break;
 	}
 
-	if (out > 2 && buf)
-		free(eth);
+	free(vnet_buf);
+	free(eth_buf);
 
 	return vnet_len + eth_len;
+
+out_free_buf:
+	free(vnet_buf);
+	free(eth_buf);
+	return -EINVAL;
 }
 
 int uip_rx(struct iovec *iov, u16 in, struct uip_info *info)
diff --git a/virtio/net.c b/virtio/net.c
index 985642f6..f8c40d40 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -221,8 +221,9 @@ static void *virtio_net_ctrl_thread(void *p)
 	struct net_dev *ndev = queue->ndev;
 	u16 out, in, head;
 	struct kvm *kvm = ndev->kvm;
-	struct virtio_net_ctrl_hdr *ctrl;
-	virtio_net_ctrl_ack *ack;
+	struct virtio_net_ctrl_hdr ctrl;
+	virtio_net_ctrl_ack ack;
+	size_t len;
 
 	kvm__set_thread_name("virtio-net-ctrl");
 
@@ -234,18 +235,19 @@ static void *virtio_net_ctrl_thread(void *p)
 
 		while (virt_queue__available(vq)) {
 			head = virt_queue__get_iov(vq, iov, &out, &in, kvm);
-			ctrl = iov[0].iov_base;
-			ack = iov[out].iov_base;
+			len = min(iov_size(iov, in), sizeof(ctrl));
+			memcpy_fromiovec((void *)&ctrl, iov, len);
 
-			switch (ctrl->class) {
+			switch (ctrl.class) {
 			case VIRTIO_NET_CTRL_MQ:
-				*ack = virtio_net_handle_mq(kvm, ndev, ctrl);
+				ack = virtio_net_handle_mq(kvm, ndev, &ctrl);
 				break;
 			default:
-				*ack = VIRTIO_NET_ERR;
+				ack = VIRTIO_NET_ERR;
 				break;
 			}
-			virt_queue__set_used_elem(vq, head, iov[out].iov_len);
+			memcpy_toiovec(iov + in, &ack, sizeof(ack));
+			virt_queue__set_used_elem(vq, head, sizeof(ack));
 		}
 
 		if (virtio_queue__should_signal(vq))
@@ -495,7 +497,8 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 		| 1UL << VIRTIO_RING_F_INDIRECT_DESC
 		| 1UL << VIRTIO_NET_F_CTRL_VQ
 		| 1UL << VIRTIO_NET_F_MRG_RXBUF
-		| 1UL << (ndev->queue_pairs > 1 ? VIRTIO_NET_F_MQ : 0);
+		| 1UL << (ndev->queue_pairs > 1 ? VIRTIO_NET_F_MQ : 0)
+		| 1UL << VIRTIO_F_ANY_LAYOUT;
 
 	/*
 	 * The UFO feature for host and guest only can be enabled when the
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 14/24] virtio/console: Add VIRTIO_F_ANY_LAYOUT feature
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (12 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 13/24] virtio/net: Implement VIRTIO_F_ANY_LAYOUT feature Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 15/24] virtio/blk: Implement " Jean-Philippe Brucker
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Our virtio-console implementation already supports ANY_LAYOUT, because
buffers are accessed with scatter-gather operations. Advertise the
VIRTIO_F_ANY_LAYOUT feature.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 virtio/console.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virtio/console.c b/virtio/console.c
index 610995de..c42c8b9f 100644
--- a/virtio/console.c
+++ b/virtio/console.c
@@ -122,7 +122,7 @@ static size_t get_config_size(struct kvm *kvm, void *dev)
 
 static u32 get_host_features(struct kvm *kvm, void *dev)
 {
-	return 0;
+	return 1 << VIRTIO_F_ANY_LAYOUT;
 }
 
 static void notify_status(struct kvm *kvm, void *dev, u32 status)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 15/24] virtio/blk: Implement VIRTIO_F_ANY_LAYOUT feature
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (13 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 14/24] virtio/console: Add " Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 16/24] virtio/pci: Factor MSI route creation Jean-Philippe Brucker
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

The current virtio-block implementation assumes that buffers have a
specific layout (5.2.6.4 "Legacy Interface: Framing Requirements").
Modern virtio removes this layout constraint, so we have to be careful
when reading buffers. Note that since the Linux driver uses the same
layout as the legacy transport, arbitrary layouts were not actually
tested.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/disk-image.h |  3 +-
 disk/core.c              | 26 ++++++++++++++----
 virtio/blk.c             | 59 ++++++++++++++++++++++++++--------------
 3 files changed, 60 insertions(+), 28 deletions(-)

diff --git a/include/kvm/disk-image.h b/include/kvm/disk-image.h
index 27d4f7da..b2123838 100644
--- a/include/kvm/disk-image.h
+++ b/include/kvm/disk-image.h
@@ -88,7 +88,8 @@ ssize_t disk_image__read(struct disk_image *disk, u64 sector, const struct iovec
 				int iovcount, void *param);
 ssize_t disk_image__write(struct disk_image *disk, u64 sector, const struct iovec *iov,
 				int iovcount, void *param);
-ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *len);
+ssize_t disk_image__get_serial(struct disk_image *disk, struct iovec *iov,
+			       int iovcount, ssize_t len);
 
 struct disk_image *raw_image__probe(int fd, struct stat *st, bool readonly);
 struct disk_image *blkdev__probe(const char *filename, int flags, struct stat *st);
diff --git a/disk/core.c b/disk/core.c
index d8d04cb0..f69095d9 100644
--- a/disk/core.c
+++ b/disk/core.c
@@ -2,6 +2,7 @@
 #include "kvm/qcow.h"
 #include "kvm/virtio-blk.h"
 #include "kvm/kvm.h"
+#include "kvm/iovec.h"
 
 #include <linux/err.h>
 #include <poll.h>
@@ -304,20 +305,33 @@ ssize_t disk_image__write(struct disk_image *disk, u64 sector,
 	return total;
 }
 
-ssize_t disk_image__get_serial(struct disk_image *disk, void *buffer, ssize_t *len)
+ssize_t disk_image__get_serial(struct disk_image *disk, struct iovec *iov,
+			       int iovcount, ssize_t len)
 {
 	struct stat st;
+	void *buf;
 	int r;
 
 	r = fstat(disk->fd, &st);
 	if (r)
 		return r;
 
-	*len = snprintf(buffer, *len, "%llu%llu%llu",
-			(unsigned long long)st.st_dev,
-			(unsigned long long)st.st_rdev,
-			(unsigned long long)st.st_ino);
-	return *len;
+	buf = malloc(len);
+	if (!buf)
+		return -ENOMEM;
+
+	len = snprintf(buf, len, "%llu%llu%llu",
+		       (unsigned long long)st.st_dev,
+		       (unsigned long long)st.st_rdev,
+		       (unsigned long long)st.st_ino);
+	if (len < 0 || (size_t)len > iov_size(iov, iovcount)) {
+		free(buf);
+		return -ENOMEM;
+	}
+
+	memcpy_toiovec(iov, buf, len);
+	free(buf);
+	return len;
 }
 
 void disk_image__set_callback(struct disk_image *disk,
diff --git a/virtio/blk.c b/virtio/blk.c
index b56d45bd..54035af4 100644
--- a/virtio/blk.c
+++ b/virtio/blk.c
@@ -2,6 +2,7 @@
 
 #include "kvm/virtio-pci-dev.h"
 #include "kvm/disk-image.h"
+#include "kvm/iovec.h"
 #include "kvm/mutex.h"
 #include "kvm/util.h"
 #include "kvm/kvm.h"
@@ -33,6 +34,7 @@ struct blk_dev_req {
 	struct blk_dev			*bdev;
 	struct iovec			iov[VIRTIO_BLK_QUEUE_SIZE];
 	u16				out, in, head;
+	u8				*status;
 	struct kvm			*kvm;
 };
 
@@ -66,7 +68,7 @@ void virtio_blk_complete(void *param, long len)
 	u8 *status;
 
 	/* status */
-	status	= req->iov[req->out + req->in - 1].iov_base;
+	status = req->status;
 	*status	= (len < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
 
 	mutex_lock(&bdev->mutex);
@@ -79,46 +81,60 @@ void virtio_blk_complete(void *param, long len)
 
 static void virtio_blk_do_io_request(struct kvm *kvm, struct virt_queue *vq, struct blk_dev_req *req)
 {
-	struct virtio_blk_outhdr *req_hdr;
-	ssize_t block_cnt;
+	struct virtio_blk_outhdr req_hdr;
+	size_t iovcount, last_iov;
 	struct blk_dev *bdev;
 	struct iovec *iov;
-	u16 out, in;
+	ssize_t len;
 	u32 type;
 	u64 sector;
 
-	block_cnt	= -1;
 	bdev		= req->bdev;
 	iov		= req->iov;
-	out		= req->out;
-	in		= req->in;
-	req_hdr		= iov[0].iov_base;
 
-	type = virtio_guest_to_host_u32(vq, req_hdr->type);
-	sector = virtio_guest_to_host_u64(vq, req_hdr->sector);
+	iovcount = req->out;
+	len = memcpy_fromiovec_safe(&req_hdr, &iov, sizeof(req_hdr), &iovcount);
+	if (len) {
+		pr_warning("Failed to get header");
+		return;
+	}
+
+	type = virtio_guest_to_host_u32(vq, req_hdr.type);
+	sector = virtio_guest_to_host_u64(vq, req_hdr.sector);
+
+	iovcount += req->in;
+	if (!iov_size(iov, iovcount)) {
+		pr_warning("Invalid IOV");
+		return;
+	}
+
+	/* Extract status byte from iovec */
+	last_iov = iovcount - 1;
+	while (!iov[last_iov].iov_len)
+		last_iov--;
+	iov[last_iov].iov_len--;
+	req->status = iov[last_iov].iov_base + iov[last_iov].iov_len;
+	if (!iov[last_iov].iov_len)
+		iovcount--;
 
 	switch (type) {
 	case VIRTIO_BLK_T_IN:
-		block_cnt = disk_image__read(bdev->disk, sector,
-				iov + 1, in + out - 2, req);
+		disk_image__read(bdev->disk, sector, iov, iovcount, req);
 		break;
 	case VIRTIO_BLK_T_OUT:
-		block_cnt = disk_image__write(bdev->disk, sector,
-				iov + 1, in + out - 2, req);
+		disk_image__write(bdev->disk, sector, iov, iovcount, req);
 		break;
 	case VIRTIO_BLK_T_FLUSH:
-		block_cnt = disk_image__flush(bdev->disk);
-		virtio_blk_complete(req, block_cnt);
+		len = disk_image__flush(bdev->disk);
+		virtio_blk_complete(req, len);
 		break;
 	case VIRTIO_BLK_T_GET_ID:
-		block_cnt = VIRTIO_BLK_ID_BYTES;
-		disk_image__get_serial(bdev->disk,
-				(iov + 1)->iov_base, &block_cnt);
-		virtio_blk_complete(req, block_cnt);
+		len = disk_image__get_serial(bdev->disk, iov, iovcount,
+					     VIRTIO_BLK_ID_BYTES);
+		virtio_blk_complete(req, len);
 		break;
 	default:
 		pr_warning("request type %d", type);
-		block_cnt	= -1;
 		break;
 	}
 }
@@ -161,6 +177,7 @@ static u32 get_host_features(struct kvm *kvm, void *dev)
 		| 1UL << VIRTIO_BLK_F_FLUSH
 		| 1UL << VIRTIO_RING_F_EVENT_IDX
 		| 1UL << VIRTIO_RING_F_INDIRECT_DESC
+		| 1UL << VIRTIO_F_ANY_LAYOUT
 		| (bdev->disk->readonly ? 1UL << VIRTIO_BLK_F_RO : 0);
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 16/24] virtio/pci: Factor MSI route creation
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (14 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 15/24] virtio/blk: Implement " Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 17/24] virtio/pci: Delete MSI routes Jean-Philippe Brucker
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

The code for creating an MSI route is already duplicated between config
and virtqueue MSI. Modern virtio will need it as well, so move it to a
separate function.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 virtio/pci.c | 60 +++++++++++++++++++++++-----------------------------
 1 file changed, 27 insertions(+), 33 deletions(-)

diff --git a/virtio/pci.c b/virtio/pci.c
index 85018e79..1a549314 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -37,6 +37,29 @@ static u32 virtio_pci__msix_io_addr(struct virtio_pci *vpci)
 	return pci__bar_address(&vpci->pci_hdr, 2);
 }
 
+static int virtio_pci__add_msix_route(struct virtio_pci *vpci, u32 vec)
+{
+	int gsi;
+	struct msi_msg *msg;
+
+	if (vec == VIRTIO_MSI_NO_VECTOR)
+		return -EINVAL;
+
+	msg = &vpci->msix_table[vec].msg;
+	gsi = irq__add_msix_route(vpci->kvm, msg, vpci->dev_hdr.dev_num << 3);
+	/*
+	 * We don't need IRQ routing if we can use
+	 * MSI injection via the KVM_SIGNAL_MSI ioctl.
+	 */
+	if (gsi == -ENXIO && vpci->features & VIRTIO_PCI_F_SIGNAL_MSI)
+		return gsi;
+
+	if (gsi < 0)
+		die("failed to configure MSIs");
+
+	return gsi;
+}
+
 static void virtio_pci__ioevent_callback(struct kvm *kvm, void *param)
 {
 	struct virtio_pci_ioevent_param *ioeventfd = param;
@@ -219,24 +242,10 @@ static bool virtio_pci__specific_data_out(struct kvm *kvm, struct virtio_device
 		switch (offset) {
 		case VIRTIO_MSI_CONFIG_VECTOR:
 			vec = vpci->config_vector = ioport__read16(data);
-			if (vec == VIRTIO_MSI_NO_VECTOR)
-				break;
-
-			gsi = irq__add_msix_route(kvm,
-						  &vpci->msix_table[vec].msg,
-						  vpci->dev_hdr.dev_num << 3);
-			/*
-			 * We don't need IRQ routing if we can use
-			 * MSI injection via the KVM_SIGNAL_MSI ioctl.
-			 */
-			if (gsi == -ENXIO &&
-			    vpci->features & VIRTIO_PCI_F_SIGNAL_MSI)
-				break;
 
-			if (gsi < 0) {
-				die("failed to configure MSIs");
+			gsi = virtio_pci__add_msix_route(vpci, vec);
+			if (gsi < 0)
 				break;
-			}
 
 			vpci->config_gsi = gsi;
 			break;
@@ -244,24 +253,9 @@ static bool virtio_pci__specific_data_out(struct kvm *kvm, struct virtio_device
 			vec = ioport__read16(data);
 			vpci->vq_vector[vpci->queue_selector] = vec;
 
-			if (vec == VIRTIO_MSI_NO_VECTOR)
-				break;
-
-			gsi = irq__add_msix_route(kvm,
-						  &vpci->msix_table[vec].msg,
-						  vpci->dev_hdr.dev_num << 3);
-			/*
-			 * We don't need IRQ routing if we can use
-			 * MSI injection via the KVM_SIGNAL_MSI ioctl.
-			 */
-			if (gsi == -ENXIO &&
-			    vpci->features & VIRTIO_PCI_F_SIGNAL_MSI)
-				break;
-
-			if (gsi < 0) {
-				die("failed to configure MSIs");
+			gsi = virtio_pci__add_msix_route(vpci, vec);
+			if (gsi < 0)
 				break;
-			}
 
 			vpci->gsis[vpci->queue_selector] = gsi;
 			if (vdev->ops->notify_vq_gsi)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 17/24] virtio/pci: Delete MSI routes
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (15 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 16/24] virtio/pci: Factor MSI route creation Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-09 12:40   ` Will Deacon
  2022-06-07 17:02 ` [PATCH kvmtool 18/24] virtio: Extract init_vq() for PCI and MMIO Jean-Philippe Brucker
                   ` (8 subsequent siblings)
  25 siblings, 1 reply; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

On exit_vq() and device reset, remove the MSI routes that were set up at
runtime.

TODO: make irq__add_msix_route reuse those deleted routes. Currently, new
ones will be created after reset.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 virtio/pci.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/virtio/pci.c b/virtio/pci.c
index 1a549314..9b710852 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -60,6 +60,13 @@ static int virtio_pci__add_msix_route(struct virtio_pci *vpci, u32 vec)
 	return gsi;
 }
 
+static void virtio_pci__del_msix_route(struct virtio_pci *vpci, u32 gsi)
+{
+	struct msi_msg msg = { 0 };
+
+	irq__update_msix_route(vpci->kvm, gsi, &msg);
+}
+
 static void virtio_pci__ioevent_callback(struct kvm *kvm, void *param)
 {
 	struct virtio_pci_ioevent_param *ioeventfd = param;
@@ -128,6 +135,9 @@ static void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
 	u32 mmio_addr = virtio_pci__mmio_addr(vpci);
 	u16 port_addr = virtio_pci__port_addr(vpci);
 
+	virtio_pci__del_msix_route(vpci, vpci->gsis[vq]);
+	vpci->gsis[vq] = 0;
+	vpci->vq_vector[vq] = VIRTIO_MSI_NO_VECTOR;
 	ioeventfd__del_event(mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq);
 	ioeventfd__del_event(port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq);
 	virtio_exit_vq(kvm, vdev, vpci->dev, vq);
@@ -623,6 +633,10 @@ int virtio_pci__reset(struct kvm *kvm, struct virtio_device *vdev)
 	unsigned int vq;
 	struct virtio_pci *vpci = vdev->virtio;
 
+	virtio_pci__del_msix_route(vpci, vpci->config_gsi);
+	vpci->config_gsi = 0;
+	vpci->config_vector = VIRTIO_MSI_NO_VECTOR;
+
 	for (vq = 0; vq < vdev->ops->get_vq_count(kvm, vpci->dev); vq++)
 		virtio_pci_exit_vq(kvm, vdev, vq);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 18/24] virtio: Extract init_vq() for PCI and MMIO
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (16 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 17/24] virtio/pci: Delete MSI routes Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 19/24] virtio/pci: Make doorbell offset dynamic Jean-Philippe Brucker
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Modern virtio will need to reuse this code when initializing a
virtqueue. It's not much, but still nicer to have next to exit_vq().

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 virtio/mmio.c | 19 +++++++++++++++----
 virtio/pci.c  | 19 +++++++++++++++----
 2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/virtio/mmio.c b/virtio/mmio.c
index 268a4391..2a96e0e3 100644
--- a/virtio/mmio.c
+++ b/virtio/mmio.c
@@ -79,6 +79,20 @@ int virtio_mmio_signal_vq(struct kvm *kvm, struct virtio_device *vdev, u32 vq)
 	return 0;
 }
 
+static int virtio_mmio_init_vq(struct kvm *kvm, struct virtio_device *vdev,
+			       int vq)
+{
+	int ret;
+	struct virtio_mmio *vmmio = vdev->virtio;
+
+	ret = virtio_mmio_init_ioeventfd(vmmio->kvm, vdev, vq);
+	if (ret) {
+		pr_err("couldn't add ioeventfd for vq %d: %d", vq, ret);
+		return ret;
+	}
+	return vdev->ops->init_vq(vmmio->kvm, vmmio->dev, vq);
+}
+
 static void virtio_mmio_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
 				int vq)
 {
@@ -200,10 +214,7 @@ static void virtio_mmio_config_out(struct kvm_cpu *vcpu,
 				.align	= vmmio->hdr.queue_align,
 				.pgsize	= vmmio->hdr.guest_page_size,
 			};
-			virtio_mmio_init_ioeventfd(vmmio->kvm, vdev,
-						   vmmio->hdr.queue_sel);
-			vdev->ops->init_vq(vmmio->kvm, vmmio->dev,
-					   vmmio->hdr.queue_sel);
+			virtio_mmio_init_vq(kvm, vdev, vmmio->hdr.queue_sel);
 		} else {
 			virtio_mmio_exit_vq(kvm, vdev, vmmio->hdr.queue_sel);
 		}
diff --git a/virtio/pci.c b/virtio/pci.c
index 9b710852..f0459925 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -128,6 +128,20 @@ free_ioport_evt:
 	return r;
 }
 
+static int virtio_pci_init_vq(struct kvm *kvm, struct virtio_device *vdev,
+			      int vq)
+{
+	int ret;
+	struct virtio_pci *vpci = vdev->virtio;
+
+	ret = virtio_pci__init_ioeventfd(kvm, vdev, vq);
+	if (ret) {
+		pr_err("couldn't add ioeventfd for vq %d: %d", vq, ret);
+		return ret;
+	}
+	return vdev->ops->init_vq(kvm, vpci->dev, vq);
+}
+
 static void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
 			       int vq)
 {
@@ -314,10 +328,7 @@ static bool virtio_pci__data_out(struct kvm_cpu *vcpu, struct virtio_device *vde
 				.align	= VIRTIO_PCI_VRING_ALIGN,
 				.pgsize	= 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT,
 			};
-			virtio_pci__init_ioeventfd(kvm, vdev,
-						   vpci->queue_selector);
-			vdev->ops->init_vq(kvm, vpci->dev,
-					   vpci->queue_selector);
+			virtio_pci_init_vq(kvm, vdev, vpci->queue_selector);
 		} else {
 			virtio_pci_exit_vq(kvm, vdev, vpci->queue_selector);
 		}
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 19/24] virtio/pci: Make doorbell offset dynamic
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (17 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 18/24] virtio: Extract init_vq() for PCI and MMIO Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 20/24] virtio: Move PCI transport to pci-legacy Jean-Philippe Brucker
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

The doorbell offset depends on the transport - virtio-legacy uses a
fixed offset, but modern virtio can have per-vq offsets. Add an offset
field to the virtio_pci structure.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio-pci.h |  1 +
 virtio/pci.c             | 14 +++++++++-----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
index 959b4b81..d64e5c99 100644
--- a/include/kvm/virtio-pci.h
+++ b/include/kvm/virtio-pci.h
@@ -24,6 +24,7 @@ struct virtio_pci {
 	void			*dev;
 	struct kvm		*kvm;
 
+	u32			doorbell_offset;
 	u8			status;
 	u8			isr;
 	u32			features;
diff --git a/virtio/pci.c b/virtio/pci.c
index f0459925..c02534a6 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -81,6 +81,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde
 	struct virtio_pci *vpci = vdev->virtio;
 	u32 mmio_addr = virtio_pci__mmio_addr(vpci);
 	u16 port_addr = virtio_pci__port_addr(vpci);
+	off_t offset = vpci->doorbell_offset;
 	int r, flags = 0;
 	int fd;
 
@@ -104,7 +105,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde
 		flags |= IOEVENTFD_FLAG_USER_POLL;
 
 	/* ioport */
-	ioevent.io_addr	= port_addr + VIRTIO_PCI_QUEUE_NOTIFY;
+	ioevent.io_addr	= port_addr + offset;
 	ioevent.io_len	= sizeof(u16);
 	ioevent.fd	= fd = eventfd(0, 0);
 	r = ioeventfd__add_event(&ioevent, flags | IOEVENTFD_FLAG_PIO);
@@ -112,7 +113,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde
 		return r;
 
 	/* mmio */
-	ioevent.io_addr	= mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY;
+	ioevent.io_addr	= mmio_addr + offset;
 	ioevent.io_len	= sizeof(u16);
 	ioevent.fd	= eventfd(0, 0);
 	r = ioeventfd__add_event(&ioevent, flags);
@@ -124,7 +125,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde
 	return 0;
 
 free_ioport_evt:
-	ioeventfd__del_event(port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq);
+	ioeventfd__del_event(port_addr + offset, vq);
 	return r;
 }
 
@@ -148,12 +149,13 @@ static void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
 	struct virtio_pci *vpci = vdev->virtio;
 	u32 mmio_addr = virtio_pci__mmio_addr(vpci);
 	u16 port_addr = virtio_pci__port_addr(vpci);
+	off_t offset = vpci->doorbell_offset;
 
 	virtio_pci__del_msix_route(vpci, vpci->gsis[vq]);
 	vpci->gsis[vq] = 0;
 	vpci->vq_vector[vq] = VIRTIO_MSI_NO_VECTOR;
-	ioeventfd__del_event(mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq);
-	ioeventfd__del_event(port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq);
+	ioeventfd__del_event(mmio_addr + offset, vq);
+	ioeventfd__del_event(port_addr + offset, vq);
 	virtio_exit_vq(kvm, vdev, vpci->dev, vq);
 }
 
@@ -571,6 +573,8 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 	mmio_addr = pci_get_mmio_block(PCI_IO_SIZE);
 	msix_io_block = pci_get_mmio_block(VIRTIO_MSIX_BAR_SIZE);
 
+	vpci->doorbell_offset = VIRTIO_PCI_QUEUE_NOTIFY;
+
 	vpci->pci_hdr = (struct pci_device_header) {
 		.vendor_id		= cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET),
 		.device_id		= cpu_to_le16(device_id),
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 20/24] virtio: Move PCI transport to pci-legacy
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (18 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 19/24] virtio/pci: Make doorbell offset dynamic Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 21/24] virtio: Add support for modern virtio-pci Jean-Philippe Brucker
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

To make space for the more recent virtio version, move the legacy bits of
virtio-pci to a different file.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 Makefile                 |   1 +
 include/kvm/virtio-pci.h |  39 +++++++
 virtio/pci-legacy.c      | 205 ++++++++++++++++++++++++++++++++
 virtio/pci.c             | 245 ++-------------------------------------
 4 files changed, 254 insertions(+), 236 deletions(-)
 create mode 100644 virtio/pci-legacy.c

diff --git a/Makefile b/Makefile
index 729f9276..5cb0544c 100644
--- a/Makefile
+++ b/Makefile
@@ -74,6 +74,7 @@ OBJS	+= virtio/rng.o
 OBJS    += virtio/balloon.o
 OBJS	+= virtio/pci.o
 OBJS	+= virtio/vsock.o
+OBJS	+= virtio/pci-legacy.o
 OBJS	+= disk/blk.o
 OBJS	+= disk/qcow.o
 OBJS	+= disk/raw.o
diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
index d64e5c99..8cce5528 100644
--- a/include/kvm/virtio-pci.h
+++ b/include/kvm/virtio-pci.h
@@ -4,12 +4,15 @@
 #include "kvm/devices.h"
 #include "kvm/pci.h"
 
+#include <stdbool.h>
+#include <linux/byteorder.h>
 #include <linux/types.h>
 
 #define VIRTIO_PCI_MAX_VQ	32
 #define VIRTIO_PCI_MAX_CONFIG	1
 
 struct kvm;
+struct kvm_cpu;
 
 struct virtio_pci_ioevent_param {
 	struct virtio_device	*vdev;
@@ -18,6 +21,13 @@ struct virtio_pci_ioevent_param {
 
 #define VIRTIO_PCI_F_SIGNAL_MSI (1 << 0)
 
+#define ALIGN_UP(x, s)		ALIGN((x) + (s) - 1, (s))
+#define VIRTIO_NR_MSIX		(VIRTIO_PCI_MAX_VQ + VIRTIO_PCI_MAX_CONFIG)
+#define VIRTIO_MSIX_TABLE_SIZE	(VIRTIO_NR_MSIX * 16)
+#define VIRTIO_MSIX_PBA_SIZE	(ALIGN_UP(VIRTIO_MSIX_TABLE_SIZE, 64) / 8)
+#define VIRTIO_MSIX_BAR_SIZE	(1UL << fls_long(VIRTIO_MSIX_TABLE_SIZE + \
+						 VIRTIO_MSIX_PBA_SIZE))
+
 struct virtio_pci {
 	struct pci_device_header pci_hdr;
 	struct device_header	dev_hdr;
@@ -57,4 +67,33 @@ int virtio_pci__reset(struct kvm *kvm, struct virtio_device *vdev);
 int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 		     int device_id, int subsys_id, int class);
 
+static inline bool virtio_pci__msix_enabled(struct virtio_pci *vpci)
+{
+	return vpci->pci_hdr.msix.ctrl & cpu_to_le16(PCI_MSIX_FLAGS_ENABLE);
+}
+
+static inline u16 virtio_pci__port_addr(struct virtio_pci *vpci)
+{
+	return pci__bar_address(&vpci->pci_hdr, 0);
+}
+
+static inline u32 virtio_pci__mmio_addr(struct virtio_pci *vpci)
+{
+	return pci__bar_address(&vpci->pci_hdr, 1);
+}
+
+static inline u32 virtio_pci__msix_io_addr(struct virtio_pci *vpci)
+{
+	return pci__bar_address(&vpci->pci_hdr, 2);
+}
+
+int virtio_pci__add_msix_route(struct virtio_pci *vpci, u32 vec);
+int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vdev,
+			       u32 vq);
+int virtio_pci_init_vq(struct kvm *kvm, struct virtio_device *vdev, int vq);
+void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev, int vq);
+
+void virtio_pci_legacy__io_mmio_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+				  u32 len, u8 is_write, void *ptr);
+
 #endif
diff --git a/virtio/pci-legacy.c b/virtio/pci-legacy.c
new file mode 100644
index 00000000..58047967
--- /dev/null
+++ b/virtio/pci-legacy.c
@@ -0,0 +1,205 @@
+#include "kvm/virtio-pci.h"
+
+#include "kvm/ioport.h"
+#include "kvm/virtio.h"
+
+static bool virtio_pci__specific_data_in(struct kvm *kvm, struct virtio_device *vdev,
+					 void *data, u32 size, unsigned long offset)
+{
+	u32 config_offset;
+	struct virtio_pci *vpci = vdev->virtio;
+	int type = virtio__get_dev_specific_field(offset - 20,
+							virtio_pci__msix_enabled(vpci),
+							&config_offset);
+	if (type == VIRTIO_PCI_O_MSIX) {
+		switch (offset) {
+		case VIRTIO_MSI_CONFIG_VECTOR:
+			ioport__write16(data, vpci->config_vector);
+			break;
+		case VIRTIO_MSI_QUEUE_VECTOR:
+			ioport__write16(data, vpci->vq_vector[vpci->queue_selector]);
+			break;
+		};
+
+		return true;
+	} else if (type == VIRTIO_PCI_O_CONFIG) {
+		return virtio_access_config(kvm, vdev, vpci->dev, config_offset,
+					    data, size, false);
+	}
+
+	return false;
+}
+
+static bool virtio_pci__data_in(struct kvm_cpu *vcpu, struct virtio_device *vdev,
+				unsigned long offset, void *data, u32 size)
+{
+	bool ret = true;
+	struct virtio_pci *vpci;
+	struct virt_queue *vq;
+	struct kvm *kvm;
+	u32 val;
+
+	kvm = vcpu->kvm;
+	vpci = vdev->virtio;
+
+	switch (offset) {
+	case VIRTIO_PCI_HOST_FEATURES:
+		val = vdev->ops->get_host_features(kvm, vpci->dev);
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_QUEUE_PFN:
+		vq = vdev->ops->get_vq(kvm, vpci->dev, vpci->queue_selector);
+		ioport__write32(data, vq->vring_addr.pfn);
+		break;
+	case VIRTIO_PCI_QUEUE_NUM:
+		val = vdev->ops->get_size_vq(kvm, vpci->dev, vpci->queue_selector);
+		ioport__write16(data, val);
+		break;
+	case VIRTIO_PCI_STATUS:
+		ioport__write8(data, vpci->status);
+		break;
+	case VIRTIO_PCI_ISR:
+		ioport__write8(data, vpci->isr);
+		kvm__irq_line(kvm, vpci->legacy_irq_line, VIRTIO_IRQ_LOW);
+		vpci->isr = VIRTIO_IRQ_LOW;
+		break;
+	default:
+		ret = virtio_pci__specific_data_in(kvm, vdev, data, size, offset);
+		break;
+	};
+
+	return ret;
+}
+
+static bool virtio_pci__specific_data_out(struct kvm *kvm, struct virtio_device *vdev,
+					  void *data, u32 size, unsigned long offset)
+{
+	struct virtio_pci *vpci = vdev->virtio;
+	u32 config_offset, vec;
+	int gsi;
+	int type = virtio__get_dev_specific_field(offset - 20, virtio_pci__msix_enabled(vpci),
+							&config_offset);
+	if (type == VIRTIO_PCI_O_MSIX) {
+		switch (offset) {
+		case VIRTIO_MSI_CONFIG_VECTOR:
+			vec = vpci->config_vector = ioport__read16(data);
+
+			gsi = virtio_pci__add_msix_route(vpci, vec);
+			if (gsi < 0)
+				break;
+
+			vpci->config_gsi = gsi;
+			break;
+		case VIRTIO_MSI_QUEUE_VECTOR:
+			vec = ioport__read16(data);
+			vpci->vq_vector[vpci->queue_selector] = vec;
+
+			gsi = virtio_pci__add_msix_route(vpci, vec);
+			if (gsi < 0)
+				break;
+
+			vpci->gsis[vpci->queue_selector] = gsi;
+			if (vdev->ops->notify_vq_gsi)
+				vdev->ops->notify_vq_gsi(kvm, vpci->dev,
+							 vpci->queue_selector,
+							 gsi);
+			break;
+		};
+
+		return true;
+	} else if (type == VIRTIO_PCI_O_CONFIG) {
+		return virtio_access_config(kvm, vdev, vpci->dev, config_offset,
+					    data, size, true);
+	}
+
+	return false;
+}
+
+static bool virtio_pci__data_out(struct kvm_cpu *vcpu, struct virtio_device *vdev,
+				 unsigned long offset, void *data, u32 size)
+{
+	bool ret = true;
+	struct virtio_pci *vpci;
+	struct virt_queue *vq;
+	struct kvm *kvm;
+	u32 val;
+	unsigned int vq_count;
+
+	kvm = vcpu->kvm;
+	vpci = vdev->virtio;
+	vq_count = vdev->ops->get_vq_count(kvm, vpci->dev);
+
+	switch (offset) {
+	case VIRTIO_PCI_GUEST_FEATURES:
+		val = ioport__read32(data);
+		virtio_set_guest_features(kvm, vdev, vpci->dev, val);
+		break;
+	case VIRTIO_PCI_QUEUE_PFN:
+		val = ioport__read32(data);
+		if (val) {
+			vq = vdev->ops->get_vq(kvm, vpci->dev,
+					       vpci->queue_selector);
+			vq->vring_addr = (struct vring_addr) {
+				.legacy	= true,
+				.pfn	= val,
+				.align	= VIRTIO_PCI_VRING_ALIGN,
+				.pgsize	= 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT,
+			};
+			virtio_pci_init_vq(kvm, vdev, vpci->queue_selector);
+		} else {
+			virtio_pci_exit_vq(kvm, vdev, vpci->queue_selector);
+		}
+		break;
+	case VIRTIO_PCI_QUEUE_SEL:
+		val = ioport__read16(data);
+		if (val >= vq_count) {
+			WARN_ONCE(1, "QUEUE_SEL value (%u) is larger than VQ count (%u)\n",
+				val, vq_count);
+			return false;
+		}
+		vpci->queue_selector = val;
+		break;
+	case VIRTIO_PCI_QUEUE_NOTIFY:
+		val = ioport__read16(data);
+		if (val >= vq_count) {
+			WARN_ONCE(1, "QUEUE_SEL value (%u) is larger than VQ count (%u)\n",
+				val, vq_count);
+			return false;
+		}
+		vdev->ops->notify_vq(kvm, vpci->dev, val);
+		break;
+	case VIRTIO_PCI_STATUS:
+		vpci->status = ioport__read8(data);
+		if (!vpci->status) /* Sample endianness on reset */
+			vdev->endian = kvm_cpu__get_endianness(vcpu);
+		virtio_notify_status(kvm, vdev, vpci->dev, vpci->status);
+		break;
+	default:
+		ret = virtio_pci__specific_data_out(kvm, vdev, data, size, offset);
+		break;
+	};
+
+	return ret;
+}
+
+void virtio_pci_legacy__io_mmio_callback(struct kvm_cpu *vcpu, u64 addr,
+					 u8 *data, u32 len, u8 is_write,
+					 void *ptr)
+{
+	struct virtio_device *vdev = ptr;
+	struct virtio_pci *vpci = vdev->virtio;
+	u32 ioport_addr = virtio_pci__port_addr(vpci);
+	u32 base_addr;
+
+	if (addr >= ioport_addr &&
+	    addr < ioport_addr + pci__bar_size(&vpci->pci_hdr, 0))
+		base_addr = ioport_addr;
+	else
+		base_addr = virtio_pci__mmio_addr(vpci);
+
+	if (!is_write)
+		virtio_pci__data_in(vcpu, vdev, addr - base_addr, data, len);
+	else
+		virtio_pci__data_out(vcpu, vdev, addr - base_addr, data, len);
+}
+
diff --git a/virtio/pci.c b/virtio/pci.c
index c02534a6..2e070211 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -11,33 +11,10 @@
 
 #include <sys/ioctl.h>
 #include <linux/virtio_pci.h>
-#include <linux/byteorder.h>
 #include <assert.h>
 #include <string.h>
 
-#define ALIGN_UP(x, s)		ALIGN((x) + (s) - 1, (s))
-#define VIRTIO_NR_MSIX		(VIRTIO_PCI_MAX_VQ + VIRTIO_PCI_MAX_CONFIG)
-#define VIRTIO_MSIX_TABLE_SIZE	(VIRTIO_NR_MSIX * 16)
-#define VIRTIO_MSIX_PBA_SIZE	(ALIGN_UP(VIRTIO_MSIX_TABLE_SIZE, 64) / 8)
-#define VIRTIO_MSIX_BAR_SIZE	(1UL << fls_long(VIRTIO_MSIX_TABLE_SIZE + \
-						 VIRTIO_MSIX_PBA_SIZE))
-
-static u16 virtio_pci__port_addr(struct virtio_pci *vpci)
-{
-	return pci__bar_address(&vpci->pci_hdr, 0);
-}
-
-static u32 virtio_pci__mmio_addr(struct virtio_pci *vpci)
-{
-	return pci__bar_address(&vpci->pci_hdr, 1);
-}
-
-static u32 virtio_pci__msix_io_addr(struct virtio_pci *vpci)
-{
-	return pci__bar_address(&vpci->pci_hdr, 2);
-}
-
-static int virtio_pci__add_msix_route(struct virtio_pci *vpci, u32 vec)
+int virtio_pci__add_msix_route(struct virtio_pci *vpci, u32 vec)
 {
 	int gsi;
 	struct msi_msg *msg;
@@ -75,7 +52,8 @@ static void virtio_pci__ioevent_callback(struct kvm *kvm, void *param)
 	ioeventfd->vdev->ops->notify_vq(kvm, vpci->dev, ioeventfd->vq);
 }
 
-static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vdev, u32 vq)
+int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vdev,
+			       u32 vq)
 {
 	struct ioevent ioevent;
 	struct virtio_pci *vpci = vdev->virtio;
@@ -129,8 +107,7 @@ free_ioport_evt:
 	return r;
 }
 
-static int virtio_pci_init_vq(struct kvm *kvm, struct virtio_device *vdev,
-			      int vq)
+int virtio_pci_init_vq(struct kvm *kvm, struct virtio_device *vdev, int vq)
 {
 	int ret;
 	struct virtio_pci *vpci = vdev->virtio;
@@ -143,8 +120,7 @@ static int virtio_pci_init_vq(struct kvm *kvm, struct virtio_device *vdev,
 	return vdev->ops->init_vq(kvm, vpci->dev, vq);
 }
 
-static void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
-			       int vq)
+void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev, int vq)
 {
 	struct virtio_pci *vpci = vdev->virtio;
 	u32 mmio_addr = virtio_pci__mmio_addr(vpci);
@@ -159,79 +135,6 @@ static void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
 	virtio_exit_vq(kvm, vdev, vpci->dev, vq);
 }
 
-static inline bool virtio_pci__msix_enabled(struct virtio_pci *vpci)
-{
-	return vpci->pci_hdr.msix.ctrl & cpu_to_le16(PCI_MSIX_FLAGS_ENABLE);
-}
-
-static bool virtio_pci__specific_data_in(struct kvm *kvm, struct virtio_device *vdev,
-					 void *data, u32 size, unsigned long offset)
-{
-	u32 config_offset;
-	struct virtio_pci *vpci = vdev->virtio;
-	int type = virtio__get_dev_specific_field(offset - 20,
-							virtio_pci__msix_enabled(vpci),
-							&config_offset);
-	if (type == VIRTIO_PCI_O_MSIX) {
-		switch (offset) {
-		case VIRTIO_MSI_CONFIG_VECTOR:
-			ioport__write16(data, vpci->config_vector);
-			break;
-		case VIRTIO_MSI_QUEUE_VECTOR:
-			ioport__write16(data, vpci->vq_vector[vpci->queue_selector]);
-			break;
-		};
-
-		return true;
-	} else if (type == VIRTIO_PCI_O_CONFIG) {
-		return virtio_access_config(kvm, vdev, vpci->dev, config_offset,
-					    data, size, false);
-	}
-
-	return false;
-}
-
-static bool virtio_pci__data_in(struct kvm_cpu *vcpu, struct virtio_device *vdev,
-				unsigned long offset, void *data, u32 size)
-{
-	bool ret = true;
-	struct virtio_pci *vpci;
-	struct virt_queue *vq;
-	struct kvm *kvm;
-	u32 val;
-
-	kvm = vcpu->kvm;
-	vpci = vdev->virtio;
-
-	switch (offset) {
-	case VIRTIO_PCI_HOST_FEATURES:
-		val = vdev->ops->get_host_features(kvm, vpci->dev);
-		ioport__write32(data, val);
-		break;
-	case VIRTIO_PCI_QUEUE_PFN:
-		vq = vdev->ops->get_vq(kvm, vpci->dev, vpci->queue_selector);
-		ioport__write32(data, vq->vring_addr.pfn);
-		break;
-	case VIRTIO_PCI_QUEUE_NUM:
-		val = vdev->ops->get_size_vq(kvm, vpci->dev, vpci->queue_selector);
-		ioport__write16(data, val);
-		break;
-	case VIRTIO_PCI_STATUS:
-		ioport__write8(data, vpci->status);
-		break;
-	case VIRTIO_PCI_ISR:
-		ioport__write8(data, vpci->isr);
-		kvm__irq_line(kvm, vpci->legacy_irq_line, VIRTIO_IRQ_LOW);
-		vpci->isr = VIRTIO_IRQ_LOW;
-		break;
-	default:
-		ret = virtio_pci__specific_data_in(kvm, vdev, data, size, offset);
-		break;
-	};
-
-	return ret;
-}
-
 static void update_msix_map(struct virtio_pci *vpci,
 			    struct msix_table *msix_entry, u32 vecnum)
 {
@@ -256,117 +159,6 @@ static void update_msix_map(struct virtio_pci *vpci,
 	irq__update_msix_route(vpci->kvm, gsi, &msix_entry->msg);
 }
 
-static bool virtio_pci__specific_data_out(struct kvm *kvm, struct virtio_device *vdev,
-					  void *data, u32 size, unsigned long offset)
-{
-	struct virtio_pci *vpci = vdev->virtio;
-	u32 config_offset, vec;
-	int gsi;
-	int type = virtio__get_dev_specific_field(offset - 20, virtio_pci__msix_enabled(vpci),
-							&config_offset);
-	if (type == VIRTIO_PCI_O_MSIX) {
-		switch (offset) {
-		case VIRTIO_MSI_CONFIG_VECTOR:
-			vec = vpci->config_vector = ioport__read16(data);
-
-			gsi = virtio_pci__add_msix_route(vpci, vec);
-			if (gsi < 0)
-				break;
-
-			vpci->config_gsi = gsi;
-			break;
-		case VIRTIO_MSI_QUEUE_VECTOR:
-			vec = ioport__read16(data);
-			vpci->vq_vector[vpci->queue_selector] = vec;
-
-			gsi = virtio_pci__add_msix_route(vpci, vec);
-			if (gsi < 0)
-				break;
-
-			vpci->gsis[vpci->queue_selector] = gsi;
-			if (vdev->ops->notify_vq_gsi)
-				vdev->ops->notify_vq_gsi(kvm, vpci->dev,
-							 vpci->queue_selector,
-							 gsi);
-			break;
-		};
-
-		return true;
-	} else if (type == VIRTIO_PCI_O_CONFIG) {
-		return virtio_access_config(kvm, vdev, vpci->dev, config_offset,
-					    data, size, true);
-	}
-
-	return false;
-}
-
-static bool virtio_pci__data_out(struct kvm_cpu *vcpu, struct virtio_device *vdev,
-				 unsigned long offset, void *data, u32 size)
-{
-	bool ret = true;
-	struct virtio_pci *vpci;
-	struct virt_queue *vq;
-	struct kvm *kvm;
-	u32 val;
-	unsigned int vq_count;
-
-	kvm = vcpu->kvm;
-	vpci = vdev->virtio;
-	vq_count = vdev->ops->get_vq_count(kvm, vpci->dev);
-
-	switch (offset) {
-	case VIRTIO_PCI_GUEST_FEATURES:
-		val = ioport__read32(data);
-		virtio_set_guest_features(kvm, vdev, vpci->dev, val);
-		break;
-	case VIRTIO_PCI_QUEUE_PFN:
-		val = ioport__read32(data);
-		if (val) {
-			vq = vdev->ops->get_vq(kvm, vpci->dev,
-					       vpci->queue_selector);
-			vq->vring_addr = (struct vring_addr) {
-				.legacy	= true,
-				.pfn	= val,
-				.align	= VIRTIO_PCI_VRING_ALIGN,
-				.pgsize	= 1 << VIRTIO_PCI_QUEUE_ADDR_SHIFT,
-			};
-			virtio_pci_init_vq(kvm, vdev, vpci->queue_selector);
-		} else {
-			virtio_pci_exit_vq(kvm, vdev, vpci->queue_selector);
-		}
-		break;
-	case VIRTIO_PCI_QUEUE_SEL:
-		val = ioport__read16(data);
-		if (val >= vq_count) {
-			WARN_ONCE(1, "QUEUE_SEL value (%u) is larger than VQ count (%u)\n",
-				val, vq_count);
-			return false;
-		}
-		vpci->queue_selector = val;
-		break;
-	case VIRTIO_PCI_QUEUE_NOTIFY:
-		val = ioport__read16(data);
-		if (val >= vq_count) {
-			WARN_ONCE(1, "QUEUE_SEL value (%u) is larger than VQ count (%u)\n",
-				val, vq_count);
-			return false;
-		}
-		vdev->ops->notify_vq(kvm, vpci->dev, val);
-		break;
-	case VIRTIO_PCI_STATUS:
-		vpci->status = ioport__read8(data);
-		if (!vpci->status) /* Sample endianness on reset */
-			vdev->endian = kvm_cpu__get_endianness(vcpu);
-		virtio_notify_status(kvm, vdev, vpci->dev, vpci->status);
-		break;
-	default:
-		ret = virtio_pci__specific_data_out(kvm, vdev, data, size, offset);
-		break;
-	};
-
-	return ret;
-}
-
 static void virtio_pci__msix_mmio_callback(struct kvm_cpu *vcpu,
 					   u64 addr, u8 *data, u32 len,
 					   u8 is_write, void *ptr)
@@ -477,27 +269,6 @@ int virtio_pci__signal_config(struct kvm *kvm, struct virtio_device *vdev)
 	return 0;
 }
 
-static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu,
-					 u64 addr, u8 *data, u32 len,
-					 u8 is_write, void *ptr)
-{
-	struct virtio_device *vdev = ptr;
-	struct virtio_pci *vpci = vdev->virtio;
-	u32 ioport_addr = virtio_pci__port_addr(vpci);
-	u32 base_addr;
-
-	if (addr >= ioport_addr &&
-	    addr < ioport_addr + pci__bar_size(&vpci->pci_hdr, 0))
-		base_addr = ioport_addr;
-	else
-		base_addr = virtio_pci__mmio_addr(vpci);
-
-	if (!is_write)
-		virtio_pci__data_in(vcpu, vdev, addr - base_addr, data, len);
-	else
-		virtio_pci__data_out(vcpu, vdev, addr - base_addr, data, len);
-}
-
 static int virtio_pci__bar_activate(struct kvm *kvm,
 				    struct pci_device_header *pci_hdr,
 				    int bar_num, void *data)
@@ -514,11 +285,13 @@ static int virtio_pci__bar_activate(struct kvm *kvm,
 	switch (bar_num) {
 	case 0:
 		r = kvm__register_pio(kvm, bar_addr, bar_size,
-				      virtio_pci__io_mmio_callback, vdev);
+				      virtio_pci_legacy__io_mmio_callback,
+				      vdev);
 		break;
 	case 1:
 		r =  kvm__register_mmio(kvm, bar_addr, bar_size, false,
-					virtio_pci__io_mmio_callback, vdev);
+					virtio_pci_legacy__io_mmio_callback,
+					vdev);
 		break;
 	case 2:
 		r =  kvm__register_mmio(kvm, bar_addr, bar_size, false,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 21/24] virtio: Add support for modern virtio-pci
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (19 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 20/24] virtio: Move PCI transport to pci-legacy Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 22/24] virtio: Move MMIO transport to mmio-legacy Jean-Philippe Brucker
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Add support for modern virtio-pci implementation (based on the 1.0 virtio
spec). We add a new transport, alongside MMIO and PCI-legacy. This is now
the default when selecting PCI, but users can still select the legacy
transport for all virtio devices by passing "--virtio-legacy" on the
command-line.

The main change in modern PCI is the way we address virtqueues, using
64-bit values instead of PFNs. To keep the queue configuration atomic the
device also gets a "queue enable" register. Configuration is also made
extensible by more feature bits and PCI capabilities. Scalability is
improved as well, as devices can have notification registers for each
virtqueue on separate pages. However this implementation keeps a single
notification register.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 Makefile                          |   1 +
 arm/include/arm-common/kvm-arch.h |   6 +-
 include/kvm/kvm-config.h          |   1 +
 include/kvm/kvm.h                 |   6 +
 include/kvm/pci.h                 |  11 +
 include/kvm/virtio-pci-dev.h      |   4 +
 include/kvm/virtio-pci.h          |   6 +
 include/kvm/virtio.h              |   1 +
 mips/include/kvm/kvm-arch.h       |   2 -
 powerpc/include/kvm/kvm-arch.h    |   2 -
 x86/include/kvm/kvm-arch.h        |   2 -
 builtin-run.c                     |   2 +
 virtio/core.c                     |   6 +-
 virtio/pci-modern.c               | 386 ++++++++++++++++++++++++++++++
 virtio/pci.c                      |  24 +-
 15 files changed, 441 insertions(+), 19 deletions(-)
 create mode 100644 virtio/pci-modern.c

diff --git a/Makefile b/Makefile
index 5cb0544c..cb76b26d 100644
--- a/Makefile
+++ b/Makefile
@@ -75,6 +75,7 @@ OBJS    += virtio/balloon.o
 OBJS	+= virtio/pci.o
 OBJS	+= virtio/vsock.o
 OBJS	+= virtio/pci-legacy.o
+OBJS	+= virtio/pci-modern.o
 OBJS	+= disk/blk.o
 OBJS	+= disk/qcow.o
 OBJS	+= disk/raw.o
diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h
index fc55360d..0a960f4f 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -81,8 +81,10 @@
 
 #define KVM_VM_TYPE		0
 
-#define VIRTIO_DEFAULT_TRANS(kvm)	\
-	((kvm)->cfg.arch.virtio_trans_pci ? VIRTIO_PCI : VIRTIO_MMIO)
+#define VIRTIO_DEFAULT_TRANS(kvm)					\
+	((kvm)->cfg.arch.virtio_trans_pci ?				\
+	 ((kvm)->cfg.virtio_legacy ? VIRTIO_PCI_LEGACY : VIRTIO_PCI) :	\
+	 VIRTIO_MMIO)
 
 #define VIRTIO_RING_ENDIAN	(VIRTIO_ENDIAN_LE | VIRTIO_ENDIAN_BE)
 
diff --git a/include/kvm/kvm-config.h b/include/kvm/kvm-config.h
index 6a5720c4..0e72d84c 100644
--- a/include/kvm/kvm-config.h
+++ b/include/kvm/kvm-config.h
@@ -62,6 +62,7 @@ struct kvm_config {
 	bool no_dhcp;
 	bool ioport_debug;
 	bool mmio_debug;
+	bool virtio_legacy;
 };
 
 #endif
diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
index ad732e56..36a688d6 100644
--- a/include/kvm/kvm.h
+++ b/include/kvm/kvm.h
@@ -45,6 +45,12 @@ struct kvm_cpu;
 typedef void (*mmio_handler_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data,
 				u32 len, u8 is_write, void *ptr);
 
+/* Archs can override this in kvm-arch.h */
+#ifndef VIRTIO_DEFAULT_TRANS
+#define VIRTIO_DEFAULT_TRANS(kvm) \
+	((kvm)->cfg.virtio_legacy ? VIRTIO_PCI_LEGACY : VIRTIO_PCI)
+#endif
+
 enum {
 	KVM_VMSTATE_RUNNING,
 	KVM_VMSTATE_PAUSED,
diff --git a/include/kvm/pci.h b/include/kvm/pci.h
index d6eb3986..25113f60 100644
--- a/include/kvm/pci.h
+++ b/include/kvm/pci.h
@@ -4,6 +4,7 @@
 #include <linux/types.h>
 #include <linux/kvm.h>
 #include <linux/pci_regs.h>
+#include <linux/virtio_pci.h>
 #include <endian.h>
 #include <stdbool.h>
 
@@ -142,6 +143,14 @@ struct msi_cap_32 {
 	u32 pend_bits;
 };
 
+struct virtio_caps {
+	struct virtio_pci_cap		common;
+	struct virtio_pci_notify_cap	notify;
+	struct virtio_pci_cap		isr;
+	struct virtio_pci_cap		device;
+	struct virtio_pci_cfg_cap	pci;
+};
+
 struct pci_cap_hdr {
 	u8	type;
 	u8	next;
@@ -212,6 +221,7 @@ struct pci_device_header {
 			struct msix_cap msix;
 			/* Used only by architectures which support PCIE */
 			struct pci_exp_cap pci_exp;
+			struct virtio_caps virtio;
 		} __attribute__((packed));
 		/* Pad to PCI config space size */
 		u8	__pad[PCI_DEV_CFG_SIZE];
@@ -232,6 +242,7 @@ struct pci_device_header {
 };
 
 #define PCI_CAP(pci_hdr, pos) ((void *)(pci_hdr) + (pos))
+#define PCI_CAP_OFF(pci_hdr, cap) ((void *)&(pci_hdr)->cap - (void *)(pci_hdr))
 
 #define pci_for_each_cap(pos, cap, hdr)				\
 	for ((pos) = (hdr)->capabilities & ~3;			\
diff --git a/include/kvm/virtio-pci-dev.h b/include/kvm/virtio-pci-dev.h
index 7bf35cdb..9d59e677 100644
--- a/include/kvm/virtio-pci-dev.h
+++ b/include/kvm/virtio-pci-dev.h
@@ -19,6 +19,10 @@
 #define PCI_DEVICE_ID_VESA			0x2000
 #define PCI_DEVICE_ID_PCI_SHMEM			0x0001
 
+/* Modern virtio device IDs start at 1040 */
+#define PCI_DEVICE_ID_VIRTIO_BASE		0x1040
+#define PCI_SUBSYS_ID_VIRTIO_BASE		0x0040
+
 #define PCI_VENDOR_ID_REDHAT_QUMRANET		0x1af4
 #define PCI_VENDOR_ID_PCI_SHMEM			0x0001
 #define PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET	0x1af4
diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
index 8cce5528..47075334 100644
--- a/include/kvm/virtio-pci.h
+++ b/include/kvm/virtio-pci.h
@@ -3,6 +3,7 @@
 
 #include "kvm/devices.h"
 #include "kvm/pci.h"
+#include "kvm/virtio.h"
 
 #include <stdbool.h>
 #include <linux/byteorder.h>
@@ -37,6 +38,8 @@ struct virtio_pci {
 	u32			doorbell_offset;
 	u8			status;
 	u8			isr;
+	u32			device_features_sel;
+	u32			driver_features_sel;
 	u32			features;
 
 	/*
@@ -66,6 +69,7 @@ int virtio_pci__exit(struct kvm *kvm, struct virtio_device *vdev);
 int virtio_pci__reset(struct kvm *kvm, struct virtio_device *vdev);
 int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 		     int device_id, int subsys_id, int class);
+int virtio_pci_modern_init(struct virtio_device *vdev);
 
 static inline bool virtio_pci__msix_enabled(struct virtio_pci *vpci)
 {
@@ -95,5 +99,7 @@ void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev, int vq);
 
 void virtio_pci_legacy__io_mmio_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data,
 				  u32 len, u8 is_write, void *ptr);
+void virtio_pci_modern__io_mmio_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+					 u32 len, u8 is_write, void *ptr);
 
 #endif
diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 8c05bae2..9d77666c 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -194,6 +194,7 @@ int virtio__get_dev_specific_field(int offset, bool msix, u32 *config_off);
 
 enum virtio_trans {
 	VIRTIO_PCI,
+	VIRTIO_PCI_LEGACY,
 	VIRTIO_MMIO,
 };
 
diff --git a/mips/include/kvm/kvm-arch.h b/mips/include/kvm/kvm-arch.h
index e2f048a0..f5b9044f 100644
--- a/mips/include/kvm/kvm-arch.h
+++ b/mips/include/kvm/kvm-arch.h
@@ -35,8 +35,6 @@
 
 #define KVM_IOEVENTFD_HAS_PIO	0
 
-#define VIRTIO_DEFAULT_TRANS(kvm)	VIRTIO_PCI
-
 #define MAX_PAGE_SIZE		SZ_64K
 
 #include <stdbool.h>
diff --git a/powerpc/include/kvm/kvm-arch.h b/powerpc/include/kvm/kvm-arch.h
index 8eeca507..ff42c132 100644
--- a/powerpc/include/kvm/kvm-arch.h
+++ b/powerpc/include/kvm/kvm-arch.h
@@ -51,8 +51,6 @@
 
 #define MAX_PAGE_SIZE			SZ_256K
 
-#define VIRTIO_DEFAULT_TRANS(kvm)	VIRTIO_PCI
-
 struct spapr_phb;
 
 struct kvm_arch {
diff --git a/x86/include/kvm/kvm-arch.h b/x86/include/kvm/kvm-arch.h
index d8a7312d..73a957bb 100644
--- a/x86/include/kvm/kvm-arch.h
+++ b/x86/include/kvm/kvm-arch.h
@@ -33,8 +33,6 @@
 
 #define MAX_PAGE_SIZE		SZ_4K
 
-#define VIRTIO_DEFAULT_TRANS(kvm)	VIRTIO_PCI
-
 struct kvm_arch {
 	u16			boot_selector;
 	u16			boot_ip;
diff --git a/builtin-run.c b/builtin-run.c
index 9a1a0c1f..5b9d8f0e 100644
--- a/builtin-run.c
+++ b/builtin-run.c
@@ -128,6 +128,8 @@ void kvm_run_set_wrapper_sandbox(void)
 			" rootfs"),					\
 	OPT_STRING('\0', "hugetlbfs", &(cfg)->hugetlbfs_path, "path",	\
 			"Hugetlbfs path"),				\
+	OPT_BOOLEAN('\0', "virtio-legacy", &(cfg)->virtio_legacy,	\
+		    "Use legacy virtio transport"),			\
 									\
 	OPT_GROUP("Kernel options:"),					\
 	OPT_STRING('k', "kernel", &(cfg)->kernel_filename, "kernel",	\
diff --git a/virtio/core.c b/virtio/core.c
index 09abbf40..39a282e9 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -14,7 +14,7 @@
 
 const char* virtio_trans_name(enum virtio_trans trans)
 {
-	if (trans == VIRTIO_PCI)
+	if (trans == VIRTIO_PCI || trans == VIRTIO_PCI_LEGACY)
 		return "pci";
 	else if (trans == VIRTIO_MMIO)
 		return "mmio";
@@ -329,8 +329,10 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 	int r;
 
 	switch (trans) {
-	case VIRTIO_PCI:
+	case VIRTIO_PCI_LEGACY:
 		vdev->legacy			= true;
+		/* fall through */
+	case VIRTIO_PCI:
 		virtio = calloc(sizeof(struct virtio_pci), 1);
 		if (!virtio)
 			return -ENOMEM;
diff --git a/virtio/pci-modern.c b/virtio/pci-modern.c
new file mode 100644
index 00000000..584d1cab
--- /dev/null
+++ b/virtio/pci-modern.c
@@ -0,0 +1,386 @@
+#include "kvm/virtio-pci.h"
+
+#include "kvm/ioport.h"
+#include "kvm/virtio.h"
+#include "kvm/virtio-pci-dev.h"
+
+#include <linux/virtio_config.h>
+
+#define VPCI_CFG_COMMON_SIZE	sizeof(struct virtio_pci_common_cfg)
+#define VPCI_CFG_COMMON_START	0
+#define VPCI_CFG_COMMON_END	(VPCI_CFG_COMMON_SIZE - 1)
+/*
+ * Use a naturally aligned 4-byte doorbell, in case we ever want to
+ * implement VIRTIO_F_NOTIFICATION_DATA
+ */
+#define VPCI_CFG_NOTIFY_SIZE	4
+#define VPCI_CFG_NOTIFY_START	(VPCI_CFG_COMMON_END + 1)
+#define VPCI_CFG_NOTIFY_END	(VPCI_CFG_COMMON_END + VPCI_CFG_NOTIFY_SIZE)
+#define VPCI_CFG_ISR_SIZE	4
+#define VPCI_CFG_ISR_START	(VPCI_CFG_NOTIFY_END + 1)
+#define VPCI_CFG_ISR_END	(VPCI_CFG_NOTIFY_END + VPCI_CFG_ISR_SIZE)
+/*
+ * We're at 64 bytes. Use the remaining 192 bytes in PCI_IO_SIZE for the
+ * device-specific config space. It's sufficient for the devices we
+ * currently implement (virtio_blk_config is 60 bytes) and, I think, all
+ * existing virtio 1.2 devices.
+ */
+#define VPCI_CFG_DEV_START	(VPCI_CFG_ISR_END + 1)
+#define VPCI_CFG_DEV_END	((PCI_IO_SIZE) - 1)
+#define VPCI_CFG_DEV_SIZE	(VPCI_CFG_DEV_END - VPCI_CFG_DEV_START + 1)
+
+#define vpci_selected_vq(vpci) \
+	vdev->ops->get_vq((vpci)->kvm, (vpci)->dev, (vpci)->queue_selector)
+
+typedef bool (*access_handler_t)(struct virtio_device *, unsigned long, void *, int);
+
+static bool virtio_pci__common_write(struct virtio_device *vdev,
+				     unsigned long offset, void *data, int size)
+{
+	u32 val, gsi, vec;
+	struct virtio_pci *vpci = vdev->virtio;
+
+	switch (offset - VPCI_CFG_COMMON_START) {
+	case VIRTIO_PCI_COMMON_DFSELECT:
+		vpci->device_features_sel = ioport__read32(data);
+		break;
+	case VIRTIO_PCI_COMMON_GFSELECT:
+		vpci->driver_features_sel = ioport__read32(data);
+		break;
+	case VIRTIO_PCI_COMMON_GF:
+		val = ioport__read32(data);
+		if (vpci->driver_features_sel == 0)
+			virtio_set_guest_features(vpci->kvm, vdev, vpci->dev, val);
+		break;
+	case VIRTIO_PCI_COMMON_MSIX:
+		vec = vpci->config_vector = ioport__read16(data);
+		gsi = virtio_pci__add_msix_route(vpci, vec);
+		if (gsi < 0)
+			break;
+
+		vpci->config_gsi = gsi;
+		break;
+	case VIRTIO_PCI_COMMON_STATUS:
+		vpci->status = ioport__read8(data);
+		virtio_notify_status(vpci->kvm, vdev, vpci->dev, vpci->status);
+		break;
+	case VIRTIO_PCI_COMMON_Q_SELECT:
+		val = ioport__read16(data);
+		if (val >= (u32)vdev->ops->get_vq_count(vpci->kvm, vpci->dev))
+			pr_warning("invalid vq number %u", val);
+		else
+			vpci->queue_selector = val;
+		break;
+	case VIRTIO_PCI_COMMON_Q_SIZE:
+		vdev->ops->set_size_vq(vpci->kvm, vpci->dev,
+				       vpci->queue_selector,
+				       ioport__read16(data));
+		break;
+	case VIRTIO_PCI_COMMON_Q_MSIX:
+		vec = vpci->vq_vector[vpci->queue_selector] = ioport__read16(data);
+
+		gsi = virtio_pci__add_msix_route(vpci, vec);
+		if (gsi < 0)
+			break;
+
+		vpci->gsis[vpci->queue_selector] = gsi;
+		if (vdev->ops->notify_vq_gsi)
+			vdev->ops->notify_vq_gsi(vpci->kvm, vpci->dev,
+						 vpci->queue_selector, gsi);
+		break;
+	case VIRTIO_PCI_COMMON_Q_ENABLE:
+		val = ioport__read16(data);
+		if (val)
+			virtio_pci_init_vq(vpci->kvm, vdev, vpci->queue_selector);
+		else
+			virtio_pci_exit_vq(vpci->kvm, vdev, vpci->queue_selector);
+		break;
+	case VIRTIO_PCI_COMMON_Q_DESCLO:
+		vpci_selected_vq(vpci)->vring_addr.desc_lo = ioport__read32(data);
+		break;
+	case VIRTIO_PCI_COMMON_Q_DESCHI:
+		vpci_selected_vq(vpci)->vring_addr.desc_hi = ioport__read32(data);
+		break;
+	case VIRTIO_PCI_COMMON_Q_AVAILLO:
+		vpci_selected_vq(vpci)->vring_addr.avail_lo = ioport__read32(data);
+		break;
+	case VIRTIO_PCI_COMMON_Q_AVAILHI:
+		vpci_selected_vq(vpci)->vring_addr.avail_hi = ioport__read32(data);
+		break;
+	case VIRTIO_PCI_COMMON_Q_USEDLO:
+		vpci_selected_vq(vpci)->vring_addr.used_lo = ioport__read32(data);
+		break;
+	case VIRTIO_PCI_COMMON_Q_USEDHI:
+		vpci_selected_vq(vpci)->vring_addr.used_hi = ioport__read32(data);
+		break;
+	}
+
+	return true;
+}
+
+static bool virtio_pci__notify_write(struct virtio_device *vdev,
+				     unsigned long offset, void *data, int size)
+{
+	u16 vq = ioport__read16(data);
+	struct virtio_pci *vpci = vdev->virtio;
+
+	vdev->ops->notify_vq(vpci->kvm, vpci->dev, vq);
+
+	return true;
+}
+
+static bool virtio_pci__config_write(struct virtio_device *vdev,
+				     unsigned long offset, void *data, int size)
+{
+	struct virtio_pci *vpci = vdev->virtio;
+
+	return virtio_access_config(vpci->kvm, vdev, vpci->dev,
+				    offset - VPCI_CFG_DEV_START, data, size,
+				    true);
+}
+
+static bool virtio_pci__common_read(struct virtio_device *vdev,
+				    unsigned long offset, void *data, int size)
+{
+	u32 val;
+	struct virtio_pci *vpci = vdev->virtio;
+	u64 features = 1ULL << VIRTIO_F_VERSION_1;
+
+	switch (offset - VPCI_CFG_COMMON_START) {
+	case VIRTIO_PCI_COMMON_DFSELECT:
+		val = vpci->device_features_sel;
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_DF:
+		if (vpci->device_features_sel > 1)
+			break;
+		features |= vdev->ops->get_host_features(vpci->kvm, vpci->dev);
+		val = features >> (32 * vpci->device_features_sel);
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_GFSELECT:
+		val = vpci->driver_features_sel;
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_MSIX:
+		val = vpci->config_vector;
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_NUMQ:
+		val = vdev->ops->get_vq_count(vpci->kvm, vpci->dev);
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_STATUS:
+		ioport__write8(data, vpci->status);
+		break;
+	case VIRTIO_PCI_COMMON_CFGGENERATION:
+		/*
+		 * The config generation changes when the device updates a
+		 * config field larger than 32 bits, that the driver may read
+		 * using multiple accesses. Since kvmtool doesn't use any
+		 * mutable config field larger than 32 bits, the generation is
+		 * constant.
+		 */
+		ioport__write8(data, 0);
+		break;
+	case VIRTIO_PCI_COMMON_Q_SELECT:
+		ioport__write16(data, vpci->queue_selector);
+		break;
+	case VIRTIO_PCI_COMMON_Q_SIZE:
+		val = vdev->ops->get_size_vq(vpci->kvm, vpci->dev,
+					     vpci->queue_selector);
+		ioport__write16(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_MSIX:
+		val = vpci->vq_vector[vpci->queue_selector];
+		ioport__write16(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_ENABLE:
+		val = vpci_selected_vq(vpci)->enabled;
+		ioport__write16(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_NOFF:
+		val = vpci->queue_selector;
+		ioport__write16(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_DESCLO:
+		val = vpci_selected_vq(vpci)->vring_addr.desc_lo;
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_DESCHI:
+		val = vpci_selected_vq(vpci)->vring_addr.desc_hi;
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_AVAILLO:
+		val = vpci_selected_vq(vpci)->vring_addr.avail_lo;
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_AVAILHI:
+		val = vpci_selected_vq(vpci)->vring_addr.avail_hi;
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_USEDLO:
+		val = vpci_selected_vq(vpci)->vring_addr.used_lo;
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_PCI_COMMON_Q_USEDHI:
+		val = vpci_selected_vq(vpci)->vring_addr.used_hi;
+		ioport__write32(data, val);
+		break;
+	};
+
+	return true;
+}
+
+static bool virtio_pci__isr_read(struct virtio_device *vdev,
+				 unsigned long offset, void *data, int size)
+{
+	struct virtio_pci *vpci = vdev->virtio;
+
+	if (WARN_ON(offset - VPCI_CFG_ISR_START != 0))
+		return false;
+
+	ioport__write8(data, vpci->isr);
+	/*
+	 * Interrupts are edge triggered (yes, going against the PCI and virtio
+	 * specs), so no need to deassert the IRQ line.
+	 */
+	vpci->isr = 0;
+
+	return 0;
+}
+
+static bool virtio_pci__config_read(struct virtio_device *vdev,
+				    unsigned long offset, void *data, int size)
+{
+	struct virtio_pci *vpci = vdev->virtio;
+
+	return virtio_access_config(vpci->kvm, vdev, vpci->dev,
+				    offset - VPCI_CFG_DEV_START, data, size,
+				    false);
+}
+
+static bool virtio_pci_access(struct kvm_cpu *vcpu, struct virtio_device *vdev,
+			      unsigned long offset, void *data, int size,
+			      bool write)
+{
+	access_handler_t handler = NULL;
+
+	switch (offset) {
+	case VPCI_CFG_COMMON_START...VPCI_CFG_COMMON_END:
+		if (write)
+			handler = virtio_pci__common_write;
+		else
+			handler = virtio_pci__common_read;
+		break;
+	case VPCI_CFG_NOTIFY_START...VPCI_CFG_NOTIFY_END:
+		if (write)
+			handler = virtio_pci__notify_write;
+		break;
+	case VPCI_CFG_ISR_START...VPCI_CFG_ISR_END:
+		if (!write)
+			handler = virtio_pci__isr_read;
+		break;
+	case VPCI_CFG_DEV_START...VPCI_CFG_DEV_END:
+		if (write)
+			handler = virtio_pci__config_write;
+		else
+			handler = virtio_pci__config_read;
+		break;
+	}
+
+	if (!handler)
+		return false;
+
+	return handler(vdev, offset, data, size);
+}
+
+void virtio_pci_modern__io_mmio_callback(struct kvm_cpu *vcpu, u64 addr,
+					 u8 *data, u32 len, u8 is_write,
+					 void *ptr)
+{
+	struct virtio_device *vdev = ptr;
+	struct virtio_pci *vpci = vdev->virtio;
+	u32 mmio_addr = virtio_pci__mmio_addr(vpci);
+
+	virtio_pci_access(vcpu, vdev, addr - mmio_addr, data, len, is_write);
+}
+
+int virtio_pci_modern_init(struct virtio_device *vdev)
+{
+	int subsys_id;
+	struct virtio_pci *vpci = vdev->virtio;
+	struct pci_device_header *hdr = &vpci->pci_hdr;
+
+	subsys_id = le16_to_cpu(hdr->subsys_id);
+
+	hdr->device_id = cpu_to_le16(PCI_DEVICE_ID_VIRTIO_BASE + subsys_id);
+	hdr->subsys_id = cpu_to_le16(PCI_SUBSYS_ID_VIRTIO_BASE + subsys_id);
+
+	vpci->doorbell_offset = VPCI_CFG_NOTIFY_START;
+	vdev->endian = VIRTIO_ENDIAN_LE;
+
+	hdr->msix.next = PCI_CAP_OFF(hdr, virtio);
+
+	hdr->virtio.common = (struct virtio_pci_cap) {
+		.cap_vndr		= PCI_CAP_ID_VNDR,
+		.cap_next		= PCI_CAP_OFF(hdr, virtio.notify),
+		.cap_len		= sizeof(hdr->virtio.common),
+		.cfg_type		= VIRTIO_PCI_CAP_COMMON_CFG,
+		.bar			= 1,
+		.offset			= cpu_to_le32(VPCI_CFG_COMMON_START),
+		.length			= cpu_to_le32(VPCI_CFG_COMMON_SIZE),
+	};
+	BUILD_BUG_ON(VPCI_CFG_COMMON_START & 0x3);
+
+	hdr->virtio.notify = (struct virtio_pci_notify_cap) {
+		.cap.cap_vndr		= PCI_CAP_ID_VNDR,
+		.cap.cap_next		= PCI_CAP_OFF(hdr, virtio.isr),
+		.cap.cap_len		= sizeof(hdr->virtio.notify),
+		.cap.cfg_type		= VIRTIO_PCI_CAP_NOTIFY_CFG,
+		.cap.bar		= 1,
+		.cap.offset		= cpu_to_le32(VPCI_CFG_NOTIFY_START),
+		.cap.length		= cpu_to_le32(VPCI_CFG_NOTIFY_SIZE),
+		/*
+		 * Notify multiplier is 0, meaning that notifications are all on
+		 * the same register
+		 */
+	};
+	BUILD_BUG_ON(VPCI_CFG_NOTIFY_START & 0x3);
+
+	hdr->virtio.isr = (struct virtio_pci_cap) {
+		.cap_vndr		= PCI_CAP_ID_VNDR,
+		.cap_next		= PCI_CAP_OFF(hdr, virtio.device),
+		.cap_len		= sizeof(hdr->virtio.isr),
+		.cfg_type		= VIRTIO_PCI_CAP_ISR_CFG,
+		.bar			= 1,
+		.offset			= cpu_to_le32(VPCI_CFG_ISR_START),
+		.length			= cpu_to_le32(VPCI_CFG_ISR_SIZE),
+	};
+
+	hdr->virtio.device = (struct virtio_pci_cap) {
+		.cap_vndr		= PCI_CAP_ID_VNDR,
+		.cap_next		= PCI_CAP_OFF(hdr, virtio.pci),
+		.cap_len		= sizeof(hdr->virtio.device),
+		.cfg_type		= VIRTIO_PCI_CAP_DEVICE_CFG,
+		.bar			= 1,
+		.offset			= cpu_to_le32(VPCI_CFG_DEV_START),
+		.length			= cpu_to_le32(VPCI_CFG_DEV_SIZE),
+	};
+	BUILD_BUG_ON(VPCI_CFG_DEV_START & 0x3);
+
+	/*
+	 * TODO: implement this weird proxy capability (it is a "MUST" in the
+	 * spec, but I don't know if anyone actually uses it).
+	 * It doesn't use any BAR space. Instead the driver writes .cap.offset
+	 * and .cap.length to access a register in a BAR.
+	 */
+	hdr->virtio.pci = (struct virtio_pci_cfg_cap) {
+		.cap.cap_vndr		= PCI_CAP_ID_VNDR,
+		.cap.cap_next		= 0,
+		.cap.cap_len		= sizeof(hdr->virtio.pci),
+		.cap.cfg_type		= VIRTIO_PCI_CAP_PCI_CFG,
+	};
+
+	return 0;
+}
diff --git a/virtio/pci.c b/virtio/pci.c
index 2e070211..8bea8c73 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -274,9 +274,15 @@ static int virtio_pci__bar_activate(struct kvm *kvm,
 				    int bar_num, void *data)
 {
 	struct virtio_device *vdev = data;
+	mmio_handler_fn mmio_fn;
 	u32 bar_addr, bar_size;
 	int r = -EINVAL;
 
+	if (vdev->legacy)
+		mmio_fn = &virtio_pci_legacy__io_mmio_callback;
+	else
+		mmio_fn = &virtio_pci_modern__io_mmio_callback;
+
 	assert(bar_num <= 2);
 
 	bar_addr = pci__bar_address(pci_hdr, bar_num);
@@ -284,13 +290,10 @@ static int virtio_pci__bar_activate(struct kvm *kvm,
 
 	switch (bar_num) {
 	case 0:
-		r = kvm__register_pio(kvm, bar_addr, bar_size,
-				      virtio_pci_legacy__io_mmio_callback,
-				      vdev);
+		r = kvm__register_pio(kvm, bar_addr, bar_size, mmio_fn, vdev);
 		break;
 	case 1:
-		r =  kvm__register_mmio(kvm, bar_addr, bar_size, false,
-					virtio_pci_legacy__io_mmio_callback,
+		r =  kvm__register_mmio(kvm, bar_addr, bar_size, false, mmio_fn,
 					vdev);
 		break;
 	case 2:
@@ -346,14 +349,12 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 	mmio_addr = pci_get_mmio_block(PCI_IO_SIZE);
 	msix_io_block = pci_get_mmio_block(VIRTIO_MSIX_BAR_SIZE);
 
-	vpci->doorbell_offset = VIRTIO_PCI_QUEUE_NOTIFY;
-
 	vpci->pci_hdr = (struct pci_device_header) {
 		.vendor_id		= cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET),
 		.device_id		= cpu_to_le16(device_id),
 		.command		= PCI_COMMAND_IO | PCI_COMMAND_MEMORY,
 		.header_type		= PCI_HEADER_TYPE_NORMAL,
-		.revision_id		= 0,
+		.revision_id		= vdev->legacy ? 0 : 1,
 		.class[0]		= class & 0xff,
 		.class[1]		= (class >> 8) & 0xff,
 		.class[2]		= (class >> 16) & 0xff,
@@ -366,7 +367,7 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 		.bar[2]			= cpu_to_le32(msix_io_block
 							| PCI_BASE_ADDRESS_SPACE_MEMORY),
 		.status			= cpu_to_le16(PCI_STATUS_CAP_LIST),
-		.capabilities		= (void *)&vpci->pci_hdr.msix - (void *)&vpci->pci_hdr,
+		.capabilities		= PCI_CAP_OFF(&vpci->pci_hdr, msix),
 		.bar_size[0]		= cpu_to_le32(PCI_IO_SIZE),
 		.bar_size[1]		= cpu_to_le32(PCI_IO_SIZE),
 		.bar_size[2]		= cpu_to_le32(VIRTIO_MSIX_BAR_SIZE),
@@ -413,6 +414,11 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 	if (r < 0)
 		return r;
 
+	if (vdev->legacy)
+		vpci->doorbell_offset = VIRTIO_PCI_QUEUE_NOTIFY;
+	else
+		return virtio_pci_modern_init(vdev);
+
 	return 0;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 22/24] virtio: Move MMIO transport to mmio-legacy
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (20 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 21/24] virtio: Add support for modern virtio-pci Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 23/24] virtio: Add support for modern virtio-mmio Jean-Philippe Brucker
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

To make space for the modern register layout, move the current code to
mmio-legacy.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 Makefile                  |   1 +
 include/kvm/virtio-mmio.h |   9 +++
 virtio/mmio-legacy.c      | 150 +++++++++++++++++++++++++++++++++++
 virtio/mmio.c             | 160 ++------------------------------------
 4 files changed, 165 insertions(+), 155 deletions(-)
 create mode 100644 virtio/mmio-legacy.c

diff --git a/Makefile b/Makefile
index cb76b26d..186dc541 100644
--- a/Makefile
+++ b/Makefile
@@ -106,6 +106,7 @@ OBJS	+= virtio/9p-pdu.o
 OBJS	+= kvm-ipc.o
 OBJS	+= builtin-sandbox.o
 OBJS	+= virtio/mmio.o
+OBJS	+= virtio/mmio-legacy.o
 
 # Translate uname -m into ARCH string
 ARCH ?= $(shell uname -m | sed -e s/i.86/i386/ -e s/ppc.*/powerpc/ \
diff --git a/include/kvm/virtio-mmio.h b/include/kvm/virtio-mmio.h
index 6bc50bd1..e7ef8386 100644
--- a/include/kvm/virtio-mmio.h
+++ b/include/kvm/virtio-mmio.h
@@ -4,6 +4,8 @@
 #include <linux/types.h>
 #include <linux/virtio_mmio.h>
 
+#include <kvm/kvm-cpu.h>
+
 #define VIRTIO_MMIO_MAX_VQ	32
 #define VIRTIO_MMIO_MAX_CONFIG	1
 #define VIRTIO_MMIO_IO_SIZE	0x200
@@ -57,4 +59,11 @@ int virtio_mmio_exit(struct kvm *kvm, struct virtio_device *vdev);
 int virtio_mmio_reset(struct kvm *kvm, struct virtio_device *vdev);
 int virtio_mmio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 		      int device_id, int subsys_id, int class);
+int virtio_mmio_init_ioeventfd(struct kvm *kvm, struct virtio_device *vdev,
+			       u32 vq);
+
+void virtio_mmio_legacy_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+				 u32 len, u8 is_write, void *ptr);
+int virtio_mmio_init_vq(struct kvm *kvm, struct virtio_device *vdev, int vq);
+void virtio_mmio_exit_vq(struct kvm *kvm, struct virtio_device *vdev, int vq);
 #endif
diff --git a/virtio/mmio-legacy.c b/virtio/mmio-legacy.c
new file mode 100644
index 00000000..7ca7e69f
--- /dev/null
+++ b/virtio/mmio-legacy.c
@@ -0,0 +1,150 @@
+#include "kvm/ioport.h"
+#include "kvm/virtio.h"
+#include "kvm/virtio-mmio.h"
+
+#include <linux/virtio_mmio.h>
+
+#define vmmio_selected_vq(vdev, vmmio) \
+	(vdev)->ops->get_vq((vmmio)->kvm, (vmmio)->dev, (vmmio)->hdr.queue_sel)
+
+static void virtio_mmio_config_in(struct kvm_cpu *vcpu,
+				  u64 addr, void *data, u32 len,
+				  struct virtio_device *vdev)
+{
+	struct virtio_mmio *vmmio = vdev->virtio;
+	struct virt_queue *vq;
+	u32 val = 0;
+
+	switch (addr) {
+	case VIRTIO_MMIO_MAGIC_VALUE:
+	case VIRTIO_MMIO_VERSION:
+	case VIRTIO_MMIO_DEVICE_ID:
+	case VIRTIO_MMIO_VENDOR_ID:
+	case VIRTIO_MMIO_STATUS:
+	case VIRTIO_MMIO_INTERRUPT_STATUS:
+		ioport__write32(data, *(u32 *)(((void *)&vmmio->hdr) + addr));
+		break;
+	case VIRTIO_MMIO_DEVICE_FEATURES:
+		if (vmmio->hdr.host_features_sel == 0)
+			val = vdev->ops->get_host_features(vmmio->kvm,
+							   vmmio->dev);
+		ioport__write32(data, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_PFN:
+		vq = vmmio_selected_vq(vdev, vmmio);
+		ioport__write32(data, vq->vring_addr.pfn);
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM_MAX:
+		val = vdev->ops->get_size_vq(vmmio->kvm, vmmio->dev,
+					     vmmio->hdr.queue_sel);
+		ioport__write32(data, val);
+		break;
+	default:
+		break;
+	}
+}
+
+static void virtio_mmio_config_out(struct kvm_cpu *vcpu,
+				   u64 addr, void *data, u32 len,
+				   struct virtio_device *vdev)
+{
+	struct virtio_mmio *vmmio = vdev->virtio;
+	struct kvm *kvm = vmmio->kvm;
+	unsigned int vq_count = vdev->ops->get_vq_count(kvm, vmmio->dev);
+	struct virt_queue *vq;
+	u32 val = 0;
+
+	switch (addr) {
+	case VIRTIO_MMIO_DEVICE_FEATURES_SEL:
+	case VIRTIO_MMIO_DRIVER_FEATURES_SEL:
+		val = ioport__read32(data);
+		*(u32 *)(((void *)&vmmio->hdr) + addr) = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_SEL:
+		val = ioport__read32(data);
+		if (val >= vq_count) {
+			WARN_ONCE(1, "QUEUE_SEL value (%u) is larger than VQ count (%u)\n",
+				val, vq_count);
+			break;
+		}
+		*(u32 *)(((void *)&vmmio->hdr) + addr) = val;
+		break;
+	case VIRTIO_MMIO_STATUS:
+		vmmio->hdr.status = ioport__read32(data);
+		if (!vmmio->hdr.status) /* Sample endianness on reset */
+			vdev->endian = kvm_cpu__get_endianness(vcpu);
+		virtio_notify_status(kvm, vdev, vmmio->dev, vmmio->hdr.status);
+		break;
+	case VIRTIO_MMIO_DRIVER_FEATURES:
+		if (vmmio->hdr.guest_features_sel == 0) {
+			val = ioport__read32(data);
+			virtio_set_guest_features(vmmio->kvm, vdev,
+						  vmmio->dev, val);
+		}
+		break;
+	case VIRTIO_MMIO_GUEST_PAGE_SIZE:
+		val = ioport__read32(data);
+		vmmio->hdr.guest_page_size = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM:
+		val = ioport__read32(data);
+		vmmio->hdr.queue_num = val;
+		vdev->ops->set_size_vq(vmmio->kvm, vmmio->dev,
+				       vmmio->hdr.queue_sel, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_ALIGN:
+		val = ioport__read32(data);
+		vmmio->hdr.queue_align = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_PFN:
+		val = ioport__read32(data);
+		if (val) {
+			vq = vmmio_selected_vq(vdev, vmmio);
+			vq->vring_addr = (struct vring_addr) {
+				.legacy	= true,
+				.pfn	= val,
+				.align	= vmmio->hdr.queue_align,
+				.pgsize	= vmmio->hdr.guest_page_size,
+			};
+			virtio_mmio_init_vq(kvm, vdev, vmmio->hdr.queue_sel);
+		} else {
+			virtio_mmio_exit_vq(kvm, vdev, vmmio->hdr.queue_sel);
+		}
+		break;
+	case VIRTIO_MMIO_QUEUE_NOTIFY:
+		val = ioport__read32(data);
+		if (val >= vq_count) {
+			WARN_ONCE(1, "QUEUE_NOTIFY value (%u) is larger than VQ count (%u)\n",
+				val, vq_count);
+			break;
+		}
+		vdev->ops->notify_vq(vmmio->kvm, vmmio->dev, val);
+		break;
+	case VIRTIO_MMIO_INTERRUPT_ACK:
+		val = ioport__read32(data);
+		vmmio->hdr.interrupt_state &= ~val;
+		break;
+	default:
+		break;
+	};
+}
+
+void virtio_mmio_legacy_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+				 u32 len, u8 is_write, void *ptr)
+{
+	struct virtio_device *vdev = ptr;
+	struct virtio_mmio *vmmio = vdev->virtio;
+	u32 offset = addr - vmmio->addr;
+
+	if (offset >= VIRTIO_MMIO_CONFIG) {
+		offset -= VIRTIO_MMIO_CONFIG;
+		virtio_access_config(vmmio->kvm, vdev, vmmio->dev, offset, data,
+				     len, is_write);
+		return;
+	}
+
+	if (is_write)
+		virtio_mmio_config_out(vcpu, offset, data, len, ptr);
+	else
+		virtio_mmio_config_in(vcpu, offset, data, len, ptr);
+}
diff --git a/virtio/mmio.c b/virtio/mmio.c
index 2a96e0e3..fab45733 100644
--- a/virtio/mmio.c
+++ b/virtio/mmio.c
@@ -1,10 +1,8 @@
 #include "kvm/devices.h"
 #include "kvm/virtio-mmio.h"
 #include "kvm/ioeventfd.h"
-#include "kvm/ioport.h"
 #include "kvm/virtio.h"
 #include "kvm/kvm.h"
-#include "kvm/kvm-cpu.h"
 #include "kvm/irq.h"
 #include "kvm/fdt.h"
 
@@ -29,8 +27,8 @@ static void virtio_mmio_ioevent_callback(struct kvm *kvm, void *param)
 	ioeventfd->vdev->ops->notify_vq(kvm, vmmio->dev, ioeventfd->vq);
 }
 
-static int virtio_mmio_init_ioeventfd(struct kvm *kvm,
-				      struct virtio_device *vdev, u32 vq)
+int virtio_mmio_init_ioeventfd(struct kvm *kvm, struct virtio_device *vdev,
+			       u32 vq)
 {
 	struct virtio_mmio *vmmio = vdev->virtio;
 	struct ioevent ioevent;
@@ -79,8 +77,7 @@ int virtio_mmio_signal_vq(struct kvm *kvm, struct virtio_device *vdev, u32 vq)
 	return 0;
 }
 
-static int virtio_mmio_init_vq(struct kvm *kvm, struct virtio_device *vdev,
-			       int vq)
+int virtio_mmio_init_vq(struct kvm *kvm, struct virtio_device *vdev, int vq)
 {
 	int ret;
 	struct virtio_mmio *vmmio = vdev->virtio;
@@ -93,8 +90,7 @@ static int virtio_mmio_init_vq(struct kvm *kvm, struct virtio_device *vdev,
 	return vdev->ops->init_vq(vmmio->kvm, vmmio->dev, vq);
 }
 
-static void virtio_mmio_exit_vq(struct kvm *kvm, struct virtio_device *vdev,
-				int vq)
+void virtio_mmio_exit_vq(struct kvm *kvm, struct virtio_device *vdev, int vq)
 {
 	struct virtio_mmio *vmmio = vdev->virtio;
 
@@ -112,152 +108,6 @@ int virtio_mmio_signal_config(struct kvm *kvm, struct virtio_device *vdev)
 	return 0;
 }
 
-#define vmmio_selected_vq(vdev, vmmio) \
-	(vdev)->ops->get_vq((vmmio)->kvm, (vmmio)->dev, (vmmio)->hdr.queue_sel)
-
-static void virtio_mmio_config_in(struct kvm_cpu *vcpu,
-				  u64 addr, void *data, u32 len,
-				  struct virtio_device *vdev)
-{
-	struct virtio_mmio *vmmio = vdev->virtio;
-	struct virt_queue *vq;
-	u32 val = 0;
-
-	switch (addr) {
-	case VIRTIO_MMIO_MAGIC_VALUE:
-	case VIRTIO_MMIO_VERSION:
-	case VIRTIO_MMIO_DEVICE_ID:
-	case VIRTIO_MMIO_VENDOR_ID:
-	case VIRTIO_MMIO_STATUS:
-	case VIRTIO_MMIO_INTERRUPT_STATUS:
-		ioport__write32(data, *(u32 *)(((void *)&vmmio->hdr) + addr));
-		break;
-	case VIRTIO_MMIO_DEVICE_FEATURES:
-		if (vmmio->hdr.host_features_sel == 0)
-			val = vdev->ops->get_host_features(vmmio->kvm,
-							   vmmio->dev);
-		ioport__write32(data, val);
-		break;
-	case VIRTIO_MMIO_QUEUE_PFN:
-		vq = vmmio_selected_vq(vdev, vmmio);
-		ioport__write32(data, vq->vring_addr.pfn);
-		break;
-	case VIRTIO_MMIO_QUEUE_NUM_MAX:
-		val = vdev->ops->get_size_vq(vmmio->kvm, vmmio->dev,
-					     vmmio->hdr.queue_sel);
-		ioport__write32(data, val);
-		break;
-	default:
-		break;
-	}
-}
-
-static void virtio_mmio_config_out(struct kvm_cpu *vcpu,
-				   u64 addr, void *data, u32 len,
-				   struct virtio_device *vdev)
-{
-	struct virtio_mmio *vmmio = vdev->virtio;
-	struct kvm *kvm = vmmio->kvm;
-	unsigned int vq_count = vdev->ops->get_vq_count(kvm, vmmio->dev);
-	struct virt_queue *vq;
-	u32 val = 0;
-
-	switch (addr) {
-	case VIRTIO_MMIO_DEVICE_FEATURES_SEL:
-	case VIRTIO_MMIO_DRIVER_FEATURES_SEL:
-		val = ioport__read32(data);
-		*(u32 *)(((void *)&vmmio->hdr) + addr) = val;
-		break;
-	case VIRTIO_MMIO_QUEUE_SEL:
-		val = ioport__read32(data);
-		if (val >= vq_count) {
-			WARN_ONCE(1, "QUEUE_SEL value (%u) is larger than VQ count (%u)\n",
-				val, vq_count);
-			break;
-		}
-		*(u32 *)(((void *)&vmmio->hdr) + addr) = val;
-		break;
-	case VIRTIO_MMIO_STATUS:
-		vmmio->hdr.status = ioport__read32(data);
-		if (!vmmio->hdr.status) /* Sample endianness on reset */
-			vdev->endian = kvm_cpu__get_endianness(vcpu);
-		virtio_notify_status(kvm, vdev, vmmio->dev, vmmio->hdr.status);
-		break;
-	case VIRTIO_MMIO_DRIVER_FEATURES:
-		if (vmmio->hdr.guest_features_sel == 0) {
-			val = ioport__read32(data);
-			virtio_set_guest_features(vmmio->kvm, vdev,
-						  vmmio->dev, val);
-		}
-		break;
-	case VIRTIO_MMIO_GUEST_PAGE_SIZE:
-		val = ioport__read32(data);
-		vmmio->hdr.guest_page_size = val;
-		break;
-	case VIRTIO_MMIO_QUEUE_NUM:
-		val = ioport__read32(data);
-		vmmio->hdr.queue_num = val;
-		vdev->ops->set_size_vq(vmmio->kvm, vmmio->dev,
-				       vmmio->hdr.queue_sel, val);
-		break;
-	case VIRTIO_MMIO_QUEUE_ALIGN:
-		val = ioport__read32(data);
-		vmmio->hdr.queue_align = val;
-		break;
-	case VIRTIO_MMIO_QUEUE_PFN:
-		val = ioport__read32(data);
-		if (val) {
-			vq = vmmio_selected_vq(vdev, vmmio);
-			vq->vring_addr = (struct vring_addr) {
-				.legacy	= true,
-				.pfn	= val,
-				.align	= vmmio->hdr.queue_align,
-				.pgsize	= vmmio->hdr.guest_page_size,
-			};
-			virtio_mmio_init_vq(kvm, vdev, vmmio->hdr.queue_sel);
-		} else {
-			virtio_mmio_exit_vq(kvm, vdev, vmmio->hdr.queue_sel);
-		}
-		break;
-	case VIRTIO_MMIO_QUEUE_NOTIFY:
-		val = ioport__read32(data);
-		if (val >= vq_count) {
-			WARN_ONCE(1, "QUEUE_NOTIFY value (%u) is larger than VQ count (%u)\n",
-				val, vq_count);
-			break;
-		}
-		vdev->ops->notify_vq(vmmio->kvm, vmmio->dev, val);
-		break;
-	case VIRTIO_MMIO_INTERRUPT_ACK:
-		val = ioport__read32(data);
-		vmmio->hdr.interrupt_state &= ~val;
-		break;
-	default:
-		break;
-	};
-}
-
-static void virtio_mmio_mmio_callback(struct kvm_cpu *vcpu,
-				      u64 addr, u8 *data, u32 len,
-				      u8 is_write, void *ptr)
-{
-	struct virtio_device *vdev = ptr;
-	struct virtio_mmio *vmmio = vdev->virtio;
-	u32 offset = addr - vmmio->addr;
-
-	if (offset >= VIRTIO_MMIO_CONFIG) {
-		offset -= VIRTIO_MMIO_CONFIG;
-		virtio_access_config(vmmio->kvm, vdev, vmmio->dev, offset, data,
-				     len, is_write);
-		return;
-	}
-
-	if (is_write)
-		virtio_mmio_config_out(vcpu, offset, data, len, ptr);
-	else
-		virtio_mmio_config_in(vcpu, offset, data, len, ptr);
-}
-
 #ifdef CONFIG_HAS_LIBFDT
 #define DEVICE_NAME_MAX_LEN 32
 static
@@ -307,7 +157,7 @@ int virtio_mmio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 	vmmio->dev	= dev;
 
 	r = kvm__register_mmio(kvm, vmmio->addr, VIRTIO_MMIO_IO_SIZE,
-			       false, virtio_mmio_mmio_callback, vdev);
+			       false, virtio_mmio_legacy_callback, vdev);
 	if (r < 0)
 		return r;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 23/24] virtio: Add support for modern virtio-mmio
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (21 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 22/24] virtio: Move MMIO transport to mmio-legacy Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-07 17:02 ` [PATCH kvmtool 24/24] virtio/pci: Initialize all vectors to VIRTIO_MSI_NO_VECTOR Jean-Philippe Brucker
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

Add modern MMIO transport to virtio, make it the default. Legacy transport
can be enabled with --virtio-legacy. The main change for MMIO is the queue
addresses. They are now 64-bit addresses instead of 32-bit PFNs. Apart
from that all changes for supporting modern devices are already
implemented.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 Makefile                          |   1 +
 arm/include/arm-common/kvm-arch.h |   2 +-
 include/kvm/virtio-mmio.h         |  20 +++-
 include/kvm/virtio.h              |   1 +
 virtio/core.c                     |   6 +-
 virtio/mmio-modern.c              | 157 ++++++++++++++++++++++++++++++
 virtio/mmio.c                     |  12 ++-
 7 files changed, 189 insertions(+), 10 deletions(-)
 create mode 100644 virtio/mmio-modern.c

diff --git a/Makefile b/Makefile
index 186dc541..df15c783 100644
--- a/Makefile
+++ b/Makefile
@@ -107,6 +107,7 @@ OBJS	+= kvm-ipc.o
 OBJS	+= builtin-sandbox.o
 OBJS	+= virtio/mmio.o
 OBJS	+= virtio/mmio-legacy.o
+OBJS	+= virtio/mmio-modern.o
 
 # Translate uname -m into ARCH string
 ARCH ?= $(shell uname -m | sed -e s/i.86/i386/ -e s/ppc.*/powerpc/ \
diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h
index 0a960f4f..a4c7604a 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -84,7 +84,7 @@
 #define VIRTIO_DEFAULT_TRANS(kvm)					\
 	((kvm)->cfg.arch.virtio_trans_pci ?				\
 	 ((kvm)->cfg.virtio_legacy ? VIRTIO_PCI_LEGACY : VIRTIO_PCI) :	\
-	 VIRTIO_MMIO)
+	 ((kvm)->cfg.virtio_legacy ? VIRTIO_MMIO_LEGACY : VIRTIO_MMIO))
 
 #define VIRTIO_RING_ENDIAN	(VIRTIO_ENDIAN_LE | VIRTIO_ENDIAN_BE)
 
diff --git a/include/kvm/virtio-mmio.h b/include/kvm/virtio-mmio.h
index e7ef8386..b428b8d3 100644
--- a/include/kvm/virtio-mmio.h
+++ b/include/kvm/virtio-mmio.h
@@ -27,20 +27,30 @@ struct virtio_mmio_hdr {
 	u32	reserved_1[2];
 	u32	guest_features;
 	u32	guest_features_sel;
-	u32	guest_page_size;
+	u32	guest_page_size;	/* legacy */
 	u32	reserved_2;
 	u32	queue_sel;
 	u32	queue_num_max;
 	u32	queue_num;
-	u32	queue_align;
-	u32	queue_pfn;
-	u32	reserved_3[3];
+	u32	queue_align;		/* legacy */
+	u32	queue_pfn;		/* legacy */
+	u32	queue_ready;		/* modern */
+	u32	reserved_3[2];
 	u32	queue_notify;
 	u32	reserved_4[3];
 	u32	interrupt_state;
 	u32	interrupt_ack;
 	u32	reserved_5[2];
 	u32	status;
+	u32	reserved_7[3];
+	u32	queue_desc_low;		/* modern */
+	u32	queue_desc_high;
+	u32	reserved_8[2];
+	u32	queue_avail_low;
+	u32	queue_avail_high;
+	u32	reserved_9[2];
+	u32	queue_used_low;
+	u32	queue_used_high;
 } __attribute__((packed));
 
 struct virtio_mmio {
@@ -64,6 +74,8 @@ int virtio_mmio_init_ioeventfd(struct kvm *kvm, struct virtio_device *vdev,
 
 void virtio_mmio_legacy_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data,
 				 u32 len, u8 is_write, void *ptr);
+void virtio_mmio_modern_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+				 u32 len, u8 is_write, void *ptr);
 int virtio_mmio_init_vq(struct kvm *kvm, struct virtio_device *vdev, int vq);
 void virtio_mmio_exit_vq(struct kvm *kvm, struct virtio_device *vdev, int vq);
 #endif
diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 9d77666c..ced2701c 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -196,6 +196,7 @@ enum virtio_trans {
 	VIRTIO_PCI,
 	VIRTIO_PCI_LEGACY,
 	VIRTIO_MMIO,
+	VIRTIO_MMIO_LEGACY,
 };
 
 struct virtio_device {
diff --git a/virtio/core.c b/virtio/core.c
index 39a282e9..069b82d2 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -16,7 +16,7 @@ const char* virtio_trans_name(enum virtio_trans trans)
 {
 	if (trans == VIRTIO_PCI || trans == VIRTIO_PCI_LEGACY)
 		return "pci";
-	else if (trans == VIRTIO_MMIO)
+	else if (trans == VIRTIO_MMIO || trans == VIRTIO_MMIO_LEGACY)
 		return "mmio";
 	return "unknown";
 }
@@ -345,8 +345,10 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 		vdev->ops->reset		= virtio_pci__reset;
 		r = vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class);
 		break;
-	case VIRTIO_MMIO:
+	case VIRTIO_MMIO_LEGACY:
 		vdev->legacy			= true;
+		/* fall through */
+	case VIRTIO_MMIO:
 		virtio = calloc(sizeof(struct virtio_mmio), 1);
 		if (!virtio)
 			return -ENOMEM;
diff --git a/virtio/mmio-modern.c b/virtio/mmio-modern.c
new file mode 100644
index 00000000..6a112d34
--- /dev/null
+++ b/virtio/mmio-modern.c
@@ -0,0 +1,157 @@
+#include "kvm/virtio.h"
+#include "kvm/virtio-mmio.h"
+
+#include <linux/byteorder.h>
+
+#define vmmio_selected_vq(vmmio) \
+	vdev->ops->get_vq((vmmio)->kvm, (vmmio)->dev, (vmmio)->hdr.queue_sel)
+
+static void virtio_mmio_config_in(struct kvm_cpu *vcpu,
+				  u64 addr, u32 *data, u32 len,
+				  struct virtio_device *vdev)
+{
+	struct virtio_mmio *vmmio = vdev->virtio;
+	u64 features = 1ULL << VIRTIO_F_VERSION_1;
+	u32 val = 0;
+
+	switch (addr) {
+	case VIRTIO_MMIO_MAGIC_VALUE:
+	case VIRTIO_MMIO_VERSION:
+	case VIRTIO_MMIO_DEVICE_ID:
+	case VIRTIO_MMIO_VENDOR_ID:
+	case VIRTIO_MMIO_STATUS:
+	case VIRTIO_MMIO_INTERRUPT_STATUS:
+		val = *(u32 *)(((void *)&vmmio->hdr) + addr);
+		break;
+	case VIRTIO_MMIO_DEVICE_FEATURES:
+		if (vmmio->hdr.host_features_sel > 1)
+			break;
+		features |= vdev->ops->get_host_features(vmmio->kvm, vmmio->dev);
+		val = features >> (32 * vmmio->hdr.host_features_sel);
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM_MAX:
+		val = vdev->ops->get_size_vq(vmmio->kvm, vmmio->dev,
+					     vmmio->hdr.queue_sel);
+		break;
+	case VIRTIO_MMIO_QUEUE_READY:
+		val = vmmio_selected_vq(vmmio)->enabled;
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_LOW:
+		val = vmmio_selected_vq(vmmio)->vring_addr.desc_lo;
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_HIGH:
+		val = vmmio_selected_vq(vmmio)->vring_addr.desc_hi;
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_LOW:
+		val = vmmio_selected_vq(vmmio)->vring_addr.used_lo;
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_HIGH:
+		val = vmmio_selected_vq(vmmio)->vring_addr.used_hi;
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_LOW:
+		val = vmmio_selected_vq(vmmio)->vring_addr.avail_lo;
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_HIGH:
+		val = vmmio_selected_vq(vmmio)->vring_addr.avail_hi;
+		break;
+	case VIRTIO_MMIO_CONFIG_GENERATION:
+		/*
+		 * The config generation changes when the device updates a
+		 * config field larger than 32 bits, that the driver reads using
+		 * multiple accesses. Since kvmtool doesn't use any mutable
+		 * config field larger than 32 bits, the generation is constant.
+		 */
+		break;
+	default:
+		return;
+	}
+
+	*data = cpu_to_le32(val);
+}
+
+static void virtio_mmio_config_out(struct kvm_cpu *vcpu,
+				   u64 addr, u32 *data, u32 len,
+				   struct virtio_device *vdev)
+{
+	struct virtio_mmio *vmmio = vdev->virtio;
+	struct kvm *kvm = vmmio->kvm;
+	u32 val = le32_to_cpu(*data);
+
+	switch (addr) {
+	case VIRTIO_MMIO_DEVICE_FEATURES_SEL:
+	case VIRTIO_MMIO_DRIVER_FEATURES_SEL:
+	case VIRTIO_MMIO_QUEUE_SEL:
+		*(u32 *)(((void *)&vmmio->hdr) + addr) = val;
+		break;
+	case VIRTIO_MMIO_STATUS:
+		vmmio->hdr.status = val;
+		virtio_notify_status(kvm, vdev, vmmio->dev, val);
+		break;
+	case VIRTIO_MMIO_DRIVER_FEATURES:
+		if (vmmio->hdr.guest_features_sel == 0)
+			virtio_set_guest_features(vmmio->kvm, vdev, vmmio->dev,
+						  val);
+		break;
+	case VIRTIO_MMIO_QUEUE_NUM:
+		vmmio->hdr.queue_num = val;
+		vdev->ops->set_size_vq(vmmio->kvm, vmmio->dev,
+				       vmmio->hdr.queue_sel, val);
+		break;
+	case VIRTIO_MMIO_QUEUE_READY:
+		if (val)
+			virtio_mmio_init_vq(kvm, vdev, vmmio->hdr.queue_sel);
+		else
+			virtio_mmio_exit_vq(kvm, vdev, vmmio->hdr.queue_sel);
+		break;
+	case VIRTIO_MMIO_QUEUE_NOTIFY:
+		vdev->ops->notify_vq(vmmio->kvm, vmmio->dev, val);
+		break;
+	case VIRTIO_MMIO_INTERRUPT_ACK:
+		vmmio->hdr.interrupt_state &= ~val;
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_LOW:
+		vmmio_selected_vq(vmmio)->vring_addr.desc_lo = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_DESC_HIGH:
+		vmmio_selected_vq(vmmio)->vring_addr.desc_hi = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_LOW:
+		vmmio_selected_vq(vmmio)->vring_addr.used_lo = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_USED_HIGH:
+		vmmio_selected_vq(vmmio)->vring_addr.used_hi = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_LOW:
+		vmmio_selected_vq(vmmio)->vring_addr.avail_lo = val;
+		break;
+	case VIRTIO_MMIO_QUEUE_AVAIL_HIGH:
+		vmmio_selected_vq(vmmio)->vring_addr.avail_hi = val;
+		break;
+	};
+}
+
+void virtio_mmio_modern_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data,
+				 u32 len, u8 is_write, void *ptr)
+{
+	struct virtio_device *vdev = ptr;
+	struct virtio_mmio *vmmio = vdev->virtio;
+	u32 offset = addr - vmmio->addr;
+
+	if (offset >= VIRTIO_MMIO_CONFIG) {
+		offset -= VIRTIO_MMIO_CONFIG;
+		virtio_access_config(vmmio->kvm, vdev, vmmio->dev, offset, data,
+				     len, is_write);
+		return;
+	}
+
+	if (len != 4) {
+		pr_debug("Invalid %s size %d at 0x%llx", is_write ? "write" :
+			 "read", len, addr);
+		return;
+	}
+
+	if (is_write)
+		virtio_mmio_config_out(vcpu, offset, (void *)data, len, ptr);
+	else
+		virtio_mmio_config_in(vcpu, offset, (void *)data, len, ptr);
+}
diff --git a/virtio/mmio.c b/virtio/mmio.c
index fab45733..fae73b52 100644
--- a/virtio/mmio.c
+++ b/virtio/mmio.c
@@ -149,6 +149,7 @@ static void generate_virtio_mmio_fdt_node(void *fdt,
 int virtio_mmio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 		     int device_id, int subsys_id, int class)
 {
+	bool legacy = vdev->legacy;
 	struct virtio_mmio *vmmio = vdev->virtio;
 	int r;
 
@@ -156,14 +157,19 @@ int virtio_mmio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 	vmmio->kvm	= kvm;
 	vmmio->dev	= dev;
 
-	r = kvm__register_mmio(kvm, vmmio->addr, VIRTIO_MMIO_IO_SIZE,
-			       false, virtio_mmio_legacy_callback, vdev);
+	if (!legacy)
+		vdev->endian = VIRTIO_ENDIAN_LE;
+
+	r = kvm__register_mmio(kvm, vmmio->addr, VIRTIO_MMIO_IO_SIZE, false,
+			       legacy ? virtio_mmio_legacy_callback :
+					virtio_mmio_modern_callback,
+			       vdev);
 	if (r < 0)
 		return r;
 
 	vmmio->hdr = (struct virtio_mmio_hdr) {
 		.magic		= {'v', 'i', 'r', 't'},
-		.version	= 1,
+		.version	= legacy ? 1 : 2,
 		.device_id	= subsys_id,
 		.vendor_id	= 0x4d564b4c , /* 'LKVM' */
 		.queue_num_max	= 256,
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH kvmtool 24/24] virtio/pci: Initialize all vectors to VIRTIO_MSI_NO_VECTOR
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (22 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 23/24] virtio: Add support for modern virtio-mmio Jean-Philippe Brucker
@ 2022-06-07 17:02 ` Jean-Philippe Brucker
  2022-06-09 12:39 ` [PATCH kvmtool 00/24] Virtio v1 support Will Deacon
  2022-06-09 12:50 ` Will Deacon
  25 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-07 17:02 UTC (permalink / raw)
  To: will
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

According to the virtio spec, all vectors must be initialized to
VIRTIO_MSI_NO_VECTOR (0xffff). In 4.1.5.1.2.1 "Device Requirements:
MSI-X Vector Configuration":

    The device MUST return vector mapped to a given event, (NO_VECTOR if
    unmapped) on read of config_msix_vector/queue_msix_vector.

Currently we return 0, which is a valid MSI vector. Return NO_VECTOR
instead.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 include/kvm/virtio-pci.h | 2 +-
 virtio/pci.c             | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
index 47075334..4590d1b3 100644
--- a/include/kvm/virtio-pci.h
+++ b/include/kvm/virtio-pci.h
@@ -53,7 +53,7 @@ struct virtio_pci {
 	/* MSI-X */
 	u16			config_vector;
 	u32			config_gsi;
-	u32			vq_vector[VIRTIO_PCI_MAX_VQ];
+	u16			vq_vector[VIRTIO_PCI_MAX_VQ];
 	u32			gsis[VIRTIO_PCI_MAX_VQ];
 	u64			msix_pba;
 	struct msix_table	msix_table[VIRTIO_PCI_MAX_VQ + VIRTIO_PCI_MAX_CONFIG];
diff --git a/virtio/pci.c b/virtio/pci.c
index 8bea8c73..03432d76 100644
--- a/virtio/pci.c
+++ b/virtio/pci.c
@@ -403,7 +403,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev,
 	/* Both table and PBA are mapped to the same BAR (2) */
 	vpci->pci_hdr.msix.table_offset = cpu_to_le32(2);
 	vpci->pci_hdr.msix.pba_offset = cpu_to_le32(2 | VIRTIO_MSIX_TABLE_SIZE);
-	vpci->config_vector = 0;
+	vpci->config_vector = VIRTIO_MSI_NO_VECTOR;
+	/* Initialize all vq vectors to NO_VECTOR */
+	memset(vpci->vq_vector, 0xff, sizeof(vpci->vq_vector));
 
 	if (irq__can_signal_msi(kvm))
 		vpci->features |= VIRTIO_PCI_F_SIGNAL_MSI;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH kvmtool 00/24] Virtio v1 support
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (23 preceding siblings ...)
  2022-06-07 17:02 ` [PATCH kvmtool 24/24] virtio/pci: Initialize all vectors to VIRTIO_MSI_NO_VECTOR Jean-Philippe Brucker
@ 2022-06-09 12:39 ` Will Deacon
  2022-06-10 16:31   ` Jean-Philippe Brucker
  2022-06-09 12:50 ` Will Deacon
  25 siblings, 1 reply; 29+ messages in thread
From: Will Deacon @ 2022-06-09 12:39 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

On Tue, Jun 07, 2022 at 06:02:15PM +0100, Jean-Philippe Brucker wrote:
> Add support for version 1 of the virtio transport to kvmtool. Based on a
> RFC by Sasha Levin [1], I've been trying to complete it here and there.
> It's long overdue and is quite painful to rebase, so let's get it
> merged.
> 
> Several reasons why the legacy transport needs to be replaced:
> 
> * Only 32 feature bits are supported. Most importantly
>   VIRTIO_F_ACCESS_PLATFORM, which forces a Linux guest to use the DMA
>   API, cannot be enabled. So we can't support private guests that
>   decrypt or share only their DMA memory with the host.

Woohoo!

> * Legacy virtqueue address is a 32-bit pfn, aligned on 4kB. Since Linux
>   guests bypass the DMA API they can't support large GPAs.
> 
> * New devices types (iommu, crypto, memory, etc) and new features cannot
>   be supported.
> 
> * New guests won't implement the legacy transport. Existing guests will
>   eventually drop legacy support.
> 
> Support for modern transport becomes the default and legacy is enabled
> with --virtio-legacy.
> 
> I only tested what I could: vsock, scsi and vhost-net are currently
> broken and can be fixed later (they have issues with mem regions and
> feature mask, among other things). I also haven't tested big-endian.

If these are broken, then shall we default to legacy mode and have the
modern transport be opt-in? Otherwise we're regressing people in a
confusing way.

Will

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH kvmtool 17/24] virtio/pci: Delete MSI routes
  2022-06-07 17:02 ` [PATCH kvmtool 17/24] virtio/pci: Delete MSI routes Jean-Philippe Brucker
@ 2022-06-09 12:40   ` Will Deacon
  0 siblings, 0 replies; 29+ messages in thread
From: Will Deacon @ 2022-06-09 12:40 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: andre.przywara, alexandru.elisei, kvm, suzuki.poulose,
	sasha.levin, jean-philippe

On Tue, Jun 07, 2022 at 06:02:32PM +0100, Jean-Philippe Brucker wrote:
> On exit_vq() and device reset, remove the MSI routes that were set up at
> runtime.
> 
> TODO: make irq__add_msix_route reuse those deleted routes. Currently, new
> ones will be created after reset.

Please don't leave the TODO in here

Will

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH kvmtool 00/24] Virtio v1 support
  2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
                   ` (24 preceding siblings ...)
  2022-06-09 12:39 ` [PATCH kvmtool 00/24] Virtio v1 support Will Deacon
@ 2022-06-09 12:50 ` Will Deacon
  25 siblings, 0 replies; 29+ messages in thread
From: Will Deacon @ 2022-06-09 12:50 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: catalin.marinas, kernel-team, Will Deacon, sasha.levin,
	suzuki.poulose, kvm, jean-philippe, alexandru.elisei,
	andre.przywara

On Tue, 7 Jun 2022 18:02:15 +0100, Jean-Philippe Brucker wrote:
> Add support for version 1 of the virtio transport to kvmtool. Based on a
> RFC by Sasha Levin [1], I've been trying to complete it here and there.
> It's long overdue and is quite painful to rebase, so let's get it
> merged.
> 
> Several reasons why the legacy transport needs to be replaced:
> 
> [...]

Applied patches 1-16 to kvmtool (master), thanks!

[01/24] virtio: Add NEEDS_RESET to the status mask
        https://git.kernel.org/will/kvmtool/c/7efc2622d5ee
[02/24] virtio: Remove redundant test
        https://git.kernel.org/will/kvmtool/c/3a1e36e4bf49
[03/24] virtio/vsock: Remove redundant state tracking
        https://git.kernel.org/will/kvmtool/c/a8e397bb9dd9
[04/24] virtio: Factor virtqueue initialization
        https://git.kernel.org/will/kvmtool/c/fd41cde06617
[05/24] virtio: Support modern virtqueue addresses
        https://git.kernel.org/will/kvmtool/c/609ee9066879
[06/24] virtio: Add config access helpers
        https://git.kernel.org/will/kvmtool/c/15e6c4e74d06
[07/24] virtio: Fix device-specific config endianness
        https://git.kernel.org/will/kvmtool/c/867b15ccd7da
[08/24] virtio/console: Remove unused callback
        https://git.kernel.org/will/kvmtool/c/17ad9fd6ce37
[09/24] virtio: Remove set_guest_features() device op
        https://git.kernel.org/will/kvmtool/c/902a8ecb3877
[10/24] Add memcpy_fromiovec_safe
        https://git.kernel.org/will/kvmtool/c/c492534f3ac9
[11/24] virtio/net: Offload vnet header endianness conversion to tap
        https://git.kernel.org/will/kvmtool/c/8b27bcff44fd
[12/24] virtio/net: Prepare for modern virtio
        https://git.kernel.org/will/kvmtool/c/b231683c3361
[13/24] virtio/net: Implement VIRTIO_F_ANY_LAYOUT feature
        https://git.kernel.org/will/kvmtool/c/6daffe57762c
[14/24] virtio/console: Add VIRTIO_F_ANY_LAYOUT feature
        https://git.kernel.org/will/kvmtool/c/e74b56e1495c
[15/24] virtio/blk: Implement VIRTIO_F_ANY_LAYOUT feature
        https://git.kernel.org/will/kvmtool/c/484278913807
[16/24] virtio/pci: Factor MSI route creation
        https://git.kernel.org/will/kvmtool/c/f44af23e3a62

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH kvmtool 00/24] Virtio v1 support
  2022-06-09 12:39 ` [PATCH kvmtool 00/24] Virtio v1 support Will Deacon
@ 2022-06-10 16:31   ` Jean-Philippe Brucker
  0 siblings, 0 replies; 29+ messages in thread
From: Jean-Philippe Brucker @ 2022-06-10 16:31 UTC (permalink / raw)
  To: Will Deacon
  Cc: Andre Przywara, Alexandru Elisei, kvm, Suzuki Poulose, jean-philippe

On Thu, Jun 09, 2022 at 01:39:48PM +0100, Will Deacon wrote:
> On Tue, Jun 07, 2022 at 06:02:15PM +0100, Jean-Philippe Brucker wrote:
> > Add support for version 1 of the virtio transport to kvmtool. Based on a
> > RFC by Sasha Levin [1], I've been trying to complete it here and there.
> > It's long overdue and is quite painful to rebase, so let's get it
> > merged.
> >
> > Several reasons why the legacy transport needs to be replaced:
> >
> > * Only 32 feature bits are supported. Most importantly
> >   VIRTIO_F_ACCESS_PLATFORM, which forces a Linux guest to use the DMA
> >   API, cannot be enabled. So we can't support private guests that
> >   decrypt or share only their DMA memory with the host.
> 
> Woohoo!
> 
> > * Legacy virtqueue address is a 32-bit pfn, aligned on 4kB. Since Linux
> >   guests bypass the DMA API they can't support large GPAs.
> >
> > * New devices types (iommu, crypto, memory, etc) and new features cannot
> >   be supported.
> >
> > * New guests won't implement the legacy transport. Existing guests will
> >   eventually drop legacy support.
> >
> > Support for modern transport becomes the default and legacy is enabled
> > with --virtio-legacy.
> >
> > I only tested what I could: vsock, scsi and vhost-net are currently
> > broken and can be fixed later (they have issues with mem regions and
> > feature mask, among other things). I also haven't tested big-endian.
> 
> If these are broken, then shall we default to legacy mode and have the
> modern transport be opt-in? Otherwise we're regressing people in a
> confusing way.

What I meant was that even without these patches, I wasn't able to use any
of the vhost devices, they already had several bugs. But now that I spent
a little more time trying to fix them, it looks like the modern transport
does add one regression (vhost ioeventfd is set on I/O port but modern
transport uses MMIO).

I'll sort this out and resend. Thanks for picking up the base patches!

Thanks,
Jean

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-06-10 16:32 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-07 17:02 [PATCH kvmtool 00/24] Virtio v1 support Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 01/24] virtio: Add NEEDS_RESET to the status mask Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 02/24] virtio: Remove redundant test Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 03/24] virtio/vsock: Remove redundant state tracking Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 04/24] virtio: Factor virtqueue initialization Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 05/24] virtio: Support modern virtqueue addresses Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 06/24] virtio: Add config access helpers Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 07/24] virtio: Fix device-specific config endianness Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 08/24] virtio/console: Remove unused callback Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 09/24] virtio: Remove set_guest_features() device op Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 10/24] Add memcpy_fromiovec_safe Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 11/24] virtio/net: Offload vnet header endianness conversion to tap Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 12/24] virtio/net: Prepare for modern virtio Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 13/24] virtio/net: Implement VIRTIO_F_ANY_LAYOUT feature Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 14/24] virtio/console: Add " Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 15/24] virtio/blk: Implement " Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 16/24] virtio/pci: Factor MSI route creation Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 17/24] virtio/pci: Delete MSI routes Jean-Philippe Brucker
2022-06-09 12:40   ` Will Deacon
2022-06-07 17:02 ` [PATCH kvmtool 18/24] virtio: Extract init_vq() for PCI and MMIO Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 19/24] virtio/pci: Make doorbell offset dynamic Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 20/24] virtio: Move PCI transport to pci-legacy Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 21/24] virtio: Add support for modern virtio-pci Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 22/24] virtio: Move MMIO transport to mmio-legacy Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 23/24] virtio: Add support for modern virtio-mmio Jean-Philippe Brucker
2022-06-07 17:02 ` [PATCH kvmtool 24/24] virtio/pci: Initialize all vectors to VIRTIO_MSI_NO_VECTOR Jean-Philippe Brucker
2022-06-09 12:39 ` [PATCH kvmtool 00/24] Virtio v1 support Will Deacon
2022-06-10 16:31   ` Jean-Philippe Brucker
2022-06-09 12:50 ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.