All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/2] support virtio mmio specification Version 3
@ 2019-12-25  2:50 Zha Bin
  2019-12-25  2:50 ` [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi Zha Bin
                   ` (2 more replies)
  0 siblings, 3 replies; 60+ messages in thread
From: Zha Bin @ 2019-12-25  2:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: mst, jasowang, slp, virtio-dev, gerry, zhabin, jing2.liu, chao.p.peng

In cloud native environment, we need a lightweight and secure system. It
should benefit from the speed of containers and the security of VM, which
is classified as secure containers. The traditional solution of cloud VM
is Qemu. In fact we don't need to pay for the legacy devices. Currently,
more userspace VMMs, e.g. Qemu, Firecracker, Cloud Hypervisor and Alibaba
Cloud VMM which is called Dragonball, began to pay attention to a
lightweight solution.

The lightweight VMM is suitable to cloud native infrastructure which is
designed for creating secure sandbox to address the requirements of
multi-tenant. Meanwhile, with faster startup time and lower memory
overhead, it makes possible to launch thousands of microVMs on the same
machine. This VMM minimizes the emulation devices and uses virtio-mmio to
get a more lightweight transport layer. The virtio-mmio devices have less
code than virtio-pci, which can decrease boot time and increase deploy
density by customizing kernel such as setting pci=off. From another point
of view, the minimal device can reduce the attack surface.

We have compared the number of files and the lines of code between
virtio-mmio and virio-pci.

				Virtio-PCI	    Virtio-MMIO	
	number of files(Linux)	    161			1
	lines of code(Linux)	    78237		538
	number of files(Qemu)	    24			1
	lines of code(Qemu)	    8952		421

But the current standard virtio-mmio spec has some limitations which is
only support legacy interrupt and will cause performance penalties.

To address such limitation, we proposed to update virtio-mmio spec to
version 3 to add the MSI interrupt and doorbell notify features[1], which
can achieve the same performance as virtio-pci devices with only around 400
lines of code.

Here are the performance gain of MSI interrupt in virtio-mmio. Each case is
repeated three times.

        netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024

                Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
        trans/s     9536        6939            9500
        trans/s     9734        7029            9749
        trans/s     9894        7095            9318

With the virtio spec proposal[1], other VMMs (e.g. Qemu) can also make use
of the new features to get a enhancing performance.

[1] https://lkml.org/lkml/2019/12/20/113

Liu Jiang (2):
  x86/msi: Enhance x86 to support platform_msi
  virtio-mmio: add features for virtio-mmio specification version 3

 arch/x86/Kconfig                 |   6 ++
 arch/x86/include/asm/hw_irq.h    |   5 +
 arch/x86/include/asm/irqdomain.h |   9 ++
 arch/x86/kernel/apic/msi.c       |  74 +++++++++++++
 drivers/base/platform-msi.c      |   4 +-
 drivers/virtio/virtio_mmio.c     | 226 ++++++++++++++++++++++++++++++++++++---
 include/linux/msi.h              |   1 +
 include/uapi/linux/virtio_mmio.h |  28 +++++
 8 files changed, 337 insertions(+), 16 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi
  2019-12-25  2:50 [PATCH v1 0/2] support virtio mmio specification Version 3 Zha Bin
@ 2019-12-25  2:50 ` Zha Bin
  2019-12-26  8:21     ` [virtio-dev] " Jason Wang
  2020-01-17 13:58   ` Thomas Gleixner
  2019-12-25  2:50 ` [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3 Zha Bin
  2020-01-02  6:55   ` [virtio-dev] " Jason Wang
  2 siblings, 2 replies; 60+ messages in thread
From: Zha Bin @ 2019-12-25  2:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: mst, jasowang, slp, virtio-dev, gerry, zhabin, jing2.liu, chao.p.peng

From: Liu Jiang <gerry@linux.alibaba.com>

The x86 platform currently only supports specific MSI/MSI-x for PCI
devices. To enable MSI to the platform devices such as virtio-mmio
device, this patch enhances x86 with platform MSI by leveraging the
already built-in platform-msi driver (drivers/base/platform-msi.c)
and makes it as a configurable option.

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
---
 arch/x86/Kconfig                 |  6 ++++
 arch/x86/include/asm/hw_irq.h    |  5 +++
 arch/x86/include/asm/irqdomain.h |  9 +++++
 arch/x86/kernel/apic/msi.c       | 74 ++++++++++++++++++++++++++++++++++++++++
 drivers/base/platform-msi.c      |  4 +--
 include/linux/msi.h              |  1 +
 6 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5e89499..c1178f2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1074,6 +1074,12 @@ config X86_IO_APIC
 	def_bool y
 	depends on X86_LOCAL_APIC || X86_UP_IOAPIC
 
+config X86_PLATFORM_MSI
+	def_bool y
+	depends on X86_LOCAL_APIC && X86_64
+	select GENERIC_MSI_IRQ
+	select GENERIC_MSI_IRQ_DOMAIN
+
 config X86_REROUTE_FOR_BROKEN_BOOT_IRQS
 	bool "Reroute for broken boot IRQs"
 	depends on X86_IO_APIC
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 4154bc5..5c96b49 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -113,6 +113,11 @@ struct irq_alloc_info {
 			struct msi_desc *desc;
 		};
 #endif
+#ifdef CONFIG_X86_PLATFORM_MSI
+		struct {
+			irq_hw_number_t	platform_msi_hwirq;
+		};
+#endif
 	};
 };
 
diff --git a/arch/x86/include/asm/irqdomain.h b/arch/x86/include/asm/irqdomain.h
index c066ffa..b53f51f 100644
--- a/arch/x86/include/asm/irqdomain.h
+++ b/arch/x86/include/asm/irqdomain.h
@@ -56,4 +56,13 @@ extern void mp_irqdomain_deactivate(struct irq_domain *domain,
 static inline void arch_init_msi_domain(struct irq_domain *domain) { }
 #endif
 
+#ifdef CONFIG_X86_PLATFORM_MSI
+extern struct irq_domain *platform_msi_get_def_irq_domain(void);
+#else
+static inline struct irq_domain *platform_msi_get_def_irq_domain(void)
+{
+	return NULL;
+}
+#endif
+
 #endif
diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 7f75334..ef558c7 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -22,6 +22,10 @@
 #include <asm/irq_remapping.h>
 
 static struct irq_domain *msi_default_domain;
+#ifdef CONFIG_X86_PLATFORM_MSI
+static struct irq_domain *platform_msi_default_domain;
+static struct msi_domain_info platform_msi_domain_info;
+#endif
 
 static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg)
 {
@@ -146,6 +150,17 @@ void __init arch_init_msi_domain(struct irq_domain *parent)
 	}
 	if (!msi_default_domain)
 		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n");
+#ifdef CONFIG_X86_PLATFORM_MSI
+	fn = irq_domain_alloc_named_fwnode("PLATFORM-MSI");
+	if (fn) {
+		platform_msi_default_domain =
+			platform_msi_create_irq_domain(fn,
+				&platform_msi_domain_info, parent);
+		irq_domain_free_fwnode(fn);
+	}
+	if (!platform_msi_default_domain)
+		pr_warn("failed to initialize irqdomain for PLATFORM-MSI.\n");
+#endif
 }
 
 #ifdef CONFIG_IRQ_REMAP
@@ -384,3 +399,62 @@ int hpet_assign_irq(struct irq_domain *domain, struct hpet_channel *hc,
 	return irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, &info);
 }
 #endif
+
+#ifdef CONFIG_X86_PLATFORM_MSI
+static void platform_msi_mask_irq(struct irq_data *data)
+{
+}
+
+static void platform_msi_unmask_irq(struct irq_data *data)
+{
+}
+
+static struct irq_chip platform_msi_controller = {
+	.name			= "PLATFORM-MSI",
+	.irq_mask		= platform_msi_mask_irq,
+	.irq_unmask		= platform_msi_unmask_irq,
+	.irq_ack		= irq_chip_ack_parent,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_compose_msi_msg	= irq_msi_compose_msg,
+	.flags			= IRQCHIP_SKIP_SET_WAKE,
+};
+
+static int platform_msi_prepare(struct irq_domain *domain, struct device *dev,
+				int nvec, msi_alloc_info_t *arg)
+{
+	memset(arg, 0, sizeof(*arg));
+	return 0;
+}
+
+static void platform_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
+{
+	arg->platform_msi_hwirq = platform_msi_calc_hwirq(desc);
+}
+
+static irq_hw_number_t platform_msi_get_hwirq(struct msi_domain_info *info,
+					      msi_alloc_info_t *arg)
+{
+	return arg->platform_msi_hwirq;
+}
+
+static struct msi_domain_ops platform_msi_domain_ops = {
+	.msi_prepare	= platform_msi_prepare,
+	.set_desc	= platform_msi_set_desc,
+	.get_hwirq	= platform_msi_get_hwirq,
+};
+
+static struct msi_domain_info platform_msi_domain_info = {
+	.flags          = MSI_FLAG_USE_DEF_DOM_OPS |
+			  MSI_FLAG_USE_DEF_CHIP_OPS |
+			  MSI_FLAG_ACTIVATE_EARLY,
+	.ops            = &platform_msi_domain_ops,
+	.chip           = &platform_msi_controller,
+	.handler        = handle_edge_irq,
+	.handler_name   = "edge",
+};
+
+struct irq_domain *platform_msi_get_def_irq_domain(void)
+{
+	return platform_msi_default_domain;
+}
+#endif
diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c
index 8da314b..45752f1 100644
--- a/drivers/base/platform-msi.c
+++ b/drivers/base/platform-msi.c
@@ -31,12 +31,11 @@ struct platform_msi_priv_data {
 /* The devid allocator */
 static DEFINE_IDA(platform_msi_devid_ida);
 
-#ifdef GENERIC_MSI_DOMAIN_OPS
 /*
  * Convert an msi_desc to a globaly unique identifier (per-device
  * devid + msi_desc position in the msi_list).
  */
-static irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
+irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
 {
 	u32 devid;
 
@@ -45,6 +44,7 @@ static irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
 	return (devid << (32 - DEV_ID_SHIFT)) | desc->platform.msi_index;
 }
 
+#ifdef GENERIC_MSI_DOMAIN_OPS
 static void platform_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
 {
 	arg->desc = desc;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 8ad679e..ee5f566 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -362,6 +362,7 @@ int platform_msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
 void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq,
 			      unsigned int nvec);
 void *platform_msi_get_host_data(struct irq_domain *domain);
+irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc);
 #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
 
 #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-25  2:50 [PATCH v1 0/2] support virtio mmio specification Version 3 Zha Bin
  2019-12-25  2:50 ` [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi Zha Bin
@ 2019-12-25  2:50 ` Zha Bin
  2019-12-25 10:20     ` [virtio-dev] " Jason Wang
                     ` (4 more replies)
  2020-01-02  6:55   ` [virtio-dev] " Jason Wang
  2 siblings, 5 replies; 60+ messages in thread
From: Zha Bin @ 2019-12-25  2:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: mst, jasowang, slp, virtio-dev, gerry, zhabin, jing2.liu, chao.p.peng

From: Liu Jiang <gerry@linux.alibaba.com>

Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
virtio over mmio devices as a lightweight machine model for modern
cloud. The standard virtio over MMIO transport layer only supports one
legacy interrupt, which is much heavier than virtio over PCI transport
layer using MSI. Legacy interrupt has long work path and causes specific
VMExits in following cases, which would considerably slow down the
performance:

1) read interrupt status register
2) update interrupt status register
3) write IOAPIC EOI register

We proposed to update virtio over MMIO to version 3[1] to add the
following new features and enhance the performance.

1) Support Message Signaled Interrupt(MSI), which increases the
   interrupt performance for virtio multi-queue devices
2) Support per-queue doorbell, so the guest kernel may directly write
   to the doorbells provided by virtio devices.

The following is the network tcp_rr performance testing report, tested
with virtio-pci device, vanilla virtio-mmio device and patched
virtio-mmio device (run test 3 times for each case):

	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024

		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
	trans/s	    9536	6939		9500
	trans/s	    9734	7029		9749
	trans/s	    9894	7095		9318

[1] https://lkml.org/lkml/2019/12/20/113

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
---
 drivers/virtio/virtio_mmio.c     | 226 ++++++++++++++++++++++++++++++++++++---
 include/uapi/linux/virtio_mmio.h |  28 +++++
 2 files changed, 240 insertions(+), 14 deletions(-)

diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index e09edb5..065c083 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -68,8 +68,8 @@
 #include <linux/virtio_config.h>
 #include <uapi/linux/virtio_mmio.h>
 #include <linux/virtio_ring.h>
-
-
+#include <linux/msi.h>
+#include <asm/irqdomain.h>
 
 /* The alignment to use between consumer and producer parts of vring.
  * Currently hardcoded to the page size. */
@@ -90,6 +90,12 @@ struct virtio_mmio_device {
 	/* a list of queues so we can dispatch IRQs */
 	spinlock_t lock;
 	struct list_head virtqueues;
+
+	int doorbell_base;
+	int doorbell_scale;
+	bool msi_enabled;
+	/* Name strings for interrupts. */
+	char (*vm_vq_names)[256];
 };
 
 struct virtio_mmio_vq_info {
@@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
 };
 
 
+static void vm_free_msi_irqs(struct virtio_device *vdev);
+static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs);
 
 /* Configuration interface */
 
@@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
 {
 	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vq->vdev);
 
+	if (vm_dev->version == 3) {
+		int offset = vm_dev->doorbell_base +
+			     vm_dev->doorbell_scale * vq->index;
+		writel(vq->index, vm_dev->base + offset);
+	} else
 	/* We write the queue's selector into the notification register to
 	 * signal the other end */
-	writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
+		writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
+
 	return true;
 }
 
+/* Handle a configuration change */
+static irqreturn_t vm_config_changed(int irq, void *opaque)
+{
+	struct virtio_mmio_device *vm_dev = opaque;
+
+	virtio_config_changed(&vm_dev->vdev);
+
+	return IRQ_HANDLED;
+}
+
 /* Notify all virtqueues on an interrupt. */
 static irqreturn_t vm_interrupt(int irq, void *opaque)
 {
@@ -307,7 +331,12 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
 	return ret;
 }
 
+static int vm_msi_irq_vector(struct device *dev, unsigned int nr)
+{
+	struct msi_desc *entry = first_msi_entry(dev);
 
+	return entry->irq + nr;
+}
 
 static void vm_del_vq(struct virtqueue *vq)
 {
@@ -316,6 +345,12 @@ static void vm_del_vq(struct virtqueue *vq)
 	unsigned long flags;
 	unsigned int index = vq->index;
 
+	if (vm_dev->version == 3 && vm_dev->msi_enabled) {
+		int irq = vm_msi_irq_vector(&vq->vdev->dev, index + 1);
+
+		free_irq(irq, vq);
+	}
+
 	spin_lock_irqsave(&vm_dev->lock, flags);
 	list_del(&info->node);
 	spin_unlock_irqrestore(&vm_dev->lock, flags);
@@ -334,15 +369,41 @@ static void vm_del_vq(struct virtqueue *vq)
 	kfree(info);
 }
 
-static void vm_del_vqs(struct virtio_device *vdev)
+static void vm_free_irqs(struct virtio_device *vdev)
 {
 	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+
+	if (vm_dev->version == 3)
+		vm_free_msi_irqs(vdev);
+	else
+		free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
+}
+
+static void vm_del_vqs(struct virtio_device *vdev)
+{
 	struct virtqueue *vq, *n;
 
 	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
 		vm_del_vq(vq);
 
-	free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
+	vm_free_irqs(vdev);
+}
+
+static int vm_request_single_irq(struct virtio_device *vdev)
+{
+	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+	int irq = platform_get_irq(vm_dev->pdev, 0);
+	int err;
+
+	if (irq < 0) {
+		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
+		return irq;
+	}
+
+	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
+			dev_name(&vdev->dev), vm_dev);
+
+	return err;
 }
 
 static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
@@ -455,6 +516,72 @@ static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
 	return ERR_PTR(err);
 }
 
+static inline void mmio_msi_set_enable(struct virtio_mmio_device *vm_dev,
+		int enable)
+{
+	u16 status;
+
+	status = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
+	if (enable)
+		status |= VIRTIO_MMIO_MSI_ENABLE_MASK;
+	else
+		status &= ~VIRTIO_MMIO_MSI_ENABLE_MASK;
+	writew(status, vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
+}
+
+static int vm_find_vqs_msi(struct virtio_device *vdev, unsigned int nvqs,
+			struct virtqueue *vqs[], vq_callback_t *callbacks[],
+			const char * const names[], const bool *ctx,
+			struct irq_affinity *desc)
+{
+	int i, err, irq;
+	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+
+	/* Allocate nvqs irqs for queues and one irq for configuration */
+	err = vm_request_msi_vectors(vdev, nvqs + 1);
+	if (err != 0)
+		return err;
+
+	for (i = 0; i < nvqs; i++) {
+		int index = i + 1;
+
+		if (!names[i]) {
+			vqs[i] = NULL;
+			continue;
+		}
+		vqs[i] = vm_setup_vq(vdev, i, callbacks[i], names[i],
+				ctx ? ctx[i] : false);
+		if (IS_ERR(vqs[i])) {
+			err = PTR_ERR(vqs[i]);
+			goto error_find;
+		}
+
+		if (!callbacks[i])
+			continue;
+
+		irq = vm_msi_irq_vector(&vqs[i]->vdev->dev, index);
+		/* allocate per-vq irq if available and necessary */
+		snprintf(vm_dev->vm_vq_names[index],
+			sizeof(*vm_dev->vm_vq_names),
+			"%s-%s",
+			dev_name(&vm_dev->vdev.dev), names[i]);
+		err = request_irq(irq, vring_interrupt, 0,
+				vm_dev->vm_vq_names[index], vqs[i]);
+
+		if (err)
+			goto error_find;
+	}
+
+	mmio_msi_set_enable(vm_dev, 1);
+	vm_dev->msi_enabled = true;
+
+	return 0;
+
+error_find:
+	vm_del_vqs(vdev);
+	return err;
+}
+
 static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 		       struct virtqueue *vqs[],
 		       vq_callback_t *callbacks[],
@@ -463,16 +590,16 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 		       struct irq_affinity *desc)
 {
 	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
-	int irq = platform_get_irq(vm_dev->pdev, 0);
 	int i, err, queue_idx = 0;
 
-	if (irq < 0) {
-		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
-		return irq;
+	if (vm_dev->version == 3) {
+		err = vm_find_vqs_msi(vdev, nvqs, vqs, callbacks,
+				names, ctx, desc);
+		if (!err)
+			return 0;
 	}
 
-	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
-			dev_name(&vdev->dev), vm_dev);
+	err = vm_request_single_irq(vdev);
 	if (err)
 		return err;
 
@@ -493,6 +620,73 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 	return 0;
 }
 
+static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
+{
+	struct device *dev = desc->dev;
+	struct virtio_device *vdev = dev_to_virtio(dev);
+	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+	void __iomem *pos = vm_dev->base;
+	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
+			desc->platform.msi_index);
+
+	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
+	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
+	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
+	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
+}
+
+static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs)
+{
+	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+	int irq, err;
+	u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
+
+	if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) == 0)
+		return -EINVAL;
+
+	vm_dev->vm_vq_names = kmalloc_array(nirqs, sizeof(*vm_dev->vm_vq_names),
+			GFP_KERNEL);
+	if (!vm_dev->vm_vq_names)
+		return -ENOMEM;
+
+	if (!vdev->dev.msi_domain)
+		vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
+	err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
+			mmio_write_msi_msg);
+	if (err)
+		goto error_free_mem;
+
+	/* The first MSI vector is used for configuration change events. */
+	snprintf(vm_dev->vm_vq_names[0], sizeof(*vm_dev->vm_vq_names),
+			"%s-config", dev_name(&vdev->dev));
+	irq = vm_msi_irq_vector(&vdev->dev, 0);
+	err = request_irq(irq, vm_config_changed, 0, vm_dev->vm_vq_names[0],
+			vm_dev);
+	if (err)
+		goto error_free_mem;
+
+	return 0;
+
+error_free_mem:
+	kfree(vm_dev->vm_vq_names);
+	vm_dev->vm_vq_names = NULL;
+	return err;
+}
+
+static void vm_free_msi_irqs(struct virtio_device *vdev)
+{
+	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+
+	if (vm_dev->msi_enabled) {
+		mmio_msi_set_enable(vm_dev, 0);
+		free_irq(vm_msi_irq_vector(&vdev->dev, 0), vm_dev);
+		platform_msi_domain_free_irqs(&vdev->dev);
+		kfree(vm_dev->vm_vq_names);
+		vm_dev->vm_vq_names = NULL;
+		vm_dev->msi_enabled = false;
+	}
+}
+
 static const char *vm_bus_name(struct virtio_device *vdev)
 {
 	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
@@ -567,7 +761,7 @@ static int virtio_mmio_probe(struct platform_device *pdev)
 
 	/* Check device version */
 	vm_dev->version = readl(vm_dev->base + VIRTIO_MMIO_VERSION);
-	if (vm_dev->version < 1 || vm_dev->version > 2) {
+	if (vm_dev->version < 1 || vm_dev->version > 3) {
 		dev_err(&pdev->dev, "Version %ld not supported!\n",
 				vm_dev->version);
 		return -ENXIO;
@@ -603,6 +797,12 @@ static int virtio_mmio_probe(struct platform_device *pdev)
 		dev_warn(&pdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");
 
 	platform_set_drvdata(pdev, vm_dev);
+	if (vm_dev->version == 3) {
+		u32 db = readl(vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
+
+		vm_dev->doorbell_base = db & 0xffff;
+		vm_dev->doorbell_scale = (db >> 16) & 0xffff;
+	}
 
 	rc = register_virtio_device(&vm_dev->vdev);
 	if (rc)
@@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct platform_device *pdev)
 	return 0;
 }
 
-
-
 /* Devices list parameter */
 
 #if defined(CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES)
diff --git a/include/uapi/linux/virtio_mmio.h b/include/uapi/linux/virtio_mmio.h
index c4b0968..148895e 100644
--- a/include/uapi/linux/virtio_mmio.h
+++ b/include/uapi/linux/virtio_mmio.h
@@ -122,6 +122,34 @@
 #define VIRTIO_MMIO_QUEUE_USED_LOW	0x0a0
 #define VIRTIO_MMIO_QUEUE_USED_HIGH	0x0a4
 
+/* MSI related registers */
+
+/* MSI status register */
+#define VIRTIO_MMIO_MSI_STATUS		0x0c0
+/* MSI command register */
+#define VIRTIO_MMIO_MSI_COMMAND		0x0c2
+/* MSI low 32 bit address, 64 bits in two halves */
+#define VIRTIO_MMIO_MSI_ADDRESS_LOW	0x0c4
+/* MSI high 32 bit address, 64 bits in two halves */
+#define VIRTIO_MMIO_MSI_ADDRESS_HIGH	0x0c8
+/* MSI data */
+#define VIRTIO_MMIO_MSI_DATA		0x0cc
+
+/* RO: MSI feature enabled mask */
+#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000
+/* RO: Maximum queue size available */
+#define VIRTIO_MMIO_MSI_STATUS_QMASK	0x07ff
+/* Reserved */
+#define VIRTIO_MMIO_MSI_STATUS_RESERVED	0x7800
+
+#define VIRTIO_MMIO_MSI_CMD_UPDATE	0x1
+#define VIRTIO_MMIO_MSI_CMD_OP_MASK	0xf000
+#define VIRTIO_MMIO_MSI_CMD_ARG_MASK	0x0fff
+#define VIRTIO_MMIO_MSI_CMD(op, arg)	((op) << 12 | (arg))
+#define VIRITO_MMIO_MSI_CMD_ARG(cmd)	((cmd) & VIRTIO_MMIO_MSI_CMD_ARG_MASK)
+#define VIRTIO_MMIO_MSI_CMD_OP(cmd)	\
+		(((cmd) & VIRTIO_MMIO_MSI_CMD_OP_MASK) >> 12)
+
 /* Configuration atomicity value */
 #define VIRTIO_MMIO_CONFIG_GENERATION	0x0fc
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-25  2:50 ` [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3 Zha Bin
@ 2019-12-25 10:20     ` Jason Wang
  2019-12-25 22:28     ` kbuild test robot
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2019-12-25 10:20 UTC (permalink / raw)
  To: Zha Bin, linux-kernel; +Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/25 上午10:50, Zha Bin wrote:
> From: Liu Jiang <gerry@linux.alibaba.com>
>
> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
> virtio over mmio devices as a lightweight machine model for modern
> cloud. The standard virtio over MMIO transport layer only supports one
> legacy interrupt, which is much heavier than virtio over PCI transport
> layer using MSI. Legacy interrupt has long work path and causes specific
> VMExits in following cases, which would considerably slow down the
> performance:
>
> 1) read interrupt status register
> 2) update interrupt status register
> 3) write IOAPIC EOI register
>
> We proposed to update virtio over MMIO to version 3[1] to add the
> following new features and enhance the performance.
>
> 1) Support Message Signaled Interrupt(MSI), which increases the
>     interrupt performance for virtio multi-queue devices
> 2) Support per-queue doorbell, so the guest kernel may directly write
>     to the doorbells provided by virtio devices.
>
> The following is the network tcp_rr performance testing report, tested
> with virtio-pci device, vanilla virtio-mmio device and patched
> virtio-mmio device (run test 3 times for each case):
>
> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>
> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
> 	trans/s	    9536	6939		9500
> 	trans/s	    9734	7029		9749
> 	trans/s	    9894	7095		9318
>
> [1] https://lkml.org/lkml/2019/12/20/113


Thanks for the patch. Two questions after a quick glance:

1) In PCI we choose to support MSI-X instead of MSI for having extra 
flexibility like alias, independent data and address (e.g for affinity) 
. Any reason for not start from MSI-X? E.g having MSI-X table and PBA 
(both of which looks pretty independent).
2) It's better to split notify_multiplexer out of MSI support to ease 
the reviewers (apply to spec patch as well)

Thanks


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2019-12-25 10:20     ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2019-12-25 10:20 UTC (permalink / raw)
  To: Zha Bin, linux-kernel; +Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/25 上午10:50, Zha Bin wrote:
> From: Liu Jiang <gerry@linux.alibaba.com>
>
> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
> virtio over mmio devices as a lightweight machine model for modern
> cloud. The standard virtio over MMIO transport layer only supports one
> legacy interrupt, which is much heavier than virtio over PCI transport
> layer using MSI. Legacy interrupt has long work path and causes specific
> VMExits in following cases, which would considerably slow down the
> performance:
>
> 1) read interrupt status register
> 2) update interrupt status register
> 3) write IOAPIC EOI register
>
> We proposed to update virtio over MMIO to version 3[1] to add the
> following new features and enhance the performance.
>
> 1) Support Message Signaled Interrupt(MSI), which increases the
>     interrupt performance for virtio multi-queue devices
> 2) Support per-queue doorbell, so the guest kernel may directly write
>     to the doorbells provided by virtio devices.
>
> The following is the network tcp_rr performance testing report, tested
> with virtio-pci device, vanilla virtio-mmio device and patched
> virtio-mmio device (run test 3 times for each case):
>
> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>
> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
> 	trans/s	    9536	6939		9500
> 	trans/s	    9734	7029		9749
> 	trans/s	    9894	7095		9318
>
> [1] https://lkml.org/lkml/2019/12/20/113


Thanks for the patch. Two questions after a quick glance:

1) In PCI we choose to support MSI-X instead of MSI for having extra 
flexibility like alias, independent data and address (e.g for affinity) 
. Any reason for not start from MSI-X? E.g having MSI-X table and PBA 
(both of which looks pretty independent).
2) It's better to split notify_multiplexer out of MSI support to ease 
the reviewers (apply to spec patch as well)

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-25 10:20     ` [virtio-dev] " Jason Wang
  (?)
@ 2019-12-25 15:20     ` Liu, Jiang
  2019-12-26  8:09         ` [virtio-dev] " Jason Wang
  -1 siblings, 1 reply; 60+ messages in thread
From: Liu, Jiang @ 2019-12-25 15:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zha Bin, linux-kernel, mst, slp, virtio-dev, jing2.liu, chao.p.peng



> On Dec 25, 2019, at 6:20 PM, Jason Wang <jasowang@redhat.com> wrote:
> 
> 
> On 2019/12/25 上午10:50, Zha Bin wrote:
>> From: Liu Jiang <gerry@linux.alibaba.com>
>> 
>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>> virtio over mmio devices as a lightweight machine model for modern
>> cloud. The standard virtio over MMIO transport layer only supports one
>> legacy interrupt, which is much heavier than virtio over PCI transport
>> layer using MSI. Legacy interrupt has long work path and causes specific
>> VMExits in following cases, which would considerably slow down the
>> performance:
>> 
>> 1) read interrupt status register
>> 2) update interrupt status register
>> 3) write IOAPIC EOI register
>> 
>> We proposed to update virtio over MMIO to version 3[1] to add the
>> following new features and enhance the performance.
>> 
>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>    interrupt performance for virtio multi-queue devices
>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>    to the doorbells provided by virtio devices.
>> 
>> The following is the network tcp_rr performance testing report, tested
>> with virtio-pci device, vanilla virtio-mmio device and patched
>> virtio-mmio device (run test 3 times for each case):
>> 
>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>> 
>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>> 	trans/s	    9536	6939		9500
>> 	trans/s	    9734	7029		9749
>> 	trans/s	    9894	7095		9318
>> 
>> [1] https://lkml.org/lkml/2019/12/20/113
> 
> 
> Thanks for the patch. Two questions after a quick glance:
> 
> 1) In PCI we choose to support MSI-X instead of MSI for having extra flexibility like alias, independent data and address (e.g for affinity) . Any reason for not start from MSI-X? E.g having MSI-X table and PBA (both of which looks pretty independent).
Hi Jason,
	Thanks for reviewing patches on Christmas Day:)
	The PCI MSI-x has several advantages over PCI MSI, mainly 
1) support 2048 vectors, much more than 32 vectors supported by MSI.
2) dedicated address/data for each vector,
3) per vector mask/pending bits.
The proposed MMIO MSI extension supports both 1) and 2), but the extension doesn’t support 3) because
we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.

On the other hand, we want to simplify VMM implementations as simple as possible, and mimicking the PCI MSI-x
will cause some complexity to VMM implementations.

> 2) It's better to split notify_multiplexer out of MSI support to ease the reviewers (apply to spec patch as well)
Great suggestion, we will try to split the patch.

Thanks,
Gerry

> 
> Thanks


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-25  2:50 ` [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3 Zha Bin
@ 2019-12-25 22:28     ` kbuild test robot
  2019-12-25 22:28     ` kbuild test robot
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 60+ messages in thread
From: kbuild test robot @ 2019-12-25 22:28 UTC (permalink / raw)
  To: Zha Bin
  Cc: kbuild-all, linux-kernel, mst, jasowang, slp, virtio-dev, gerry,
	zhabin, jing2.liu, chao.p.peng

[-- Attachment #1: Type: text/plain, Size: 2101 bytes --]

Hi Zha,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[cannot apply to tip/auto-latest linux/master linus/master v5.5-rc3 next-20191220]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Zha-Bin/support-virtio-mmio-specification-Version-3/20191226-041757
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 004e8dce9c5595697951f7cd0e9f66b35c92265e
config: alpha-randconfig-a001-20191225 (attached as .config)
compiler: alpha-linux-gcc (GCC) 7.5.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.5.0 make.cross ARCH=alpha 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers//virtio/virtio_mmio.c:72:10: fatal error: asm/irqdomain.h: No such file or directory
    #include <asm/irqdomain.h>
             ^~~~~~~~~~~~~~~~~
   compilation terminated.

vim +72 drivers//virtio/virtio_mmio.c

    56	
    57	#include <linux/acpi.h>
    58	#include <linux/dma-mapping.h>
    59	#include <linux/highmem.h>
    60	#include <linux/interrupt.h>
    61	#include <linux/io.h>
    62	#include <linux/list.h>
    63	#include <linux/module.h>
    64	#include <linux/platform_device.h>
    65	#include <linux/slab.h>
    66	#include <linux/spinlock.h>
    67	#include <linux/virtio.h>
    68	#include <linux/virtio_config.h>
    69	#include <uapi/linux/virtio_mmio.h>
    70	#include <linux/virtio_ring.h>
    71	#include <linux/msi.h>
  > 72	#include <asm/irqdomain.h>
    73	

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33206 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2019-12-25 22:28     ` kbuild test robot
  0 siblings, 0 replies; 60+ messages in thread
From: kbuild test robot @ 2019-12-25 22:28 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2157 bytes --]

Hi Zha,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[cannot apply to tip/auto-latest linux/master linus/master v5.5-rc3 next-20191220]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Zha-Bin/support-virtio-mmio-specification-Version-3/20191226-041757
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 004e8dce9c5595697951f7cd0e9f66b35c92265e
config: alpha-randconfig-a001-20191225 (attached as .config)
compiler: alpha-linux-gcc (GCC) 7.5.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.5.0 make.cross ARCH=alpha 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> drivers//virtio/virtio_mmio.c:72:10: fatal error: asm/irqdomain.h: No such file or directory
    #include <asm/irqdomain.h>
             ^~~~~~~~~~~~~~~~~
   compilation terminated.

vim +72 drivers//virtio/virtio_mmio.c

    56	
    57	#include <linux/acpi.h>
    58	#include <linux/dma-mapping.h>
    59	#include <linux/highmem.h>
    60	#include <linux/interrupt.h>
    61	#include <linux/io.h>
    62	#include <linux/list.h>
    63	#include <linux/module.h>
    64	#include <linux/platform_device.h>
    65	#include <linux/slab.h>
    66	#include <linux/spinlock.h>
    67	#include <linux/virtio.h>
    68	#include <linux/virtio_config.h>
    69	#include <uapi/linux/virtio_mmio.h>
    70	#include <linux/virtio_ring.h>
    71	#include <linux/msi.h>
  > 72	#include <asm/irqdomain.h>
    73	

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org Intel Corporation

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 33206 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-25  2:50 ` [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3 Zha Bin
@ 2019-12-26  1:44     ` kbuild test robot
  2019-12-25 22:28     ` kbuild test robot
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 60+ messages in thread
From: kbuild test robot @ 2019-12-26  1:44 UTC (permalink / raw)
  To: Zha Bin
  Cc: kbuild-all, linux-kernel, mst, jasowang, slp, virtio-dev, gerry,
	zhabin, jing2.liu, chao.p.peng

[-- Attachment #1: Type: text/plain, Size: 15486 bytes --]

Hi Zha,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[cannot apply to tip/auto-latest linux/master linus/master v5.5-rc3 next-20191219]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Zha-Bin/support-virtio-mmio-specification-Version-3/20191226-041757
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 004e8dce9c5595697951f7cd0e9f66b35c92265e
config: i386-randconfig-c001-20191225 (attached as .config)
compiler: gcc-7 (Debian 7.5.0-3) 7.5.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/list.h:9:0,
                    from include/linux/kobject.h:19,
                    from include/linux/of.h:17,
                    from include/linux/irqdomain.h:35,
                    from include/linux/acpi.h:13,
                    from drivers//virtio/virtio_mmio.c:57:
   drivers//virtio/virtio_mmio.c: In function 'vm_msi_irq_vector':
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/kernel.h:993:26: note: in definition of macro 'container_of'
     void *__mptr = (void *)(ptr);     \
                             ^~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
>> drivers//virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
   In file included from include/linux/ioport.h:13:0,
                    from include/linux/acpi.h:12,
                    from drivers//virtio/virtio_mmio.c:57:
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/compiler.h:330:9: note: in definition of macro '__compiletime_assert'
      if (!(condition))     \
            ^~~~~~~~~
   include/linux/compiler.h:350:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
     ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:20: note: in expansion of macro '__same_type'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
                       ^~~~~~~~~~~
>> include/linux/list.h:479:2: note: in expansion of macro 'container_of'
     container_of(ptr, type, member)
     ^~~~~~~~~~~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
>> drivers//virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/compiler.h:330:9: note: in definition of macro '__compiletime_assert'
      if (!(condition))     \
            ^~~~~~~~~
   include/linux/compiler.h:350:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
     ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:995:6: note: in expansion of macro '__same_type'
        !__same_type(*(ptr), void),   \
         ^~~~~~~~~~~
>> include/linux/list.h:479:2: note: in expansion of macro 'container_of'
     container_of(ptr, type, member)
     ^~~~~~~~~~~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
>> drivers//virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
   drivers//virtio/virtio_mmio.c: In function 'vm_request_msi_vectors':
>> drivers//virtio/virtio_mmio.c:652:17: error: 'struct device' has no member named 'msi_domain'; did you mean 'pm_domain'?
     if (!vdev->dev.msi_domain)
                    ^~~~~~~~~~
                    pm_domain
   drivers//virtio/virtio_mmio.c:653:13: error: 'struct device' has no member named 'msi_domain'; did you mean 'pm_domain'?
      vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
                ^~~~~~~~~~
                pm_domain
>> drivers//virtio/virtio_mmio.c:654:8: error: implicit declaration of function 'platform_msi_domain_alloc_irqs'; did you mean 'irq_domain_alloc_irqs'? [-Werror=implicit-function-declaration]
     err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           irq_domain_alloc_irqs
   drivers//virtio/virtio_mmio.c: In function 'vm_free_msi_irqs':
>> drivers//virtio/virtio_mmio.c:683:3: error: implicit declaration of function 'platform_msi_domain_free_irqs'; did you mean 'irq_domain_free_irqs'? [-Werror=implicit-function-declaration]
      platform_msi_domain_free_irqs(&vdev->dev);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      irq_domain_free_irqs
   cc1: some warnings being treated as errors
--
   In file included from include/linux/list.h:9:0,
                    from include/linux/kobject.h:19,
                    from include/linux/of.h:17,
                    from include/linux/irqdomain.h:35,
                    from include/linux/acpi.h:13,
                    from drivers/virtio/virtio_mmio.c:57:
   drivers/virtio/virtio_mmio.c: In function 'vm_msi_irq_vector':
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/kernel.h:993:26: note: in definition of macro 'container_of'
     void *__mptr = (void *)(ptr);     \
                             ^~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
   drivers/virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
   In file included from include/linux/ioport.h:13:0,
                    from include/linux/acpi.h:12,
                    from drivers/virtio/virtio_mmio.c:57:
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/compiler.h:330:9: note: in definition of macro '__compiletime_assert'
      if (!(condition))     \
            ^~~~~~~~~
   include/linux/compiler.h:350:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
     ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:20: note: in expansion of macro '__same_type'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
                       ^~~~~~~~~~~
>> include/linux/list.h:479:2: note: in expansion of macro 'container_of'
     container_of(ptr, type, member)
     ^~~~~~~~~~~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
   drivers/virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/compiler.h:330:9: note: in definition of macro '__compiletime_assert'
      if (!(condition))     \
            ^~~~~~~~~
   include/linux/compiler.h:350:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
     ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:995:6: note: in expansion of macro '__same_type'
        !__same_type(*(ptr), void),   \
         ^~~~~~~~~~~
>> include/linux/list.h:479:2: note: in expansion of macro 'container_of'
     container_of(ptr, type, member)
     ^~~~~~~~~~~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
   drivers/virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
   drivers/virtio/virtio_mmio.c: In function 'vm_request_msi_vectors':
   drivers/virtio/virtio_mmio.c:652:17: error: 'struct device' has no member named 'msi_domain'; did you mean 'pm_domain'?
     if (!vdev->dev.msi_domain)
                    ^~~~~~~~~~
                    pm_domain
   drivers/virtio/virtio_mmio.c:653:13: error: 'struct device' has no member named 'msi_domain'; did you mean 'pm_domain'?
      vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
                ^~~~~~~~~~
                pm_domain
   drivers/virtio/virtio_mmio.c:654:8: error: implicit declaration of function 'platform_msi_domain_alloc_irqs'; did you mean 'irq_domain_alloc_irqs'? [-Werror=implicit-function-declaration]
     err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           irq_domain_alloc_irqs
   drivers/virtio/virtio_mmio.c: In function 'vm_free_msi_irqs':
   drivers/virtio/virtio_mmio.c:683:3: error: implicit declaration of function 'platform_msi_domain_free_irqs'; did you mean 'irq_domain_free_irqs'? [-Werror=implicit-function-declaration]
      platform_msi_domain_free_irqs(&vdev->dev);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      irq_domain_free_irqs
   cc1: some warnings being treated as errors

vim +135 include/linux/msi.h

3b7d1921f4cdd6 Eric W. Biederman 2006-10-04  132  
d31eb342409b24 Jiang Liu         2014-11-15  133  /* Helpers to hide struct msi_desc implementation details */
25a98bd4ff9355 Jiang Liu         2015-07-09  134  #define msi_desc_to_dev(desc)		((desc)->dev)
4a7cc831670550 Jiang Liu         2015-07-09 @135  #define dev_to_msi_list(dev)		(&(dev)->msi_list)
d31eb342409b24 Jiang Liu         2014-11-15  136  #define first_msi_entry(dev)		\
d31eb342409b24 Jiang Liu         2014-11-15 @137  	list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
d31eb342409b24 Jiang Liu         2014-11-15  138  #define for_each_msi_entry(desc, dev)	\
d31eb342409b24 Jiang Liu         2014-11-15  139  	list_for_each_entry((desc), dev_to_msi_list((dev)), list)
81b1e6e6a8590a Miquel Raynal     2018-10-11  140  #define for_each_msi_entry_safe(desc, tmp, dev)	\
81b1e6e6a8590a Miquel Raynal     2018-10-11  141  	list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
d31eb342409b24 Jiang Liu         2014-11-15  142  

:::::: The code at line 135 was first introduced by commit
:::::: 4a7cc831670550e6b48ef5760e7213f89935ff0d genirq/MSI: Move msi_list from struct pci_dev to struct device

:::::: TO: Jiang Liu <jiang.liu@linux.intel.com>
:::::: CC: Thomas Gleixner <tglx@linutronix.de>

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34784 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2019-12-26  1:44     ` kbuild test robot
  0 siblings, 0 replies; 60+ messages in thread
From: kbuild test robot @ 2019-12-26  1:44 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 15762 bytes --]

Hi Zha,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[cannot apply to tip/auto-latest linux/master linus/master v5.5-rc3 next-20191219]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Zha-Bin/support-virtio-mmio-specification-Version-3/20191226-041757
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 004e8dce9c5595697951f7cd0e9f66b35c92265e
config: i386-randconfig-c001-20191225 (attached as .config)
compiler: gcc-7 (Debian 7.5.0-3) 7.5.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/list.h:9:0,
                    from include/linux/kobject.h:19,
                    from include/linux/of.h:17,
                    from include/linux/irqdomain.h:35,
                    from include/linux/acpi.h:13,
                    from drivers//virtio/virtio_mmio.c:57:
   drivers//virtio/virtio_mmio.c: In function 'vm_msi_irq_vector':
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/kernel.h:993:26: note: in definition of macro 'container_of'
     void *__mptr = (void *)(ptr);     \
                             ^~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
>> drivers//virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
   In file included from include/linux/ioport.h:13:0,
                    from include/linux/acpi.h:12,
                    from drivers//virtio/virtio_mmio.c:57:
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/compiler.h:330:9: note: in definition of macro '__compiletime_assert'
      if (!(condition))     \
            ^~~~~~~~~
   include/linux/compiler.h:350:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
     ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:20: note: in expansion of macro '__same_type'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
                       ^~~~~~~~~~~
>> include/linux/list.h:479:2: note: in expansion of macro 'container_of'
     container_of(ptr, type, member)
     ^~~~~~~~~~~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
>> drivers//virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/compiler.h:330:9: note: in definition of macro '__compiletime_assert'
      if (!(condition))     \
            ^~~~~~~~~
   include/linux/compiler.h:350:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
     ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:995:6: note: in expansion of macro '__same_type'
        !__same_type(*(ptr), void),   \
         ^~~~~~~~~~~
>> include/linux/list.h:479:2: note: in expansion of macro 'container_of'
     container_of(ptr, type, member)
     ^~~~~~~~~~~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
>> drivers//virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
   drivers//virtio/virtio_mmio.c: In function 'vm_request_msi_vectors':
>> drivers//virtio/virtio_mmio.c:652:17: error: 'struct device' has no member named 'msi_domain'; did you mean 'pm_domain'?
     if (!vdev->dev.msi_domain)
                    ^~~~~~~~~~
                    pm_domain
   drivers//virtio/virtio_mmio.c:653:13: error: 'struct device' has no member named 'msi_domain'; did you mean 'pm_domain'?
      vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
                ^~~~~~~~~~
                pm_domain
>> drivers//virtio/virtio_mmio.c:654:8: error: implicit declaration of function 'platform_msi_domain_alloc_irqs'; did you mean 'irq_domain_alloc_irqs'? [-Werror=implicit-function-declaration]
     err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           irq_domain_alloc_irqs
   drivers//virtio/virtio_mmio.c: In function 'vm_free_msi_irqs':
>> drivers//virtio/virtio_mmio.c:683:3: error: implicit declaration of function 'platform_msi_domain_free_irqs'; did you mean 'irq_domain_free_irqs'? [-Werror=implicit-function-declaration]
      platform_msi_domain_free_irqs(&vdev->dev);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      irq_domain_free_irqs
   cc1: some warnings being treated as errors
--
   In file included from include/linux/list.h:9:0,
                    from include/linux/kobject.h:19,
                    from include/linux/of.h:17,
                    from include/linux/irqdomain.h:35,
                    from include/linux/acpi.h:13,
                    from drivers/virtio/virtio_mmio.c:57:
   drivers/virtio/virtio_mmio.c: In function 'vm_msi_irq_vector':
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/kernel.h:993:26: note: in definition of macro 'container_of'
     void *__mptr = (void *)(ptr);     \
                             ^~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
   drivers/virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
   In file included from include/linux/ioport.h:13:0,
                    from include/linux/acpi.h:12,
                    from drivers/virtio/virtio_mmio.c:57:
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/compiler.h:330:9: note: in definition of macro '__compiletime_assert'
      if (!(condition))     \
            ^~~~~~~~~
   include/linux/compiler.h:350:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
     ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:20: note: in expansion of macro '__same_type'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
                       ^~~~~~~~~~~
>> include/linux/list.h:479:2: note: in expansion of macro 'container_of'
     container_of(ptr, type, member)
     ^~~~~~~~~~~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
   drivers/virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
>> include/linux/msi.h:135:38: error: 'struct device' has no member named 'msi_list'
    #define dev_to_msi_list(dev)  (&(dev)->msi_list)
                                         ^
   include/linux/compiler.h:330:9: note: in definition of macro '__compiletime_assert'
      if (!(condition))     \
            ^~~~~~~~~
   include/linux/compiler.h:350:2: note: in expansion of macro '_compiletime_assert'
     _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
     ^~~~~~~~~~~~~~~~~~~
   include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                        ^~~~~~~~~~~~~~~~~~
   include/linux/kernel.h:994:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
     BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
     ^~~~~~~~~~~~~~~~
   include/linux/kernel.h:995:6: note: in expansion of macro '__same_type'
        !__same_type(*(ptr), void),   \
         ^~~~~~~~~~~
>> include/linux/list.h:479:2: note: in expansion of macro 'container_of'
     container_of(ptr, type, member)
     ^~~~~~~~~~~~
>> include/linux/list.h:490:2: note: in expansion of macro 'list_entry'
     list_entry((ptr)->next, type, member)
     ^~~~~~~~~~
>> include/linux/msi.h:137:2: note: in expansion of macro 'list_first_entry'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
     ^~~~~~~~~~~~~~~~
>> include/linux/msi.h:137:19: note: in expansion of macro 'dev_to_msi_list'
     list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
                      ^~~~~~~~~~~~~~~
   drivers/virtio/virtio_mmio.c:336:27: note: in expansion of macro 'first_msi_entry'
     struct msi_desc *entry = first_msi_entry(dev);
                              ^~~~~~~~~~~~~~~
   drivers/virtio/virtio_mmio.c: In function 'vm_request_msi_vectors':
   drivers/virtio/virtio_mmio.c:652:17: error: 'struct device' has no member named 'msi_domain'; did you mean 'pm_domain'?
     if (!vdev->dev.msi_domain)
                    ^~~~~~~~~~
                    pm_domain
   drivers/virtio/virtio_mmio.c:653:13: error: 'struct device' has no member named 'msi_domain'; did you mean 'pm_domain'?
      vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
                ^~~~~~~~~~
                pm_domain
   drivers/virtio/virtio_mmio.c:654:8: error: implicit declaration of function 'platform_msi_domain_alloc_irqs'; did you mean 'irq_domain_alloc_irqs'? [-Werror=implicit-function-declaration]
     err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           irq_domain_alloc_irqs
   drivers/virtio/virtio_mmio.c: In function 'vm_free_msi_irqs':
   drivers/virtio/virtio_mmio.c:683:3: error: implicit declaration of function 'platform_msi_domain_free_irqs'; did you mean 'irq_domain_free_irqs'? [-Werror=implicit-function-declaration]
      platform_msi_domain_free_irqs(&vdev->dev);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      irq_domain_free_irqs
   cc1: some warnings being treated as errors

vim +135 include/linux/msi.h

3b7d1921f4cdd6 Eric W. Biederman 2006-10-04  132  
d31eb342409b24 Jiang Liu         2014-11-15  133  /* Helpers to hide struct msi_desc implementation details */
25a98bd4ff9355 Jiang Liu         2015-07-09  134  #define msi_desc_to_dev(desc)		((desc)->dev)
4a7cc831670550 Jiang Liu         2015-07-09 @135  #define dev_to_msi_list(dev)		(&(dev)->msi_list)
d31eb342409b24 Jiang Liu         2014-11-15  136  #define first_msi_entry(dev)		\
d31eb342409b24 Jiang Liu         2014-11-15 @137  	list_first_entry(dev_to_msi_list((dev)), struct msi_desc, list)
d31eb342409b24 Jiang Liu         2014-11-15  138  #define for_each_msi_entry(desc, dev)	\
d31eb342409b24 Jiang Liu         2014-11-15  139  	list_for_each_entry((desc), dev_to_msi_list((dev)), list)
81b1e6e6a8590a Miquel Raynal     2018-10-11  140  #define for_each_msi_entry_safe(desc, tmp, dev)	\
81b1e6e6a8590a Miquel Raynal     2018-10-11  141  	list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
d31eb342409b24 Jiang Liu         2014-11-15  142  

:::::: The code at line 135 was first introduced by commit
:::::: 4a7cc831670550e6b48ef5760e7213f89935ff0d genirq/MSI: Move msi_list from struct pci_dev to struct device

:::::: TO: Jiang Liu <jiang.liu@linux.intel.com>
:::::: CC: Thomas Gleixner <tglx@linutronix.de>

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org Intel Corporation

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 34784 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-25 15:20     ` Liu, Jiang
@ 2019-12-26  8:09         ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2019-12-26  8:09 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Zha Bin, linux-kernel, mst, slp, virtio-dev, jing2.liu, chao.p.peng


On 2019/12/25 下午11:20, Liu, Jiang wrote:
>
>> On Dec 25, 2019, at 6:20 PM, Jason Wang <jasowang@redhat.com> wrote:
>>
>>
>> On 2019/12/25 上午10:50, Zha Bin wrote:
>>> From: Liu Jiang <gerry@linux.alibaba.com>
>>>
>>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>>> virtio over mmio devices as a lightweight machine model for modern
>>> cloud. The standard virtio over MMIO transport layer only supports one
>>> legacy interrupt, which is much heavier than virtio over PCI transport
>>> layer using MSI. Legacy interrupt has long work path and causes specific
>>> VMExits in following cases, which would considerably slow down the
>>> performance:
>>>
>>> 1) read interrupt status register
>>> 2) update interrupt status register
>>> 3) write IOAPIC EOI register
>>>
>>> We proposed to update virtio over MMIO to version 3[1] to add the
>>> following new features and enhance the performance.
>>>
>>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>>     interrupt performance for virtio multi-queue devices
>>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>>     to the doorbells provided by virtio devices.
>>>
>>> The following is the network tcp_rr performance testing report, tested
>>> with virtio-pci device, vanilla virtio-mmio device and patched
>>> virtio-mmio device (run test 3 times for each case):
>>>
>>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>>
>>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>>> 	trans/s	    9536	6939		9500
>>> 	trans/s	    9734	7029		9749
>>> 	trans/s	    9894	7095		9318
>>>
>>> [1] https://lkml.org/lkml/2019/12/20/113
>>
>> Thanks for the patch. Two questions after a quick glance:
>>
>> 1) In PCI we choose to support MSI-X instead of MSI for having extra flexibility like alias, independent data and address (e.g for affinity) . Any reason for not start from MSI-X? E.g having MSI-X table and PBA (both of which looks pretty independent).
> Hi Jason,
> 	Thanks for reviewing patches on Christmas Day:)
> 	The PCI MSI-x has several advantages over PCI MSI, mainly
> 1) support 2048 vectors, much more than 32 vectors supported by MSI.
> 2) dedicated address/data for each vector,
> 3) per vector mask/pending bits.
> The proposed MMIO MSI extension supports both 1) and 2),


Aha right, I mis-read the patch. But more questions comes:

1) The association between vq and MSI-X vector is fixed. This means it 
can't work for a device that have more than 2047 queues. We probably 
need something similar to virtio-pci to allow a dynamic association.
2) The mask and unmask control is missed


>   but the extension doesn’t support 3) because
> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.


Not directly used but masking/unmasking is widely used in irq subsystem 
which allows lots of optimizations.


>
> On the other hand, we want to simplify VMM implementations as simple as possible, and mimicking the PCI MSI-x
> will cause some complexity to VMM implementations.


I agree to simplify VMM implementation, but it looks to me introducing 
masking/pending won't cost too much code in the VMM implementation. Just 
new type of command for VIRTIO_MMIO_MSI_COMMAND.

Thanks


>
>> 2) It's better to split notify_multiplexer out of MSI support to ease the reviewers (apply to spec patch as well)
> Great suggestion, we will try to split the patch.
>
> Thanks,
> Gerry
>
>> Thanks


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2019-12-26  8:09         ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2019-12-26  8:09 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Zha Bin, linux-kernel, mst, slp, virtio-dev, jing2.liu, chao.p.peng


On 2019/12/25 下午11:20, Liu, Jiang wrote:
>
>> On Dec 25, 2019, at 6:20 PM, Jason Wang <jasowang@redhat.com> wrote:
>>
>>
>> On 2019/12/25 上午10:50, Zha Bin wrote:
>>> From: Liu Jiang <gerry@linux.alibaba.com>
>>>
>>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>>> virtio over mmio devices as a lightweight machine model for modern
>>> cloud. The standard virtio over MMIO transport layer only supports one
>>> legacy interrupt, which is much heavier than virtio over PCI transport
>>> layer using MSI. Legacy interrupt has long work path and causes specific
>>> VMExits in following cases, which would considerably slow down the
>>> performance:
>>>
>>> 1) read interrupt status register
>>> 2) update interrupt status register
>>> 3) write IOAPIC EOI register
>>>
>>> We proposed to update virtio over MMIO to version 3[1] to add the
>>> following new features and enhance the performance.
>>>
>>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>>     interrupt performance for virtio multi-queue devices
>>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>>     to the doorbells provided by virtio devices.
>>>
>>> The following is the network tcp_rr performance testing report, tested
>>> with virtio-pci device, vanilla virtio-mmio device and patched
>>> virtio-mmio device (run test 3 times for each case):
>>>
>>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>>
>>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>>> 	trans/s	    9536	6939		9500
>>> 	trans/s	    9734	7029		9749
>>> 	trans/s	    9894	7095		9318
>>>
>>> [1] https://lkml.org/lkml/2019/12/20/113
>>
>> Thanks for the patch. Two questions after a quick glance:
>>
>> 1) In PCI we choose to support MSI-X instead of MSI for having extra flexibility like alias, independent data and address (e.g for affinity) . Any reason for not start from MSI-X? E.g having MSI-X table and PBA (both of which looks pretty independent).
> Hi Jason,
> 	Thanks for reviewing patches on Christmas Day:)
> 	The PCI MSI-x has several advantages over PCI MSI, mainly
> 1) support 2048 vectors, much more than 32 vectors supported by MSI.
> 2) dedicated address/data for each vector,
> 3) per vector mask/pending bits.
> The proposed MMIO MSI extension supports both 1) and 2),


Aha right, I mis-read the patch. But more questions comes:

1) The association between vq and MSI-X vector is fixed. This means it 
can't work for a device that have more than 2047 queues. We probably 
need something similar to virtio-pci to allow a dynamic association.
2) The mask and unmask control is missed


>   but the extension doesn’t support 3) because
> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.


Not directly used but masking/unmasking is widely used in irq subsystem 
which allows lots of optimizations.


>
> On the other hand, we want to simplify VMM implementations as simple as possible, and mimicking the PCI MSI-x
> will cause some complexity to VMM implementations.


I agree to simplify VMM implementation, but it looks to me introducing 
masking/pending won't cost too much code in the VMM implementation. Just 
new type of command for VIRTIO_MMIO_MSI_COMMAND.

Thanks


>
>> 2) It's better to split notify_multiplexer out of MSI support to ease the reviewers (apply to spec patch as well)
> Great suggestion, we will try to split the patch.
>
> Thanks,
> Gerry
>
>> Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi
  2019-12-25  2:50 ` [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi Zha Bin
@ 2019-12-26  8:21     ` Jason Wang
  2020-01-17 13:58   ` Thomas Gleixner
  1 sibling, 0 replies; 60+ messages in thread
From: Jason Wang @ 2019-12-26  8:21 UTC (permalink / raw)
  To: Zha Bin, linux-kernel; +Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/25 上午10:50, Zha Bin wrote:
> From: Liu Jiang <gerry@linux.alibaba.com>
>
> The x86 platform currently only supports specific MSI/MSI-x for PCI
> devices. To enable MSI to the platform devices such as virtio-mmio
> device, this patch enhances x86 with platform MSI by leveraging the
> already built-in platform-msi driver (drivers/base/platform-msi.c)
> and makes it as a configurable option.
>
> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
> ---
>   arch/x86/Kconfig                 |  6 ++++
>   arch/x86/include/asm/hw_irq.h    |  5 +++
>   arch/x86/include/asm/irqdomain.h |  9 +++++
>   arch/x86/kernel/apic/msi.c       | 74 ++++++++++++++++++++++++++++++++++++++++
>   drivers/base/platform-msi.c      |  4 +--
>   include/linux/msi.h              |  1 +
>   6 files changed, 97 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 5e89499..c1178f2 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1074,6 +1074,12 @@ config X86_IO_APIC
>   	def_bool y
>   	depends on X86_LOCAL_APIC || X86_UP_IOAPIC
>   
> +config X86_PLATFORM_MSI
> +	def_bool y
> +	depends on X86_LOCAL_APIC && X86_64
> +	select GENERIC_MSI_IRQ
> +	select GENERIC_MSI_IRQ_DOMAIN
> +
>   config X86_REROUTE_FOR_BROKEN_BOOT_IRQS
>   	bool "Reroute for broken boot IRQs"
>   	depends on X86_IO_APIC
> diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
> index 4154bc5..5c96b49 100644
> --- a/arch/x86/include/asm/hw_irq.h
> +++ b/arch/x86/include/asm/hw_irq.h
> @@ -113,6 +113,11 @@ struct irq_alloc_info {
>   			struct msi_desc *desc;
>   		};
>   #endif
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +		struct {
> +			irq_hw_number_t	platform_msi_hwirq;
> +		};
> +#endif
>   	};
>   };
>   
> diff --git a/arch/x86/include/asm/irqdomain.h b/arch/x86/include/asm/irqdomain.h
> index c066ffa..b53f51f 100644
> --- a/arch/x86/include/asm/irqdomain.h
> +++ b/arch/x86/include/asm/irqdomain.h
> @@ -56,4 +56,13 @@ extern void mp_irqdomain_deactivate(struct irq_domain *domain,
>   static inline void arch_init_msi_domain(struct irq_domain *domain) { }
>   #endif
>   
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +extern struct irq_domain *platform_msi_get_def_irq_domain(void);
> +#else
> +static inline struct irq_domain *platform_msi_get_def_irq_domain(void)
> +{
> +	return NULL;
> +}
> +#endif
> +
>   #endif
> diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
> index 7f75334..ef558c7 100644
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -22,6 +22,10 @@
>   #include <asm/irq_remapping.h>
>   
>   static struct irq_domain *msi_default_domain;
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +static struct irq_domain *platform_msi_default_domain;
> +static struct msi_domain_info platform_msi_domain_info;
> +#endif
>   
>   static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg)
>   {
> @@ -146,6 +150,17 @@ void __init arch_init_msi_domain(struct irq_domain *parent)
>   	}
>   	if (!msi_default_domain)
>   		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n");
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +	fn = irq_domain_alloc_named_fwnode("PLATFORM-MSI");
> +	if (fn) {
> +		platform_msi_default_domain =
> +			platform_msi_create_irq_domain(fn,
> +				&platform_msi_domain_info, parent);
> +		irq_domain_free_fwnode(fn);
> +	}
> +	if (!platform_msi_default_domain)
> +		pr_warn("failed to initialize irqdomain for PLATFORM-MSI.\n");
> +#endif
>   }
>   
>   #ifdef CONFIG_IRQ_REMAP
> @@ -384,3 +399,62 @@ int hpet_assign_irq(struct irq_domain *domain, struct hpet_channel *hc,
>   	return irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, &info);
>   }
>   #endif
> +
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +static void platform_msi_mask_irq(struct irq_data *data)
> +{
> +}
> +
> +static void platform_msi_unmask_irq(struct irq_data *data)
> +{
> +}
> +
> +static struct irq_chip platform_msi_controller = {
> +	.name			= "PLATFORM-MSI",
> +	.irq_mask		= platform_msi_mask_irq,
> +	.irq_unmask		= platform_msi_unmask_irq,
> +	.irq_ack		= irq_chip_ack_parent,
> +	.irq_retrigger		= irq_chip_retrigger_hierarchy,
> +	.irq_compose_msi_msg	= irq_msi_compose_msg,
> +	.flags			= IRQCHIP_SKIP_SET_WAKE,


Except for irq_msi_compose_msg all the rest were arch independent. Can 
we have allow arch to specify irq_compose_msi_msg then we can unfiy the 
rest for all archs?


> +};
> +
> +static int platform_msi_prepare(struct irq_domain *domain, struct device *dev,
> +				int nvec, msi_alloc_info_t *arg)
> +{
> +	memset(arg, 0, sizeof(*arg));
> +	return 0;
> +}
> +
> +static void platform_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
> +{
> +	arg->platform_msi_hwirq = platform_msi_calc_hwirq(desc);
> +}
> +
> +static irq_hw_number_t platform_msi_get_hwirq(struct msi_domain_info *info,
> +					      msi_alloc_info_t *arg)
> +{
> +	return arg->platform_msi_hwirq;
> +}
> +
> +static struct msi_domain_ops platform_msi_domain_ops = {
> +	.msi_prepare	= platform_msi_prepare,
> +	.set_desc	= platform_msi_set_desc,
> +	.get_hwirq	= platform_msi_get_hwirq,
> +};
> +
> +static struct msi_domain_info platform_msi_domain_info = {
> +	.flags          = MSI_FLAG_USE_DEF_DOM_OPS |
> +			  MSI_FLAG_USE_DEF_CHIP_OPS |
> +			  MSI_FLAG_ACTIVATE_EARLY,
> +	.ops            = &platform_msi_domain_ops,
> +	.chip           = &platform_msi_controller,
> +	.handler        = handle_edge_irq,
> +	.handler_name   = "edge",
> +};


Is this better to have a virtio-mmio specific domain and irq_chip? It 
looks to me the advantages are:

- works for all archs
- allow virtio-mmio specific method to do e.g mask/unmask.

Thanks


> +
> +struct irq_domain *platform_msi_get_def_irq_domain(void)
> +{
> +	return platform_msi_default_domain;
> +}
> +#endif
> diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c
> index 8da314b..45752f1 100644
> --- a/drivers/base/platform-msi.c
> +++ b/drivers/base/platform-msi.c
> @@ -31,12 +31,11 @@ struct platform_msi_priv_data {
>   /* The devid allocator */
>   static DEFINE_IDA(platform_msi_devid_ida);
>   
> -#ifdef GENERIC_MSI_DOMAIN_OPS
>   /*
>    * Convert an msi_desc to a globaly unique identifier (per-device
>    * devid + msi_desc position in the msi_list).
>    */
> -static irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
> +irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
>   {
>   	u32 devid;
>   
> @@ -45,6 +44,7 @@ static irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
>   	return (devid << (32 - DEV_ID_SHIFT)) | desc->platform.msi_index;
>   }
>   
> +#ifdef GENERIC_MSI_DOMAIN_OPS
>   static void platform_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
>   {
>   	arg->desc = desc;
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 8ad679e..ee5f566 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -362,6 +362,7 @@ int platform_msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
>   void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq,
>   			      unsigned int nvec);
>   void *platform_msi_get_host_data(struct irq_domain *domain);
> +irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc);
>   #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
>   
>   #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi
@ 2019-12-26  8:21     ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2019-12-26  8:21 UTC (permalink / raw)
  To: Zha Bin, linux-kernel; +Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/25 上午10:50, Zha Bin wrote:
> From: Liu Jiang <gerry@linux.alibaba.com>
>
> The x86 platform currently only supports specific MSI/MSI-x for PCI
> devices. To enable MSI to the platform devices such as virtio-mmio
> device, this patch enhances x86 with platform MSI by leveraging the
> already built-in platform-msi driver (drivers/base/platform-msi.c)
> and makes it as a configurable option.
>
> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
> ---
>   arch/x86/Kconfig                 |  6 ++++
>   arch/x86/include/asm/hw_irq.h    |  5 +++
>   arch/x86/include/asm/irqdomain.h |  9 +++++
>   arch/x86/kernel/apic/msi.c       | 74 ++++++++++++++++++++++++++++++++++++++++
>   drivers/base/platform-msi.c      |  4 +--
>   include/linux/msi.h              |  1 +
>   6 files changed, 97 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 5e89499..c1178f2 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1074,6 +1074,12 @@ config X86_IO_APIC
>   	def_bool y
>   	depends on X86_LOCAL_APIC || X86_UP_IOAPIC
>   
> +config X86_PLATFORM_MSI
> +	def_bool y
> +	depends on X86_LOCAL_APIC && X86_64
> +	select GENERIC_MSI_IRQ
> +	select GENERIC_MSI_IRQ_DOMAIN
> +
>   config X86_REROUTE_FOR_BROKEN_BOOT_IRQS
>   	bool "Reroute for broken boot IRQs"
>   	depends on X86_IO_APIC
> diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
> index 4154bc5..5c96b49 100644
> --- a/arch/x86/include/asm/hw_irq.h
> +++ b/arch/x86/include/asm/hw_irq.h
> @@ -113,6 +113,11 @@ struct irq_alloc_info {
>   			struct msi_desc *desc;
>   		};
>   #endif
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +		struct {
> +			irq_hw_number_t	platform_msi_hwirq;
> +		};
> +#endif
>   	};
>   };
>   
> diff --git a/arch/x86/include/asm/irqdomain.h b/arch/x86/include/asm/irqdomain.h
> index c066ffa..b53f51f 100644
> --- a/arch/x86/include/asm/irqdomain.h
> +++ b/arch/x86/include/asm/irqdomain.h
> @@ -56,4 +56,13 @@ extern void mp_irqdomain_deactivate(struct irq_domain *domain,
>   static inline void arch_init_msi_domain(struct irq_domain *domain) { }
>   #endif
>   
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +extern struct irq_domain *platform_msi_get_def_irq_domain(void);
> +#else
> +static inline struct irq_domain *platform_msi_get_def_irq_domain(void)
> +{
> +	return NULL;
> +}
> +#endif
> +
>   #endif
> diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
> index 7f75334..ef558c7 100644
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -22,6 +22,10 @@
>   #include <asm/irq_remapping.h>
>   
>   static struct irq_domain *msi_default_domain;
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +static struct irq_domain *platform_msi_default_domain;
> +static struct msi_domain_info platform_msi_domain_info;
> +#endif
>   
>   static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg)
>   {
> @@ -146,6 +150,17 @@ void __init arch_init_msi_domain(struct irq_domain *parent)
>   	}
>   	if (!msi_default_domain)
>   		pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n");
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +	fn = irq_domain_alloc_named_fwnode("PLATFORM-MSI");
> +	if (fn) {
> +		platform_msi_default_domain =
> +			platform_msi_create_irq_domain(fn,
> +				&platform_msi_domain_info, parent);
> +		irq_domain_free_fwnode(fn);
> +	}
> +	if (!platform_msi_default_domain)
> +		pr_warn("failed to initialize irqdomain for PLATFORM-MSI.\n");
> +#endif
>   }
>   
>   #ifdef CONFIG_IRQ_REMAP
> @@ -384,3 +399,62 @@ int hpet_assign_irq(struct irq_domain *domain, struct hpet_channel *hc,
>   	return irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, &info);
>   }
>   #endif
> +
> +#ifdef CONFIG_X86_PLATFORM_MSI
> +static void platform_msi_mask_irq(struct irq_data *data)
> +{
> +}
> +
> +static void platform_msi_unmask_irq(struct irq_data *data)
> +{
> +}
> +
> +static struct irq_chip platform_msi_controller = {
> +	.name			= "PLATFORM-MSI",
> +	.irq_mask		= platform_msi_mask_irq,
> +	.irq_unmask		= platform_msi_unmask_irq,
> +	.irq_ack		= irq_chip_ack_parent,
> +	.irq_retrigger		= irq_chip_retrigger_hierarchy,
> +	.irq_compose_msi_msg	= irq_msi_compose_msg,
> +	.flags			= IRQCHIP_SKIP_SET_WAKE,


Except for irq_msi_compose_msg all the rest were arch independent. Can 
we have allow arch to specify irq_compose_msi_msg then we can unfiy the 
rest for all archs?


> +};
> +
> +static int platform_msi_prepare(struct irq_domain *domain, struct device *dev,
> +				int nvec, msi_alloc_info_t *arg)
> +{
> +	memset(arg, 0, sizeof(*arg));
> +	return 0;
> +}
> +
> +static void platform_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
> +{
> +	arg->platform_msi_hwirq = platform_msi_calc_hwirq(desc);
> +}
> +
> +static irq_hw_number_t platform_msi_get_hwirq(struct msi_domain_info *info,
> +					      msi_alloc_info_t *arg)
> +{
> +	return arg->platform_msi_hwirq;
> +}
> +
> +static struct msi_domain_ops platform_msi_domain_ops = {
> +	.msi_prepare	= platform_msi_prepare,
> +	.set_desc	= platform_msi_set_desc,
> +	.get_hwirq	= platform_msi_get_hwirq,
> +};
> +
> +static struct msi_domain_info platform_msi_domain_info = {
> +	.flags          = MSI_FLAG_USE_DEF_DOM_OPS |
> +			  MSI_FLAG_USE_DEF_CHIP_OPS |
> +			  MSI_FLAG_ACTIVATE_EARLY,
> +	.ops            = &platform_msi_domain_ops,
> +	.chip           = &platform_msi_controller,
> +	.handler        = handle_edge_irq,
> +	.handler_name   = "edge",
> +};


Is this better to have a virtio-mmio specific domain and irq_chip? It 
looks to me the advantages are:

- works for all archs
- allow virtio-mmio specific method to do e.g mask/unmask.

Thanks


> +
> +struct irq_domain *platform_msi_get_def_irq_domain(void)
> +{
> +	return platform_msi_default_domain;
> +}
> +#endif
> diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c
> index 8da314b..45752f1 100644
> --- a/drivers/base/platform-msi.c
> +++ b/drivers/base/platform-msi.c
> @@ -31,12 +31,11 @@ struct platform_msi_priv_data {
>   /* The devid allocator */
>   static DEFINE_IDA(platform_msi_devid_ida);
>   
> -#ifdef GENERIC_MSI_DOMAIN_OPS
>   /*
>    * Convert an msi_desc to a globaly unique identifier (per-device
>    * devid + msi_desc position in the msi_list).
>    */
> -static irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
> +irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
>   {
>   	u32 devid;
>   
> @@ -45,6 +44,7 @@ static irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc)
>   	return (devid << (32 - DEV_ID_SHIFT)) | desc->platform.msi_index;
>   }
>   
> +#ifdef GENERIC_MSI_DOMAIN_OPS
>   static void platform_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc)
>   {
>   	arg->desc = desc;
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 8ad679e..ee5f566 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -362,6 +362,7 @@ int platform_msi_domain_alloc(struct irq_domain *domain, unsigned int virq,
>   void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq,
>   			      unsigned int nvec);
>   void *platform_msi_get_host_data(struct irq_domain *domain);
> +irq_hw_number_t platform_msi_calc_hwirq(struct msi_desc *desc);
>   #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
>   
>   #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-25  2:50 ` [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3 Zha Bin
@ 2019-12-26  8:40     ` Jason Wang
  2019-12-25 22:28     ` kbuild test robot
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2019-12-26  8:40 UTC (permalink / raw)
  To: Zha Bin, linux-kernel; +Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/25 上午10:50, Zha Bin wrote:
> From: Liu Jiang <gerry@linux.alibaba.com>
>
> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
> virtio over mmio devices as a lightweight machine model for modern
> cloud. The standard virtio over MMIO transport layer only supports one
> legacy interrupt, which is much heavier than virtio over PCI transport
> layer using MSI. Legacy interrupt has long work path and causes specific
> VMExits in following cases, which would considerably slow down the
> performance:
>
> 1) read interrupt status register
> 2) update interrupt status register
> 3) write IOAPIC EOI register
>
> We proposed to update virtio over MMIO to version 3[1] to add the
> following new features and enhance the performance.
>
> 1) Support Message Signaled Interrupt(MSI), which increases the
>     interrupt performance for virtio multi-queue devices
> 2) Support per-queue doorbell, so the guest kernel may directly write
>     to the doorbells provided by virtio devices.
>
> The following is the network tcp_rr performance testing report, tested
> with virtio-pci device, vanilla virtio-mmio device and patched
> virtio-mmio device (run test 3 times for each case):
>
> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>
> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
> 	trans/s	    9536	6939		9500
> 	trans/s	    9734	7029		9749
> 	trans/s	    9894	7095		9318
>
> [1] https://lkml.org/lkml/2019/12/20/113
>
> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
> ---
>   drivers/virtio/virtio_mmio.c     | 226 ++++++++++++++++++++++++++++++++++++---
>   include/uapi/linux/virtio_mmio.h |  28 +++++
>   2 files changed, 240 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index e09edb5..065c083 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -68,8 +68,8 @@
>   #include <linux/virtio_config.h>
>   #include <uapi/linux/virtio_mmio.h>
>   #include <linux/virtio_ring.h>
> -
> -
> +#include <linux/msi.h>
> +#include <asm/irqdomain.h>
>   
>   /* The alignment to use between consumer and producer parts of vring.
>    * Currently hardcoded to the page size. */
> @@ -90,6 +90,12 @@ struct virtio_mmio_device {
>   	/* a list of queues so we can dispatch IRQs */
>   	spinlock_t lock;
>   	struct list_head virtqueues;
> +
> +	int doorbell_base;
> +	int doorbell_scale;


It's better to use the terminology defined by spec, e.g 
notify_base/notify_multiplexer etc.

And we usually use unsigned type for offset.


> +	bool msi_enabled;
> +	/* Name strings for interrupts. */
> +	char (*vm_vq_names)[256];
>   };
>   
>   struct virtio_mmio_vq_info {
> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>   };
>   
>   
> +static void vm_free_msi_irqs(struct virtio_device *vdev);
> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs);
>   
>   /* Configuration interface */
>   
> @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>   {
>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vq->vdev);
>   
> +	if (vm_dev->version == 3) {
> +		int offset = vm_dev->doorbell_base +
> +			     vm_dev->doorbell_scale * vq->index;
> +		writel(vq->index, vm_dev->base + offset);


In virtio-pci we store the doorbell address in vq->priv to avoid doing 
multiplication in fast path.


> +	} else
>   	/* We write the queue's selector into the notification register to
>   	 * signal the other end */
> -	writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +		writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +
>   	return true;
>   }
>   
> +/* Handle a configuration change */
> +static irqreturn_t vm_config_changed(int irq, void *opaque)
> +{
> +	struct virtio_mmio_device *vm_dev = opaque;
> +
> +	virtio_config_changed(&vm_dev->vdev);
> +
> +	return IRQ_HANDLED;
> +}
> +
>   /* Notify all virtqueues on an interrupt. */
>   static irqreturn_t vm_interrupt(int irq, void *opaque)
>   {
> @@ -307,7 +331,12 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
>   	return ret;
>   }
>   
> +static int vm_msi_irq_vector(struct device *dev, unsigned int nr)
> +{
> +	struct msi_desc *entry = first_msi_entry(dev);
>   
> +	return entry->irq + nr;
> +}
>   
>   static void vm_del_vq(struct virtqueue *vq)
>   {
> @@ -316,6 +345,12 @@ static void vm_del_vq(struct virtqueue *vq)
>   	unsigned long flags;
>   	unsigned int index = vq->index;
>   
> +	if (vm_dev->version == 3 && vm_dev->msi_enabled) {
> +		int irq = vm_msi_irq_vector(&vq->vdev->dev, index + 1);
> +
> +		free_irq(irq, vq);
> +	}
> +
>   	spin_lock_irqsave(&vm_dev->lock, flags);
>   	list_del(&info->node);
>   	spin_unlock_irqrestore(&vm_dev->lock, flags);
> @@ -334,15 +369,41 @@ static void vm_del_vq(struct virtqueue *vq)
>   	kfree(info);
>   }
>   
> -static void vm_del_vqs(struct virtio_device *vdev)
> +static void vm_free_irqs(struct virtio_device *vdev)
>   {
>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	if (vm_dev->version == 3)
> +		vm_free_msi_irqs(vdev);
> +	else
> +		free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
> +}
> +
> +static void vm_del_vqs(struct virtio_device *vdev)
> +{
>   	struct virtqueue *vq, *n;
>   
>   	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
>   		vm_del_vq(vq);
>   
> -	free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
> +	vm_free_irqs(vdev);
> +}
> +
> +static int vm_request_single_irq(struct virtio_device *vdev)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	int irq = platform_get_irq(vm_dev->pdev, 0);
> +	int err;
> +
> +	if (irq < 0) {
> +		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
> +		return irq;
> +	}
> +
> +	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> +			dev_name(&vdev->dev), vm_dev);
> +
> +	return err;
>   }
>   
>   static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
> @@ -455,6 +516,72 @@ static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
>   	return ERR_PTR(err);
>   }
>   
> +static inline void mmio_msi_set_enable(struct virtio_mmio_device *vm_dev,
> +		int enable)
> +{
> +	u16 status;
> +
> +	status = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +	if (enable)
> +		status |= VIRTIO_MMIO_MSI_ENABLE_MASK;
> +	else
> +		status &= ~VIRTIO_MMIO_MSI_ENABLE_MASK;
> +	writew(status, vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +}
> +
> +static int vm_find_vqs_msi(struct virtio_device *vdev, unsigned int nvqs,
> +			struct virtqueue *vqs[], vq_callback_t *callbacks[],
> +			const char * const names[], const bool *ctx,
> +			struct irq_affinity *desc)
> +{
> +	int i, err, irq;
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	/* Allocate nvqs irqs for queues and one irq for configuration */
> +	err = vm_request_msi_vectors(vdev, nvqs + 1);
> +	if (err != 0)
> +		return err;
> +
> +	for (i = 0; i < nvqs; i++) {
> +		int index = i + 1;
> +
> +		if (!names[i]) {
> +			vqs[i] = NULL;
> +			continue;
> +		}
> +		vqs[i] = vm_setup_vq(vdev, i, callbacks[i], names[i],
> +				ctx ? ctx[i] : false);
> +		if (IS_ERR(vqs[i])) {
> +			err = PTR_ERR(vqs[i]);
> +			goto error_find;
> +		}
> +
> +		if (!callbacks[i])
> +			continue;
> +
> +		irq = vm_msi_irq_vector(&vqs[i]->vdev->dev, index);
> +		/* allocate per-vq irq if available and necessary */
> +		snprintf(vm_dev->vm_vq_names[index],
> +			sizeof(*vm_dev->vm_vq_names),
> +			"%s-%s",
> +			dev_name(&vm_dev->vdev.dev), names[i]);
> +		err = request_irq(irq, vring_interrupt, 0,
> +				vm_dev->vm_vq_names[index], vqs[i]);
> +
> +		if (err)
> +			goto error_find;
> +	}
> +
> +	mmio_msi_set_enable(vm_dev, 1);
> +	vm_dev->msi_enabled = true;
> +
> +	return 0;
> +
> +error_find:
> +	vm_del_vqs(vdev);
> +	return err;
> +}
> +
>   static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   		       struct virtqueue *vqs[],
>   		       vq_callback_t *callbacks[],
> @@ -463,16 +590,16 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   		       struct irq_affinity *desc)
>   {
>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> -	int irq = platform_get_irq(vm_dev->pdev, 0);
>   	int i, err, queue_idx = 0;
>   
> -	if (irq < 0) {
> -		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
> -		return irq;
> +	if (vm_dev->version == 3) {
> +		err = vm_find_vqs_msi(vdev, nvqs, vqs, callbacks,
> +				names, ctx, desc);
> +		if (!err)
> +			return 0;
>   	}
>   
> -	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> -			dev_name(&vdev->dev), vm_dev);
> +	err = vm_request_single_irq(vdev);
>   	if (err)
>   		return err;
>   
> @@ -493,6 +620,73 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   	return 0;
>   }
>   
> +static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> +	struct device *dev = desc->dev;
> +	struct virtio_device *vdev = dev_to_virtio(dev);
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	void __iomem *pos = vm_dev->base;
> +	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
> +			desc->platform.msi_index);
> +
> +	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
> +	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
> +	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
> +	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
> +}
> +
> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	int irq, err;
> +	u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +
> +	if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) == 0)
> +		return -EINVAL;
> +
> +	vm_dev->vm_vq_names = kmalloc_array(nirqs, sizeof(*vm_dev->vm_vq_names),
> +			GFP_KERNEL);
> +	if (!vm_dev->vm_vq_names)
> +		return -ENOMEM;
> +
> +	if (!vdev->dev.msi_domain)
> +		vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
> +	err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
> +			mmio_write_msi_msg);


Can platform_msi_domain_alloc_irqs check msi_domain vs NULL?


> +	if (err)
> +		goto error_free_mem;
> +
> +	/* The first MSI vector is used for configuration change events. */
> +	snprintf(vm_dev->vm_vq_names[0], sizeof(*vm_dev->vm_vq_names),
> +			"%s-config", dev_name(&vdev->dev));
> +	irq = vm_msi_irq_vector(&vdev->dev, 0);
> +	err = request_irq(irq, vm_config_changed, 0, vm_dev->vm_vq_names[0],
> +			vm_dev);
> +	if (err)
> +		goto error_free_mem;
> +
> +	return 0;
> +
> +error_free_mem:
> +	kfree(vm_dev->vm_vq_names);
> +	vm_dev->vm_vq_names = NULL;
> +	return err;
> +}
> +
> +static void vm_free_msi_irqs(struct virtio_device *vdev)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	if (vm_dev->msi_enabled) {
> +		mmio_msi_set_enable(vm_dev, 0);
> +		free_irq(vm_msi_irq_vector(&vdev->dev, 0), vm_dev);
> +		platform_msi_domain_free_irqs(&vdev->dev);
> +		kfree(vm_dev->vm_vq_names);
> +		vm_dev->vm_vq_names = NULL;
> +		vm_dev->msi_enabled = false;
> +	}
> +}
> +
>   static const char *vm_bus_name(struct virtio_device *vdev)
>   {
>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> @@ -567,7 +761,7 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>   
>   	/* Check device version */
>   	vm_dev->version = readl(vm_dev->base + VIRTIO_MMIO_VERSION);
> -	if (vm_dev->version < 1 || vm_dev->version > 2) {
> +	if (vm_dev->version < 1 || vm_dev->version > 3) {
>   		dev_err(&pdev->dev, "Version %ld not supported!\n",
>   				vm_dev->version);
>   		return -ENXIO;
> @@ -603,6 +797,12 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>   		dev_warn(&pdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");
>   
>   	platform_set_drvdata(pdev, vm_dev);
> +	if (vm_dev->version == 3) {
> +		u32 db = readl(vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +
> +		vm_dev->doorbell_base = db & 0xffff;
> +		vm_dev->doorbell_scale = (db >> 16) & 0xffff;
> +	}
>   
>   	rc = register_virtio_device(&vm_dev->vdev);
>   	if (rc)
> @@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct platform_device *pdev)
>   	return 0;
>   }
>   
> -
> -


Unnecessary changes.


>   /* Devices list parameter */
>   
>   #if defined(CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES)
> diff --git a/include/uapi/linux/virtio_mmio.h b/include/uapi/linux/virtio_mmio.h
> index c4b0968..148895e 100644
> --- a/include/uapi/linux/virtio_mmio.h
> +++ b/include/uapi/linux/virtio_mmio.h
> @@ -122,6 +122,34 @@
>   #define VIRTIO_MMIO_QUEUE_USED_LOW	0x0a0
>   #define VIRTIO_MMIO_QUEUE_USED_HIGH	0x0a4
>   
> +/* MSI related registers */
> +
> +/* MSI status register */
> +#define VIRTIO_MMIO_MSI_STATUS		0x0c0
> +/* MSI command register */
> +#define VIRTIO_MMIO_MSI_COMMAND		0x0c2
> +/* MSI low 32 bit address, 64 bits in two halves */
> +#define VIRTIO_MMIO_MSI_ADDRESS_LOW	0x0c4
> +/* MSI high 32 bit address, 64 bits in two halves */
> +#define VIRTIO_MMIO_MSI_ADDRESS_HIGH	0x0c8
> +/* MSI data */
> +#define VIRTIO_MMIO_MSI_DATA		0x0cc


I wonder what's the advantage of using registers instead of memory 
mapped tables as PCI did. Is this because MMIO doesn't have capability 
list which makes it hard to be extended if we have variable size of tables?


> +
> +/* RO: MSI feature enabled mask */
> +#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000
> +/* RO: Maximum queue size available */
> +#define VIRTIO_MMIO_MSI_STATUS_QMASK	0x07ff
> +/* Reserved */
> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED	0x7800
> +
> +#define VIRTIO_MMIO_MSI_CMD_UPDATE	0x1


I believe we need a command to read the number of vectors supported by 
the device, or 2048 is assumed to be a fixed size here?

Thanks


> +#define VIRTIO_MMIO_MSI_CMD_OP_MASK	0xf000
> +#define VIRTIO_MMIO_MSI_CMD_ARG_MASK	0x0fff
> +#define VIRTIO_MMIO_MSI_CMD(op, arg)	((op) << 12 | (arg))
> +#define VIRITO_MMIO_MSI_CMD_ARG(cmd)	((cmd) & VIRTIO_MMIO_MSI_CMD_ARG_MASK)
> +#define VIRTIO_MMIO_MSI_CMD_OP(cmd)	\
> +		(((cmd) & VIRTIO_MMIO_MSI_CMD_OP_MASK) >> 12)
> +
>   /* Configuration atomicity value */
>   #define VIRTIO_MMIO_CONFIG_GENERATION	0x0fc
>   


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2019-12-26  8:40     ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2019-12-26  8:40 UTC (permalink / raw)
  To: Zha Bin, linux-kernel; +Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/25 上午10:50, Zha Bin wrote:
> From: Liu Jiang <gerry@linux.alibaba.com>
>
> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
> virtio over mmio devices as a lightweight machine model for modern
> cloud. The standard virtio over MMIO transport layer only supports one
> legacy interrupt, which is much heavier than virtio over PCI transport
> layer using MSI. Legacy interrupt has long work path and causes specific
> VMExits in following cases, which would considerably slow down the
> performance:
>
> 1) read interrupt status register
> 2) update interrupt status register
> 3) write IOAPIC EOI register
>
> We proposed to update virtio over MMIO to version 3[1] to add the
> following new features and enhance the performance.
>
> 1) Support Message Signaled Interrupt(MSI), which increases the
>     interrupt performance for virtio multi-queue devices
> 2) Support per-queue doorbell, so the guest kernel may directly write
>     to the doorbells provided by virtio devices.
>
> The following is the network tcp_rr performance testing report, tested
> with virtio-pci device, vanilla virtio-mmio device and patched
> virtio-mmio device (run test 3 times for each case):
>
> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>
> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
> 	trans/s	    9536	6939		9500
> 	trans/s	    9734	7029		9749
> 	trans/s	    9894	7095		9318
>
> [1] https://lkml.org/lkml/2019/12/20/113
>
> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
> ---
>   drivers/virtio/virtio_mmio.c     | 226 ++++++++++++++++++++++++++++++++++++---
>   include/uapi/linux/virtio_mmio.h |  28 +++++
>   2 files changed, 240 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index e09edb5..065c083 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -68,8 +68,8 @@
>   #include <linux/virtio_config.h>
>   #include <uapi/linux/virtio_mmio.h>
>   #include <linux/virtio_ring.h>
> -
> -
> +#include <linux/msi.h>
> +#include <asm/irqdomain.h>
>   
>   /* The alignment to use between consumer and producer parts of vring.
>    * Currently hardcoded to the page size. */
> @@ -90,6 +90,12 @@ struct virtio_mmio_device {
>   	/* a list of queues so we can dispatch IRQs */
>   	spinlock_t lock;
>   	struct list_head virtqueues;
> +
> +	int doorbell_base;
> +	int doorbell_scale;


It's better to use the terminology defined by spec, e.g 
notify_base/notify_multiplexer etc.

And we usually use unsigned type for offset.


> +	bool msi_enabled;
> +	/* Name strings for interrupts. */
> +	char (*vm_vq_names)[256];
>   };
>   
>   struct virtio_mmio_vq_info {
> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>   };
>   
>   
> +static void vm_free_msi_irqs(struct virtio_device *vdev);
> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs);
>   
>   /* Configuration interface */
>   
> @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>   {
>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vq->vdev);
>   
> +	if (vm_dev->version == 3) {
> +		int offset = vm_dev->doorbell_base +
> +			     vm_dev->doorbell_scale * vq->index;
> +		writel(vq->index, vm_dev->base + offset);


In virtio-pci we store the doorbell address in vq->priv to avoid doing 
multiplication in fast path.


> +	} else
>   	/* We write the queue's selector into the notification register to
>   	 * signal the other end */
> -	writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +		writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +
>   	return true;
>   }
>   
> +/* Handle a configuration change */
> +static irqreturn_t vm_config_changed(int irq, void *opaque)
> +{
> +	struct virtio_mmio_device *vm_dev = opaque;
> +
> +	virtio_config_changed(&vm_dev->vdev);
> +
> +	return IRQ_HANDLED;
> +}
> +
>   /* Notify all virtqueues on an interrupt. */
>   static irqreturn_t vm_interrupt(int irq, void *opaque)
>   {
> @@ -307,7 +331,12 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
>   	return ret;
>   }
>   
> +static int vm_msi_irq_vector(struct device *dev, unsigned int nr)
> +{
> +	struct msi_desc *entry = first_msi_entry(dev);
>   
> +	return entry->irq + nr;
> +}
>   
>   static void vm_del_vq(struct virtqueue *vq)
>   {
> @@ -316,6 +345,12 @@ static void vm_del_vq(struct virtqueue *vq)
>   	unsigned long flags;
>   	unsigned int index = vq->index;
>   
> +	if (vm_dev->version == 3 && vm_dev->msi_enabled) {
> +		int irq = vm_msi_irq_vector(&vq->vdev->dev, index + 1);
> +
> +		free_irq(irq, vq);
> +	}
> +
>   	spin_lock_irqsave(&vm_dev->lock, flags);
>   	list_del(&info->node);
>   	spin_unlock_irqrestore(&vm_dev->lock, flags);
> @@ -334,15 +369,41 @@ static void vm_del_vq(struct virtqueue *vq)
>   	kfree(info);
>   }
>   
> -static void vm_del_vqs(struct virtio_device *vdev)
> +static void vm_free_irqs(struct virtio_device *vdev)
>   {
>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	if (vm_dev->version == 3)
> +		vm_free_msi_irqs(vdev);
> +	else
> +		free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
> +}
> +
> +static void vm_del_vqs(struct virtio_device *vdev)
> +{
>   	struct virtqueue *vq, *n;
>   
>   	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
>   		vm_del_vq(vq);
>   
> -	free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
> +	vm_free_irqs(vdev);
> +}
> +
> +static int vm_request_single_irq(struct virtio_device *vdev)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	int irq = platform_get_irq(vm_dev->pdev, 0);
> +	int err;
> +
> +	if (irq < 0) {
> +		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
> +		return irq;
> +	}
> +
> +	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> +			dev_name(&vdev->dev), vm_dev);
> +
> +	return err;
>   }
>   
>   static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
> @@ -455,6 +516,72 @@ static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
>   	return ERR_PTR(err);
>   }
>   
> +static inline void mmio_msi_set_enable(struct virtio_mmio_device *vm_dev,
> +		int enable)
> +{
> +	u16 status;
> +
> +	status = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +	if (enable)
> +		status |= VIRTIO_MMIO_MSI_ENABLE_MASK;
> +	else
> +		status &= ~VIRTIO_MMIO_MSI_ENABLE_MASK;
> +	writew(status, vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +}
> +
> +static int vm_find_vqs_msi(struct virtio_device *vdev, unsigned int nvqs,
> +			struct virtqueue *vqs[], vq_callback_t *callbacks[],
> +			const char * const names[], const bool *ctx,
> +			struct irq_affinity *desc)
> +{
> +	int i, err, irq;
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	/* Allocate nvqs irqs for queues and one irq for configuration */
> +	err = vm_request_msi_vectors(vdev, nvqs + 1);
> +	if (err != 0)
> +		return err;
> +
> +	for (i = 0; i < nvqs; i++) {
> +		int index = i + 1;
> +
> +		if (!names[i]) {
> +			vqs[i] = NULL;
> +			continue;
> +		}
> +		vqs[i] = vm_setup_vq(vdev, i, callbacks[i], names[i],
> +				ctx ? ctx[i] : false);
> +		if (IS_ERR(vqs[i])) {
> +			err = PTR_ERR(vqs[i]);
> +			goto error_find;
> +		}
> +
> +		if (!callbacks[i])
> +			continue;
> +
> +		irq = vm_msi_irq_vector(&vqs[i]->vdev->dev, index);
> +		/* allocate per-vq irq if available and necessary */
> +		snprintf(vm_dev->vm_vq_names[index],
> +			sizeof(*vm_dev->vm_vq_names),
> +			"%s-%s",
> +			dev_name(&vm_dev->vdev.dev), names[i]);
> +		err = request_irq(irq, vring_interrupt, 0,
> +				vm_dev->vm_vq_names[index], vqs[i]);
> +
> +		if (err)
> +			goto error_find;
> +	}
> +
> +	mmio_msi_set_enable(vm_dev, 1);
> +	vm_dev->msi_enabled = true;
> +
> +	return 0;
> +
> +error_find:
> +	vm_del_vqs(vdev);
> +	return err;
> +}
> +
>   static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   		       struct virtqueue *vqs[],
>   		       vq_callback_t *callbacks[],
> @@ -463,16 +590,16 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   		       struct irq_affinity *desc)
>   {
>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> -	int irq = platform_get_irq(vm_dev->pdev, 0);
>   	int i, err, queue_idx = 0;
>   
> -	if (irq < 0) {
> -		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
> -		return irq;
> +	if (vm_dev->version == 3) {
> +		err = vm_find_vqs_msi(vdev, nvqs, vqs, callbacks,
> +				names, ctx, desc);
> +		if (!err)
> +			return 0;
>   	}
>   
> -	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> -			dev_name(&vdev->dev), vm_dev);
> +	err = vm_request_single_irq(vdev);
>   	if (err)
>   		return err;
>   
> @@ -493,6 +620,73 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>   	return 0;
>   }
>   
> +static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> +	struct device *dev = desc->dev;
> +	struct virtio_device *vdev = dev_to_virtio(dev);
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	void __iomem *pos = vm_dev->base;
> +	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
> +			desc->platform.msi_index);
> +
> +	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
> +	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
> +	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
> +	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
> +}
> +
> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	int irq, err;
> +	u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +
> +	if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) == 0)
> +		return -EINVAL;
> +
> +	vm_dev->vm_vq_names = kmalloc_array(nirqs, sizeof(*vm_dev->vm_vq_names),
> +			GFP_KERNEL);
> +	if (!vm_dev->vm_vq_names)
> +		return -ENOMEM;
> +
> +	if (!vdev->dev.msi_domain)
> +		vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
> +	err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
> +			mmio_write_msi_msg);


Can platform_msi_domain_alloc_irqs check msi_domain vs NULL?


> +	if (err)
> +		goto error_free_mem;
> +
> +	/* The first MSI vector is used for configuration change events. */
> +	snprintf(vm_dev->vm_vq_names[0], sizeof(*vm_dev->vm_vq_names),
> +			"%s-config", dev_name(&vdev->dev));
> +	irq = vm_msi_irq_vector(&vdev->dev, 0);
> +	err = request_irq(irq, vm_config_changed, 0, vm_dev->vm_vq_names[0],
> +			vm_dev);
> +	if (err)
> +		goto error_free_mem;
> +
> +	return 0;
> +
> +error_free_mem:
> +	kfree(vm_dev->vm_vq_names);
> +	vm_dev->vm_vq_names = NULL;
> +	return err;
> +}
> +
> +static void vm_free_msi_irqs(struct virtio_device *vdev)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	if (vm_dev->msi_enabled) {
> +		mmio_msi_set_enable(vm_dev, 0);
> +		free_irq(vm_msi_irq_vector(&vdev->dev, 0), vm_dev);
> +		platform_msi_domain_free_irqs(&vdev->dev);
> +		kfree(vm_dev->vm_vq_names);
> +		vm_dev->vm_vq_names = NULL;
> +		vm_dev->msi_enabled = false;
> +	}
> +}
> +
>   static const char *vm_bus_name(struct virtio_device *vdev)
>   {
>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> @@ -567,7 +761,7 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>   
>   	/* Check device version */
>   	vm_dev->version = readl(vm_dev->base + VIRTIO_MMIO_VERSION);
> -	if (vm_dev->version < 1 || vm_dev->version > 2) {
> +	if (vm_dev->version < 1 || vm_dev->version > 3) {
>   		dev_err(&pdev->dev, "Version %ld not supported!\n",
>   				vm_dev->version);
>   		return -ENXIO;
> @@ -603,6 +797,12 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>   		dev_warn(&pdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");
>   
>   	platform_set_drvdata(pdev, vm_dev);
> +	if (vm_dev->version == 3) {
> +		u32 db = readl(vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +
> +		vm_dev->doorbell_base = db & 0xffff;
> +		vm_dev->doorbell_scale = (db >> 16) & 0xffff;
> +	}
>   
>   	rc = register_virtio_device(&vm_dev->vdev);
>   	if (rc)
> @@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct platform_device *pdev)
>   	return 0;
>   }
>   
> -
> -


Unnecessary changes.


>   /* Devices list parameter */
>   
>   #if defined(CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES)
> diff --git a/include/uapi/linux/virtio_mmio.h b/include/uapi/linux/virtio_mmio.h
> index c4b0968..148895e 100644
> --- a/include/uapi/linux/virtio_mmio.h
> +++ b/include/uapi/linux/virtio_mmio.h
> @@ -122,6 +122,34 @@
>   #define VIRTIO_MMIO_QUEUE_USED_LOW	0x0a0
>   #define VIRTIO_MMIO_QUEUE_USED_HIGH	0x0a4
>   
> +/* MSI related registers */
> +
> +/* MSI status register */
> +#define VIRTIO_MMIO_MSI_STATUS		0x0c0
> +/* MSI command register */
> +#define VIRTIO_MMIO_MSI_COMMAND		0x0c2
> +/* MSI low 32 bit address, 64 bits in two halves */
> +#define VIRTIO_MMIO_MSI_ADDRESS_LOW	0x0c4
> +/* MSI high 32 bit address, 64 bits in two halves */
> +#define VIRTIO_MMIO_MSI_ADDRESS_HIGH	0x0c8
> +/* MSI data */
> +#define VIRTIO_MMIO_MSI_DATA		0x0cc


I wonder what's the advantage of using registers instead of memory 
mapped tables as PCI did. Is this because MMIO doesn't have capability 
list which makes it hard to be extended if we have variable size of tables?


> +
> +/* RO: MSI feature enabled mask */
> +#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000
> +/* RO: Maximum queue size available */
> +#define VIRTIO_MMIO_MSI_STATUS_QMASK	0x07ff
> +/* Reserved */
> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED	0x7800
> +
> +#define VIRTIO_MMIO_MSI_CMD_UPDATE	0x1


I believe we need a command to read the number of vectors supported by 
the device, or 2048 is assumed to be a fixed size here?

Thanks


> +#define VIRTIO_MMIO_MSI_CMD_OP_MASK	0xf000
> +#define VIRTIO_MMIO_MSI_CMD_ARG_MASK	0x0fff
> +#define VIRTIO_MMIO_MSI_CMD(op, arg)	((op) << 12 | (arg))
> +#define VIRITO_MMIO_MSI_CMD_ARG(cmd)	((cmd) & VIRTIO_MMIO_MSI_CMD_ARG_MASK)
> +#define VIRTIO_MMIO_MSI_CMD_OP(cmd)	\
> +		(((cmd) & VIRTIO_MMIO_MSI_CMD_OP_MASK) >> 12)
> +
>   /* Configuration atomicity value */
>   #define VIRTIO_MMIO_CONFIG_GENERATION	0x0fc
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-26  8:09         ` [virtio-dev] " Jason Wang
  (?)
@ 2019-12-26 12:35         ` Liu, Jiang
  -1 siblings, 0 replies; 60+ messages in thread
From: Liu, Jiang @ 2019-12-26 12:35 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zha Bin, linux-kernel, mst, Sergio Lopez, virtio-dev, jing2.liu,
	chao.p.peng



> On Dec 26, 2019, at 4:09 PM, Jason Wang <jasowang@redhat.com> wrote:
> 
> 
> On 2019/12/25 下午11:20, Liu, Jiang wrote:
>> 
>>> On Dec 25, 2019, at 6:20 PM, Jason Wang <jasowang@redhat.com> wrote:
>>> 
>>> 
>>> On 2019/12/25 上午10:50, Zha Bin wrote:
>>>> From: Liu Jiang <gerry@linux.alibaba.com>
>>>> 
>>>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>>>> virtio over mmio devices as a lightweight machine model for modern
>>>> cloud. The standard virtio over MMIO transport layer only supports one
>>>> legacy interrupt, which is much heavier than virtio over PCI transport
>>>> layer using MSI. Legacy interrupt has long work path and causes specific
>>>> VMExits in following cases, which would considerably slow down the
>>>> performance:
>>>> 
>>>> 1) read interrupt status register
>>>> 2) update interrupt status register
>>>> 3) write IOAPIC EOI register
>>>> 
>>>> We proposed to update virtio over MMIO to version 3[1] to add the
>>>> following new features and enhance the performance.
>>>> 
>>>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>>>    interrupt performance for virtio multi-queue devices
>>>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>>>    to the doorbells provided by virtio devices.
>>>> 
>>>> The following is the network tcp_rr performance testing report, tested
>>>> with virtio-pci device, vanilla virtio-mmio device and patched
>>>> virtio-mmio device (run test 3 times for each case):
>>>> 
>>>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>>> 
>>>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>>>> 	trans/s	    9536	6939		9500
>>>> 	trans/s	    9734	7029		9749
>>>> 	trans/s	    9894	7095		9318
>>>> 
>>>> [1] https://lkml.org/lkml/2019/12/20/113
>>> 
>>> Thanks for the patch. Two questions after a quick glance:
>>> 
>>> 1) In PCI we choose to support MSI-X instead of MSI for having extra flexibility like alias, independent data and address (e.g for affinity) . Any reason for not start from MSI-X? E.g having MSI-X table and PBA (both of which looks pretty independent).
>> Hi Jason,
>> 	Thanks for reviewing patches on Christmas Day:)
>> 	The PCI MSI-x has several advantages over PCI MSI, mainly
>> 1) support 2048 vectors, much more than 32 vectors supported by MSI.
>> 2) dedicated address/data for each vector,
>> 3) per vector mask/pending bits.
>> The proposed MMIO MSI extension supports both 1) and 2),
> 
> 
> Aha right, I mis-read the patch. But more questions comes:
> 
> 1) The association between vq and MSI-X vector is fixed. This means it can't work for a device that have more than 2047 queues. We probably need something similar to virtio-pci to allow a dynamic association.
> 2) The mask and unmask control is missed
> 
> 
>>  but the extension doesn’t support 3) because
>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
> 
> 
> Not directly used but masking/unmasking is widely used in irq subsystem which allows lots of optimizations.
> 
> 
>> 
>> On the other hand, we want to simplify VMM implementations as simple as possible, and mimicking the PCI MSI-x
>> will cause some complexity to VMM implementations.
> 
> 
> I agree to simplify VMM implementation, but it looks to me introducing masking/pending won't cost too much code in the VMM implementation. Just new type of command for VIRTIO_MMIO_MSI_COMMAND.
> 
> Thanks
> 
> 
>> 
>>> 2) It's better to split notify_multiplexer out of MSI support to ease the reviewers (apply to spec patch as well)
>> Great suggestion, we will try to split the patch.
>> 
>> Thanks,
>> Gerry
>> 
>>> Thanks


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-26  8:09         ` [virtio-dev] " Jason Wang
  (?)
  (?)
@ 2019-12-26 13:16         ` Liu, Jiang
  2020-01-02  6:28             ` [virtio-dev] " Jason Wang
  2020-01-05 10:42             ` [virtio-dev] " Michael S. Tsirkin
  -1 siblings, 2 replies; 60+ messages in thread
From: Liu, Jiang @ 2019-12-26 13:16 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zha Bin, linux-kernel, mst, slp, virtio-dev, jing2.liu, chao.p.peng



> On Dec 26, 2019, at 4:09 PM, Jason Wang <jasowang@redhat.com> wrote:
> 
> 
> On 2019/12/25 下午11:20, Liu, Jiang wrote:
>> 
>>> On Dec 25, 2019, at 6:20 PM, Jason Wang <jasowang@redhat.com> wrote:
>>> 
>>> 
>>> On 2019/12/25 上午10:50, Zha Bin wrote:
>>>> From: Liu Jiang <gerry@linux.alibaba.com>
>>>> 
>>>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>>>> virtio over mmio devices as a lightweight machine model for modern
>>>> cloud. The standard virtio over MMIO transport layer only supports one
>>>> legacy interrupt, which is much heavier than virtio over PCI transport
>>>> layer using MSI. Legacy interrupt has long work path and causes specific
>>>> VMExits in following cases, which would considerably slow down the
>>>> performance:
>>>> 
>>>> 1) read interrupt status register
>>>> 2) update interrupt status register
>>>> 3) write IOAPIC EOI register
>>>> 
>>>> We proposed to update virtio over MMIO to version 3[1] to add the
>>>> following new features and enhance the performance.
>>>> 
>>>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>>>    interrupt performance for virtio multi-queue devices
>>>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>>>    to the doorbells provided by virtio devices.
>>>> 
>>>> The following is the network tcp_rr performance testing report, tested
>>>> with virtio-pci device, vanilla virtio-mmio device and patched
>>>> virtio-mmio device (run test 3 times for each case):
>>>> 
>>>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>>> 
>>>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>>>> 	trans/s	    9536	6939		9500
>>>> 	trans/s	    9734	7029		9749
>>>> 	trans/s	    9894	7095		9318
>>>> 
>>>> [1] https://lkml.org/lkml/2019/12/20/113
>>> 
>>> Thanks for the patch. Two questions after a quick glance:
>>> 
>>> 1) In PCI we choose to support MSI-X instead of MSI for having extra flexibility like alias, independent data and address (e.g for affinity) . Any reason for not start from MSI-X? E.g having MSI-X table and PBA (both of which looks pretty independent).
>> Hi Jason,
>> 	Thanks for reviewing patches on Christmas Day:)
>> 	The PCI MSI-x has several advantages over PCI MSI, mainly
>> 1) support 2048 vectors, much more than 32 vectors supported by MSI.
>> 2) dedicated address/data for each vector,
>> 3) per vector mask/pending bits.
>> The proposed MMIO MSI extension supports both 1) and 2),
> 
> 
> Aha right, I mis-read the patch. But more questions comes:
> 
> 1) The association between vq and MSI-X vector is fixed. This means it can't work for a device that have more than 2047 queues. We probably need something similar to virtio-pci to allow a dynamic association.
We have considered both the PCI MSI-x like dynamic association design and fix mapping design.
The fix mapping design simplifies both the interrupt configuration process and VMM implementations. 
And the virtio mmio transport layer is mainly used by light VMMs to support small scale virtual machines,  2048 vectors should be enough for these usage cases.
So the fix mapping design has been used.

> 2) The mask and unmask control is missed
> 
> 
>>  but the extension doesn’t support 3) because
>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
> 
> 
> Not directly used but masking/unmasking is widely used in irq subsystem which allows lots of optimizations.
> 
> 
>> 
>> On the other hand, we want to simplify VMM implementations as simple as possible, and mimicking the PCI MSI-x
>> will cause some complexity to VMM implementations.
> 
> 
> I agree to simplify VMM implementation, but it looks to me introducing masking/pending won't cost too much code in the VMM implementation. Just new type of command for VIRTIO_MMIO_MSI_COMMAND.
We want to make VMM implementations as simple as possible:) 
And based on following observations, we have disabled support of mask/unmask,
1) MSI is edge triggering, which means it won’t be shared with other interrupt sources, so masking/unmasking won’t be used for normal interrupt management logic.
2) Linux virtio mmio transport layer doesn’t support  suspend/resume yet, so there’s no need to quiesce the device by masking interrupts.
3) The legacy PCI 2.2 devices doesn’t support irq masking/unmasking, so irq masking/unmasking may be optional operations.
So we skipped support of irq masking/unmasking. We will recheck whether irq masking/unmasking is mandatory for MMIO devices.
On the other hand, we may enhance the spec to define command codes for masking/unmasking, and VMM may optionally support masking/unmasking.

Thanks,
Gerry

> 
> Thanks
> 
> 
>> 
>>> 2) It's better to split notify_multiplexer out of MSI support to ease the reviewers (apply to spec patch as well)
>> Great suggestion, we will try to split the patch.
>> 
>> Thanks,
>> Gerry
>> 
>>> Thanks


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-26  8:40     ` [virtio-dev] " Jason Wang
@ 2019-12-27  9:37       ` Liu, Jing2
  -1 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2019-12-27  9:37 UTC (permalink / raw)
  To: Jason Wang, Zha Bin, linux-kernel
  Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng

Hi Jason,

Thanks for reviewing the patches!

  [...]
>> -
>> +#include <linux/msi.h>
>> +#include <asm/irqdomain.h>
>>     /* The alignment to use between consumer and producer parts of 
>> vring.
>>    * Currently hardcoded to the page size. */
>> @@ -90,6 +90,12 @@ struct virtio_mmio_device {
>>       /* a list of queues so we can dispatch IRQs */
>>       spinlock_t lock;
>>       struct list_head virtqueues;
>> +
>> +    int doorbell_base;
>> +    int doorbell_scale;
>
>
> It's better to use the terminology defined by spec, e.g 
> notify_base/notify_multiplexer etc.
>
> And we usually use unsigned type for offset.

We will fix this in next version.


>
>
>> +    bool msi_enabled;
>> +    /* Name strings for interrupts. */
>> +    char (*vm_vq_names)[256];
>>   };
>>     struct virtio_mmio_vq_info {
>> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>>   };
>>     +static void vm_free_msi_irqs(struct virtio_device *vdev);
>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int 
>> nirqs);
>>     /* Configuration interface */
>>   @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>>   {
>>       struct virtio_mmio_device *vm_dev = 
>> to_virtio_mmio_device(vq->vdev);
>>   +    if (vm_dev->version == 3) {
>> +        int offset = vm_dev->doorbell_base +
>> +                 vm_dev->doorbell_scale * vq->index;
>> +        writel(vq->index, vm_dev->base + offset);
>
>
> In virtio-pci we store the doorbell address in vq->priv to avoid doing 
> multiplication in fast path.

Good suggestion. We will fix this in next version.

[...]

>> +
>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int 
>> nirqs)
>> +{
>> +    struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>> +    int irq, err;
>> +    u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
>> +
>> +    if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) 
>> == 0)
>> +        return -EINVAL;
>> +
>> +    vm_dev->vm_vq_names = kmalloc_array(nirqs, 
>> sizeof(*vm_dev->vm_vq_names),
>> +            GFP_KERNEL);
>> +    if (!vm_dev->vm_vq_names)
>> +        return -ENOMEM;
>> +
>> +    if (!vdev->dev.msi_domain)
>> +        vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
>> +    err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
>> +            mmio_write_msi_msg);
>
>
> Can platform_msi_domain_alloc_irqs check msi_domain vs NULL?
>
Actually here, we need to firstly consider the cases that @dev doesn't 
have @msi_domain,

according to the report by lkp.

For the platform_msi_domain_alloc_irqs, it can help check inside.


> [...]
>>         rc = register_virtio_device(&vm_dev->vdev);
>>       if (rc)
>> @@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct 
>> platform_device *pdev)
>>       return 0;
>>   }
>>   -
>> -
>
>
> Unnecessary changes.

Got it. Will remove it later.


> [...]
>>   +/* MSI related registers */
>> +
>> +/* MSI status register */
>> +#define VIRTIO_MMIO_MSI_STATUS        0x0c0
>> +/* MSI command register */
>> +#define VIRTIO_MMIO_MSI_COMMAND        0x0c2
>> +/* MSI low 32 bit address, 64 bits in two halves */
>> +#define VIRTIO_MMIO_MSI_ADDRESS_LOW    0x0c4
>> +/* MSI high 32 bit address, 64 bits in two halves */
>> +#define VIRTIO_MMIO_MSI_ADDRESS_HIGH    0x0c8
>> +/* MSI data */
>> +#define VIRTIO_MMIO_MSI_DATA        0x0cc
>
>
> I wonder what's the advantage of using registers instead of memory 
> mapped tables as PCI did. Is this because MMIO doesn't have capability 
> list which makes it hard to be extended if we have variable size of 
> tables?
>
Yes, mmio doesn't have capability which limits the extension.

It need some registers to specify the msi table/mask table/pending table 
offsets if using what pci did.

As comments previously, mask/pending can be easily extended by MSI command.

>
>> +
>> +/* RO: MSI feature enabled mask */
>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>> +/* RO: Maximum queue size available */
>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>> +/* Reserved */
>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>> +
>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>
>
> I believe we need a command to read the number of vectors supported by 
> the device, or 2048 is assumed to be a fixed size here?

For not bringing much complexity, we proposed vector per queue and fixed 
relationship between events and vectors.

So the number of vectors supported by device is equal to the total 
number of vqs and config.

We will try to explicitly highlight this point in spec for later version.


Thanks!

Jing

>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2019-12-27  9:37       ` Liu, Jing2
  0 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2019-12-27  9:37 UTC (permalink / raw)
  To: Jason Wang, Zha Bin, linux-kernel
  Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng

Hi Jason,

Thanks for reviewing the patches!

  [...]
>> -
>> +#include <linux/msi.h>
>> +#include <asm/irqdomain.h>
>>     /* The alignment to use between consumer and producer parts of 
>> vring.
>>    * Currently hardcoded to the page size. */
>> @@ -90,6 +90,12 @@ struct virtio_mmio_device {
>>       /* a list of queues so we can dispatch IRQs */
>>       spinlock_t lock;
>>       struct list_head virtqueues;
>> +
>> +    int doorbell_base;
>> +    int doorbell_scale;
>
>
> It's better to use the terminology defined by spec, e.g 
> notify_base/notify_multiplexer etc.
>
> And we usually use unsigned type for offset.

We will fix this in next version.


>
>
>> +    bool msi_enabled;
>> +    /* Name strings for interrupts. */
>> +    char (*vm_vq_names)[256];
>>   };
>>     struct virtio_mmio_vq_info {
>> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>>   };
>>     +static void vm_free_msi_irqs(struct virtio_device *vdev);
>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int 
>> nirqs);
>>     /* Configuration interface */
>>   @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>>   {
>>       struct virtio_mmio_device *vm_dev = 
>> to_virtio_mmio_device(vq->vdev);
>>   +    if (vm_dev->version == 3) {
>> +        int offset = vm_dev->doorbell_base +
>> +                 vm_dev->doorbell_scale * vq->index;
>> +        writel(vq->index, vm_dev->base + offset);
>
>
> In virtio-pci we store the doorbell address in vq->priv to avoid doing 
> multiplication in fast path.

Good suggestion. We will fix this in next version.

[...]

>> +
>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int 
>> nirqs)
>> +{
>> +    struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>> +    int irq, err;
>> +    u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
>> +
>> +    if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) 
>> == 0)
>> +        return -EINVAL;
>> +
>> +    vm_dev->vm_vq_names = kmalloc_array(nirqs, 
>> sizeof(*vm_dev->vm_vq_names),
>> +            GFP_KERNEL);
>> +    if (!vm_dev->vm_vq_names)
>> +        return -ENOMEM;
>> +
>> +    if (!vdev->dev.msi_domain)
>> +        vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
>> +    err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
>> +            mmio_write_msi_msg);
>
>
> Can platform_msi_domain_alloc_irqs check msi_domain vs NULL?
>
Actually here, we need to firstly consider the cases that @dev doesn't 
have @msi_domain,

according to the report by lkp.

For the platform_msi_domain_alloc_irqs, it can help check inside.


> [...]
>>         rc = register_virtio_device(&vm_dev->vdev);
>>       if (rc)
>> @@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct 
>> platform_device *pdev)
>>       return 0;
>>   }
>>   -
>> -
>
>
> Unnecessary changes.

Got it. Will remove it later.


> [...]
>>   +/* MSI related registers */
>> +
>> +/* MSI status register */
>> +#define VIRTIO_MMIO_MSI_STATUS        0x0c0
>> +/* MSI command register */
>> +#define VIRTIO_MMIO_MSI_COMMAND        0x0c2
>> +/* MSI low 32 bit address, 64 bits in two halves */
>> +#define VIRTIO_MMIO_MSI_ADDRESS_LOW    0x0c4
>> +/* MSI high 32 bit address, 64 bits in two halves */
>> +#define VIRTIO_MMIO_MSI_ADDRESS_HIGH    0x0c8
>> +/* MSI data */
>> +#define VIRTIO_MMIO_MSI_DATA        0x0cc
>
>
> I wonder what's the advantage of using registers instead of memory 
> mapped tables as PCI did. Is this because MMIO doesn't have capability 
> list which makes it hard to be extended if we have variable size of 
> tables?
>
Yes, mmio doesn't have capability which limits the extension.

It need some registers to specify the msi table/mask table/pending table 
offsets if using what pci did.

As comments previously, mask/pending can be easily extended by MSI command.

>
>> +
>> +/* RO: MSI feature enabled mask */
>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>> +/* RO: Maximum queue size available */
>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>> +/* Reserved */
>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>> +
>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>
>
> I believe we need a command to read the number of vectors supported by 
> the device, or 2048 is assumed to be a fixed size here?

For not bringing much complexity, we proposed vector per queue and fixed 
relationship between events and vectors.

So the number of vectors supported by device is equal to the total 
number of vqs and config.

We will try to explicitly highlight this point in spec for later version.


Thanks!

Jing

>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-26 13:16         ` Liu, Jiang
@ 2020-01-02  6:28             ` Jason Wang
  2020-01-05 10:42             ` [virtio-dev] " Michael S. Tsirkin
  1 sibling, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-02  6:28 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Zha Bin, linux-kernel, mst, slp, virtio-dev, jing2.liu, chao.p.peng


On 2019/12/26 下午9:16, Liu, Jiang wrote:
>
>> On Dec 26, 2019, at 4:09 PM, Jason Wang <jasowang@redhat.com> wrote:
>>
>>
>> On 2019/12/25 下午11:20, Liu, Jiang wrote:
>>>> On Dec 25, 2019, at 6:20 PM, Jason Wang <jasowang@redhat.com> wrote:
>>>>
>>>>
>>>> On 2019/12/25 上午10:50, Zha Bin wrote:
>>>>> From: Liu Jiang <gerry@linux.alibaba.com>
>>>>>
>>>>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>>>>> virtio over mmio devices as a lightweight machine model for modern
>>>>> cloud. The standard virtio over MMIO transport layer only supports one
>>>>> legacy interrupt, which is much heavier than virtio over PCI transport
>>>>> layer using MSI. Legacy interrupt has long work path and causes specific
>>>>> VMExits in following cases, which would considerably slow down the
>>>>> performance:
>>>>>
>>>>> 1) read interrupt status register
>>>>> 2) update interrupt status register
>>>>> 3) write IOAPIC EOI register
>>>>>
>>>>> We proposed to update virtio over MMIO to version 3[1] to add the
>>>>> following new features and enhance the performance.
>>>>>
>>>>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>>>>     interrupt performance for virtio multi-queue devices
>>>>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>>>>     to the doorbells provided by virtio devices.
>>>>>
>>>>> The following is the network tcp_rr performance testing report, tested
>>>>> with virtio-pci device, vanilla virtio-mmio device and patched
>>>>> virtio-mmio device (run test 3 times for each case):
>>>>>
>>>>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>>>>
>>>>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>>>>> 	trans/s	    9536	6939		9500
>>>>> 	trans/s	    9734	7029		9749
>>>>> 	trans/s	    9894	7095		9318
>>>>>
>>>>> [1] https://lkml.org/lkml/2019/12/20/113
>>>> Thanks for the patch. Two questions after a quick glance:
>>>>
>>>> 1) In PCI we choose to support MSI-X instead of MSI for having extra flexibility like alias, independent data and address (e.g for affinity) . Any reason for not start from MSI-X? E.g having MSI-X table and PBA (both of which looks pretty independent).
>>> Hi Jason,
>>> 	Thanks for reviewing patches on Christmas Day:)
>>> 	The PCI MSI-x has several advantages over PCI MSI, mainly
>>> 1) support 2048 vectors, much more than 32 vectors supported by MSI.
>>> 2) dedicated address/data for each vector,
>>> 3) per vector mask/pending bits.
>>> The proposed MMIO MSI extension supports both 1) and 2),
>>
>> Aha right, I mis-read the patch. But more questions comes:
>>
>> 1) The association between vq and MSI-X vector is fixed. This means it can't work for a device that have more than 2047 queues. We probably need something similar to virtio-pci to allow a dynamic association.
> We have considered both the PCI MSI-x like dynamic association design and fix mapping design.
> The fix mapping design simplifies both the interrupt configuration process and VMM implementations.


Well, for VMM just an indirection and for guest, it can choose to use 
fixed mapping, just need to program once during probe.


> And the virtio mmio transport layer is mainly used by light VMMs to support small scale virtual machines,


Let's not limit the interface to be used by a specific case :). 
Eliminating PCIE would be appealing for other scenarios.


>    2048 vectors should be enough for these usage cases.
> So the fix mapping design has been used.
>
>> 2) The mask and unmask control is missed
>>
>>
>>>   but the extension doesn’t support 3) because
>>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
>>
>> Not directly used but masking/unmasking is widely used in irq subsystem which allows lots of optimizations.
>>
>>
>>> On the other hand, we want to simplify VMM implementations as simple as possible, and mimicking the PCI MSI-x
>>> will cause some complexity to VMM implementations.
>>
>> I agree to simplify VMM implementation, but it looks to me introducing masking/pending won't cost too much code in the VMM implementation. Just new type of command for VIRTIO_MMIO_MSI_COMMAND.
> We want to make VMM implementations as simple as possible:)
> And based on following observations, we have disabled support of mask/unmask,
> 1) MSI is edge triggering, which means it won’t be shared with other interrupt sources,


Is this true? I think the spec does not forbid such usages, e.g using 
the same MSI address/command for different queues or devices?


> so masking/unmasking won’t be used for normal interrupt management logic.
> 2) Linux virtio mmio transport layer doesn’t support  suspend/resume yet, so there’s no need to quiesce the device by masking interrupts.


Yes, but it's a limitation only for virtio mmio transport. We can add it.


> 3) The legacy PCI 2.2 devices doesn’t support irq masking/unmasking, so irq masking/unmasking may be optional operations.


Yes, but as you said, it helps for performance and some other cases. I 
still prefer to implement that consider it is not complex. If we do MSI 
without masking/unmasking, I suspect we will implement MSI-X finally 
somedays then maintaining MSI will become a burden... (still takes 
virtio-pci as an example, it choose to implement MSI-X not MSI).


> So we skipped support of irq masking/unmasking. We will recheck whether irq masking/unmasking is mandatory for MMIO devices.
> On the other hand, we may enhance the spec to define command codes for masking/unmasking, and VMM may optionally support masking/unmasking.


Yes, thanks.


>
> Thanks,
> Gerry
>
>> Thanks
>>
>>
>>>> 2) It's better to split notify_multiplexer out of MSI support to ease the reviewers (apply to spec patch as well)
>>> Great suggestion, we will try to split the patch.
>>>
>>> Thanks,
>>> Gerry
>>>
>>>> Thanks


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-02  6:28             ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-02  6:28 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Zha Bin, linux-kernel, mst, slp, virtio-dev, jing2.liu, chao.p.peng


On 2019/12/26 下午9:16, Liu, Jiang wrote:
>
>> On Dec 26, 2019, at 4:09 PM, Jason Wang <jasowang@redhat.com> wrote:
>>
>>
>> On 2019/12/25 下午11:20, Liu, Jiang wrote:
>>>> On Dec 25, 2019, at 6:20 PM, Jason Wang <jasowang@redhat.com> wrote:
>>>>
>>>>
>>>> On 2019/12/25 上午10:50, Zha Bin wrote:
>>>>> From: Liu Jiang <gerry@linux.alibaba.com>
>>>>>
>>>>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>>>>> virtio over mmio devices as a lightweight machine model for modern
>>>>> cloud. The standard virtio over MMIO transport layer only supports one
>>>>> legacy interrupt, which is much heavier than virtio over PCI transport
>>>>> layer using MSI. Legacy interrupt has long work path and causes specific
>>>>> VMExits in following cases, which would considerably slow down the
>>>>> performance:
>>>>>
>>>>> 1) read interrupt status register
>>>>> 2) update interrupt status register
>>>>> 3) write IOAPIC EOI register
>>>>>
>>>>> We proposed to update virtio over MMIO to version 3[1] to add the
>>>>> following new features and enhance the performance.
>>>>>
>>>>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>>>>     interrupt performance for virtio multi-queue devices
>>>>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>>>>     to the doorbells provided by virtio devices.
>>>>>
>>>>> The following is the network tcp_rr performance testing report, tested
>>>>> with virtio-pci device, vanilla virtio-mmio device and patched
>>>>> virtio-mmio device (run test 3 times for each case):
>>>>>
>>>>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>>>>
>>>>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>>>>> 	trans/s	    9536	6939		9500
>>>>> 	trans/s	    9734	7029		9749
>>>>> 	trans/s	    9894	7095		9318
>>>>>
>>>>> [1] https://lkml.org/lkml/2019/12/20/113
>>>> Thanks for the patch. Two questions after a quick glance:
>>>>
>>>> 1) In PCI we choose to support MSI-X instead of MSI for having extra flexibility like alias, independent data and address (e.g for affinity) . Any reason for not start from MSI-X? E.g having MSI-X table and PBA (both of which looks pretty independent).
>>> Hi Jason,
>>> 	Thanks for reviewing patches on Christmas Day:)
>>> 	The PCI MSI-x has several advantages over PCI MSI, mainly
>>> 1) support 2048 vectors, much more than 32 vectors supported by MSI.
>>> 2) dedicated address/data for each vector,
>>> 3) per vector mask/pending bits.
>>> The proposed MMIO MSI extension supports both 1) and 2),
>>
>> Aha right, I mis-read the patch. But more questions comes:
>>
>> 1) The association between vq and MSI-X vector is fixed. This means it can't work for a device that have more than 2047 queues. We probably need something similar to virtio-pci to allow a dynamic association.
> We have considered both the PCI MSI-x like dynamic association design and fix mapping design.
> The fix mapping design simplifies both the interrupt configuration process and VMM implementations.


Well, for VMM just an indirection and for guest, it can choose to use 
fixed mapping, just need to program once during probe.


> And the virtio mmio transport layer is mainly used by light VMMs to support small scale virtual machines,


Let's not limit the interface to be used by a specific case :). 
Eliminating PCIE would be appealing for other scenarios.


>    2048 vectors should be enough for these usage cases.
> So the fix mapping design has been used.
>
>> 2) The mask and unmask control is missed
>>
>>
>>>   but the extension doesn’t support 3) because
>>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
>>
>> Not directly used but masking/unmasking is widely used in irq subsystem which allows lots of optimizations.
>>
>>
>>> On the other hand, we want to simplify VMM implementations as simple as possible, and mimicking the PCI MSI-x
>>> will cause some complexity to VMM implementations.
>>
>> I agree to simplify VMM implementation, but it looks to me introducing masking/pending won't cost too much code in the VMM implementation. Just new type of command for VIRTIO_MMIO_MSI_COMMAND.
> We want to make VMM implementations as simple as possible:)
> And based on following observations, we have disabled support of mask/unmask,
> 1) MSI is edge triggering, which means it won’t be shared with other interrupt sources,


Is this true? I think the spec does not forbid such usages, e.g using 
the same MSI address/command for different queues or devices?


> so masking/unmasking won’t be used for normal interrupt management logic.
> 2) Linux virtio mmio transport layer doesn’t support  suspend/resume yet, so there’s no need to quiesce the device by masking interrupts.


Yes, but it's a limitation only for virtio mmio transport. We can add it.


> 3) The legacy PCI 2.2 devices doesn’t support irq masking/unmasking, so irq masking/unmasking may be optional operations.


Yes, but as you said, it helps for performance and some other cases. I 
still prefer to implement that consider it is not complex. If we do MSI 
without masking/unmasking, I suspect we will implement MSI-X finally 
somedays then maintaining MSI will become a burden... (still takes 
virtio-pci as an example, it choose to implement MSI-X not MSI).


> So we skipped support of irq masking/unmasking. We will recheck whether irq masking/unmasking is mandatory for MMIO devices.
> On the other hand, we may enhance the spec to define command codes for masking/unmasking, and VMM may optionally support masking/unmasking.


Yes, thanks.


>
> Thanks,
> Gerry
>
>> Thanks
>>
>>
>>>> 2) It's better to split notify_multiplexer out of MSI support to ease the reviewers (apply to spec patch as well)
>>> Great suggestion, we will try to split the patch.
>>>
>>> Thanks,
>>> Gerry
>>>
>>>> Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-27  9:37       ` Liu, Jing2
@ 2020-01-02  6:33         ` Jason Wang
  -1 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-02  6:33 UTC (permalink / raw)
  To: Liu, Jing2, Zha Bin, linux-kernel
  Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/27 下午5:37, Liu, Jing2 wrote:
> Hi Jason,
>
> Thanks for reviewing the patches!
>
>  [...]
>>> -
>>> +#include <linux/msi.h>
>>> +#include <asm/irqdomain.h>
>>>     /* The alignment to use between consumer and producer parts of 
>>> vring.
>>>    * Currently hardcoded to the page size. */
>>> @@ -90,6 +90,12 @@ struct virtio_mmio_device {
>>>       /* a list of queues so we can dispatch IRQs */
>>>       spinlock_t lock;
>>>       struct list_head virtqueues;
>>> +
>>> +    int doorbell_base;
>>> +    int doorbell_scale;
>>
>>
>> It's better to use the terminology defined by spec, e.g 
>> notify_base/notify_multiplexer etc.
>>
>> And we usually use unsigned type for offset.
>
> We will fix this in next version.
>
>
>>
>>
>>> +    bool msi_enabled;
>>> +    /* Name strings for interrupts. */
>>> +    char (*vm_vq_names)[256];
>>>   };
>>>     struct virtio_mmio_vq_info {
>>> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>>>   };
>>>     +static void vm_free_msi_irqs(struct virtio_device *vdev);
>>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int 
>>> nirqs);
>>>     /* Configuration interface */
>>>   @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>>>   {
>>>       struct virtio_mmio_device *vm_dev = 
>>> to_virtio_mmio_device(vq->vdev);
>>>   +    if (vm_dev->version == 3) {
>>> +        int offset = vm_dev->doorbell_base +
>>> +                 vm_dev->doorbell_scale * vq->index;
>>> +        writel(vq->index, vm_dev->base + offset);
>>
>>
>> In virtio-pci we store the doorbell address in vq->priv to avoid 
>> doing multiplication in fast path.
>
> Good suggestion. We will fix this in next version.
>
> [...]
>
>>> +
>>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int 
>>> nirqs)
>>> +{
>>> +    struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>>> +    int irq, err;
>>> +    u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
>>> +
>>> +    if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) 
>>> == 0)
>>> +        return -EINVAL;
>>> +
>>> +    vm_dev->vm_vq_names = kmalloc_array(nirqs, 
>>> sizeof(*vm_dev->vm_vq_names),
>>> +            GFP_KERNEL);
>>> +    if (!vm_dev->vm_vq_names)
>>> +        return -ENOMEM;
>>> +
>>> +    if (!vdev->dev.msi_domain)
>>> +        vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
>>> +    err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
>>> +            mmio_write_msi_msg);
>>
>>
>> Can platform_msi_domain_alloc_irqs check msi_domain vs NULL?
>>
> Actually here, we need to firstly consider the cases that @dev doesn't 
> have @msi_domain,
>
> according to the report by lkp.
>
> For the platform_msi_domain_alloc_irqs, it can help check inside.
>
>
>> [...]
>>>         rc = register_virtio_device(&vm_dev->vdev);
>>>       if (rc)
>>> @@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct 
>>> platform_device *pdev)
>>>       return 0;
>>>   }
>>>   -
>>> -
>>
>>
>> Unnecessary changes.
>
> Got it. Will remove it later.
>
>
>> [...]
>>>   +/* MSI related registers */
>>> +
>>> +/* MSI status register */
>>> +#define VIRTIO_MMIO_MSI_STATUS        0x0c0
>>> +/* MSI command register */
>>> +#define VIRTIO_MMIO_MSI_COMMAND        0x0c2
>>> +/* MSI low 32 bit address, 64 bits in two halves */
>>> +#define VIRTIO_MMIO_MSI_ADDRESS_LOW    0x0c4
>>> +/* MSI high 32 bit address, 64 bits in two halves */
>>> +#define VIRTIO_MMIO_MSI_ADDRESS_HIGH    0x0c8
>>> +/* MSI data */
>>> +#define VIRTIO_MMIO_MSI_DATA        0x0cc
>>
>>
>> I wonder what's the advantage of using registers instead of memory 
>> mapped tables as PCI did. Is this because MMIO doesn't have 
>> capability list which makes it hard to be extended if we have 
>> variable size of tables?
>>
> Yes, mmio doesn't have capability which limits the extension.


We may want to revisit and add something like this for future virtio 
MMIO versions.


>
> It need some registers to specify the msi table/mask table/pending 
> table offsets if using what pci did.
>
> As comments previously, mask/pending can be easily extended by MSI 
> command.
>
>>
>>> +
>>> +/* RO: MSI feature enabled mask */
>>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>>> +/* RO: Maximum queue size available */
>>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>>> +/* Reserved */
>>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>>> +
>>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>>
>>
>> I believe we need a command to read the number of vectors supported 
>> by the device, or 2048 is assumed to be a fixed size here?
>
> For not bringing much complexity, we proposed vector per queue and 
> fixed relationship between events and vectors.


It's a about the number of MSIs not the mapping between queues to MSIs. 
And it looks to me it won't bring obvious complexity, just need a 
register to read the #MSIs. Device implementation may stick to a fixed size.

Having few pages for a device that only have one queue is kind of a waste.

Thanks


>
>
> So the number of vectors supported by device is equal to the total 
> number of vqs and config.
>
> We will try to explicitly highlight this point in spec for later version.
>
>
> Thanks!
>
> Jing
>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>
>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-02  6:33         ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-02  6:33 UTC (permalink / raw)
  To: Liu, Jing2, Zha Bin, linux-kernel
  Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/27 下午5:37, Liu, Jing2 wrote:
> Hi Jason,
>
> Thanks for reviewing the patches!
>
>  [...]
>>> -
>>> +#include <linux/msi.h>
>>> +#include <asm/irqdomain.h>
>>>     /* The alignment to use between consumer and producer parts of 
>>> vring.
>>>    * Currently hardcoded to the page size. */
>>> @@ -90,6 +90,12 @@ struct virtio_mmio_device {
>>>       /* a list of queues so we can dispatch IRQs */
>>>       spinlock_t lock;
>>>       struct list_head virtqueues;
>>> +
>>> +    int doorbell_base;
>>> +    int doorbell_scale;
>>
>>
>> It's better to use the terminology defined by spec, e.g 
>> notify_base/notify_multiplexer etc.
>>
>> And we usually use unsigned type for offset.
>
> We will fix this in next version.
>
>
>>
>>
>>> +    bool msi_enabled;
>>> +    /* Name strings for interrupts. */
>>> +    char (*vm_vq_names)[256];
>>>   };
>>>     struct virtio_mmio_vq_info {
>>> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>>>   };
>>>     +static void vm_free_msi_irqs(struct virtio_device *vdev);
>>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int 
>>> nirqs);
>>>     /* Configuration interface */
>>>   @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>>>   {
>>>       struct virtio_mmio_device *vm_dev = 
>>> to_virtio_mmio_device(vq->vdev);
>>>   +    if (vm_dev->version == 3) {
>>> +        int offset = vm_dev->doorbell_base +
>>> +                 vm_dev->doorbell_scale * vq->index;
>>> +        writel(vq->index, vm_dev->base + offset);
>>
>>
>> In virtio-pci we store the doorbell address in vq->priv to avoid 
>> doing multiplication in fast path.
>
> Good suggestion. We will fix this in next version.
>
> [...]
>
>>> +
>>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int 
>>> nirqs)
>>> +{
>>> +    struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>>> +    int irq, err;
>>> +    u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
>>> +
>>> +    if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) 
>>> == 0)
>>> +        return -EINVAL;
>>> +
>>> +    vm_dev->vm_vq_names = kmalloc_array(nirqs, 
>>> sizeof(*vm_dev->vm_vq_names),
>>> +            GFP_KERNEL);
>>> +    if (!vm_dev->vm_vq_names)
>>> +        return -ENOMEM;
>>> +
>>> +    if (!vdev->dev.msi_domain)
>>> +        vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
>>> +    err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
>>> +            mmio_write_msi_msg);
>>
>>
>> Can platform_msi_domain_alloc_irqs check msi_domain vs NULL?
>>
> Actually here, we need to firstly consider the cases that @dev doesn't 
> have @msi_domain,
>
> according to the report by lkp.
>
> For the platform_msi_domain_alloc_irqs, it can help check inside.
>
>
>> [...]
>>>         rc = register_virtio_device(&vm_dev->vdev);
>>>       if (rc)
>>> @@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct 
>>> platform_device *pdev)
>>>       return 0;
>>>   }
>>>   -
>>> -
>>
>>
>> Unnecessary changes.
>
> Got it. Will remove it later.
>
>
>> [...]
>>>   +/* MSI related registers */
>>> +
>>> +/* MSI status register */
>>> +#define VIRTIO_MMIO_MSI_STATUS        0x0c0
>>> +/* MSI command register */
>>> +#define VIRTIO_MMIO_MSI_COMMAND        0x0c2
>>> +/* MSI low 32 bit address, 64 bits in two halves */
>>> +#define VIRTIO_MMIO_MSI_ADDRESS_LOW    0x0c4
>>> +/* MSI high 32 bit address, 64 bits in two halves */
>>> +#define VIRTIO_MMIO_MSI_ADDRESS_HIGH    0x0c8
>>> +/* MSI data */
>>> +#define VIRTIO_MMIO_MSI_DATA        0x0cc
>>
>>
>> I wonder what's the advantage of using registers instead of memory 
>> mapped tables as PCI did. Is this because MMIO doesn't have 
>> capability list which makes it hard to be extended if we have 
>> variable size of tables?
>>
> Yes, mmio doesn't have capability which limits the extension.


We may want to revisit and add something like this for future virtio 
MMIO versions.


>
> It need some registers to specify the msi table/mask table/pending 
> table offsets if using what pci did.
>
> As comments previously, mask/pending can be easily extended by MSI 
> command.
>
>>
>>> +
>>> +/* RO: MSI feature enabled mask */
>>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>>> +/* RO: Maximum queue size available */
>>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>>> +/* Reserved */
>>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>>> +
>>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>>
>>
>> I believe we need a command to read the number of vectors supported 
>> by the device, or 2048 is assumed to be a fixed size here?
>
> For not bringing much complexity, we proposed vector per queue and 
> fixed relationship between events and vectors.


It's a about the number of MSIs not the mapping between queues to MSIs. 
And it looks to me it won't bring obvious complexity, just need a 
register to read the #MSIs. Device implementation may stick to a fixed size.

Having few pages for a device that only have one queue is kind of a waste.

Thanks


>
>
> So the number of vectors supported by device is equal to the total 
> number of vqs and config.
>
> We will try to explicitly highlight this point in spec for later version.
>
>
> Thanks!
>
> Jing
>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 0/2] support virtio mmio specification Version 3
  2019-12-25  2:50 [PATCH v1 0/2] support virtio mmio specification Version 3 Zha Bin
@ 2020-01-02  6:55   ` Jason Wang
  2019-12-25  2:50 ` [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3 Zha Bin
  2020-01-02  6:55   ` [virtio-dev] " Jason Wang
  2 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-02  6:55 UTC (permalink / raw)
  To: Zha Bin, linux-kernel; +Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/25 上午10:50, Zha Bin wrote:
> With the virtio spec proposal[1], other VMMs (e.g. Qemu) can also make use
> of the new features to get a enhancing performance.
>
> [1]https://lkml.org/lkml/2019/12/20/113
>
> Liu Jiang (2):
>    x86/msi: Enhance x86 to support platform_msi
>    virtio-mmio: add features for virtio-mmio specification version 3


Btw, for next version I suggest to copy both kvm-devel list and 
qemu-devel list.

Thanks



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 0/2] support virtio mmio specification Version 3
@ 2020-01-02  6:55   ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-02  6:55 UTC (permalink / raw)
  To: Zha Bin, linux-kernel; +Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2019/12/25 上午10:50, Zha Bin wrote:
> With the virtio spec proposal[1], other VMMs (e.g. Qemu) can also make use
> of the new features to get a enhancing performance.
>
> [1]https://lkml.org/lkml/2019/12/20/113
>
> Liu Jiang (2):
>    x86/msi: Enhance x86 to support platform_msi
>    virtio-mmio: add features for virtio-mmio specification version 3


Btw, for next version I suggest to copy both kvm-devel list and 
qemu-devel list.

Thanks



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-02  6:33         ` Jason Wang
@ 2020-01-02  9:13           ` Liu, Jing2
  -1 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-02  9:13 UTC (permalink / raw)
  To: Jason Wang, Zha Bin, linux-kernel
  Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng

[...]

>>>
>>>> +
>>>> +/* RO: MSI feature enabled mask */
>>>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>>>> +/* RO: Maximum queue size available */
>>>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>>>> +/* Reserved */
>>>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>>>> +
>>>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>>>
>>>
>>> I believe we need a command to read the number of vectors supported 
>>> by the device, or 2048 is assumed to be a fixed size here?
>>
>> For not bringing much complexity, we proposed vector per queue and 
>> fixed relationship between events and vectors.
>
>
> It's a about the number of MSIs not the mapping between queues to 
> MSIs.And it looks to me it won't bring obvious complexity, just need a 
> register to read the #MSIs. Device implementation may stick to a fixed 
> size.

Based on that assumption, the device supports #MSIs = #queues + #config. 
Then driver need not read the register.

We're trying to make such kind of agreement on spec level.

>
> Having few pages for a device that only have one queue is kind of a 
> waste.

Could I ask what's the meaning of few pages here? BTW, we didn't define 
MSIx-like tables for virtio-mmio.

Thanks,

Jing

>
> Thanks
>
>
>>
>>
>> So the number of vectors supported by device is equal to the total 
>> number of vqs and config.
>>
>> We will try to explicitly highlight this point in spec for later 
>> version.
>>
>>
>> Thanks!
>>
>> Jing
>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>
>>
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-02  9:13           ` Liu, Jing2
  0 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-02  9:13 UTC (permalink / raw)
  To: Jason Wang, Zha Bin, linux-kernel
  Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng

[...]

>>>
>>>> +
>>>> +/* RO: MSI feature enabled mask */
>>>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>>>> +/* RO: Maximum queue size available */
>>>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>>>> +/* Reserved */
>>>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>>>> +
>>>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>>>
>>>
>>> I believe we need a command to read the number of vectors supported 
>>> by the device, or 2048 is assumed to be a fixed size here?
>>
>> For not bringing much complexity, we proposed vector per queue and 
>> fixed relationship between events and vectors.
>
>
> It's a about the number of MSIs not the mapping between queues to 
> MSIs.And it looks to me it won't bring obvious complexity, just need a 
> register to read the #MSIs. Device implementation may stick to a fixed 
> size.

Based on that assumption, the device supports #MSIs = #queues + #config. 
Then driver need not read the register.

We're trying to make such kind of agreement on spec level.

>
> Having few pages for a device that only have one queue is kind of a 
> waste.

Could I ask what's the meaning of few pages here? BTW, we didn't define 
MSIx-like tables for virtio-mmio.

Thanks,

Jing

>
> Thanks
>
>
>>
>>
>> So the number of vectors supported by device is equal to the total 
>> number of vqs and config.
>>
>> We will try to explicitly highlight this point in spec for later 
>> version.
>>
>>
>> Thanks!
>>
>> Jing
>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-02  9:13           ` Liu, Jing2
@ 2020-01-03  3:24             ` Jason Wang
  -1 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-03  3:24 UTC (permalink / raw)
  To: Liu, Jing2, Zha Bin, linux-kernel
  Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2020/1/2 下午5:13, Liu, Jing2 wrote:
> [...]
>
>>>>
>>>>> +
>>>>> +/* RO: MSI feature enabled mask */
>>>>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>>>>> +/* RO: Maximum queue size available */
>>>>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>>>>> +/* Reserved */
>>>>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>>>>> +
>>>>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>>>>
>>>>
>>>> I believe we need a command to read the number of vectors supported 
>>>> by the device, or 2048 is assumed to be a fixed size here?
>>>
>>> For not bringing much complexity, we proposed vector per queue and 
>>> fixed relationship between events and vectors.
>>
>>
>> It's a about the number of MSIs not the mapping between queues to 
>> MSIs.And it looks to me it won't bring obvious complexity, just need 
>> a register to read the #MSIs. Device implementation may stick to a 
>> fixed size.
>
> Based on that assumption, the device supports #MSIs = #queues + 
> #config. Then driver need not read the register.
>
> We're trying to make such kind of agreement on spec level.


Ok, I get you now.

But still, having fixed number of MSIs is less flexible. E.g:

- for x86, processor can only deal with about 250 interrupt vectors.
- driver may choose to share MSI vectors [1] (which is not merged but we 
will for sure need it)

[1] https://lkml.org/lkml/2014/12/25/169


>
>>
>> Having few pages for a device that only have one queue is kind of a 
>> waste.
>
> Could I ask what's the meaning of few pages here? BTW, we didn't 
> define MSIx-like tables for virtio-mmio.


I thought you're using a fixed size (2048) for each device. But looks not :)

Thanks


>
> Thanks,
>
> Jing
>
>>
>> Thanks
>>
>>
>>>
>>>
>>> So the number of vectors supported by device is equal to the total 
>>> number of vqs and config.
>>>
>>> We will try to explicitly highlight this point in spec for later 
>>> version.
>>>
>>>
>>> Thanks!
>>>
>>> Jing
>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-03  3:24             ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-03  3:24 UTC (permalink / raw)
  To: Liu, Jing2, Zha Bin, linux-kernel
  Cc: mst, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 2020/1/2 下午5:13, Liu, Jing2 wrote:
> [...]
>
>>>>
>>>>> +
>>>>> +/* RO: MSI feature enabled mask */
>>>>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>>>>> +/* RO: Maximum queue size available */
>>>>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>>>>> +/* Reserved */
>>>>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>>>>> +
>>>>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>>>>
>>>>
>>>> I believe we need a command to read the number of vectors supported 
>>>> by the device, or 2048 is assumed to be a fixed size here?
>>>
>>> For not bringing much complexity, we proposed vector per queue and 
>>> fixed relationship between events and vectors.
>>
>>
>> It's a about the number of MSIs not the mapping between queues to 
>> MSIs.And it looks to me it won't bring obvious complexity, just need 
>> a register to read the #MSIs. Device implementation may stick to a 
>> fixed size.
>
> Based on that assumption, the device supports #MSIs = #queues + 
> #config. Then driver need not read the register.
>
> We're trying to make such kind of agreement on spec level.


Ok, I get you now.

But still, having fixed number of MSIs is less flexible. E.g:

- for x86, processor can only deal with about 250 interrupt vectors.
- driver may choose to share MSI vectors [1] (which is not merged but we 
will for sure need it)

[1] https://lkml.org/lkml/2014/12/25/169


>
>>
>> Having few pages for a device that only have one queue is kind of a 
>> waste.
>
> Could I ask what's the meaning of few pages here? BTW, we didn't 
> define MSIx-like tables for virtio-mmio.


I thought you're using a fixed size (2048) for each device. But looks not :)

Thanks


>
> Thanks,
>
> Jing
>
>>
>> Thanks
>>
>>
>>>
>>>
>>> So the number of vectors supported by device is equal to the total 
>>> number of vqs and config.
>>>
>>> We will try to explicitly highlight this point in spec for later 
>>> version.
>>>
>>>
>>> Thanks!
>>>
>>> Jing
>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-03  3:24             ` Jason Wang
  (?)
@ 2020-01-03  6:14             ` Liu, Jiang
  2020-01-03  9:12                 ` Jason Wang
  -1 siblings, 1 reply; 60+ messages in thread
From: Liu, Jiang @ 2020-01-03  6:14 UTC (permalink / raw)
  To: Jason Wang
  Cc: Liu, Jing2, Zha Bin, linux-kernel, mst, slp, virtio-dev,
	jing2.liu, chao.p.peng



> On Jan 3, 2020, at 11:24 AM, Jason Wang <jasowang@redhat.com> wrote:
> 
> 
> On 2020/1/2 下午5:13, Liu, Jing2 wrote:
>> [...]
>> 
>>>>> 
>>>>>> +
>>>>>> +/* RO: MSI feature enabled mask */
>>>>>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK    0x8000
>>>>>> +/* RO: Maximum queue size available */
>>>>>> +#define VIRTIO_MMIO_MSI_STATUS_QMASK    0x07ff
>>>>>> +/* Reserved */
>>>>>> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED    0x7800
>>>>>> +
>>>>>> +#define VIRTIO_MMIO_MSI_CMD_UPDATE    0x1
>>>>> 
>>>>> 
>>>>> I believe we need a command to read the number of vectors supported by the device, or 2048 is assumed to be a fixed size here?
>>>> 
>>>> For not bringing much complexity, we proposed vector per queue and fixed relationship between events and vectors.
>>> 
>>> 
>>> It's a about the number of MSIs not the mapping between queues to MSIs.And it looks to me it won't bring obvious complexity, just need a register to read the #MSIs. Device implementation may stick to a fixed size.
>> 
>> Based on that assumption, the device supports #MSIs = #queues + #config. Then driver need not read the register.
>> 
>> We're trying to make such kind of agreement on spec level.
> 
> 
> Ok, I get you now.
> 
> But still, having fixed number of MSIs is less flexible. E.g:
> 
> - for x86, processor can only deal with about 250 interrupt vectors.
> - driver may choose to share MSI vectors [1] (which is not merged but we will for sure need it)
Thanks for the info:)
X86 systems roughly have NCPU * 200 vectors available for device interrupts.
The proposed patch tries to map multiple event sources to an interrupt vector, to avoid running out of x86 CPU vectors.
Many virtio mmio devices may have several or tens of event sources, and it’s rare to have hundreds of event sources.
So could we treat the dynamic mapping between event sources and interrupt vectors as an advanced optional feature?

> 
> [1] https://lkml.org/lkml/2014/12/25/169
> 
> 
>> 
>>> 
>>> Having few pages for a device that only have one queue is kind of a waste.
>> 
>> Could I ask what's the meaning of few pages here? BTW, we didn't define MSIx-like tables for virtio-mmio.
> 
> 
> I thought you're using a fixed size (2048) for each device. But looks not :)
> 
> Thanks
> 
> 
>> 
>> Thanks,
>> 
>> Jing
>> 
>>> 
>>> Thanks
>>> 
>>> 
>>>> 
>>>> 
>>>> So the number of vectors supported by device is equal to the total number of vqs and config.
>>>> 
>>>> We will try to explicitly highlight this point in spec for later version.
>>>> 
>>>> 
>>>> Thanks!
>>>> 
>>>> Jing
>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>>> 
>>>> 
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-02  6:28             ` [virtio-dev] " Jason Wang
  (?)
@ 2020-01-03  6:50             ` Liu, Jiang
  -1 siblings, 0 replies; 60+ messages in thread
From: Liu, Jiang @ 2020-01-03  6:50 UTC (permalink / raw)
  To: Jason Wang
  Cc: Zha Bin, linux-kernel, mst, slp, virtio-dev, jing2.liu, chao.p.peng

On Jan 2, 2020, at 2:28 PM, Jason Wang <jasowang@redhat.com> wrote:
> 
> 
> On 2019/12/26 下午9:16, Liu, Jiang wrote:
>> 
>>> On Dec 26, 2019, at 4:09 PM, Jason Wang <jasowang@redhat.com> wrote:
>>> 
>>> 
>>> On 2019/12/25 下午11:20, Liu, Jiang wrote:
>>>>> On Dec 25, 2019, at 6:20 PM, Jason Wang <jasowang@redhat.com> wrote:
>>>>> 
>>>>> 
>>>>> On 2019/12/25 上午10:50, Zha Bin wrote:
>>>>>> From: Liu Jiang <gerry@linux.alibaba.com>
>>>>>> 
>>>>>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>>>>>> virtio over mmio devices as a lightweight machine model for modern
>>>>>> cloud. The standard virtio over MMIO transport layer only supports one
>>>>>> legacy interrupt, which is much heavier than virtio over PCI transport
>>>>>> layer using MSI. Legacy interrupt has long work path and causes specific
>>>>>> VMExits in following cases, which would considerably slow down the
>>>>>> performance:
>>>>>> 
>>>>>> 1) read interrupt status register
>>>>>> 2) update interrupt status register
>>>>>> 3) write IOAPIC EOI register
>>>>>> 
>>>>>> We proposed to update virtio over MMIO to version 3[1] to add the
>>>>>> following new features and enhance the performance.
>>>>>> 
>>>>>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>>>>>    interrupt performance for virtio multi-queue devices
>>>>>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>>>>>    to the doorbells provided by virtio devices.
>>>>>> 
>>>>>> The following is the network tcp_rr performance testing report, tested
>>>>>> with virtio-pci device, vanilla virtio-mmio device and patched
>>>>>> virtio-mmio device (run test 3 times for each case):
>>>>>> 
>>>>>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>>>>> 
>>>>>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>>>>>> 	trans/s	    9536	6939		9500
>>>>>> 	trans/s	    9734	7029		9749
>>>>>> 	trans/s	    9894	7095		9318
>>>>>> 
>>>>>> [1] https://lkml.org/lkml/2019/12/20/113
>>>>> Thanks for the patch. Two questions after a quick glance:
>>>>> 
>>>>> 1) In PCI we choose to support MSI-X instead of MSI for having extra flexibility like alias, independent data and address (e.g for affinity) . Any reason for not start from MSI-X? E.g having MSI-X table and PBA (both of which looks pretty independent).
>>>> Hi Jason,
>>>> 	Thanks for reviewing patches on Christmas Day:)
>>>> 	The PCI MSI-x has several advantages over PCI MSI, mainly
>>>> 1) support 2048 vectors, much more than 32 vectors supported by MSI.
>>>> 2) dedicated address/data for each vector,
>>>> 3) per vector mask/pending bits.
>>>> The proposed MMIO MSI extension supports both 1) and 2),
>>> 
>>> Aha right, I mis-read the patch. But more questions comes:
>>> 
>>> 1) The association between vq and MSI-X vector is fixed. This means it can't work for a device that have more than 2047 queues. We probably need something similar to virtio-pci to allow a dynamic association.
>> We have considered both the PCI MSI-x like dynamic association design and fix mapping design.
>> The fix mapping design simplifies both the interrupt configuration process and VMM implementations.
> 
> 
> Well, for VMM just an indirection and for guest, it can choose to use fixed mapping, just need to program once during probe.
> 
> 
>> And the virtio mmio transport layer is mainly used by light VMMs to support small scale virtual machines,
> 
> 
> Let's not limit the interface to be used by a specific case :). Eliminating PCIE would be appealing for other scenarios.
> 
> 
>>   2048 vectors should be enough for these usage cases.
>> So the fix mapping design has been used.
>> 
>>> 2) The mask and unmask control is missed
>>> 
>>> 
>>>>  but the extension doesn’t support 3) because
>>>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
>>> 
>>> Not directly used but masking/unmasking is widely used in irq subsystem which allows lots of optimizations.
>>> 
>>> 
>>>> On the other hand, we want to simplify VMM implementations as simple as possible, and mimicking the PCI MSI-x
>>>> will cause some complexity to VMM implementations.
>>> 
>>> I agree to simplify VMM implementation, but it looks to me introducing masking/pending won't cost too much code in the VMM implementation. Just new type of command for VIRTIO_MMIO_MSI_COMMAND.
>> We want to make VMM implementations as simple as possible:)
>> And based on following observations, we have disabled support of mask/unmask,
>> 1) MSI is edge triggering, which means it won’t be shared with other interrupt sources,
> 
> 
> Is this true? I think the spec does not forbid such usages, e.g using the same MSI address/command for different queues or devices?
Yes, the spec doesn’t forbid this.
We could share the same MSIx vector for multiple queues, rx/tax etc.
But we can’t share a Linux MSI interrupt for different devices/MSIx vectors, this is an implementation constraint of the Linux interrupt subsystem.

> 
> 
>> so masking/unmasking won’t be used for normal interrupt management logic.
>> 2) Linux virtio mmio transport layer doesn’t support  suspend/resume yet, so there’s no need to quiesce the device by masking interrupts.
> 
> 
> Yes, but it's a limitation only for virtio mmio transport. We can add it.
> 
> 
>> 3) The legacy PCI 2.2 devices doesn’t support irq masking/unmasking, so irq masking/unmasking may be optional operations.
> 
> 
> Yes, but as you said, it helps for performance and some other cases. I still prefer to implement that consider it is not complex. If we do MSI without masking/unmasking, I suspect we will implement MSI-X finally somedays then maintaining MSI will become a burden... (still takes virtio-pci as an example, it choose to implement MSI-X not MSI).
> 
> 
>> So we skipped support of irq masking/unmasking. We will recheck whether irq masking/unmasking is mandatory for MMIO devices.
>> On the other hand, we may enhance the spec to define command codes for masking/unmasking, and VMM may optionally support masking/unmasking.
> 
> 
> Yes, thanks.
> 
> 
>> 
>> Thanks,
>> Gerry
>> 
>>> Thanks
>>> 
>>> 
>>>>> 2) It's better to split notify_multiplexer out of MSI support to ease the reviewers (apply to spec patch as well)
>>>> Great suggestion, we will try to split the patch.
>>>> 
>>>> Thanks,
>>>> Gerry
>>>> 
>>>>> Thanks


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-03  6:14             ` Liu, Jiang
@ 2020-01-03  9:12                 ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-03  9:12 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Liu, Jing2, Zha Bin, linux-kernel, mst, slp, virtio-dev,
	jing2.liu, chao.p.peng


On 2020/1/3 下午2:14, Liu, Jiang wrote:
>> Ok, I get you now.
>>
>> But still, having fixed number of MSIs is less flexible. E.g:
>>
>> - for x86, processor can only deal with about 250 interrupt vectors.
>> - driver may choose to share MSI vectors [1] (which is not merged but we will for sure need it)
> Thanks for the info:)
> X86 systems roughly have NCPU * 200 vectors available for device interrupts.
> The proposed patch tries to map multiple event sources to an interrupt vector, to avoid running out of x86 CPU vectors.
> Many virtio mmio devices may have several or tens of event sources, and it’s rare to have hundreds of event sources.
> So could we treat the dynamic mapping between event sources and interrupt vectors as an advanced optional feature?
>

Maybe, but I still prefer to implement it if it is not too complex. 
Let's see Michael's opinion on this.

Thanks


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-03  9:12                 ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-03  9:12 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Liu, Jing2, Zha Bin, linux-kernel, mst, slp, virtio-dev,
	jing2.liu, chao.p.peng


On 2020/1/3 下午2:14, Liu, Jiang wrote:
>> Ok, I get you now.
>>
>> But still, having fixed number of MSIs is less flexible. E.g:
>>
>> - for x86, processor can only deal with about 250 interrupt vectors.
>> - driver may choose to share MSI vectors [1] (which is not merged but we will for sure need it)
> Thanks for the info:)
> X86 systems roughly have NCPU * 200 vectors available for device interrupts.
> The proposed patch tries to map multiple event sources to an interrupt vector, to avoid running out of x86 CPU vectors.
> Many virtio mmio devices may have several or tens of event sources, and it’s rare to have hundreds of event sources.
> So could we treat the dynamic mapping between event sources and interrupt vectors as an advanced optional feature?
>

Maybe, but I still prefer to implement it if it is not too complex. 
Let's see Michael's opinion on this.

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-26 13:16         ` Liu, Jiang
@ 2020-01-05 10:42             ` Michael S. Tsirkin
  2020-01-05 10:42             ` [virtio-dev] " Michael S. Tsirkin
  1 sibling, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-05 10:42 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Jason Wang, Zha Bin, linux-kernel, slp, virtio-dev, jing2.liu,
	chao.p.peng

On Thu, Dec 26, 2019 at 09:16:19PM +0800, Liu, Jiang wrote:
> > 2) The mask and unmask control is missed
> > 
> > 
> >>  but the extension doesn’t support 3) because
> >> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.

Linux uses masking/unmasking in order to migrate interrupts between
CPUs.

-- 
MST


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-05 10:42             ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-05 10:42 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Jason Wang, Zha Bin, linux-kernel, slp, virtio-dev, jing2.liu,
	chao.p.peng

On Thu, Dec 26, 2019 at 09:16:19PM +0800, Liu, Jiang wrote:
> > 2) The mask and unmask control is missed
> > 
> > 
> >>  but the extension doesn’t support 3) because
> >> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.

Linux uses masking/unmasking in order to migrate interrupts between
CPUs.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2019-12-25  2:50 ` [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3 Zha Bin
@ 2020-01-05 11:04     ` Michael S. Tsirkin
  2019-12-25 22:28     ` kbuild test robot
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-05 11:04 UTC (permalink / raw)
  To: Zha Bin
  Cc: linux-kernel, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng

On Wed, Dec 25, 2019 at 10:50:23AM +0800, Zha Bin wrote:
> From: Liu Jiang <gerry@linux.alibaba.com>
> 
> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
> virtio over mmio devices as a lightweight machine model for modern
> cloud. The standard virtio over MMIO transport layer only supports one
> legacy interrupt, which is much heavier than virtio over PCI transport
> layer using MSI. Legacy interrupt has long work path and causes specific
> VMExits in following cases, which would considerably slow down the
> performance:
> 
> 1) read interrupt status register
> 2) update interrupt status register
> 3) write IOAPIC EOI register
> 
> We proposed to update virtio over MMIO to version 3[1] to add the
> following new features and enhance the performance.
> 
> 1) Support Message Signaled Interrupt(MSI), which increases the
>    interrupt performance for virtio multi-queue devices
> 2) Support per-queue doorbell, so the guest kernel may directly write
>    to the doorbells provided by virtio devices.

Do we need to come up with new "doorbell" terminology?
virtio spec calls these available event notifications,
let's stick to this.


> 
> The following is the network tcp_rr performance testing report, tested
> with virtio-pci device, vanilla virtio-mmio device and patched
> virtio-mmio device (run test 3 times for each case):
> 
> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
> 
> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
> 	trans/s	    9536	6939		9500
> 	trans/s	    9734	7029		9749
> 	trans/s	    9894	7095		9318
> 
> [1] https://lkml.org/lkml/2019/12/20/113
> 
> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>


Do we need a new version though? What is wrong with
a feature bit? This way we can create compatible devices
and drivers.

> ---
>  drivers/virtio/virtio_mmio.c     | 226 ++++++++++++++++++++++++++++++++++++---
>  include/uapi/linux/virtio_mmio.h |  28 +++++
>  2 files changed, 240 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index e09edb5..065c083 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -68,8 +68,8 @@
>  #include <linux/virtio_config.h>
>  #include <uapi/linux/virtio_mmio.h>
>  #include <linux/virtio_ring.h>
> -
> -
> +#include <linux/msi.h>
> +#include <asm/irqdomain.h>
>  
>  /* The alignment to use between consumer and producer parts of vring.
>   * Currently hardcoded to the page size. */
> @@ -90,6 +90,12 @@ struct virtio_mmio_device {
>  	/* a list of queues so we can dispatch IRQs */
>  	spinlock_t lock;
>  	struct list_head virtqueues;
> +
> +	int doorbell_base;
> +	int doorbell_scale;
> +	bool msi_enabled;
> +	/* Name strings for interrupts. */
> +	char (*vm_vq_names)[256];

Add a comment explaining 256 pls.

>  };
>  
>  struct virtio_mmio_vq_info {
> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>  };
>  
>  
> +static void vm_free_msi_irqs(struct virtio_device *vdev);
> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs);
>  
>  /* Configuration interface */
>  
> @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>  {
>  	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vq->vdev);
>  
> +	if (vm_dev->version == 3) {
> +		int offset = vm_dev->doorbell_base +
> +			     vm_dev->doorbell_scale * vq->index;
> +		writel(vq->index, vm_dev->base + offset);
> +	} else
>  	/* We write the queue's selector into the notification register to
>  	 * signal the other end */
> -	writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +		writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +
>  	return true;
>  }

You might want to support VIRTIO_F_NOTIFICATION_DATA too.


> +/* Handle a configuration change */
> +static irqreturn_t vm_config_changed(int irq, void *opaque)
> +{
> +	struct virtio_mmio_device *vm_dev = opaque;
> +
> +	virtio_config_changed(&vm_dev->vdev);
> +
> +	return IRQ_HANDLED;
> +}
> +
>  /* Notify all virtqueues on an interrupt. */
>  static irqreturn_t vm_interrupt(int irq, void *opaque)
>  {
> @@ -307,7 +331,12 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
>  	return ret;
>  }
>  
> +static int vm_msi_irq_vector(struct device *dev, unsigned int nr)
> +{
> +	struct msi_desc *entry = first_msi_entry(dev);
>  
> +	return entry->irq + nr;
> +}
>  
>  static void vm_del_vq(struct virtqueue *vq)
>  {
> @@ -316,6 +345,12 @@ static void vm_del_vq(struct virtqueue *vq)
>  	unsigned long flags;
>  	unsigned int index = vq->index;
>  
> +	if (vm_dev->version == 3 && vm_dev->msi_enabled) {
> +		int irq = vm_msi_irq_vector(&vq->vdev->dev, index + 1);
> +
> +		free_irq(irq, vq);
> +	}
> +
>  	spin_lock_irqsave(&vm_dev->lock, flags);
>  	list_del(&info->node);
>  	spin_unlock_irqrestore(&vm_dev->lock, flags);


I think tying MSI to versions is a mistake.



> @@ -334,15 +369,41 @@ static void vm_del_vq(struct virtqueue *vq)
>  	kfree(info);
>  }
>  
> -static void vm_del_vqs(struct virtio_device *vdev)
> +static void vm_free_irqs(struct virtio_device *vdev)
>  {
>  	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	if (vm_dev->version == 3)
> +		vm_free_msi_irqs(vdev);
> +	else
> +		free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
> +}
> +
> +static void vm_del_vqs(struct virtio_device *vdev)
> +{
>  	struct virtqueue *vq, *n;
>  
>  	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
>  		vm_del_vq(vq);
>  
> -	free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
> +	vm_free_irqs(vdev);
> +}
> +
> +static int vm_request_single_irq(struct virtio_device *vdev)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	int irq = platform_get_irq(vm_dev->pdev, 0);
> +	int err;
> +
> +	if (irq < 0) {
> +		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
> +		return irq;
> +	}
> +
> +	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> +			dev_name(&vdev->dev), vm_dev);
> +
> +	return err;
>  }
>  
>  static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
> @@ -455,6 +516,72 @@ static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
>  	return ERR_PTR(err);
>  }
>  
> +static inline void mmio_msi_set_enable(struct virtio_mmio_device *vm_dev,
> +		int enable)
> +{
> +	u16 status;
> +
> +	status = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +	if (enable)
> +		status |= VIRTIO_MMIO_MSI_ENABLE_MASK;
> +	else
> +		status &= ~VIRTIO_MMIO_MSI_ENABLE_MASK;
> +	writew(status, vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +}
> +
> +static int vm_find_vqs_msi(struct virtio_device *vdev, unsigned int nvqs,
> +			struct virtqueue *vqs[], vq_callback_t *callbacks[],
> +			const char * const names[], const bool *ctx,
> +			struct irq_affinity *desc)
> +{
> +	int i, err, irq;
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	/* Allocate nvqs irqs for queues and one irq for configuration */
> +	err = vm_request_msi_vectors(vdev, nvqs + 1);
> +	if (err != 0)
> +		return err;

Not all devices need high speed.  Some might want to share
irqs between VQs, or even with config change callback.
Balloon is a case in point.
A hint about max # of MSI necessary would be a good
idea for this case.


Sharing MSI doesn't necessarily require dedicated registers like PCI has,
you can just program same vector in multiple VQs.

> +
> +	for (i = 0; i < nvqs; i++) {
> +		int index = i + 1;
> +
> +		if (!names[i]) {
> +			vqs[i] = NULL;
> +			continue;
> +		}
> +		vqs[i] = vm_setup_vq(vdev, i, callbacks[i], names[i],
> +				ctx ? ctx[i] : false);
> +		if (IS_ERR(vqs[i])) {
> +			err = PTR_ERR(vqs[i]);
> +			goto error_find;
> +		}
> +
> +		if (!callbacks[i])
> +			continue;
> +
> +		irq = vm_msi_irq_vector(&vqs[i]->vdev->dev, index);
> +		/* allocate per-vq irq if available and necessary */
> +		snprintf(vm_dev->vm_vq_names[index],
> +			sizeof(*vm_dev->vm_vq_names),
> +			"%s-%s",
> +			dev_name(&vm_dev->vdev.dev), names[i]);
> +		err = request_irq(irq, vring_interrupt, 0,
> +				vm_dev->vm_vq_names[index], vqs[i]);
> +
> +		if (err)
> +			goto error_find;
> +	}
> +
> +	mmio_msi_set_enable(vm_dev, 1);
> +	vm_dev->msi_enabled = true;
> +
> +	return 0;
> +
> +error_find:
> +	vm_del_vqs(vdev);
> +	return err;
> +}
> +
>  static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>  		       struct virtqueue *vqs[],
>  		       vq_callback_t *callbacks[],
> @@ -463,16 +590,16 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>  		       struct irq_affinity *desc)
>  {
>  	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> -	int irq = platform_get_irq(vm_dev->pdev, 0);
>  	int i, err, queue_idx = 0;
>  
> -	if (irq < 0) {
> -		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
> -		return irq;
> +	if (vm_dev->version == 3) {
> +		err = vm_find_vqs_msi(vdev, nvqs, vqs, callbacks,
> +				names, ctx, desc);
> +		if (!err)
> +			return 0;
>  	}
>  
> -	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> -			dev_name(&vdev->dev), vm_dev);
> +	err = vm_request_single_irq(vdev);
>  	if (err)
>  		return err;
>  
> @@ -493,6 +620,73 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>  	return 0;
>  }
>  
> +static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> +	struct device *dev = desc->dev;
> +	struct virtio_device *vdev = dev_to_virtio(dev);
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	void __iomem *pos = vm_dev->base;
> +	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
> +			desc->platform.msi_index);
> +
> +	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
> +	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
> +	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
> +	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
> +}

All this can happen when IRQ affinity changes while device
is sending interrupts. An interrupt sent between the writel
operations will then be directed incorrectly.



> +
> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	int irq, err;
> +	u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +
> +	if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) == 0)
> +		return -EINVAL;
> +
> +	vm_dev->vm_vq_names = kmalloc_array(nirqs, sizeof(*vm_dev->vm_vq_names),
> +			GFP_KERNEL);
> +	if (!vm_dev->vm_vq_names)
> +		return -ENOMEM;
> +
> +	if (!vdev->dev.msi_domain)
> +		vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
> +	err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
> +			mmio_write_msi_msg);
> +	if (err)
> +		goto error_free_mem;
> +
> +	/* The first MSI vector is used for configuration change events. */
> +	snprintf(vm_dev->vm_vq_names[0], sizeof(*vm_dev->vm_vq_names),
> +			"%s-config", dev_name(&vdev->dev));
> +	irq = vm_msi_irq_vector(&vdev->dev, 0);
> +	err = request_irq(irq, vm_config_changed, 0, vm_dev->vm_vq_names[0],
> +			vm_dev);
> +	if (err)
> +		goto error_free_mem;
> +
> +	return 0;
> +
> +error_free_mem:
> +	kfree(vm_dev->vm_vq_names);
> +	vm_dev->vm_vq_names = NULL;
> +	return err;
> +}
> +
> +static void vm_free_msi_irqs(struct virtio_device *vdev)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	if (vm_dev->msi_enabled) {
> +		mmio_msi_set_enable(vm_dev, 0);
> +		free_irq(vm_msi_irq_vector(&vdev->dev, 0), vm_dev);
> +		platform_msi_domain_free_irqs(&vdev->dev);
> +		kfree(vm_dev->vm_vq_names);
> +		vm_dev->vm_vq_names = NULL;
> +		vm_dev->msi_enabled = false;
> +	}
> +}
> +
>  static const char *vm_bus_name(struct virtio_device *vdev)
>  {
>  	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> @@ -567,7 +761,7 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>  
>  	/* Check device version */
>  	vm_dev->version = readl(vm_dev->base + VIRTIO_MMIO_VERSION);
> -	if (vm_dev->version < 1 || vm_dev->version > 2) {
> +	if (vm_dev->version < 1 || vm_dev->version > 3) {
>  		dev_err(&pdev->dev, "Version %ld not supported!\n",
>  				vm_dev->version);
>  		return -ENXIO;
> @@ -603,6 +797,12 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>  		dev_warn(&pdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");
>  
>  	platform_set_drvdata(pdev, vm_dev);
> +	if (vm_dev->version == 3) {
> +		u32 db = readl(vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +
> +		vm_dev->doorbell_base = db & 0xffff;
> +		vm_dev->doorbell_scale = (db >> 16) & 0xffff;
> +	}
>  
>  	rc = register_virtio_device(&vm_dev->vdev);
>  	if (rc)
> @@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct platform_device *pdev)
>  	return 0;
>  }
>  
> -
> -
>  /* Devices list parameter */
>  
>  #if defined(CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES)
> diff --git a/include/uapi/linux/virtio_mmio.h b/include/uapi/linux/virtio_mmio.h
> index c4b0968..148895e 100644
> --- a/include/uapi/linux/virtio_mmio.h
> +++ b/include/uapi/linux/virtio_mmio.h
> @@ -122,6 +122,34 @@
>  #define VIRTIO_MMIO_QUEUE_USED_LOW	0x0a0
>  #define VIRTIO_MMIO_QUEUE_USED_HIGH	0x0a4
>  
> +/* MSI related registers */
> +
> +/* MSI status register */
> +#define VIRTIO_MMIO_MSI_STATUS		0x0c0
> +/* MSI command register */
> +#define VIRTIO_MMIO_MSI_COMMAND		0x0c2
> +/* MSI low 32 bit address, 64 bits in two halves */
> +#define VIRTIO_MMIO_MSI_ADDRESS_LOW	0x0c4
> +/* MSI high 32 bit address, 64 bits in two halves */
> +#define VIRTIO_MMIO_MSI_ADDRESS_HIGH	0x0c8
> +/* MSI data */
> +#define VIRTIO_MMIO_MSI_DATA		0x0cc
> +
> +/* RO: MSI feature enabled mask */
> +#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000

I don't understand the comment. Is this a way for
a version 3 device to say "I want/do not want MSI"?
Why not just use a feature bit? We are not short on these.

> +/* RO: Maximum queue size available */
> +#define VIRTIO_MMIO_MSI_STATUS_QMASK	0x07ff


I don't know what above is. Seems unused.

> +/* Reserved */
> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED	0x7800
> +
> +#define VIRTIO_MMIO_MSI_CMD_UPDATE	0x1
> +#define VIRTIO_MMIO_MSI_CMD_OP_MASK	0xf000
> +#define VIRTIO_MMIO_MSI_CMD_ARG_MASK	0x0fff
> +#define VIRTIO_MMIO_MSI_CMD(op, arg)	((op) << 12 | (arg))
> +#define VIRITO_MMIO_MSI_CMD_ARG(cmd)	((cmd) & VIRTIO_MMIO_MSI_CMD_ARG_MASK)
> +#define VIRTIO_MMIO_MSI_CMD_OP(cmd)	\
> +		(((cmd) & VIRTIO_MMIO_MSI_CMD_OP_MASK) >> 12)
> +
>  /* Configuration atomicity value */
>  #define VIRTIO_MMIO_CONFIG_GENERATION	0x0fc
>  
> -- 
> 1.8.3.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-05 11:04     ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-05 11:04 UTC (permalink / raw)
  To: Zha Bin
  Cc: linux-kernel, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng

On Wed, Dec 25, 2019 at 10:50:23AM +0800, Zha Bin wrote:
> From: Liu Jiang <gerry@linux.alibaba.com>
> 
> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
> virtio over mmio devices as a lightweight machine model for modern
> cloud. The standard virtio over MMIO transport layer only supports one
> legacy interrupt, which is much heavier than virtio over PCI transport
> layer using MSI. Legacy interrupt has long work path and causes specific
> VMExits in following cases, which would considerably slow down the
> performance:
> 
> 1) read interrupt status register
> 2) update interrupt status register
> 3) write IOAPIC EOI register
> 
> We proposed to update virtio over MMIO to version 3[1] to add the
> following new features and enhance the performance.
> 
> 1) Support Message Signaled Interrupt(MSI), which increases the
>    interrupt performance for virtio multi-queue devices
> 2) Support per-queue doorbell, so the guest kernel may directly write
>    to the doorbells provided by virtio devices.

Do we need to come up with new "doorbell" terminology?
virtio spec calls these available event notifications,
let's stick to this.


> 
> The following is the network tcp_rr performance testing report, tested
> with virtio-pci device, vanilla virtio-mmio device and patched
> virtio-mmio device (run test 3 times for each case):
> 
> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
> 
> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
> 	trans/s	    9536	6939		9500
> 	trans/s	    9734	7029		9749
> 	trans/s	    9894	7095		9318
> 
> [1] https://lkml.org/lkml/2019/12/20/113
> 
> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>


Do we need a new version though? What is wrong with
a feature bit? This way we can create compatible devices
and drivers.

> ---
>  drivers/virtio/virtio_mmio.c     | 226 ++++++++++++++++++++++++++++++++++++---
>  include/uapi/linux/virtio_mmio.h |  28 +++++
>  2 files changed, 240 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index e09edb5..065c083 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -68,8 +68,8 @@
>  #include <linux/virtio_config.h>
>  #include <uapi/linux/virtio_mmio.h>
>  #include <linux/virtio_ring.h>
> -
> -
> +#include <linux/msi.h>
> +#include <asm/irqdomain.h>
>  
>  /* The alignment to use between consumer and producer parts of vring.
>   * Currently hardcoded to the page size. */
> @@ -90,6 +90,12 @@ struct virtio_mmio_device {
>  	/* a list of queues so we can dispatch IRQs */
>  	spinlock_t lock;
>  	struct list_head virtqueues;
> +
> +	int doorbell_base;
> +	int doorbell_scale;
> +	bool msi_enabled;
> +	/* Name strings for interrupts. */
> +	char (*vm_vq_names)[256];

Add a comment explaining 256 pls.

>  };
>  
>  struct virtio_mmio_vq_info {
> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>  };
>  
>  
> +static void vm_free_msi_irqs(struct virtio_device *vdev);
> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs);
>  
>  /* Configuration interface */
>  
> @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>  {
>  	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vq->vdev);
>  
> +	if (vm_dev->version == 3) {
> +		int offset = vm_dev->doorbell_base +
> +			     vm_dev->doorbell_scale * vq->index;
> +		writel(vq->index, vm_dev->base + offset);
> +	} else
>  	/* We write the queue's selector into the notification register to
>  	 * signal the other end */
> -	writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +		writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +
>  	return true;
>  }

You might want to support VIRTIO_F_NOTIFICATION_DATA too.


> +/* Handle a configuration change */
> +static irqreturn_t vm_config_changed(int irq, void *opaque)
> +{
> +	struct virtio_mmio_device *vm_dev = opaque;
> +
> +	virtio_config_changed(&vm_dev->vdev);
> +
> +	return IRQ_HANDLED;
> +}
> +
>  /* Notify all virtqueues on an interrupt. */
>  static irqreturn_t vm_interrupt(int irq, void *opaque)
>  {
> @@ -307,7 +331,12 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
>  	return ret;
>  }
>  
> +static int vm_msi_irq_vector(struct device *dev, unsigned int nr)
> +{
> +	struct msi_desc *entry = first_msi_entry(dev);
>  
> +	return entry->irq + nr;
> +}
>  
>  static void vm_del_vq(struct virtqueue *vq)
>  {
> @@ -316,6 +345,12 @@ static void vm_del_vq(struct virtqueue *vq)
>  	unsigned long flags;
>  	unsigned int index = vq->index;
>  
> +	if (vm_dev->version == 3 && vm_dev->msi_enabled) {
> +		int irq = vm_msi_irq_vector(&vq->vdev->dev, index + 1);
> +
> +		free_irq(irq, vq);
> +	}
> +
>  	spin_lock_irqsave(&vm_dev->lock, flags);
>  	list_del(&info->node);
>  	spin_unlock_irqrestore(&vm_dev->lock, flags);


I think tying MSI to versions is a mistake.



> @@ -334,15 +369,41 @@ static void vm_del_vq(struct virtqueue *vq)
>  	kfree(info);
>  }
>  
> -static void vm_del_vqs(struct virtio_device *vdev)
> +static void vm_free_irqs(struct virtio_device *vdev)
>  {
>  	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	if (vm_dev->version == 3)
> +		vm_free_msi_irqs(vdev);
> +	else
> +		free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
> +}
> +
> +static void vm_del_vqs(struct virtio_device *vdev)
> +{
>  	struct virtqueue *vq, *n;
>  
>  	list_for_each_entry_safe(vq, n, &vdev->vqs, list)
>  		vm_del_vq(vq);
>  
> -	free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
> +	vm_free_irqs(vdev);
> +}
> +
> +static int vm_request_single_irq(struct virtio_device *vdev)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	int irq = platform_get_irq(vm_dev->pdev, 0);
> +	int err;
> +
> +	if (irq < 0) {
> +		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
> +		return irq;
> +	}
> +
> +	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> +			dev_name(&vdev->dev), vm_dev);
> +
> +	return err;
>  }
>  
>  static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
> @@ -455,6 +516,72 @@ static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
>  	return ERR_PTR(err);
>  }
>  
> +static inline void mmio_msi_set_enable(struct virtio_mmio_device *vm_dev,
> +		int enable)
> +{
> +	u16 status;
> +
> +	status = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +	if (enable)
> +		status |= VIRTIO_MMIO_MSI_ENABLE_MASK;
> +	else
> +		status &= ~VIRTIO_MMIO_MSI_ENABLE_MASK;
> +	writew(status, vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +}
> +
> +static int vm_find_vqs_msi(struct virtio_device *vdev, unsigned int nvqs,
> +			struct virtqueue *vqs[], vq_callback_t *callbacks[],
> +			const char * const names[], const bool *ctx,
> +			struct irq_affinity *desc)
> +{
> +	int i, err, irq;
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	/* Allocate nvqs irqs for queues and one irq for configuration */
> +	err = vm_request_msi_vectors(vdev, nvqs + 1);
> +	if (err != 0)
> +		return err;

Not all devices need high speed.  Some might want to share
irqs between VQs, or even with config change callback.
Balloon is a case in point.
A hint about max # of MSI necessary would be a good
idea for this case.


Sharing MSI doesn't necessarily require dedicated registers like PCI has,
you can just program same vector in multiple VQs.

> +
> +	for (i = 0; i < nvqs; i++) {
> +		int index = i + 1;
> +
> +		if (!names[i]) {
> +			vqs[i] = NULL;
> +			continue;
> +		}
> +		vqs[i] = vm_setup_vq(vdev, i, callbacks[i], names[i],
> +				ctx ? ctx[i] : false);
> +		if (IS_ERR(vqs[i])) {
> +			err = PTR_ERR(vqs[i]);
> +			goto error_find;
> +		}
> +
> +		if (!callbacks[i])
> +			continue;
> +
> +		irq = vm_msi_irq_vector(&vqs[i]->vdev->dev, index);
> +		/* allocate per-vq irq if available and necessary */
> +		snprintf(vm_dev->vm_vq_names[index],
> +			sizeof(*vm_dev->vm_vq_names),
> +			"%s-%s",
> +			dev_name(&vm_dev->vdev.dev), names[i]);
> +		err = request_irq(irq, vring_interrupt, 0,
> +				vm_dev->vm_vq_names[index], vqs[i]);
> +
> +		if (err)
> +			goto error_find;
> +	}
> +
> +	mmio_msi_set_enable(vm_dev, 1);
> +	vm_dev->msi_enabled = true;
> +
> +	return 0;
> +
> +error_find:
> +	vm_del_vqs(vdev);
> +	return err;
> +}
> +
>  static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>  		       struct virtqueue *vqs[],
>  		       vq_callback_t *callbacks[],
> @@ -463,16 +590,16 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>  		       struct irq_affinity *desc)
>  {
>  	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> -	int irq = platform_get_irq(vm_dev->pdev, 0);
>  	int i, err, queue_idx = 0;
>  
> -	if (irq < 0) {
> -		dev_err(&vdev->dev, "Cannot get IRQ resource\n");
> -		return irq;
> +	if (vm_dev->version == 3) {
> +		err = vm_find_vqs_msi(vdev, nvqs, vqs, callbacks,
> +				names, ctx, desc);
> +		if (!err)
> +			return 0;
>  	}
>  
> -	err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> -			dev_name(&vdev->dev), vm_dev);
> +	err = vm_request_single_irq(vdev);
>  	if (err)
>  		return err;
>  
> @@ -493,6 +620,73 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>  	return 0;
>  }
>  
> +static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
> +{
> +	struct device *dev = desc->dev;
> +	struct virtio_device *vdev = dev_to_virtio(dev);
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	void __iomem *pos = vm_dev->base;
> +	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
> +			desc->platform.msi_index);
> +
> +	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
> +	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
> +	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
> +	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
> +}

All this can happen when IRQ affinity changes while device
is sending interrupts. An interrupt sent between the writel
operations will then be directed incorrectly.



> +
> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +	int irq, err;
> +	u16 csr = readw(vm_dev->base + VIRTIO_MMIO_MSI_STATUS);
> +
> +	if (vm_dev->msi_enabled || (csr & VIRTIO_MMIO_MSI_ENABLE_MASK) == 0)
> +		return -EINVAL;
> +
> +	vm_dev->vm_vq_names = kmalloc_array(nirqs, sizeof(*vm_dev->vm_vq_names),
> +			GFP_KERNEL);
> +	if (!vm_dev->vm_vq_names)
> +		return -ENOMEM;
> +
> +	if (!vdev->dev.msi_domain)
> +		vdev->dev.msi_domain = platform_msi_get_def_irq_domain();
> +	err = platform_msi_domain_alloc_irqs(&vdev->dev, nirqs,
> +			mmio_write_msi_msg);
> +	if (err)
> +		goto error_free_mem;
> +
> +	/* The first MSI vector is used for configuration change events. */
> +	snprintf(vm_dev->vm_vq_names[0], sizeof(*vm_dev->vm_vq_names),
> +			"%s-config", dev_name(&vdev->dev));
> +	irq = vm_msi_irq_vector(&vdev->dev, 0);
> +	err = request_irq(irq, vm_config_changed, 0, vm_dev->vm_vq_names[0],
> +			vm_dev);
> +	if (err)
> +		goto error_free_mem;
> +
> +	return 0;
> +
> +error_free_mem:
> +	kfree(vm_dev->vm_vq_names);
> +	vm_dev->vm_vq_names = NULL;
> +	return err;
> +}
> +
> +static void vm_free_msi_irqs(struct virtio_device *vdev)
> +{
> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> +
> +	if (vm_dev->msi_enabled) {
> +		mmio_msi_set_enable(vm_dev, 0);
> +		free_irq(vm_msi_irq_vector(&vdev->dev, 0), vm_dev);
> +		platform_msi_domain_free_irqs(&vdev->dev);
> +		kfree(vm_dev->vm_vq_names);
> +		vm_dev->vm_vq_names = NULL;
> +		vm_dev->msi_enabled = false;
> +	}
> +}
> +
>  static const char *vm_bus_name(struct virtio_device *vdev)
>  {
>  	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> @@ -567,7 +761,7 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>  
>  	/* Check device version */
>  	vm_dev->version = readl(vm_dev->base + VIRTIO_MMIO_VERSION);
> -	if (vm_dev->version < 1 || vm_dev->version > 2) {
> +	if (vm_dev->version < 1 || vm_dev->version > 3) {
>  		dev_err(&pdev->dev, "Version %ld not supported!\n",
>  				vm_dev->version);
>  		return -ENXIO;
> @@ -603,6 +797,12 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>  		dev_warn(&pdev->dev, "Failed to enable 64-bit or 32-bit DMA.  Trying to continue, but this might not work.\n");
>  
>  	platform_set_drvdata(pdev, vm_dev);
> +	if (vm_dev->version == 3) {
> +		u32 db = readl(vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
> +
> +		vm_dev->doorbell_base = db & 0xffff;
> +		vm_dev->doorbell_scale = (db >> 16) & 0xffff;
> +	}
>  
>  	rc = register_virtio_device(&vm_dev->vdev);
>  	if (rc)
> @@ -619,8 +819,6 @@ static int virtio_mmio_remove(struct platform_device *pdev)
>  	return 0;
>  }
>  
> -
> -
>  /* Devices list parameter */
>  
>  #if defined(CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES)
> diff --git a/include/uapi/linux/virtio_mmio.h b/include/uapi/linux/virtio_mmio.h
> index c4b0968..148895e 100644
> --- a/include/uapi/linux/virtio_mmio.h
> +++ b/include/uapi/linux/virtio_mmio.h
> @@ -122,6 +122,34 @@
>  #define VIRTIO_MMIO_QUEUE_USED_LOW	0x0a0
>  #define VIRTIO_MMIO_QUEUE_USED_HIGH	0x0a4
>  
> +/* MSI related registers */
> +
> +/* MSI status register */
> +#define VIRTIO_MMIO_MSI_STATUS		0x0c0
> +/* MSI command register */
> +#define VIRTIO_MMIO_MSI_COMMAND		0x0c2
> +/* MSI low 32 bit address, 64 bits in two halves */
> +#define VIRTIO_MMIO_MSI_ADDRESS_LOW	0x0c4
> +/* MSI high 32 bit address, 64 bits in two halves */
> +#define VIRTIO_MMIO_MSI_ADDRESS_HIGH	0x0c8
> +/* MSI data */
> +#define VIRTIO_MMIO_MSI_DATA		0x0cc
> +
> +/* RO: MSI feature enabled mask */
> +#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000

I don't understand the comment. Is this a way for
a version 3 device to say "I want/do not want MSI"?
Why not just use a feature bit? We are not short on these.

> +/* RO: Maximum queue size available */
> +#define VIRTIO_MMIO_MSI_STATUS_QMASK	0x07ff


I don't know what above is. Seems unused.

> +/* Reserved */
> +#define VIRTIO_MMIO_MSI_STATUS_RESERVED	0x7800
> +
> +#define VIRTIO_MMIO_MSI_CMD_UPDATE	0x1
> +#define VIRTIO_MMIO_MSI_CMD_OP_MASK	0xf000
> +#define VIRTIO_MMIO_MSI_CMD_ARG_MASK	0x0fff
> +#define VIRTIO_MMIO_MSI_CMD(op, arg)	((op) << 12 | (arg))
> +#define VIRITO_MMIO_MSI_CMD_ARG(cmd)	((cmd) & VIRTIO_MMIO_MSI_CMD_ARG_MASK)
> +#define VIRTIO_MMIO_MSI_CMD_OP(cmd)	\
> +		(((cmd) & VIRTIO_MMIO_MSI_CMD_OP_MASK) >> 12)
> +
>  /* Configuration atomicity value */
>  #define VIRTIO_MMIO_CONFIG_GENERATION	0x0fc
>  
> -- 
> 1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-03  9:12                 ` Jason Wang
@ 2020-01-05 11:25                   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-05 11:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: Liu, Jiang, Liu, Jing2, Zha Bin, linux-kernel, slp, virtio-dev,
	jing2.liu, chao.p.peng

On Fri, Jan 03, 2020 at 05:12:38PM +0800, Jason Wang wrote:
> 
> On 2020/1/3 下午2:14, Liu, Jiang wrote:
> > > Ok, I get you now.
> > > 
> > > But still, having fixed number of MSIs is less flexible. E.g:
> > > 
> > > - for x86, processor can only deal with about 250 interrupt vectors.
> > > - driver may choose to share MSI vectors [1] (which is not merged but we will for sure need it)
> > Thanks for the info:)
> > X86 systems roughly have NCPU * 200 vectors available for device interrupts.
> > The proposed patch tries to map multiple event sources to an interrupt vector, to avoid running out of x86 CPU vectors.
> > Many virtio mmio devices may have several or tens of event sources, and it’s rare to have hundreds of event sources.
> > So could we treat the dynamic mapping between event sources and interrupt vectors as an advanced optional feature?
> > 
> 
> Maybe, but I still prefer to implement it if it is not too complex. Let's
> see Michael's opinion on this.
> 
> Thanks

I think a way for the device to limit # of vectors in use by driver is
useful. But sharing of vectors doesn't really need any special
registers, just program the same vector for multiple Qs/interrupts.

-- 
MST


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-05 11:25                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-05 11:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: Liu, Jiang, Liu, Jing2, Zha Bin, linux-kernel, slp, virtio-dev,
	jing2.liu, chao.p.peng

On Fri, Jan 03, 2020 at 05:12:38PM +0800, Jason Wang wrote:
> 
> On 2020/1/3 下午2:14, Liu, Jiang wrote:
> > > Ok, I get you now.
> > > 
> > > But still, having fixed number of MSIs is less flexible. E.g:
> > > 
> > > - for x86, processor can only deal with about 250 interrupt vectors.
> > > - driver may choose to share MSI vectors [1] (which is not merged but we will for sure need it)
> > Thanks for the info:)
> > X86 systems roughly have NCPU * 200 vectors available for device interrupts.
> > The proposed patch tries to map multiple event sources to an interrupt vector, to avoid running out of x86 CPU vectors.
> > Many virtio mmio devices may have several or tens of event sources, and it’s rare to have hundreds of event sources.
> > So could we treat the dynamic mapping between event sources and interrupt vectors as an advanced optional feature?
> > 
> 
> Maybe, but I still prefer to implement it if it is not too complex. Let's
> see Michael's opinion on this.
> 
> Thanks

I think a way for the device to limit # of vectors in use by driver is
useful. But sharing of vectors doesn't really need any special
registers, just program the same vector for multiple Qs/interrupts.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-05 11:25                   ` Michael S. Tsirkin
@ 2020-01-06  2:51                     ` Jason Wang
  -1 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-06  2:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Liu, Jiang, Liu, Jing2, Zha Bin, linux-kernel, slp, virtio-dev,
	jing2.liu, chao.p.peng


On 2020/1/5 下午7:25, Michael S. Tsirkin wrote:
> On Fri, Jan 03, 2020 at 05:12:38PM +0800, Jason Wang wrote:
>> On 2020/1/3 下午2:14, Liu, Jiang wrote:
>>>> Ok, I get you now.
>>>>
>>>> But still, having fixed number of MSIs is less flexible. E.g:
>>>>
>>>> - for x86, processor can only deal with about 250 interrupt vectors.
>>>> - driver may choose to share MSI vectors [1] (which is not merged but we will for sure need it)
>>> Thanks for the info:)
>>> X86 systems roughly have NCPU * 200 vectors available for device interrupts.
>>> The proposed patch tries to map multiple event sources to an interrupt vector, to avoid running out of x86 CPU vectors.
>>> Many virtio mmio devices may have several or tens of event sources, and it’s rare to have hundreds of event sources.
>>> So could we treat the dynamic mapping between event sources and interrupt vectors as an advanced optional feature?
>>>
>> Maybe, but I still prefer to implement it if it is not too complex. Let's
>> see Michael's opinion on this.
>>
>> Thanks
> I think a way for the device to limit # of vectors in use by driver is
> useful. But sharing of vectors doesn't really need any special
> registers, just program the same vector for multiple Qs/interrupts.


Right, but sine the #vectors is limited, we still need dynamic mapping 
like what is done in PCI.

Thanks


>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-06  2:51                     ` Jason Wang
  0 siblings, 0 replies; 60+ messages in thread
From: Jason Wang @ 2020-01-06  2:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Liu, Jiang, Liu, Jing2, Zha Bin, linux-kernel, slp, virtio-dev,
	jing2.liu, chao.p.peng


On 2020/1/5 下午7:25, Michael S. Tsirkin wrote:
> On Fri, Jan 03, 2020 at 05:12:38PM +0800, Jason Wang wrote:
>> On 2020/1/3 下午2:14, Liu, Jiang wrote:
>>>> Ok, I get you now.
>>>>
>>>> But still, having fixed number of MSIs is less flexible. E.g:
>>>>
>>>> - for x86, processor can only deal with about 250 interrupt vectors.
>>>> - driver may choose to share MSI vectors [1] (which is not merged but we will for sure need it)
>>> Thanks for the info:)
>>> X86 systems roughly have NCPU * 200 vectors available for device interrupts.
>>> The proposed patch tries to map multiple event sources to an interrupt vector, to avoid running out of x86 CPU vectors.
>>> Many virtio mmio devices may have several or tens of event sources, and it’s rare to have hundreds of event sources.
>>> So could we treat the dynamic mapping between event sources and interrupt vectors as an advanced optional feature?
>>>
>> Maybe, but I still prefer to implement it if it is not too complex. Let's
>> see Michael's opinion on this.
>>
>> Thanks
> I think a way for the device to limit # of vectors in use by driver is
> useful. But sharing of vectors doesn't really need any special
> registers, just program the same vector for multiple Qs/interrupts.


Right, but sine the #vectors is limited, we still need dynamic mapping 
like what is done in PCI.

Thanks


>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-05 10:42             ` [virtio-dev] " Michael S. Tsirkin
@ 2020-01-06  7:24               ` Liu, Jing2
  -1 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-06  7:24 UTC (permalink / raw)
  To: Michael S. Tsirkin, Liu, Jiang
  Cc: Jason Wang, Zha Bin, linux-kernel, slp, virtio-dev, jing2.liu,
	chao.p.peng


On 1/5/2020 6:42 PM, Michael S. Tsirkin wrote:
> On Thu, Dec 26, 2019 at 09:16:19PM +0800, Liu, Jiang wrote:
>>> 2) The mask and unmask control is missed
>>>
>>>
>>>>   but the extension doesn’t support 3) because
>>>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
> Linux uses masking/unmasking in order to migrate interrupts between
> CPUs.

Hi Michael,

Thanks for reviewing the patches!

When trying to study the mask/unmask use case during migrating irq, it 
seems being used e.g.

1) migrate irq(s) away from offline cpu

2) irq affinity is changing, while an interrupt comes so it sets 
SETAFFINITY_PENDING and

the lapic (e.g. x86) does the mask and unmask to finish the pending 
during ack.

Is this right? So we should have mask/unmask for each vector.

Thanks,

Jing


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-06  7:24               ` Liu, Jing2
  0 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-06  7:24 UTC (permalink / raw)
  To: Michael S. Tsirkin, Liu, Jiang
  Cc: Jason Wang, Zha Bin, linux-kernel, slp, virtio-dev, jing2.liu,
	chao.p.peng


On 1/5/2020 6:42 PM, Michael S. Tsirkin wrote:
> On Thu, Dec 26, 2019 at 09:16:19PM +0800, Liu, Jiang wrote:
>>> 2) The mask and unmask control is missed
>>>
>>>
>>>>   but the extension doesn’t support 3) because
>>>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
> Linux uses masking/unmasking in order to migrate interrupts between
> CPUs.

Hi Michael,

Thanks for reviewing the patches!

When trying to study the mask/unmask use case during migrating irq, it 
seems being used e.g.

1) migrate irq(s) away from offline cpu

2) irq affinity is changing, while an interrupt comes so it sets 
SETAFFINITY_PENDING and

the lapic (e.g. x86) does the mask and unmask to finish the pending 
during ack.

Is this right? So we should have mask/unmask for each vector.

Thanks,

Jing


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-05 11:04     ` [virtio-dev] " Michael S. Tsirkin
@ 2020-01-09  6:15       ` Liu, Jing2
  -1 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-09  6:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zha Bin
  Cc: linux-kernel, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 1/5/2020 7:04 PM, Michael S. Tsirkin wrote:
> On Wed, Dec 25, 2019 at 10:50:23AM +0800, Zha Bin wrote:
>> From: Liu Jiang<gerry@linux.alibaba.com>
>>
>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>> virtio over mmio devices as a lightweight machine model for modern
>> cloud. The standard virtio over MMIO transport layer only supports one
>> legacy interrupt, which is much heavier than virtio over PCI transport
>> layer using MSI. Legacy interrupt has long work path and causes specific
>> VMExits in following cases, which would considerably slow down the
>> performance:
>>
>> 1) read interrupt status register
>> 2) update interrupt status register
>> 3) write IOAPIC EOI register
>>
>> We proposed to update virtio over MMIO to version 3[1] to add the
>> following new features and enhance the performance.
>>
>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>     interrupt performance for virtio multi-queue devices
>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>     to the doorbells provided by virtio devices.
> Do we need to come up with new "doorbell" terminology?
> virtio spec calls these available event notifications,
> let's stick to this.

Yes, let's keep virtio words, which just calls notifications.

>> The following is the network tcp_rr performance testing report, tested
>> with virtio-pci device, vanilla virtio-mmio device and patched
>> virtio-mmio device (run test 3 times for each case):
>>
>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>
>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>> 	trans/s	    9536	6939		9500
>> 	trans/s	    9734	7029		9749
>> 	trans/s	    9894	7095		9318
>>
>> [1]https://lkml.org/lkml/2019/12/20/113
>>
>> Signed-off-by: Liu Jiang<gerry@linux.alibaba.com>
>> Signed-off-by: Zha Bin<zhabin@linux.alibaba.com>
>> Signed-off-by: Chao Peng<chao.p.peng@linux.intel.com>
>> Signed-off-by: Jing Liu<jing2.liu@linux.intel.com>
> Do we need a new version though? What is wrong with
> a feature bit? This way we can create compatible devices
> and drivers.

We considered using 1 feature bit of 24~37 to specify MSI capability, but

this feature bit only means for mmio transport layer, but not representing

comment feature negotiation of the virtio device. So we're not sure if 
this is a good choice.

>> [...]
>>   
>> +static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
>> +{
>> +	struct device *dev = desc->dev;
>> +	struct virtio_device *vdev = dev_to_virtio(dev);
>> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>> +	void __iomem *pos = vm_dev->base;
>> +	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
>> +			desc->platform.msi_index);
>> +
>> +	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
>> +	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
>> +	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
>> +	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
>> +}
> All this can happen when IRQ affinity changes while device
> is sending interrupts. An interrupt sent between the writel
> operations will then be directed incorrectly.

When investigating kernel MSI behavior, I found in most case there's no 
action during IRQ affinity changes to avoid the interrupt coming.

For example, when migrate_one_irq, it masks the irq before 
irq_do_set_affinity. But for others, like user setting any irq affinity

via /proc/, it only holds desc->lock instead of disable/mask irq. In 
such case, how can it ensure the interrupt sending between writel ops?


>> [...]
>> +
>> +/* RO: MSI feature enabled mask */
>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000
> I don't understand the comment. Is this a way for
> a version 3 device to say "I want/do not want MSI"?
> Why not just use a feature bit? We are not short on these.

This is just used for current MSI enabled/disabled status, after all MSI 
configurations setup finished.

Not for showing MSI capability.

In other words, since the concern of feature bit, we choose to update 
the virtio mmio

version that devices with v3 have MSI capability and notifications.


Thanks,

Jing


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-09  6:15       ` Liu, Jing2
  0 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-09  6:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zha Bin
  Cc: linux-kernel, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 1/5/2020 7:04 PM, Michael S. Tsirkin wrote:
> On Wed, Dec 25, 2019 at 10:50:23AM +0800, Zha Bin wrote:
>> From: Liu Jiang<gerry@linux.alibaba.com>
>>
>> Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
>> virtio over mmio devices as a lightweight machine model for modern
>> cloud. The standard virtio over MMIO transport layer only supports one
>> legacy interrupt, which is much heavier than virtio over PCI transport
>> layer using MSI. Legacy interrupt has long work path and causes specific
>> VMExits in following cases, which would considerably slow down the
>> performance:
>>
>> 1) read interrupt status register
>> 2) update interrupt status register
>> 3) write IOAPIC EOI register
>>
>> We proposed to update virtio over MMIO to version 3[1] to add the
>> following new features and enhance the performance.
>>
>> 1) Support Message Signaled Interrupt(MSI), which increases the
>>     interrupt performance for virtio multi-queue devices
>> 2) Support per-queue doorbell, so the guest kernel may directly write
>>     to the doorbells provided by virtio devices.
> Do we need to come up with new "doorbell" terminology?
> virtio spec calls these available event notifications,
> let's stick to this.

Yes, let's keep virtio words, which just calls notifications.

>> The following is the network tcp_rr performance testing report, tested
>> with virtio-pci device, vanilla virtio-mmio device and patched
>> virtio-mmio device (run test 3 times for each case):
>>
>> 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
>>
>> 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
>> 	trans/s	    9536	6939		9500
>> 	trans/s	    9734	7029		9749
>> 	trans/s	    9894	7095		9318
>>
>> [1]https://lkml.org/lkml/2019/12/20/113
>>
>> Signed-off-by: Liu Jiang<gerry@linux.alibaba.com>
>> Signed-off-by: Zha Bin<zhabin@linux.alibaba.com>
>> Signed-off-by: Chao Peng<chao.p.peng@linux.intel.com>
>> Signed-off-by: Jing Liu<jing2.liu@linux.intel.com>
> Do we need a new version though? What is wrong with
> a feature bit? This way we can create compatible devices
> and drivers.

We considered using 1 feature bit of 24~37 to specify MSI capability, but

this feature bit only means for mmio transport layer, but not representing

comment feature negotiation of the virtio device. So we're not sure if 
this is a good choice.

>> [...]
>>   
>> +static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
>> +{
>> +	struct device *dev = desc->dev;
>> +	struct virtio_device *vdev = dev_to_virtio(dev);
>> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>> +	void __iomem *pos = vm_dev->base;
>> +	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
>> +			desc->platform.msi_index);
>> +
>> +	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
>> +	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
>> +	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
>> +	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
>> +}
> All this can happen when IRQ affinity changes while device
> is sending interrupts. An interrupt sent between the writel
> operations will then be directed incorrectly.

When investigating kernel MSI behavior, I found in most case there's no 
action during IRQ affinity changes to avoid the interrupt coming.

For example, when migrate_one_irq, it masks the irq before 
irq_do_set_affinity. But for others, like user setting any irq affinity

via /proc/, it only holds desc->lock instead of disable/mask irq. In 
such case, how can it ensure the interrupt sending between writel ops?


>> [...]
>> +
>> +/* RO: MSI feature enabled mask */
>> +#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000
> I don't understand the comment. Is this a way for
> a version 3 device to say "I want/do not want MSI"?
> Why not just use a feature bit? We are not short on these.

This is just used for current MSI enabled/disabled status, after all MSI 
configurations setup finished.

Not for showing MSI capability.

In other words, since the concern of feature bit, we choose to update 
the virtio mmio

version that devices with v3 have MSI capability and notifications.


Thanks,

Jing


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-09  6:15       ` [virtio-dev] " Liu, Jing2
@ 2020-01-09 13:26         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-09 13:26 UTC (permalink / raw)
  To: Liu, Jing2
  Cc: Zha Bin, linux-kernel, jasowang, slp, virtio-dev, gerry,
	jing2.liu, chao.p.peng

On Thu, Jan 09, 2020 at 02:15:51PM +0800, Liu, Jing2 wrote:
> 
> On 1/5/2020 7:04 PM, Michael S. Tsirkin wrote:
> > On Wed, Dec 25, 2019 at 10:50:23AM +0800, Zha Bin wrote:
> > > From: Liu Jiang<gerry@linux.alibaba.com>
> > > 
> > > Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
> > > virtio over mmio devices as a lightweight machine model for modern
> > > cloud. The standard virtio over MMIO transport layer only supports one
> > > legacy interrupt, which is much heavier than virtio over PCI transport
> > > layer using MSI. Legacy interrupt has long work path and causes specific
> > > VMExits in following cases, which would considerably slow down the
> > > performance:
> > > 
> > > 1) read interrupt status register
> > > 2) update interrupt status register
> > > 3) write IOAPIC EOI register
> > > 
> > > We proposed to update virtio over MMIO to version 3[1] to add the
> > > following new features and enhance the performance.
> > > 
> > > 1) Support Message Signaled Interrupt(MSI), which increases the
> > >     interrupt performance for virtio multi-queue devices
> > > 2) Support per-queue doorbell, so the guest kernel may directly write
> > >     to the doorbells provided by virtio devices.
> > Do we need to come up with new "doorbell" terminology?
> > virtio spec calls these available event notifications,
> > let's stick to this.
> 
> Yes, let's keep virtio words, which just calls notifications.
> 
> > > The following is the network tcp_rr performance testing report, tested
> > > with virtio-pci device, vanilla virtio-mmio device and patched
> > > virtio-mmio device (run test 3 times for each case):
> > > 
> > > 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
> > > 
> > > 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
> > > 	trans/s	    9536	6939		9500
> > > 	trans/s	    9734	7029		9749
> > > 	trans/s	    9894	7095		9318
> > > 
> > > [1]https://lkml.org/lkml/2019/12/20/113
> > > 
> > > Signed-off-by: Liu Jiang<gerry@linux.alibaba.com>
> > > Signed-off-by: Zha Bin<zhabin@linux.alibaba.com>
> > > Signed-off-by: Chao Peng<chao.p.peng@linux.intel.com>
> > > Signed-off-by: Jing Liu<jing2.liu@linux.intel.com>
> > Do we need a new version though? What is wrong with
> > a feature bit? This way we can create compatible devices
> > and drivers.
> 
> We considered using 1 feature bit of 24~37 to specify MSI capability, but
> 
> this feature bit only means for mmio transport layer, but not representing
> 
> comment feature negotiation of the virtio device. So we're not sure if this
> is a good choice.

We are not short on bits, just don't use bits below 32
since these are for legacy devices.


> > > [...]
> > > +static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > +{
> > > +	struct device *dev = desc->dev;
> > > +	struct virtio_device *vdev = dev_to_virtio(dev);
> > > +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> > > +	void __iomem *pos = vm_dev->base;
> > > +	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
> > > +			desc->platform.msi_index);
> > > +
> > > +	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
> > > +	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
> > > +	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
> > > +	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
> > > +}
> > All this can happen when IRQ affinity changes while device
> > is sending interrupts. An interrupt sent between the writel
> > operations will then be directed incorrectly.
> 
> When investigating kernel MSI behavior, I found in most case there's no
> action during IRQ affinity changes to avoid the interrupt coming.
> 
> For example, when migrate_one_irq, it masks the irq before
> irq_do_set_affinity. But for others, like user setting any irq affinity
> 
> via /proc/, it only holds desc->lock instead of disable/mask irq. In such
> case, how can it ensure the interrupt sending between writel ops?

Could be a bug too. E.g. PCI spec explicitly says it's illegal to
change non-masked interrupts exactly for this reason.



> 
> > > [...]
> > > +
> > > +/* RO: MSI feature enabled mask */
> > > +#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000
> > I don't understand the comment. Is this a way for
> > a version 3 device to say "I want/do not want MSI"?
> > Why not just use a feature bit? We are not short on these.
> 
> This is just used for current MSI enabled/disabled status, after all MSI
> configurations setup finished.
> 
> Not for showing MSI capability.
> 
> In other words, since the concern of feature bit, we choose to update the
> virtio mmio
> 
> version that devices with v3 have MSI capability and notifications.
> 
> 
> Thanks,
> 
> Jing

MSI looks like an optimization. I don't see how that
justifies incrementing a major version and breaking
compat with all existing guests.

-- 
MST


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-09 13:26         ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-09 13:26 UTC (permalink / raw)
  To: Liu, Jing2
  Cc: Zha Bin, linux-kernel, jasowang, slp, virtio-dev, gerry,
	jing2.liu, chao.p.peng

On Thu, Jan 09, 2020 at 02:15:51PM +0800, Liu, Jing2 wrote:
> 
> On 1/5/2020 7:04 PM, Michael S. Tsirkin wrote:
> > On Wed, Dec 25, 2019 at 10:50:23AM +0800, Zha Bin wrote:
> > > From: Liu Jiang<gerry@linux.alibaba.com>
> > > 
> > > Userspace VMMs (e.g. Qemu microvm, Firecracker) take advantage of using
> > > virtio over mmio devices as a lightweight machine model for modern
> > > cloud. The standard virtio over MMIO transport layer only supports one
> > > legacy interrupt, which is much heavier than virtio over PCI transport
> > > layer using MSI. Legacy interrupt has long work path and causes specific
> > > VMExits in following cases, which would considerably slow down the
> > > performance:
> > > 
> > > 1) read interrupt status register
> > > 2) update interrupt status register
> > > 3) write IOAPIC EOI register
> > > 
> > > We proposed to update virtio over MMIO to version 3[1] to add the
> > > following new features and enhance the performance.
> > > 
> > > 1) Support Message Signaled Interrupt(MSI), which increases the
> > >     interrupt performance for virtio multi-queue devices
> > > 2) Support per-queue doorbell, so the guest kernel may directly write
> > >     to the doorbells provided by virtio devices.
> > Do we need to come up with new "doorbell" terminology?
> > virtio spec calls these available event notifications,
> > let's stick to this.
> 
> Yes, let's keep virtio words, which just calls notifications.
> 
> > > The following is the network tcp_rr performance testing report, tested
> > > with virtio-pci device, vanilla virtio-mmio device and patched
> > > virtio-mmio device (run test 3 times for each case):
> > > 
> > > 	netperf -t TCP_RR -H 192.168.1.36 -l 30 -- -r 32,1024
> > > 
> > > 		Virtio-PCI    Virtio-MMIO   Virtio-MMIO(MSI)
> > > 	trans/s	    9536	6939		9500
> > > 	trans/s	    9734	7029		9749
> > > 	trans/s	    9894	7095		9318
> > > 
> > > [1]https://lkml.org/lkml/2019/12/20/113
> > > 
> > > Signed-off-by: Liu Jiang<gerry@linux.alibaba.com>
> > > Signed-off-by: Zha Bin<zhabin@linux.alibaba.com>
> > > Signed-off-by: Chao Peng<chao.p.peng@linux.intel.com>
> > > Signed-off-by: Jing Liu<jing2.liu@linux.intel.com>
> > Do we need a new version though? What is wrong with
> > a feature bit? This way we can create compatible devices
> > and drivers.
> 
> We considered using 1 feature bit of 24~37 to specify MSI capability, but
> 
> this feature bit only means for mmio transport layer, but not representing
> 
> comment feature negotiation of the virtio device. So we're not sure if this
> is a good choice.

We are not short on bits, just don't use bits below 32
since these are for legacy devices.


> > > [...]
> > > +static void mmio_write_msi_msg(struct msi_desc *desc, struct msi_msg *msg)
> > > +{
> > > +	struct device *dev = desc->dev;
> > > +	struct virtio_device *vdev = dev_to_virtio(dev);
> > > +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
> > > +	void __iomem *pos = vm_dev->base;
> > > +	uint16_t cmd = VIRTIO_MMIO_MSI_CMD(VIRTIO_MMIO_MSI_CMD_UPDATE,
> > > +			desc->platform.msi_index);
> > > +
> > > +	writel(msg->address_lo, pos + VIRTIO_MMIO_MSI_ADDRESS_LOW);
> > > +	writel(msg->address_hi, pos + VIRTIO_MMIO_MSI_ADDRESS_HIGH);
> > > +	writel(msg->data, pos + VIRTIO_MMIO_MSI_DATA);
> > > +	writew(cmd, pos + VIRTIO_MMIO_MSI_COMMAND);
> > > +}
> > All this can happen when IRQ affinity changes while device
> > is sending interrupts. An interrupt sent between the writel
> > operations will then be directed incorrectly.
> 
> When investigating kernel MSI behavior, I found in most case there's no
> action during IRQ affinity changes to avoid the interrupt coming.
> 
> For example, when migrate_one_irq, it masks the irq before
> irq_do_set_affinity. But for others, like user setting any irq affinity
> 
> via /proc/, it only holds desc->lock instead of disable/mask irq. In such
> case, how can it ensure the interrupt sending between writel ops?

Could be a bug too. E.g. PCI spec explicitly says it's illegal to
change non-masked interrupts exactly for this reason.



> 
> > > [...]
> > > +
> > > +/* RO: MSI feature enabled mask */
> > > +#define VIRTIO_MMIO_MSI_ENABLE_MASK	0x8000
> > I don't understand the comment. Is this a way for
> > a version 3 device to say "I want/do not want MSI"?
> > Why not just use a feature bit? We are not short on these.
> 
> This is just used for current MSI enabled/disabled status, after all MSI
> configurations setup finished.
> 
> Not for showing MSI capability.
> 
> In other words, since the concern of feature bit, we choose to update the
> virtio mmio
> 
> version that devices with v3 have MSI capability and notifications.
> 
> 
> Thanks,
> 
> Jing

MSI looks like an optimization. I don't see how that
justifies incrementing a major version and breaking
compat with all existing guests.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-05 10:42             ` [virtio-dev] " Michael S. Tsirkin
  (?)
  (?)
@ 2020-01-09 16:06             ` Liu, Jiang
  2020-01-09 16:47                 ` [virtio-dev] " Michael S. Tsirkin
  -1 siblings, 1 reply; 60+ messages in thread
From: Liu, Jiang @ 2020-01-09 16:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Zha Bin, linux-kernel, slp, virtio-dev, jing2.liu,
	chao.p.peng

On Jan 5, 2020, at 6:42 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> 
> On Thu, Dec 26, 2019 at 09:16:19PM +0800, Liu, Jiang wrote:
>>> 2) The mask and unmask control is missed
>>> 
>>> 
>>>> but the extension doesn’t support 3) because
>>>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
> 
> Linux uses masking/unmasking in order to migrate interrupts between
> CPUs.
This is a limitation of the PCI MSI/MSIx spec.
To update the MSI/MSIx vector configuration, we need to write to msg_high/msg_low/msg_data registers.
Because write to three 32-bit registers is not an atomic operation on PCI bus, so it may cause incorrect interrupt delivery if interrupt happens after writing 1 or 2 registers.
When Intel remapping is enabled on x86 platforms, we don’t need to mask/unmask PCI MSI/MSIx interrupts when setting affinity.
For MMIO MSI extension, we have special design to solve this race window. The flow to update MMIO MSI vector configuration is:
1) write msg_high
2) write msg_low
3) write msg_data
4) write the command register to update the vector configuration.
During step 1-3, the hardware/device backend driver only caches the value written. And update the vector configuration in step 4, so it’s an atomic operation now.
So mask/unmask becomes optional for MMIO MSI interrupts.

> 
> -- 
> MST


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-09 16:06             ` Liu, Jiang
@ 2020-01-09 16:47                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-09 16:47 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Jason Wang, Zha Bin, linux-kernel, slp, virtio-dev, jing2.liu,
	chao.p.peng

On Fri, Jan 10, 2020 at 12:06:06AM +0800, Liu, Jiang wrote:
> On Jan 5, 2020, at 6:42 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > 
> > On Thu, Dec 26, 2019 at 09:16:19PM +0800, Liu, Jiang wrote:
> >>> 2) The mask and unmask control is missed
> >>> 
> >>> 
> >>>> but the extension doesn’t support 3) because
> >>>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
> > 
> > Linux uses masking/unmasking in order to migrate interrupts between
> > CPUs.
> This is a limitation of the PCI MSI/MSIx spec.
> To update the MSI/MSIx vector configuration, we need to write to msg_high/msg_low/msg_data registers.
> Because write to three 32-bit registers is not an atomic operation on PCI bus, so it may cause incorrect interrupt delivery if interrupt happens after writing 1 or 2 registers.
> When Intel remapping is enabled on x86 platforms, we don’t need to mask/unmask PCI MSI/MSIx interrupts when setting affinity.
> For MMIO MSI extension, we have special design to solve this race window. The flow to update MMIO MSI vector configuration is:
> 1) write msg_high
> 2) write msg_low
> 3) write msg_data
> 4) write the command register to update the vector configuration.
> During step 1-3, the hardware/device backend driver only caches the value written. And update the vector configuration in step 4, so it’s an atomic operation now.
> So mask/unmask becomes optional for MMIO MSI interrupts.

Oh I see. That needs some documentation I guess.

One question though: how do you block VQ from generating interrupts?

There's value in doing that for performance since then device can avoid
re-checking interrupt enable/event idx values in memory.



> > 
> > -- 
> > MST


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-09 16:47                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 60+ messages in thread
From: Michael S. Tsirkin @ 2020-01-09 16:47 UTC (permalink / raw)
  To: Liu, Jiang
  Cc: Jason Wang, Zha Bin, linux-kernel, slp, virtio-dev, jing2.liu,
	chao.p.peng

On Fri, Jan 10, 2020 at 12:06:06AM +0800, Liu, Jiang wrote:
> On Jan 5, 2020, at 6:42 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > 
> > On Thu, Dec 26, 2019 at 09:16:19PM +0800, Liu, Jiang wrote:
> >>> 2) The mask and unmask control is missed
> >>> 
> >>> 
> >>>> but the extension doesn’t support 3) because
> >>>> we noticed that the Linux virtio subsystem doesn’t really make use of interrupt masking/unmasking.
> > 
> > Linux uses masking/unmasking in order to migrate interrupts between
> > CPUs.
> This is a limitation of the PCI MSI/MSIx spec.
> To update the MSI/MSIx vector configuration, we need to write to msg_high/msg_low/msg_data registers.
> Because write to three 32-bit registers is not an atomic operation on PCI bus, so it may cause incorrect interrupt delivery if interrupt happens after writing 1 or 2 registers.
> When Intel remapping is enabled on x86 platforms, we don’t need to mask/unmask PCI MSI/MSIx interrupts when setting affinity.
> For MMIO MSI extension, we have special design to solve this race window. The flow to update MMIO MSI vector configuration is:
> 1) write msg_high
> 2) write msg_low
> 3) write msg_data
> 4) write the command register to update the vector configuration.
> During step 1-3, the hardware/device backend driver only caches the value written. And update the vector configuration in step 4, so it’s an atomic operation now.
> So mask/unmask becomes optional for MMIO MSI interrupts.

Oh I see. That needs some documentation I guess.

One question though: how do you block VQ from generating interrupts?

There's value in doing that for performance since then device can avoid
re-checking interrupt enable/event idx values in memory.



> > 
> > -- 
> > MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-05 11:04     ` [virtio-dev] " Michael S. Tsirkin
@ 2020-01-15  7:06       ` Liu, Jing2
  -1 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-15  7:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zha Bin
  Cc: linux-kernel, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 1/5/2020 7:04 PM, Michael S. Tsirkin wrote:
> [...]
>> +static int vm_find_vqs_msi(struct virtio_device *vdev, unsigned int nvqs,
>> +			struct virtqueue *vqs[], vq_callback_t *callbacks[],
>> +			const char * const names[], const bool *ctx,
>> +			struct irq_affinity *desc)
>> +{
>> +	int i, err, irq;
>> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>> +
>> +	/* Allocate nvqs irqs for queues and one irq for configuration */
>> +	err = vm_request_msi_vectors(vdev, nvqs + 1);
>> +	if (err != 0)
>> +		return err;
> Not all devices need high speed.  Some might want to share
> irqs between VQs, or even with config change callback.
> Balloon is a case in point.
> A hint about max # of MSI necessary would be a good
> idea for this case.

This seems being a hint about telling MSI number device supported and 
whether it wants MSI sharing.

For devices with tens of queues at most or need high speed, they choose 
vector per queue, and can simply use fixed mapping.

For others, it can ask for advanced mode, which means MSI sharing and 
dynamic mapping.

What about let device decide the mode it would use, as follows.

MaxVecNum 32bit - The max msi vector number that device supports.

MsiState 32bit

- bit[x]=0 implies vec per queue/config and fixed mapping. In this case, 
MsiVecNum>=num_queue+1

- bit [x]=1 implies the hint of msi sharing and dynamic mapping. In this 
case, MsiVecNum<num_queue+1


Thanks,

Jing

>
> Sharing MSI doesn't necessarily require dedicated registers like PCI has,
> you can just program same vector in multiple VQs.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-15  7:06       ` Liu, Jing2
  0 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-15  7:06 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zha Bin
  Cc: linux-kernel, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 1/5/2020 7:04 PM, Michael S. Tsirkin wrote:
> [...]
>> +static int vm_find_vqs_msi(struct virtio_device *vdev, unsigned int nvqs,
>> +			struct virtqueue *vqs[], vq_callback_t *callbacks[],
>> +			const char * const names[], const bool *ctx,
>> +			struct irq_affinity *desc)
>> +{
>> +	int i, err, irq;
>> +	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
>> +
>> +	/* Allocate nvqs irqs for queues and one irq for configuration */
>> +	err = vm_request_msi_vectors(vdev, nvqs + 1);
>> +	if (err != 0)
>> +		return err;
> Not all devices need high speed.  Some might want to share
> irqs between VQs, or even with config change callback.
> Balloon is a case in point.
> A hint about max # of MSI necessary would be a good
> idea for this case.

This seems being a hint about telling MSI number device supported and 
whether it wants MSI sharing.

For devices with tens of queues at most or need high speed, they choose 
vector per queue, and can simply use fixed mapping.

For others, it can ask for advanced mode, which means MSI sharing and 
dynamic mapping.

What about let device decide the mode it would use, as follows.

MaxVecNum 32bit - The max msi vector number that device supports.

MsiState 32bit

- bit[x]=0 implies vec per queue/config and fixed mapping. In this case, 
MsiVecNum>=num_queue+1

- bit [x]=1 implies the hint of msi sharing and dynamic mapping. In this 
case, MsiVecNum<num_queue+1


Thanks,

Jing

>
> Sharing MSI doesn't necessarily require dedicated registers like PCI has,
> you can just program same vector in multiple VQs.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi
  2019-12-25  2:50 ` [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi Zha Bin
  2019-12-26  8:21     ` [virtio-dev] " Jason Wang
@ 2020-01-17 13:58   ` Thomas Gleixner
  2020-01-17 14:06     ` Liu, Jiang
  2020-01-19  2:27       ` [virtio-dev] " Liu, Jing2
  1 sibling, 2 replies; 60+ messages in thread
From: Thomas Gleixner @ 2020-01-17 13:58 UTC (permalink / raw)
  To: Zha Bin, linux-kernel
  Cc: mst, jasowang, slp, virtio-dev, gerry, zhabin, jing2.liu, chao.p.peng

Zha Bin <zhabin@linux.alibaba.com> writes:

> From: Liu Jiang <gerry@linux.alibaba.com>
>
> The x86 platform currently only supports specific MSI/MSI-x for PCI
> devices. To enable MSI to the platform devices such as virtio-mmio
> device, this patch enhances x86 with platform MSI by leveraging the
> already built-in platform-msi driver (drivers/base/platform-msi.c)
> and makes it as a configurable option.

Why are you trying to make this an architecture feature instead of
having a generic implementation which can work on all architectures
which support virtio-mmio?

> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>

This Signed-off-by chain is invalid.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi
  2020-01-17 13:58   ` Thomas Gleixner
@ 2020-01-17 14:06     ` Liu, Jiang
  2020-01-19  2:27       ` [virtio-dev] " Liu, Jing2
  1 sibling, 0 replies; 60+ messages in thread
From: Liu, Jiang @ 2020-01-17 14:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Zha Bin, linux-kernel, Michael S. Tsirkin, jasowang, slp,
	virtio-dev, jing2.liu, chao.p.peng



> On Jan 17, 2020, at 9:58 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> Zha Bin <zhabin@linux.alibaba.com> writes:
> 
>> From: Liu Jiang <gerry@linux.alibaba.com>
>> 
>> The x86 platform currently only supports specific MSI/MSI-x for PCI
>> devices. To enable MSI to the platform devices such as virtio-mmio
>> device, this patch enhances x86 with platform MSI by leveraging the
>> already built-in platform-msi driver (drivers/base/platform-msi.c)
>> and makes it as a configurable option.
> 
> Why are you trying to make this an architecture feature instead of
> having a generic implementation which can work on all architectures
> which support virtio-mmio?
Thanks, Thomas!
We are reworking to implement it as a generic irqdomain for all virtio MMIO devices.

> 
>> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
>> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
>> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
>> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
> 
> This Signed-off-by chain is invalid.
> 
> Thanks,
> 
>        tglx


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi
  2020-01-17 13:58   ` Thomas Gleixner
@ 2020-01-19  2:27       ` Liu, Jing2
  2020-01-19  2:27       ` [virtio-dev] " Liu, Jing2
  1 sibling, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-19  2:27 UTC (permalink / raw)
  To: Thomas Gleixner, Zha Bin, linux-kernel
  Cc: mst, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 1/17/2020 9:58 PM, Thomas Gleixner wrote:
>> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
>> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
>> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
>> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
> This Signed-off-by chain is invalid.
>
> Thanks,
>
>          tglx

Could I ask how can we handle such co-author situation?

Thanks!

Jing


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [virtio-dev] Re: [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi
@ 2020-01-19  2:27       ` Liu, Jing2
  0 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-19  2:27 UTC (permalink / raw)
  To: Thomas Gleixner, Zha Bin, linux-kernel
  Cc: mst, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 1/17/2020 9:58 PM, Thomas Gleixner wrote:
>> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
>> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
>> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
>> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
> This Signed-off-by chain is invalid.
>
> Thanks,
>
>          tglx

Could I ask how can we handle such co-author situation?

Thanks!

Jing


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi
  2020-01-19  2:27       ` [virtio-dev] " Liu, Jing2
  (?)
@ 2020-01-19 14:57       ` Thomas Gleixner
  -1 siblings, 0 replies; 60+ messages in thread
From: Thomas Gleixner @ 2020-01-19 14:57 UTC (permalink / raw)
  To: Liu, Jing2, Zha Bin, linux-kernel
  Cc: mst, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng

"Liu, Jing2" <jing2.liu@linux.intel.com> writes:
> On 1/17/2020 9:58 PM, Thomas Gleixner wrote:
>>> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
>>> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com>
>>> Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
>>> Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
>> This Signed-off-by chain is invalid.
>>
>
> Could I ask how can we handle such co-author situation?

Search for Co-developed-by in Documentaiton,

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
  2020-01-05 11:04     ` [virtio-dev] " Michael S. Tsirkin
@ 2020-01-21  5:03       ` Liu, Jing2
  -1 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-21  5:03 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zha Bin
  Cc: linux-kernel, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 1/5/2020 7:04 PM, Michael S. Tsirkin wrote:
>
>>   
>>   struct virtio_mmio_vq_info {
>> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>>   };
>>   
>>   
>> +static void vm_free_msi_irqs(struct virtio_device *vdev);
>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs);
>>   
>>   /* Configuration interface */
>>   
>> @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>>   {
>>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vq->vdev);
>>   
>> +	if (vm_dev->version == 3) {
>> +		int offset = vm_dev->doorbell_base +
>> +			     vm_dev->doorbell_scale * vq->index;
>> +		writel(vq->index, vm_dev->base + offset);
>> +	} else
>>   	/* We write the queue's selector into the notification register to
>>   	 * signal the other end */
>> -	writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
>> +		writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
>> +
>>   	return true;
>>   }
> You might want to support VIRTIO_F_NOTIFICATION_DATA too.
>
Yes, the feature is new in virtio1.1 and the kernel has not defined and 
implemented it yet.

To implement it for both PCI and MMIO, maybe it would be good to work a 
separate patch set later?

BTW, we are working on v2 and will send out later.

Thanks,

Jing


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [virtio-dev] Re: [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3
@ 2020-01-21  5:03       ` Liu, Jing2
  0 siblings, 0 replies; 60+ messages in thread
From: Liu, Jing2 @ 2020-01-21  5:03 UTC (permalink / raw)
  To: Michael S. Tsirkin, Zha Bin
  Cc: linux-kernel, jasowang, slp, virtio-dev, gerry, jing2.liu, chao.p.peng


On 1/5/2020 7:04 PM, Michael S. Tsirkin wrote:
>
>>   
>>   struct virtio_mmio_vq_info {
>> @@ -101,6 +107,8 @@ struct virtio_mmio_vq_info {
>>   };
>>   
>>   
>> +static void vm_free_msi_irqs(struct virtio_device *vdev);
>> +static int vm_request_msi_vectors(struct virtio_device *vdev, int nirqs);
>>   
>>   /* Configuration interface */
>>   
>> @@ -273,12 +281,28 @@ static bool vm_notify(struct virtqueue *vq)
>>   {
>>   	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vq->vdev);
>>   
>> +	if (vm_dev->version == 3) {
>> +		int offset = vm_dev->doorbell_base +
>> +			     vm_dev->doorbell_scale * vq->index;
>> +		writel(vq->index, vm_dev->base + offset);
>> +	} else
>>   	/* We write the queue's selector into the notification register to
>>   	 * signal the other end */
>> -	writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
>> +		writel(vq->index, vm_dev->base + VIRTIO_MMIO_QUEUE_NOTIFY);
>> +
>>   	return true;
>>   }
> You might want to support VIRTIO_F_NOTIFICATION_DATA too.
>
Yes, the feature is new in virtio1.1 and the kernel has not defined and 
implemented it yet.

To implement it for both PCI and MMIO, maybe it would be good to work a 
separate patch set later?

BTW, we are working on v2 and will send out later.

Thanks,

Jing


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2020-01-21  5:03 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-25  2:50 [PATCH v1 0/2] support virtio mmio specification Version 3 Zha Bin
2019-12-25  2:50 ` [PATCH v1 1/2] x86/msi: Enhance x86 to support platform_msi Zha Bin
2019-12-26  8:21   ` Jason Wang
2019-12-26  8:21     ` [virtio-dev] " Jason Wang
2020-01-17 13:58   ` Thomas Gleixner
2020-01-17 14:06     ` Liu, Jiang
2020-01-19  2:27     ` Liu, Jing2
2020-01-19  2:27       ` [virtio-dev] " Liu, Jing2
2020-01-19 14:57       ` Thomas Gleixner
2019-12-25  2:50 ` [PATCH v1 2/2] virtio-mmio: add features for virtio-mmio specification version 3 Zha Bin
2019-12-25 10:20   ` Jason Wang
2019-12-25 10:20     ` [virtio-dev] " Jason Wang
2019-12-25 15:20     ` Liu, Jiang
2019-12-26  8:09       ` Jason Wang
2019-12-26  8:09         ` [virtio-dev] " Jason Wang
2019-12-26 12:35         ` Liu, Jiang
2019-12-26 13:16         ` Liu, Jiang
2020-01-02  6:28           ` Jason Wang
2020-01-02  6:28             ` [virtio-dev] " Jason Wang
2020-01-03  6:50             ` Liu, Jiang
2020-01-05 10:42           ` Michael S. Tsirkin
2020-01-05 10:42             ` [virtio-dev] " Michael S. Tsirkin
2020-01-06  7:24             ` Liu, Jing2
2020-01-06  7:24               ` [virtio-dev] " Liu, Jing2
2020-01-09 16:06             ` Liu, Jiang
2020-01-09 16:47               ` Michael S. Tsirkin
2020-01-09 16:47                 ` [virtio-dev] " Michael S. Tsirkin
2019-12-25 22:28   ` kbuild test robot
2019-12-25 22:28     ` kbuild test robot
2019-12-26  1:44   ` kbuild test robot
2019-12-26  1:44     ` kbuild test robot
2019-12-26  8:40   ` Jason Wang
2019-12-26  8:40     ` [virtio-dev] " Jason Wang
2019-12-27  9:37     ` Liu, Jing2
2019-12-27  9:37       ` Liu, Jing2
2020-01-02  6:33       ` Jason Wang
2020-01-02  6:33         ` Jason Wang
2020-01-02  9:13         ` Liu, Jing2
2020-01-02  9:13           ` Liu, Jing2
2020-01-03  3:24           ` Jason Wang
2020-01-03  3:24             ` Jason Wang
2020-01-03  6:14             ` Liu, Jiang
2020-01-03  9:12               ` Jason Wang
2020-01-03  9:12                 ` Jason Wang
2020-01-05 11:25                 ` Michael S. Tsirkin
2020-01-05 11:25                   ` Michael S. Tsirkin
2020-01-06  2:51                   ` Jason Wang
2020-01-06  2:51                     ` Jason Wang
2020-01-05 11:04   ` Michael S. Tsirkin
2020-01-05 11:04     ` [virtio-dev] " Michael S. Tsirkin
2020-01-09  6:15     ` Liu, Jing2
2020-01-09  6:15       ` [virtio-dev] " Liu, Jing2
2020-01-09 13:26       ` Michael S. Tsirkin
2020-01-09 13:26         ` [virtio-dev] " Michael S. Tsirkin
2020-01-15  7:06     ` Liu, Jing2
2020-01-15  7:06       ` [virtio-dev] " Liu, Jing2
2020-01-21  5:03     ` Liu, Jing2
2020-01-21  5:03       ` Liu, Jing2
2020-01-02  6:55 ` [PATCH v1 0/2] support virtio mmio specification Version 3 Jason Wang
2020-01-02  6:55   ` [virtio-dev] " Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.