linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
@ 2015-10-04 20:43 Vlad Zolotarov
  2015-10-04 20:43 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
                   ` (4 more replies)
  0 siblings, 5 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:43 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, corbet, gregkh
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
 
Currently uio_pci_generic supports only legacy INT#x interrupts source. However
there are situations when this is not enough, for instance SR-IOV VF devices that
simply don't have INT#x capability. For such devices uio_pci_generic will simply
fail (more specifically probe() will fail).
 
When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
overhead and thus VFIO is not an option users that develop user-space drivers are left
without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
DPDK) just to be able to use UIO infrastructure.
 
This series provides a generic solution for this problem while preserving the original behaviour
for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).

New in v3:
   - Add __iomem qualifier to temp buffer receiving ioremap value.  

New in v2:
   - Added #include <linux/uaccess.h> to uio_pci_generic.c

Vlad Zolotarov (3):
  uio: add ioctl support
  uio_pci_generic: add MSI/MSI-X support
  Documentation: update uio-howto

 Documentation/DocBook/uio-howto.tmpl |  29 ++-
 drivers/uio/uio.c                    |  15 ++
 drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
 include/linux/uio_driver.h           |   3 +
 include/linux/uio_pci_generic.h      |  36 +++
 5 files changed, 467 insertions(+), 26 deletions(-)
 create mode 100644 include/linux/uio_pci_generic.h

-- 
2.1.0


^ permalink raw reply	[flat|nested] 96+ messages in thread

* [PATCH v3 1/3] uio: add ioctl support
  2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
@ 2015-10-04 20:43 ` Vlad Zolotarov
  2015-10-05  3:03   ` Greg KH
  2015-10-04 20:43 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:43 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, corbet, gregkh
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
 drivers/uio/uio.c          | 15 +++++++++++++++
 include/linux/uio_driver.h |  3 +++
 2 files changed, 18 insertions(+)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 8196581..714b0e5 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -704,6 +704,20 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
 	}
 }
 
+static long uio_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
+{
+	struct uio_listener *listener = filep->private_data;
+	struct uio_device *idev = listener->dev;
+
+	if (!idev->info)
+		return -EIO;
+
+	if (!idev->info->ioctl)
+		return -ENOTTY;
+
+	return idev->info->ioctl(idev->info, cmd, arg);
+}
+
 static const struct file_operations uio_fops = {
 	.owner		= THIS_MODULE,
 	.open		= uio_open,
@@ -712,6 +726,7 @@ static const struct file_operations uio_fops = {
 	.write		= uio_write,
 	.mmap		= uio_mmap,
 	.poll		= uio_poll,
+	.unlocked_ioctl	= uio_ioctl,
 	.fasync		= uio_fasync,
 	.llseek		= noop_llseek,
 };
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 32c0e83..10d7833 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -89,6 +89,7 @@ struct uio_device {
  * @mmap:		mmap operation for this uio device
  * @open:		open operation for this uio device
  * @release:		release operation for this uio device
+ * @ioctl:		ioctl handler
  * @irqcontrol:		disable/enable irqs when 0/1 is written to /dev/uioX
  */
 struct uio_info {
@@ -105,6 +106,8 @@ struct uio_info {
 	int (*open)(struct uio_info *info, struct inode *inode);
 	int (*release)(struct uio_info *info, struct inode *inode);
 	int (*irqcontrol)(struct uio_info *info, s32 irq_on);
+	int (*ioctl)(struct uio_info *info, unsigned int cmd,
+		     unsigned long arg);
 };
 
 extern int __must_check
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
  2015-10-04 20:43 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
@ 2015-10-04 20:43 ` Vlad Zolotarov
  2015-10-05  3:11   ` Greg KH
                     ` (2 more replies)
  2015-10-04 20:43 ` [PATCH v3 3/3] Documentation: update uio-howto Vlad Zolotarov
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:43 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, corbet, gregkh
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

Add support for MSI and MSI-X interrupt modes:
   - Interrupt mode selection order is:
        INT#X (for backward compatibility) -> MSI-X -> MSI.
   - Add ioctl() commands:
      - UIO_PCI_GENERIC_INT_MODE_GET: query the current interrupt mode.
      - UIO_PCI_GENERIC_IRQ_NUM_GET: query the maximum number of IRQs.
      - UIO_PCI_GENERIC_IRQ_SET: bind the IRQ to eventfd (similar to vfio).
   - Add mappings to all bars (memory and portio): some devices have
     registers related to MSI/MSI-X handling outside BAR0.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
New in v3:
   - Add __iomem qualifier to temp buffer receiving ioremap value.

New in v2:
   - Added #include <linux/uaccess.h> to uio_pci_generic.c

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
 drivers/uio/uio_pci_generic.c   | 410 +++++++++++++++++++++++++++++++++++++---
 include/linux/uio_pci_generic.h |  36 ++++
 2 files changed, 423 insertions(+), 23 deletions(-)
 create mode 100644 include/linux/uio_pci_generic.h

diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
index d0b508b..6b8b1789 100644
--- a/drivers/uio/uio_pci_generic.c
+++ b/drivers/uio/uio_pci_generic.c
@@ -22,16 +22,32 @@
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <linux/msi.h>
 #include <linux/slab.h>
 #include <linux/uio_driver.h>
+#include <linux/uio_pci_generic.h>
+#include <linux/eventfd.h>
+#include <linux/uaccess.h>
 
 #define DRIVER_VERSION	"0.01.0"
 #define DRIVER_AUTHOR	"Michael S. Tsirkin <mst@redhat.com>"
 #define DRIVER_DESC	"Generic UIO driver for PCI 2.3 devices"
 
+struct msix_info {
+	int num_irqs;
+	struct msix_entry *table;
+	struct uio_msix_irq_ctx {
+		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
+		char *name;			/* name in /proc/interrupts */
+	} *ctx;
+};
+
 struct uio_pci_generic_dev {
 	struct uio_info info;
 	struct pci_dev *pdev;
+	struct mutex msix_state_lock;		/* ioctl mutex */
+	enum uio_int_mode int_mode;
+	struct msix_info msix;
 };
 
 static inline struct uio_pci_generic_dev *
@@ -40,9 +56,177 @@ to_uio_pci_generic_dev(struct uio_info *info)
 	return container_of(info, struct uio_pci_generic_dev, info);
 }
 
-/* Interrupt handler. Read/modify/write the command register to disable
- * the interrupt. */
-static irqreturn_t irqhandler(int irq, struct uio_info *info)
+/* Unmap previously ioremap'd resources */
+static void release_iomaps(struct uio_pci_generic_dev *gdev)
+{
+	int i;
+	struct uio_mem *mem = gdev->info.mem;
+
+	for (i = 0; i < MAX_UIO_MAPS; i++, mem++) {
+		if (mem->internal_addr) {
+			iounmap(mem->internal_addr);
+			mem->internal_addr = NULL;
+		}
+	}
+}
+
+static int setup_maps(struct pci_dev *pdev, struct uio_info *info)
+{
+	int i, m = 0, p = 0, err;
+	static const char * const bar_names[] = {
+		"BAR0",	"BAR1",	"BAR2",	"BAR3",	"BAR4",	"BAR5",
+	};
+
+	for (i = 0; i < ARRAY_SIZE(bar_names); i++) {
+		unsigned long start = pci_resource_start(pdev, i);
+		unsigned long flags = pci_resource_flags(pdev, i);
+		unsigned long len = pci_resource_len(pdev, i);
+
+		if (start == 0 || len == 0)
+			continue;
+
+		if (flags & IORESOURCE_MEM) {
+			void __iomem *addr;
+
+			if (m >= MAX_UIO_MAPS)
+				continue;
+
+			addr = ioremap(start, len);
+			if (addr == NULL) {
+				err = -EINVAL;
+				goto fail;
+			}
+
+			info->mem[m].name = bar_names[i];
+			info->mem[m].addr = start;
+			info->mem[m].internal_addr = addr;
+			info->mem[m].size = len;
+			info->mem[m].memtype = UIO_MEM_PHYS;
+			++m;
+		} else if (flags & IORESOURCE_IO) {
+			if (p >= MAX_UIO_PORT_REGIONS)
+				continue;
+
+			info->port[p].name = bar_names[i];
+			info->port[p].start = start;
+			info->port[p].size = len;
+			info->port[p].porttype = UIO_PORT_X86;
+			++p;
+		}
+	}
+
+	return 0;
+fail:
+	for (i = 0; i < m; i++) {
+		iounmap(info->mem[i].internal_addr);
+		info->mem[i].internal_addr = NULL;
+	}
+
+	return err;
+}
+
+static irqreturn_t msix_irqhandler(int irq, void *arg);
+
+/* set the mapping between vector # and existing eventfd. */
+static int set_irq_eventfd(struct uio_pci_generic_dev *gdev, int vec, int fd)
+{
+	struct uio_msix_irq_ctx *ctx;
+	struct eventfd_ctx *trigger;
+	struct pci_dev *pdev = gdev->pdev;
+	int irq, err;
+
+	if (vec >= gdev->msix.num_irqs) {
+		dev_notice(&gdev->pdev->dev, "vec %u >= num_vec %u\n",
+			   vec, gdev->msix.num_irqs);
+		return -ERANGE;
+	}
+
+	irq = gdev->msix.table[vec].vector;
+
+	/* Cleanup existing irq mapping */
+	ctx = &gdev->msix.ctx[vec];
+	if (ctx->trigger) {
+		free_irq(irq, ctx->trigger);
+		eventfd_ctx_put(ctx->trigger);
+		ctx->trigger = NULL;
+	}
+
+	/* Passing -1 is used to disable interrupt */
+	if (fd < 0)
+		return 0;
+
+
+	trigger = eventfd_ctx_fdget(fd);
+	if (IS_ERR(trigger)) {
+		err = PTR_ERR(trigger);
+		dev_notice(&gdev->pdev->dev,
+			   "eventfd ctx get failed: %d\n", err);
+		return err;
+	}
+
+	err = request_irq(irq, msix_irqhandler, 0, ctx->name, trigger);
+	if (err) {
+		dev_notice(&pdev->dev, "request irq failed: %d\n", err);
+		eventfd_ctx_put(trigger);
+		return err;
+	}
+
+	dev_dbg(&pdev->dev, "map vector %u to fd %d trigger %p\n",
+		vec, fd, trigger);
+	ctx->trigger = trigger;
+
+	return 0;
+}
+
+static int uio_pci_generic_ioctl(struct uio_info *info, unsigned int cmd,
+				 unsigned long arg)
+{
+	struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
+	struct uio_pci_generic_irq_set hdr;
+	int err;
+
+	switch (cmd) {
+	case UIO_PCI_GENERIC_IRQ_SET:
+		if (copy_from_user(&hdr, (void __user *)arg, sizeof(hdr)))
+			return -EFAULT;
+
+		/* Locking is needed to ensure two things:
+		 *  1) Two IRQ_SET ioctl()'s are not running in parallel.
+		 *  2) IRQ_SET ioctl() is not running in parallel with remove().
+		 */
+		mutex_lock(&gdev->msix_state_lock);
+		if (gdev->int_mode != UIO_INT_MODE_MSIX) {
+			mutex_unlock(&gdev->msix_state_lock);
+			return -EOPNOTSUPP;
+		}
+
+		err = set_irq_eventfd(gdev, hdr.vec, hdr.fd);
+		mutex_unlock(&gdev->msix_state_lock);
+
+		break;
+	case UIO_PCI_GENERIC_IRQ_NUM_GET:
+		if (gdev->int_mode == UIO_INT_MODE_NONE)
+			err = put_user(0, (u32 __user *)arg);
+		else if (gdev->int_mode != UIO_INT_MODE_MSIX)
+			err = put_user(1, (u32 __user *)arg);
+		else
+			err = put_user(gdev->msix.num_irqs,
+				       (u32 __user *)arg);
+
+		break;
+	case UIO_PCI_GENERIC_INT_MODE_GET:
+		err = put_user(gdev->int_mode, (u32 __user *)arg);
+
+		break;
+	default:
+		err = -EOPNOTSUPP;
+	}
+
+	return err;
+}
+
+/* INT#X interrupt handler. */
+static irqreturn_t intx_irqhandler(int irq, struct uio_info *info)
 {
 	struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
 
@@ -53,8 +237,162 @@ static irqreturn_t irqhandler(int irq, struct uio_info *info)
 	return IRQ_HANDLED;
 }
 
-static int probe(struct pci_dev *pdev,
-			   const struct pci_device_id *id)
+/* MSI interrupt handler. */
+static irqreturn_t msi_irqhandler(int irq, struct uio_info *info)
+{
+	/* UIO core will signal the user process. */
+	return IRQ_HANDLED;
+}
+
+/* MSI-X interrupt handler. */
+static irqreturn_t msix_irqhandler(int irq, void *arg)
+{
+	struct eventfd_ctx *trigger = arg;
+
+	pr_devel("irq %u trigger %p\n", irq, trigger);
+
+	eventfd_signal(trigger, 1);
+	return IRQ_HANDLED;
+}
+
+static bool enable_intx(struct uio_pci_generic_dev *gdev)
+{
+	struct pci_dev *pdev = gdev->pdev;
+
+	if (!pdev->irq || !pci_intx_mask_supported(pdev))
+		return false;
+
+	gdev->int_mode = UIO_INT_MODE_INTX;
+	gdev->info.irq = pdev->irq;
+	gdev->info.irq_flags = IRQF_SHARED;
+	gdev->info.handler = intx_irqhandler;
+
+	return true;
+}
+
+static void set_pci_master(struct pci_dev *pdev)
+{
+	pci_set_master(pdev);
+	dev_warn(&pdev->dev, "Enabling PCI bus mastering. Bogus userspace application is able to trash kernel memory using DMA");
+	add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+}
+
+static bool enable_msi(struct uio_pci_generic_dev *gdev)
+{
+	struct pci_dev *pdev = gdev->pdev;
+
+	set_pci_master(pdev);
+
+	if (pci_enable_msi(pdev))
+		return false;
+
+	gdev->int_mode = UIO_INT_MODE_MSI;
+	gdev->info.irq = pdev->irq;
+	gdev->info.irq_flags = 0;
+	gdev->info.handler = msi_irqhandler;
+
+	return true;
+}
+
+static bool enable_msix(struct uio_pci_generic_dev *gdev)
+{
+	struct pci_dev *pdev = gdev->pdev;
+	int i, vectors = pci_msix_vec_count(pdev);
+
+	if (vectors <= 0)
+		return false;
+
+	gdev->msix.table = kcalloc(vectors, sizeof(struct msix_entry),
+				   GFP_KERNEL);
+	if (!gdev->msix.table) {
+		dev_err(&pdev->dev, "Failed to allocate memory for MSI-X table");
+		return false;
+	}
+
+	gdev->msix.ctx = kcalloc(vectors, sizeof(struct uio_msix_irq_ctx),
+				 GFP_KERNEL);
+	if (!gdev->msix.ctx) {
+		dev_err(&pdev->dev, "Failed to allocate memory for MSI-X contexts");
+		goto err_ctx_alloc;
+	}
+
+	for (i = 0; i < vectors; i++) {
+		gdev->msix.table[i].entry = i;
+		gdev->msix.ctx[i].name = kasprintf(GFP_KERNEL,
+						   KBUILD_MODNAME "[%d](%s)",
+						   i, pci_name(pdev));
+		if (!gdev->msix.ctx[i].name)
+			goto err_name_alloc;
+	}
+
+	set_pci_master(pdev);
+
+	if (pci_enable_msix(pdev, gdev->msix.table, vectors))
+		goto err_msix_enable;
+
+	gdev->int_mode = UIO_INT_MODE_MSIX;
+	gdev->info.irq = UIO_IRQ_CUSTOM;
+	gdev->msix.num_irqs = vectors;
+
+	return true;
+
+err_msix_enable:
+	pci_clear_master(pdev);
+err_name_alloc:
+	for (i = 0; i < vectors; i++)
+		kfree(gdev->msix.ctx[i].name);
+
+	kfree(gdev->msix.ctx);
+err_ctx_alloc:
+	kfree(gdev->msix.table);
+
+	return false;
+}
+
+/**
+ * Disable interrupts and free related resources.
+ *
+ * @gdev device handle
+ *
+ * This function should be called after the corresponding UIO device has been
+ * unregistered. This will ensure that there are no currently running ioctl()s
+ * and there won't be any new ones until next probe() call.
+ */
+static void disable_intr(struct uio_pci_generic_dev *gdev)
+{
+	struct pci_dev *pdev = gdev->pdev;
+	int i;
+
+	switch (gdev->int_mode) {
+	case UIO_INT_MODE_MSI:
+		pci_disable_msi(pdev);
+		pci_clear_master(pdev);
+
+		break;
+	case UIO_INT_MODE_MSIX:
+		/* No need for locking here since there shouldn't be any
+		 * ioctl()s running by now.
+		 */
+		for (i = 0; i < gdev->msix.num_irqs; i++) {
+			if (gdev->msix.ctx[i].trigger)
+				set_irq_eventfd(gdev, i, -1);
+
+			kfree(gdev->msix.ctx[i].name);
+		}
+
+		pci_disable_msix(pdev);
+		pci_clear_master(pdev);
+		kfree(gdev->msix.ctx);
+		kfree(gdev->msix.table);
+
+		break;
+	default:
+		break;
+	}
+}
+
+
+static int probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct uio_pci_generic_dev *gdev;
 	int err;
@@ -66,42 +404,64 @@ static int probe(struct pci_dev *pdev,
 		return err;
 	}
 
-	if (!pdev->irq) {
-		dev_warn(&pdev->dev, "No IRQ assigned to device: "
-			 "no support for interrupts?\n");
-		pci_disable_device(pdev);
-		return -ENODEV;
-	}
-
-	if (!pci_intx_mask_supported(pdev)) {
-		err = -ENODEV;
-		goto err_verify;
-	}
-
 	gdev = kzalloc(sizeof(struct uio_pci_generic_dev), GFP_KERNEL);
 	if (!gdev) {
 		err = -ENOMEM;
 		goto err_alloc;
 	}
 
+	gdev->pdev = pdev;
 	gdev->info.name = "uio_pci_generic";
 	gdev->info.version = DRIVER_VERSION;
-	gdev->info.irq = pdev->irq;
-	gdev->info.irq_flags = IRQF_SHARED;
-	gdev->info.handler = irqhandler;
-	gdev->pdev = pdev;
+	gdev->info.ioctl = uio_pci_generic_ioctl;
+	mutex_init(&gdev->msix_state_lock);
+
+	err = pci_request_regions(pdev, "uio_pci_generic");
+	if (err != 0) {
+		dev_err(&pdev->dev, "Cannot request regions\n");
+		goto err_request_regions;
+	}
+
+	/* Enable the corresponding interrupt mode. Try to enable INT#X first
+	 * for backward compatibility.
+	 */
+	if (enable_intx(gdev))
+		dev_info(&pdev->dev, "Using INT#x mode: IRQ %ld",
+			 gdev->info.irq);
+	else if (enable_msix(gdev))
+		dev_info(&pdev->dev, "Using MSI-X mode: number of IRQs %d",
+			 gdev->msix.num_irqs);
+	else if (enable_msi(gdev))
+		dev_info(&pdev->dev, "Using MSI mode: IRQ %ld", gdev->info.irq);
+	else {
+		err = -ENODEV;
+		goto err_verify;
+	}
+
+	/* remap resources */
+	err = setup_maps(pdev, &gdev->info);
+	if (err)
+		goto err_maps;
 
 	err = uio_register_device(&pdev->dev, &gdev->info);
 	if (err)
 		goto err_register;
+
 	pci_set_drvdata(pdev, gdev);
 
 	return 0;
+
 err_register:
+	release_iomaps(gdev);
+err_maps:
+	disable_intr(gdev);
+err_verify:
+	pci_release_regions(pdev);
+err_request_regions:
 	kfree(gdev);
 err_alloc:
-err_verify:
 	pci_disable_device(pdev);
+
 	return err;
 }
 
@@ -110,8 +470,12 @@ static void remove(struct pci_dev *pdev)
 	struct uio_pci_generic_dev *gdev = pci_get_drvdata(pdev);
 
 	uio_unregister_device(&gdev->info);
-	pci_disable_device(pdev);
+	disable_intr(gdev);
+	release_iomaps(gdev);
+	pci_release_regions(pdev);
 	kfree(gdev);
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
 }
 
 static struct pci_driver uio_pci_driver = {
diff --git a/include/linux/uio_pci_generic.h b/include/linux/uio_pci_generic.h
new file mode 100644
index 0000000..10716fc
--- /dev/null
+++ b/include/linux/uio_pci_generic.h
@@ -0,0 +1,36 @@
+/*
+ * include/linux/uio_pci_generic.h
+ *
+ * Userspace generic PCI IO driver.
+ *
+ * Licensed under the GPLv2 only.
+ */
+
+#ifndef _UIO_PCI_GENERIC_H_
+#define _UIO_PCI_GENERIC_H_
+
+#include <linux/ioctl.h>
+
+enum uio_int_mode {
+	UIO_INT_MODE_NONE,
+	UIO_INT_MODE_INTX,
+	UIO_INT_MODE_MSI,
+	UIO_INT_MODE_MSIX
+};
+
+/* bind the requested IRQ to the given eventfd */
+struct uio_pci_generic_irq_set {
+	int vec; /* index of the IRQ to connect to starting from 0 */
+	int fd;
+};
+
+#define UIO_PCI_GENERIC_BASE		0x86
+
+#define UIO_PCI_GENERIC_IRQ_SET		_IOW('I', UIO_PCI_GENERIC_BASE + 1, \
+		struct uio_pci_generic_irq_set)
+#define UIO_PCI_GENERIC_IRQ_NUM_GET	_IOW('I', UIO_PCI_GENERIC_BASE + 2, \
+		uint32_t)
+#define UIO_PCI_GENERIC_INT_MODE_GET	_IOW('I', UIO_PCI_GENERIC_BASE + 3, \
+		uint32_t)
+
+#endif /* _UIO_PCI_GENERIC_H_ */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [PATCH v3 3/3] Documentation: update uio-howto
  2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
  2015-10-04 20:43 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
  2015-10-04 20:43 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
@ 2015-10-04 20:43 ` Vlad Zolotarov
  2015-10-04 20:45 ` [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
  2015-10-05 19:50 ` Michael S. Tsirkin
  4 siblings, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:43 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, corbet, gregkh
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

Change the chapters related to uio_pci_generic that refer interrupt mode.
Add the relevant explanation regarding MSI and MSI-X interrupt modes
support.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
 Documentation/DocBook/uio-howto.tmpl | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl
index cd0e452..a176129 100644
--- a/Documentation/DocBook/uio-howto.tmpl
+++ b/Documentation/DocBook/uio-howto.tmpl
@@ -46,6 +46,12 @@ GPL version 2.
 
 <revhistory>
 	<revision>
+	<revnumber>0.10</revnumber>
+	<date>2015-10-04</date>
+	<authorinitials>vz</authorinitials>
+	<revremark>Added MSI and MSI-X support to uio_pci_generic.</revremark>
+	</revision>
+	<revision>
 	<revnumber>0.9</revnumber>
 	<date>2009-07-16</date>
 	<authorinitials>mst</authorinitials>
@@ -935,15 +941,32 @@ and look in the output for failure reasons
 <sect1 id="uio_pci_generic_internals">
 <title>Things to know about uio_pci_generic</title>
 	<para>
-Interrupts are handled using the Interrupt Disable bit in the PCI command
+Interrupts are handled either as MSI-X or MSI interrupts (if the device supports it) or
+as legacy INTx interrupts. By default INTx interrupts are used.
+	</para>
+	<para>
+uio_pci_generic automatically configures a device to use INTx interrupt for backward
+compatibility. If INTx are not available MSI-X interrupts will be used if the device
+supports it and if not MSI interrupts are going to be used. If none of the interrupts
+modes is supported probe() will fail.
+	</para>
+	<para>
+To get the used interrupt mode application has to use UIO_PCI_GENERIC_INT_MODE_GET ioctl
+command.
+UIO_PCI_GENERIC_IRQ_NUM_GET ioctl command may be used to get the total number of IRQs.
+Then UIO_PCI_GENERIC_IRQ_SET ioctl command may be used to bind a specific eventfd to a specific
+IRQ vector.
+	</para>
+	<para>
+Legacy interrupts are handled using the Interrupt Disable bit in the PCI command
 register and Interrupt Status bit in the PCI status register.  All devices
 compliant to PCI 2.3 (circa 2002) and all compliant PCI Express devices should
 support these bits.  uio_pci_generic detects this support, and won't bind to
 devices which do not support the Interrupt Disable Bit in the command register.
 	</para>
 	<para>
-On each interrupt, uio_pci_generic sets the Interrupt Disable bit.
-This prevents the device from generating further interrupts
+If legacy interrupts are used, uio_pci_generic sets the Interrupt Disable bit on
+each interrupt. This prevents the device from generating further interrupts
 until the bit is cleared. The userspace driver should clear this
 bit before blocking and waiting for more interrupts.
 	</para>
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
                   ` (2 preceding siblings ...)
  2015-10-04 20:43 ` [PATCH v3 3/3] Documentation: update uio-howto Vlad Zolotarov
@ 2015-10-04 20:45 ` Vlad Zolotarov
  2015-10-05 19:50 ` Michael S. Tsirkin
  4 siblings, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:45 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, corbet, gregkh
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck

This is the same v3 but with the correct email address of Greg. In the 
first iteration the first letter of the email was missing... ;)

On 10/04/15 23:43, Vlad Zolotarov wrote:
> This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
>   
> Currently uio_pci_generic supports only legacy INT#x interrupts source. However
> there are situations when this is not enough, for instance SR-IOV VF devices that
> simply don't have INT#x capability. For such devices uio_pci_generic will simply
> fail (more specifically probe() will fail).
>   
> When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
> overhead and thus VFIO is not an option users that develop user-space drivers are left
> without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
> DPDK) just to be able to use UIO infrastructure.
>   
> This series provides a generic solution for this problem while preserving the original behaviour
> for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).
>
> New in v3:
>     - Add __iomem qualifier to temp buffer receiving ioremap value.
>
> New in v2:
>     - Added #include <linux/uaccess.h> to uio_pci_generic.c
>
> Vlad Zolotarov (3):
>    uio: add ioctl support
>    uio_pci_generic: add MSI/MSI-X support
>    Documentation: update uio-howto
>
>   Documentation/DocBook/uio-howto.tmpl |  29 ++-
>   drivers/uio/uio.c                    |  15 ++
>   drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
>   include/linux/uio_driver.h           |   3 +
>   include/linux/uio_pci_generic.h      |  36 +++
>   5 files changed, 467 insertions(+), 26 deletions(-)
>   create mode 100644 include/linux/uio_pci_generic.h
>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-04 20:43 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
@ 2015-10-05  3:03   ` Greg KH
  2015-10-05  7:33     ` Vlad Zolotarov
  0 siblings, 1 reply; 96+ messages in thread
From: Greg KH @ 2015-10-05  3:03 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Sun, Oct 04, 2015 at 11:43:16PM +0300, Vlad Zolotarov wrote:
> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
> ---
>  drivers/uio/uio.c          | 15 +++++++++++++++
>  include/linux/uio_driver.h |  3 +++
>  2 files changed, 18 insertions(+)

You add an ioctl yet fail to justify _why_ you need/want that ioctl, and
you don't document it at all?  Come on, you know better than that, no
one can take a patch that has no changelog comments at all like this :(

Also, I _REALLY_ don't want to add any ioctls to the UIO interface, so
you had better have a really compelling argument as to why this is the
_ONLY_ way you can solve this unknown problem by using such a horrid
thing...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-04 20:43 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
@ 2015-10-05  3:11   ` Greg KH
  2015-10-05  7:41     ` Vlad Zolotarov
  2015-10-05  8:28     ` Avi Kivity
  2015-10-05  8:41   ` Stephen Hemminger
  2015-10-05 19:16   ` Michael S. Tsirkin
  2 siblings, 2 replies; 96+ messages in thread
From: Greg KH @ 2015-10-05  3:11 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Sun, Oct 04, 2015 at 11:43:17PM +0300, Vlad Zolotarov wrote:
> Add support for MSI and MSI-X interrupt modes:
>    - Interrupt mode selection order is:
>         INT#X (for backward compatibility) -> MSI-X -> MSI.
>    - Add ioctl() commands:
>       - UIO_PCI_GENERIC_INT_MODE_GET: query the current interrupt mode.
>       - UIO_PCI_GENERIC_IRQ_NUM_GET: query the maximum number of IRQs.
>       - UIO_PCI_GENERIC_IRQ_SET: bind the IRQ to eventfd (similar to vfio).
>    - Add mappings to all bars (memory and portio): some devices have
>      registers related to MSI/MSI-X handling outside BAR0.
> 
> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
> ---
> New in v3:
>    - Add __iomem qualifier to temp buffer receiving ioremap value.
> 
> New in v2:
>    - Added #include <linux/uaccess.h> to uio_pci_generic.c
> 
> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
> ---
>  drivers/uio/uio_pci_generic.c   | 410 +++++++++++++++++++++++++++++++++++++---
>  include/linux/uio_pci_generic.h |  36 ++++
>  2 files changed, 423 insertions(+), 23 deletions(-)
>  create mode 100644 include/linux/uio_pci_generic.h
> 
> diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
> index d0b508b..6b8b1789 100644
> --- a/drivers/uio/uio_pci_generic.c
> +++ b/drivers/uio/uio_pci_generic.c
> @@ -22,16 +22,32 @@
>  #include <linux/device.h>
>  #include <linux/module.h>
>  #include <linux/pci.h>
> +#include <linux/msi.h>
>  #include <linux/slab.h>
>  #include <linux/uio_driver.h>
> +#include <linux/uio_pci_generic.h>
> +#include <linux/eventfd.h>
> +#include <linux/uaccess.h>
>  
>  #define DRIVER_VERSION	"0.01.0"
>  #define DRIVER_AUTHOR	"Michael S. Tsirkin <mst@redhat.com>"
>  #define DRIVER_DESC	"Generic UIO driver for PCI 2.3 devices"
>  
> +struct msix_info {
> +	int num_irqs;
> +	struct msix_entry *table;
> +	struct uio_msix_irq_ctx {
> +		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */

Why are you using eventfd for msi vectors?  What's the reason for
needing this?

You haven't documented how this api works at all, you are going to have
to a lot more work to justify this, as this greatly increases the
complexity of the user/kernel api in unknown ways.

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-05  3:03   ` Greg KH
@ 2015-10-05  7:33     ` Vlad Zolotarov
  2015-10-05  8:01       ` Greg KH
  0 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05  7:33 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck




On 10/05/15 06:03, Greg KH wrote:
> On Sun, Oct 04, 2015 at 11:43:16PM +0300, Vlad Zolotarov wrote:
>> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
>> ---
>>   drivers/uio/uio.c          | 15 +++++++++++++++
>>   include/linux/uio_driver.h |  3 +++
>>   2 files changed, 18 insertions(+)
> You add an ioctl yet fail to justify _why_ you need/want that ioctl, and
> you don't document it at all?  Come on, you know better than that, no
> one can take a patch that has no changelog comments at all like this :(

My bad. U are absolutely right here - it was late and I was tired that I 
missed that to someone it may not be so "crystal clear" like it is to 
me... :)
Again, my bad - let me clarify it here and if we agree I'll respin the 
series with all relevant updates including the changelog.

>
> Also, I _REALLY_ don't want to add any ioctls to the UIO interface, so
> you had better have a really compelling argument as to why this is the
> _ONLY_ way you can solve this unknown problem by using such a horrid
> thing...

Pls., note that this doesn't _ADD_ any ioctls directly to UIO driver, 
but only lets the underlying PCI drivers to have them. UIO in this case 
is only a proxy.

The main idea of this series is, as mentioned in PATCH0, to add the MSI 
and MSI-X support for uio_pci_generic driver.
While with MSI the things are quite simple and we may just ride the 
existing infrastructure, with the MSI-X the things get a bit more 
complicated since we may have more than one interrupt vector. Therefore 
we have to decide which interface we want to give to the user.

One option could be to make all existing interrupts trigger the same 
objects in UIO as the current single interrupt does, however this would 
create an awkward, quite not-flexible semantics. For instance a regular 
(kernel) driver has a separate state machine for each interrupt line, 
which sometimes runs on a separate CPU, etc. This way we get to the 
second option - allow indication for each separate interrupt vector. And 
for obvious reasons (mentioned above) we (Stephen has sent a similar 
series on a dpdk-dev list) chose a second approach.

In order not to invent the wheel we mimicked the VFIO approach, which 
allows to bind the pre-allocated  eventfd descriptor to the specific 
interrupt vector using the ioctl().

The interface is simple. The UIO_PCI_GENERIC_IRQ_SET ioctl() data is:

struct uio_pci_generic_irq_set {
	int vec; /* index of the IRQ to connect to starting from 0 */
	int fd;
};


where "vec" is an index of the IRQ starting from 0 and "fd" is an 
eventfd file descriptor a user wants to poll() for in order to get the 
interrupt indications. If "fd" is less than 0, ioctl() will unbind the 
interrupt from the previously bound eventfd descriptor.

This way a user may poll() for any IRQ it wants separately, or epoll() 
for any subset of them, or do whatever he/she wants to do.

That's why we needed the ioctl(). I admit that it may not be the _ONLY_ 
way to achieve the goal described above but again we took VFIO approach 
as a template for a solution and just followed it. If u think there is 
more elegant/robust/better way to do so, pls., share. :)

thanks,
vlad


>
> thanks,
>
> greg k-h


On 10/05/15 06:03, Greg KH wrote:
> On Sun, Oct 04, 2015 at 11:43:16PM +0300, Vlad Zolotarov wrote:
>> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
>> ---
>>   drivers/uio/uio.c          | 15 +++++++++++++++
>>   include/linux/uio_driver.h |  3 +++
>>   2 files changed, 18 insertions(+)
> You add an ioctl yet fail to justify _why_ you need/want that ioctl, and
> you don't document it at all?  Come on, you know better than that, no
> one can take a patch that has no changelog comments at all like this :(
>
> Also, I _REALLY_ don't want to add any ioctls to the UIO interface, so
> you had better have a really compelling argument as to why this is the
> _ONLY_ way you can solve this unknown problem by using such a horrid
> thing...
>
> thanks,
>
> greg k-h


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  3:11   ` Greg KH
@ 2015-10-05  7:41     ` Vlad Zolotarov
  2015-10-05  7:56       ` Greg KH
  2015-10-05  8:28     ` Avi Kivity
  1 sibling, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05  7:41 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/05/15 06:11, Greg KH wrote:
> On Sun, Oct 04, 2015 at 11:43:17PM +0300, Vlad Zolotarov wrote:
>> Add support for MSI and MSI-X interrupt modes:
>>     - Interrupt mode selection order is:
>>          INT#X (for backward compatibility) -> MSI-X -> MSI.
>>     - Add ioctl() commands:
>>        - UIO_PCI_GENERIC_INT_MODE_GET: query the current interrupt mode.
>>        - UIO_PCI_GENERIC_IRQ_NUM_GET: query the maximum number of IRQs.
>>        - UIO_PCI_GENERIC_IRQ_SET: bind the IRQ to eventfd (similar to vfio).
>>     - Add mappings to all bars (memory and portio): some devices have
>>       registers related to MSI/MSI-X handling outside BAR0.
>>
>> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
>> ---
>> New in v3:
>>     - Add __iomem qualifier to temp buffer receiving ioremap value.
>>
>> New in v2:
>>     - Added #include <linux/uaccess.h> to uio_pci_generic.c
>>
>> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
>> ---
>>   drivers/uio/uio_pci_generic.c   | 410 +++++++++++++++++++++++++++++++++++++---
>>   include/linux/uio_pci_generic.h |  36 ++++
>>   2 files changed, 423 insertions(+), 23 deletions(-)
>>   create mode 100644 include/linux/uio_pci_generic.h
>>
>> diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
>> index d0b508b..6b8b1789 100644
>> --- a/drivers/uio/uio_pci_generic.c
>> +++ b/drivers/uio/uio_pci_generic.c
>> @@ -22,16 +22,32 @@
>>   #include <linux/device.h>
>>   #include <linux/module.h>
>>   #include <linux/pci.h>
>> +#include <linux/msi.h>
>>   #include <linux/slab.h>
>>   #include <linux/uio_driver.h>
>> +#include <linux/uio_pci_generic.h>
>> +#include <linux/eventfd.h>
>> +#include <linux/uaccess.h>
>>   
>>   #define DRIVER_VERSION	"0.01.0"
>>   #define DRIVER_AUTHOR	"Michael S. Tsirkin <mst@redhat.com>"
>>   #define DRIVER_DESC	"Generic UIO driver for PCI 2.3 devices"
>>   
>> +struct msix_info {
>> +	int num_irqs;
>> +	struct msix_entry *table;
>> +	struct uio_msix_irq_ctx {
>> +		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
> Why are you using eventfd for msi vectors?  What's the reason for
> needing this?

A small correction - for MSI-X vectors. There may be only one MSI vector 
per PCI function and if it's used it would use the same interface as a 
legacy INT#x interrupt uses at the moment.
So, for MSI-X case the reason is that there may be (in most cases there 
will be) more than one interrupt vector. Thus, as I've explained in a 
PATCH1 thread we need a way to indicated each of them separately. 
eventfd seems like a good way of doing so. If u have better ideas, pls., 
share.

>
> You haven't documented how this api works at all, you are going to have
> to a lot more work to justify this, as this greatly increases the
> complexity of the user/kernel api in unknown ways.

I actually do documented it a bit. Pls., check PATCH3 out. I admit that 
I could do a better job by for instance providing a code example. I'll 
improve this in v4 once we agree on all other details.

thanks,
vlad

>
> greg k-h


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  7:41     ` Vlad Zolotarov
@ 2015-10-05  7:56       ` Greg KH
  2015-10-05 10:48         ` Vlad Zolotarov
  0 siblings, 1 reply; 96+ messages in thread
From: Greg KH @ 2015-10-05  7:56 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote:
> >>+struct msix_info {
> >>+	int num_irqs;
> >>+	struct msix_entry *table;
> >>+	struct uio_msix_irq_ctx {
> >>+		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
> >Why are you using eventfd for msi vectors?  What's the reason for
> >needing this?
> 
> A small correction - for MSI-X vectors. There may be only one MSI vector per
> PCI function and if it's used it would use the same interface as a legacy
> INT#x interrupt uses at the moment.
> So, for MSI-X case the reason is that there may be (in most cases there will
> be) more than one interrupt vector. Thus, as I've explained in a PATCH1
> thread we need a way to indicated each of them separately. eventfd seems
> like a good way of doing so. If u have better ideas, pls., share.

You need to document what you are doing here, I don't see any
explaination for using eventfd at all.

And no, I don't know of any other solution as I don't know what you are
trying to do here (hint, the changelog didn't document it...)

> >You haven't documented how this api works at all, you are going to have
> >to a lot more work to justify this, as this greatly increases the
> >complexity of the user/kernel api in unknown ways.
> 
> I actually do documented it a bit. Pls., check PATCH3 out.

That provided no information at all about how to use the api.

If it did, you would see that your api is broken for 32/64bit kernels
and will fall over into nasty pieces the first time you try to use it
there, which means it hasn't been tested at all :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-05  7:33     ` Vlad Zolotarov
@ 2015-10-05  8:01       ` Greg KH
  2015-10-05 10:36         ` Vlad Zolotarov
  0 siblings, 1 reply; 96+ messages in thread
From: Greg KH @ 2015-10-05  8:01 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Mon, Oct 05, 2015 at 10:33:20AM +0300, Vlad Zolotarov wrote:
> On 10/05/15 06:03, Greg KH wrote:
> >On Sun, Oct 04, 2015 at 11:43:16PM +0300, Vlad Zolotarov wrote:
> >>Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
> >>---
> >>  drivers/uio/uio.c          | 15 +++++++++++++++
> >>  include/linux/uio_driver.h |  3 +++
> >>  2 files changed, 18 insertions(+)
> >You add an ioctl yet fail to justify _why_ you need/want that ioctl, and
> >you don't document it at all?  Come on, you know better than that, no
> >one can take a patch that has no changelog comments at all like this :(
> 
> My bad. U are absolutely right here - it was late and I was tired that I
> missed that to someone it may not be so "crystal clear" like it is to me...
> :)
> Again, my bad - let me clarify it here and if we agree I'll respin the
> series with all relevant updates including the changelog.
> 
> >
> >Also, I _REALLY_ don't want to add any ioctls to the UIO interface, so
> >you had better have a really compelling argument as to why this is the
> >_ONLY_ way you can solve this unknown problem by using such a horrid
> >thing...
> 
> Pls., note that this doesn't _ADD_ any ioctls directly to UIO driver, but
> only lets the underlying PCI drivers to have them. UIO in this case is only
> a proxy.

Exactly, and I don't want to provide an ioctl "proxy" for UIO drivers.
That way lies madness and horrid code, and other nasty things (hint,
each ioctl is a custom syscall, so you are opening up the box for all
sorts of bad things to happen in drivers...)

For example, your ioctl you use here is incorrect, and will fail
horribly on a large majority of systems.  I don't want to open up the
requirements that more people have to know how to "do it right" in order
to use the UIO interface for their drivers, as people will get it wrong
(as this patch series shows...)

> The main idea of this series is, as mentioned in PATCH0, to add the MSI and
> MSI-X support for uio_pci_generic driver.

Yes, I know that, but I don't see anything that shows _how_ to use this
api.  And then there's the issue of why we even need this, why not just
write a whole new driver for this, like the previous driver did (which
also used ioctls, yes, I didn't have the chance to object to that before
everyone else did...)

> While with MSI the things are quite simple and we may just ride the existing
> infrastructure, with the MSI-X the things get a bit more complicated since
> we may have more than one interrupt vector. Therefore we have to decide
> which interface we want to give to the user.
> 
> One option could be to make all existing interrupts trigger the same objects
> in UIO as the current single interrupt does, however this would create an
> awkward, quite not-flexible semantics. For instance a regular (kernel)
> driver has a separate state machine for each interrupt line, which sometimes
> runs on a separate CPU, etc. This way we get to the second option - allow
> indication for each separate interrupt vector. And for obvious reasons
> (mentioned above) we (Stephen has sent a similar series on a dpdk-dev list)
> chose a second approach.
> 
> In order not to invent the wheel we mimicked the VFIO approach, which allows
> to bind the pre-allocated  eventfd descriptor to the specific interrupt
> vector using the ioctl().
> 
> The interface is simple. The UIO_PCI_GENERIC_IRQ_SET ioctl() data is:
> 
> struct uio_pci_generic_irq_set {
> 	int vec; /* index of the IRQ to connect to starting from 0 */
> 	int fd;
> };

And that's broken :(

NEVER use an "int" for an ioctl, it is wrong and will cause horrible
issues on a large number of systems.  That is what the __u16 and friends
variable types are for.  You know better than this :)

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  3:11   ` Greg KH
  2015-10-05  7:41     ` Vlad Zolotarov
@ 2015-10-05  8:28     ` Avi Kivity
  2015-10-05  9:49       ` Greg KH
  2015-10-06 14:46       ` Michael S. Tsirkin
  1 sibling, 2 replies; 96+ messages in thread
From: Avi Kivity @ 2015-10-05  8:28 UTC (permalink / raw)
  To: Greg KH, Vlad Zolotarov
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On 10/05/2015 06:11 AM, Greg KH wrote:
> On Sun, Oct 04, 2015 at 11:43:17PM +0300, Vlad Zolotarov wrote:
>> Add support for MSI and MSI-X interrupt modes:
>>     - Interrupt mode selection order is:
>>          INT#X (for backward compatibility) -> MSI-X -> MSI.
>>     - Add ioctl() commands:
>>        - UIO_PCI_GENERIC_INT_MODE_GET: query the current interrupt mode.
>>        - UIO_PCI_GENERIC_IRQ_NUM_GET: query the maximum number of IRQs.
>>        - UIO_PCI_GENERIC_IRQ_SET: bind the IRQ to eventfd (similar to vfio).
>>     - Add mappings to all bars (memory and portio): some devices have
>>       registers related to MSI/MSI-X handling outside BAR0.
>>
>> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
>> ---
>> New in v3:
>>     - Add __iomem qualifier to temp buffer receiving ioremap value.
>>
>> New in v2:
>>     - Added #include <linux/uaccess.h> to uio_pci_generic.c
>>
>> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
>> ---
>>   drivers/uio/uio_pci_generic.c   | 410 +++++++++++++++++++++++++++++++++++++---
>>   include/linux/uio_pci_generic.h |  36 ++++
>>   2 files changed, 423 insertions(+), 23 deletions(-)
>>   create mode 100644 include/linux/uio_pci_generic.h
>>
>> diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
>> index d0b508b..6b8b1789 100644
>> --- a/drivers/uio/uio_pci_generic.c
>> +++ b/drivers/uio/uio_pci_generic.c
>> @@ -22,16 +22,32 @@
>>   #include <linux/device.h>
>>   #include <linux/module.h>
>>   #include <linux/pci.h>
>> +#include <linux/msi.h>
>>   #include <linux/slab.h>
>>   #include <linux/uio_driver.h>
>> +#include <linux/uio_pci_generic.h>
>> +#include <linux/eventfd.h>
>> +#include <linux/uaccess.h>
>>   
>>   #define DRIVER_VERSION	"0.01.0"
>>   #define DRIVER_AUTHOR	"Michael S. Tsirkin <mst@redhat.com>"
>>   #define DRIVER_DESC	"Generic UIO driver for PCI 2.3 devices"
>>   
>> +struct msix_info {
>> +	int num_irqs;
>> +	struct msix_entry *table;
>> +	struct uio_msix_irq_ctx {
>> +		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
> Why are you using eventfd for msi vectors?  What's the reason for
> needing this?
>
> You haven't documented how this api works at all, you are going to have
> to a lot more work to justify this, as this greatly increases the
> complexity of the user/kernel api in unknown ways.
>
>

Of course it has to be documented, but this just follows vfio.

Eventfd is a natural enough representation of an interrupt; both kvm and 
vfio use it, and are also able to share the eventfd, allowing a vfio 
interrupt to generate a kvm interrupt, without userspace intervention, 
and one day without even kernel intervention.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-04 20:43 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
  2015-10-05  3:11   ` Greg KH
@ 2015-10-05  8:41   ` Stephen Hemminger
  2015-10-05  9:08     ` Vlad Zolotarov
  2015-10-05  9:11     ` Vlad Zolotarov
  2015-10-05 19:16   ` Michael S. Tsirkin
  2 siblings, 2 replies; 96+ messages in thread
From: Stephen Hemminger @ 2015-10-05  8:41 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, mst, hjk, corbet, gregkh, bruce.richardson, avi,
	gleb, alexander.duyck

On Sun,  4 Oct 2015 23:43:17 +0300
Vlad Zolotarov <vladz@cloudius-systems.com> wrote:

> +static int setup_maps(struct pci_dev *pdev, struct uio_info *info)
> +{
> +	int i, m = 0, p = 0, err;
> +	static const char * const bar_names[] = {
> +		"BAR0",	"BAR1",	"BAR2",	"BAR3",	"BAR4",	"BAR5",
> +	};
> +
> +	for (i = 0; i < ARRAY_SIZE(bar_names); i++) {
> +		unsigned long start = pci_resource_start(pdev, i);
> +		unsigned long flags = pci_resource_flags(pdev, i);
> +		unsigned long len = pci_resource_len(pdev, i);
> +
> +		if (start == 0 || len == 0)
> +			continue;
> +
> +		if (flags & IORESOURCE_MEM) {
> +			void __iomem *addr;
> +
> +			if (m >= MAX_UIO_MAPS)
> +				continue;
> +
> +			addr = ioremap(start, len);
> +			if (addr == NULL) {
> +				err = -EINVAL;
> +				goto fail;
> +			}
> +
> +			info->mem[m].name = bar_names[i];
> +			info->mem[m].addr = start;
> +			info->mem[m].internal_addr = addr;
> +			info->mem[m].size = len;
> +			info->mem[m].memtype = UIO_MEM_PHYS;
> +			++m;
> +		} else if (flags & IORESOURCE_IO) {
> +			if (p >= MAX_UIO_PORT_REGIONS)
> +				continue;
> +
> +			info->port[p].name = bar_names[i];
> +			info->port[p].start = start;
> +			info->port[p].size = len;
> +			info->port[p].porttype = UIO_PORT_X86;
> +			++p;
> +		}
> +	}
> +
> +	return 0;
> +fail:
> +	for (i = 0; i < m; i++) {
> +		iounmap(info->mem[i].internal_addr);
> +		info->mem[i].internal_addr = NULL;
> +	}
> +
> +	return err;
> +

I wonder do we really have to setup all the BAR's in uio_pci_generic?
The DPDK code works with uio_pci_generic already, and it didn't setup the BAR's.
One possible issue is that without that maybe kernel would not know about the
region used for MSI-X vectors table.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  8:41   ` Stephen Hemminger
@ 2015-10-05  9:08     ` Vlad Zolotarov
  2015-10-05 10:06       ` Vlad Zolotarov
  2015-10-05  9:11     ` Vlad Zolotarov
  1 sibling, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05  9:08 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: linux-kernel, mst, hjk, corbet, gregkh, bruce.richardson, avi,
	gleb, alexander.duyck



On 10/05/15 11:41, Stephen Hemminger wrote:
> On Sun,  4 Oct 2015 23:43:17 +0300
> Vlad Zolotarov <vladz@cloudius-systems.com> wrote:
>
>> +static int setup_maps(struct pci_dev *pdev, struct uio_info *info)
>> +{
>> +	int i, m = 0, p = 0, err;
>> +	static const char * const bar_names[] = {
>> +		"BAR0",	"BAR1",	"BAR2",	"BAR3",	"BAR4",	"BAR5",
>> +	};
>> +
>> +	for (i = 0; i < ARRAY_SIZE(bar_names); i++) {
>> +		unsigned long start = pci_resource_start(pdev, i);
>> +		unsigned long flags = pci_resource_flags(pdev, i);
>> +		unsigned long len = pci_resource_len(pdev, i);
>> +
>> +		if (start == 0 || len == 0)
>> +			continue;
>> +
>> +		if (flags & IORESOURCE_MEM) {
>> +			void __iomem *addr;
>> +
>> +			if (m >= MAX_UIO_MAPS)
>> +				continue;
>> +
>> +			addr = ioremap(start, len);
>> +			if (addr == NULL) {
>> +				err = -EINVAL;
>> +				goto fail;
>> +			}
>> +
>> +			info->mem[m].name = bar_names[i];
>> +			info->mem[m].addr = start;
>> +			info->mem[m].internal_addr = addr;
>> +			info->mem[m].size = len;
>> +			info->mem[m].memtype = UIO_MEM_PHYS;
>> +			++m;
>> +		} else if (flags & IORESOURCE_IO) {
>> +			if (p >= MAX_UIO_PORT_REGIONS)
>> +				continue;
>> +
>> +			info->port[p].name = bar_names[i];
>> +			info->port[p].start = start;
>> +			info->port[p].size = len;
>> +			info->port[p].porttype = UIO_PORT_X86;
>> +			++p;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +fail:
>> +	for (i = 0; i < m; i++) {
>> +		iounmap(info->mem[i].internal_addr);
>> +		info->mem[i].internal_addr = NULL;
>> +	}
>> +
>> +	return err;
>> +
> I wonder do we really have to setup all the BAR's in uio_pci_generic?
> The DPDK code works with uio_pci_generic already, and it didn't setup the BAR's.

DPDK never used uio_pci_generic with MSI-X support so far and all MSI-X 
capable UIO DPDK drivers like igb_uio and your newly proposed uio_msi do 
map them all.
U also mentioned in the other thread that virtio requires portio bars too.
The thing is that generally bars are needed for programming the device 
therefore it's logical to have them exposed. In general different 
devices have different registers layout therefore a general driver may 
not know which bar exactly is going to be required for a specific device 
and for a specific usage. That's why I think that "map them all" 
approach is rather generic and appropriate. However if there are other 
motives not to do so that I'm missing, pls., let me know.

thanks,
vlad

> One possible issue is that without that maybe kernel would not know about the
> region used for MSI-X vectors table.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  8:41   ` Stephen Hemminger
  2015-10-05  9:08     ` Vlad Zolotarov
@ 2015-10-05  9:11     ` Vlad Zolotarov
  1 sibling, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05  9:11 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: linux-kernel, mst, hjk, corbet, gregkh, bruce.richardson, avi,
	gleb, alexander.duyck



On 10/05/15 11:41, Stephen Hemminger wrote:
> On Sun,  4 Oct 2015 23:43:17 +0300
> Vlad Zolotarov <vladz@cloudius-systems.com> wrote:
>
>> +static int setup_maps(struct pci_dev *pdev, struct uio_info *info)
>> +{
>> +	int i, m = 0, p = 0, err;
>> +	static const char * const bar_names[] = {
>> +		"BAR0",	"BAR1",	"BAR2",	"BAR3",	"BAR4",	"BAR5",
>> +	};
>> +
>> +	for (i = 0; i < ARRAY_SIZE(bar_names); i++) {
>> +		unsigned long start = pci_resource_start(pdev, i);
>> +		unsigned long flags = pci_resource_flags(pdev, i);
>> +		unsigned long len = pci_resource_len(pdev, i);
>> +
>> +		if (start == 0 || len == 0)
>> +			continue;
>> +
>> +		if (flags & IORESOURCE_MEM) {
>> +			void __iomem *addr;
>> +
>> +			if (m >= MAX_UIO_MAPS)
>> +				continue;
>> +
>> +			addr = ioremap(start, len);
>> +			if (addr == NULL) {
>> +				err = -EINVAL;
>> +				goto fail;
>> +			}
>> +
>> +			info->mem[m].name = bar_names[i];
>> +			info->mem[m].addr = start;
>> +			info->mem[m].internal_addr = addr;
>> +			info->mem[m].size = len;
>> +			info->mem[m].memtype = UIO_MEM_PHYS;
>> +			++m;
>> +		} else if (flags & IORESOURCE_IO) {
>> +			if (p >= MAX_UIO_PORT_REGIONS)
>> +				continue;
>> +
>> +			info->port[p].name = bar_names[i];
>> +			info->port[p].start = start;
>> +			info->port[p].size = len;
>> +			info->port[p].porttype = UIO_PORT_X86;
>> +			++p;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +fail:
>> +	for (i = 0; i < m; i++) {
>> +		iounmap(info->mem[i].internal_addr);
>> +		info->mem[i].internal_addr = NULL;
>> +	}
>> +
>> +	return err;
>> +
> I wonder do we really have to setup all the BAR's in uio_pci_generic?
> The DPDK code works with uio_pci_generic already, and it didn't setup the BAR's.
> One possible issue is that without that maybe kernel would not know about the
> region used for MSI-X vectors table.

So, what's your point? It sounds like u are for setting the mappings in 
the uio_pci_generic after all, aren't u? ;)



^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  8:28     ` Avi Kivity
@ 2015-10-05  9:49       ` Greg KH
  2015-10-05 10:20         ` Avi Kivity
  2015-10-06 14:46       ` Michael S. Tsirkin
  1 sibling, 1 reply; 96+ messages in thread
From: Greg KH @ 2015-10-05  9:49 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Vlad Zolotarov, linux-kernel, mst, hjk, corbet, bruce.richardson,
	avi, gleb, stephen, alexander.duyck

On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote:
> Of course it has to be documented, but this just follows vfio.
> 
> Eventfd is a natural enough representation of an interrupt; both kvm and
> vfio use it, and are also able to share the eventfd, allowing a vfio
> interrupt to generate a kvm interrupt, without userspace intervention, and
> one day without even kernel intervention.

That's nice and wonderful, but it's not how UIO works today, so this is
now going to be a mix and match type interface, with no justification so
far as to why to create this new api and exactly how this is all going
to be used from userspace.

Example code would be even better...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  9:08     ` Vlad Zolotarov
@ 2015-10-05 10:06       ` Vlad Zolotarov
  2015-10-05 20:09         ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05 10:06 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: linux-kernel, mst, hjk, corbet, gregkh, bruce.richardson, avi,
	gleb, alexander.duyck



On 10/05/15 12:08, Vlad Zolotarov wrote:
>
>
> On 10/05/15 11:41, Stephen Hemminger wrote:
>> On Sun,  4 Oct 2015 23:43:17 +0300
>> Vlad Zolotarov <vladz@cloudius-systems.com> wrote:
>>
>>> +static int setup_maps(struct pci_dev *pdev, struct uio_info *info)
>>> +{
>>> +    int i, m = 0, p = 0, err;
>>> +    static const char * const bar_names[] = {
>>> +        "BAR0",    "BAR1",    "BAR2",    "BAR3", "BAR4",    "BAR5",
>>> +    };
>>> +
>>> +    for (i = 0; i < ARRAY_SIZE(bar_names); i++) {
>>> +        unsigned long start = pci_resource_start(pdev, i);
>>> +        unsigned long flags = pci_resource_flags(pdev, i);
>>> +        unsigned long len = pci_resource_len(pdev, i);
>>> +
>>> +        if (start == 0 || len == 0)
>>> +            continue;
>>> +
>>> +        if (flags & IORESOURCE_MEM) {
>>> +            void __iomem *addr;
>>> +
>>> +            if (m >= MAX_UIO_MAPS)
>>> +                continue;
>>> +
>>> +            addr = ioremap(start, len);
>>> +            if (addr == NULL) {
>>> +                err = -EINVAL;
>>> +                goto fail;
>>> +            }
>>> +
>>> +            info->mem[m].name = bar_names[i];
>>> +            info->mem[m].addr = start;
>>> +            info->mem[m].internal_addr = addr;
>>> +            info->mem[m].size = len;
>>> +            info->mem[m].memtype = UIO_MEM_PHYS;
>>> +            ++m;
>>> +        } else if (flags & IORESOURCE_IO) {
>>> +            if (p >= MAX_UIO_PORT_REGIONS)
>>> +                continue;
>>> +
>>> +            info->port[p].name = bar_names[i];
>>> +            info->port[p].start = start;
>>> +            info->port[p].size = len;
>>> +            info->port[p].porttype = UIO_PORT_X86;
>>> +            ++p;
>>> +        }
>>> +    }
>>> +
>>> +    return 0;
>>> +fail:
>>> +    for (i = 0; i < m; i++) {
>>> +        iounmap(info->mem[i].internal_addr);
>>> +        info->mem[i].internal_addr = NULL;
>>> +    }
>>> +
>>> +    return err;
>>> +
>> I wonder do we really have to setup all the BAR's in uio_pci_generic?
>> The DPDK code works with uio_pci_generic already, and it didn't setup 
>> the BAR's.
>
> DPDK never used uio_pci_generic with MSI-X support so far and all 
> MSI-X capable UIO DPDK drivers like igb_uio and your newly proposed 
> uio_msi do map them all.
> U also mentioned in the other thread that virtio requires portio bars 
> too.
> The thing is that generally bars are needed for programming the device 
> therefore it's logical to have them exposed. In general different 
> devices have different registers layout therefore a general driver may 
> not know which bar exactly is going to be required for a specific 
> device and for a specific usage. That's why I think that "map them 
> all" approach is rather generic and appropriate. However if there are 
> other motives not to do so that I'm missing, pls., let me know.

Having said all that however I'd agree if someone would say that 
mappings setting would rather come as a separate patch in this series... ;)
it will in v4...


>
> thanks,
> vlad
>
>> One possible issue is that without that maybe kernel would not know 
>> about the
>> region used for MSI-X vectors table.
>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  9:49       ` Greg KH
@ 2015-10-05 10:20         ` Avi Kivity
  2015-10-06 14:38           ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-05 10:20 UTC (permalink / raw)
  To: Greg KH
  Cc: Vlad Zolotarov, linux-kernel, mst, hjk, corbet, bruce.richardson,
	avi, gleb, stephen, alexander.duyck

On 10/05/2015 12:49 PM, Greg KH wrote:
> On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote:
>> Of course it has to be documented, but this just follows vfio.
>>
>> Eventfd is a natural enough representation of an interrupt; both kvm and
>> vfio use it, and are also able to share the eventfd, allowing a vfio
>> interrupt to generate a kvm interrupt, without userspace intervention, and
>> one day without even kernel intervention.
> That's nice and wonderful, but it's not how UIO works today, so this is
> now going to be a mix and match type interface, with no justification so
> far as to why to create this new api and exactly how this is all going
> to be used from userspace.

The intended user is dpdk (http://dpdk.org), which is a family of 
userspace networking drivers for high performance networking applications.

The natural device driver for dpdk is vfio, which both provides memory 
protection and exposes msi/msix interrupts.  However, in many cases vfio 
cannot be used, either due to the lack of an iommu (for example, in 
virtualized environments) or out of a desire to avoid the iommus 
performance impact.

The challenge in exposing msix interrupts to user space is that there 
are many of them, so you can't simply poll the device fd.  If you do, 
how do you know which interrupt was triggered?  The solution that vfio 
adopted was to associate each interrupt with an eventfd, allowing it to 
be individually polled.  Since you can pass an eventfd with SCM_RIGHTS, 
and since kvm can trigger guest interrupts using an eventfd, the 
solution is very flexible.

> Example code would be even better...
>
>


This is the vfio dpdk interface code:

http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c

basically, the equivalent uio msix code would be very similar if uio 
adopts a similar interface:

http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_pci_uio.c

(current code lacks msi/msix support, of course).

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-05  8:01       ` Greg KH
@ 2015-10-05 10:36         ` Vlad Zolotarov
  2015-10-05 20:02           ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05 10:36 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/05/15 11:01, Greg KH wrote:
> On Mon, Oct 05, 2015 at 10:33:20AM +0300, Vlad Zolotarov wrote:
>> On 10/05/15 06:03, Greg KH wrote:
>>> On Sun, Oct 04, 2015 at 11:43:16PM +0300, Vlad Zolotarov wrote:
>>>> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
>>>> ---
>>>>   drivers/uio/uio.c          | 15 +++++++++++++++
>>>>   include/linux/uio_driver.h |  3 +++
>>>>   2 files changed, 18 insertions(+)
>>> You add an ioctl yet fail to justify _why_ you need/want that ioctl, and
>>> you don't document it at all?  Come on, you know better than that, no
>>> one can take a patch that has no changelog comments at all like this :(
>> My bad. U are absolutely right here - it was late and I was tired that I
>> missed that to someone it may not be so "crystal clear" like it is to me...
>> :)
>> Again, my bad - let me clarify it here and if we agree I'll respin the
>> series with all relevant updates including the changelog.
>>
>>> Also, I _REALLY_ don't want to add any ioctls to the UIO interface, so
>>> you had better have a really compelling argument as to why this is the
>>> _ONLY_ way you can solve this unknown problem by using such a horrid
>>> thing...
>> Pls., note that this doesn't _ADD_ any ioctls directly to UIO driver, but
>> only lets the underlying PCI drivers to have them. UIO in this case is only
>> a proxy.
> Exactly, and I don't want to provide an ioctl "proxy" for UIO drivers.
> That way lies madness and horrid code, and other nasty things (hint,
> each ioctl is a custom syscall, so you are opening up the box for all
> sorts of bad things to happen in drivers...)
>
> For example, your ioctl you use here is incorrect, and will fail
> horribly on a large majority of systems.  I don't want to open up the
> requirements that more people have to know how to "do it right" in order
> to use the UIO interface for their drivers, as people will get it wrong
> (as this patch series shows...)

Sometimes there is no other (better) way to get things done. And bugs - 
isn't it what code review is for? ;)
I'll fix the "int" issue.

>
>> The main idea of this series is, as mentioned in PATCH0, to add the MSI and
>> MSI-X support for uio_pci_generic driver.
> Yes, I know that, but I don't see anything that shows _how_ to use this
> api.

I get that, i'll extend PATCH3 of this series with a detailed 
description in v4.

U use it as follows:

 1. Bind the PCI function to uio_pci_generic.
 2. Query for its interrupt mode with UIO_PCI_GENERIC_INT_MODE_GET ioctl.
 3. If interrupt mode is INT#x or MSI - use the current UIO interface
    for polling, namely use the UIO file descriptor.
 4. Else
     1. Query for the number of MSI-X vectors with
        UIO_PCI_GENERIC_IRQ_NUM_GET ioctl.
     2. Allocate the required number of eventfd descriptors using
        eventfd() from sys/eventfd.h.
     3. Bind them to the required IRQs with UIO_PCI_GENERIC_IRQ_SET ioctl.
 5. When done, just unbind the PCI function from the uio_pci_generic.


> And then there's the issue of why we even need this, why not just
> write a whole new driver for this, like the previous driver did (which
> also used ioctls, yes, I didn't have the chance to object to that before
> everyone else did...)

Which "previous driver" do u refer here?
IMHO writing something instead of UIO (not just uio_pci_generic) seems 
like an overkill for solving this issue. Supporting MSI-X interrupts 
seem like a very beneficial feature for uio_pci_generic and it's really 
not _THAT_ complicated API - just look at VFIO for a comparison... ;)
uio_pci_generic is clearly missing this important feature. And creating 
another user space driver infrastructure just to add it seems extremely 
unjustified.

>
>> While with MSI the things are quite simple and we may just ride the existing
>> infrastructure, with the MSI-X the things get a bit more complicated since
>> we may have more than one interrupt vector. Therefore we have to decide
>> which interface we want to give to the user.
>>
>> One option could be to make all existing interrupts trigger the same objects
>> in UIO as the current single interrupt does, however this would create an
>> awkward, quite not-flexible semantics. For instance a regular (kernel)
>> driver has a separate state machine for each interrupt line, which sometimes
>> runs on a separate CPU, etc. This way we get to the second option - allow
>> indication for each separate interrupt vector. And for obvious reasons
>> (mentioned above) we (Stephen has sent a similar series on a dpdk-dev list)
>> chose a second approach.
>>
>> In order not to invent the wheel we mimicked the VFIO approach, which allows
>> to bind the pre-allocated  eventfd descriptor to the specific interrupt
>> vector using the ioctl().
>>
>> The interface is simple. The UIO_PCI_GENERIC_IRQ_SET ioctl() data is:
>>
>> struct uio_pci_generic_irq_set {
>> 	int vec; /* index of the IRQ to connect to starting from 0 */
>> 	int fd;
>> };
> And that's broken :(

Good catch. Thanks. Will fix.
I'm not a big ioctl fan myself but unfortunately I don't see a good 
alternative here. proc? Would it make it cleaner?

>
> NEVER use an "int" for an ioctl, it is wrong and will cause horrible
> issues on a large number of systems.  That is what the __u16 and friends
> variable types are for.  You know better than this :)
>
> greg k-h


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  7:56       ` Greg KH
@ 2015-10-05 10:48         ` Vlad Zolotarov
  2015-10-05 10:57           ` Greg KH
  0 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05 10:48 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/05/15 10:56, Greg KH wrote:
> On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote:
>>>> +struct msix_info {
>>>> +	int num_irqs;
>>>> +	struct msix_entry *table;
>>>> +	struct uio_msix_irq_ctx {
>>>> +		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
>>> Why are you using eventfd for msi vectors?  What's the reason for
>>> needing this?
>> A small correction - for MSI-X vectors. There may be only one MSI vector per
>> PCI function and if it's used it would use the same interface as a legacy
>> INT#x interrupt uses at the moment.
>> So, for MSI-X case the reason is that there may be (in most cases there will
>> be) more than one interrupt vector. Thus, as I've explained in a PATCH1
>> thread we need a way to indicated each of them separately. eventfd seems
>> like a good way of doing so. If u have better ideas, pls., share.
> You need to document what you are doing here, I don't see any
> explaination for using eventfd at all.
>
> And no, I don't know of any other solution as I don't know what you are
> trying to do here (hint, the changelog didn't document it...)
>
>>> You haven't documented how this api works at all, you are going to have
>>> to a lot more work to justify this, as this greatly increases the
>>> complexity of the user/kernel api in unknown ways.
>> I actually do documented it a bit. Pls., check PATCH3 out.
> That provided no information at all about how to use the api.
>
> If it did, you would see that your api is broken for 32/64bit kernels
> and will fall over into nasty pieces the first time you try to use it
> there, which means it hasn't been tested at all :(

It has been tested of course ;)
I tested it only in 64 bit environment however where both kernel and 
user space applications were compiled on the same machine with the same 
compiler and it could be that "int" had the same number of bytes both in 
kernel and in user space application. Therefore it worked perfectly - I 
patched DPDK to use the new uio_pci_generic MSI-X API to test this and I 
have verified that all 3 interrupt modes work: MSI-X with SR-IOV VF 
device in Amazon EC2 guest and INT#x and MSI with a PF device on bare 
metal server.

However I agree using uint32_t for "vec" and "fd" would be much more 
correct.

>
> thanks,
>
> greg k-h


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05 10:48         ` Vlad Zolotarov
@ 2015-10-05 10:57           ` Greg KH
  2015-10-05 11:09             ` Avi Kivity
  2015-10-05 11:41             ` Vlad Zolotarov
  0 siblings, 2 replies; 96+ messages in thread
From: Greg KH @ 2015-10-05 10:57 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote:
> 
> 
> On 10/05/15 10:56, Greg KH wrote:
> >On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote:
> >>>>+struct msix_info {
> >>>>+	int num_irqs;
> >>>>+	struct msix_entry *table;
> >>>>+	struct uio_msix_irq_ctx {
> >>>>+		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
> >>>Why are you using eventfd for msi vectors?  What's the reason for
> >>>needing this?
> >>A small correction - for MSI-X vectors. There may be only one MSI vector per
> >>PCI function and if it's used it would use the same interface as a legacy
> >>INT#x interrupt uses at the moment.
> >>So, for MSI-X case the reason is that there may be (in most cases there will
> >>be) more than one interrupt vector. Thus, as I've explained in a PATCH1
> >>thread we need a way to indicated each of them separately. eventfd seems
> >>like a good way of doing so. If u have better ideas, pls., share.
> >You need to document what you are doing here, I don't see any
> >explaination for using eventfd at all.
> >
> >And no, I don't know of any other solution as I don't know what you are
> >trying to do here (hint, the changelog didn't document it...)
> >
> >>>You haven't documented how this api works at all, you are going to have
> >>>to a lot more work to justify this, as this greatly increases the
> >>>complexity of the user/kernel api in unknown ways.
> >>I actually do documented it a bit. Pls., check PATCH3 out.
> >That provided no information at all about how to use the api.
> >
> >If it did, you would see that your api is broken for 32/64bit kernels
> >and will fall over into nasty pieces the first time you try to use it
> >there, which means it hasn't been tested at all :(
> 
> It has been tested of course ;)
> I tested it only in 64 bit environment however where both kernel and user
> space applications were compiled on the same machine with the same compiler
> and it could be that "int" had the same number of bytes both in kernel and
> in user space application. Therefore it worked perfectly - I patched DPDK to
> use the new uio_pci_generic MSI-X API to test this and I have verified that
> all 3 interrupt modes work: MSI-X with SR-IOV VF device in Amazon EC2 guest
> and INT#x and MSI with a PF device on bare metal server.
> 
> However I agree using uint32_t for "vec" and "fd" would be much more
> correct.

I don't think file descriptors are __u32 on a 64bit arch, are they?

And NEVER use the _t types in kernel code, the namespaces is all wrong
and it is not applicable for us, sorry.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05 10:57           ` Greg KH
@ 2015-10-05 11:09             ` Avi Kivity
  2015-10-05 13:08               ` Greg KH
  2015-10-05 11:41             ` Vlad Zolotarov
  1 sibling, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-05 11:09 UTC (permalink / raw)
  To: Greg KH, Vlad Zolotarov
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On 10/05/2015 01:57 PM, Greg KH wrote:
> On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote:
>>
>> On 10/05/15 10:56, Greg KH wrote:
>>> On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote:
>>>>>> +struct msix_info {
>>>>>> +	int num_irqs;
>>>>>> +	struct msix_entry *table;
>>>>>> +	struct uio_msix_irq_ctx {
>>>>>> +		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
>>>>> Why are you using eventfd for msi vectors?  What's the reason for
>>>>> needing this?
>>>> A small correction - for MSI-X vectors. There may be only one MSI vector per
>>>> PCI function and if it's used it would use the same interface as a legacy
>>>> INT#x interrupt uses at the moment.
>>>> So, for MSI-X case the reason is that there may be (in most cases there will
>>>> be) more than one interrupt vector. Thus, as I've explained in a PATCH1
>>>> thread we need a way to indicated each of them separately. eventfd seems
>>>> like a good way of doing so. If u have better ideas, pls., share.
>>> You need to document what you are doing here, I don't see any
>>> explaination for using eventfd at all.
>>>
>>> And no, I don't know of any other solution as I don't know what you are
>>> trying to do here (hint, the changelog didn't document it...)
>>>
>>>>> You haven't documented how this api works at all, you are going to have
>>>>> to a lot more work to justify this, as this greatly increases the
>>>>> complexity of the user/kernel api in unknown ways.
>>>> I actually do documented it a bit. Pls., check PATCH3 out.
>>> That provided no information at all about how to use the api.
>>>
>>> If it did, you would see that your api is broken for 32/64bit kernels
>>> and will fall over into nasty pieces the first time you try to use it
>>> there, which means it hasn't been tested at all :(
>> It has been tested of course ;)
>> I tested it only in 64 bit environment however where both kernel and user
>> space applications were compiled on the same machine with the same compiler
>> and it could be that "int" had the same number of bytes both in kernel and
>> in user space application. Therefore it worked perfectly - I patched DPDK to
>> use the new uio_pci_generic MSI-X API to test this and I have verified that
>> all 3 interrupt modes work: MSI-X with SR-IOV VF device in Amazon EC2 guest
>> and INT#x and MSI with a PF device on bare metal server.
>>
>> However I agree using uint32_t for "vec" and "fd" would be much more
>> correct.
> I don't think file descriptors are __u32 on a 64bit arch, are they?
>
> And NEVER use the _t types in kernel code, the namespaces is all wrong
> and it is not applicable for us, sorry.

Wasn't the real reason that they aren't defined (or reserved) by C89, 
and therefore could clash with a user identifier, rather than some 
inherent wrongness?


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05 10:57           ` Greg KH
  2015-10-05 11:09             ` Avi Kivity
@ 2015-10-05 11:41             ` Vlad Zolotarov
  2015-10-05 11:47               ` Avi Kivity
  1 sibling, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05 11:41 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/05/15 13:57, Greg KH wrote:
> On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote:
>>
>> On 10/05/15 10:56, Greg KH wrote:
>>> On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote:
>>>>>> +struct msix_info {
>>>>>> +	int num_irqs;
>>>>>> +	struct msix_entry *table;
>>>>>> +	struct uio_msix_irq_ctx {
>>>>>> +		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
>>>>> Why are you using eventfd for msi vectors?  What's the reason for
>>>>> needing this?
>>>> A small correction - for MSI-X vectors. There may be only one MSI vector per
>>>> PCI function and if it's used it would use the same interface as a legacy
>>>> INT#x interrupt uses at the moment.
>>>> So, for MSI-X case the reason is that there may be (in most cases there will
>>>> be) more than one interrupt vector. Thus, as I've explained in a PATCH1
>>>> thread we need a way to indicated each of them separately. eventfd seems
>>>> like a good way of doing so. If u have better ideas, pls., share.
>>> You need to document what you are doing here, I don't see any
>>> explaination for using eventfd at all.
>>>
>>> And no, I don't know of any other solution as I don't know what you are
>>> trying to do here (hint, the changelog didn't document it...)
>>>
>>>>> You haven't documented how this api works at all, you are going to have
>>>>> to a lot more work to justify this, as this greatly increases the
>>>>> complexity of the user/kernel api in unknown ways.
>>>> I actually do documented it a bit. Pls., check PATCH3 out.
>>> That provided no information at all about how to use the api.
>>>
>>> If it did, you would see that your api is broken for 32/64bit kernels
>>> and will fall over into nasty pieces the first time you try to use it
>>> there, which means it hasn't been tested at all :(
>> It has been tested of course ;)
>> I tested it only in 64 bit environment however where both kernel and user
>> space applications were compiled on the same machine with the same compiler
>> and it could be that "int" had the same number of bytes both in kernel and
>> in user space application. Therefore it worked perfectly - I patched DPDK to
>> use the new uio_pci_generic MSI-X API to test this and I have verified that
>> all 3 interrupt modes work: MSI-X with SR-IOV VF device in Amazon EC2 guest
>> and INT#x and MSI with a PF device on bare metal server.
>>
>> However I agree using uint32_t for "vec" and "fd" would be much more
>> correct.
> I don't think file descriptors are __u32 on a 64bit arch, are they?

I think they are "int" on all platforms and as far as I know u32 should 
be enough to contain int on any platform.

>
> And NEVER use the _t types in kernel code,

Never meant it - it was for a user space interface. For a kernel it's 
u32 of course.

>   the namespaces is all wrong
> and it is not applicable for us, sorry.
>
> thanks,
>
> greg k-h


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05 11:41             ` Vlad Zolotarov
@ 2015-10-05 11:47               ` Avi Kivity
  2015-10-05 11:53                 ` Vlad Zolotarov
  0 siblings, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-05 11:47 UTC (permalink / raw)
  To: Vlad Zolotarov, Greg KH
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On 10/05/2015 02:41 PM, Vlad Zolotarov wrote:
>
>
> On 10/05/15 13:57, Greg KH wrote:
>> On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote:
>>>
>>> On 10/05/15 10:56, Greg KH wrote:
>>>> On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote:
>>>>>>> +struct msix_info {
>>>>>>> +    int num_irqs;
>>>>>>> +    struct msix_entry *table;
>>>>>>> +    struct uio_msix_irq_ctx {
>>>>>>> +        struct eventfd_ctx *trigger;    /* MSI-x vector to 
>>>>>>> eventfd */
>>>>>> Why are you using eventfd for msi vectors?  What's the reason for
>>>>>> needing this?
>>>>> A small correction - for MSI-X vectors. There may be only one MSI 
>>>>> vector per
>>>>> PCI function and if it's used it would use the same interface as a 
>>>>> legacy
>>>>> INT#x interrupt uses at the moment.
>>>>> So, for MSI-X case the reason is that there may be (in most cases 
>>>>> there will
>>>>> be) more than one interrupt vector. Thus, as I've explained in a 
>>>>> PATCH1
>>>>> thread we need a way to indicated each of them separately. eventfd 
>>>>> seems
>>>>> like a good way of doing so. If u have better ideas, pls., share.
>>>> You need to document what you are doing here, I don't see any
>>>> explaination for using eventfd at all.
>>>>
>>>> And no, I don't know of any other solution as I don't know what you 
>>>> are
>>>> trying to do here (hint, the changelog didn't document it...)
>>>>
>>>>>> You haven't documented how this api works at all, you are going 
>>>>>> to have
>>>>>> to a lot more work to justify this, as this greatly increases the
>>>>>> complexity of the user/kernel api in unknown ways.
>>>>> I actually do documented it a bit. Pls., check PATCH3 out.
>>>> That provided no information at all about how to use the api.
>>>>
>>>> If it did, you would see that your api is broken for 32/64bit kernels
>>>> and will fall over into nasty pieces the first time you try to use it
>>>> there, which means it hasn't been tested at all :(
>>> It has been tested of course ;)
>>> I tested it only in 64 bit environment however where both kernel and 
>>> user
>>> space applications were compiled on the same machine with the same 
>>> compiler
>>> and it could be that "int" had the same number of bytes both in 
>>> kernel and
>>> in user space application. Therefore it worked perfectly - I patched 
>>> DPDK to
>>> use the new uio_pci_generic MSI-X API to test this and I have 
>>> verified that
>>> all 3 interrupt modes work: MSI-X with SR-IOV VF device in Amazon 
>>> EC2 guest
>>> and INT#x and MSI with a PF device on bare metal server.
>>>
>>> However I agree using uint32_t for "vec" and "fd" would be much more
>>> correct.
>> I don't think file descriptors are __u32 on a 64bit arch, are they?
>
> I think they are "int" on all platforms and as far as I know u32 
> should be enough to contain int on any platform.
>

You need to make sure structures have the same layout on both 32-bit and 
64-bit systems, or you'll have to code compat ioctl translations for 
them.  The best way to do that is to use __u32 so the sizes are obvious, 
even for int, and to pad everything to 64 bit:

> +struct msix_info { 

+    __u32 num_irqs;
+    __u32 pad; // so pointer below is aligned to 64-bit on both 32-bit 
and 64-bit userspace
>
> +    struct msix_entry *table;
> +    struct uio_msix_irq_ctx {
> +        struct eventfd_ctx *trigger;    /* MSI-x vector to eventfd */


>>
>> And NEVER use the _t types in kernel code,
>
> Never meant it - it was for a user space interface. For a kernel it's 
> u32 of course.
>

For interfaces, use __u32.  You can't use uint32_t because if someone 
uses C89 in 2015, they may not have <cstdint.h>.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05 11:47               ` Avi Kivity
@ 2015-10-05 11:53                 ` Vlad Zolotarov
  0 siblings, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-05 11:53 UTC (permalink / raw)
  To: Avi Kivity, Greg KH
  Cc: linux-kernel, mst, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/05/15 14:47, Avi Kivity wrote:
> On 10/05/2015 02:41 PM, Vlad Zolotarov wrote:
>>
>>
>> On 10/05/15 13:57, Greg KH wrote:
>>> On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote:
>>>>
>>>> On 10/05/15 10:56, Greg KH wrote:
>>>>> On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote:
>>>>>>>> +struct msix_info {
>>>>>>>> +    int num_irqs;
>>>>>>>> +    struct msix_entry *table;
>>>>>>>> +    struct uio_msix_irq_ctx {
>>>>>>>> +        struct eventfd_ctx *trigger;    /* MSI-x vector to 
>>>>>>>> eventfd */
>>>>>>> Why are you using eventfd for msi vectors?  What's the reason for
>>>>>>> needing this?
>>>>>> A small correction - for MSI-X vectors. There may be only one MSI 
>>>>>> vector per
>>>>>> PCI function and if it's used it would use the same interface as 
>>>>>> a legacy
>>>>>> INT#x interrupt uses at the moment.
>>>>>> So, for MSI-X case the reason is that there may be (in most cases 
>>>>>> there will
>>>>>> be) more than one interrupt vector. Thus, as I've explained in a 
>>>>>> PATCH1
>>>>>> thread we need a way to indicated each of them separately. 
>>>>>> eventfd seems
>>>>>> like a good way of doing so. If u have better ideas, pls., share.
>>>>> You need to document what you are doing here, I don't see any
>>>>> explaination for using eventfd at all.
>>>>>
>>>>> And no, I don't know of any other solution as I don't know what 
>>>>> you are
>>>>> trying to do here (hint, the changelog didn't document it...)
>>>>>
>>>>>>> You haven't documented how this api works at all, you are going 
>>>>>>> to have
>>>>>>> to a lot more work to justify this, as this greatly increases the
>>>>>>> complexity of the user/kernel api in unknown ways.
>>>>>> I actually do documented it a bit. Pls., check PATCH3 out.
>>>>> That provided no information at all about how to use the api.
>>>>>
>>>>> If it did, you would see that your api is broken for 32/64bit kernels
>>>>> and will fall over into nasty pieces the first time you try to use it
>>>>> there, which means it hasn't been tested at all :(
>>>> It has been tested of course ;)
>>>> I tested it only in 64 bit environment however where both kernel 
>>>> and user
>>>> space applications were compiled on the same machine with the same 
>>>> compiler
>>>> and it could be that "int" had the same number of bytes both in 
>>>> kernel and
>>>> in user space application. Therefore it worked perfectly - I 
>>>> patched DPDK to
>>>> use the new uio_pci_generic MSI-X API to test this and I have 
>>>> verified that
>>>> all 3 interrupt modes work: MSI-X with SR-IOV VF device in Amazon 
>>>> EC2 guest
>>>> and INT#x and MSI with a PF device on bare metal server.
>>>>
>>>> However I agree using uint32_t for "vec" and "fd" would be much more
>>>> correct.
>>> I don't think file descriptors are __u32 on a 64bit arch, are they?
>>
>> I think they are "int" on all platforms and as far as I know u32 
>> should be enough to contain int on any platform.
>>
>
> You need to make sure structures have the same layout on both 32-bit 
> and 64-bit systems, or you'll have to code compat ioctl translations 
> for them.  The best way to do that is to use __u32 so the sizes are 
> obvious, even for int, and to pad everything to 64 bit:

Sure, but the structure below is not the one that is passed in ioctl() - 
it's an internal uio_pci_generic state and there is nothing to worry about.
The one in question is struct uio_pci_generic_irq_set from 
uio_pci_generic.h:

struct uio_pci_generic_irq_set {
     int vec; /* index of the IRQ to connect to starting from 0 */
     int fd;
};

It should be
struct uio_pci_generic_irq_set {
     __u32 vec; /* index of the IRQ to connect to starting from 0 */
     __u32 fd;
};

instead.


>
>> +struct msix_info { 
>
> +    __u32 num_irqs;
> +    __u32 pad; // so pointer below is aligned to 64-bit on both 
> 32-bit and 64-bit userspace
>>
>> +    struct msix_entry *table;
>> +    struct uio_msix_irq_ctx {
>> +        struct eventfd_ctx *trigger;    /* MSI-x vector to eventfd */
>
>
>>>
>>> And NEVER use the _t types in kernel code,
>>
>> Never meant it - it was for a user space interface. For a kernel it's 
>> u32 of course.
>>
>
> For interfaces, use __u32.  You can't use uint32_t because if someone 
> uses C89 in 2015, they may not have <cstdint.h>.
>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05 11:09             ` Avi Kivity
@ 2015-10-05 13:08               ` Greg KH
  0 siblings, 0 replies; 96+ messages in thread
From: Greg KH @ 2015-10-05 13:08 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Vlad Zolotarov, linux-kernel, mst, hjk, corbet, bruce.richardson,
	avi, gleb, stephen, alexander.duyck

On Mon, Oct 05, 2015 at 02:09:32PM +0300, Avi Kivity wrote:
> On 10/05/2015 01:57 PM, Greg KH wrote:
> >On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote:
> >>
> >>On 10/05/15 10:56, Greg KH wrote:
> >>>On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote:
> >>>>>>+struct msix_info {
> >>>>>>+	int num_irqs;
> >>>>>>+	struct msix_entry *table;
> >>>>>>+	struct uio_msix_irq_ctx {
> >>>>>>+		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
> >>>>>Why are you using eventfd for msi vectors?  What's the reason for
> >>>>>needing this?
> >>>>A small correction - for MSI-X vectors. There may be only one MSI vector per
> >>>>PCI function and if it's used it would use the same interface as a legacy
> >>>>INT#x interrupt uses at the moment.
> >>>>So, for MSI-X case the reason is that there may be (in most cases there will
> >>>>be) more than one interrupt vector. Thus, as I've explained in a PATCH1
> >>>>thread we need a way to indicated each of them separately. eventfd seems
> >>>>like a good way of doing so. If u have better ideas, pls., share.
> >>>You need to document what you are doing here, I don't see any
> >>>explaination for using eventfd at all.
> >>>
> >>>And no, I don't know of any other solution as I don't know what you are
> >>>trying to do here (hint, the changelog didn't document it...)
> >>>
> >>>>>You haven't documented how this api works at all, you are going to have
> >>>>>to a lot more work to justify this, as this greatly increases the
> >>>>>complexity of the user/kernel api in unknown ways.
> >>>>I actually do documented it a bit. Pls., check PATCH3 out.
> >>>That provided no information at all about how to use the api.
> >>>
> >>>If it did, you would see that your api is broken for 32/64bit kernels
> >>>and will fall over into nasty pieces the first time you try to use it
> >>>there, which means it hasn't been tested at all :(
> >>It has been tested of course ;)
> >>I tested it only in 64 bit environment however where both kernel and user
> >>space applications were compiled on the same machine with the same compiler
> >>and it could be that "int" had the same number of bytes both in kernel and
> >>in user space application. Therefore it worked perfectly - I patched DPDK to
> >>use the new uio_pci_generic MSI-X API to test this and I have verified that
> >>all 3 interrupt modes work: MSI-X with SR-IOV VF device in Amazon EC2 guest
> >>and INT#x and MSI with a PF device on bare metal server.
> >>
> >>However I agree using uint32_t for "vec" and "fd" would be much more
> >>correct.
> >I don't think file descriptors are __u32 on a 64bit arch, are they?
> >
> >And NEVER use the _t types in kernel code, the namespaces is all wrong
> >and it is not applicable for us, sorry.
> 
> Wasn't the real reason that they aren't defined (or reserved) by C89, and
> therefore could clash with a user identifier, rather than some inherent
> wrongness?

Kind of, my memory is vague.  There's a great rant from Linus about why
they don't work in the kernel somewhere in the lkml archives...

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-04 20:43 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
  2015-10-05  3:11   ` Greg KH
  2015-10-05  8:41   ` Stephen Hemminger
@ 2015-10-05 19:16   ` Michael S. Tsirkin
  2 siblings, 0 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-05 19:16 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Sun, Oct 04, 2015 at 11:43:17PM +0300, Vlad Zolotarov wrote:
> Add support for MSI and MSI-X interrupt modes:
>    - Interrupt mode selection order is:
>         INT#X (for backward compatibility) -> MSI-X -> MSI.
>    - Add ioctl() commands:
>       - UIO_PCI_GENERIC_INT_MODE_GET: query the current interrupt mode.
>       - UIO_PCI_GENERIC_IRQ_NUM_GET: query the maximum number of IRQs.

This might be something that humans might want to
read or configure, as well.
In fact, # of interrupts is a limited resource
so it seems very reasonable to have a system
admin configure it rather than the application.
So why not use sysfs attributes for this?

>       - UIO_PCI_GENERIC_IRQ_SET: bind the IRQ to eventfd (similar to vfio).

I think that as a first step, you should just use the regular
uio sysfs interface to report interrupts.
Adding eventfd support to uio sounds like a reasonable
extension but I don't see why it needs to be done
as part of the same patch.

>    - Add mappings to all bars (memory and portio):

I don't think it's a good idea.
People already can poke at BARs through device sysfs
(and already abuse this interface).
We don't want to add another API to do exactly the same thing.
In any case, this should be a separate patch if necessary.

> 	some devices have
>      registers related to MSI/MSI-X handling outside BAR0.

This is very vague. Definitely not enough to justify a new
kernel/userspace API.


> 
> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
> ---
> New in v3:
>    - Add __iomem qualifier to temp buffer receiving ioremap value.
> 
> New in v2:
>    - Added #include <linux/uaccess.h> to uio_pci_generic.c
> 
> Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
> ---
>  drivers/uio/uio_pci_generic.c   | 410 +++++++++++++++++++++++++++++++++++++---
>  include/linux/uio_pci_generic.h |  36 ++++
>  2 files changed, 423 insertions(+), 23 deletions(-)
>  create mode 100644 include/linux/uio_pci_generic.h
> 
> diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
> index d0b508b..6b8b1789 100644
> --- a/drivers/uio/uio_pci_generic.c
> +++ b/drivers/uio/uio_pci_generic.c
> @@ -22,16 +22,32 @@
>  #include <linux/device.h>
>  #include <linux/module.h>
>  #include <linux/pci.h>
> +#include <linux/msi.h>
>  #include <linux/slab.h>
>  #include <linux/uio_driver.h>
> +#include <linux/uio_pci_generic.h>
> +#include <linux/eventfd.h>
> +#include <linux/uaccess.h>
>  
>  #define DRIVER_VERSION	"0.01.0"
>  #define DRIVER_AUTHOR	"Michael S. Tsirkin <mst@redhat.com>"
>  #define DRIVER_DESC	"Generic UIO driver for PCI 2.3 devices"
>  
> +struct msix_info {
> +	int num_irqs;
> +	struct msix_entry *table;
> +	struct uio_msix_irq_ctx {
> +		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
> +		char *name;			/* name in /proc/interrupts */
> +	} *ctx;
> +};
> +
>  struct uio_pci_generic_dev {
>  	struct uio_info info;
>  	struct pci_dev *pdev;
> +	struct mutex msix_state_lock;		/* ioctl mutex */
> +	enum uio_int_mode int_mode;
> +	struct msix_info msix;
>  };
>  
>  static inline struct uio_pci_generic_dev *
> @@ -40,9 +56,177 @@ to_uio_pci_generic_dev(struct uio_info *info)
>  	return container_of(info, struct uio_pci_generic_dev, info);
>  }
>  
> -/* Interrupt handler. Read/modify/write the command register to disable
> - * the interrupt. */
> -static irqreturn_t irqhandler(int irq, struct uio_info *info)
> +/* Unmap previously ioremap'd resources */
> +static void release_iomaps(struct uio_pci_generic_dev *gdev)
> +{
> +	int i;
> +	struct uio_mem *mem = gdev->info.mem;
> +
> +	for (i = 0; i < MAX_UIO_MAPS; i++, mem++) {
> +		if (mem->internal_addr) {
> +			iounmap(mem->internal_addr);
> +			mem->internal_addr = NULL;
> +		}
> +	}
> +}
> +
> +static int setup_maps(struct pci_dev *pdev, struct uio_info *info)
> +{
> +	int i, m = 0, p = 0, err;
> +	static const char * const bar_names[] = {
> +		"BAR0",	"BAR1",	"BAR2",	"BAR3",	"BAR4",	"BAR5",
> +	};
> +
> +	for (i = 0; i < ARRAY_SIZE(bar_names); i++) {
> +		unsigned long start = pci_resource_start(pdev, i);
> +		unsigned long flags = pci_resource_flags(pdev, i);
> +		unsigned long len = pci_resource_len(pdev, i);
> +
> +		if (start == 0 || len == 0)
> +			continue;
> +
> +		if (flags & IORESOURCE_MEM) {
> +			void __iomem *addr;
> +
> +			if (m >= MAX_UIO_MAPS)
> +				continue;
> +
> +			addr = ioremap(start, len);
> +			if (addr == NULL) {
> +				err = -EINVAL;
> +				goto fail;
> +			}
> +
> +			info->mem[m].name = bar_names[i];
> +			info->mem[m].addr = start;
> +			info->mem[m].internal_addr = addr;
> +			info->mem[m].size = len;
> +			info->mem[m].memtype = UIO_MEM_PHYS;
> +			++m;
> +		} else if (flags & IORESOURCE_IO) {
> +			if (p >= MAX_UIO_PORT_REGIONS)
> +				continue;
> +
> +			info->port[p].name = bar_names[i];
> +			info->port[p].start = start;
> +			info->port[p].size = len;
> +			info->port[p].porttype = UIO_PORT_X86;
> +			++p;
> +		}
> +	}
> +
> +	return 0;
> +fail:
> +	for (i = 0; i < m; i++) {
> +		iounmap(info->mem[i].internal_addr);
> +		info->mem[i].internal_addr = NULL;
> +	}
> +
> +	return err;
> +}
> +
> +static irqreturn_t msix_irqhandler(int irq, void *arg);
> +
> +/* set the mapping between vector # and existing eventfd. */
> +static int set_irq_eventfd(struct uio_pci_generic_dev *gdev, int vec, int fd)
> +{
> +	struct uio_msix_irq_ctx *ctx;
> +	struct eventfd_ctx *trigger;
> +	struct pci_dev *pdev = gdev->pdev;
> +	int irq, err;
> +
> +	if (vec >= gdev->msix.num_irqs) {
> +		dev_notice(&gdev->pdev->dev, "vec %u >= num_vec %u\n",
> +			   vec, gdev->msix.num_irqs);
> +		return -ERANGE;
> +	}
> +
> +	irq = gdev->msix.table[vec].vector;
> +
> +	/* Cleanup existing irq mapping */
> +	ctx = &gdev->msix.ctx[vec];
> +	if (ctx->trigger) {
> +		free_irq(irq, ctx->trigger);
> +		eventfd_ctx_put(ctx->trigger);
> +		ctx->trigger = NULL;
> +	}
> +
> +	/* Passing -1 is used to disable interrupt */
> +	if (fd < 0)
> +		return 0;
> +
> +
> +	trigger = eventfd_ctx_fdget(fd);
> +	if (IS_ERR(trigger)) {
> +		err = PTR_ERR(trigger);
> +		dev_notice(&gdev->pdev->dev,
> +			   "eventfd ctx get failed: %d\n", err);
> +		return err;
> +	}
> +
> +	err = request_irq(irq, msix_irqhandler, 0, ctx->name, trigger);
> +	if (err) {
> +		dev_notice(&pdev->dev, "request irq failed: %d\n", err);
> +		eventfd_ctx_put(trigger);
> +		return err;
> +	}
> +
> +	dev_dbg(&pdev->dev, "map vector %u to fd %d trigger %p\n",
> +		vec, fd, trigger);
> +	ctx->trigger = trigger;
> +
> +	return 0;
> +}
> +
> +static int uio_pci_generic_ioctl(struct uio_info *info, unsigned int cmd,
> +				 unsigned long arg)
> +{
> +	struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
> +	struct uio_pci_generic_irq_set hdr;
> +	int err;
> +
> +	switch (cmd) {
> +	case UIO_PCI_GENERIC_IRQ_SET:
> +		if (copy_from_user(&hdr, (void __user *)arg, sizeof(hdr)))
> +			return -EFAULT;
> +
> +		/* Locking is needed to ensure two things:
> +		 *  1) Two IRQ_SET ioctl()'s are not running in parallel.
> +		 *  2) IRQ_SET ioctl() is not running in parallel with remove().
> +		 */
> +		mutex_lock(&gdev->msix_state_lock);
> +		if (gdev->int_mode != UIO_INT_MODE_MSIX) {
> +			mutex_unlock(&gdev->msix_state_lock);
> +			return -EOPNOTSUPP;
> +		}
> +
> +		err = set_irq_eventfd(gdev, hdr.vec, hdr.fd);
> +		mutex_unlock(&gdev->msix_state_lock);
> +
> +		break;
> +	case UIO_PCI_GENERIC_IRQ_NUM_GET:
> +		if (gdev->int_mode == UIO_INT_MODE_NONE)
> +			err = put_user(0, (u32 __user *)arg);
> +		else if (gdev->int_mode != UIO_INT_MODE_MSIX)
> +			err = put_user(1, (u32 __user *)arg);
> +		else
> +			err = put_user(gdev->msix.num_irqs,
> +				       (u32 __user *)arg);
> +
> +		break;
> +	case UIO_PCI_GENERIC_INT_MODE_GET:
> +		err = put_user(gdev->int_mode, (u32 __user *)arg);
> +
> +		break;
> +	default:
> +		err = -EOPNOTSUPP;
> +	}
> +
> +	return err;
> +}
> +
> +/* INT#X interrupt handler. */
> +static irqreturn_t intx_irqhandler(int irq, struct uio_info *info)
>  {
>  	struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
>  
> @@ -53,8 +237,162 @@ static irqreturn_t irqhandler(int irq, struct uio_info *info)
>  	return IRQ_HANDLED;
>  }
>  
> -static int probe(struct pci_dev *pdev,
> -			   const struct pci_device_id *id)
> +/* MSI interrupt handler. */
> +static irqreturn_t msi_irqhandler(int irq, struct uio_info *info)
> +{
> +	/* UIO core will signal the user process. */
> +	return IRQ_HANDLED;
> +}
> +
> +/* MSI-X interrupt handler. */
> +static irqreturn_t msix_irqhandler(int irq, void *arg)
> +{
> +	struct eventfd_ctx *trigger = arg;
> +
> +	pr_devel("irq %u trigger %p\n", irq, trigger);
> +
> +	eventfd_signal(trigger, 1);
> +	return IRQ_HANDLED;
> +}
> +
> +static bool enable_intx(struct uio_pci_generic_dev *gdev)
> +{
> +	struct pci_dev *pdev = gdev->pdev;
> +
> +	if (!pdev->irq || !pci_intx_mask_supported(pdev))
> +		return false;
> +
> +	gdev->int_mode = UIO_INT_MODE_INTX;
> +	gdev->info.irq = pdev->irq;
> +	gdev->info.irq_flags = IRQF_SHARED;
> +	gdev->info.handler = intx_irqhandler;
> +
> +	return true;
> +}
> +
> +static void set_pci_master(struct pci_dev *pdev)
> +{
> +	pci_set_master(pdev);
> +	dev_warn(&pdev->dev, "Enabling PCI bus mastering. Bogus userspace application is able to trash kernel memory using DMA");
> +	add_taint(TAINT_USER, LOCKDEP_STILL_OK);

Basically, users won't notice this until their kernel
crashes, and then it's too late.

Further, this happens silently: it's not an admin doing
something insecure intentionally.
It's the driver doing something insecure.

This doesn't make sense.

IIUC, the issue is that userspace can reprogram MSI/MSI-X
configuration, causing it to write into kernel memory.

So IMHO the right thing to do is to prevent userspace from
poking at MSI/MSI-X configuration when this is enabled.

This includes the MSI/MSI-X capability in config space
and the MSI-X table in the relevant BAR.

For MSI-X it's especially tricky.
Ideally, we'd map a zero page over that region, because
existing applications expect to be able to map the whole
BAR.

This will need some infrastructure work.

> +}
> +
> +static bool enable_msi(struct uio_pci_generic_dev *gdev)
> +{
> +	struct pci_dev *pdev = gdev->pdev;
> +
> +	set_pci_master(pdev);
> +
> +	if (pci_enable_msi(pdev))
> +		return false;
> +
> +	gdev->int_mode = UIO_INT_MODE_MSI;
> +	gdev->info.irq = pdev->irq;
> +	gdev->info.irq_flags = 0;
> +	gdev->info.handler = msi_irqhandler;
> +
> +	return true;
> +}
> +
> +static bool enable_msix(struct uio_pci_generic_dev *gdev)
> +{
> +	struct pci_dev *pdev = gdev->pdev;
> +	int i, vectors = pci_msix_vec_count(pdev);
> +
> +	if (vectors <= 0)
> +		return false;
> +
> +	gdev->msix.table = kcalloc(vectors, sizeof(struct msix_entry),
> +				   GFP_KERNEL);
> +	if (!gdev->msix.table) {
> +		dev_err(&pdev->dev, "Failed to allocate memory for MSI-X table");
> +		return false;
> +	}
> +
> +	gdev->msix.ctx = kcalloc(vectors, sizeof(struct uio_msix_irq_ctx),
> +				 GFP_KERNEL);
> +	if (!gdev->msix.ctx) {
> +		dev_err(&pdev->dev, "Failed to allocate memory for MSI-X contexts");
> +		goto err_ctx_alloc;
> +	}
> +
> +	for (i = 0; i < vectors; i++) {
> +		gdev->msix.table[i].entry = i;
> +		gdev->msix.ctx[i].name = kasprintf(GFP_KERNEL,
> +						   KBUILD_MODNAME "[%d](%s)",
> +						   i, pci_name(pdev));
> +		if (!gdev->msix.ctx[i].name)
> +			goto err_name_alloc;
> +	}
> +
> +	set_pci_master(pdev);
> +
> +	if (pci_enable_msix(pdev, gdev->msix.table, vectors))
> +		goto err_msix_enable;
> +
> +	gdev->int_mode = UIO_INT_MODE_MSIX;
> +	gdev->info.irq = UIO_IRQ_CUSTOM;
> +	gdev->msix.num_irqs = vectors;
> +
> +	return true;
> +
> +err_msix_enable:
> +	pci_clear_master(pdev);
> +err_name_alloc:
> +	for (i = 0; i < vectors; i++)
> +		kfree(gdev->msix.ctx[i].name);
> +
> +	kfree(gdev->msix.ctx);
> +err_ctx_alloc:
> +	kfree(gdev->msix.table);
> +
> +	return false;
> +}
> +
> +/**
> + * Disable interrupts and free related resources.
> + *
> + * @gdev device handle
> + *
> + * This function should be called after the corresponding UIO device has been
> + * unregistered. This will ensure that there are no currently running ioctl()s
> + * and there won't be any new ones until next probe() call.
> + */
> +static void disable_intr(struct uio_pci_generic_dev *gdev)
> +{
> +	struct pci_dev *pdev = gdev->pdev;
> +	int i;
> +
> +	switch (gdev->int_mode) {
> +	case UIO_INT_MODE_MSI:
> +		pci_disable_msi(pdev);
> +		pci_clear_master(pdev);
> +
> +		break;
> +	case UIO_INT_MODE_MSIX:
> +		/* No need for locking here since there shouldn't be any
> +		 * ioctl()s running by now.
> +		 */
> +		for (i = 0; i < gdev->msix.num_irqs; i++) {
> +			if (gdev->msix.ctx[i].trigger)
> +				set_irq_eventfd(gdev, i, -1);
> +
> +			kfree(gdev->msix.ctx[i].name);
> +		}
> +
> +		pci_disable_msix(pdev);
> +		pci_clear_master(pdev);
> +		kfree(gdev->msix.ctx);
> +		kfree(gdev->msix.table);
> +
> +		break;
> +	default:
> +		break;
> +	}
> +}
> +
> +
> +static int probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct uio_pci_generic_dev *gdev;
>  	int err;
> @@ -66,42 +404,64 @@ static int probe(struct pci_dev *pdev,
>  		return err;
>  	}
>  
> -	if (!pdev->irq) {
> -		dev_warn(&pdev->dev, "No IRQ assigned to device: "
> -			 "no support for interrupts?\n");
> -		pci_disable_device(pdev);
> -		return -ENODEV;
> -	}
> -
> -	if (!pci_intx_mask_supported(pdev)) {
> -		err = -ENODEV;
> -		goto err_verify;
> -	}
> -
>  	gdev = kzalloc(sizeof(struct uio_pci_generic_dev), GFP_KERNEL);
>  	if (!gdev) {
>  		err = -ENOMEM;
>  		goto err_alloc;
>  	}
>  
> +	gdev->pdev = pdev;
>  	gdev->info.name = "uio_pci_generic";
>  	gdev->info.version = DRIVER_VERSION;
> -	gdev->info.irq = pdev->irq;
> -	gdev->info.irq_flags = IRQF_SHARED;
> -	gdev->info.handler = irqhandler;
> -	gdev->pdev = pdev;
> +	gdev->info.ioctl = uio_pci_generic_ioctl;
> +	mutex_init(&gdev->msix_state_lock);
> +
> +	err = pci_request_regions(pdev, "uio_pci_generic");
> +	if (err != 0) {
> +		dev_err(&pdev->dev, "Cannot request regions\n");
> +		goto err_request_regions;
> +	}
> +
> +	/* Enable the corresponding interrupt mode. Try to enable INT#X first
> +	 * for backward compatibility.
> +	 */
> +	if (enable_intx(gdev))
> +		dev_info(&pdev->dev, "Using INT#x mode: IRQ %ld",
> +			 gdev->info.irq);
> +	else if (enable_msix(gdev))
> +		dev_info(&pdev->dev, "Using MSI-X mode: number of IRQs %d",
> +			 gdev->msix.num_irqs);
> +	else if (enable_msi(gdev))
> +		dev_info(&pdev->dev, "Using MSI mode: IRQ %ld", gdev->info.irq);
> +	else {
> +		err = -ENODEV;
> +		goto err_verify;
> +	}
> +
> +	/* remap resources */
> +	err = setup_maps(pdev, &gdev->info);
> +	if (err)
> +		goto err_maps;
>  
>  	err = uio_register_device(&pdev->dev, &gdev->info);
>  	if (err)
>  		goto err_register;
> +
>  	pci_set_drvdata(pdev, gdev);
>  
>  	return 0;
> +
>  err_register:
> +	release_iomaps(gdev);
> +err_maps:
> +	disable_intr(gdev);
> +err_verify:
> +	pci_release_regions(pdev);
> +err_request_regions:
>  	kfree(gdev);
>  err_alloc:
> -err_verify:
>  	pci_disable_device(pdev);
> +
>  	return err;
>  }
>  
> @@ -110,8 +470,12 @@ static void remove(struct pci_dev *pdev)
>  	struct uio_pci_generic_dev *gdev = pci_get_drvdata(pdev);
>  
>  	uio_unregister_device(&gdev->info);
> -	pci_disable_device(pdev);
> +	disable_intr(gdev);
> +	release_iomaps(gdev);
> +	pci_release_regions(pdev);
>  	kfree(gdev);
> +	pci_disable_device(pdev);
> +	pci_set_drvdata(pdev, NULL);
>  }
>  
>  static struct pci_driver uio_pci_driver = {
> diff --git a/include/linux/uio_pci_generic.h b/include/linux/uio_pci_generic.h
> new file mode 100644
> index 0000000..10716fc
> --- /dev/null
> +++ b/include/linux/uio_pci_generic.h
> @@ -0,0 +1,36 @@
> +/*
> + * include/linux/uio_pci_generic.h
> + *
> + * Userspace generic PCI IO driver.
> + *
> + * Licensed under the GPLv2 only.
> + */
> +
> +#ifndef _UIO_PCI_GENERIC_H_
> +#define _UIO_PCI_GENERIC_H_
> +
> +#include <linux/ioctl.h>
> +
> +enum uio_int_mode {
> +	UIO_INT_MODE_NONE,
> +	UIO_INT_MODE_INTX,
> +	UIO_INT_MODE_MSI,
> +	UIO_INT_MODE_MSIX
> +};
> +
> +/* bind the requested IRQ to the given eventfd */
> +struct uio_pci_generic_irq_set {
> +	int vec; /* index of the IRQ to connect to starting from 0 */
> +	int fd;
> +};
> +
> +#define UIO_PCI_GENERIC_BASE		0x86
> +
> +#define UIO_PCI_GENERIC_IRQ_SET		_IOW('I', UIO_PCI_GENERIC_BASE + 1, \
> +		struct uio_pci_generic_irq_set)
> +#define UIO_PCI_GENERIC_IRQ_NUM_GET	_IOW('I', UIO_PCI_GENERIC_BASE + 2, \
> +		uint32_t)
> +#define UIO_PCI_GENERIC_INT_MODE_GET	_IOW('I', UIO_PCI_GENERIC_BASE + 3, \
> +		uint32_t)
> +
> +#endif /* _UIO_PCI_GENERIC_H_ */
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
                   ` (3 preceding siblings ...)
  2015-10-04 20:45 ` [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
@ 2015-10-05 19:50 ` Michael S. Tsirkin
  2015-10-06  8:37   ` Vlad Zolotarov
  4 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-05 19:50 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Sun, Oct 04, 2015 at 11:43:15PM +0300, Vlad Zolotarov wrote:
> This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
>  
> Currently uio_pci_generic supports only legacy INT#x interrupts source. However
> there are situations when this is not enough, for instance SR-IOV VF devices that
> simply don't have INT#x capability. For such devices uio_pci_generic will simply
> fail (more specifically probe() will fail).
>  
> When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
> overhead and thus VFIO is not an option
> users that develop user-space drivers are left
> without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
> DPDK) just to be able to use UIO infrastructure.
>  
> This series provides a generic solution for this problem while preserving the original behaviour
> for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).

What is missing here is that drivers using uio_pci_generic generally
poke at config and BAR sysfs files of the device.

We can not stop them without breaking existing users, but this means
that we can't enable bus mastering and MSI/MSI-X blindly: userspace
bugs will corrupt the MSI-X table and/or MSi/MSI-X capability,
and cause device to overwrite random addresses, corrupting kernel
memory.

Your solution seems to be a warning in dmesg and tainting the
kernel, but that's not enough.

You need to add infrastructure to prevent this.

VFIO has some code to do this, but it's not bound by existing UIO API so it
simply fails the mmap.  We want I think existing applications to work,
so I suspect we need to make a hole there (probably map a zero page in
case apps want to read it, and maybe even set it up for COW in case they
tweak the PBA which sometimes happens to be in the same page).

Your patches also seem to add in eventfd and mmap capabilities which
seems to be orthogonal. They are there in VFIO which I'm guessing is the
real reason you do it.

So, what you are trying to do might be closer to extending VFIO which
already has a bunch of checks like that.  Yes, it also wants to program
the IOMMU.  So maybe do it with a separate device that can be root-only,
so unpriveledged users can't abuse it.

You should Cc, and talk to the VFIO maintainer.



> New in v3:
>    - Add __iomem qualifier to temp buffer receiving ioremap value.  
> 
> New in v2:
>    - Added #include <linux/uaccess.h> to uio_pci_generic.c
> 
> Vlad Zolotarov (3):
>   uio: add ioctl support
>   uio_pci_generic: add MSI/MSI-X support
>   Documentation: update uio-howto
> 
>  Documentation/DocBook/uio-howto.tmpl |  29 ++-
>  drivers/uio/uio.c                    |  15 ++
>  drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
>  include/linux/uio_driver.h           |   3 +
>  include/linux/uio_pci_generic.h      |  36 +++
>  5 files changed, 467 insertions(+), 26 deletions(-)
>  create mode 100644 include/linux/uio_pci_generic.h
> 
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-05 10:36         ` Vlad Zolotarov
@ 2015-10-05 20:02           ` Michael S. Tsirkin
       [not found]             ` <CAOYyTHZ2=UCYxuJKvd5S6qxp=84DBq5bMadg5wL0rFLZBh2-8Q@mail.gmail.com>
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-05 20:02 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: Greg KH, linux-kernel, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Mon, Oct 05, 2015 at 01:36:35PM +0300, Vlad Zolotarov wrote:
> >And then there's the issue of why we even need this, why not just
> >write a whole new driver for this, like the previous driver did (which
> >also used ioctls, yes, I didn't have the chance to object to that before
> >everyone else did...)
> 
> Which "previous driver" do u refer here?
> IMHO writing something instead of UIO (not just uio_pci_generic) seems like
> an overkill for solving this issue. Supporting MSI-X interrupts seem like a
> very beneficial feature for uio_pci_generic and it's really not _THAT_
> complicated API - just look at VFIO for a comparison... ;)

Except most things VFIO does is actually there for security.
Which, for a device that can do DMA and isn't even behind an IOMMU,
sounds like a pretty big deal actually.

> uio_pci_generic is clearly missing this important feature. And creating
> another user space driver infrastructure just to add it seems extremely
> unjustified.

uio_pci_generic was always intended to be used with extremely simple
devices which don't do DMA, or where someone else has set up the IOMMU
(like kvm does with CONFIG_KVM_DEVICE_ASSIGNMENT).

We need to be much more careful with MSI if there's no IOMMU.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05 10:06       ` Vlad Zolotarov
@ 2015-10-05 20:09         ` Michael S. Tsirkin
  0 siblings, 0 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-05 20:09 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: Stephen Hemminger, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, avi, gleb, alexander.duyck

On Mon, Oct 05, 2015 at 01:06:09PM +0300, Vlad Zolotarov wrote:
> Having said all that however I'd agree if someone would say that mappings
> setting would rather come as a separate patch in this series... ;)
> it will in v4...

Just drop this is my advice. There are enough controversial things here
as it is.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
       [not found]             ` <CAOYyTHZ2=UCYxuJKvd5S6qxp=84DBq5bMadg5wL0rFLZBh2-8Q@mail.gmail.com>
@ 2015-10-05 22:29               ` Michael S. Tsirkin
  2015-10-06  8:33                 ` Vlad Zolotarov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-05 22:29 UTC (permalink / raw)
  To: Vladislav Zolotarov
  Cc: Greg KH, Bruce Richardson, linux-kernel, hjk, avi, corbet,
	alexander.duyck, gleb, stephen

On Tue, Oct 06, 2015 at 12:43:45AM +0300, Vladislav Zolotarov wrote:
> So, like it has already been asked in a different thread I'm going to
> ask a rhetorical question: what adding an MSI and MSI-X interrupts support to
> uio_pci_generic has to do with security?

memory protection is a better term than security.

It's very simple: you enable bus mastering and you ask userspace to map
all device BARs. One of these BARs holds the address to which device
writes to trigger MSI-X interrupt.

This is how MSI-X works, internally: from the point of view of
PCI it's a memory write. It just so happens that the destination
address is in the interrupt controller, that triggers an interrupt.

But a bug in this userspace application can corrupt the MSI-X table,
which in turn can easily corrupt kernel memory, or unrelated processes's
memory.  This is in my opinion unacceptable.

So you need to be very careful
- probably need to reset device before you even enable bus master
- prevent userspace from touching msi config
- prevent userspace from moving BARs since msi-x config is within a BAR
- detect reset and prevent linux from touching device while it's under
  reset

The list goes on and on.

This is pretty much what VFIO spent the last 3 years doing, except VFIO
also can do IOMMU groups.

> What "security threat" does it add
> that u don't already have today?

Yes, userspace can create this today if it tweaks PCI config space to
enable MSI-X, then corrupts the MSI-X table.  It's unfortunate that we
don't yet prevent this, but at least you need two things to go wrong for
this to trigger.

The reason, as I tried to point out, is simply that I didn't think
uio_pci_generic will be used for these configurations.
But there's nothing fundamental here that makes them secure
and that therefore makes your patches secure as well.

Fixing this to make uio_pci_generic write-protect MSI/MSI-X enable
registers sounds kind of reasonable, this shouldn't be too hard.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-05 22:29               ` Michael S. Tsirkin
@ 2015-10-06  8:33                 ` Vlad Zolotarov
  2015-10-06 14:19                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-06  8:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Greg KH, Bruce Richardson, linux-kernel, hjk, avi, corbet,
	alexander.duyck, gleb, stephen



On 10/06/15 01:29, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 12:43:45AM +0300, Vladislav Zolotarov wrote:
>> So, like it has already been asked in a different thread I'm going to
>> ask a rhetorical question: what adding an MSI and MSI-X interrupts support to
>> uio_pci_generic has to do with security?
> memory protection is a better term than security.
>
> It's very simple: you enable bus mastering and you ask userspace to map
> all device BARs. One of these BARs holds the address to which device
> writes to trigger MSI-X interrupt.
>
> This is how MSI-X works, internally: from the point of view of
> PCI it's a memory write. It just so happens that the destination
> address is in the interrupt controller, that triggers an interrupt.
>
> But a bug in this userspace application can corrupt the MSI-X table,
> which in turn can easily corrupt kernel memory, or unrelated processes's
> memory.  This is in my opinion unacceptable.
>
> So you need to be very careful
> - probably need to reset device before you even enable bus master
> - prevent userspace from touching msi config
> - prevent userspace from moving BARs since msi-x config is within a BAR
> - detect reset and prevent linux from touching device while it's under
>    reset
>
> The list goes on and on.
>
> This is pretty much what VFIO spent the last 3 years doing, except VFIO
> also can do IOMMU groups.
>
>> What "security threat" does it add
>> that u don't already have today?
> Yes, userspace can create this today if it tweaks PCI config space to
> enable MSI-X, then corrupts the MSI-X table.  It's unfortunate that we
> don't yet prevent this, but at least you need two things to go wrong for
> this to trigger.
>
> The reason, as I tried to point out, is simply that I didn't think
> uio_pci_generic will be used for these configurations.
> But there's nothing fundamental here that makes them secure
> and that therefore makes your patches secure as well.
>
> Fixing this to make uio_pci_generic write-protect MSI/MSI-X enable
> registers sounds kind of reasonable, this shouldn't be too hard.

Sure. But like u've just pointed out yourself - this is a general issue 
and it has nothing to do with the ability to get notifications per 
MSI-X/MSI interrupts, which this series adds (bus mastering may and is 
easily enabled from the user space - look for pci_uio_set_bus_master() 
function in the DPDK).

So, while I absolutely agree with u in regard to the fact that we have a 
security/memory corruption threat in the current in-tree uio_pci_generic 
- the solution u propose should be a matter of a separate patch and is 
obviously orthogonal to this series.

thanks,
vlad

>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-05 19:50 ` Michael S. Tsirkin
@ 2015-10-06  8:37   ` Vlad Zolotarov
  2015-10-06 14:30     ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-06  8:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/05/15 22:50, Michael S. Tsirkin wrote:
> On Sun, Oct 04, 2015 at 11:43:15PM +0300, Vlad Zolotarov wrote:
>> This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
>>   
>> Currently uio_pci_generic supports only legacy INT#x interrupts source. However
>> there are situations when this is not enough, for instance SR-IOV VF devices that
>> simply don't have INT#x capability. For such devices uio_pci_generic will simply
>> fail (more specifically probe() will fail).
>>   
>> When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
>> overhead and thus VFIO is not an option
>> users that develop user-space drivers are left
>> without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
>> DPDK) just to be able to use UIO infrastructure.
>>   
>> This series provides a generic solution for this problem while preserving the original behaviour
>> for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).
> What is missing here is that drivers using uio_pci_generic generally
> poke at config and BAR sysfs files of the device.
>
> We can not stop them without breaking existing users, but this means
> that we can't enable bus mastering and MSI/MSI-X blindly: userspace
> bugs will corrupt the MSI-X table and/or MSi/MSI-X capability,
> and cause device to overwrite random addresses, corrupting kernel
> memory.
>
> Your solution seems to be a warning in dmesg and tainting the
> kernel, but that's not enough.
>
> You need to add infrastructure to prevent this.

Bus mastering is easily enabled from the user space (taken from DPDK code):

static int
pci_uio_set_bus_master(int dev_fd)
{
	uint16_t reg;
	int ret;

	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
	if (ret != sizeof(reg)) {
		RTE_LOG(ERR, EAL,
			"Cannot read command from PCI config space!\n");
		return -1;
	}

	/* return if bus mastering is already on */
	if (reg & PCI_COMMAND_MASTER)
		return 0;

	reg |= PCI_COMMAND_MASTER;

	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
	if (ret != sizeof(reg)) {
		RTE_LOG(ERR, EAL,
			"Cannot write command to PCI config space!\n");
		return -1;
	}

	return 0;
}

So, this is a non-issue. ;)

>
> VFIO has some code to do this, but it's not bound by existing UIO API so it
> simply fails the mmap.  We want I think existing applications to work,
> so I suspect we need to make a hole there (probably map a zero page in
> case apps want to read it, and maybe even set it up for COW in case they
> tweak the PBA which sometimes happens to be in the same page).
>
> Your patches also seem to add in eventfd and mmap capabilities which
> seems to be orthogonal. They are there in VFIO which I'm guessing is the
> real reason you do it.
>
> So, what you are trying to do might be closer to extending VFIO which
> already has a bunch of checks like that.  Yes, it also wants to program
> the IOMMU.  So maybe do it with a separate device that can be root-only,
> so unpriveledged users can't abuse it.
>
> You should Cc, and talk to the VFIO maintainer.
>
>
>
>> New in v3:
>>     - Add __iomem qualifier to temp buffer receiving ioremap value.
>>
>> New in v2:
>>     - Added #include <linux/uaccess.h> to uio_pci_generic.c
>>
>> Vlad Zolotarov (3):
>>    uio: add ioctl support
>>    uio_pci_generic: add MSI/MSI-X support
>>    Documentation: update uio-howto
>>
>>   Documentation/DocBook/uio-howto.tmpl |  29 ++-
>>   drivers/uio/uio.c                    |  15 ++
>>   drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
>>   include/linux/uio_driver.h           |   3 +
>>   include/linux/uio_pci_generic.h      |  36 +++
>>   5 files changed, 467 insertions(+), 26 deletions(-)
>>   create mode 100644 include/linux/uio_pci_generic.h
>>
>> -- 
>> 2.1.0


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-06  8:33                 ` Vlad Zolotarov
@ 2015-10-06 14:19                   ` Michael S. Tsirkin
  2015-10-06 14:30                     ` Gleb Natapov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 14:19 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: Greg KH, Bruce Richardson, linux-kernel, hjk, avi, corbet,
	alexander.duyck, gleb, stephen

On Tue, Oct 06, 2015 at 11:33:56AM +0300, Vlad Zolotarov wrote:
> the solution u propose should be a matter of a separate patch and is
> obviously orthogonal to this series.

Doesn't work this way, sorry. You want a patch enabling MSI merged,
you need to secure the MSI configuration.

And it's going to be a lot of work, duplicating a bunch of code from
VFIO.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-06 14:19                   ` Michael S. Tsirkin
@ 2015-10-06 14:30                     ` Gleb Natapov
  2015-10-06 15:19                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Gleb Natapov @ 2015-10-06 14:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Vlad Zolotarov, Greg KH, Bruce Richardson, linux-kernel, hjk,
	avi, corbet, alexander.duyck, gleb, stephen

On Tue, Oct 06, 2015 at 05:19:22PM +0300, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 11:33:56AM +0300, Vlad Zolotarov wrote:
> > the solution u propose should be a matter of a separate patch and is
> > obviously orthogonal to this series.
> 
> Doesn't work this way, sorry. You want a patch enabling MSI merged,
> you need to secure the MSI configuration.
> 
MSI can be enabled right now without the patch by writing directly into
PCI bar. The only thing this patch adds is forwarding the interrupt to
an eventfd.
 
--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06  8:37   ` Vlad Zolotarov
@ 2015-10-06 14:30     ` Michael S. Tsirkin
  2015-10-06 14:40       ` Vlad Zolotarov
  2015-10-06 15:11       ` Avi Kivity
  0 siblings, 2 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 14:30 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Tue, Oct 06, 2015 at 11:37:59AM +0300, Vlad Zolotarov wrote:
> Bus mastering is easily enabled from the user space (taken from DPDK code):
> 
> static int
> pci_uio_set_bus_master(int dev_fd)
> {
> 	uint16_t reg;
> 	int ret;
> 
> 	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
> 	if (ret != sizeof(reg)) {
> 		RTE_LOG(ERR, EAL,
> 			"Cannot read command from PCI config space!\n");
> 		return -1;
> 	}
> 
> 	/* return if bus mastering is already on */
> 	if (reg & PCI_COMMAND_MASTER)
> 		return 0;
> 
> 	reg |= PCI_COMMAND_MASTER;
> 
> 	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
> 	if (ret != sizeof(reg)) {
> 		RTE_LOG(ERR, EAL,
> 			"Cannot write command to PCI config space!\n");
> 		return -1;
> 	}
> 
> 	return 0;
> }
> 
> So, this is a non-issue. ;)

There might be valid reasons for DPDK to do this, e.g. if using VFIO.

I'm guessing it doesn't enable MSI though, does it?

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05 10:20         ` Avi Kivity
@ 2015-10-06 14:38           ` Michael S. Tsirkin
  2015-10-06 14:43             ` Vlad Zolotarov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 14:38 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Greg KH, Vlad Zolotarov, linux-kernel, hjk, corbet,
	bruce.richardson, avi, gleb, stephen, alexander.duyck

On Mon, Oct 05, 2015 at 01:20:11PM +0300, Avi Kivity wrote:
> On 10/05/2015 12:49 PM, Greg KH wrote:
> >On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote:
> >>Of course it has to be documented, but this just follows vfio.
> >>
> >>Eventfd is a natural enough representation of an interrupt; both kvm and
> >>vfio use it, and are also able to share the eventfd, allowing a vfio
> >>interrupt to generate a kvm interrupt, without userspace intervention, and
> >>one day without even kernel intervention.
> >That's nice and wonderful, but it's not how UIO works today, so this is
> >now going to be a mix and match type interface, with no justification so
> >far as to why to create this new api and exactly how this is all going
> >to be used from userspace.
> 
> The intended user is dpdk (http://dpdk.org), which is a family of userspace
> networking drivers for high performance networking applications.
> 
> The natural device driver for dpdk is vfio, which both provides memory
> protection and exposes msi/msix interrupts.  However, in many cases vfio
> cannot be used, either due to the lack of an iommu (for example, in
> virtualized environments) or out of a desire to avoid the iommus performance
> impact.
> 
> The challenge in exposing msix interrupts to user space is that there are
> many of them, so you can't simply poll the device fd.  If you do, how do you
> know which interrupt was triggered?  The solution that vfio adopted was to
> associate each interrupt with an eventfd, allowing it to be individually
> polled.  Since you can pass an eventfd with SCM_RIGHTS, and since kvm can
> trigger guest interrupts using an eventfd, the solution is very flexible.
> 
> >Example code would be even better...
> >
> >
> 
> 
> This is the vfio dpdk interface code:
> 
> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
> 
> basically, the equivalent uio msix code would be very similar if uio adopts
> a similar interface:
> 
> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> 
> (current code lacks msi/msix support, of course).

So you really want a driver that behaves exactly like vfio.
Which immediately begs a question: why not extend vfio
to cover your usecase.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 14:30     ` Michael S. Tsirkin
@ 2015-10-06 14:40       ` Vlad Zolotarov
  2015-10-06 15:13         ` Michael S. Tsirkin
  2015-10-06 15:11       ` Avi Kivity
  1 sibling, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-06 14:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/06/15 17:30, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 11:37:59AM +0300, Vlad Zolotarov wrote:
>> Bus mastering is easily enabled from the user space (taken from DPDK code):
>>
>> static int
>> pci_uio_set_bus_master(int dev_fd)
>> {
>> 	uint16_t reg;
>> 	int ret;
>>
>> 	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
>> 	if (ret != sizeof(reg)) {
>> 		RTE_LOG(ERR, EAL,
>> 			"Cannot read command from PCI config space!\n");
>> 		return -1;
>> 	}
>>
>> 	/* return if bus mastering is already on */
>> 	if (reg & PCI_COMMAND_MASTER)
>> 		return 0;
>>
>> 	reg |= PCI_COMMAND_MASTER;
>>
>> 	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
>> 	if (ret != sizeof(reg)) {
>> 		RTE_LOG(ERR, EAL,
>> 			"Cannot write command to PCI config space!\n");
>> 		return -1;
>> 	}
>>
>> 	return 0;
>> }
>>
>> So, this is a non-issue. ;)
> There might be valid reasons for DPDK to do this, e.g. if using VFIO.

Michael, I'm afraid u are missing the main point here - the code above 
destroys all your long arguments. U can't possibly prevent the root-user 
from enabling the device bus mastering. And as me and other people on 
this thread have already mentioned MSI and MSI-X device configuration is 
controlled from the device BAR, thus may not be prevented too.


>
> I'm guessing it doesn't enable MSI though, does it?

Again, enabling MSI is a matter of a trivial patch configuring device 
registers on the device BAR.

>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 14:38           ` Michael S. Tsirkin
@ 2015-10-06 14:43             ` Vlad Zolotarov
  2015-10-06 14:56               ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-06 14:43 UTC (permalink / raw)
  To: Michael S. Tsirkin, Avi Kivity
  Cc: Greg KH, linux-kernel, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/06/15 17:38, Michael S. Tsirkin wrote:
> On Mon, Oct 05, 2015 at 01:20:11PM +0300, Avi Kivity wrote:
>> On 10/05/2015 12:49 PM, Greg KH wrote:
>>> On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote:
>>>> Of course it has to be documented, but this just follows vfio.
>>>>
>>>> Eventfd is a natural enough representation of an interrupt; both kvm and
>>>> vfio use it, and are also able to share the eventfd, allowing a vfio
>>>> interrupt to generate a kvm interrupt, without userspace intervention, and
>>>> one day without even kernel intervention.
>>> That's nice and wonderful, but it's not how UIO works today, so this is
>>> now going to be a mix and match type interface, with no justification so
>>> far as to why to create this new api and exactly how this is all going
>>> to be used from userspace.
>> The intended user is dpdk (http://dpdk.org), which is a family of userspace
>> networking drivers for high performance networking applications.
>>
>> The natural device driver for dpdk is vfio, which both provides memory
>> protection and exposes msi/msix interrupts.  However, in many cases vfio
>> cannot be used, either due to the lack of an iommu (for example, in
>> virtualized environments) or out of a desire to avoid the iommus performance
>> impact.
>>
>> The challenge in exposing msix interrupts to user space is that there are
>> many of them, so you can't simply poll the device fd.  If you do, how do you
>> know which interrupt was triggered?  The solution that vfio adopted was to
>> associate each interrupt with an eventfd, allowing it to be individually
>> polled.  Since you can pass an eventfd with SCM_RIGHTS, and since kvm can
>> trigger guest interrupts using an eventfd, the solution is very flexible.
>>
>>> Example code would be even better...
>>>
>>>
>>
>> This is the vfio dpdk interface code:
>>
>> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_pci_vfio.c
>>
>> basically, the equivalent uio msix code would be very similar if uio adopts
>> a similar interface:
>>
>> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
>>
>> (current code lacks msi/msix support, of course).
> So you really want a driver that behaves exactly like vfio.
> Which immediately begs a question: why not extend vfio
> to cover your usecase.

The only "like VFIO" behavior we implement here is binding the MSI-X 
interrupt notification to eventfd descriptor. This doesn't justifies the 
hassle of implementing IOMMU-less VFIO mode.



>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-05  8:28     ` Avi Kivity
  2015-10-05  9:49       ` Greg KH
@ 2015-10-06 14:46       ` Michael S. Tsirkin
  2015-10-06 15:27         ` Avi Kivity
  1 sibling, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 14:46 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Greg KH, Vlad Zolotarov, linux-kernel, hjk, corbet,
	bruce.richardson, avi, gleb, stephen, alexander.duyck

On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote:
> Eventfd is a natural enough representation of an interrupt; both kvm and
> vfio use it, and are also able to share the eventfd, allowing a vfio
> interrupt to generate a kvm interrupt, without userspace intervention, and
> one day without even kernel intervention.

eventfd without kernel intervention sounds unlikely.

kvm might configure the cpu such that an interrupt will not trigger a
vmexit.  eventfd seems like an unlikely interface to do that: with the
eventfd, device triggering it has no info about the interrupt so it
can't send it to the correct VM.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 14:43             ` Vlad Zolotarov
@ 2015-10-06 14:56               ` Michael S. Tsirkin
  2015-10-06 15:23                 ` Avi Kivity
  2015-10-06 15:28                 ` Vlad Zolotarov
  0 siblings, 2 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 14:56 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: Avi Kivity, Greg KH, linux-kernel, hjk, corbet, bruce.richardson,
	avi, gleb, stephen, alexander.duyck

On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote:
> The only "like VFIO" behavior we implement here is binding the MSI-X
> interrupt notification to eventfd descriptor.

There will be more if you add some basic memory protections.

Besides, that's not true.
Your patch queries MSI capability, sets # of vectors.
You even hinted you want to add BAR mapping down the road.

VFIO does all of that.

> This doesn't justifies the
> hassle of implementing IOMMU-less VFIO mode.

This applies to both VFIO and UIO really.  I'm not sure the hassle of
maintaining this functionality in tree is justified.  It remains to be
seen whether there are any users that won't taint the kernel.
Apparently not in the current form of the patch, but who knows.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 14:30     ` Michael S. Tsirkin
  2015-10-06 14:40       ` Vlad Zolotarov
@ 2015-10-06 15:11       ` Avi Kivity
  2015-10-06 15:15         ` Michael S. Tsirkin
  1 sibling, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-06 15:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, gleb,
	stephen, alexander.duyck



On 10/06/2015 05:30 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 11:37:59AM +0300, Vlad Zolotarov wrote:
>> Bus mastering is easily enabled from the user space (taken from DPDK code):
>>
>> static int
>> pci_uio_set_bus_master(int dev_fd)
>> {
>> 	uint16_t reg;
>> 	int ret;
>>
>> 	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
>> 	if (ret != sizeof(reg)) {
>> 		RTE_LOG(ERR, EAL,
>> 			"Cannot read command from PCI config space!\n");
>> 		return -1;
>> 	}
>>
>> 	/* return if bus mastering is already on */
>> 	if (reg & PCI_COMMAND_MASTER)
>> 		return 0;
>>
>> 	reg |= PCI_COMMAND_MASTER;
>>
>> 	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
>> 	if (ret != sizeof(reg)) {
>> 		RTE_LOG(ERR, EAL,
>> 			"Cannot write command to PCI config space!\n");
>> 		return -1;
>> 	}
>>
>> 	return 0;
>> }
>>
>> So, this is a non-issue. ;)
> There might be valid reasons for DPDK to do this, e.g. if using VFIO.

DPDK does this when using vfio, and when using uio_pci_generic. All of 
the network cards that DPDK supports require DMA.

> I'm guessing it doesn't enable MSI though, does it?

It does not enable MSI, because the main kernel driver used for 
interacting with the device, pci_uio_generic, does not support MSI. In 
some configurations, PCI INTA is not available, while MSI(X) is, hence 
the desire that pci_uio_generic support MSI.

While it is possible that userspace malfunctions and accidentally 
programs MSI incorrectly, the risk is dwarfed by the ability of 
userspace to program DMA incorrectly.  Under normal operation userspace 
programs tens of millions of DMA operations per second, while it never 
touches the MSI BARs (it is the kernel that programs them).

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 14:40       ` Vlad Zolotarov
@ 2015-10-06 15:13         ` Michael S. Tsirkin
  2015-10-06 16:35           ` Vlad Zolotarov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 15:13 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Tue, Oct 06, 2015 at 05:40:23PM +0300, Vlad Zolotarov wrote:
> >I'm guessing it doesn't enable MSI though, does it?
> 
> Again, enabling MSI is a matter of a trivial patch configuring device
> registers on the device BAR.

No, not really.

-- 
mST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 15:11       ` Avi Kivity
@ 2015-10-06 15:15         ` Michael S. Tsirkin
  2015-10-06 16:00           ` Gleb Natapov
  2015-10-06 16:09           ` Avi Kivity
  0 siblings, 2 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 15:15 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck

> While it is possible that userspace malfunctions and accidentally programs
> MSI incorrectly, the risk is dwarfed by the ability of userspace to program
> DMA incorrectly.

That seems to imply that for the upstream kernel this is not a valid usecase at all.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-06 14:30                     ` Gleb Natapov
@ 2015-10-06 15:19                       ` Michael S. Tsirkin
  2015-10-06 15:31                         ` Vlad Zolotarov
  2015-10-06 15:57                         ` Gleb Natapov
  0 siblings, 2 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 15:19 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Vlad Zolotarov, Greg KH, Bruce Richardson, linux-kernel, hjk,
	avi, corbet, alexander.duyck, gleb, stephen

On Tue, Oct 06, 2015 at 05:30:31PM +0300, Gleb Natapov wrote:
> On Tue, Oct 06, 2015 at 05:19:22PM +0300, Michael S. Tsirkin wrote:
> > On Tue, Oct 06, 2015 at 11:33:56AM +0300, Vlad Zolotarov wrote:
> > > the solution u propose should be a matter of a separate patch and is
> > > obviously orthogonal to this series.
> > 
> > Doesn't work this way, sorry. You want a patch enabling MSI merged,
> > you need to secure the MSI configuration.
> > 
> MSI can be enabled right now without the patch by writing directly into
> PCI bar.

By poking at config registers in sysfs? We can block this, or we
can log this, pretty easily. We don't ATM but it's not hard to do.

> The only thing this patch adds is forwarding the interrupt to
> an eventfd.

This one just adds a bunch of ioctls. The next ones do
more than you describe.

> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 14:56               ` Michael S. Tsirkin
@ 2015-10-06 15:23                 ` Avi Kivity
  2015-10-06 18:51                   ` Alex Williamson
  2015-10-06 15:28                 ` Vlad Zolotarov
  1 sibling, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-06 15:23 UTC (permalink / raw)
  To: Michael S. Tsirkin, Vlad Zolotarov
  Cc: Greg KH, linux-kernel, hjk, corbet, bruce.richardson, avi, gleb,
	stephen, alexander.duyck, Alex Williamson



On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote:
>> The only "like VFIO" behavior we implement here is binding the MSI-X
>> interrupt notification to eventfd descriptor.
> There will be more if you add some basic memory protections.
>
> Besides, that's not true.
> Your patch queries MSI capability, sets # of vectors.
> You even hinted you want to add BAR mapping down the road.

BAR mapping is already available from sysfs; it is not mandatory.

> VFIO does all of that.
>

Copying vfio maintainer Alex (hi!).

vfio's charter is modern iommu-capable configurations. It is designed to 
be secure enough to be usable by an unprivileged user.

For performance and hardware reasons, many dpdk deployments use 
uio_pci_generic.  They are willing to trade off the security provided by 
vfio for the performance and deployment flexibility of pci_uio_generic.  
Forcing these features into vfio will compromise its security and 
needlessly complicate its code (I guess it can be done with a "null" 
iommu, but then vfio will have to decide whether it is secure or not).

>> This doesn't justifies the
>> hassle of implementing IOMMU-less VFIO mode.
> This applies to both VFIO and UIO really.  I'm not sure the hassle of
> maintaining this functionality in tree is justified.  It remains to be
> seen whether there are any users that won't taint the kernel.
> Apparently not in the current form of the patch, but who knows.

It is not msix that taints the kernel, it's uio_pci_generic.  Msix is a 
tiny feature addition that doesn't change the security situation one bit.

btw, currently you can map BARs and dd to /dev/mem to your heart's 
content without tainting the kernel.  I don't see how you can claim that 
msix support makes the situation worse, when root can access every bit 
of physical memory, either directly or via DMA.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 14:46       ` Michael S. Tsirkin
@ 2015-10-06 15:27         ` Avi Kivity
  0 siblings, 0 replies; 96+ messages in thread
From: Avi Kivity @ 2015-10-06 15:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Greg KH, Vlad Zolotarov, linux-kernel, hjk, corbet,
	bruce.richardson, avi, gleb, stephen, alexander.duyck



On 10/06/2015 05:46 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote:
>> Eventfd is a natural enough representation of an interrupt; both kvm and
>> vfio use it, and are also able to share the eventfd, allowing a vfio
>> interrupt to generate a kvm interrupt, without userspace intervention, and
>> one day without even kernel intervention.
> eventfd without kernel intervention sounds unlikely.
>
> kvm might configure the cpu such that an interrupt will not trigger a
> vmexit.  eventfd seems like an unlikely interface to do that: with the
> eventfd, device triggering it has no info about the interrupt so it
> can't send it to the correct VM.

https://lwn.net/Articles/650863/

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 14:56               ` Michael S. Tsirkin
  2015-10-06 15:23                 ` Avi Kivity
@ 2015-10-06 15:28                 ` Vlad Zolotarov
  1 sibling, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-06 15:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Greg KH, linux-kernel, hjk, corbet, bruce.richardson,
	avi, gleb, stephen, alexander.duyck



On 10/06/15 17:56, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote:
>> The only "like VFIO" behavior we implement here is binding the MSI-X
>> interrupt notification to eventfd descriptor.
> There will be more if you add some basic memory protections.

I've already explained that there is no need for any additional memory 
protections since it won't be able to protect anything.

>
> Besides, that's not true.
> Your patch queries MSI capability, sets # of vectors.

My patch doesn't set # of vectors.

> You even hinted you want to add BAR mapping down the road.
>
> VFIO does all of that.
>
>> This doesn't justifies the
>> hassle of implementing IOMMU-less VFIO mode.
> This applies to both VFIO and UIO really.  I'm not sure the hassle of
> maintaining this functionality in tree is justified.  It remains to be
> seen whether there are any users that won't taint the kernel.
> Apparently not in the current form of the patch, but who knows.

Again, uio_pci_generic with my patch simply follows the UIO design and 
in addition allows mapping MSI-X interrupts to eventfd's and it does it 
much more laconically compared to VFIO. Therefore these two modules 
won't be more related than they are today.

thanks,
vlad

>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-06 15:19                       ` Michael S. Tsirkin
@ 2015-10-06 15:31                         ` Vlad Zolotarov
  2015-10-06 15:57                         ` Gleb Natapov
  1 sibling, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-06 15:31 UTC (permalink / raw)
  To: Michael S. Tsirkin, Gleb Natapov
  Cc: Greg KH, Bruce Richardson, linux-kernel, hjk, avi, corbet,
	alexander.duyck, gleb, stephen



On 10/06/15 18:19, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 05:30:31PM +0300, Gleb Natapov wrote:
>> On Tue, Oct 06, 2015 at 05:19:22PM +0300, Michael S. Tsirkin wrote:
>>> On Tue, Oct 06, 2015 at 11:33:56AM +0300, Vlad Zolotarov wrote:
>>>> the solution u propose should be a matter of a separate patch and is
>>>> obviously orthogonal to this series.
>>> Doesn't work this way, sorry. You want a patch enabling MSI merged,
>>> you need to secure the MSI configuration.
>>>
>> MSI can be enabled right now without the patch by writing directly into
>> PCI bar.
> By poking at config registers in sysfs? We can block this, or we
> can log this, pretty easily. We don't ATM but it's not hard to do.
>
>> The only thing this patch adds is forwarding the interrupt to
>> an eventfd.
> This one just adds a bunch of ioctls. The next ones do
> more than you describe.

This one adds zero ioctls and the next one does exactly what Gleb 
describes.

>
>> --
>> 			Gleb.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 1/3] uio: add ioctl support
  2015-10-06 15:19                       ` Michael S. Tsirkin
  2015-10-06 15:31                         ` Vlad Zolotarov
@ 2015-10-06 15:57                         ` Gleb Natapov
  1 sibling, 0 replies; 96+ messages in thread
From: Gleb Natapov @ 2015-10-06 15:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Vlad Zolotarov, Greg KH, Bruce Richardson, linux-kernel, hjk,
	avi, corbet, alexander.duyck, gleb, stephen

On Tue, Oct 06, 2015 at 06:19:34PM +0300, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 05:30:31PM +0300, Gleb Natapov wrote:
> > On Tue, Oct 06, 2015 at 05:19:22PM +0300, Michael S. Tsirkin wrote:
> > > On Tue, Oct 06, 2015 at 11:33:56AM +0300, Vlad Zolotarov wrote:
> > > > the solution u propose should be a matter of a separate patch and is
> > > > obviously orthogonal to this series.
> > > 
> > > Doesn't work this way, sorry. You want a patch enabling MSI merged,
> > > you need to secure the MSI configuration.
> > > 
> > MSI can be enabled right now without the patch by writing directly into
> > PCI bar.
> 
> By poking at config registers in sysfs? We can block this, or we
> can log this, pretty easily. We don't ATM but it's not hard to do.
> 
Blocking this will break userspace API. As a maintainer you should know
that we do not break userspace APIs. Logging this is fine, but how
exactly it helps you with "security"? The patch in question already
taints the kernel which is much stronger than logging.

> > The only thing this patch adds is forwarding the interrupt to
> > an eventfd.
> 
> This one just adds a bunch of ioctls. The next ones do
> more than you describe.
> 
Yes, it adds bunch of ioctls to do exactly what I wrote above. What
point have you tried to make by this statement? It eluded me.

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 15:15         ` Michael S. Tsirkin
@ 2015-10-06 16:00           ` Gleb Natapov
  2015-10-06 16:09           ` Avi Kivity
  1 sibling, 0 replies; 96+ messages in thread
From: Gleb Natapov @ 2015-10-06 16:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck

On Tue, Oct 06, 2015 at 06:15:54PM +0300, Michael S. Tsirkin wrote:
> > While it is possible that userspace malfunctions and accidentally programs
> > MSI incorrectly, the risk is dwarfed by the ability of userspace to program
> > DMA incorrectly.
> 
> That seems to imply that for the upstream kernel this is not a valid usecase at all.
> 
Are you implying that uio_pci_generic should be removed from upstream
kernel because Avi did not describe anything that cannot be done with
upstream kernel right now.

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 15:15         ` Michael S. Tsirkin
  2015-10-06 16:00           ` Gleb Natapov
@ 2015-10-06 16:09           ` Avi Kivity
  2015-10-07 10:25             ` Michael S. Tsirkin
  1 sibling, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-06 16:09 UTC (permalink / raw)
  To: Michael S. Tsirkin, Avi Kivity
  Cc: Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck



On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote:
>> While it is possible that userspace malfunctions and accidentally programs
>> MSI incorrectly, the risk is dwarfed by the ability of userspace to program
>> DMA incorrectly.
> That seems to imply that for the upstream kernel this is not a valid usecase at all.
>

That is trivially incorrect, upstream pci_uio_generic is used with dpdk 
for years.  Are dpdk applications an invalid use case?

Again:

- security is not compromised.  you need to be root to (ab)use this.
- all of the potentially compromising functionality has been there from 
day 1
- uio_pci_generic is the only way to provide the required performance on 
some configurations (where kernel drivers, or userspace drivers + iommu 
are too slow)
- uio_pci_generic + msix is the only way to enable userspace drivers on 
some configurations (SRIOV)

The proposed functionality does not increase the attack surface.
The proposed functionality marginally increases the bug surface.
The proposed functionality is a natural evolution of uio_pci_generic.

There is a new class of applications (network function virtualization) 
which require this.  They can't use the kernel drivers because they are 
too slow.  They can't use the iommu because it is either too slow, or 
taken over by the hypervisor.  They are willing to live with less kernel 
protection, because they are a single user application anyway (and since 
they use a kernel bypass, they don't really care that much about the 
kernel).

The kernel serves more use-cases than a desktop or a multi-user 
servers.  Some of these users are willing to trade off protection for 
performance or functionality (an extreme, yet similar, example is 
linux-nommu, which allows any application to access any bit of memory, 
due to the lack of protection hardware).

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 15:13         ` Michael S. Tsirkin
@ 2015-10-06 16:35           ` Vlad Zolotarov
  0 siblings, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-06 16:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/06/15 18:13, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 05:40:23PM +0300, Vlad Zolotarov wrote:
>>> I'm guessing it doesn't enable MSI though, does it?
>> Again, enabling MSI is a matter of a trivial patch configuring device
>> registers on the device BAR.
> No, not really.

Sure that is!
Look at pci_msi_set_enable(): it's a single read and then write, just 
like the code for bus master enabling. The msi capability address may be 
retrieved from lspci and then mapped for instance using sysfs.

Configuring the MSI table is a few more reads and writes too but at the 
bottom line - it's a trivial code too...

>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 15:23                 ` Avi Kivity
@ 2015-10-06 18:51                   ` Alex Williamson
  2015-10-06 21:32                     ` Stephen Hemminger
                                       ` (2 more replies)
  0 siblings, 3 replies; 96+ messages in thread
From: Alex Williamson @ 2015-10-06 18:51 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Michael S. Tsirkin, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Tue, 2015-10-06 at 18:23 +0300, Avi Kivity wrote:
> 
> On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote:
> > On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote:
> >> The only "like VFIO" behavior we implement here is binding the MSI-X
> >> interrupt notification to eventfd descriptor.
> > There will be more if you add some basic memory protections.
> >
> > Besides, that's not true.
> > Your patch queries MSI capability, sets # of vectors.
> > You even hinted you want to add BAR mapping down the road.
> 
> BAR mapping is already available from sysfs; it is not mandatory.
> 
> > VFIO does all of that.
> >
> 
> Copying vfio maintainer Alex (hi!).
> 
> vfio's charter is modern iommu-capable configurations. It is designed to 
> be secure enough to be usable by an unprivileged user.
> 
> For performance and hardware reasons, many dpdk deployments use 
> uio_pci_generic.  They are willing to trade off the security provided by 
> vfio for the performance and deployment flexibility of pci_uio_generic.  
> Forcing these features into vfio will compromise its security and 
> needlessly complicate its code (I guess it can be done with a "null" 
> iommu, but then vfio will have to decide whether it is secure or not).

It's not just the iommu model vfio uses, it's that vfio is built around
iommu groups.  For instance to use a device in vfio, the user opens the
vfio group file and asks for the device within that group.  That's a
fairly fundamental part of the mechanics to sidestep.

However, is there an opportunity at a lower level?  Systems without an
iommu typically have dma ops handled via a software iotlb (ie. bounce
buffers), but I think they simply don't have iommu ops registered.
Could a no-iommu, iommu subsystem provide enough dummy iommu ops to fake
out vfio?  It would need to iterate the devices on the bus and come up
with dummy iommu groups and dummy versions of iommu_map and unmap.  The
grouping is easy, one device per group, there's no isolation anyway.
The vfio type1 iommu backend will do pinning, which seems like an
improvement over the mlock that uio users probably try to do now.  I
guess the no-iommu map would error if the IOVA isn't simply the bus
address of the page mapped.

Of course this is entirely unsafe and this no-iommu driver should taint
the kernel, but it at least standardizes on one userspace API and you're
already doing completely unsafe things with uio.  vfio should be
enlightened at least to the point that it allows only privileged users
access to devices under such a (lack of) iommu.

> >> This doesn't justifies the
> >> hassle of implementing IOMMU-less VFIO mode.
> > This applies to both VFIO and UIO really.  I'm not sure the hassle of
> > maintaining this functionality in tree is justified.  It remains to be
> > seen whether there are any users that won't taint the kernel.
> > Apparently not in the current form of the patch, but who knows.
> 
> It is not msix that taints the kernel, it's uio_pci_generic.  Msix is a 
> tiny feature addition that doesn't change the security situation one bit.
> 
> btw, currently you can map BARs and dd to /dev/mem to your heart's 
> content without tainting the kernel.  I don't see how you can claim that 
> msix support makes the situation worse, when root can access every bit 
> of physical memory, either directly or via DMA.




^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 18:51                   ` Alex Williamson
@ 2015-10-06 21:32                     ` Stephen Hemminger
  2015-10-06 21:41                       ` Alex Williamson
  2015-10-07  6:52                     ` Avi Kivity
  2015-10-07  7:55                     ` Vlad Zolotarov
  2 siblings, 1 reply; 96+ messages in thread
From: Stephen Hemminger @ 2015-10-06 21:32 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Avi Kivity, Michael S. Tsirkin, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb,
	alexander.duyck

On Tue, 06 Oct 2015 12:51:20 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> Of course this is entirely unsafe and this no-iommu driver should taint
> the kernel, but it at least standardizes on one userspace API and you're
> already doing completely unsafe things with uio.  vfio should be
> enlightened at least to the point that it allows only privileged users
> access to devices under such a (lack of) iommu

I agree with the design, but not with the taint argument.
(Unless you want to taint any and all use of UIO drivers which can
 already do this).

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 21:32                     ` Stephen Hemminger
@ 2015-10-06 21:41                       ` Alex Williamson
       [not found]                         ` <CAOaVG152OrQz-Bbnpr0VeE+vLH7nMGsG6A3sD7eTQHormNGVUg@mail.gmail.com>
  0 siblings, 1 reply; 96+ messages in thread
From: Alex Williamson @ 2015-10-06 21:41 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Avi Kivity, Michael S. Tsirkin, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb,
	alexander.duyck

On Tue, 2015-10-06 at 22:32 +0100, Stephen Hemminger wrote:
> On Tue, 06 Oct 2015 12:51:20 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > Of course this is entirely unsafe and this no-iommu driver should taint
> > the kernel, but it at least standardizes on one userspace API and you're
> > already doing completely unsafe things with uio.  vfio should be
> > enlightened at least to the point that it allows only privileged users
> > access to devices under such a (lack of) iommu
> 
> I agree with the design, but not with the taint argument.
> (Unless you want to taint any and all use of UIO drivers which can
>  already do this).

Yes, actually, if the bus master bit gets enabled all bets are off.  I
don't see how that leaves a supportable kernel, so we might as well
taint it.  Isn't this exactly why we taint for proprietary drivers, we
have no idea what it has mucked with in kernel space.  This just moves
the proprietary driver out to userspace without an iommu to protect the
host.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 18:51                   ` Alex Williamson
  2015-10-06 21:32                     ` Stephen Hemminger
@ 2015-10-07  6:52                     ` Avi Kivity
  2015-10-07 16:31                       ` Alex Williamson
  2015-10-07  7:55                     ` Vlad Zolotarov
  2 siblings, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-07  6:52 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Michael S. Tsirkin, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck



On 10/06/2015 09:51 PM, Alex Williamson wrote:
> On Tue, 2015-10-06 at 18:23 +0300, Avi Kivity wrote:
>> On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote:
>>> On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote:
>>>> The only "like VFIO" behavior we implement here is binding the MSI-X
>>>> interrupt notification to eventfd descriptor.
>>> There will be more if you add some basic memory protections.
>>>
>>> Besides, that's not true.
>>> Your patch queries MSI capability, sets # of vectors.
>>> You even hinted you want to add BAR mapping down the road.
>> BAR mapping is already available from sysfs; it is not mandatory.
>>
>>> VFIO does all of that.
>>>
>> Copying vfio maintainer Alex (hi!).
>>
>> vfio's charter is modern iommu-capable configurations. It is designed to
>> be secure enough to be usable by an unprivileged user.
>>
>> For performance and hardware reasons, many dpdk deployments use
>> uio_pci_generic.  They are willing to trade off the security provided by
>> vfio for the performance and deployment flexibility of pci_uio_generic.
>> Forcing these features into vfio will compromise its security and
>> needlessly complicate its code (I guess it can be done with a "null"
>> iommu, but then vfio will have to decide whether it is secure or not).
> It's not just the iommu model vfio uses, it's that vfio is built around
> iommu groups.  For instance to use a device in vfio, the user opens the
> vfio group file and asks for the device within that group.  That's a
> fairly fundamental part of the mechanics to sidestep.
>
> However, is there an opportunity at a lower level?  Systems without an
> iommu typically have dma ops handled via a software iotlb (ie. bounce
> buffers), but I think they simply don't have iommu ops registered.
> Could a no-iommu, iommu subsystem provide enough dummy iommu ops to fake
> out vfio?  It would need to iterate the devices on the bus and come up
> with dummy iommu groups and dummy versions of iommu_map and unmap.  The
> grouping is easy, one device per group, there's no isolation anyway.
> The vfio type1 iommu backend will do pinning, which seems like an
> improvement over the mlock that uio users probably try to do now.

Right now, people use hugetlbfs maps, which both locks the memory and 
provides better performance.

>    I
> guess the no-iommu map would error if the IOVA isn't simply the bus
> address of the page mapped.
>
> Of course this is entirely unsafe and this no-iommu driver should taint
> the kernel, but it at least standardizes on one userspace API and you're
> already doing completely unsafe things with uio.  vfio should be
> enlightened at least to the point that it allows only privileged users
> access to devices under such a (lack of) iommu.

There is an additional complication.  With an iommu, userspace programs 
the device with virtual addresses, but without it, they have to program 
physical addresses.  So vfio would need to communicate this bit of 
information.

We can go further and define a better translation API than the current 
one (reading /proc/pagemap).  But it's going to be a bigger change to 
vfio than I thought at first.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-06 18:51                   ` Alex Williamson
  2015-10-06 21:32                     ` Stephen Hemminger
  2015-10-07  6:52                     ` Avi Kivity
@ 2015-10-07  7:55                     ` Vlad Zolotarov
  2015-10-08  8:48                       ` Michael S. Tsirkin
  2 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-07  7:55 UTC (permalink / raw)
  To: Alex Williamson, Avi Kivity
  Cc: Michael S. Tsirkin, Greg KH, linux-kernel, hjk, corbet,
	bruce.richardson, avi, gleb, stephen, alexander.duyck



On 10/06/15 21:51, Alex Williamson wrote:
> On Tue, 2015-10-06 at 18:23 +0300, Avi Kivity wrote:
>> On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote:
>>> On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote:
>>>> The only "like VFIO" behavior we implement here is binding the MSI-X
>>>> interrupt notification to eventfd descriptor.
>>> There will be more if you add some basic memory protections.
>>>
>>> Besides, that's not true.
>>> Your patch queries MSI capability, sets # of vectors.
>>> You even hinted you want to add BAR mapping down the road.
>> BAR mapping is already available from sysfs; it is not mandatory.
>>
>>> VFIO does all of that.
>>>
>> Copying vfio maintainer Alex (hi!).
>>
>> vfio's charter is modern iommu-capable configurations. It is designed to
>> be secure enough to be usable by an unprivileged user.
>>
>> For performance and hardware reasons, many dpdk deployments use
>> uio_pci_generic.  They are willing to trade off the security provided by
>> vfio for the performance and deployment flexibility of pci_uio_generic.
>> Forcing these features into vfio will compromise its security and
>> needlessly complicate its code (I guess it can be done with a "null"
>> iommu, but then vfio will have to decide whether it is secure or not).
> It's not just the iommu model vfio uses, it's that vfio is built around
> iommu groups.  For instance to use a device in vfio, the user opens the
> vfio group file and asks for the device within that group.  That's a
> fairly fundamental part of the mechanics to sidestep.
>
> However, is there an opportunity at a lower level?  Systems without an
> iommu typically have dma ops handled via a software iotlb (ie. bounce
> buffers), but I think they simply don't have iommu ops registered.
> Could a no-iommu, iommu subsystem provide enough dummy iommu ops to fake
> out vfio?  It would need to iterate the devices on the bus and come up
> with dummy iommu groups and dummy versions of iommu_map and unmap.  The
> grouping is easy, one device per group, there's no isolation anyway.
> The vfio type1 iommu backend will do pinning, which seems like an
> improvement over the mlock that uio users probably try to do now.  I
> guess the no-iommu map would error if the IOVA isn't simply the bus
> address of the page mapped.
>
> Of course this is entirely unsafe and this no-iommu driver should taint
> the kernel, but it at least standardizes on one userspace API and you're
> already doing completely unsafe things with uio.  vfio should be
> enlightened at least to the point that it allows only privileged users
> access to devices under such a (lack of) iommu.

Thanks for clarification, Alex.
One of the important points in the above description is that vfio has 
been build around IOMMU groups - and that's a good thing!
This means that this ensures the safety for vfio users and IMHO breaking 
this by introducing the no-iommu mode won't bring any good.

What do we have on a negative side of this step:

 1. Just a description of the work that has to be done implies a
    non-trivial code that will have to be maintained later.
 2. This new mode will be absolutely unsafe while some users may
    mistakenly assume that using vfio is safe in all situations like it
    is now.
 3. The vfio user interface is "a bit" more complicated than the one of
    UIO's and if there isn't any added value (see below) users will just
    prefer continue using UIO.

Let's try to analyze the possible positive sides (as u've described them 
above):

 1. /This added feature may allow to standardize the user-space drivers
    interface. Why is it good? - This could allow us to maintain only
    one infrastructure (vfio)./ That's true but unfortunately UIO
    interface is already widely used and thus it can't be just killed.
    Therefore instead of maintaining one unsafe user-space driver
    infrastructure we'll have to maintain two. Therefore the result will
    be exactly the opposite from the expected.
 2. I'm not very familiar with all vfio features but regarding the "vfio
    type1 iommu backend going to do pinning instead of mlock" - another
    alternative to pin the pages in the memory is to use hugetlbfs,
    which UIO users like DPDK do. I may be wrong but I'm not sure that
    in this case using vfio type1 iommu backend would be beneficial it
    terms of performance and performance is usually the most important
    factor for un-safe mode users (e.g. DPDK).


So, considering the above I think that instead of complicating the 
already non-trivial vfio interface even more we'd rather have two types 
of user-space interfaces:

  * safe - VFIO
  * not safe - UIO

The thing is that this is more or less the situation right now and 
according to negatives.3 above it is likely to remain this way (at least 
the UIO part ;)) for some (long) time so all this "adding unsafe mode to 
VFIO" initiative looks completely useless.

thanks,
vlad


>
>>>> This doesn't justifies the
>>>> hassle of implementing IOMMU-less VFIO mode.
>>> This applies to both VFIO and UIO really.  I'm not sure the hassle of
>>> maintaining this functionality in tree is justified.  It remains to be
>>> seen whether there are any users that won't taint the kernel.
>>> Apparently not in the current form of the patch, but who knows.
>> It is not msix that taints the kernel, it's uio_pci_generic.  Msix is a
>> tiny feature addition that doesn't change the security situation one bit.
>>
>> btw, currently you can map BARs and dd to /dev/mem to your heart's
>> content without tainting the kernel.  I don't see how you can claim that
>> msix support makes the situation worse, when root can access every bit
>> of physical memory, either directly or via DMA.
>
>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
       [not found]                         ` <CAOaVG152OrQz-Bbnpr0VeE+vLH7nMGsG6A3sD7eTQHormNGVUg@mail.gmail.com>
@ 2015-10-07  7:57                           ` Vlad Zolotarov
       [not found]                           ` <5614C160.6000203@scylladb.com>
  1 sibling, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-07  7:57 UTC (permalink / raw)
  To: Stephen Hemminger, Alex Williamson
  Cc: Avi Kivity, Michael S. Tsirkin, Greg KH, Linux Kernel, hjk,
	Jonathan Corbet, Bruce.Richardson@intel.com, avi, gleb,
	Alexander Duyck



On 10/07/15 00:58, Stephen Hemminger wrote:
> Go ahead and submit a seperate taint bit for UIO as a patch.

This patch already does this.

thanks,
vlad

>
>
> On Tue, Oct 6, 2015 at 10:41 PM, Alex Williamson 
> <alex.williamson@redhat.com <mailto:alex.williamson@redhat.com>> wrote:
>
>     On Tue, 2015-10-06 at 22:32 +0100, Stephen Hemminger wrote:
>     > On Tue, 06 Oct 2015 12:51:20 -0600
>     > Alex Williamson <alex.williamson@redhat.com
>     <mailto:alex.williamson@redhat.com>> wrote:
>     >
>     > > Of course this is entirely unsafe and this no-iommu driver
>     should taint
>     > > the kernel, but it at least standardizes on one userspace API
>     and you're
>     > > already doing completely unsafe things with uio.  vfio should be
>     > > enlightened at least to the point that it allows only
>     privileged users
>     > > access to devices under such a (lack of) iommu
>     >
>     > I agree with the design, but not with the taint argument.
>     > (Unless you want to taint any and all use of UIO drivers which can
>     >  already do this).
>
>     Yes, actually, if the bus master bit gets enabled all bets are off.  I
>     don't see how that leaves a supportable kernel, so we might as well
>     taint it.  Isn't this exactly why we taint for proprietary drivers, we
>     have no idea what it has mucked with in kernel space.  This just moves
>     the proprietary driver out to userspace without an iommu to
>     protect the
>     host.  Thanks,
>
>     Alex
>
>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
       [not found]                           ` <5614C160.6000203@scylladb.com>
@ 2015-10-07  8:00                             ` Vlad Zolotarov
  2015-10-07  8:01                               ` Vlad Zolotarov
  0 siblings, 1 reply; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-07  8:00 UTC (permalink / raw)
  To: Avi Kivity, Stephen Hemminger, Alex Williamson
  Cc: Michael S. Tsirkin, Greg KH, Linux Kernel, hjk, Jonathan Corbet,
	Bruce.Richardson@intel.com, avi, gleb, Alexander Duyck



On 10/07/15 09:53, Avi Kivity wrote:
> On 10/07/2015 12:58 AM, Stephen Hemminger wrote:
>> Go ahead and submit a seperate taint bit for UIO as a patch.
>>
>
> Taint should only be applied if bus mastering is enabled (to avoid 
> annoying the users of the original uio use case)

Pls., note that this series would enable the legacy INT#X mode if 
possible and this, of course, without enabling bus mastering and without 
tainting the kernel.
This means that the current users of uio_pci_generic won't feel/get any 
difference after/if these patches are applied since before these patches 
it could only be used with the devices that do have INT#X capability.

>
>>
>> On Tue, Oct 6, 2015 at 10:41 PM, Alex Williamson 
>> <alex.williamson@redhat.com <mailto:alex.williamson@redhat.com>> wrote:
>>
>>     On Tue, 2015-10-06 at 22:32 +0100, Stephen Hemminger wrote:
>>     > On Tue, 06 Oct 2015 12:51:20 -0600
>>     > Alex Williamson <alex.williamson@redhat.com
>>     <mailto:alex.williamson@redhat.com>> wrote:
>>     >
>>     > > Of course this is entirely unsafe and this no-iommu driver
>>     should taint
>>     > > the kernel, but it at least standardizes on one userspace API
>>     and you're
>>     > > already doing completely unsafe things with uio.  vfio should be
>>     > > enlightened at least to the point that it allows only
>>     privileged users
>>     > > access to devices under such a (lack of) iommu
>>     >
>>     > I agree with the design, but not with the taint argument.
>>     > (Unless you want to taint any and all use of UIO drivers which can
>>     >  already do this).
>>
>>     Yes, actually, if the bus master bit gets enabled all bets are
>>     off.  I
>>     don't see how that leaves a supportable kernel, so we might as well
>>     taint it.  Isn't this exactly why we taint for proprietary
>>     drivers, we
>>     have no idea what it has mucked with in kernel space. This just moves
>>     the proprietary driver out to userspace without an iommu to
>>     protect the
>>     host.  Thanks,
>>
>>     Alex
>>
>>
>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-07  8:00                             ` Vlad Zolotarov
@ 2015-10-07  8:01                               ` Vlad Zolotarov
  0 siblings, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-07  8:01 UTC (permalink / raw)
  To: Avi Kivity, Stephen Hemminger, Alex Williamson
  Cc: Michael S. Tsirkin, Greg KH, Linux Kernel, hjk, Jonathan Corbet,
	Bruce.Richardson@intel.com, avi, gleb, Alexander Duyck



On 10/07/15 11:00, Vlad Zolotarov wrote:
>
>
> On 10/07/15 09:53, Avi Kivity wrote:
>> On 10/07/2015 12:58 AM, Stephen Hemminger wrote:
>>> Go ahead and submit a seperate taint bit for UIO as a patch.
>>>
>>
>> Taint should only be applied if bus mastering is enabled (to avoid 
>> annoying the users of the original uio use case)
>
> Pls., note that this series would enable the legacy INT#X mode if 
> possible 

By default I meant.

> and this, of course, without enabling bus mastering and without 
> tainting the kernel.
> This means that the current users of uio_pci_generic won't feel/get 
> any difference after/if these patches are applied since before these 
> patches it could only be used with the devices that do have INT#X 
> capability.
>
>>
>>>
>>> On Tue, Oct 6, 2015 at 10:41 PM, Alex Williamson 
>>> <alex.williamson@redhat.com <mailto:alex.williamson@redhat.com>> wrote:
>>>
>>>     On Tue, 2015-10-06 at 22:32 +0100, Stephen Hemminger wrote:
>>>     > On Tue, 06 Oct 2015 12:51:20 -0600
>>>     > Alex Williamson <alex.williamson@redhat.com
>>>     <mailto:alex.williamson@redhat.com>> wrote:
>>>     >
>>>     > > Of course this is entirely unsafe and this no-iommu driver
>>>     should taint
>>>     > > the kernel, but it at least standardizes on one userspace API
>>>     and you're
>>>     > > already doing completely unsafe things with uio.  vfio 
>>> should be
>>>     > > enlightened at least to the point that it allows only
>>>     privileged users
>>>     > > access to devices under such a (lack of) iommu
>>>     >
>>>     > I agree with the design, but not with the taint argument.
>>>     > (Unless you want to taint any and all use of UIO drivers which 
>>> can
>>>     >  already do this).
>>>
>>>     Yes, actually, if the bus master bit gets enabled all bets are
>>>     off.  I
>>>     don't see how that leaves a supportable kernel, so we might as well
>>>     taint it.  Isn't this exactly why we taint for proprietary
>>>     drivers, we
>>>     have no idea what it has mucked with in kernel space. This just 
>>> moves
>>>     the proprietary driver out to userspace without an iommu to
>>>     protect the
>>>     host.  Thanks,
>>>
>>>     Alex
>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 16:09           ` Avi Kivity
@ 2015-10-07 10:25             ` Michael S. Tsirkin
  2015-10-07 10:28               ` Avi Kivity
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-07 10:25 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Avi Kivity, Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck

On Tue, Oct 06, 2015 at 07:09:11PM +0300, Avi Kivity wrote:
> 
> 
> On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote:
> >>While it is possible that userspace malfunctions and accidentally programs
> >>MSI incorrectly, the risk is dwarfed by the ability of userspace to program
> >>DMA incorrectly.
> >That seems to imply that for the upstream kernel this is not a valid usecase at all.
> >
> 
> That is trivially incorrect, upstream pci_uio_generic is used with dpdk for
> years.

dpdk used to do polling for years. patch to use interrupts was posted in
june 2015.

>  Are dpdk applications an invalid use case?

The way dpdk is using UIO/sysfs is borderline at best, and can't be used
to justify new interfaces.  They have a more secure mode using VFIO.
That one's more reasonable.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-07 10:25             ` Michael S. Tsirkin
@ 2015-10-07 10:28               ` Avi Kivity
  0 siblings, 0 replies; 96+ messages in thread
From: Avi Kivity @ 2015-10-07 10:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck



On 10/07/2015 01:25 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 07:09:11PM +0300, Avi Kivity wrote:
>>
>> On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote:
>>>> While it is possible that userspace malfunctions and accidentally programs
>>>> MSI incorrectly, the risk is dwarfed by the ability of userspace to program
>>>> DMA incorrectly.
>>> That seems to imply that for the upstream kernel this is not a valid usecase at all.
>>>
>> That is trivially incorrect, upstream pci_uio_generic is used with dpdk for
>> years.
> dpdk used to do polling for years. patch to use interrupts was posted in
> june 2015.

dpdk used interrupts long before that.

>
>>   Are dpdk applications an invalid use case?
> The way dpdk is using UIO/sysfs is borderline at best, and can't be used
> to justify new interfaces.  They have a more secure mode using VFIO.
> That one's more reasonable.
>

Maybe this was not stressed enough times, but not all configurations 
have an iommu, or want to use one.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-07  6:52                     ` Avi Kivity
@ 2015-10-07 16:31                       ` Alex Williamson
  2015-10-07 16:39                         ` Avi Kivity
  2015-10-07 20:05                         ` Michael S. Tsirkin
  0 siblings, 2 replies; 96+ messages in thread
From: Alex Williamson @ 2015-10-07 16:31 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Michael S. Tsirkin, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Wed, 2015-10-07 at 09:52 +0300, Avi Kivity wrote:
> 
> On 10/06/2015 09:51 PM, Alex Williamson wrote:
> > On Tue, 2015-10-06 at 18:23 +0300, Avi Kivity wrote:
> >> On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote:
> >>> On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote:
> >>>> The only "like VFIO" behavior we implement here is binding the MSI-X
> >>>> interrupt notification to eventfd descriptor.
> >>> There will be more if you add some basic memory protections.
> >>>
> >>> Besides, that's not true.
> >>> Your patch queries MSI capability, sets # of vectors.
> >>> You even hinted you want to add BAR mapping down the road.
> >> BAR mapping is already available from sysfs; it is not mandatory.
> >>
> >>> VFIO does all of that.
> >>>
> >> Copying vfio maintainer Alex (hi!).
> >>
> >> vfio's charter is modern iommu-capable configurations. It is designed to
> >> be secure enough to be usable by an unprivileged user.
> >>
> >> For performance and hardware reasons, many dpdk deployments use
> >> uio_pci_generic.  They are willing to trade off the security provided by
> >> vfio for the performance and deployment flexibility of pci_uio_generic.
> >> Forcing these features into vfio will compromise its security and
> >> needlessly complicate its code (I guess it can be done with a "null"
> >> iommu, but then vfio will have to decide whether it is secure or not).
> > It's not just the iommu model vfio uses, it's that vfio is built around
> > iommu groups.  For instance to use a device in vfio, the user opens the
> > vfio group file and asks for the device within that group.  That's a
> > fairly fundamental part of the mechanics to sidestep.
> >
> > However, is there an opportunity at a lower level?  Systems without an
> > iommu typically have dma ops handled via a software iotlb (ie. bounce
> > buffers), but I think they simply don't have iommu ops registered.
> > Could a no-iommu, iommu subsystem provide enough dummy iommu ops to fake
> > out vfio?  It would need to iterate the devices on the bus and come up
> > with dummy iommu groups and dummy versions of iommu_map and unmap.  The
> > grouping is easy, one device per group, there's no isolation anyway.
> > The vfio type1 iommu backend will do pinning, which seems like an
> > improvement over the mlock that uio users probably try to do now.
> 
> Right now, people use hugetlbfs maps, which both locks the memory and 
> provides better performance.
> 
> >    I
> > guess the no-iommu map would error if the IOVA isn't simply the bus
> > address of the page mapped.
> >
> > Of course this is entirely unsafe and this no-iommu driver should taint
> > the kernel, but it at least standardizes on one userspace API and you're
> > already doing completely unsafe things with uio.  vfio should be
> > enlightened at least to the point that it allows only privileged users
> > access to devices under such a (lack of) iommu.
> 
> There is an additional complication.  With an iommu, userspace programs 
> the device with virtual addresses, but without it, they have to program 
> physical addresses.  So vfio would need to communicate this bit of 
> information.
> 
> We can go further and define a better translation API than the current 
> one (reading /proc/pagemap).  But it's going to be a bigger change to 
> vfio than I thought at first.

It sounds like a separate vfio iommu backend from type1, one that just
pins the page and returns the bus address.  The curse and benefit would
be that existing type1 users wouldn't "just work" in an insecure mode,
the DMA mapping code would need to be aware of the difference.  Still, I
do really prefer to keep vfio as only exposing a secure, iommu protected
device to the user because surely someone will try and users would
expect that removing iommu restrictions from vfio means they can do
device assignment to VMs w/o an iommu.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-07 16:31                       ` Alex Williamson
@ 2015-10-07 16:39                         ` Avi Kivity
  2015-10-07 21:05                           ` Michael S. Tsirkin
  2015-10-07 20:05                         ` Michael S. Tsirkin
  1 sibling, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-07 16:39 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Michael S. Tsirkin, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck



On 10/07/2015 07:31 PM, Alex Williamson wrote:
>>>     I
>>> guess the no-iommu map would error if the IOVA isn't simply the bus
>>> address of the page mapped.
>>>
>>> Of course this is entirely unsafe and this no-iommu driver should taint
>>> the kernel, but it at least standardizes on one userspace API and you're
>>> already doing completely unsafe things with uio.  vfio should be
>>> enlightened at least to the point that it allows only privileged users
>>> access to devices under such a (lack of) iommu.
>> There is an additional complication.  With an iommu, userspace programs
>> the device with virtual addresses, but without it, they have to program
>> physical addresses.  So vfio would need to communicate this bit of
>> information.
>>
>> We can go further and define a better translation API than the current
>> one (reading /proc/pagemap).  But it's going to be a bigger change to
>> vfio than I thought at first.
> It sounds like a separate vfio iommu backend from type1, one that just
> pins the page and returns the bus address.  The curse and benefit would
> be that existing type1 users wouldn't "just work" in an insecure mode,
> the DMA mapping code would need to be aware of the difference.  Still, I
> do really prefer to keep vfio as only exposing a secure, iommu protected
> device to the user because surely someone will try and users would
> expect that removing iommu restrictions from vfio means they can do
> device assignment to VMs w/o an iommu.

That's what I thought as well, but apparently adding msix support to the 
already insecure uio drivers is even worse.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-07 16:31                       ` Alex Williamson
  2015-10-07 16:39                         ` Avi Kivity
@ 2015-10-07 20:05                         ` Michael S. Tsirkin
  1 sibling, 0 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-07 20:05 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Avi Kivity, Vlad Zolotarov, Greg KH, linux-kernel, hjk, corbet,
	bruce.richardson, avi, gleb, stephen, alexander.duyck

On Wed, Oct 07, 2015 at 10:31:04AM -0600, Alex Williamson wrote:
> It sounds like a separate vfio iommu backend from type1, one that just
> pins the page and returns the bus address.  The curse and benefit would
> be that existing type1 users wouldn't "just work" in an insecure mode,
> the DMA mapping code would need to be aware of the difference.  Still, I
> do really prefer to keep vfio as only exposing a secure, iommu protected
> device to the user because surely someone will try and users would
> expect that removing iommu restrictions from vfio means they can do
> device assignment to VMs w/o an iommu.

What I had in mind is rather reusing vfio code.

What is needed is all the logic for handling device reset, protecting
BARs and config space regions, MSI/MSI-X and device-specific
work-arounds.

Also, all the interface things such as eventfd.

But I don't think it should be the same char device as vfio -
/dev/vfio/vfio is world-accessible - should be a separate non-world
accessible device, maybe /dev/vfio/noiommu.

This will ensure e.g. qemu does not attempts to use it automatically.
-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-07 16:39                         ` Avi Kivity
@ 2015-10-07 21:05                           ` Michael S. Tsirkin
  2015-10-08  4:19                             ` Gleb Natapov
  2015-10-08  5:33                             ` Avi Kivity
  0 siblings, 2 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-07 21:05 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
> That's what I thought as well, but apparently adding msix support to the
> already insecure uio drivers is even worse.

I'm glad you finally agree what these drivers are doing is insecure.

And basically kernel cares about security, no one wants to maintain insecure stuff.

So you guys should think harder whether this code makes any sense upstream.

Getting support from kernel is probably the biggest reason to put code
upstream, and this driver taints kernel unconditionally so you don't get
that.

Alternatively, most of the problem you are trying to solve is for
virtualization - and it is is better addressed at the hypervisor level.
There are enough opensource hypervisors out there - work on IOMMU
support there would be time well spent.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-07 21:05                           ` Michael S. Tsirkin
@ 2015-10-08  4:19                             ` Gleb Natapov
  2015-10-08  7:41                               ` Michael S. Tsirkin
  2015-10-08  5:33                             ` Avi Kivity
  1 sibling, 1 reply; 96+ messages in thread
From: Gleb Natapov @ 2015-10-08  4:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 12:05:11AM +0300, Michael S. Tsirkin wrote:
> On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
> > That's what I thought as well, but apparently adding msix support to the
> > already insecure uio drivers is even worse.
> 
> I'm glad you finally agree what these drivers are doing is insecure.
> 
Michael, please stop this meaningless world play. The above is said in
the contexts of a device that is meant to be accessible by regular users
and obviously for that purpose uio is insecure (in its current state btw).
If you give user access to your root block device this device will be
insecure too, so according to your logic block device is insecure?
Pushing the code from uio to vfio means that vfio will have to implement
access policy by itself - allow iommu mode to regular users, but
no-iommu to root only. Implementing policy in the kernel is bad. Well
the alternative is to add /dev/vfio/nommu like you've said, but what
would be the difference between this and uio eludes me.

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-07 21:05                           ` Michael S. Tsirkin
  2015-10-08  4:19                             ` Gleb Natapov
@ 2015-10-08  5:33                             ` Avi Kivity
  2015-10-08  7:32                               ` Michael S. Tsirkin
  2015-10-08  8:32                               ` Michael S. Tsirkin
  1 sibling, 2 replies; 96+ messages in thread
From: Avi Kivity @ 2015-10-08  5:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck



On 08/10/15 00:05, Michael S. Tsirkin wrote:
> On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
>> That's what I thought as well, but apparently adding msix support to the
>> already insecure uio drivers is even worse.
> I'm glad you finally agree what these drivers are doing is insecure.
>
> And basically kernel cares about security, no one wants to maintain insecure stuff.
>
> So you guys should think harder whether this code makes any sense upstream.

You simply ignore everything I write, cherry-picking the word "insecure" 
as if it makes your point.  That is very frustrating.

The kernel is not secure against root, even in the restricted "will it 
oops" sense.  You can oops it easily, try dd if=/dev/urandom of=/dev/mem 
(or of=/dev/sda for a more satisfying oops).

> Getting support from kernel is probably the biggest reason to put code
> upstream, and this driver taints kernel unconditionally so you don't get
> that.

The biggest reason is that if a driver gets upstream, in a year or two 
it is universally available.


> Alternatively, most of the problem you are trying to solve is for
> virtualization - and it is is better addressed at the hypervisor level.
> There are enough opensource hypervisors out there - work on IOMMU
> support there would be time well spent.

It is not.  The problem we are trying to solve, and please consider the 
following as if written in all caps, is that some configurations do not 
have an iommu or cannot use it for performance reasons.

It is good practice to defend against root oopsing the kernel, but in 
some cases it cannot be achieved.  A trivial example is a nommu kernel, 
this is another.  In these cases we can give up on this goal, because it 
is not the only reason for the kernel's existence, there are others.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  5:33                             ` Avi Kivity
@ 2015-10-08  7:32                               ` Michael S. Tsirkin
  2015-10-08  8:46                                 ` Avi Kivity
  2015-10-08  8:32                               ` Michael S. Tsirkin
  1 sibling, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08  7:32 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> It is good practice to defend against root oopsing the kernel, but in some
> cases it cannot be achieved.

Absolutely. That's one of the issues with these patches. They don't even
try where it's absolutely possible.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  4:19                             ` Gleb Natapov
@ 2015-10-08  7:41                               ` Michael S. Tsirkin
  2015-10-08  7:59                                 ` Gleb Natapov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08  7:41 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 07:19:13AM +0300, Gleb Natapov wrote:
> Well
> the alternative is to add /dev/vfio/nommu like you've said, but what
> would be the difference between this and uio eludes me.

Are you familiar with vfio that you ask such a question?

Here's the vfio pci code:

$ wc -l drivers/vfio/pci/*
   27 drivers/vfio/pci/Kconfig
    4 drivers/vfio/pci/Makefile
 1217 drivers/vfio/pci/vfio_pci.c
 1602 drivers/vfio/pci/vfio_pci_config.c
  675 drivers/vfio/pci/vfio_pci_intrs.c
   92 drivers/vfio/pci/vfio_pci_private.h
  238 drivers/vfio/pci/vfio_pci_rdwr.c
 3855 total

There's some code dealing with iommu groups in
drivers/vfio/pci/vfio_pci.c,
but most of it is validating input and
presenting a consistent interface to userspace.

This is exactly what's missing here.

There's also drivers/vfio/virqfd.c which deals
with sending interrupts over eventfds correctly.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  7:41                               ` Michael S. Tsirkin
@ 2015-10-08  7:59                                 ` Gleb Natapov
  2015-10-08  9:38                                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Gleb Natapov @ 2015-10-08  7:59 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 10:41:53AM +0300, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 07:19:13AM +0300, Gleb Natapov wrote:
> > Well
> > the alternative is to add /dev/vfio/nommu like you've said, but what
> > would be the difference between this and uio eludes me.
> 
> Are you familiar with vfio that you ask such a question?
> 
Yes, I do and I do not see anything of value that vfio can add to nommu
setup besides complexity, but I do see why it will have to have special
interface not applicable to regular vfio (hint: there is not HW to translate
virtual address to physical) and why it will have to be accessible to
root user only.

> Here's the vfio pci code:
> 
> $ wc -l drivers/vfio/pci/*
>    27 drivers/vfio/pci/Kconfig
>     4 drivers/vfio/pci/Makefile
>  1217 drivers/vfio/pci/vfio_pci.c
>  1602 drivers/vfio/pci/vfio_pci_config.c
>   675 drivers/vfio/pci/vfio_pci_intrs.c
>    92 drivers/vfio/pci/vfio_pci_private.h
>   238 drivers/vfio/pci/vfio_pci_rdwr.c
>  3855 total
>
> There's some code dealing with iommu groups in
> drivers/vfio/pci/vfio_pci.c,
> but most of it is validating input and
> presenting a consistent interface to userspace.
> 
What is has to do with the patch series in question? Non patched
uio_generic code does not validate input. If you think it should by all
means write the code (don't break existing use cases while doing so),
but the patch under discussion does not even access pci device from
userspace, so it will not be affected by said filtering.

> This is exactly what's missing here.
It is not missing in this patch series, it is missing from upstream
code. I do not remember this been an issue when uio_generic was accepted
into the kernel. The reason was because it meant to be accessible by root
only. VFIO was designed to be used by regular user from ground up, so
obviously unrestricted access to pci space was out of the question.
Different use cases lead to different designs, how surprising.

> 
> There's also drivers/vfio/virqfd.c which deals
> with sending interrupts over eventfds correctly.
> 
As opposite to this patch that deals with them incorrectly? In what way?

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  5:33                             ` Avi Kivity
  2015-10-08  7:32                               ` Michael S. Tsirkin
@ 2015-10-08  8:32                               ` Michael S. Tsirkin
  2015-10-08  8:52                                 ` Gleb Natapov
  2015-10-08  9:19                                 ` Avi Kivity
  1 sibling, 2 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08  8:32 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> On 08/10/15 00:05, Michael S. Tsirkin wrote:
> >On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
> >>That's what I thought as well, but apparently adding msix support to the
> >>already insecure uio drivers is even worse.
> >I'm glad you finally agree what these drivers are doing is insecure.
> >
> >And basically kernel cares about security, no one wants to maintain insecure stuff.
> >
> >So you guys should think harder whether this code makes any sense upstream.
> 
> You simply ignore everything I write, cherry-picking the word "insecure" as
> if it makes your point.  That is very frustrating.

And I'm sorry about the frustration.  I didn't intend to twist your
words. It's just that I had to spend literally hours trying to explain
that security matters in kernel, and all I was getting back was a
summary "there's no security issue because there are other way to
corrupt memory".

So I was glad when it looked like there's finally an agreement that yes,
there's value in validating userspace input and yes, it's insecure
not to do this.

> It is good practice to defend against root oopsing the kernel, but in some
> cases it cannot be achieved.

I originally included ways to fix issues that I pointed out, ranging
from harder to implement with more overhead but more secure to easier to
implement with less overhead but less secure.  There didn't seem to be
an understanding that the issues are there at all, so I stopped doing
that - seemed like a waste of time.

For example, will it kill your performance to reset devices cleanly, on
open and close, protect them from writes into MSI config, BAR registers
and related capablities etc etc?  And if not, why are you people wasting
time arguing about that?  The only thing I heard is that it's a hassle.
That's true (though if you follow my advice and try to share code with
vfio/pci you get a lot of this logic for free).  So it's an
understandable argument if you just need something that works, quickly.
But if it's such a stopgap hack, there's no need to insist on it
upstream.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  7:32                               ` Michael S. Tsirkin
@ 2015-10-08  8:46                                 ` Avi Kivity
  2015-10-08  9:16                                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-08  8:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck



On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
>> It is good practice to defend against root oopsing the kernel, but in some
>> cases it cannot be achieved.
> Absolutely. That's one of the issues with these patches. They don't even
> try where it's absolutely possible.
>

Are you referring to blocking the maps of the msix BAR areas?

I think there is value in that.  The value is small, because a 
corruption is more likely in the dynamic memory responsible for tens of 
millions of DMA operations per second, rather than a static 4K area, but 
it exists.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-07  7:55                     ` Vlad Zolotarov
@ 2015-10-08  8:48                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08  8:48 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: Alex Williamson, Avi Kivity, Greg KH, linux-kernel, hjk, corbet,
	bruce.richardson, avi, gleb, stephen, alexander.duyck

On Wed, Oct 07, 2015 at 10:55:30AM +0300, Vlad Zolotarov wrote:
>  * not safe - UIO

That's wrong. UIO (in particular uio_pci_generic) can be used
safely in many ways, for example with any device not doing DMA.  I
wouldn't put it upstream otherwise.

Make your driver work in such a way that it can be used safely,
and it can be merged.

But when you try to do this, you will find out just why VFIO/PCI is
1000s of LOC while your patch is only 500.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  8:32                               ` Michael S. Tsirkin
@ 2015-10-08  8:52                                 ` Gleb Natapov
  2015-10-08  9:19                                 ` Avi Kivity
  1 sibling, 0 replies; 96+ messages in thread
From: Gleb Natapov @ 2015-10-08  8:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 11:32:50AM +0300, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> > On 08/10/15 00:05, Michael S. Tsirkin wrote:
> > >On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
> > >>That's what I thought as well, but apparently adding msix support to the
> > >>already insecure uio drivers is even worse.
> > >I'm glad you finally agree what these drivers are doing is insecure.
> > >
> > >And basically kernel cares about security, no one wants to maintain insecure stuff.
> > >
> > >So you guys should think harder whether this code makes any sense upstream.
> > 
> > You simply ignore everything I write, cherry-picking the word "insecure" as
> > if it makes your point.  That is very frustrating.
> 
> And I'm sorry about the frustration.  I didn't intend to twist your
> words. It's just that I had to spend literally hours trying to explain
> that security matters in kernel, and all I was getting back was a
> summary "there's no security issue because there are other way to
> corrupt memory".
> 
That's not the (only) answer that you were given. The answers that
you constantly ignore is that the patch in question does not add any
new ways to corrupt memory which are not possible using _upstream_
uio_pci_generic device, so the fact that uio_pci_generic can corrupt
memory cannot be used as a reason to not apply patches that do not corrupt
any memory. You seams to be constantly arguing that uio_pci_generic is
not suitable for upstream.

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  8:46                                 ` Avi Kivity
@ 2015-10-08  9:16                                   ` Michael S. Tsirkin
  2015-10-08  9:44                                     ` Avi Kivity
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08  9:16 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> 
> 
> On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> >On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> >>It is good practice to defend against root oopsing the kernel, but in some
> >>cases it cannot be achieved.
> >Absolutely. That's one of the issues with these patches. They don't even
> >try where it's absolutely possible.
> >
> 
> Are you referring to blocking the maps of the msix BAR areas?

For example. There are more. I listed some of the issues on the mailing
list, and I might have missed some.  VFIO has code to address all this,
people should share code to avoid duplication, or at least read it
to understand the issues.

> I think there is value in that.  The value is small because a
> corruption is more likely in the dynamic memory responsible for tens
> of millions of DMA operations per second, rather than a static 4K
> area, but it exists.

There are other bugs which will hurt e.g. each time application does not
exit gracefully.

But well, heh :) That's precisely my feeling about the whole "running
userspace drivers without an IOMMU" project. The value is small
since modern hardware has fast IOMMUs, but it exists.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  8:32                               ` Michael S. Tsirkin
  2015-10-08  8:52                                 ` Gleb Natapov
@ 2015-10-08  9:19                                 ` Avi Kivity
  2015-10-08 10:26                                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-08  9:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck



On 10/08/2015 11:32 AM, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
>> On 08/10/15 00:05, Michael S. Tsirkin wrote:
>>> On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
>>>> That's what I thought as well, but apparently adding msix support to the
>>>> already insecure uio drivers is even worse.
>>> I'm glad you finally agree what these drivers are doing is insecure.
>>>
>>> And basically kernel cares about security, no one wants to maintain insecure stuff.
>>>
>>> So you guys should think harder whether this code makes any sense upstream.
>> You simply ignore everything I write, cherry-picking the word "insecure" as
>> if it makes your point.  That is very frustrating.
> And I'm sorry about the frustration.  I didn't intend to twist your
> words. It's just that I had to spend literally hours trying to explain
> that security matters in kernel, and all I was getting back was a
> summary "there's no security issue because there are other way to
> corrupt memory".

The word security has several meanings.  The primary meaning is "defense 
against a malicious attacker".  In that sense, there is no added value 
at all, because the attacker is already root, and can already access all 
of kernel and user memory.  Even if the attacker is not root, and just 
has access to a non-iommu-protected device, they can still DMA to and 
from any memory they like.

This sense of the word however is irrelevant for this conversation; the 
user already gave up on it when they chose to use uio_pci_generic 
(either because they have no iommu, or because they need the extra 
performance).

Do we agree that security, in the sense of defense against a malicious 
attacker, is irrelevant for this conversation?

A secondary meaning is protection against inadvertent bugs.  Yes, a 
faulty memory write that happens to land in the msix page, can cause a 
random memory word to be overwritten.  But so can a faulty memory write 
into the rings, or the data structures that support virtual->physical 
translation, the data structures that describe the packets before 
translation, the memory allocator or pool.  The patch extends the 
vulnerable surface, but by a negligible amount.

>
> So I was glad when it looked like there's finally an agreement that yes,
> there's value in validating userspace input and yes, it's insecure
> not to do this.



>
>> It is good practice to defend against root oopsing the kernel, but in some
>> cases it cannot be achieved.
> I originally included ways to fix issues that I pointed out, ranging
> from harder to implement with more overhead but more secure to easier to
> implement with less overhead but less secure.  There didn't seem to be
> an understanding that the issues are there at all, so I stopped doing
> that - seemed like a waste of time.
>
> For example, will it kill your performance to reset devices cleanly, on
> open and close,

I don't recall this being mentioned at all.  It seems completely 
unrelated to a patch adding msix support to uio_pci_generic.

>   protect them from writes into MSI config, BAR registers
> and related capablities etc etc?

Obviously the userspace driver has to write to the BAR area.

If you're talking about the BAR setup registers, yes there is some 
(tiny) value in that, but how is it related to this patch?

Protecting the MSI area in the BARs _is_ related to the patch.  I agree 
it adds value, if small.

>    And if not, why are you people wasting
> time arguing about that?

I you want to use your position as maintainer of uio_pci_generic to get 
people to overhaul the driver for you with unrelated changes, they will 
object.  I can understand a maintainer pointing out the right way to do 
something rather than the wrong way.  But piling on a list of unrelated 
features as prerequisites is, in my opinion, abuse.

Let me repeat that pci_uio_generic is already used for userspace 
drivers, with all the issues that you point out, for a long while now. 
These issues are not exposed by the requirement to use msix. You are not 
protecting the kernel in any way by blocking the patch, you are only 
protecting people with iommu-less configurations from using their hardware.

>    The only thing I heard is that it's a hassle.
> That's true (though if you follow my advice and try to share code with
> vfio/pci you get a lot of this logic for free).

My thinking was that vfio was for secure (in the "defense against 
malicious attackers" sense) while uio_pci_generic was, de-facto at 
least, for use by trusted users.

We are in the strange situation that the Alex is open to adding an 
insecure mode to vfio, while you object to a patch which does not change 
the security of uio_pci_generic in any way; it only makes it more usable 
at the cost of a tiny increase in the bug surface.

>    So it's an
> understandable argument if you just need something that works, quickly.
> But if it's such a stopgap hack, there's no need to insist on it
> upstream.

It is not more or less a hack than uio_pci_generic allowing DMA, or 
/dev/mem, or the module loading interface, or nommu kernels. Security is 
just one aspect of the kernel, not the only one.

It's perfectly reasonable to taint the kernel when insecure DMA is 
enabled, and to allow the administrator to disable the interface 
completely.  What I don't understand is why, given that the user allows 
DMA, we should prevent them from using MSIX in addition.


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  7:59                                 ` Gleb Natapov
@ 2015-10-08  9:38                                   ` Michael S. Tsirkin
  2015-10-08  9:45                                     ` Gleb Natapov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08  9:38 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 10:59:10AM +0300, Gleb Natapov wrote:
> I do not remember this been an issue when uio_generic was accepted
> into the kernel. The reason was because it meant to be accessible by root
> only.

No - because it does not need bus mastering. So it can be used safely
with some devices.

[mst@robin linux]$ git grep pci_set_master|wc -l 533
[mst@robin linux]$ git grep pci_enable|wc -l 1597

Looks like about 2/3 devices don't need to be bus masters.

It's up to admin not to bind it to devices, and that is unfortunate,
but manually binding an incorrect driver to a device is generally
a hard problem to solve.

> > There's also drivers/vfio/virqfd.c which deals
> > with sending interrupts over eventfds correctly.
> > 
> As opposite to this patch that deals with them incorrectly? In what way?

cleanup on fd close is not handled.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  9:16                                   ` Michael S. Tsirkin
@ 2015-10-08  9:44                                     ` Avi Kivity
  2015-10-08 12:06                                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Avi Kivity @ 2015-10-08  9:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck



On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
>>
>> On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
>>> On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
>>>> It is good practice to defend against root oopsing the kernel, but in some
>>>> cases it cannot be achieved.
>>> Absolutely. That's one of the issues with these patches. They don't even
>>> try where it's absolutely possible.
>>>
>> Are you referring to blocking the maps of the msix BAR areas?
> For example. There are more. I listed some of the issues on the mailing
> list, and I might have missed some.  VFIO has code to address all this,
> people should share code to avoid duplication, or at least read it
> to understand the issues.

All but one of those are unrelated to the patch that adds msix support.

>
>> I think there is value in that.  The value is small because a
>> corruption is more likely in the dynamic memory responsible for tens
>> of millions of DMA operations per second, rather than a static 4K
>> area, but it exists.
> There are other bugs which will hurt e.g. each time application does not
> exit gracefully.

uio_pci_generic disables DMA when the device is removed, so we're safe 
here, at least if files are released before the address space.

>
> But well, heh :) That's precisely my feeling about the whole "running
> userspace drivers without an IOMMU" project. The value is small
> since modern hardware has fast IOMMUs, but it exists.
>

For users that don't have iommus at all (usually because it is taken by 
the hypervisor), it has great value.

I can't comment on iommu overhead; for my use case it is likely 
negligible and we will use an iommu when available; but apparently it 
matters for others.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  9:38                                   ` Michael S. Tsirkin
@ 2015-10-08  9:45                                     ` Gleb Natapov
  2015-10-08 12:15                                       ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Gleb Natapov @ 2015-10-08  9:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 12:38:28PM +0300, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 10:59:10AM +0300, Gleb Natapov wrote:
> > I do not remember this been an issue when uio_generic was accepted
> > into the kernel. The reason was because it meant to be accessible by root
> > only.
> 
> No - because it does not need bus mastering. So it can be used safely
> with some devices.
> 
It still can be used safely with same devices. Admittedly I did not look
close, but I am sure the patch does not enable bus mastering if MSI
interrupt is not requested. If not, well that can be fixed. But more
importantly it can be used unsafely in its current state. Not only can,
it is widely used so.

> [mst@robin linux]$ git grep pci_set_master|wc -l 533
> [mst@robin linux]$ git grep pci_enable|wc -l 1597
> 
> Looks like about 2/3 devices don't need to be bus masters.
> 
> It's up to admin not to bind it to devices, and that is unfortunate,
> but manually binding an incorrect driver to a device is generally
> a hard problem to solve.
> 
> > > There's also drivers/vfio/virqfd.c which deals
> > > with sending interrupts over eventfds correctly.
> > > 
> > As opposite to this patch that deals with them incorrectly? In what way?
> 
> cleanup on fd close is not handled.
> 
Have you commented about this on the patch and it was not fixed?

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  9:19                                 ` Avi Kivity
@ 2015-10-08 10:26                                   ` Michael S. Tsirkin
  2015-10-08 13:20                                     ` Avi Kivity
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08 10:26 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Thu, Oct 08, 2015 at 12:19:20PM +0300, Avi Kivity wrote:
> 
> 
> On 10/08/2015 11:32 AM, Michael S. Tsirkin wrote:
> >On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> >>On 08/10/15 00:05, Michael S. Tsirkin wrote:
> >>>On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
> >>>>That's what I thought as well, but apparently adding msix support to the
> >>>>already insecure uio drivers is even worse.
> >>>I'm glad you finally agree what these drivers are doing is insecure.
> >>>
> >>>And basically kernel cares about security, no one wants to maintain insecure stuff.
> >>>
> >>>So you guys should think harder whether this code makes any sense upstream.
> >>You simply ignore everything I write, cherry-picking the word "insecure" as
> >>if it makes your point.  That is very frustrating.
> >And I'm sorry about the frustration.  I didn't intend to twist your
> >words. It's just that I had to spend literally hours trying to explain
> >that security matters in kernel, and all I was getting back was a
> >summary "there's no security issue because there are other way to
> >corrupt memory".
> 
> The word security has several meanings.  The primary meaning is "defense
> against a malicious attacker".  In that sense, there is no added value at
> all, because the attacker is already root, and can already access all of
> kernel and user memory.  Even if the attacker is not root, and just has
> access to a non-iommu-protected device, they can still DMA to and from any
> memory they like.
> 
> This sense of the word however is irrelevant for this conversation; the user
> already gave up on it when they chose to use uio_pci_generic (either because
> they have no iommu, or because they need the extra performance).
> 
> Do we agree that security, in the sense of defense against a malicious
> attacker, is irrelevant for this conversation?

No. uio_pci_generic currently can be used in a secure way in
a sense that it's protected againt malicious attacker,
assuming you bind it to a device that does not do DMA.


> A secondary meaning is protection against inadvertent bugs.  Yes, a faulty
> memory write that happens to land in the msix page, can cause a random
> memory word to be overwritten.  But so can a faulty memory write into the
> rings, or the data structures that support virtual->physical translation,
> the data structures that describe the packets before translation, the memory
> allocator or pool.  The patch extends the vulnerable surface, but by a
> negligible amount.
> 
> >
> >So I was glad when it looked like there's finally an agreement that yes,
> >there's value in validating userspace input and yes, it's insecure
> >not to do this.
> 
> 
> 
> >
> >>It is good practice to defend against root oopsing the kernel, but in some
> >>cases it cannot be achieved.
> >I originally included ways to fix issues that I pointed out, ranging
> >from harder to implement with more overhead but more secure to easier to
> >implement with less overhead but less secure.  There didn't seem to be
> >an understanding that the issues are there at all, so I stopped doing
> >that - seemed like a waste of time.
> >
> >For example, will it kill your performance to reset devices cleanly, on
> >open and close,
> 
> I don't recall this being mentioned at all.

http://mid.gmane.org/20151006005527-mutt-send-email-mst@redhat.com

But really, this is just off the top of my head.
These are all issues VFIO developers encountered
and fixed over the years. Go into that code, read it,
and you will discover the issues and the solutions.

>  It seems completely unrelated
> to a patch adding msix support to uio_pci_generic.

It isn't unrelated. It's because with MSIX patch you are enabling bus
mastering in kernel.  So if you start device in a bad state it will
corrupt kernel memory.

> >  protect them from writes into MSI config, BAR registers
> >and related capablities etc etc?
> 
> Obviously the userspace driver has to write to the BAR area.
> 
> If you're talking about the BAR setup registers, yes there is some (tiny)
> value in that, but how is it related to this patch?

If you don't, moving BARs will move the MSI-X region and
protecting it won't help.

> Protecting the MSI area in the BARs _is_ related to the patch.  I agree it
> adds value, if small.
> 
> >   And if not, why are you people wasting
> >time arguing about that?
> 
> I you want to use your position as maintainer of uio_pci_generic to get
> people to overhaul the driver for you with unrelated changes, they will
> object.  I can understand a maintainer pointing out the right way to do
> something rather than the wrong way.  But piling on a list of unrelated
> features as prerequisites is, in my opinion, abuse.

I don't see them as unrelated.  Basically you want to turn
uio_pci_generic into vfio/pci except without an IOMMU.  You will need a
lot of VFIO code then.  That will need a lot of work.  You seem to blame
me for this but IMHO that's because patch author has chosen a wrong
approach.

> Let me repeat that pci_uio_generic is already used for userspace drivers,
> with all the issues that you point out, for a long while now. These issues
> are not exposed by the requirement to use msix.

I answered this already. I don't agree with this.

> You are not protecting the
> kernel in any way by blocking the patch, you are only protecting people with
> iommu-less configurations from using their hardware.

Because it's either this patch or nothing at all? I don't believe that.
Someone come along and write a better one.

> >   The only thing I heard is that it's a hassle.
> >That's true (though if you follow my advice and try to share code with
> >vfio/pci you get a lot of this logic for free).
> 
> My thinking was that vfio was for secure (in the "defense against malicious
> attackers" sense) while uio_pci_generic was, de-facto at least, for use by
> trusted users.

And some are using it in very broken ways. Yes. But now you want
to fix this in stone by tying a kernel/userspace interface
to their broken ways. I think that would be a mistake.

> We are in the strange situation that the Alex is open to adding an insecure
> mode to vfio, 

I don't find this strange. It seems to make sense. VFIO is
already used with DMA capable devices.

> while you object to a patch which does not change the security
> of uio_pci_generic in any way; it only makes it more usable at the cost of a
> tiny increase in the bug surface.

I don't agree with this either. This depends on the device.

> >   So it's an
> >understandable argument if you just need something that works, quickly.
> >But if it's such a stopgap hack, there's no need to insist on it
> >upstream.
> 
> It is not more or less a hack than uio_pci_generic allowing DMA,

It doesn't. sysfs does.

> or
> /dev/mem, or the module loading interface, or nommu kernels. Security is
> just one aspect of the kernel, not the only one.
>
> It's perfectly reasonable to taint the kernel when insecure DMA is enabled,
> and to allow the administrator to disable the interface completely.  What I
> don't understand is why, given that the user allows DMA, we should prevent
> them from using MSIX in addition.

There's no need to prevent MSIX with or without DMA.

But UIO uses sysfs for device access. So if we program MSIX we need to
extend sysfs to protect a ton of registers that are MSIX related from
the user, and do a bunch of setup and cleanup otherwise kernel will be
very confused.

It might be surprising to you how many registers are MSIX related,
but it's true.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  9:44                                     ` Avi Kivity
@ 2015-10-08 12:06                                       ` Michael S. Tsirkin
  2015-10-08 12:27                                         ` Gleb Natapov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08 12:06 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Thu, Oct 08, 2015 at 12:44:09PM +0300, Avi Kivity wrote:
> 
> 
> On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> >On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> >>
> >>On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> >>>>It is good practice to defend against root oopsing the kernel, but in some
> >>>>cases it cannot be achieved.
> >>>Absolutely. That's one of the issues with these patches. They don't even
> >>>try where it's absolutely possible.
> >>>
> >>Are you referring to blocking the maps of the msix BAR areas?
> >For example. There are more. I listed some of the issues on the mailing
> >list, and I might have missed some.  VFIO has code to address all this,
> >people should share code to avoid duplication, or at least read it
> >to understand the issues.
> 
> All but one of those are unrelated to the patch that adds msix support.

They are related because msix support enables bus mastering.  Without it
device is passive and can't harm anyone. With it, suddently you need to
be very careful with the device to avoid corrupting kernel memory.

> >
> >>I think there is value in that.  The value is small because a
> >>corruption is more likely in the dynamic memory responsible for tens
> >>of millions of DMA operations per second, rather than a static 4K
> >>area, but it exists.
> >There are other bugs which will hurt e.g. each time application does not
> >exit gracefully.
> 
> uio_pci_generic disables DMA when the device is removed, so we're safe here,
> at least if files are released before the address space.

No, not really.

You seem to insist on *me* going into VFIO code, digging out
rationale for everything it does and then spelling it out.

If I do it just this once, will you then believe that maybe we don't
have to re-discover all issues and maybe all of VFIO/PCI code shouldn't
just be duplicated in UIO?

The rationale is that when you open the device next, kernel will enable
bus master and if device is in a bad state it might immediately start
doing DMA all over the place.  And it's on open so userspace doesn't
have the chance to bring it to a good state yet.

commit bc4fba77124e2fe4eb14bcb52875c0b0228deace
    vfio-pci: Attempt bus/slot reset on release
fwiw

> >
> >But well, heh :) That's precisely my feeling about the whole "running
> >userspace drivers without an IOMMU" project. The value is small
> >since modern hardware has fast IOMMUs, but it exists.
> >
> 
> For users that don't have iommus at all (usually because it is taken by the
> hypervisor),
> it has great value.

Isn't this what I said? Let me repeat:

	most of the problem you are trying to solve is for
	virtualization - and it is is better addressed at the hypervisor level.
	There are enough opensource hypervisors out there - work on IOMMU
	support there would be time well spent.

http://mid.gmane.org/20151007230553-mutt-send-email-mst@redhat.com


> I can't comment on iommu overhead; for my use case it is likely negligible
> and we will use an iommu when available; but apparently it matters for
> others.

You and Vlad are the only ones who brought this up.
So maybe you should not bring it up anymore.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08  9:45                                     ` Gleb Natapov
@ 2015-10-08 12:15                                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08 12:15 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 12:45:08PM +0300, Gleb Natapov wrote:
> On Thu, Oct 08, 2015 at 12:38:28PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Oct 08, 2015 at 10:59:10AM +0300, Gleb Natapov wrote:
> > > I do not remember this been an issue when uio_generic was accepted
> > > into the kernel. The reason was because it meant to be accessible by root
> > > only.
> > 
> > No - because it does not need bus mastering. So it can be used safely
> > with some devices.
> > 
> It still can be used safely with same devices.

This patch does not add any functionality that can be used safely.
And for no good reason except it's a hassle.

> Admittedly I did not look
> close, but I am sure the patch does not enable bus mastering if MSI
> interrupt is not requested. If not, well that can be fixed. But more
> importantly it can be used unsafely in its current state. Not only can,
> it is widely used so.
> 
> > [mst@robin linux]$ git grep pci_set_master|wc -l 533
> > [mst@robin linux]$ git grep pci_enable|wc -l 1597
> > 
> > Looks like about 2/3 devices don't need to be bus masters.
> > 
> > It's up to admin not to bind it to devices, and that is unfortunate,
> > but manually binding an incorrect driver to a device is generally
> > a hard problem to solve.
> > 
> > > > There's also drivers/vfio/virqfd.c which deals
> > > > with sending interrupts over eventfds correctly.
> > > > 
> > > As opposite to this patch that deals with them incorrectly? In what way?
> > 
> > cleanup on fd close is not handled.
> > 
> Have you commented about this on the patch and it was not fixed?

No - I only noticed this when I poked at VFIO to try and explain
why it's not a good idea to duplicate its code. I'm sure there
are more issues that we'll just have to re-discover the hard way
if we do try.

> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 12:06                                       ` Michael S. Tsirkin
@ 2015-10-08 12:27                                         ` Gleb Natapov
  2015-10-08 13:20                                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Gleb Natapov @ 2015-10-08 12:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 03:06:07PM +0300, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 12:44:09PM +0300, Avi Kivity wrote:
> > 
> > 
> > On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> > >On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> > >>
> > >>On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> > >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> > >>>>It is good practice to defend against root oopsing the kernel, but in some
> > >>>>cases it cannot be achieved.
> > >>>Absolutely. That's one of the issues with these patches. They don't even
> > >>>try where it's absolutely possible.
> > >>>
> > >>Are you referring to blocking the maps of the msix BAR areas?
> > >For example. There are more. I listed some of the issues on the mailing
> > >list, and I might have missed some.  VFIO has code to address all this,
> > >people should share code to avoid duplication, or at least read it
> > >to understand the issues.
> > 
> > All but one of those are unrelated to the patch that adds msix support.
> 
> They are related because msix support enables bus mastering.  Without it
> device is passive and can't harm anyone. With it, suddently you need to
> be very careful with the device to avoid corrupting kernel memory.
> 
Most (if not all) uio_pci_generic users enable pci bus mastering. The
fact that they do that without even tainting the kernel like the patch
does make current situation much worse that with the patch.

> > I can't comment on iommu overhead; for my use case it is likely negligible
> > and we will use an iommu when available; but apparently it matters for
> > others.
> 
> You and Vlad are the only ones who brought this up.
> So maybe you should not bring it up anymore.
> 
Common, you were CCed to at least this one:

 We have a solution that makes use of IOMMU support with vfio.  The 
 problem is there are multiple cases where that support is either not 
 available, or using the IOMMU provides excess overhead.


http://dpdk.org/ml/archives/dev/2015-October/024560.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 12:27                                         ` Gleb Natapov
@ 2015-10-08 13:20                                           ` Michael S. Tsirkin
  2015-10-08 13:28                                             ` Gleb Natapov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08 13:20 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 03:27:37PM +0300, Gleb Natapov wrote:
> On Thu, Oct 08, 2015 at 03:06:07PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Oct 08, 2015 at 12:44:09PM +0300, Avi Kivity wrote:
> > > 
> > > 
> > > On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> > > >On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> > > >>
> > > >>On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> > > >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> > > >>>>It is good practice to defend against root oopsing the kernel, but in some
> > > >>>>cases it cannot be achieved.
> > > >>>Absolutely. That's one of the issues with these patches. They don't even
> > > >>>try where it's absolutely possible.
> > > >>>
> > > >>Are you referring to blocking the maps of the msix BAR areas?
> > > >For example. There are more. I listed some of the issues on the mailing
> > > >list, and I might have missed some.  VFIO has code to address all this,
> > > >people should share code to avoid duplication, or at least read it
> > > >to understand the issues.
> > > 
> > > All but one of those are unrelated to the patch that adds msix support.
> > 
> > They are related because msix support enables bus mastering.  Without it
> > device is passive and can't harm anyone. With it, suddently you need to
> > be very careful with the device to avoid corrupting kernel memory.
> > 
> Most (if not all) uio_pci_generic users enable pci bus mastering. The
> fact that they do that without even tainting the kernel like the patch
> does make current situation much worse that with the patch.

It isn't worse. It's a sane interface. Whoever enables bus mastering
must be careful.  If userspace enables bus mastering then userspace
needs to be very careful with the device to avoid corrupting kernel
memory.  If kernel does it, it's kernel's responsibility.

> > > I can't comment on iommu overhead; for my use case it is likely negligible
> > > and we will use an iommu when available; but apparently it matters for
> > > others.
> > 
> > You and Vlad are the only ones who brought this up.
> > So maybe you should not bring it up anymore.
> > 
> Common, you were CCed to at least this one:
> 
>  We have a solution that makes use of IOMMU support with vfio.  The 
>  problem is there are multiple cases where that support is either not 
>  available, or using the IOMMU provides excess overhead.
> 
> 
> http://dpdk.org/ml/archives/dev/2015-October/024560.html

Thanks for the correction.  I didn't notice that one, and I
misunderstood Avi's comment to mean it's just a theoretical case (taking
"apparently" to mean "maybe").  So someone else did bring it up, it's
not just Avi and Vlad.  I'm sorry, I take my comment back.  It might
help to mention "iommu overhead on pre ivy-bridge x86 systems" - that is
what this email seems to refer to.

> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 10:26                                   ` Michael S. Tsirkin
@ 2015-10-08 13:20                                     ` Avi Kivity
  2015-10-08 14:17                                       ` Michael S. Tsirkin
  2015-10-08 15:31                                       ` Alex Williamson
  0 siblings, 2 replies; 96+ messages in thread
From: Avi Kivity @ 2015-10-08 13:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On 10/08/2015 01:26 PM, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 12:19:20PM +0300, Avi Kivity wrote:
>>
>> On 10/08/2015 11:32 AM, Michael S. Tsirkin wrote:
>>> On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
>>>> On 08/10/15 00:05, Michael S. Tsirkin wrote:
>>>>> On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
>>>>>> That's what I thought as well, but apparently adding msix support to the
>>>>>> already insecure uio drivers is even worse.
>>>>> I'm glad you finally agree what these drivers are doing is insecure.
>>>>>
>>>>> And basically kernel cares about security, no one wants to maintain insecure stuff.
>>>>>
>>>>> So you guys should think harder whether this code makes any sense upstream.
>>>> You simply ignore everything I write, cherry-picking the word "insecure" as
>>>> if it makes your point.  That is very frustrating.
>>> And I'm sorry about the frustration.  I didn't intend to twist your
>>> words. It's just that I had to spend literally hours trying to explain
>>> that security matters in kernel, and all I was getting back was a
>>> summary "there's no security issue because there are other way to
>>> corrupt memory".
>> The word security has several meanings.  The primary meaning is "defense
>> against a malicious attacker".  In that sense, there is no added value at
>> all, because the attacker is already root, and can already access all of
>> kernel and user memory.  Even if the attacker is not root, and just has
>> access to a non-iommu-protected device, they can still DMA to and from any
>> memory they like.
>>
>> This sense of the word however is irrelevant for this conversation; the user
>> already gave up on it when they chose to use uio_pci_generic (either because
>> they have no iommu, or because they need the extra performance).
>>
>> Do we agree that security, in the sense of defense against a malicious
>> attacker, is irrelevant for this conversation?
> No. uio_pci_generic currently can be used in a secure way in
> a sense that it's protected againt malicious attacker,
> assuming you bind it to a device that does not do DMA.

The context of the conversation is dpdk, which only supports DMA.

Do we agree that security, in the sense of defense against a malicious 
attacker, is irrelevant for this conversation, taking this under 
consideration?


>
>> A secondary meaning is protection against inadvertent bugs.  Yes, a faulty
>> memory write that happens to land in the msix page, can cause a random
>> memory word to be overwritten.  But so can a faulty memory write into the
>> rings, or the data structures that support virtual->physical translation,
>> the data structures that describe the packets before translation, the memory
>> allocator or pool.  The patch extends the vulnerable surface, but by a
>> negligible amount.
>>
>>> So I was glad when it looked like there's finally an agreement that yes,
>>> there's value in validating userspace input and yes, it's insecure
>>> not to do this.
>>
>>
>>>> It is good practice to defend against root oopsing the kernel, but in some
>>>> cases it cannot be achieved.
>>> I originally included ways to fix issues that I pointed out, ranging
>> >from harder to implement with more overhead but more secure to easier to
>>> implement with less overhead but less secure.  There didn't seem to be
>>> an understanding that the issues are there at all, so I stopped doing
>>> that - seemed like a waste of time.
>>>
>>> For example, will it kill your performance to reset devices cleanly, on
>>> open and close,
>> I don't recall this being mentioned at all.
> http://mid.gmane.org/20151006005527-mutt-send-email-mst@redhat.com

Down at the moment for me.

> But really, this is just off the top of my head.
> These are all issues VFIO developers encountered
> and fixed over the years. Go into that code, read it,
> and you will discover the issues and the solutions.

vfio is solving a different problem, the problem of security against a 
malicious attacker, one that I'm hoping to agree here that we aren't 
attempting to solve.

People have been happily using uio_pci_generic despite all those 
issues.  All they were missing was msix support.  You can't use that to 
force them to overhaul that driver, or to add a new subsystem to vfio.

>
>>   It seems completely unrelated
>> to a patch adding msix support to uio_pci_generic.
> It isn't unrelated. It's because with MSIX patch you are enabling bus
> mastering in kernel.  So if you start device in a bad state it will
> corrupt kernel memory.

You are right, this patch can regress secure users.

I'd be surprised if there are msix-capable pci devices that do not rely 
on DMA, though.

>
>>>   protect them from writes into MSI config, BAR registers
>>> and related capablities etc etc?
>> Obviously the userspace driver has to write to the BAR area.
>>
>> If you're talking about the BAR setup registers, yes there is some (tiny)
>> value in that, but how is it related to this patch?
> If you don't, moving BARs will move the MSI-X region and
> protecting it won't help.

Won't it just become invisible if you do?

If userspace starts playing with BARs, you lost already, whether msix is 
enabled or not doesn't matter.  It can shadow other BARs, for example.

This is a general weakness of uio_pci_generic, not something exposed by 
this patch.

>
>> Protecting the MSI area in the BARs _is_ related to the patch.  I agree it
>> adds value, if small.
>>
>>>    And if not, why are you people wasting
>>> time arguing about that?
>> I you want to use your position as maintainer of uio_pci_generic to get
>> people to overhaul the driver for you with unrelated changes, they will
>> object.  I can understand a maintainer pointing out the right way to do
>> something rather than the wrong way.  But piling on a list of unrelated
>> features as prerequisites is, in my opinion, abuse.
> I don't see them as unrelated.  Basically you want to turn
> uio_pci_generic into vfio/pci except without an IOMMU.

That is not what we want.  Simply adding msix support is sufficient for 
us.  Everything else was piled on afterwards.

> You will need a
> lot of VFIO code then.  That will need a lot of work.  You seem to blame
> me for this but IMHO that's because patch author has chosen a wrong
> approach.
>
>> Let me repeat that pci_uio_generic is already used for userspace drivers,
>> with all the issues that you point out, for a long while now. These issues
>> are not exposed by the requirement to use msix.
> I answered this already. I don't agree with this.

With the first sentence or the second?  I'm trying really hard to 
understand the problem.

>
>> You are not protecting the
>> kernel in any way by blocking the patch, you are only protecting people with
>> iommu-less configurations from using their hardware.
> Because it's either this patch or nothing at all? I don't believe that.
> Someone come along and write a better one.

 From your description, I can't imagine what the better patch looks 
like, except as a complete overhaul of uio_pci_generic.

Our requirement is to enable msix, not to pretend that it is secure 
while it allows DMA to any piece of memory in the system.

>
>>>    The only thing I heard is that it's a hassle.
>>> That's true (though if you follow my advice and try to share code with
>>> vfio/pci you get a lot of this logic for free).
>> My thinking was that vfio was for secure (in the "defense against malicious
>> attackers" sense) while uio_pci_generic was, de-facto at least, for use by
>> trusted users.
> And some are using it in very broken ways. Yes. But now you want
> to fix this in stone by tying a kernel/userspace interface
> to their broken ways. I think that would be a mistake.

At heart, the brokenness here is that you allow insecure DMA. No amount 
of changes will fix this.

We need an interface for users that are prepared to give up kernel/user 
protection, either because they have no other choice, or because they 
have performance requirements that mandate it. What extra value does 
protecting the BARs against movement add? Nothing.

>
>> We are in the strange situation that the Alex is open to adding an insecure
>> mode to vfio,
> I don't find this strange. It seems to make sense. VFIO is
> already used with DMA capable devices.

It's strange to me because it's charter was for iommu-protected device 
assignment, while uio_pci_generic is for generic pci userspace.


>
>> while you object to a patch which does not change the security
>> of uio_pci_generic in any way; it only makes it more usable at the cost of a
>> tiny increase in the bug surface.
> I don't agree with this either. This depends on the device.
>
>>>    So it's an
>>> understandable argument if you just need something that works, quickly.
>>> But if it's such a stopgap hack, there's no need to insist on it
>>> upstream.
>> It is not more or less a hack than uio_pci_generic allowing DMA,
> It doesn't. sysfs does.
>
>> or
>> /dev/mem, or the module loading interface, or nommu kernels. Security is
>> just one aspect of the kernel, not the only one.
>>
>> It's perfectly reasonable to taint the kernel when insecure DMA is enabled,
>> and to allow the administrator to disable the interface completely.  What I
>> don't understand is why, given that the user allows DMA, we should prevent
>> them from using MSIX in addition.
> There's no need to prevent MSIX with or without DMA.
>
> But UIO uses sysfs for device access. So if we program MSIX we need to
> extend sysfs to protect a ton of registers that are MSIX related from
> the user, and do a bunch of setup and cleanup otherwise kernel will be
> very confused.
>
> It might be surprising to you how many registers are MSIX related,
> but it's true.
>

Userspace can already do all of these things, confusing the kernel. It 
simply doesn't, which is why everything works.  It need only continue 
not to do so for everything to continue to work.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 13:20                                           ` Michael S. Tsirkin
@ 2015-10-08 13:28                                             ` Gleb Natapov
  2015-10-08 16:43                                               ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Gleb Natapov @ 2015-10-08 13:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 04:20:04PM +0300, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 03:27:37PM +0300, Gleb Natapov wrote:
> > On Thu, Oct 08, 2015 at 03:06:07PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Oct 08, 2015 at 12:44:09PM +0300, Avi Kivity wrote:
> > > > 
> > > > 
> > > > On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> > > > >On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> > > > >>
> > > > >>On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> > > > >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> > > > >>>>It is good practice to defend against root oopsing the kernel, but in some
> > > > >>>>cases it cannot be achieved.
> > > > >>>Absolutely. That's one of the issues with these patches. They don't even
> > > > >>>try where it's absolutely possible.
> > > > >>>
> > > > >>Are you referring to blocking the maps of the msix BAR areas?
> > > > >For example. There are more. I listed some of the issues on the mailing
> > > > >list, and I might have missed some.  VFIO has code to address all this,
> > > > >people should share code to avoid duplication, or at least read it
> > > > >to understand the issues.
> > > > 
> > > > All but one of those are unrelated to the patch that adds msix support.
> > > 
> > > They are related because msix support enables bus mastering.  Without it
> > > device is passive and can't harm anyone. With it, suddently you need to
> > > be very careful with the device to avoid corrupting kernel memory.
> > > 
> > Most (if not all) uio_pci_generic users enable pci bus mastering. The
> > fact that they do that without even tainting the kernel like the patch
> > does make current situation much worse that with the patch.
> 
> It isn't worse. It's a sane interface. Whoever enables bus mastering
> must be careful.  If userspace enables bus mastering then userspace
> needs to be very careful with the device to avoid corrupting kernel
> memory.  If kernel does it, it's kernel's responsibility.
> 
Although this definition of sanity sounds strange to me, but lets
flow with it for the sake of this email: would it be OK if proposed
interface refused to work if bus mastering is not already enabled by
userspace?
 
--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 13:20                                     ` Avi Kivity
@ 2015-10-08 14:17                                       ` Michael S. Tsirkin
  2015-10-08 15:31                                       ` Alex Williamson
  1 sibling, 0 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08 14:17 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Alex Williamson, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Thu, Oct 08, 2015 at 04:20:12PM +0300, Avi Kivity wrote:
> On 10/08/2015 01:26 PM, Michael S. Tsirkin wrote:
> >On Thu, Oct 08, 2015 at 12:19:20PM +0300, Avi Kivity wrote:
> >>
> >>On 10/08/2015 11:32 AM, Michael S. Tsirkin wrote:
> >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> >>>>On 08/10/15 00:05, Michael S. Tsirkin wrote:
> >>>>>On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote:
> >>>>>>That's what I thought as well, but apparently adding msix support to the
> >>>>>>already insecure uio drivers is even worse.
> >>>>>I'm glad you finally agree what these drivers are doing is insecure.
> >>>>>
> >>>>>And basically kernel cares about security, no one wants to maintain insecure stuff.
> >>>>>
> >>>>>So you guys should think harder whether this code makes any sense upstream.
> >>>>You simply ignore everything I write, cherry-picking the word "insecure" as
> >>>>if it makes your point.  That is very frustrating.
> >>>And I'm sorry about the frustration.  I didn't intend to twist your
> >>>words. It's just that I had to spend literally hours trying to explain
> >>>that security matters in kernel, and all I was getting back was a
> >>>summary "there's no security issue because there are other way to
> >>>corrupt memory".
> >>The word security has several meanings.  The primary meaning is "defense
> >>against a malicious attacker".  In that sense, there is no added value at
> >>all, because the attacker is already root, and can already access all of
> >>kernel and user memory.  Even if the attacker is not root, and just has
> >>access to a non-iommu-protected device, they can still DMA to and from any
> >>memory they like.
> >>
> >>This sense of the word however is irrelevant for this conversation; the user
> >>already gave up on it when they chose to use uio_pci_generic (either because
> >>they have no iommu, or because they need the extra performance).
> >>
> >>Do we agree that security, in the sense of defense against a malicious
> >>attacker, is irrelevant for this conversation?
> >No. uio_pci_generic currently can be used in a secure way in
> >a sense that it's protected againt malicious attacker,
> >assuming you bind it to a device that does not do DMA.
> 
> The context of the conversation is dpdk, which only supports DMA.
> 
> Do we agree that security, in the sense of defense against a malicious
> attacker, is irrelevant for this conversation, taking this under
> consideration?

DPDK has a mode which they call UIO which seems to require people to
disable security.  I agree to that.  It's unfortunate that naming it UIO
gives the whole infrastructure a bad name for security.

But for upstreaming purposes, this doesn't matter:
we don't merge single use interfaces into Linux,
and I do care about interfaces of the code I maintain
being useful in a secure way.

> >
> >>A secondary meaning is protection against inadvertent bugs.  Yes, a faulty
> >>memory write that happens to land in the msix page, can cause a random
> >>memory word to be overwritten.  But so can a faulty memory write into the
> >>rings, or the data structures that support virtual->physical translation,
> >>the data structures that describe the packets before translation, the memory
> >>allocator or pool.  The patch extends the vulnerable surface, but by a
> >>negligible amount.
> >>
> >>>So I was glad when it looked like there's finally an agreement that yes,
> >>>there's value in validating userspace input and yes, it's insecure
> >>>not to do this.
> >>
> >>
> >>>>It is good practice to defend against root oopsing the kernel, but in some
> >>>>cases it cannot be achieved.
> >>>I originally included ways to fix issues that I pointed out, ranging
> >>>from harder to implement with more overhead but more secure to easier to
> >>>implement with less overhead but less secure.  There didn't seem to be
> >>>an understanding that the issues are there at all, so I stopped doing
> >>>that - seemed like a waste of time.
> >>>
> >>>For example, will it kill your performance to reset devices cleanly, on
> >>>open and close,
> >>I don't recall this being mentioned at all.
> >http://mid.gmane.org/20151006005527-mutt-send-email-mst@redhat.com
> 
> Down at the moment for me.

Grep this discussion for "reset", you will find it.

> >But really, this is just off the top of my head.
> >These are all issues VFIO developers encountered
> >and fixed over the years. Go into that code, read it,
> >and you will discover the issues and the solutions.
> 
> vfio is solving a different problem,

It's solving a bunch of problems, including various hardware quirks.

> the problem of security against a
> malicious attacker, one that I'm hoping to agree here that we aren't
> attempting to solve.

You aren't but you should.  uio_pci_generic does do it - though using
legacy interrupts as it does it didn't need to do a lot. But it does
check e.g. interrupt mask support, and doesn't just rely on userspace to
DTRT.

> People have been happily using uio_pci_generic despite all those issues.
> All they were missing was msix support.  You can't use that to force them to
> overhaul that driver, or to add a new subsystem to vfio.

By the way, there's a very simple way to make generic UIO useful on VFs.
Don't enable bus mastering, trigger a timer once a while.


> >
> >>  It seems completely unrelated
> >>to a patch adding msix support to uio_pci_generic.
> >It isn't unrelated. It's because with MSIX patch you are enabling bus
> >mastering in kernel.  So if you start device in a bad state it will
> >corrupt kernel memory.
> 
> You are right, this patch can regress secure users.
> 
> I'd be surprised if there are msix-capable pci devices that do not rely on
> DMA, though.

I would not be surprised at all. The PCI Express specification says:
	MSI/MSI-X interrupt support, which is optional for PCI 3.0 devices, is
	required for PCI Express devices.

> >
> >>>  protect them from writes into MSI config, BAR registers
> >>>and related capablities etc etc?
> >>Obviously the userspace driver has to write to the BAR area.
> >>
> >>If you're talking about the BAR setup registers, yes there is some (tiny)
> >>value in that, but how is it related to this patch?
> >If you don't, moving BARs will move the MSI-X region and
> >protecting it won't help.
> 
> Won't it just become invisible if you do?
> 
> If userspace starts playing with BARs, you lost already, whether msix is
> enabled or not doesn't matter.  It can shadow other BARs, for example.

Of the same device? Sure, but then you only break this device.

At least for PCI Express devices are all behind bridges so
they can't shadow each other BARs.

And for VFs, they don't shadow each other BARs IIRC.


> This is a general weakness of uio_pci_generic, not something exposed by this
> patch.

Patch enables MSIX. Thus the need to protect the MSI-X region. To protect it
you need to make sure it does not move around :)

> >
> >>Protecting the MSI area in the BARs _is_ related to the patch.  I agree it
> >>adds value, if small.
> >>
> >>>   And if not, why are you people wasting
> >>>time arguing about that?
> >>I you want to use your position as maintainer of uio_pci_generic to get
> >>people to overhaul the driver for you with unrelated changes, they will
> >>object.  I can understand a maintainer pointing out the right way to do
> >>something rather than the wrong way.  But piling on a list of unrelated
> >>features as prerequisites is, in my opinion, abuse.
> >I don't see them as unrelated.  Basically you want to turn
> >uio_pci_generic into vfio/pci except without an IOMMU.
> 
> That is not what we want.  Simply adding msix support is sufficient for us.
> Everything else was piled on afterwards.

I'm not stopping you in any way. It's just not the kind of half-baked
interface we should include and support in the upstream kernel, IMHO.

> >You will need a
> >lot of VFIO code then.  That will need a lot of work.  You seem to blame
> >me for this but IMHO that's because patch author has chosen a wrong
> >approach.
> >
> >>Let me repeat that pci_uio_generic is already used for userspace drivers,
> >>with all the issues that you point out, for a long while now. These issues
> >>are not exposed by the requirement to use msix.
> >I answered this already. I don't agree with this.
> 
> With the first sentence or the second?  I'm trying really hard to understand
> the problem.

Enabling msix in kernel exposes additinonal issues.

> >
> >>You are not protecting the
> >>kernel in any way by blocking the patch, you are only protecting people with
> >>iommu-less configurations from using their hardware.
> >Because it's either this patch or nothing at all? I don't believe that.
> >Someone come along and write a better one.
> 
> From your description, I can't imagine what the better patch looks like,
> except as a complete overhaul of uio_pci_generic.

I posted several suggestions over the last several days.
I agree it would be a large change to uio_pci_generic, so
I think a vfio extension would make more sense.

> Our requirement is to enable msix, not to pretend that it is secure while it
> allows DMA to any piece of memory in the system.
> 
> >
> >>>   The only thing I heard is that it's a hassle.
> >>>That's true (though if you follow my advice and try to share code with
> >>>vfio/pci you get a lot of this logic for free).
> >>My thinking was that vfio was for secure (in the "defense against malicious
> >>attackers" sense) while uio_pci_generic was, de-facto at least, for use by
> >>trusted users.
> >And some are using it in very broken ways. Yes. But now you want
> >to fix this in stone by tying a kernel/userspace interface
> >to their broken ways. I think that would be a mistake.
> 
> At heart, the brokenness here is that you allow insecure DMA. No amount of
> changes will fix this.
>
> We need an interface for users that are prepared to give up kernel/user
> protection, either because they have no other choice, or because they have
> performance requirements that mandate it. What extra value does protecting
> the BARs against movement add? Nothing.

Sure. But I am not talking about this usecase. It's an unsupportable
mess, it does not matter for upstream API discussion.  Yea, performance.
But where would you stop? Will you next ask me to merge code that
accesses userspace pointers without checking, because performance? I can
easily see DPDK finding a way to make device DMA into
current_thread_info()->flags to wake a CPU that does monitor on it,
instead of an interrupt, because performance.  Voila, we now have to
keep struct task layout stable because it's part of userspace ABI.  And
so on.

So yes, there's userspace doing crazy things, and we don't want
to break it, but we need to consider the needs of a non-crazy
userspace when we build APIs.


> >
> >>We are in the strange situation that the Alex is open to adding an insecure
> >>mode to vfio,
> >I don't find this strange. It seems to make sense. VFIO is
> >already used with DMA capable devices.
> 
> It's strange to me because it's charter was for iommu-protected device
> assignment, while uio_pci_generic is for generic pci userspace.

I don't know where is the VFIO charter, but you can find the UIO charter
under Documentation. DPDK is not using it as designed.

> 
> >
> >>while you object to a patch which does not change the security
> >>of uio_pci_generic in any way; it only makes it more usable at the cost of a
> >>tiny increase in the bug surface.
> >I don't agree with this either. This depends on the device.
> >
> >>>   So it's an
> >>>understandable argument if you just need something that works, quickly.
> >>>But if it's such a stopgap hack, there's no need to insist on it
> >>>upstream.
> >>It is not more or less a hack than uio_pci_generic allowing DMA,
> >It doesn't. sysfs does.
> >
> >>or
> >>/dev/mem, or the module loading interface, or nommu kernels. Security is
> >>just one aspect of the kernel, not the only one.
> >>
> >>It's perfectly reasonable to taint the kernel when insecure DMA is enabled,
> >>and to allow the administrator to disable the interface completely.  What I
> >>don't understand is why, given that the user allows DMA, we should prevent
> >>them from using MSIX in addition.
> >There's no need to prevent MSIX with or without DMA.
> >
> >But UIO uses sysfs for device access. So if we program MSIX we need to
> >extend sysfs to protect a ton of registers that are MSIX related from
> >the user, and do a bunch of setup and cleanup otherwise kernel will be
> >very confused.
> >
> >It might be surprising to you how many registers are MSIX related,
> >but it's true.
> >
> 
> Userspace can already do all of these things, confusing the kernel. It
> simply doesn't, which is why everything works.  It need only continue not to
> do so for everything to continue to work.

You make it sound as if you are enabling existing APIs for new hardware.
That's not what these patches do, it's a new API you are building,
and new drivers will have to be written to use it. And the API
as defined here has subtle gotchas.  VFIO just solved most of them
already.

Instead of all this, I have a simple suggestion.
UIO spec says:

        For cards that don't generate interrupts but need to be
        polled, there is the possibility to set up a timer that
        triggers the interrupt handler at configurable time intervals.

add code to set this up in uio_pci_generic if there's no interrupt, and
wake userspace.  This will use existing code.  This won't help the new
DPDK interrupt mode (from June 2015), but it helps use it on existing
systems with no new interfaces.

-- 
MST

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 13:20                                     ` Avi Kivity
  2015-10-08 14:17                                       ` Michael S. Tsirkin
@ 2015-10-08 15:31                                       ` Alex Williamson
  1 sibling, 0 replies; 96+ messages in thread
From: Alex Williamson @ 2015-10-08 15:31 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Michael S. Tsirkin, Vlad Zolotarov, Greg KH, linux-kernel, hjk,
	corbet, bruce.richardson, avi, gleb, stephen, alexander.duyck

On Thu, 2015-10-08 at 16:20 +0300, Avi Kivity wrote:
> On 10/08/2015 01:26 PM, Michael S. Tsirkin wrote:
> > On Thu, Oct 08, 2015 at 12:19:20PM +0300, Avi Kivity wrote:
> >> We are in the strange situation that the Alex is open to adding an insecure
> >> mode to vfio,
> > I don't find this strange. It seems to make sense. VFIO is
> > already used with DMA capable devices.
> 
> It's strange to me because it's charter was for iommu-protected device 
> assignment, while uio_pci_generic is for generic pci userspace.

To be clear, I'm not necessarily advocating an insecure mode of vfio,
I'm pointing out that vfio is built on the security, isolation, and
services advertised by the iommu layer.  That layer doesn't exist in a
no-iommu system, but a stub iommu driver that disregards the intended
purpose of iommu groups and implements those services could likely fool
vfio into working.  From a code re-use standpoint, there are some clear
advantages to doing that even though it's rather dastardly at the iommu
level.  There's not too much I can do to prevent such a thing, vfio has
to trust someone and in this case it's the core kernel iommu services.
So if such a task was attempted, I'd want to be involved and enlighten
vfio at least to the point where we can make it clear to users which
uses are secure and which are not.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 13:28                                             ` Gleb Natapov
@ 2015-10-08 16:43                                               ` Michael S. Tsirkin
  2015-10-08 17:01                                                 ` Gleb Natapov
  0 siblings, 1 reply; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08 16:43 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 04:28:34PM +0300, Gleb Natapov wrote:
> On Thu, Oct 08, 2015 at 04:20:04PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Oct 08, 2015 at 03:27:37PM +0300, Gleb Natapov wrote:
> > > On Thu, Oct 08, 2015 at 03:06:07PM +0300, Michael S. Tsirkin wrote:
> > > > On Thu, Oct 08, 2015 at 12:44:09PM +0300, Avi Kivity wrote:
> > > > > 
> > > > > 
> > > > > On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> > > > > >On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> > > > > >>
> > > > > >>On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> > > > > >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> > > > > >>>>It is good practice to defend against root oopsing the kernel, but in some
> > > > > >>>>cases it cannot be achieved.
> > > > > >>>Absolutely. That's one of the issues with these patches. They don't even
> > > > > >>>try where it's absolutely possible.
> > > > > >>>
> > > > > >>Are you referring to blocking the maps of the msix BAR areas?
> > > > > >For example. There are more. I listed some of the issues on the mailing
> > > > > >list, and I might have missed some.  VFIO has code to address all this,
> > > > > >people should share code to avoid duplication, or at least read it
> > > > > >to understand the issues.
> > > > > 
> > > > > All but one of those are unrelated to the patch that adds msix support.
> > > > 
> > > > They are related because msix support enables bus mastering.  Without it
> > > > device is passive and can't harm anyone. With it, suddently you need to
> > > > be very careful with the device to avoid corrupting kernel memory.
> > > > 
> > > Most (if not all) uio_pci_generic users enable pci bus mastering. The
> > > fact that they do that without even tainting the kernel like the patch
> > > does make current situation much worse that with the patch.
> > 
> > It isn't worse. It's a sane interface. Whoever enables bus mastering
> > must be careful.  If userspace enables bus mastering then userspace
> > needs to be very careful with the device to avoid corrupting kernel
> > memory.  If kernel does it, it's kernel's responsibility.
> > 
> Although this definition of sanity sounds strange to me, but lets
> flow with it for the sake of this email: would it be OK if proposed
> interface refused to work if bus mastering is not already enabled by
> userspace?

An interface could be acceptable if there's a fallback where it
works without BM but slower (e.g. poll pending bits).

But not the proposed one.

Really, there's more to making msi-x work with
userspace drivers than this patch. As I keep telling people, you would
basically reimplement vfio/pci. Go over it, and see for yourself.
Almost everything it does is relevant for msi-x.  It's just wrong to
duplicate so much code.


> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 16:43                                               ` Michael S. Tsirkin
@ 2015-10-08 17:01                                                 ` Gleb Natapov
  2015-10-08 17:39                                                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 96+ messages in thread
From: Gleb Natapov @ 2015-10-08 17:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 07:43:04PM +0300, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 04:28:34PM +0300, Gleb Natapov wrote:
> > On Thu, Oct 08, 2015 at 04:20:04PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Oct 08, 2015 at 03:27:37PM +0300, Gleb Natapov wrote:
> > > > On Thu, Oct 08, 2015 at 03:06:07PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Oct 08, 2015 at 12:44:09PM +0300, Avi Kivity wrote:
> > > > > > 
> > > > > > 
> > > > > > On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> > > > > > >On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> > > > > > >>
> > > > > > >>On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> > > > > > >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> > > > > > >>>>It is good practice to defend against root oopsing the kernel, but in some
> > > > > > >>>>cases it cannot be achieved.
> > > > > > >>>Absolutely. That's one of the issues with these patches. They don't even
> > > > > > >>>try where it's absolutely possible.
> > > > > > >>>
> > > > > > >>Are you referring to blocking the maps of the msix BAR areas?
> > > > > > >For example. There are more. I listed some of the issues on the mailing
> > > > > > >list, and I might have missed some.  VFIO has code to address all this,
> > > > > > >people should share code to avoid duplication, or at least read it
> > > > > > >to understand the issues.
> > > > > > 
> > > > > > All but one of those are unrelated to the patch that adds msix support.
> > > > > 
> > > > > They are related because msix support enables bus mastering.  Without it
> > > > > device is passive and can't harm anyone. With it, suddently you need to
> > > > > be very careful with the device to avoid corrupting kernel memory.
> > > > > 
> > > > Most (if not all) uio_pci_generic users enable pci bus mastering. The
> > > > fact that they do that without even tainting the kernel like the patch
> > > > does make current situation much worse that with the patch.
> > > 
> > > It isn't worse. It's a sane interface. Whoever enables bus mastering
> > > must be careful.  If userspace enables bus mastering then userspace
> > > needs to be very careful with the device to avoid corrupting kernel
> > > memory.  If kernel does it, it's kernel's responsibility.
> > > 
> > Although this definition of sanity sounds strange to me, but lets
> > flow with it for the sake of this email: would it be OK if proposed
> > interface refused to work if bus mastering is not already enabled by
> > userspace?
> 
> An interface could be acceptable if there's a fallback where it
> works without BM but slower (e.g. poll pending bits).
> 
OK.

> But not the proposed one.
>
Why? Greg is against ioctl interface so it will be reworked, by besides
that what is wrong with the concept of binding msi-x interrupt to
eventfd?
 
> Really, there's more to making msi-x work with
> userspace drivers than this patch. As I keep telling people, you would
> basically reimplement vfio/pci. Go over it, and see for yourself.
> Almost everything it does is relevant for msi-x.  It's just wrong to
> duplicate so much code.
> 
The patch is tested and works with msi-x. Restricting access to msi-x
registers that vfio does is not relevant here.

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 17:01                                                 ` Gleb Natapov
@ 2015-10-08 17:39                                                   ` Michael S. Tsirkin
  2015-10-08 17:53                                                     ` Gleb Natapov
  2015-10-08 18:38                                                     ` Greg KH
  0 siblings, 2 replies; 96+ messages in thread
From: Michael S. Tsirkin @ 2015-10-08 17:39 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 08:01:21PM +0300, Gleb Natapov wrote:
> On Thu, Oct 08, 2015 at 07:43:04PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Oct 08, 2015 at 04:28:34PM +0300, Gleb Natapov wrote:
> > > On Thu, Oct 08, 2015 at 04:20:04PM +0300, Michael S. Tsirkin wrote:
> > > > On Thu, Oct 08, 2015 at 03:27:37PM +0300, Gleb Natapov wrote:
> > > > > On Thu, Oct 08, 2015 at 03:06:07PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Thu, Oct 08, 2015 at 12:44:09PM +0300, Avi Kivity wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> > > > > > > >On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> > > > > > > >>
> > > > > > > >>On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> > > > > > > >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> > > > > > > >>>>It is good practice to defend against root oopsing the kernel, but in some
> > > > > > > >>>>cases it cannot be achieved.
> > > > > > > >>>Absolutely. That's one of the issues with these patches. They don't even
> > > > > > > >>>try where it's absolutely possible.
> > > > > > > >>>
> > > > > > > >>Are you referring to blocking the maps of the msix BAR areas?
> > > > > > > >For example. There are more. I listed some of the issues on the mailing
> > > > > > > >list, and I might have missed some.  VFIO has code to address all this,
> > > > > > > >people should share code to avoid duplication, or at least read it
> > > > > > > >to understand the issues.
> > > > > > > 
> > > > > > > All but one of those are unrelated to the patch that adds msix support.
> > > > > > 
> > > > > > They are related because msix support enables bus mastering.  Without it
> > > > > > device is passive and can't harm anyone. With it, suddently you need to
> > > > > > be very careful with the device to avoid corrupting kernel memory.
> > > > > > 
> > > > > Most (if not all) uio_pci_generic users enable pci bus mastering. The
> > > > > fact that they do that without even tainting the kernel like the patch
> > > > > does make current situation much worse that with the patch.
> > > > 
> > > > It isn't worse. It's a sane interface. Whoever enables bus mastering
> > > > must be careful.  If userspace enables bus mastering then userspace
> > > > needs to be very careful with the device to avoid corrupting kernel
> > > > memory.  If kernel does it, it's kernel's responsibility.
> > > > 
> > > Although this definition of sanity sounds strange to me, but lets
> > > flow with it for the sake of this email: would it be OK if proposed
> > > interface refused to work if bus mastering is not already enabled by
> > > userspace?
> > 
> > An interface could be acceptable if there's a fallback where it
> > works without BM but slower (e.g. poll pending bits).
> > 
> OK.
> 
> > But not the proposed one.
> >
> Why? Greg is against ioctl interface so it will be reworked, by besides
> that what is wrong with the concept of binding msi-x interrupt to
> eventfd?

It's not the binding. Managing msi-x just needs more than the puny
2 ioctls to get # of vectors and set eventfd.

It interacts in strange ways with reset, and with PM, and ...

> > Really, there's more to making msi-x work with
> > userspace drivers than this patch. As I keep telling people, you would
> > basically reimplement vfio/pci. Go over it, and see for yourself.
> > Almost everything it does is relevant for msi-x.  It's just wrong to
> > duplicate so much code.
> > 
> The patch is tested and works with msi-x. Restricting access to msi-x
> registers that vfio does is not relevant here.

It works *for you* with a specific userspace application. I have no idea
how you tested it, and what does the userspace in question do.  But it
seems pretty clear that there are a ton of very reasonable things that
one can do with a device and that break when you enable MSI-X.

You need to find a way to share that logic with vfio/pci.

> --
> 			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 17:39                                                   ` Michael S. Tsirkin
@ 2015-10-08 17:53                                                     ` Gleb Natapov
  2015-10-08 18:38                                                     ` Greg KH
  1 sibling, 0 replies; 96+ messages in thread
From: Gleb Natapov @ 2015-10-08 17:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Alex Williamson, Vlad Zolotarov, Greg KH,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 08:39:10PM +0300, Michael S. Tsirkin wrote:
> On Thu, Oct 08, 2015 at 08:01:21PM +0300, Gleb Natapov wrote:
> > On Thu, Oct 08, 2015 at 07:43:04PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Oct 08, 2015 at 04:28:34PM +0300, Gleb Natapov wrote:
> > > > On Thu, Oct 08, 2015 at 04:20:04PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Oct 08, 2015 at 03:27:37PM +0300, Gleb Natapov wrote:
> > > > > > On Thu, Oct 08, 2015 at 03:06:07PM +0300, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Oct 08, 2015 at 12:44:09PM +0300, Avi Kivity wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote:
> > > > > > > > >On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote:
> > > > > > > > >>
> > > > > > > > >>On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote:
> > > > > > > > >>>On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote:
> > > > > > > > >>>>It is good practice to defend against root oopsing the kernel, but in some
> > > > > > > > >>>>cases it cannot be achieved.
> > > > > > > > >>>Absolutely. That's one of the issues with these patches. They don't even
> > > > > > > > >>>try where it's absolutely possible.
> > > > > > > > >>>
> > > > > > > > >>Are you referring to blocking the maps of the msix BAR areas?
> > > > > > > > >For example. There are more. I listed some of the issues on the mailing
> > > > > > > > >list, and I might have missed some.  VFIO has code to address all this,
> > > > > > > > >people should share code to avoid duplication, or at least read it
> > > > > > > > >to understand the issues.
> > > > > > > > 
> > > > > > > > All but one of those are unrelated to the patch that adds msix support.
> > > > > > > 
> > > > > > > They are related because msix support enables bus mastering.  Without it
> > > > > > > device is passive and can't harm anyone. With it, suddently you need to
> > > > > > > be very careful with the device to avoid corrupting kernel memory.
> > > > > > > 
> > > > > > Most (if not all) uio_pci_generic users enable pci bus mastering. The
> > > > > > fact that they do that without even tainting the kernel like the patch
> > > > > > does make current situation much worse that with the patch.
> > > > > 
> > > > > It isn't worse. It's a sane interface. Whoever enables bus mastering
> > > > > must be careful.  If userspace enables bus mastering then userspace
> > > > > needs to be very careful with the device to avoid corrupting kernel
> > > > > memory.  If kernel does it, it's kernel's responsibility.
> > > > > 
> > > > Although this definition of sanity sounds strange to me, but lets
> > > > flow with it for the sake of this email: would it be OK if proposed
> > > > interface refused to work if bus mastering is not already enabled by
> > > > userspace?
> > > 
> > > An interface could be acceptable if there's a fallback where it
> > > works without BM but slower (e.g. poll pending bits).
> > > 
> > OK.
> > 
> > > But not the proposed one.
> > >
> > Why? Greg is against ioctl interface so it will be reworked, by besides
> > that what is wrong with the concept of binding msi-x interrupt to
> > eventfd?
> 
> It's not the binding. Managing msi-x just needs more than the puny
> 2 ioctls to get # of vectors and set eventfd.
> 
> It interacts in strange ways with reset, and with PM, and ...
> 
Sorry, I need examples of what you mean. DMA also "interacts in strange
ways with reset, and with PM, and ..." and it does not have any special
handling anywhere in uio-generic. So what special properties msi-x posses
which are not part of a dma. We already agreed that if enabling of bus
mastering is done by userspace all the responsibilities pertaining to
it are also lay in userspace.

> > > Really, there's more to making msi-x work with
> > > userspace drivers than this patch. As I keep telling people, you would
> > > basically reimplement vfio/pci. Go over it, and see for yourself.
> > > Almost everything it does is relevant for msi-x.  It's just wrong to
> > > duplicate so much code.
> > > 
> > The patch is tested and works with msi-x. Restricting access to msi-x
> > registers that vfio does is not relevant here.
> 
> It works *for you* with a specific userspace application. I have no idea
> how you tested it, and what does the userspace in question do.  But it
> seems pretty clear that there are a ton of very reasonable things that
> one can do with a device and that break when you enable MSI-X.
> 
I do not follow. What things break when you enable MSI-X and why would
you enable MSI-X if things that previously worked breaks for you.
Look we cannot work with such vague statements, please be more specific
about issues that needs to be addressed. So far I got two:

 1. kernel should not enable pci bust mastering, leave it to userspace
to do before configuring msi-x
 2. if bus mustering is disabled then poll for interrupts

--
			Gleb.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-08 17:39                                                   ` Michael S. Tsirkin
  2015-10-08 17:53                                                     ` Gleb Natapov
@ 2015-10-08 18:38                                                     ` Greg KH
  1 sibling, 0 replies; 96+ messages in thread
From: Greg KH @ 2015-10-08 18:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Gleb Natapov, Avi Kivity, Alex Williamson, Vlad Zolotarov,
	linux-kernel, hjk, corbet, bruce.richardson, avi, gleb, stephen,
	alexander.duyck

On Thu, Oct 08, 2015 at 08:39:10PM +0300, Michael S. Tsirkin wrote:
> > Why? Greg is against ioctl interface so it will be reworked, by besides
> > that what is wrong with the concept of binding msi-x interrupt to
> > eventfd?
> 
> It's not the binding. Managing msi-x just needs more than the puny
> 2 ioctls to get # of vectors and set eventfd.
> 
> It interacts in strange ways with reset, and with PM, and ...

Can we please drop this thread right now.  The proposed patches are not
acceptable as-is, and everyone knows that.  The developers are going to
go off and redo things and propose a new set of patches, let's see what
the result is of that work and we can take it from there.

Random complaints about this existing patch is not useful at all
anymore, there's nothing needed to convince anyone about anything here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [PATCH v3 1/3] uio: add ioctl support
  2015-10-04 20:39 Vlad Zolotarov
@ 2015-10-04 20:39 ` Vlad Zolotarov
  0 siblings, 0 replies; 96+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:39 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, regkh, corbet
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
 drivers/uio/uio.c          | 15 +++++++++++++++
 include/linux/uio_driver.h |  3 +++
 2 files changed, 18 insertions(+)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 8196581..714b0e5 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -704,6 +704,20 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
 	}
 }
 
+static long uio_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
+{
+	struct uio_listener *listener = filep->private_data;
+	struct uio_device *idev = listener->dev;
+
+	if (!idev->info)
+		return -EIO;
+
+	if (!idev->info->ioctl)
+		return -ENOTTY;
+
+	return idev->info->ioctl(idev->info, cmd, arg);
+}
+
 static const struct file_operations uio_fops = {
 	.owner		= THIS_MODULE,
 	.open		= uio_open,
@@ -712,6 +726,7 @@ static const struct file_operations uio_fops = {
 	.write		= uio_write,
 	.mmap		= uio_mmap,
 	.poll		= uio_poll,
+	.unlocked_ioctl	= uio_ioctl,
 	.fasync		= uio_fasync,
 	.llseek		= noop_llseek,
 };
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 32c0e83..10d7833 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -89,6 +89,7 @@ struct uio_device {
  * @mmap:		mmap operation for this uio device
  * @open:		open operation for this uio device
  * @release:		release operation for this uio device
+ * @ioctl:		ioctl handler
  * @irqcontrol:		disable/enable irqs when 0/1 is written to /dev/uioX
  */
 struct uio_info {
@@ -105,6 +106,8 @@ struct uio_info {
 	int (*open)(struct uio_info *info, struct inode *inode);
 	int (*release)(struct uio_info *info, struct inode *inode);
 	int (*irqcontrol)(struct uio_info *info, s32 irq_on);
+	int (*ioctl)(struct uio_info *info, unsigned int cmd,
+		     unsigned long arg);
 };
 
 extern int __must_check
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 96+ messages in thread

end of thread, other threads:[~2015-10-08 18:38 UTC | newest]

Thread overview: 96+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
2015-10-04 20:43 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
2015-10-05  3:03   ` Greg KH
2015-10-05  7:33     ` Vlad Zolotarov
2015-10-05  8:01       ` Greg KH
2015-10-05 10:36         ` Vlad Zolotarov
2015-10-05 20:02           ` Michael S. Tsirkin
     [not found]             ` <CAOYyTHZ2=UCYxuJKvd5S6qxp=84DBq5bMadg5wL0rFLZBh2-8Q@mail.gmail.com>
2015-10-05 22:29               ` Michael S. Tsirkin
2015-10-06  8:33                 ` Vlad Zolotarov
2015-10-06 14:19                   ` Michael S. Tsirkin
2015-10-06 14:30                     ` Gleb Natapov
2015-10-06 15:19                       ` Michael S. Tsirkin
2015-10-06 15:31                         ` Vlad Zolotarov
2015-10-06 15:57                         ` Gleb Natapov
2015-10-04 20:43 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
2015-10-05  3:11   ` Greg KH
2015-10-05  7:41     ` Vlad Zolotarov
2015-10-05  7:56       ` Greg KH
2015-10-05 10:48         ` Vlad Zolotarov
2015-10-05 10:57           ` Greg KH
2015-10-05 11:09             ` Avi Kivity
2015-10-05 13:08               ` Greg KH
2015-10-05 11:41             ` Vlad Zolotarov
2015-10-05 11:47               ` Avi Kivity
2015-10-05 11:53                 ` Vlad Zolotarov
2015-10-05  8:28     ` Avi Kivity
2015-10-05  9:49       ` Greg KH
2015-10-05 10:20         ` Avi Kivity
2015-10-06 14:38           ` Michael S. Tsirkin
2015-10-06 14:43             ` Vlad Zolotarov
2015-10-06 14:56               ` Michael S. Tsirkin
2015-10-06 15:23                 ` Avi Kivity
2015-10-06 18:51                   ` Alex Williamson
2015-10-06 21:32                     ` Stephen Hemminger
2015-10-06 21:41                       ` Alex Williamson
     [not found]                         ` <CAOaVG152OrQz-Bbnpr0VeE+vLH7nMGsG6A3sD7eTQHormNGVUg@mail.gmail.com>
2015-10-07  7:57                           ` Vlad Zolotarov
     [not found]                           ` <5614C160.6000203@scylladb.com>
2015-10-07  8:00                             ` Vlad Zolotarov
2015-10-07  8:01                               ` Vlad Zolotarov
2015-10-07  6:52                     ` Avi Kivity
2015-10-07 16:31                       ` Alex Williamson
2015-10-07 16:39                         ` Avi Kivity
2015-10-07 21:05                           ` Michael S. Tsirkin
2015-10-08  4:19                             ` Gleb Natapov
2015-10-08  7:41                               ` Michael S. Tsirkin
2015-10-08  7:59                                 ` Gleb Natapov
2015-10-08  9:38                                   ` Michael S. Tsirkin
2015-10-08  9:45                                     ` Gleb Natapov
2015-10-08 12:15                                       ` Michael S. Tsirkin
2015-10-08  5:33                             ` Avi Kivity
2015-10-08  7:32                               ` Michael S. Tsirkin
2015-10-08  8:46                                 ` Avi Kivity
2015-10-08  9:16                                   ` Michael S. Tsirkin
2015-10-08  9:44                                     ` Avi Kivity
2015-10-08 12:06                                       ` Michael S. Tsirkin
2015-10-08 12:27                                         ` Gleb Natapov
2015-10-08 13:20                                           ` Michael S. Tsirkin
2015-10-08 13:28                                             ` Gleb Natapov
2015-10-08 16:43                                               ` Michael S. Tsirkin
2015-10-08 17:01                                                 ` Gleb Natapov
2015-10-08 17:39                                                   ` Michael S. Tsirkin
2015-10-08 17:53                                                     ` Gleb Natapov
2015-10-08 18:38                                                     ` Greg KH
2015-10-08  8:32                               ` Michael S. Tsirkin
2015-10-08  8:52                                 ` Gleb Natapov
2015-10-08  9:19                                 ` Avi Kivity
2015-10-08 10:26                                   ` Michael S. Tsirkin
2015-10-08 13:20                                     ` Avi Kivity
2015-10-08 14:17                                       ` Michael S. Tsirkin
2015-10-08 15:31                                       ` Alex Williamson
2015-10-07 20:05                         ` Michael S. Tsirkin
2015-10-07  7:55                     ` Vlad Zolotarov
2015-10-08  8:48                       ` Michael S. Tsirkin
2015-10-06 15:28                 ` Vlad Zolotarov
2015-10-06 14:46       ` Michael S. Tsirkin
2015-10-06 15:27         ` Avi Kivity
2015-10-05  8:41   ` Stephen Hemminger
2015-10-05  9:08     ` Vlad Zolotarov
2015-10-05 10:06       ` Vlad Zolotarov
2015-10-05 20:09         ` Michael S. Tsirkin
2015-10-05  9:11     ` Vlad Zolotarov
2015-10-05 19:16   ` Michael S. Tsirkin
2015-10-04 20:43 ` [PATCH v3 3/3] Documentation: update uio-howto Vlad Zolotarov
2015-10-04 20:45 ` [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
2015-10-05 19:50 ` Michael S. Tsirkin
2015-10-06  8:37   ` Vlad Zolotarov
2015-10-06 14:30     ` Michael S. Tsirkin
2015-10-06 14:40       ` Vlad Zolotarov
2015-10-06 15:13         ` Michael S. Tsirkin
2015-10-06 16:35           ` Vlad Zolotarov
2015-10-06 15:11       ` Avi Kivity
2015-10-06 15:15         ` Michael S. Tsirkin
2015-10-06 16:00           ` Gleb Natapov
2015-10-06 16:09           ` Avi Kivity
2015-10-07 10:25             ` Michael S. Tsirkin
2015-10-07 10:28               ` Avi Kivity
  -- strict thread matches above, loose matches on Subject: below --
2015-10-04 20:39 Vlad Zolotarov
2015-10-04 20:39 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).