linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
@ 2015-10-04 20:39 Vlad Zolotarov
  2015-10-04 20:39 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:39 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, regkh, corbet
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
 
Currently uio_pci_generic supports only legacy INT#x interrupts source. However
there are situations when this is not enough, for instance SR-IOV VF devices that
simply don't have INT#x capability. For such devices uio_pci_generic will simply
fail (more specifically probe() will fail).
 
When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
overhead and thus VFIO is not an option users that develop user-space drivers are left
without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
DPDK) just to be able to use UIO infrastructure.
 
This series provides a generic solution for this problem while preserving the original behaviour
for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).

New in v3:
   - Add __iomem qualifier to temp buffer receiving ioremap value.  

New in v2:
   - Added #include <linux/uaccess.h> to uio_pci_generic.c

Vlad Zolotarov (3):
  uio: add ioctl support
  uio_pci_generic: add MSI/MSI-X support
  Documentation: update uio-howto

 Documentation/DocBook/uio-howto.tmpl |  29 ++-
 drivers/uio/uio.c                    |  15 ++
 drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
 include/linux/uio_driver.h           |   3 +
 include/linux/uio_pci_generic.h      |  36 +++
 5 files changed, 467 insertions(+), 26 deletions(-)
 create mode 100644 include/linux/uio_pci_generic.h

-- 
2.1.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 1/3] uio: add ioctl support
  2015-10-04 20:39 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
@ 2015-10-04 20:39 ` Vlad Zolotarov
  2015-10-04 20:39 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
  2015-10-04 20:39 ` [PATCH v3 3/3] Documentation: update uio-howto Vlad Zolotarov
  2 siblings, 0 replies; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:39 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, regkh, corbet
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
 drivers/uio/uio.c          | 15 +++++++++++++++
 include/linux/uio_driver.h |  3 +++
 2 files changed, 18 insertions(+)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 8196581..714b0e5 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -704,6 +704,20 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
 	}
 }
 
+static long uio_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
+{
+	struct uio_listener *listener = filep->private_data;
+	struct uio_device *idev = listener->dev;
+
+	if (!idev->info)
+		return -EIO;
+
+	if (!idev->info->ioctl)
+		return -ENOTTY;
+
+	return idev->info->ioctl(idev->info, cmd, arg);
+}
+
 static const struct file_operations uio_fops = {
 	.owner		= THIS_MODULE,
 	.open		= uio_open,
@@ -712,6 +726,7 @@ static const struct file_operations uio_fops = {
 	.write		= uio_write,
 	.mmap		= uio_mmap,
 	.poll		= uio_poll,
+	.unlocked_ioctl	= uio_ioctl,
 	.fasync		= uio_fasync,
 	.llseek		= noop_llseek,
 };
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 32c0e83..10d7833 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -89,6 +89,7 @@ struct uio_device {
  * @mmap:		mmap operation for this uio device
  * @open:		open operation for this uio device
  * @release:		release operation for this uio device
+ * @ioctl:		ioctl handler
  * @irqcontrol:		disable/enable irqs when 0/1 is written to /dev/uioX
  */
 struct uio_info {
@@ -105,6 +106,8 @@ struct uio_info {
 	int (*open)(struct uio_info *info, struct inode *inode);
 	int (*release)(struct uio_info *info, struct inode *inode);
 	int (*irqcontrol)(struct uio_info *info, s32 irq_on);
+	int (*ioctl)(struct uio_info *info, unsigned int cmd,
+		     unsigned long arg);
 };
 
 extern int __must_check
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support
  2015-10-04 20:39 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
  2015-10-04 20:39 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
@ 2015-10-04 20:39 ` Vlad Zolotarov
  2015-10-04 20:39 ` [PATCH v3 3/3] Documentation: update uio-howto Vlad Zolotarov
  2 siblings, 0 replies; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:39 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, regkh, corbet
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

Add support for MSI and MSI-X interrupt modes:
   - Interrupt mode selection order is:
        INT#X (for backward compatibility) -> MSI-X -> MSI.
   - Add ioctl() commands:
      - UIO_PCI_GENERIC_INT_MODE_GET: query the current interrupt mode.
      - UIO_PCI_GENERIC_IRQ_NUM_GET: query the maximum number of IRQs.
      - UIO_PCI_GENERIC_IRQ_SET: bind the IRQ to eventfd (similar to vfio).
   - Add mappings to all bars (memory and portio): some devices have
     registers related to MSI/MSI-X handling outside BAR0.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
New in v3:
   - Add __iomem qualifier to temp buffer receiving ioremap value.

New in v2:
   - Added #include <linux/uaccess.h> to uio_pci_generic.c

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
 drivers/uio/uio_pci_generic.c   | 410 +++++++++++++++++++++++++++++++++++++---
 include/linux/uio_pci_generic.h |  36 ++++
 2 files changed, 423 insertions(+), 23 deletions(-)
 create mode 100644 include/linux/uio_pci_generic.h

diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
index d0b508b..6b8b1789 100644
--- a/drivers/uio/uio_pci_generic.c
+++ b/drivers/uio/uio_pci_generic.c
@@ -22,16 +22,32 @@
 #include <linux/device.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <linux/msi.h>
 #include <linux/slab.h>
 #include <linux/uio_driver.h>
+#include <linux/uio_pci_generic.h>
+#include <linux/eventfd.h>
+#include <linux/uaccess.h>
 
 #define DRIVER_VERSION	"0.01.0"
 #define DRIVER_AUTHOR	"Michael S. Tsirkin <mst@redhat.com>"
 #define DRIVER_DESC	"Generic UIO driver for PCI 2.3 devices"
 
+struct msix_info {
+	int num_irqs;
+	struct msix_entry *table;
+	struct uio_msix_irq_ctx {
+		struct eventfd_ctx *trigger;	/* MSI-x vector to eventfd */
+		char *name;			/* name in /proc/interrupts */
+	} *ctx;
+};
+
 struct uio_pci_generic_dev {
 	struct uio_info info;
 	struct pci_dev *pdev;
+	struct mutex msix_state_lock;		/* ioctl mutex */
+	enum uio_int_mode int_mode;
+	struct msix_info msix;
 };
 
 static inline struct uio_pci_generic_dev *
@@ -40,9 +56,177 @@ to_uio_pci_generic_dev(struct uio_info *info)
 	return container_of(info, struct uio_pci_generic_dev, info);
 }
 
-/* Interrupt handler. Read/modify/write the command register to disable
- * the interrupt. */
-static irqreturn_t irqhandler(int irq, struct uio_info *info)
+/* Unmap previously ioremap'd resources */
+static void release_iomaps(struct uio_pci_generic_dev *gdev)
+{
+	int i;
+	struct uio_mem *mem = gdev->info.mem;
+
+	for (i = 0; i < MAX_UIO_MAPS; i++, mem++) {
+		if (mem->internal_addr) {
+			iounmap(mem->internal_addr);
+			mem->internal_addr = NULL;
+		}
+	}
+}
+
+static int setup_maps(struct pci_dev *pdev, struct uio_info *info)
+{
+	int i, m = 0, p = 0, err;
+	static const char * const bar_names[] = {
+		"BAR0",	"BAR1",	"BAR2",	"BAR3",	"BAR4",	"BAR5",
+	};
+
+	for (i = 0; i < ARRAY_SIZE(bar_names); i++) {
+		unsigned long start = pci_resource_start(pdev, i);
+		unsigned long flags = pci_resource_flags(pdev, i);
+		unsigned long len = pci_resource_len(pdev, i);
+
+		if (start == 0 || len == 0)
+			continue;
+
+		if (flags & IORESOURCE_MEM) {
+			void __iomem *addr;
+
+			if (m >= MAX_UIO_MAPS)
+				continue;
+
+			addr = ioremap(start, len);
+			if (addr == NULL) {
+				err = -EINVAL;
+				goto fail;
+			}
+
+			info->mem[m].name = bar_names[i];
+			info->mem[m].addr = start;
+			info->mem[m].internal_addr = addr;
+			info->mem[m].size = len;
+			info->mem[m].memtype = UIO_MEM_PHYS;
+			++m;
+		} else if (flags & IORESOURCE_IO) {
+			if (p >= MAX_UIO_PORT_REGIONS)
+				continue;
+
+			info->port[p].name = bar_names[i];
+			info->port[p].start = start;
+			info->port[p].size = len;
+			info->port[p].porttype = UIO_PORT_X86;
+			++p;
+		}
+	}
+
+	return 0;
+fail:
+	for (i = 0; i < m; i++) {
+		iounmap(info->mem[i].internal_addr);
+		info->mem[i].internal_addr = NULL;
+	}
+
+	return err;
+}
+
+static irqreturn_t msix_irqhandler(int irq, void *arg);
+
+/* set the mapping between vector # and existing eventfd. */
+static int set_irq_eventfd(struct uio_pci_generic_dev *gdev, int vec, int fd)
+{
+	struct uio_msix_irq_ctx *ctx;
+	struct eventfd_ctx *trigger;
+	struct pci_dev *pdev = gdev->pdev;
+	int irq, err;
+
+	if (vec >= gdev->msix.num_irqs) {
+		dev_notice(&gdev->pdev->dev, "vec %u >= num_vec %u\n",
+			   vec, gdev->msix.num_irqs);
+		return -ERANGE;
+	}
+
+	irq = gdev->msix.table[vec].vector;
+
+	/* Cleanup existing irq mapping */
+	ctx = &gdev->msix.ctx[vec];
+	if (ctx->trigger) {
+		free_irq(irq, ctx->trigger);
+		eventfd_ctx_put(ctx->trigger);
+		ctx->trigger = NULL;
+	}
+
+	/* Passing -1 is used to disable interrupt */
+	if (fd < 0)
+		return 0;
+
+
+	trigger = eventfd_ctx_fdget(fd);
+	if (IS_ERR(trigger)) {
+		err = PTR_ERR(trigger);
+		dev_notice(&gdev->pdev->dev,
+			   "eventfd ctx get failed: %d\n", err);
+		return err;
+	}
+
+	err = request_irq(irq, msix_irqhandler, 0, ctx->name, trigger);
+	if (err) {
+		dev_notice(&pdev->dev, "request irq failed: %d\n", err);
+		eventfd_ctx_put(trigger);
+		return err;
+	}
+
+	dev_dbg(&pdev->dev, "map vector %u to fd %d trigger %p\n",
+		vec, fd, trigger);
+	ctx->trigger = trigger;
+
+	return 0;
+}
+
+static int uio_pci_generic_ioctl(struct uio_info *info, unsigned int cmd,
+				 unsigned long arg)
+{
+	struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
+	struct uio_pci_generic_irq_set hdr;
+	int err;
+
+	switch (cmd) {
+	case UIO_PCI_GENERIC_IRQ_SET:
+		if (copy_from_user(&hdr, (void __user *)arg, sizeof(hdr)))
+			return -EFAULT;
+
+		/* Locking is needed to ensure two things:
+		 *  1) Two IRQ_SET ioctl()'s are not running in parallel.
+		 *  2) IRQ_SET ioctl() is not running in parallel with remove().
+		 */
+		mutex_lock(&gdev->msix_state_lock);
+		if (gdev->int_mode != UIO_INT_MODE_MSIX) {
+			mutex_unlock(&gdev->msix_state_lock);
+			return -EOPNOTSUPP;
+		}
+
+		err = set_irq_eventfd(gdev, hdr.vec, hdr.fd);
+		mutex_unlock(&gdev->msix_state_lock);
+
+		break;
+	case UIO_PCI_GENERIC_IRQ_NUM_GET:
+		if (gdev->int_mode == UIO_INT_MODE_NONE)
+			err = put_user(0, (u32 __user *)arg);
+		else if (gdev->int_mode != UIO_INT_MODE_MSIX)
+			err = put_user(1, (u32 __user *)arg);
+		else
+			err = put_user(gdev->msix.num_irqs,
+				       (u32 __user *)arg);
+
+		break;
+	case UIO_PCI_GENERIC_INT_MODE_GET:
+		err = put_user(gdev->int_mode, (u32 __user *)arg);
+
+		break;
+	default:
+		err = -EOPNOTSUPP;
+	}
+
+	return err;
+}
+
+/* INT#X interrupt handler. */
+static irqreturn_t intx_irqhandler(int irq, struct uio_info *info)
 {
 	struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
 
@@ -53,8 +237,162 @@ static irqreturn_t irqhandler(int irq, struct uio_info *info)
 	return IRQ_HANDLED;
 }
 
-static int probe(struct pci_dev *pdev,
-			   const struct pci_device_id *id)
+/* MSI interrupt handler. */
+static irqreturn_t msi_irqhandler(int irq, struct uio_info *info)
+{
+	/* UIO core will signal the user process. */
+	return IRQ_HANDLED;
+}
+
+/* MSI-X interrupt handler. */
+static irqreturn_t msix_irqhandler(int irq, void *arg)
+{
+	struct eventfd_ctx *trigger = arg;
+
+	pr_devel("irq %u trigger %p\n", irq, trigger);
+
+	eventfd_signal(trigger, 1);
+	return IRQ_HANDLED;
+}
+
+static bool enable_intx(struct uio_pci_generic_dev *gdev)
+{
+	struct pci_dev *pdev = gdev->pdev;
+
+	if (!pdev->irq || !pci_intx_mask_supported(pdev))
+		return false;
+
+	gdev->int_mode = UIO_INT_MODE_INTX;
+	gdev->info.irq = pdev->irq;
+	gdev->info.irq_flags = IRQF_SHARED;
+	gdev->info.handler = intx_irqhandler;
+
+	return true;
+}
+
+static void set_pci_master(struct pci_dev *pdev)
+{
+	pci_set_master(pdev);
+	dev_warn(&pdev->dev, "Enabling PCI bus mastering. Bogus userspace application is able to trash kernel memory using DMA");
+	add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+}
+
+static bool enable_msi(struct uio_pci_generic_dev *gdev)
+{
+	struct pci_dev *pdev = gdev->pdev;
+
+	set_pci_master(pdev);
+
+	if (pci_enable_msi(pdev))
+		return false;
+
+	gdev->int_mode = UIO_INT_MODE_MSI;
+	gdev->info.irq = pdev->irq;
+	gdev->info.irq_flags = 0;
+	gdev->info.handler = msi_irqhandler;
+
+	return true;
+}
+
+static bool enable_msix(struct uio_pci_generic_dev *gdev)
+{
+	struct pci_dev *pdev = gdev->pdev;
+	int i, vectors = pci_msix_vec_count(pdev);
+
+	if (vectors <= 0)
+		return false;
+
+	gdev->msix.table = kcalloc(vectors, sizeof(struct msix_entry),
+				   GFP_KERNEL);
+	if (!gdev->msix.table) {
+		dev_err(&pdev->dev, "Failed to allocate memory for MSI-X table");
+		return false;
+	}
+
+	gdev->msix.ctx = kcalloc(vectors, sizeof(struct uio_msix_irq_ctx),
+				 GFP_KERNEL);
+	if (!gdev->msix.ctx) {
+		dev_err(&pdev->dev, "Failed to allocate memory for MSI-X contexts");
+		goto err_ctx_alloc;
+	}
+
+	for (i = 0; i < vectors; i++) {
+		gdev->msix.table[i].entry = i;
+		gdev->msix.ctx[i].name = kasprintf(GFP_KERNEL,
+						   KBUILD_MODNAME "[%d](%s)",
+						   i, pci_name(pdev));
+		if (!gdev->msix.ctx[i].name)
+			goto err_name_alloc;
+	}
+
+	set_pci_master(pdev);
+
+	if (pci_enable_msix(pdev, gdev->msix.table, vectors))
+		goto err_msix_enable;
+
+	gdev->int_mode = UIO_INT_MODE_MSIX;
+	gdev->info.irq = UIO_IRQ_CUSTOM;
+	gdev->msix.num_irqs = vectors;
+
+	return true;
+
+err_msix_enable:
+	pci_clear_master(pdev);
+err_name_alloc:
+	for (i = 0; i < vectors; i++)
+		kfree(gdev->msix.ctx[i].name);
+
+	kfree(gdev->msix.ctx);
+err_ctx_alloc:
+	kfree(gdev->msix.table);
+
+	return false;
+}
+
+/**
+ * Disable interrupts and free related resources.
+ *
+ * @gdev device handle
+ *
+ * This function should be called after the corresponding UIO device has been
+ * unregistered. This will ensure that there are no currently running ioctl()s
+ * and there won't be any new ones until next probe() call.
+ */
+static void disable_intr(struct uio_pci_generic_dev *gdev)
+{
+	struct pci_dev *pdev = gdev->pdev;
+	int i;
+
+	switch (gdev->int_mode) {
+	case UIO_INT_MODE_MSI:
+		pci_disable_msi(pdev);
+		pci_clear_master(pdev);
+
+		break;
+	case UIO_INT_MODE_MSIX:
+		/* No need for locking here since there shouldn't be any
+		 * ioctl()s running by now.
+		 */
+		for (i = 0; i < gdev->msix.num_irqs; i++) {
+			if (gdev->msix.ctx[i].trigger)
+				set_irq_eventfd(gdev, i, -1);
+
+			kfree(gdev->msix.ctx[i].name);
+		}
+
+		pci_disable_msix(pdev);
+		pci_clear_master(pdev);
+		kfree(gdev->msix.ctx);
+		kfree(gdev->msix.table);
+
+		break;
+	default:
+		break;
+	}
+}
+
+
+static int probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct uio_pci_generic_dev *gdev;
 	int err;
@@ -66,42 +404,64 @@ static int probe(struct pci_dev *pdev,
 		return err;
 	}
 
-	if (!pdev->irq) {
-		dev_warn(&pdev->dev, "No IRQ assigned to device: "
-			 "no support for interrupts?\n");
-		pci_disable_device(pdev);
-		return -ENODEV;
-	}
-
-	if (!pci_intx_mask_supported(pdev)) {
-		err = -ENODEV;
-		goto err_verify;
-	}
-
 	gdev = kzalloc(sizeof(struct uio_pci_generic_dev), GFP_KERNEL);
 	if (!gdev) {
 		err = -ENOMEM;
 		goto err_alloc;
 	}
 
+	gdev->pdev = pdev;
 	gdev->info.name = "uio_pci_generic";
 	gdev->info.version = DRIVER_VERSION;
-	gdev->info.irq = pdev->irq;
-	gdev->info.irq_flags = IRQF_SHARED;
-	gdev->info.handler = irqhandler;
-	gdev->pdev = pdev;
+	gdev->info.ioctl = uio_pci_generic_ioctl;
+	mutex_init(&gdev->msix_state_lock);
+
+	err = pci_request_regions(pdev, "uio_pci_generic");
+	if (err != 0) {
+		dev_err(&pdev->dev, "Cannot request regions\n");
+		goto err_request_regions;
+	}
+
+	/* Enable the corresponding interrupt mode. Try to enable INT#X first
+	 * for backward compatibility.
+	 */
+	if (enable_intx(gdev))
+		dev_info(&pdev->dev, "Using INT#x mode: IRQ %ld",
+			 gdev->info.irq);
+	else if (enable_msix(gdev))
+		dev_info(&pdev->dev, "Using MSI-X mode: number of IRQs %d",
+			 gdev->msix.num_irqs);
+	else if (enable_msi(gdev))
+		dev_info(&pdev->dev, "Using MSI mode: IRQ %ld", gdev->info.irq);
+	else {
+		err = -ENODEV;
+		goto err_verify;
+	}
+
+	/* remap resources */
+	err = setup_maps(pdev, &gdev->info);
+	if (err)
+		goto err_maps;
 
 	err = uio_register_device(&pdev->dev, &gdev->info);
 	if (err)
 		goto err_register;
+
 	pci_set_drvdata(pdev, gdev);
 
 	return 0;
+
 err_register:
+	release_iomaps(gdev);
+err_maps:
+	disable_intr(gdev);
+err_verify:
+	pci_release_regions(pdev);
+err_request_regions:
 	kfree(gdev);
 err_alloc:
-err_verify:
 	pci_disable_device(pdev);
+
 	return err;
 }
 
@@ -110,8 +470,12 @@ static void remove(struct pci_dev *pdev)
 	struct uio_pci_generic_dev *gdev = pci_get_drvdata(pdev);
 
 	uio_unregister_device(&gdev->info);
-	pci_disable_device(pdev);
+	disable_intr(gdev);
+	release_iomaps(gdev);
+	pci_release_regions(pdev);
 	kfree(gdev);
+	pci_disable_device(pdev);
+	pci_set_drvdata(pdev, NULL);
 }
 
 static struct pci_driver uio_pci_driver = {
diff --git a/include/linux/uio_pci_generic.h b/include/linux/uio_pci_generic.h
new file mode 100644
index 0000000..10716fc
--- /dev/null
+++ b/include/linux/uio_pci_generic.h
@@ -0,0 +1,36 @@
+/*
+ * include/linux/uio_pci_generic.h
+ *
+ * Userspace generic PCI IO driver.
+ *
+ * Licensed under the GPLv2 only.
+ */
+
+#ifndef _UIO_PCI_GENERIC_H_
+#define _UIO_PCI_GENERIC_H_
+
+#include <linux/ioctl.h>
+
+enum uio_int_mode {
+	UIO_INT_MODE_NONE,
+	UIO_INT_MODE_INTX,
+	UIO_INT_MODE_MSI,
+	UIO_INT_MODE_MSIX
+};
+
+/* bind the requested IRQ to the given eventfd */
+struct uio_pci_generic_irq_set {
+	int vec; /* index of the IRQ to connect to starting from 0 */
+	int fd;
+};
+
+#define UIO_PCI_GENERIC_BASE		0x86
+
+#define UIO_PCI_GENERIC_IRQ_SET		_IOW('I', UIO_PCI_GENERIC_BASE + 1, \
+		struct uio_pci_generic_irq_set)
+#define UIO_PCI_GENERIC_IRQ_NUM_GET	_IOW('I', UIO_PCI_GENERIC_BASE + 2, \
+		uint32_t)
+#define UIO_PCI_GENERIC_INT_MODE_GET	_IOW('I', UIO_PCI_GENERIC_BASE + 3, \
+		uint32_t)
+
+#endif /* _UIO_PCI_GENERIC_H_ */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 3/3] Documentation: update uio-howto
  2015-10-04 20:39 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
  2015-10-04 20:39 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
  2015-10-04 20:39 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
@ 2015-10-04 20:39 ` Vlad Zolotarov
  2 siblings, 0 replies; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:39 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, regkh, corbet
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

Change the chapters related to uio_pci_generic that refer interrupt mode.
Add the relevant explanation regarding MSI and MSI-X interrupt modes
support.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
---
 Documentation/DocBook/uio-howto.tmpl | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl
index cd0e452..a176129 100644
--- a/Documentation/DocBook/uio-howto.tmpl
+++ b/Documentation/DocBook/uio-howto.tmpl
@@ -46,6 +46,12 @@ GPL version 2.
 
 <revhistory>
 	<revision>
+	<revnumber>0.10</revnumber>
+	<date>2015-10-04</date>
+	<authorinitials>vz</authorinitials>
+	<revremark>Added MSI and MSI-X support to uio_pci_generic.</revremark>
+	</revision>
+	<revision>
 	<revnumber>0.9</revnumber>
 	<date>2009-07-16</date>
 	<authorinitials>mst</authorinitials>
@@ -935,15 +941,32 @@ and look in the output for failure reasons
 <sect1 id="uio_pci_generic_internals">
 <title>Things to know about uio_pci_generic</title>
 	<para>
-Interrupts are handled using the Interrupt Disable bit in the PCI command
+Interrupts are handled either as MSI-X or MSI interrupts (if the device supports it) or
+as legacy INTx interrupts. By default INTx interrupts are used.
+	</para>
+	<para>
+uio_pci_generic automatically configures a device to use INTx interrupt for backward
+compatibility. If INTx are not available MSI-X interrupts will be used if the device
+supports it and if not MSI interrupts are going to be used. If none of the interrupts
+modes is supported probe() will fail.
+	</para>
+	<para>
+To get the used interrupt mode application has to use UIO_PCI_GENERIC_INT_MODE_GET ioctl
+command.
+UIO_PCI_GENERIC_IRQ_NUM_GET ioctl command may be used to get the total number of IRQs.
+Then UIO_PCI_GENERIC_IRQ_SET ioctl command may be used to bind a specific eventfd to a specific
+IRQ vector.
+	</para>
+	<para>
+Legacy interrupts are handled using the Interrupt Disable bit in the PCI command
 register and Interrupt Status bit in the PCI status register.  All devices
 compliant to PCI 2.3 (circa 2002) and all compliant PCI Express devices should
 support these bits.  uio_pci_generic detects this support, and won't bind to
 devices which do not support the Interrupt Disable Bit in the command register.
 	</para>
 	<para>
-On each interrupt, uio_pci_generic sets the Interrupt Disable bit.
-This prevents the device from generating further interrupts
+If legacy interrupts are used, uio_pci_generic sets the Interrupt Disable bit on
+each interrupt. This prevents the device from generating further interrupts
 until the bit is cleared. The userspace driver should clear this
 bit before blocking and waiting for more interrupts.
 	</para>
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-07 10:25             ` Michael S. Tsirkin
@ 2015-10-07 10:28               ` Avi Kivity
  0 siblings, 0 replies; 18+ messages in thread
From: Avi Kivity @ 2015-10-07 10:28 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck



On 10/07/2015 01:25 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 07:09:11PM +0300, Avi Kivity wrote:
>>
>> On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote:
>>>> While it is possible that userspace malfunctions and accidentally programs
>>>> MSI incorrectly, the risk is dwarfed by the ability of userspace to program
>>>> DMA incorrectly.
>>> That seems to imply that for the upstream kernel this is not a valid usecase at all.
>>>
>> That is trivially incorrect, upstream pci_uio_generic is used with dpdk for
>> years.
> dpdk used to do polling for years. patch to use interrupts was posted in
> june 2015.

dpdk used interrupts long before that.

>
>>   Are dpdk applications an invalid use case?
> The way dpdk is using UIO/sysfs is borderline at best, and can't be used
> to justify new interfaces.  They have a more secure mode using VFIO.
> That one's more reasonable.
>

Maybe this was not stressed enough times, but not all configurations 
have an iommu, or want to use one.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 16:09           ` Avi Kivity
@ 2015-10-07 10:25             ` Michael S. Tsirkin
  2015-10-07 10:28               ` Avi Kivity
  0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-10-07 10:25 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Avi Kivity, Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck

On Tue, Oct 06, 2015 at 07:09:11PM +0300, Avi Kivity wrote:
> 
> 
> On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote:
> >>While it is possible that userspace malfunctions and accidentally programs
> >>MSI incorrectly, the risk is dwarfed by the ability of userspace to program
> >>DMA incorrectly.
> >That seems to imply that for the upstream kernel this is not a valid usecase at all.
> >
> 
> That is trivially incorrect, upstream pci_uio_generic is used with dpdk for
> years.

dpdk used to do polling for years. patch to use interrupts was posted in
june 2015.

>  Are dpdk applications an invalid use case?

The way dpdk is using UIO/sysfs is borderline at best, and can't be used
to justify new interfaces.  They have a more secure mode using VFIO.
That one's more reasonable.

-- 
MST

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 15:13         ` Michael S. Tsirkin
@ 2015-10-06 16:35           ` Vlad Zolotarov
  0 siblings, 0 replies; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-06 16:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/06/15 18:13, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 05:40:23PM +0300, Vlad Zolotarov wrote:
>>> I'm guessing it doesn't enable MSI though, does it?
>> Again, enabling MSI is a matter of a trivial patch configuring device
>> registers on the device BAR.
> No, not really.

Sure that is!
Look at pci_msi_set_enable(): it's a single read and then write, just 
like the code for bus master enabling. The msi capability address may be 
retrieved from lspci and then mapped for instance using sysfs.

Configuring the MSI table is a few more reads and writes too but at the 
bottom line - it's a trivial code too...

>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 15:15         ` Michael S. Tsirkin
  2015-10-06 16:00           ` Gleb Natapov
@ 2015-10-06 16:09           ` Avi Kivity
  2015-10-07 10:25             ` Michael S. Tsirkin
  1 sibling, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2015-10-06 16:09 UTC (permalink / raw)
  To: Michael S. Tsirkin, Avi Kivity
  Cc: Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck



On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote:
>> While it is possible that userspace malfunctions and accidentally programs
>> MSI incorrectly, the risk is dwarfed by the ability of userspace to program
>> DMA incorrectly.
> That seems to imply that for the upstream kernel this is not a valid usecase at all.
>

That is trivially incorrect, upstream pci_uio_generic is used with dpdk 
for years.  Are dpdk applications an invalid use case?

Again:

- security is not compromised.  you need to be root to (ab)use this.
- all of the potentially compromising functionality has been there from 
day 1
- uio_pci_generic is the only way to provide the required performance on 
some configurations (where kernel drivers, or userspace drivers + iommu 
are too slow)
- uio_pci_generic + msix is the only way to enable userspace drivers on 
some configurations (SRIOV)

The proposed functionality does not increase the attack surface.
The proposed functionality marginally increases the bug surface.
The proposed functionality is a natural evolution of uio_pci_generic.

There is a new class of applications (network function virtualization) 
which require this.  They can't use the kernel drivers because they are 
too slow.  They can't use the iommu because it is either too slow, or 
taken over by the hypervisor.  They are willing to live with less kernel 
protection, because they are a single user application anyway (and since 
they use a kernel bypass, they don't really care that much about the 
kernel).

The kernel serves more use-cases than a desktop or a multi-user 
servers.  Some of these users are willing to trade off protection for 
performance or functionality (an extreme, yet similar, example is 
linux-nommu, which allows any application to access any bit of memory, 
due to the lack of protection hardware).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 15:15         ` Michael S. Tsirkin
@ 2015-10-06 16:00           ` Gleb Natapov
  2015-10-06 16:09           ` Avi Kivity
  1 sibling, 0 replies; 18+ messages in thread
From: Gleb Natapov @ 2015-10-06 16:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Avi Kivity, Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck

On Tue, Oct 06, 2015 at 06:15:54PM +0300, Michael S. Tsirkin wrote:
> > While it is possible that userspace malfunctions and accidentally programs
> > MSI incorrectly, the risk is dwarfed by the ability of userspace to program
> > DMA incorrectly.
> 
> That seems to imply that for the upstream kernel this is not a valid usecase at all.
> 
Are you implying that uio_pci_generic should be removed from upstream
kernel because Avi did not describe anything that cannot be done with
upstream kernel right now.

--
			Gleb.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 15:11       ` Avi Kivity
@ 2015-10-06 15:15         ` Michael S. Tsirkin
  2015-10-06 16:00           ` Gleb Natapov
  2015-10-06 16:09           ` Avi Kivity
  0 siblings, 2 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 15:15 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Vlad Zolotarov, linux-kernel, hjk, corbet, gregkh,
	bruce.richardson, gleb, stephen, alexander.duyck

> While it is possible that userspace malfunctions and accidentally programs
> MSI incorrectly, the risk is dwarfed by the ability of userspace to program
> DMA incorrectly.

That seems to imply that for the upstream kernel this is not a valid usecase at all.

-- 
MST

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 14:40       ` Vlad Zolotarov
@ 2015-10-06 15:13         ` Michael S. Tsirkin
  2015-10-06 16:35           ` Vlad Zolotarov
  0 siblings, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 15:13 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Tue, Oct 06, 2015 at 05:40:23PM +0300, Vlad Zolotarov wrote:
> >I'm guessing it doesn't enable MSI though, does it?
> 
> Again, enabling MSI is a matter of a trivial patch configuring device
> registers on the device BAR.

No, not really.

-- 
mST

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 14:30     ` Michael S. Tsirkin
  2015-10-06 14:40       ` Vlad Zolotarov
@ 2015-10-06 15:11       ` Avi Kivity
  2015-10-06 15:15         ` Michael S. Tsirkin
  1 sibling, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2015-10-06 15:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, gleb,
	stephen, alexander.duyck



On 10/06/2015 05:30 PM, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 11:37:59AM +0300, Vlad Zolotarov wrote:
>> Bus mastering is easily enabled from the user space (taken from DPDK code):
>>
>> static int
>> pci_uio_set_bus_master(int dev_fd)
>> {
>> 	uint16_t reg;
>> 	int ret;
>>
>> 	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
>> 	if (ret != sizeof(reg)) {
>> 		RTE_LOG(ERR, EAL,
>> 			"Cannot read command from PCI config space!\n");
>> 		return -1;
>> 	}
>>
>> 	/* return if bus mastering is already on */
>> 	if (reg & PCI_COMMAND_MASTER)
>> 		return 0;
>>
>> 	reg |= PCI_COMMAND_MASTER;
>>
>> 	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
>> 	if (ret != sizeof(reg)) {
>> 		RTE_LOG(ERR, EAL,
>> 			"Cannot write command to PCI config space!\n");
>> 		return -1;
>> 	}
>>
>> 	return 0;
>> }
>>
>> So, this is a non-issue. ;)
> There might be valid reasons for DPDK to do this, e.g. if using VFIO.

DPDK does this when using vfio, and when using uio_pci_generic. All of 
the network cards that DPDK supports require DMA.

> I'm guessing it doesn't enable MSI though, does it?

It does not enable MSI, because the main kernel driver used for 
interacting with the device, pci_uio_generic, does not support MSI. In 
some configurations, PCI INTA is not available, while MSI(X) is, hence 
the desire that pci_uio_generic support MSI.

While it is possible that userspace malfunctions and accidentally 
programs MSI incorrectly, the risk is dwarfed by the ability of 
userspace to program DMA incorrectly.  Under normal operation userspace 
programs tens of millions of DMA operations per second, while it never 
touches the MSI BARs (it is the kernel that programs them).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06 14:30     ` Michael S. Tsirkin
@ 2015-10-06 14:40       ` Vlad Zolotarov
  2015-10-06 15:13         ` Michael S. Tsirkin
  2015-10-06 15:11       ` Avi Kivity
  1 sibling, 1 reply; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-06 14:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/06/15 17:30, Michael S. Tsirkin wrote:
> On Tue, Oct 06, 2015 at 11:37:59AM +0300, Vlad Zolotarov wrote:
>> Bus mastering is easily enabled from the user space (taken from DPDK code):
>>
>> static int
>> pci_uio_set_bus_master(int dev_fd)
>> {
>> 	uint16_t reg;
>> 	int ret;
>>
>> 	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
>> 	if (ret != sizeof(reg)) {
>> 		RTE_LOG(ERR, EAL,
>> 			"Cannot read command from PCI config space!\n");
>> 		return -1;
>> 	}
>>
>> 	/* return if bus mastering is already on */
>> 	if (reg & PCI_COMMAND_MASTER)
>> 		return 0;
>>
>> 	reg |= PCI_COMMAND_MASTER;
>>
>> 	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
>> 	if (ret != sizeof(reg)) {
>> 		RTE_LOG(ERR, EAL,
>> 			"Cannot write command to PCI config space!\n");
>> 		return -1;
>> 	}
>>
>> 	return 0;
>> }
>>
>> So, this is a non-issue. ;)
> There might be valid reasons for DPDK to do this, e.g. if using VFIO.

Michael, I'm afraid u are missing the main point here - the code above 
destroys all your long arguments. U can't possibly prevent the root-user 
from enabling the device bus mastering. And as me and other people on 
this thread have already mentioned MSI and MSI-X device configuration is 
controlled from the device BAR, thus may not be prevented too.


>
> I'm guessing it doesn't enable MSI though, does it?

Again, enabling MSI is a matter of a trivial patch configuring device 
registers on the device BAR.

>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-06  8:37   ` Vlad Zolotarov
@ 2015-10-06 14:30     ` Michael S. Tsirkin
  2015-10-06 14:40       ` Vlad Zolotarov
  2015-10-06 15:11       ` Avi Kivity
  0 siblings, 2 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-10-06 14:30 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Tue, Oct 06, 2015 at 11:37:59AM +0300, Vlad Zolotarov wrote:
> Bus mastering is easily enabled from the user space (taken from DPDK code):
> 
> static int
> pci_uio_set_bus_master(int dev_fd)
> {
> 	uint16_t reg;
> 	int ret;
> 
> 	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
> 	if (ret != sizeof(reg)) {
> 		RTE_LOG(ERR, EAL,
> 			"Cannot read command from PCI config space!\n");
> 		return -1;
> 	}
> 
> 	/* return if bus mastering is already on */
> 	if (reg & PCI_COMMAND_MASTER)
> 		return 0;
> 
> 	reg |= PCI_COMMAND_MASTER;
> 
> 	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
> 	if (ret != sizeof(reg)) {
> 		RTE_LOG(ERR, EAL,
> 			"Cannot write command to PCI config space!\n");
> 		return -1;
> 	}
> 
> 	return 0;
> }
> 
> So, this is a non-issue. ;)

There might be valid reasons for DPDK to do this, e.g. if using VFIO.

I'm guessing it doesn't enable MSI though, does it?

-- 
MST

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-05 19:50 ` Michael S. Tsirkin
@ 2015-10-06  8:37   ` Vlad Zolotarov
  2015-10-06 14:30     ` Michael S. Tsirkin
  0 siblings, 1 reply; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-06  8:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck



On 10/05/15 22:50, Michael S. Tsirkin wrote:
> On Sun, Oct 04, 2015 at 11:43:15PM +0300, Vlad Zolotarov wrote:
>> This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
>>   
>> Currently uio_pci_generic supports only legacy INT#x interrupts source. However
>> there are situations when this is not enough, for instance SR-IOV VF devices that
>> simply don't have INT#x capability. For such devices uio_pci_generic will simply
>> fail (more specifically probe() will fail).
>>   
>> When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
>> overhead and thus VFIO is not an option
>> users that develop user-space drivers are left
>> without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
>> DPDK) just to be able to use UIO infrastructure.
>>   
>> This series provides a generic solution for this problem while preserving the original behaviour
>> for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).
> What is missing here is that drivers using uio_pci_generic generally
> poke at config and BAR sysfs files of the device.
>
> We can not stop them without breaking existing users, but this means
> that we can't enable bus mastering and MSI/MSI-X blindly: userspace
> bugs will corrupt the MSI-X table and/or MSi/MSI-X capability,
> and cause device to overwrite random addresses, corrupting kernel
> memory.
>
> Your solution seems to be a warning in dmesg and tainting the
> kernel, but that's not enough.
>
> You need to add infrastructure to prevent this.

Bus mastering is easily enabled from the user space (taken from DPDK code):

static int
pci_uio_set_bus_master(int dev_fd)
{
	uint16_t reg;
	int ret;

	ret = pread(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
	if (ret != sizeof(reg)) {
		RTE_LOG(ERR, EAL,
			"Cannot read command from PCI config space!\n");
		return -1;
	}

	/* return if bus mastering is already on */
	if (reg & PCI_COMMAND_MASTER)
		return 0;

	reg |= PCI_COMMAND_MASTER;

	ret = pwrite(dev_fd, &reg, sizeof(reg), PCI_COMMAND);
	if (ret != sizeof(reg)) {
		RTE_LOG(ERR, EAL,
			"Cannot write command to PCI config space!\n");
		return -1;
	}

	return 0;
}

So, this is a non-issue. ;)

>
> VFIO has some code to do this, but it's not bound by existing UIO API so it
> simply fails the mmap.  We want I think existing applications to work,
> so I suspect we need to make a hole there (probably map a zero page in
> case apps want to read it, and maybe even set it up for COW in case they
> tweak the PBA which sometimes happens to be in the same page).
>
> Your patches also seem to add in eventfd and mmap capabilities which
> seems to be orthogonal. They are there in VFIO which I'm guessing is the
> real reason you do it.
>
> So, what you are trying to do might be closer to extending VFIO which
> already has a bunch of checks like that.  Yes, it also wants to program
> the IOMMU.  So maybe do it with a separate device that can be root-only,
> so unpriveledged users can't abuse it.
>
> You should Cc, and talk to the VFIO maintainer.
>
>
>
>> New in v3:
>>     - Add __iomem qualifier to temp buffer receiving ioremap value.
>>
>> New in v2:
>>     - Added #include <linux/uaccess.h> to uio_pci_generic.c
>>
>> Vlad Zolotarov (3):
>>    uio: add ioctl support
>>    uio_pci_generic: add MSI/MSI-X support
>>    Documentation: update uio-howto
>>
>>   Documentation/DocBook/uio-howto.tmpl |  29 ++-
>>   drivers/uio/uio.c                    |  15 ++
>>   drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
>>   include/linux/uio_driver.h           |   3 +
>>   include/linux/uio_pci_generic.h      |  36 +++
>>   5 files changed, 467 insertions(+), 26 deletions(-)
>>   create mode 100644 include/linux/uio_pci_generic.h
>>
>> -- 
>> 2.1.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
  2015-10-04 20:45 ` Vlad Zolotarov
@ 2015-10-05 19:50 ` Michael S. Tsirkin
  2015-10-06  8:37   ` Vlad Zolotarov
  1 sibling, 1 reply; 18+ messages in thread
From: Michael S. Tsirkin @ 2015-10-05 19:50 UTC (permalink / raw)
  To: Vlad Zolotarov
  Cc: linux-kernel, hjk, corbet, gregkh, bruce.richardson, avi, gleb,
	stephen, alexander.duyck

On Sun, Oct 04, 2015 at 11:43:15PM +0300, Vlad Zolotarov wrote:
> This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
>  
> Currently uio_pci_generic supports only legacy INT#x interrupts source. However
> there are situations when this is not enough, for instance SR-IOV VF devices that
> simply don't have INT#x capability. For such devices uio_pci_generic will simply
> fail (more specifically probe() will fail).
>  
> When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
> overhead and thus VFIO is not an option
> users that develop user-space drivers are left
> without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
> DPDK) just to be able to use UIO infrastructure.
>  
> This series provides a generic solution for this problem while preserving the original behaviour
> for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).

What is missing here is that drivers using uio_pci_generic generally
poke at config and BAR sysfs files of the device.

We can not stop them without breaking existing users, but this means
that we can't enable bus mastering and MSI/MSI-X blindly: userspace
bugs will corrupt the MSI-X table and/or MSi/MSI-X capability,
and cause device to overwrite random addresses, corrupting kernel
memory.

Your solution seems to be a warning in dmesg and tainting the
kernel, but that's not enough.

You need to add infrastructure to prevent this.

VFIO has some code to do this, but it's not bound by existing UIO API so it
simply fails the mmap.  We want I think existing applications to work,
so I suspect we need to make a hole there (probably map a zero page in
case apps want to read it, and maybe even set it up for COW in case they
tweak the PBA which sometimes happens to be in the same page).

Your patches also seem to add in eventfd and mmap capabilities which
seems to be orthogonal. They are there in VFIO which I'm guessing is the
real reason you do it.

So, what you are trying to do might be closer to extending VFIO which
already has a bunch of checks like that.  Yes, it also wants to program
the IOMMU.  So maybe do it with a separate device that can be root-only,
so unpriveledged users can't abuse it.

You should Cc, and talk to the VFIO maintainer.



> New in v3:
>    - Add __iomem qualifier to temp buffer receiving ioremap value.  
> 
> New in v2:
>    - Added #include <linux/uaccess.h> to uio_pci_generic.c
> 
> Vlad Zolotarov (3):
>   uio: add ioctl support
>   uio_pci_generic: add MSI/MSI-X support
>   Documentation: update uio-howto
> 
>  Documentation/DocBook/uio-howto.tmpl |  29 ++-
>  drivers/uio/uio.c                    |  15 ++
>  drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
>  include/linux/uio_driver.h           |   3 +
>  include/linux/uio_pci_generic.h      |  36 +++
>  5 files changed, 467 insertions(+), 26 deletions(-)
>  create mode 100644 include/linux/uio_pci_generic.h
> 
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
  2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
@ 2015-10-04 20:45 ` Vlad Zolotarov
  2015-10-05 19:50 ` Michael S. Tsirkin
  1 sibling, 0 replies; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:45 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, corbet, gregkh
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck

This is the same v3 but with the correct email address of Greg. In the 
first iteration the first letter of the email was missing... ;)

On 10/04/15 23:43, Vlad Zolotarov wrote:
> This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
>   
> Currently uio_pci_generic supports only legacy INT#x interrupts source. However
> there are situations when this is not enough, for instance SR-IOV VF devices that
> simply don't have INT#x capability. For such devices uio_pci_generic will simply
> fail (more specifically probe() will fail).
>   
> When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
> overhead and thus VFIO is not an option users that develop user-space drivers are left
> without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
> DPDK) just to be able to use UIO infrastructure.
>   
> This series provides a generic solution for this problem while preserving the original behaviour
> for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).
>
> New in v3:
>     - Add __iomem qualifier to temp buffer receiving ioremap value.
>
> New in v2:
>     - Added #include <linux/uaccess.h> to uio_pci_generic.c
>
> Vlad Zolotarov (3):
>    uio: add ioctl support
>    uio_pci_generic: add MSI/MSI-X support
>    Documentation: update uio-howto
>
>   Documentation/DocBook/uio-howto.tmpl |  29 ++-
>   drivers/uio/uio.c                    |  15 ++
>   drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
>   include/linux/uio_driver.h           |   3 +
>   include/linux/uio_pci_generic.h      |  36 +++
>   5 files changed, 467 insertions(+), 26 deletions(-)
>   create mode 100644 include/linux/uio_pci_generic.h
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver
@ 2015-10-04 20:43 Vlad Zolotarov
  2015-10-04 20:45 ` Vlad Zolotarov
  2015-10-05 19:50 ` Michael S. Tsirkin
  0 siblings, 2 replies; 18+ messages in thread
From: Vlad Zolotarov @ 2015-10-04 20:43 UTC (permalink / raw)
  To: linux-kernel, mst, hjk, corbet, gregkh
  Cc: bruce.richardson, avi, gleb, stephen, alexander.duyck, Vlad Zolotarov

This series add support for MSI and MSI-X interrupts to uio_pci_generic driver.
 
Currently uio_pci_generic supports only legacy INT#x interrupts source. However
there are situations when this is not enough, for instance SR-IOV VF devices that
simply don't have INT#x capability. For such devices uio_pci_generic will simply
fail (more specifically probe() will fail).
 
When IOMMU is either not available (e.g. Amazon EC2) or not acceptable due to performance
overhead and thus VFIO is not an option users that develop user-space drivers are left
without any option but to develop some proprietary UIO drivers (e.g. igb_uio driver in Intel's
DPDK) just to be able to use UIO infrastructure.
 
This series provides a generic solution for this problem while preserving the original behaviour
for devices for which the original uio_pci_generic had worked before (i.e. INT#x will be used by default).

New in v3:
   - Add __iomem qualifier to temp buffer receiving ioremap value.  

New in v2:
   - Added #include <linux/uaccess.h> to uio_pci_generic.c

Vlad Zolotarov (3):
  uio: add ioctl support
  uio_pci_generic: add MSI/MSI-X support
  Documentation: update uio-howto

 Documentation/DocBook/uio-howto.tmpl |  29 ++-
 drivers/uio/uio.c                    |  15 ++
 drivers/uio/uio_pci_generic.c        | 410 +++++++++++++++++++++++++++++++++--
 include/linux/uio_driver.h           |   3 +
 include/linux/uio_pci_generic.h      |  36 +++
 5 files changed, 467 insertions(+), 26 deletions(-)
 create mode 100644 include/linux/uio_pci_generic.h

-- 
2.1.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-10-07 10:29 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-04 20:39 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
2015-10-04 20:39 ` [PATCH v3 1/3] uio: add ioctl support Vlad Zolotarov
2015-10-04 20:39 ` [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support Vlad Zolotarov
2015-10-04 20:39 ` [PATCH v3 3/3] Documentation: update uio-howto Vlad Zolotarov
2015-10-04 20:43 [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver Vlad Zolotarov
2015-10-04 20:45 ` Vlad Zolotarov
2015-10-05 19:50 ` Michael S. Tsirkin
2015-10-06  8:37   ` Vlad Zolotarov
2015-10-06 14:30     ` Michael S. Tsirkin
2015-10-06 14:40       ` Vlad Zolotarov
2015-10-06 15:13         ` Michael S. Tsirkin
2015-10-06 16:35           ` Vlad Zolotarov
2015-10-06 15:11       ` Avi Kivity
2015-10-06 15:15         ` Michael S. Tsirkin
2015-10-06 16:00           ` Gleb Natapov
2015-10-06 16:09           ` Avi Kivity
2015-10-07 10:25             ` Michael S. Tsirkin
2015-10-07 10:28               ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).