All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liu, Yi L" <yi.l.liu@intel.com>
To: alex.williamson@redhat.com, kwankhede@nvidia.com
Cc: kevin.tian@intel.com, baolu.lu@linux.intel.com,
	yi.l.liu@intel.com, yi.y.sun@intel.com, joro@8bytes.org,
	jean-philippe.brucker@arm.com, peterx@redhat.com,
	linux-kernel@vger.kernel.org, Liu@vger.kernel.org
Subject: [RFC v2 2/2] vfio/pci: add vfio-pci-mdev driver
Date: Tue, 12 Mar 2019 16:18:23 +0800	[thread overview]
Message-ID: <1552378703-11202-3-git-send-email-yi.l.liu@intel.com> (raw)
In-Reply-To: <1552378703-11202-1-git-send-email-yi.l.liu@intel.com>

This patch adds new driver named vfio-pci-mdev. It is similar
with vfio-pci driver. Thus consumes some symbols from vfio-pci
driver. However, this new driver wraps a pci device as a
mediated device. For a pci device, once bound to vfio-pci-mdev
driver, user space access of this device will go through vfio
mdev framework. User should create a mdev before exposing the
device to user-space.

Benefit of this new driver would be acting as a sample driver
for recent changes from "vfio/mdev: IOMMU aware mediated device"
patchset. Also it could be a good start for future device specific
mdev migration support.

To use this driver:
a) build and load vfio-pci-mdev.ko module
   execute "make menuconfig" and config VFIO_PCI_MDEV
   then load it with following command
   > sudo modprobe vfio
   > sudo modprobe vfio-pci
   > sudo modprobe vfio-pci-mdev

b) unbind original device driver
   e.g. for device with its bdf as $dev_bdf, use following command
   to unbind its original driver
   > echo $dev_bdf > /sys/bus/pci/devices/$dev_bdf/driver/unbind

c) bind vfio-pci-mdev driver to the physical device
   > echo $vend_id $dev_id > /sys/bus/pci/drivers/vfio-pci-mdev/new_id

d) check the supported mdev instances
   > ls /sys/bus/pci/devices/$dev_bdf/mdev_supported_types/
     vfio-pci-mdev-type1
   > ls /sys/bus/pci/devices/$dev_bdf/mdev_supported_types/\
     vfio-pci-mdev-type1/
     available_instances  create  device_api  devices  name

e)  create mdev on this physical device
   > echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1003" > \
     /sys/bus/pci/devices/$dev_bdf/mdev_supported_types/\
     vfio-pci-mdev-type1/create

f) passthru the mdev to guest
   add the following line in Qemu boot command
   -device vfio-pci,\
    sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1003

g) destroy mdev
   > echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1003/\
     remove

Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
---
 drivers/vfio/pci/Kconfig         |  11 ++
 drivers/vfio/pci/Makefile        |   5 +
 drivers/vfio/pci/vfio_pci_mdev.c | 334 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 350 insertions(+)
 create mode 100644 drivers/vfio/pci/vfio_pci_mdev.c

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index d0f8e4f..30c6b15 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -9,6 +9,17 @@ config VFIO_PCI
 
 	  If you don't know what to do here, say N.
 
+config VFIO_PCI_MDEV
+	tristate "VFIO MDEV support for PCI devices"
+	depends on VFIO && PCI && EVENTFD && VFIO_PCI && VFIO_MDEV_DEVICE
+	select VFIO_VIRQFD
+	select IRQ_BYPASS_MANAGER
+	help
+	  Support for the PCI VFIO Mdev bus driver.  This is required to
+	  make use of PCI drivers using the VFIO Mdev framework.
+
+	  If you don't know what to do here, say N.
+
 config VFIO_PCI_VGA
 	bool "VFIO PCI support for VGA devices"
 	depends on VFIO_PCI && X86 && VGA_ARB
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 9662c06..e840947 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -3,4 +3,9 @@ vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
 vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
 vfio-pci-$(CONFIG_VFIO_PCI_NVLINK2) += vfio_pci_nvlink2.o
 
+vfio-pci-mdev-y := vfio_pci_mdev.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
+vfio-pci-mdev-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
+vfio-pci-mdev-$(CONFIG_VFIO_PCI_NVLINK2) += vfio_pci_nvlink2.o
+
 obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
+obj-$(CONFIG_VFIO_PCI_MDEV) += vfio-pci-mdev.o
diff --git a/drivers/vfio/pci/vfio_pci_mdev.c b/drivers/vfio/pci/vfio_pci_mdev.c
new file mode 100644
index 0000000..15c8f7e
--- /dev/null
+++ b/drivers/vfio/pci/vfio_pci_mdev.c
@@ -0,0 +1,334 @@
+/*
+ * Copyright © 2019 Intel Corporation.
+ *     Author: Liu, Yi L <yi.l.liu@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Derived from original vfio_pci.c:
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ *     Author: Alex Williamson <alex.williamson@redhat.com>
+ *
+ * Derived from original vfio:
+ * Copyright 2010 Cisco Systems, Inc.  All rights reserved.
+ * Author: Tom Lyon, pugs@cisco.com
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/device.h>
+#include <linux/eventfd.h>
+#include <linux/file.h>
+#include <linux/interrupt.h>
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/notifier.h>
+#include <linux/pci.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/vfio.h>
+#include <linux/vgaarb.h>
+#include <linux/nospec.h>
+#include <linux/mdev.h>
+
+#include "vfio_pci_private.h"
+
+#define DRIVER_VERSION  "0.2"
+#define DRIVER_AUTHOR   "Alex Williamson <alex.williamson@redhat.com>"
+#define DRIVER_DESC     "VFIO PCI Mdev - User Level meta-driver"
+
+#define VFIO_PCI_MDEV_NAME  "vfio-pci-mdev"
+
+static ssize_t
+name_show(struct kobject *kobj, struct device *dev, char *buf)
+{
+	return sprintf(buf, "%s-type1\n", dev_name(dev));
+}
+
+MDEV_TYPE_ATTR_RO(name);
+
+static ssize_t
+available_instances_show(struct kobject *kobj, struct device *dev, char *buf)
+{
+	return sprintf(buf, "%d\n", 1);
+}
+
+MDEV_TYPE_ATTR_RO(available_instances);
+
+static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
+		char *buf)
+{
+	return sprintf(buf, "%s\n", VFIO_DEVICE_API_PCI_STRING);
+}
+
+MDEV_TYPE_ATTR_RO(device_api);
+
+static struct attribute *vfio_pci_mdev_types_attrs[] = {
+	&mdev_type_attr_name.attr,
+	&mdev_type_attr_device_api.attr,
+	&mdev_type_attr_available_instances.attr,
+	NULL,
+};
+
+static struct attribute_group vfio_pci_mdev_type_group1 = {
+	.name  = "type1",
+	.attrs = vfio_pci_mdev_types_attrs,
+};
+
+struct attribute_group *vfio_pci_mdev_type_groups[] = {
+	&vfio_pci_mdev_type_group1,
+	NULL,
+};
+
+struct vfio_pci_mdev {
+	struct vfio_pci_device *vdev;
+	struct mdev_device *mdev;
+	unsigned long handle;
+};
+
+static int vfio_pci_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
+{
+	struct device *pdev;
+	struct vfio_pci_device *vdev;
+	struct vfio_pci_mdev *pmdev;
+	int ret;
+
+	pdev = mdev_parent_dev(mdev);
+	vdev = dev_get_drvdata(pdev);
+	pmdev = kzalloc(sizeof(struct vfio_pci_mdev), GFP_KERNEL);
+	if (pmdev == NULL) {
+		ret = -EBUSY;
+		goto out;
+	}
+
+	pmdev->mdev = mdev;
+	pmdev->vdev = vdev;
+	mdev_set_drvdata(mdev, pmdev);
+	ret = mdev_set_iommu_device(mdev_dev(mdev), pdev);
+	if (ret) {
+		pr_info("%s, failed to config iommu isolation for mdev: %s on pf: %s\n",
+			__func__, dev_name(mdev_dev(mdev)), dev_name(pdev));
+		goto out;
+	}
+
+out:
+	return ret;
+}
+
+static int vfio_pci_mdev_remove(struct mdev_device *mdev)
+{
+	struct vfio_pci_mdev *pmdev = mdev_get_drvdata(mdev);
+
+	kfree(pmdev);
+	pr_info("%s, succeeded for mdev: %s\n", __func__,
+		     dev_name(mdev_dev(mdev)));
+
+	return 0;
+}
+
+static int vfio_pci_mdev_open(struct mdev_device *mdev)
+{
+	int ret = 0;
+	struct vfio_pci_mdev *pmdev = mdev_get_drvdata(mdev);
+
+	ret = vfio_pci_open(pmdev->vdev);
+	if (!ret)
+		pr_info("Succeeded to open mdev: %s on pf: %s\n",
+		dev_name(mdev_dev(mdev)), dev_name(&pmdev->vdev->pdev->dev));
+	else
+		pr_info("Failed to open mdev: %s on pf: %s\n",
+		dev_name(mdev_dev(mdev)), dev_name(&pmdev->vdev->pdev->dev));
+	return ret;
+}
+
+static void vfio_pci_mdev_release(struct mdev_device *mdev)
+{
+	struct vfio_pci_mdev *pmdev = mdev_get_drvdata(mdev);
+
+	pr_info("Release mdev: %s on pf: %s\n",
+		dev_name(mdev_dev(mdev)), dev_name(&pmdev->vdev->pdev->dev));
+	vfio_pci_release(pmdev->vdev);
+}
+
+static long vfio_pci_mdev_ioctl(struct mdev_device *mdev, unsigned int cmd,
+			     unsigned long arg)
+{
+	struct vfio_pci_mdev *pmdev = mdev_get_drvdata(mdev);
+
+	return vfio_pci_ioctl(pmdev->vdev, cmd, arg);
+}
+
+static int vfio_pci_mdev_mmap(struct mdev_device *mdev,
+				struct vm_area_struct *vma)
+{
+	struct vfio_pci_mdev *pmdev = mdev_get_drvdata(mdev);
+
+	return vfio_pci_mmap(pmdev->vdev, vma);
+}
+
+static ssize_t vfio_pci_mdev_read(struct mdev_device *mdev, char __user *buf,
+			size_t count, loff_t *ppos)
+{
+	struct vfio_pci_mdev *pmdev = mdev_get_drvdata(mdev);
+
+	if (!count)
+		return 0;
+
+	return vfio_pci_rw(pmdev->vdev, buf, count, ppos, false);
+}
+
+static ssize_t vfio_pci_mdev_write(struct mdev_device *mdev,
+				const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	struct vfio_pci_mdev *pmdev = mdev_get_drvdata(mdev);
+
+	if (!count)
+		return 0;
+
+	return vfio_pci_rw(pmdev->vdev, (char __user *)buf, count, ppos, true);
+}
+
+static const struct mdev_parent_ops vfio_pci_mdev_ops = {
+	.supported_type_groups	= vfio_pci_mdev_type_groups,
+	.create			= vfio_pci_mdev_create,
+	.remove			= vfio_pci_mdev_remove,
+
+	.open			= vfio_pci_mdev_open,
+	.release		= vfio_pci_mdev_release,
+
+	.read			= vfio_pci_mdev_read,
+	.write			= vfio_pci_mdev_write,
+	.mmap			= vfio_pci_mdev_mmap,
+	.ioctl			= vfio_pci_mdev_ioctl,
+};
+
+static int vfio_pci_mdev_driver_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct vfio_pci_device *vdev;
+	struct iommu_group *group;
+	int ret;
+
+	if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
+		return -EINVAL;
+
+	/*
+	 * Prevent binding to PFs with VFs enabled, this too easily allows
+	 * userspace instance with VFs and PFs from the same device, which
+	 * cannot work.  Disabling SR-IOV here would initiate removing the
+	 * VFs, which would unbind the driver, which is prone to blocking
+	 * if that VF is also in use by vfio-pci.  Just reject these PFs
+	 * and let the user sort it out.
+	 */
+	if (pci_num_vf(pdev)) {
+		pci_warn(pdev, "Cannot bind to PF with SR-IOV enabled\n");
+		return -EBUSY;
+	}
+
+	group = vfio_iommu_group_get(&pdev->dev);
+	if (!group)
+		return -EINVAL;
+
+	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
+	if (!vdev) {
+		vfio_iommu_group_put(group, &pdev->dev);
+		return -ENOMEM;
+	}
+
+	vdev->pdev = pdev;
+	vdev->name = VFIO_PCI_MDEV_NAME;
+	vdev->irq_type = VFIO_PCI_NUM_IRQS;
+	mutex_init(&vdev->igate);
+	spin_lock_init(&vdev->irqlock);
+	mutex_init(&vdev->ioeventfds_lock);
+	INIT_LIST_HEAD(&vdev->ioeventfds_list);
+
+	pci_set_drvdata(pdev, vdev);
+
+	ret = vfio_pci_reflck_attach(vdev);
+	if (ret) {
+		vfio_del_group_dev(&pdev->dev);
+		vfio_iommu_group_put(group, &pdev->dev);
+		kfree(vdev);
+		return ret;
+	}
+
+	ret = vfio_pci_probe_misc(pdev, vdev);
+	if (ret) {
+		pr_err("failed to probe %s\n", dev_name(&pdev->dev));
+		return ret;
+	}
+
+	ret = mdev_register_device(&pdev->dev, &vfio_pci_mdev_ops);
+	if (ret)
+		pr_err("Cannot register mdev for device %s\n",
+			dev_name(&pdev->dev));
+	else
+		pr_info("Wrap device %s as a mdev\n", dev_name(&pdev->dev));
+
+	return ret;
+}
+
+static void vfio_pci_mdev_driver_remove(struct pci_dev *pdev)
+{
+	struct vfio_pci_device *vdev;
+
+	vdev = pci_get_drvdata(pdev);
+	if (!vdev)
+		return;
+
+	vfio_pci_reflck_put(vdev->reflck);
+
+	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
+	kfree(vdev->region);
+	mutex_destroy(&vdev->ioeventfds_lock);
+	kfree(vdev);
+
+	vfio_pci_remove_misc(pdev);
+}
+
+static struct pci_driver vfio_pci_mdev_driver = {
+	.name		= VFIO_PCI_MDEV_NAME,
+	.id_table	= NULL, /* only dynamic ids */
+	.probe		= vfio_pci_mdev_driver_probe,
+	.remove		= vfio_pci_mdev_driver_remove,
+	.err_handler	= &vfio_err_handlers,
+};
+
+static void __exit vfio_pci_mdev_cleanup(void)
+{
+	pci_unregister_driver(&vfio_pci_mdev_driver);
+	vfio_pci_uninit_perm_bits();
+}
+
+static int __init vfio_pci_mdev_init(void)
+{
+	int ret;
+
+	/* Allocate shared config space permision data used by all devices */
+	ret = vfio_pci_init_perm_bits();
+	if (ret)
+		return ret;
+
+	/* Register and scan for devices */
+	ret = pci_register_driver(&vfio_pci_mdev_driver);
+	if (ret)
+		goto out_driver;
+
+	return 0;
+out_driver:
+	vfio_pci_uninit_perm_bits();
+	return ret;
+}
+
+module_init(vfio_pci_mdev_init);
+module_exit(vfio_pci_mdev_cleanup);
+
+MODULE_VERSION(DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
-- 
2.7.4


      parent reply	other threads:[~2019-03-13  8:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-12  8:18 [RFC v2 0/2] vfio/pci: wrap pci device as a mediated device Liu, Yi L
2019-03-12  8:18 ` [RFC v2 1/2] vfio/pci: export common symbols in vfio-pci Liu, Yi L
2019-03-19 18:14   ` Alex Williamson
2019-03-20 11:49     ` Liu, Yi L
2019-03-20 19:22       ` Alex Williamson
2019-03-23 11:06         ` Liu, Yi L
2019-03-25 18:17           ` Alex Williamson
2019-03-26 12:37             ` Liu, Yi L
2019-03-26 15:35               ` Alex Williamson
2019-03-27  8:42                 ` Liu, Yi L
2019-03-12  8:18 ` Liu, Yi L [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1552378703-11202-3-git-send-email-yi.l.liu@intel.com \
    --to=yi.l.liu@intel.com \
    --cc=Liu@vger.kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=jean-philippe.brucker@arm.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterx@redhat.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.