Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH 0/7] vfio/pci: SR-IOV support
@ 2020-02-04 23:05 Alex Williamson
  2020-02-04 23:05 ` [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks Alex Williamson
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:05 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

There seems to be an ongoing desire to use userspace, vfio-based
drivers for both SR-IOV PF and VF devices.  The fundamental issue
with this concept is that the VF is not fully independent of the PF
driver.  Minimally the PF driver might be able to deny service to the
VF, VF data paths might be dependent on the state of the PF device,
or the PF my have some degree of ability to inspect or manipulate the
VF data.  It therefore would seem irresponsible to unleash VFs onto
the system, managed by a user owned PF.

We address this in a few ways in this series.  First, we can use a bus
notifier and the driver_override facility to make sure VFs are bound
to the vfio-pci driver by default.  This should eliminate the chance
that a VF is accidentally bound and used by host drivers.  We don't
however remove the ability for a host admin to change this override.

The next issue we need to address is how we let userspace drivers
opt-in to this participation with the PF driver.  We do not want an
admin to be able to unwittingly assign one of these VFs to a tenant
that isn't working in collaboration with the PF driver.  We could use
IOMMU grouping, but this seems to push too far towards tightly coupled
PF and VF drivers.  This series introduces a "VF token", implemented
as a UUID, as a shared secret between PF and VF drivers.  The token
needs to be set by the PF driver and used as part of the device
matching by the VF driver.  Provisions in the code also account for
restarting the PF driver with active VF drivers, requiring the PF to
use the current token to re-gain access to the PF.

The above solutions introduce a bit of a modification to the VFIO ABI
and an additional ABI extension.  The modification is that the
VFIO_GROUP_GET_DEVICE_FD ioctl is specified to require a char string
from the user providing the device name.  For this solution, we extend
the syntax to allow the device name followed by key/value pairs.  In
this case we add "vf_token=3e7e882e-1daf-417f-ad8d-882eea5ee337", for
example.  These options are expected to be space separated.  Matching
these key/value pairs is entirely left to the vfio bus driver (ex.
vfio-pci) and the internal ops structure is extended to allow this
optional support.  This extension should be fully backwards compatible
to existing userspace, such code will simply fail to open these newly
exposed devices, as intended.

I've been debating whether instead of the above we should allow the
user to get the device fd as normal, but restrict the interfaces until
the user authenticates, but I'm afraid this would be a less backwards
compatible solution.  It would be just as unclear to the user why a
device read/write/mmap/ioctl failed as it might be to why getting the
device fd could fail.  However in the latter case, I believe we do a
better job of restricting how far userspace code might go before they
ultimately fail.  I'd welcome discussion in the space, and or course
the extension of the GET_DEVICE_FD string.

Finally, the user needs to be able to set a VF token.  I add a
VFIO_DEVICE_FEATURE ioctl for this that's meant to be reusable for
getting, setting, and probing arbitrary features of a device.

I'll reply to this cover letter with a very basic example of a QEMU
update to support this interface, though I haven't found a device yet
that behaves well with the PF running in one VM with the VF in
another, or really even just a PF running in a VM with SR-IOV enabled.
I know these devices exist though, and I suspect QEMU will not be the
primary user of this support for now, but this behavior reaffirms my
concerns to prevent mis-use.

Please comment.  In particular, does this approach meet the DPDK needs
for userspace PF and VF drivers, with the hopefully minor hurdle of
sharing a token between drivers.  The token is of course left to
userspace how to manage, and might be static (and not very secret) for
a given set of drivers.  Thanks,

Alex

---

Alex Williamson (7):
      vfio: Include optional device match in vfio_device_ops callbacks
      vfio/pci: Implement match ops
      vfio/pci: Introduce VF token
      vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user
      vfio/pci: Add sriov_configure support
      vfio/pci: Remove dev_fmt definition
      vfio/pci: Cleanup .probe() exit paths


 drivers/vfio/pci/vfio_pci.c         |  315 ++++++++++++++++++++++++++++++++---
 drivers/vfio/pci/vfio_pci_private.h |   10 +
 drivers/vfio/vfio.c                 |   19 ++
 include/linux/vfio.h                |    3 
 include/uapi/linux/vfio.h           |   37 ++++
 5 files changed, 356 insertions(+), 28 deletions(-)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
@ 2020-02-04 23:05 ` Alex Williamson
  2020-02-06 11:14   ` Cornelia Huck
  2020-02-04 23:05 ` [RFC PATCH 2/7] vfio/pci: Implement match ops Alex Williamson
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:05 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

Allow bus drivers to provide their own callback to match a device to
the user provided string.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/vfio.c  |   19 +++++++++++++++----
 include/linux/vfio.h |    3 +++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 388597930b64..dda1726adda8 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -875,11 +875,22 @@ EXPORT_SYMBOL_GPL(vfio_device_get_from_dev);
 static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 						     char *buf)
 {
-	struct vfio_device *it, *device = NULL;
+	struct vfio_device *it, *device = ERR_PTR(-ENODEV);
 
 	mutex_lock(&group->device_lock);
 	list_for_each_entry(it, &group->device_list, group_next) {
-		if (!strcmp(dev_name(it->dev), buf)) {
+		int ret;
+
+		if (it->ops->match) {
+			ret = it->ops->match(it->device_data, buf);
+			if (ret < 0 && ret != -ENODEV) {
+				device = ERR_PTR(ret);
+				break;
+			}
+		} else
+			ret = strcmp(dev_name(it->dev), buf);
+
+		if (!ret) {
 			device = it;
 			vfio_device_get(device);
 			break;
@@ -1441,8 +1452,8 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
 		return -EPERM;
 
 	device = vfio_device_get_from_name(group, buf);
-	if (!device)
-		return -ENODEV;
+	if (IS_ERR(device))
+		return PTR_ERR(device);
 
 	ret = device->ops->open(device->device_data);
 	if (ret) {
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index e42a711a2800..755e0f0e2900 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -26,6 +26,8 @@
  *         operations documented below
  * @mmap: Perform mmap(2) on a region of the device file descriptor
  * @request: Request for the bus driver to release the device
+ * @match: Optional device name match callback (return: 0 for match, -ENODEV
+ *         (or >0) for no match and continue, other -errno: no match and stop)
  */
 struct vfio_device_ops {
 	char	*name;
@@ -39,6 +41,7 @@ struct vfio_device_ops {
 			 unsigned long arg);
 	int	(*mmap)(void *device_data, struct vm_area_struct *vma);
 	void	(*request)(void *device_data, unsigned int count);
+	int	(*match)(void *device_data, char *buf);
 };
 
 extern struct iommu_group *vfio_iommu_group_get(struct device *dev);


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 2/7] vfio/pci: Implement match ops
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
  2020-02-04 23:05 ` [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks Alex Williamson
@ 2020-02-04 23:05 ` Alex Williamson
  2020-02-04 23:06 ` [RFC PATCH 3/7] vfio/pci: Introduce VF token Alex Williamson
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:05 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

This currently serves the same purpose as the default implementation
but will be expanded for additional functionality.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/pci/vfio_pci.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 02206162eaa9..6b3e73a33cbf 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1275,6 +1275,13 @@ static void vfio_pci_request(void *device_data, unsigned int count)
 	mutex_unlock(&vdev->igate);
 }
 
+static int vfio_pci_match(void *device_data, char *buf)
+{
+	struct vfio_pci_device *vdev = device_data;
+
+	return strcmp(pci_name(vdev->pdev), buf) ? -ENODEV : 0;
+}
+
 static const struct vfio_device_ops vfio_pci_ops = {
 	.name		= "vfio-pci",
 	.open		= vfio_pci_open,
@@ -1284,6 +1291,7 @@ static const struct vfio_device_ops vfio_pci_ops = {
 	.write		= vfio_pci_write,
 	.mmap		= vfio_pci_mmap,
 	.request	= vfio_pci_request,
+	.match		= vfio_pci_match,
 };
 
 static int vfio_pci_reflck_attach(struct vfio_pci_device *vdev);


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 3/7] vfio/pci: Introduce VF token
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
  2020-02-04 23:05 ` [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks Alex Williamson
  2020-02-04 23:05 ` [RFC PATCH 2/7] vfio/pci: Implement match ops Alex Williamson
@ 2020-02-04 23:06 ` Alex Williamson
  2020-02-05  7:57   ` Liu, Yi L
  2020-02-04 23:06 ` [RFC PATCH 4/7] vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user Alex Williamson
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:06 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

If we enable SR-IOV on a vfio-pci owned PF, the resulting VFs are not
fully isolated from the PF.  The PF can always cause a denial of
service to the VF, if not access data passed through the VF directly.
This is why vfio-pci currently does not bind to PFs with SR-IOV enabled
and does not provide access itself to enabling SR-IOV on a PF.  The
IOMMU grouping mechanism might allow us a solution to this lack of
isolation, however the deficiency isn't actually in the DMA path, so
much as the potential cooperation between PF and VF devices.  Also,
if we were to force VFs into the same IOMMU group as the PF, we severely
limit the utility of having independent drivers managing PFs and VFs
with vfio.

Therefore we introduce the concept of a VF token.  The token is
implemented as a UUID and represents a shared secret which must be set
by the PF driver and used by the VF drivers in order to access a vfio
device file descriptor for the VF.  The ioctl to set the VF token will
be provided in a later commit, this commit implements the underlying
infrastructure.  The concept here is to augment the string the user
passes to match a device within a group in order to retrieve access to
the device descriptor.  For example, rather than passing only the PCI
device name (ex. "0000:03:00.0") the user would also pass a vf_token
UUID (ex. "2ab74924-c335-45f4-9b16-8569e5b08258").  The device match
string therefore becomes:

"0000:03:00.0 vf_token=2ab74924-c335-45f4-9b16-8569e5b08258"

This syntax is expected to be extensible to future options as well, with
the standard being:

"$DEVICE_NAME $OPTION1=$VALUE1 $OPTION2=$VALUE2"

The device name must be first and option=value pairs are separated by
spaces.

The vf_token option is only required for VFs where the PF device is
bound to vfio-pci.  There is no change for PFs using existing host
drivers.

Note that in order to protect existing VF users, not only is it required
to set a vf_token on the PF before VFs devices can be accessed, but also
if there are existing VF users, (re)opening the PF device must also
provide the current vf_token as authentication.  This is intended to
prevent a VF driver starting with a trusted PF driver and later being
replaced by an unknown driver.  A vf_token is not required to open the
PF device when none of the VF devices are in use by vfio-pci drivers.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/pci/vfio_pci.c         |  125 +++++++++++++++++++++++++++++++++++
 drivers/vfio/pci/vfio_pci_private.h |    8 ++
 2 files changed, 132 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 6b3e73a33cbf..ad45ed3e0432 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -463,6 +463,34 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 		vfio_pci_set_power_state(vdev, PCI_D3hot);
 }
 
+static struct pci_driver vfio_pci_driver;
+
+static void vfio_pci_vf_token_user_add(struct vfio_pci_device *vdev, int val)
+{
+	struct vfio_device *pf_dev;
+	struct vfio_pci_device *pf_vdev;
+
+	if (!vdev->pdev->is_virtfn)
+		return;
+
+	pf_dev = vfio_device_get_from_dev(&vdev->pdev->physfn->dev);
+	if (!pf_dev)
+		return;
+
+	if (pci_dev_driver(vdev->pdev->physfn) != &vfio_pci_driver) {
+		vfio_device_put(pf_dev);
+		return;
+	}
+
+	pf_vdev = vfio_device_data(pf_dev);
+
+	mutex_lock(&pf_vdev->vf_token->lock);
+	pf_vdev->vf_token->users += val;
+	mutex_unlock(&pf_vdev->vf_token->lock);
+
+	vfio_device_put(pf_dev);
+}
+
 static void vfio_pci_release(void *device_data)
 {
 	struct vfio_pci_device *vdev = device_data;
@@ -472,6 +500,7 @@ static void vfio_pci_release(void *device_data)
 	if (!(--vdev->refcnt)) {
 		vfio_spapr_pci_eeh_release(vdev->pdev);
 		vfio_pci_disable(vdev);
+		vfio_pci_vf_token_user_add(vdev, -1);
 	}
 
 	mutex_unlock(&vdev->reflck->lock);
@@ -490,6 +519,7 @@ static int vfio_pci_open(void *device_data)
 	mutex_lock(&vdev->reflck->lock);
 
 	if (!vdev->refcnt) {
+		vfio_pci_vf_token_user_add(vdev, 1);
 		ret = vfio_pci_enable(vdev);
 		if (ret)
 			goto error;
@@ -1275,11 +1305,85 @@ static void vfio_pci_request(void *device_data, unsigned int count)
 	mutex_unlock(&vdev->igate);
 }
 
+#define VF_TOKEN_ARG "vf_token="
+
+/* Called with vdev->vf_token->lock */
+static int vfio_pci_vf_token_match(struct vfio_pci_device *vdev, char *opts)
+{
+	char *token;
+	uuid_t uuid;
+	int ret;
+
+	if (!opts)
+		return -EINVAL;
+
+	token = strstr(opts, VF_TOKEN_ARG);
+	if (!token)
+		return -EINVAL;
+
+	token += strlen(VF_TOKEN_ARG);
+
+	ret = uuid_parse(token, &uuid);
+	if (ret)
+		return ret;
+
+	if (!uuid_equal(&uuid, &vdev->vf_token->uuid))
+		return -EACCES;
+
+	return 0;
+}
+
 static int vfio_pci_match(void *device_data, char *buf)
 {
 	struct vfio_pci_device *vdev = device_data;
+	char *opts;
+	int ret;
+
+	opts = strchr(buf, ' ');
+	if (opts) {
+		*opts = 0;
+		opts++;
+	}
+
+	ret = strcmp(pci_name(vdev->pdev), buf);
+	if (ret)
+		return -ENODEV;
+
+	if (vdev->pdev->is_virtfn) {
+		struct vfio_device *pf_dev =
+			vfio_device_get_from_dev(&vdev->pdev->physfn->dev);
 
-	return strcmp(pci_name(vdev->pdev), buf) ? -ENODEV : 0;
+		if (pf_dev) {
+			if (pci_dev_driver(vdev->pdev->physfn) ==
+							&vfio_pci_driver) {
+				struct vfio_pci_device *pf_vdev =
+						vfio_device_data(pf_dev);
+
+				mutex_lock(&pf_vdev->vf_token->lock);
+				ret = vfio_pci_vf_token_match(pf_vdev, opts);
+				mutex_unlock(&pf_vdev->vf_token->lock);
+			}
+
+			vfio_device_put(pf_dev);
+
+			if (ret)
+				return -EACCES;
+		}
+	}
+
+	if (vdev->vf_token) {
+		mutex_lock(&vdev->vf_token->lock);
+
+		if (vdev->vf_token->users)
+			ret = vfio_pci_vf_token_match(vdev, opts);
+
+		mutex_unlock(&vdev->vf_token->lock);
+
+		if (ret)
+			return -EACCES;
+	}
+
+	return 0;
 }
 
 static const struct vfio_device_ops vfio_pci_ops = {
@@ -1351,6 +1455,19 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return ret;
 	}
 
+	if (pdev->sriov) {
+		vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
+		if (!vdev->vf_token) {
+			vfio_pci_reflck_put(vdev->reflck);
+			vfio_del_group_dev(&pdev->dev);
+			vfio_iommu_group_put(group, &pdev->dev);
+			kfree(vdev);
+			return -ENOMEM;
+		}
+		mutex_init(&vdev->vf_token->lock);
+		uuid_gen(&vdev->vf_token->uuid);
+	}
+
 	if (vfio_pci_is_vga(pdev)) {
 		vga_client_register(pdev, vdev, NULL, vfio_pci_set_vga_decode);
 		vga_set_legacy_decoding(pdev,
@@ -1384,6 +1501,12 @@ static void vfio_pci_remove(struct pci_dev *pdev)
 	if (!vdev)
 		return;
 
+	if (vdev->vf_token) {
+		WARN_ON(vdev->vf_token->users);
+		mutex_destroy(&vdev->vf_token->lock);
+		kfree(vdev->vf_token);
+	}
+
 	vfio_pci_reflck_put(vdev->reflck);
 
 	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index ee6ee91718a4..4ca250207ab6 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -12,6 +12,7 @@
 #include <linux/pci.h>
 #include <linux/irqbypass.h>
 #include <linux/types.h>
+#include <linux/uuid.h>
 
 #ifndef VFIO_PCI_PRIVATE_H
 #define VFIO_PCI_PRIVATE_H
@@ -84,6 +85,12 @@ struct vfio_pci_reflck {
 	struct mutex		lock;
 };
 
+struct vfio_pci_vf_token {
+	struct mutex		lock;
+	uuid_t			uuid;
+	u32			users;
+};
+
 struct vfio_pci_device {
 	struct pci_dev		*pdev;
 	void __iomem		*barmap[PCI_STD_RESOURCE_END + 1];
@@ -122,6 +129,7 @@ struct vfio_pci_device {
 	struct list_head	dummy_resources_list;
 	struct mutex		ioeventfds_lock;
 	struct list_head	ioeventfds_list;
+	struct vfio_pci_vf_token	*vf_token;
 };
 
 #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 4/7] vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (2 preceding siblings ...)
  2020-02-04 23:06 ` [RFC PATCH 3/7] vfio/pci: Introduce VF token Alex Williamson
@ 2020-02-04 23:06 ` Alex Williamson
  2020-02-04 23:06 ` [RFC PATCH 5/7] vfio/pci: Add sriov_configure support Alex Williamson
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:06 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

The VFIO_DEVICE_FEATURE ioctl is meant to be a general purpose, device
agnostic ioctl for setting, retrieving, and probing device features.
This implementation provides a 16-bit field for specifying a feature
index, where the data porition of the ioctl is determined by the
semantics for the given feature.  Additional flag bits indicate the
direction and nature of the operation; SET indicates user data is
provided into the device feature, GET indicates the device feature is
written out into user data.  The PROBE flag augments determining
whether the given feature is supported, and if provided, whether the
given operation on the feature is supported.

The first user of this ioctl is for setting the vfio-pci VF token,
where the user provides a shared secret key (UUID) on a SR-IOV PF
device, which users must provide when opening associated VF devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/pci/vfio_pci.c |   50 +++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/vfio.h   |   37 ++++++++++++++++++++++++++++++++
 2 files changed, 87 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index ad45ed3e0432..d22a9d7bc32a 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1167,6 +1167,56 @@ static long vfio_pci_ioctl(void *device_data,
 
 		return vfio_pci_ioeventfd(vdev, ioeventfd.offset,
 					  ioeventfd.data, count, ioeventfd.fd);
+	} else if (cmd == VFIO_DEVICE_FEATURE) {
+		struct vfio_device_feature feature;
+		uuid_t uuid;
+
+		minsz = offsetofend(struct vfio_device_feature, flags);
+
+		if (copy_from_user(&feature, (void __user *)arg, minsz))
+			return -EFAULT;
+
+		if (feature.argsz < minsz)
+			return -EINVAL;
+
+		if (feature.flags & ~(VFIO_DEVICE_FEATURE_MASK |
+				      VFIO_DEVICE_FEATURE_SET |
+				      VFIO_DEVICE_FEATURE_GET |
+				      VFIO_DEVICE_FEATURE_PROBE))
+			return -EINVAL;
+
+		switch (feature.flags & VFIO_DEVICE_FEATURE_MASK) {
+		case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN:
+			if (!vdev->vf_token)
+				return -ENOTTY;
+
+			/*
+			 * We do not support GET of the VF Token UUID as this
+			 * could expose the token of the previous device user,
+			 * where their tokens could be statically defined.
+			 */
+			if (feature.flags & VFIO_DEVICE_FEATURE_GET)
+				return -EINVAL;
+
+			if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
+				return 0;
+
+			/* Don't SET unless told to do so */
+			if (!(feature.flags & VFIO_DEVICE_FEATURE_SET))
+				return -EINVAL;
+
+			if (copy_from_user(&uuid, (void __user *)(arg + minsz),
+					   sizeof(uuid)))
+				return -EFAULT;
+
+			mutex_lock(&vdev->vf_token->lock);
+			uuid_copy(&vdev->vf_token->uuid, &uuid);
+			mutex_unlock(&vdev->vf_token->lock);
+
+			return 0;
+		default:
+			return -ENOTTY;
+		}
 	}
 
 	return -ENOTTY;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 9e843a147ead..8d313122f94e 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -707,6 +707,43 @@ struct vfio_device_ioeventfd {
 
 #define VFIO_DEVICE_IOEVENTFD		_IO(VFIO_TYPE, VFIO_BASE + 16)
 
+/**
+ * VFIO_DEVICE_FEATURE - _IORW(VFIO_TYPE, VFIO_BASE + 17,
+ *			       struct vfio_device_feature
+ *
+ * Get, set, or probe feature data of the device.  The feature is selected
+ * using the FEATURE_MASK portion of the flags field.  Support for a feature
+ * can be probed by setting both the FEATURE_MASK and PROBE bits.  A probe
+ * may optionally include the GET and/or SET bits to determine read vs write
+ * access of the feature respectively.  Probing a feature will return success
+ * if the feature is supporedt and all of the optionally indicated GET/SET
+ * methods are supported.  The format of the data portion of the structure is
+ * specific to the given feature.  The data portion is not required for
+ * probing.
+ *
+ * Return 0 on success, -errno on failure.
+ */
+struct vfio_device_feature {
+	__u32	argsz;
+	__u32	flags;
+#define VFIO_DEVICE_FEATURE_MASK	(0xffff) /* 16-bit feature index */
+#define VFIO_DEVICE_FEATURE_GET		(1 << 16) /* Get feature into data[] */
+#define VFIO_DEVICE_FEATURE_SET		(1 << 17) /* Set feature from data[] */
+#define VFIO_DEVICE_FEATURE_PROBE	(1 << 18) /* Probe feature support */
+	__u8	data[];
+};
+
+#define VFIO_DEVICE_FEATURE		_IO(VFIO_TYPE, VFIO_BASE + 17)
+
+/*
+ * Provide support for setting a PCI VF Token, which is used as a shared
+ * secret between PF and VF drivers.  This feature may only be set on a
+ * PCI SR-IOV PF when SR-IOV is enabled on the PF and there are no existing
+ * open VFs.  Data provided when setting this feature is a 16-byte array
+ * (__u8 b[16]), representing a UUID.
+ */
+#define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN	(0)
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 5/7] vfio/pci: Add sriov_configure support
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (3 preceding siblings ...)
  2020-02-04 23:06 ` [RFC PATCH 4/7] vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user Alex Williamson
@ 2020-02-04 23:06 ` Alex Williamson
  2020-02-04 23:06 ` [RFC PATCH 6/7] vfio/pci: Remove dev_fmt definition Alex Williamson
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:06 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

With the VF Token interface we can now expect that a vfio userspace
driver must be in collaboration with the PF driver, an unwitting
userspace driver will not be able to get past the GET_DEVICE_FD step
in accessing the device.  We can now move on to actually allowing
SR-IOV to be enabled by vfio-pci on the PF.  Support for this is not
enabled by default in this commit, but it does provide a module option
for this to be enabled (enable_sriov=1).  Enabling VFs is rather
straightforward, except we don't want to risk that a VF might get
autoprobed and bound to other drivers, so a bus notifier is used to
"capture" VFs to vfio-pci using the driver_override support.  We
assume any later action to bind the device to other drivers is
condoned by the system admin and allow it with a log warning.

vfio-pci will disable SR-IOV on a PF before releasing the device,
allowing a VF driver to be assured other drivers cannot take over the
PF and that any other userspace driver must know the shared VF token.
This support also does not provide a mechanism for the PF userspace
driver itself to manipulate SR-IOV.  With this patch SR-IOV can only
be enabled via the host sysfs interface and the PF driver user cannot
create and remove VF.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/pci/vfio_pci.c         |  113 ++++++++++++++++++++++++++++++++---
 drivers/vfio/pci/vfio_pci_private.h |    2 +
 2 files changed, 104 insertions(+), 11 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index d22a9d7bc32a..026308aa18b5 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -54,6 +54,10 @@ module_param(disable_idle_d3, bool, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(disable_idle_d3,
 		 "Disable using the PCI D3 low power state for idle, unused devices");
 
+static bool enable_sriov;
+module_param(enable_sriov, bool, 0644);
+MODULE_PARM_DESC(enable_sriov, "Enable support for SR-IOV configuration");
+
 static inline bool vfio_vga_disabled(void)
 {
 #ifdef CONFIG_VFIO_PCI_VGA
@@ -1450,6 +1454,34 @@ static const struct vfio_device_ops vfio_pci_ops = {
 
 static int vfio_pci_reflck_attach(struct vfio_pci_device *vdev);
 static void vfio_pci_reflck_put(struct vfio_pci_reflck *reflck);
+static struct pci_driver vfio_pci_driver;
+
+static int vfio_pci_bus_notifier(struct notifier_block *nb,
+				 unsigned long action, void *data)
+{
+	struct vfio_pci_device *vdev = container_of(nb,
+						    struct vfio_pci_device, nb);
+	struct device *dev = data;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (action == BUS_NOTIFY_ADD_DEVICE &&
+	    pdev->is_virtfn && pdev->physfn == vdev->pdev) {
+		pci_info(vdev->pdev, "Captured SR-IOV VF %s driver_override\n",
+			 pci_name(pdev));
+		pdev->driver_override = kasprintf(GFP_KERNEL, "%s",
+						  vfio_pci_ops.name);
+	} else if (action == BUS_NOTIFY_BOUND_DRIVER &&
+		   pdev->is_virtfn && pdev->physfn == vdev->pdev) {
+		struct pci_driver *drv = pci_dev_driver(pdev);
+
+		if (drv && drv != &vfio_pci_driver)
+			pci_warn(vdev->pdev,
+				 "VF %s bound to driver %s while PF bound to vfio-pci\n",
+				 pci_name(pdev), drv->name);
+	}
+
+	return 0;
+}
 
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
@@ -1461,12 +1493,12 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return -EINVAL;
 
 	/*
-	 * Prevent binding to PFs with VFs enabled, this too easily allows
-	 * userspace instance with VFs and PFs from the same device, which
-	 * cannot work.  Disabling SR-IOV here would initiate removing the
-	 * VFs, which would unbind the driver, which is prone to blocking
-	 * if that VF is also in use by vfio-pci.  Just reject these PFs
-	 * and let the user sort it out.
+	 * Prevent binding to PFs with VFs enabled, the VFs might be in use
+	 * by the host or other users.  We cannot capture the VFs if they
+	 * already exist, nor can we track VF users.  Disabling SR-IOV here
+	 * would initiate removing the VFs, which would unbind the driver,
+	 * which is prone to blocking if that VF is also in use by vfio-pci.
+	 * Just reject these PFs and let the user sort it out.
 	 */
 	if (pci_num_vf(pdev)) {
 		pci_warn(pdev, "Cannot bind to PF with SR-IOV enabled\n");
@@ -1514,6 +1546,18 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 			kfree(vdev);
 			return -ENOMEM;
 		}
+
+		vdev->nb.notifier_call = vfio_pci_bus_notifier;
+		ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
+		if (ret) {
+			kfree(vdev->vf_token);
+			vfio_pci_reflck_put(vdev->reflck);
+			vfio_del_group_dev(&pdev->dev);
+			vfio_iommu_group_put(group, &pdev->dev);
+			kfree(vdev);
+			return ret;
+		}
+
 		mutex_init(&vdev->vf_token->lock);
 		uuid_gen(&vdev->vf_token->uuid);
 	}
@@ -1547,6 +1591,8 @@ static void vfio_pci_remove(struct pci_dev *pdev)
 {
 	struct vfio_pci_device *vdev;
 
+	pci_disable_sriov(pdev);
+
 	vdev = vfio_del_group_dev(&pdev->dev);
 	if (!vdev)
 		return;
@@ -1557,6 +1603,9 @@ static void vfio_pci_remove(struct pci_dev *pdev)
 		kfree(vdev->vf_token);
 	}
 
+	if (vdev->nb.notifier_call)
+		bus_unregister_notifier(&pci_bus_type, &vdev->nb);
+
 	vfio_pci_reflck_put(vdev->reflck);
 
 	vfio_iommu_group_put(pdev->dev.iommu_group, &pdev->dev);
@@ -1605,16 +1654,58 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
 	return PCI_ERS_RESULT_CAN_RECOVER;
 }
 
+static int vfio_pci_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
+{
+	struct vfio_pci_device *vdev;
+	struct vfio_device *device;
+	int ret;
+
+	might_sleep();
+
+	if (!enable_sriov)
+		return -ENOENT;
+
+	device = vfio_device_get_from_dev(&pdev->dev);
+	if (!device)
+		return -ENODEV;
+
+	vdev = vfio_device_data(device);
+	if (!vdev) {
+		vfio_device_put(device);
+		return -ENODEV;
+	}
+
+	mutex_lock(&vdev->reflck->lock);
+
+	if (vdev->refcnt) {
+		mutex_unlock(&vdev->reflck->lock);
+		vfio_device_put(device);
+		return -EBUSY;
+	}
+
+	if (nr_virtfn == 0) {
+		pci_disable_sriov(pdev);
+		ret = 0;
+	} else
+		ret = pci_enable_sriov(pdev, nr_virtfn);
+
+	mutex_unlock(&vdev->reflck->lock);
+	vfio_device_put(device);
+
+	return ret < 0 ? ret : nr_virtfn;
+}
+
 static const struct pci_error_handlers vfio_err_handlers = {
 	.error_detected = vfio_pci_aer_err_detected,
 };
 
 static struct pci_driver vfio_pci_driver = {
-	.name		= "vfio-pci",
-	.id_table	= NULL, /* only dynamic ids */
-	.probe		= vfio_pci_probe,
-	.remove		= vfio_pci_remove,
-	.err_handler	= &vfio_err_handlers,
+	.name			= "vfio-pci",
+	.id_table		= NULL, /* only dynamic ids */
+	.probe			= vfio_pci_probe,
+	.remove			= vfio_pci_remove,
+	.sriov_configure	= vfio_pci_sriov_configure,
+	.err_handler		= &vfio_err_handlers,
 };
 
 static DEFINE_MUTEX(reflck_lock);
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index 4ca250207ab6..9951e2557f47 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -13,6 +13,7 @@
 #include <linux/irqbypass.h>
 #include <linux/types.h>
 #include <linux/uuid.h>
+#include <linux/notifier.h>
 
 #ifndef VFIO_PCI_PRIVATE_H
 #define VFIO_PCI_PRIVATE_H
@@ -130,6 +131,7 @@ struct vfio_pci_device {
 	struct mutex		ioeventfds_lock;
 	struct list_head	ioeventfds_list;
 	struct vfio_pci_vf_token	*vf_token;
+	struct notifier_block	nb;
 };
 
 #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 6/7] vfio/pci: Remove dev_fmt definition
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (4 preceding siblings ...)
  2020-02-04 23:06 ` [RFC PATCH 5/7] vfio/pci: Add sriov_configure support Alex Williamson
@ 2020-02-04 23:06 ` Alex Williamson
  2020-02-06 13:45   ` Cornelia Huck
  2020-02-04 23:06 ` [RFC PATCH 7/7] vfio/pci: Cleanup .probe() exit paths Alex Williamson
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:06 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

It currently results in messages like:

 "vfio-pci 0000:03:00.0: vfio_pci: ..."

Which is quite a bit redundant.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/pci/vfio_pci.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 026308aa18b5..343fe38ed06b 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -9,7 +9,6 @@
  */
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-#define dev_fmt pr_fmt
 
 #include <linux/device.h>
 #include <linux/eventfd.h>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 7/7] vfio/pci: Cleanup .probe() exit paths
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (5 preceding siblings ...)
  2020-02-04 23:06 ` [RFC PATCH 6/7] vfio/pci: Remove dev_fmt definition Alex Williamson
@ 2020-02-04 23:06 ` Alex Williamson
  2020-02-04 23:17 ` [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:06 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

The cleanup is getting a tad long.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/vfio/pci/vfio_pci.c |   54 ++++++++++++++++++++-----------------------
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 343fe38ed06b..72746f3a137a 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1510,8 +1510,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
 	if (!vdev) {
-		vfio_iommu_group_put(group, &pdev->dev);
-		return -ENOMEM;
+		ret = -ENOMEM;
+		goto out_group_put;
 	}
 
 	vdev->pdev = pdev;
@@ -1522,43 +1522,27 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	INIT_LIST_HEAD(&vdev->ioeventfds_list);
 
 	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
-	if (ret) {
-		vfio_iommu_group_put(group, &pdev->dev);
-		kfree(vdev);
-		return ret;
-	}
+	if (ret)
+		goto out_free;
 
 	ret = vfio_pci_reflck_attach(vdev);
-	if (ret) {
-		vfio_del_group_dev(&pdev->dev);
-		vfio_iommu_group_put(group, &pdev->dev);
-		kfree(vdev);
-		return ret;
-	}
+	if (ret)
+		goto out_del_group_dev;
 
 	if (pdev->sriov) {
 		vdev->vf_token = kzalloc(sizeof(*vdev->vf_token), GFP_KERNEL);
 		if (!vdev->vf_token) {
-			vfio_pci_reflck_put(vdev->reflck);
-			vfio_del_group_dev(&pdev->dev);
-			vfio_iommu_group_put(group, &pdev->dev);
-			kfree(vdev);
-			return -ENOMEM;
-		}
-
-		vdev->nb.notifier_call = vfio_pci_bus_notifier;
-		ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
-		if (ret) {
-			kfree(vdev->vf_token);
-			vfio_pci_reflck_put(vdev->reflck);
-			vfio_del_group_dev(&pdev->dev);
-			vfio_iommu_group_put(group, &pdev->dev);
-			kfree(vdev);
-			return ret;
+			ret = -ENOMEM;
+			goto out_reflck;
 		}
 
 		mutex_init(&vdev->vf_token->lock);
 		uuid_gen(&vdev->vf_token->uuid);
+
+		vdev->nb.notifier_call = vfio_pci_bus_notifier;
+		ret = bus_register_notifier(&pci_bus_type, &vdev->nb);
+		if (ret)
+			goto out_vf_token;
 	}
 
 	if (vfio_pci_is_vga(pdev)) {
@@ -1584,6 +1568,18 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	}
 
 	return ret;
+
+out_vf_token:
+	kfree(vdev->vf_token);
+out_reflck:
+	vfio_pci_reflck_put(vdev->reflck);
+out_del_group_dev:
+	vfio_del_group_dev(&pdev->dev);
+out_free:
+	kfree(vdev);
+out_group_put:
+	vfio_iommu_group_put(group, &pdev->dev);
+	return ret;
 }
 
 static void vfio_pci_remove(struct pci_dev *pdev)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (6 preceding siblings ...)
  2020-02-04 23:06 ` [RFC PATCH 7/7] vfio/pci: Cleanup .probe() exit paths Alex Williamson
@ 2020-02-04 23:17 ` Alex Williamson
  2020-02-05  7:57   ` Liu, Yi L
  2020-02-05  7:01 ` Christoph Hellwig
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2020-02-04 23:17 UTC (permalink / raw)
  To: kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck


Promised example QEMU test case...

commit 3557c63bcb286c71f3f7242cad632edd9e297d26
Author: Alex Williamson <alex.williamson@redhat.com>
Date:   Tue Feb 4 13:47:41 2020 -0700

    vfio-pci: QEMU support for vfio-pci VF tokens
    
    Example support for using a vf_token to gain access to a device as
    well as using the VFIO_DEVICE_FEATURE interface to set the VF token.
    Note that the kernel will disregard the additional option where it's
    not required, such as opening the PF with no VF users, so we can
    always provide it.
    
    NB. It's unclear whether there's value to this QEMU support without
    further exposure of SR-IOV within a VM.  This is meant mostly as a
    test case where the real initial users will likely be DPDK drivers.
    
    Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 337a173ce7c6..b755b4caa870 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2816,12 +2816,45 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
         goto error;
     }
 
-    ret = vfio_get_device(group, vdev->vbasedev.name, &vdev->vbasedev, errp);
+    if (!qemu_uuid_is_null(&vdev->vf_token)) {
+        char *uuid = qemu_uuid_unparse_strdup(&vdev->vf_token);
+
+        tmp = g_strdup_printf("%s vf_token=%s", vdev->vbasedev.name, uuid);
+        g_free(uuid);
+    } else {
+        tmp = g_strdup_printf("%s", vdev->vbasedev.name);
+    }
+
+    ret = vfio_get_device(group, tmp, &vdev->vbasedev, errp);
+
+    g_free(tmp);
+
     if (ret) {
         vfio_put_group(group);
         goto error;
     }
 
+    if (!qemu_uuid_is_null(&vdev->vf_token)) {
+        struct vfio_device_feature *feature;
+
+        feature = g_malloc0(sizeof(*feature) + 16);
+        feature->argsz = sizeof(*feature) + 16;
+        feature->flags = VFIO_DEVICE_FEATURE_PROBE | VFIO_DEVICE_FEATURE_SET |
+                         VFIO_DEVICE_FEATURE_PCI_VF_TOKEN;
+
+        if (!ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feature)) {
+            feature->flags &= ~VFIO_DEVICE_FEATURE_PROBE;
+            memcpy(&feature->data, vdev->vf_token.data, 16);
+            if (ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feature)) {
+                g_free(feature);
+                error_setg_errno(errp, errno, "failed to set vf_token UUID");
+                goto error;
+            }
+        }
+
+        g_free(feature);
+    }
+
     vfio_populate_device(vdev, &err);
     if (err) {
         error_propagate(errp, err);
@@ -3149,6 +3182,7 @@ static void vfio_instance_init(Object *obj)
 static Property vfio_pci_dev_properties[] = {
     DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host),
     DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev),
+    DEFINE_PROP_UUID_NODEFAULT("vf_token", VFIOPCIDevice, vf_token),
     DEFINE_PROP_ON_OFF_AUTO("display", VFIOPCIDevice,
                             display, ON_OFF_AUTO_OFF),
     DEFINE_PROP_UINT32("xres", VFIOPCIDevice, display_xres, 0),
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 35626cd63e9d..7f25672d7500 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -18,6 +18,7 @@
 #include "qemu/event_notifier.h"
 #include "qemu/queue.h"
 #include "qemu/timer.h"
+#include "qemu/uuid.h"
 
 #define PCI_ANY_ID (~0)
 
@@ -170,6 +171,7 @@ typedef struct VFIOPCIDevice {
     VFIODisplay *dpy;
     Error *migration_blocker;
     Notifier irqchip_change_notifier;
+    QemuUUID vf_token;
 } VFIOPCIDevice;
 
 uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index fb10370d2928..9bc28cc1d272 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -707,6 +707,43 @@ struct vfio_device_ioeventfd {
 
 #define VFIO_DEVICE_IOEVENTFD		_IO(VFIO_TYPE, VFIO_BASE + 16)
 
+/**
+ * VFIO_DEVICE_FEATURE - _IORW(VFIO_TYPE, VFIO_BASE + 17,
+ *			       struct vfio_device_feature
+ *
+ * Get, set, or probe feature data of the device.  The feature is selected
+ * using the FEATURE_MASK portion of the flags field.  Support for a feature
+ * can be probed by setting both the FEATURE_MASK and PROBE bits.  A probe
+ * may optionally include the GET and/or SET bits to determine read vs write
+ * access of the feature respectively.  Probing a feature will return success
+ * if the feature is supporedt and all of the optionally indicated GET/SET
+ * methods are supported.  The format of the data portion of the structure is
+ * specific to the given feature.  The data portion is not required for
+ * probing.
+ *
+ * Return 0 on success, -errno on failure.
+ */
+struct vfio_device_feature {
+	__u32	argsz;
+	__u32	flags;
+#define VFIO_DEVICE_FEATURE_MASK	(0xffff) /* 16-bit feature index */
+#define VFIO_DEVICE_FEATURE_GET		(1 << 16) /* Get feature into data[] */
+#define VFIO_DEVICE_FEATURE_SET		(1 << 17) /* Set feature from data[] */
+#define VFIO_DEVICE_FEATURE_PROBE	(1 << 18) /* Probe feature support */
+	__u8	data[];
+};
+
+#define VFIO_DEVICE_FEATURE		_IO(VFIO_TYPE, VFIO_BASE + 17)
+
+/*
+ * Provide support for setting a PCI VF Token, which is used as a shared
+ * secret between PF and VF drivers.  This feature may only be set on a
+ * PCI SR-IOV PF when SR-IOV is enabled on the PF and there are no existing
+ * open VFs.  Data provided when setting this feature is a 16-byte array
+ * (__u8 b[16]), representing a UUID.
+ */
+#define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN	(0)
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (7 preceding siblings ...)
  2020-02-04 23:17 ` [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
@ 2020-02-05  7:01 ` Christoph Hellwig
  2020-02-05 13:58   ` Alex Williamson
  2020-02-05  7:57 ` Liu, Yi L
  2020-02-11 11:18 ` Jerin Jacob
  10 siblings, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2020-02-05  7:01 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

On Tue, Feb 04, 2020 at 04:05:34PM -0700, Alex Williamson wrote:
> We address this in a few ways in this series.  First, we can use a bus
> notifier and the driver_override facility to make sure VFs are bound
> to the vfio-pci driver by default.  This should eliminate the chance
> that a VF is accidentally bound and used by host drivers.  We don't
> however remove the ability for a host admin to change this override.

That is just such a bad idea.  Using VFs in the host is a perfectly
valid use case that you are breaking.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (8 preceding siblings ...)
  2020-02-05  7:01 ` Christoph Hellwig
@ 2020-02-05  7:57 ` Liu, Yi L
  2020-02-05 14:10   ` Alex Williamson
  2020-02-11 11:18 ` Jerin Jacob
  10 siblings, 1 reply; 25+ messages in thread
From: Liu, Yi L @ 2020-02-05  7:57 UTC (permalink / raw)
  To: Alex Williamson, kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, Richardson, Bruce, cohuck

Hi Alex,

Silly questions on the background:

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, February 5, 2020 7:06 AM
> Subject: [RFC PATCH 0/7] vfio/pci: SR-IOV support
> 
> There seems to be an ongoing desire to use userspace, vfio-based
> drivers for both SR-IOV PF and VF devices. 

Is this series to make PF be bound-able to vfio-pci even SR-IOV is
enabled on such PFs? If yes, is it allowed to assign PF to a VM? or
it can only be used by userspace applications like DPDK?

> The fundamental issue
> with this concept is that the VF is not fully independent of the PF
> driver.  Minimally the PF driver might be able to deny service to the
> VF, VF data paths might be dependent on the state of the PF device,
> or the PF my have some degree of ability to inspect or manipulate the
> VF data.  It therefore would seem irresponsible to unleash VFs onto
> the system, managed by a user owned PF.
> 
> We address this in a few ways in this series.  First, we can use a bus
> notifier and the driver_override facility to make sure VFs are bound
> to the vfio-pci driver by default.  This should eliminate the chance
> that a VF is accidentally bound and used by host drivers.  We don't
> however remove the ability for a host admin to change this override.
> 
> The next issue we need to address is how we let userspace drivers
> opt-in to this participation with the PF driver.  We do not want an
> admin to be able to unwittingly assign one of these VFs to a tenant
> that isn't working in collaboration with the PF driver.  We could use
> IOMMU grouping, but this seems to push too far towards tightly coupled
> PF and VF drivers.  This series introduces a "VF token", implemented
> as a UUID, as a shared secret between PF and VF drivers.  The token
> needs to be set by the PF driver and used as part of the device
> matching by the VF driver.  Provisions in the code also account for
> restarting the PF driver with active VF drivers, requiring the PF to
> use the current token to re-gain access to the PF.

How about the scenario in which PF driver is vfio-based userspace
driver but VF drivers are mixed. This means not all VFs are bound
to vfio-based userspace driver. Is it also supported here? :-)

Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC PATCH 3/7] vfio/pci: Introduce VF token
  2020-02-04 23:06 ` [RFC PATCH 3/7] vfio/pci: Introduce VF token Alex Williamson
@ 2020-02-05  7:57   ` Liu, Yi L
  2020-02-05 14:13     ` Alex Williamson
  0 siblings, 1 reply; 25+ messages in thread
From: Liu, Yi L @ 2020-02-05  7:57 UTC (permalink / raw)
  To: Alex Williamson, kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, Richardson, Bruce, cohuck

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, February 5, 2020 7:06 AM
> To: kvm@vger.kernel.org
> Subject: [RFC PATCH 3/7] vfio/pci: Introduce VF token
> 
> If we enable SR-IOV on a vfio-pci owned PF, the resulting VFs are not
> fully isolated from the PF.  The PF can always cause a denial of
> service to the VF, if not access data passed through the VF directly.
> This is why vfio-pci currently does not bind to PFs with SR-IOV enabled
> and does not provide access itself to enabling SR-IOV on a PF.  The
> IOMMU grouping mechanism might allow us a solution to this lack of
> isolation, however the deficiency isn't actually in the DMA path, so
> much as the potential cooperation between PF and VF devices.  Also,
> if we were to force VFs into the same IOMMU group as the PF, we severely
> limit the utility of having independent drivers managing PFs and VFs
> with vfio.
> 
> Therefore we introduce the concept of a VF token.  The token is
> implemented as a UUID and represents a shared secret which must be set
> by the PF driver and used by the VF drivers in order to access a vfio
> device file descriptor for the VF.  The ioctl to set the VF token will
> be provided in a later commit, this commit implements the underlying
> infrastructure.  The concept here is to augment the string the user
> passes to match a device within a group in order to retrieve access to
> the device descriptor.  For example, rather than passing only the PCI
> device name (ex. "0000:03:00.0") the user would also pass a vf_token
> UUID (ex. "2ab74924-c335-45f4-9b16-8569e5b08258").  The device match
> string therefore becomes:
> 
> "0000:03:00.0 vf_token=2ab74924-c335-45f4-9b16-8569e5b08258"
> 
> This syntax is expected to be extensible to future options as well, with
> the standard being:
> 
> "$DEVICE_NAME $OPTION1=$VALUE1 $OPTION2=$VALUE2"
> 
> The device name must be first and option=value pairs are separated by
> spaces.
> 
> The vf_token option is only required for VFs where the PF device is
> bound to vfio-pci.  There is no change for PFs using existing host
> drivers.
> 
> Note that in order to protect existing VF users, not only is it required
> to set a vf_token on the PF before VFs devices can be accessed, but also
> if there are existing VF users, (re)opening the PF device must also
> provide the current vf_token as authentication.  This is intended to
> prevent a VF driver starting with a trusted PF driver and later being
> replaced by an unknown driver.  A vf_token is not required to open the
> PF device when none of the VF devices are in use by vfio-pci drivers.

So vfio_token is a kind of per-PF token?

Regards,
Yi Liu


^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-04 23:17 ` [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
@ 2020-02-05  7:57   ` Liu, Yi L
  2020-02-05 14:18     ` Alex Williamson
  0 siblings, 1 reply; 25+ messages in thread
From: Liu, Yi L @ 2020-02-05  7:57 UTC (permalink / raw)
  To: Alex Williamson, kvm
  Cc: linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, Richardson, Bruce, cohuck

> From: Alex Williamson <alex.williamson@redhat.com>
> Sent: Wednesday, February 5, 2020 7:18 AM
> To: kvm@vger.kernel.org
> Subject: Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
> 
> 
> Promised example QEMU test case...
> 
> commit 3557c63bcb286c71f3f7242cad632edd9e297d26
> Author: Alex Williamson <alex.williamson@redhat.com>
> Date:   Tue Feb 4 13:47:41 2020 -0700
> 
>     vfio-pci: QEMU support for vfio-pci VF tokens
> 
>     Example support for using a vf_token to gain access to a device as
>     well as using the VFIO_DEVICE_FEATURE interface to set the VF token.
>     Note that the kernel will disregard the additional option where it's
>     not required, such as opening the PF with no VF users, so we can
>     always provide it.
> 
>     NB. It's unclear whether there's value to this QEMU support without
>     further exposure of SR-IOV within a VM.  This is meant mostly as a
>     test case where the real initial users will likely be DPDK drivers.
> 
>     Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

Just curious how UUID is used across the test. Should the QEMU
which opens VFs add the vfio_token=UUID or the QEMU which
opens PF add the vfio_token=UUID? or both should add vfio_token=UUID.

Regards,
Yi Liu



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-05  7:01 ` Christoph Hellwig
@ 2020-02-05 13:58   ` Alex Williamson
  0 siblings, 0 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-05 13:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson, cohuck

On Tue, 4 Feb 2020 23:01:09 -0800
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Feb 04, 2020 at 04:05:34PM -0700, Alex Williamson wrote:
> > We address this in a few ways in this series.  First, we can use a bus
> > notifier and the driver_override facility to make sure VFs are bound
> > to the vfio-pci driver by default.  This should eliminate the chance
> > that a VF is accidentally bound and used by host drivers.  We don't
> > however remove the ability for a host admin to change this override.  
> 
> That is just such a bad idea.  Using VFs in the host is a perfectly
> valid use case that you are breaking.

vfio-pci currently does not allow binding to a PF with VFs enabled and
does not provide an sriov_configure callback, so it's not possible to
have VFs on a vfio-pci bound PF.  Therefore I'm not breaking any
existing use cases.  I'm also not preventing VFs from being used in the
host, I only set a default driver_override value, which can be replaced
if a different driver binding is desired.  So I also don't see that I'm
breaking a usage model here.  I do stand by the idea that VFs sourced
from a user owned PF should not by default be used in the host (ie.
autoprobed on device add).  There's a pci-pf-stub driver that can be
used to create VFs on a PF if no userspace access of the PF is required.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-05  7:57 ` Liu, Yi L
@ 2020-02-05 14:10   ` Alex Williamson
  0 siblings, 0 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-05 14:10 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, Richardson, Bruce, cohuck

On Wed, 5 Feb 2020 07:57:21 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> Hi Alex,
> 
> Silly questions on the background:
> 
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, February 5, 2020 7:06 AM
> > Subject: [RFC PATCH 0/7] vfio/pci: SR-IOV support
> > 
> > There seems to be an ongoing desire to use userspace, vfio-based
> > drivers for both SR-IOV PF and VF devices.   
> 
> Is this series to make PF be bound-able to vfio-pci even SR-IOV is
> enabled on such PFs? If yes, is it allowed to assign PF to a VM? or
> it can only be used by userspace applications like DPDK?

No, this series does not change the behavior of vfio-pci with respect
to probing a PF where VFs are already enabled.  This is still
disallowed.  I haven't seen a use case that requires this and allowing
it tends to subvert the restrictions here.  For instance, if an
existing VF is already in use by a vfio-pci driver, the PF can
transition from a trusted host driver to an unknown userspace driver.

> > The fundamental issue
> > with this concept is that the VF is not fully independent of the PF
> > driver.  Minimally the PF driver might be able to deny service to the
> > VF, VF data paths might be dependent on the state of the PF device,
> > or the PF my have some degree of ability to inspect or manipulate the
> > VF data.  It therefore would seem irresponsible to unleash VFs onto
> > the system, managed by a user owned PF.
> > 
> > We address this in a few ways in this series.  First, we can use a bus
> > notifier and the driver_override facility to make sure VFs are bound
> > to the vfio-pci driver by default.  This should eliminate the chance
> > that a VF is accidentally bound and used by host drivers.  We don't
> > however remove the ability for a host admin to change this override.
> > 
> > The next issue we need to address is how we let userspace drivers
> > opt-in to this participation with the PF driver.  We do not want an
> > admin to be able to unwittingly assign one of these VFs to a tenant
> > that isn't working in collaboration with the PF driver.  We could use
> > IOMMU grouping, but this seems to push too far towards tightly coupled
> > PF and VF drivers.  This series introduces a "VF token", implemented
> > as a UUID, as a shared secret between PF and VF drivers.  The token
> > needs to be set by the PF driver and used as part of the device
> > matching by the VF driver.  Provisions in the code also account for
> > restarting the PF driver with active VF drivers, requiring the PF to
> > use the current token to re-gain access to the PF.  
> 
> How about the scenario in which PF driver is vfio-based userspace
> driver but VF drivers are mixed. This means not all VFs are bound
> to vfio-based userspace driver. Is it also supported here? :-)

It's allowed.  Userspace VF drivers will need to participate in the VF
token scheme, host drivers may be bound to VFs normally after removing
the default driver_override.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 3/7] vfio/pci: Introduce VF token
  2020-02-05  7:57   ` Liu, Yi L
@ 2020-02-05 14:13     ` Alex Williamson
  0 siblings, 0 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-05 14:13 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, Richardson, Bruce, cohuck

On Wed, 5 Feb 2020 07:57:29 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, February 5, 2020 7:06 AM
> > To: kvm@vger.kernel.org
> > Subject: [RFC PATCH 3/7] vfio/pci: Introduce VF token
> > 
> > If we enable SR-IOV on a vfio-pci owned PF, the resulting VFs are not
> > fully isolated from the PF.  The PF can always cause a denial of
> > service to the VF, if not access data passed through the VF directly.
> > This is why vfio-pci currently does not bind to PFs with SR-IOV enabled
> > and does not provide access itself to enabling SR-IOV on a PF.  The
> > IOMMU grouping mechanism might allow us a solution to this lack of
> > isolation, however the deficiency isn't actually in the DMA path, so
> > much as the potential cooperation between PF and VF devices.  Also,
> > if we were to force VFs into the same IOMMU group as the PF, we severely
> > limit the utility of having independent drivers managing PFs and VFs
> > with vfio.
> > 
> > Therefore we introduce the concept of a VF token.  The token is
> > implemented as a UUID and represents a shared secret which must be set
> > by the PF driver and used by the VF drivers in order to access a vfio
> > device file descriptor for the VF.  The ioctl to set the VF token will
> > be provided in a later commit, this commit implements the underlying
> > infrastructure.  The concept here is to augment the string the user
> > passes to match a device within a group in order to retrieve access to
> > the device descriptor.  For example, rather than passing only the PCI
> > device name (ex. "0000:03:00.0") the user would also pass a vf_token
> > UUID (ex. "2ab74924-c335-45f4-9b16-8569e5b08258").  The device match
> > string therefore becomes:
> > 
> > "0000:03:00.0 vf_token=2ab74924-c335-45f4-9b16-8569e5b08258"
> > 
> > This syntax is expected to be extensible to future options as well, with
> > the standard being:
> > 
> > "$DEVICE_NAME $OPTION1=$VALUE1 $OPTION2=$VALUE2"
> > 
> > The device name must be first and option=value pairs are separated by
> > spaces.
> > 
> > The vf_token option is only required for VFs where the PF device is
> > bound to vfio-pci.  There is no change for PFs using existing host
> > drivers.
> > 
> > Note that in order to protect existing VF users, not only is it required
> > to set a vf_token on the PF before VFs devices can be accessed, but also
> > if there are existing VF users, (re)opening the PF device must also
> > provide the current vf_token as authentication.  This is intended to
> > prevent a VF driver starting with a trusted PF driver and later being
> > replaced by an unknown driver.  A vf_token is not required to open the
> > PF device when none of the VF devices are in use by vfio-pci drivers.  
> 
> So vfio_token is a kind of per-PF token?

Yes, the token is per-PF.  Note that the token can be changed and it
does not "de-authenticate" opened VFs.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-05  7:57   ` Liu, Yi L
@ 2020-02-05 14:18     ` Alex Williamson
  0 siblings, 0 replies; 25+ messages in thread
From: Alex Williamson @ 2020-02-05 14:18 UTC (permalink / raw)
  To: Liu, Yi L
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, Richardson, Bruce, cohuck

On Wed, 5 Feb 2020 07:57:36 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Wednesday, February 5, 2020 7:18 AM
> > To: kvm@vger.kernel.org
> > Subject: Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
> > 
> > 
> > Promised example QEMU test case...
> > 
> > commit 3557c63bcb286c71f3f7242cad632edd9e297d26
> > Author: Alex Williamson <alex.williamson@redhat.com>
> > Date:   Tue Feb 4 13:47:41 2020 -0700
> > 
> >     vfio-pci: QEMU support for vfio-pci VF tokens
> > 
> >     Example support for using a vf_token to gain access to a device as
> >     well as using the VFIO_DEVICE_FEATURE interface to set the VF token.
> >     Note that the kernel will disregard the additional option where it's
> >     not required, such as opening the PF with no VF users, so we can
> >     always provide it.
> > 
> >     NB. It's unclear whether there's value to this QEMU support without
> >     further exposure of SR-IOV within a VM.  This is meant mostly as a
> >     test case where the real initial users will likely be DPDK drivers.
> > 
> >     Signed-off-by: Alex Williamson <alex.williamson@redhat.com>  
> 
> Just curious how UUID is used across the test. Should the QEMU
> which opens VFs add the vfio_token=UUID or the QEMU which
> opens PF add the vfio_token=UUID? or both should add vfio_token=UUID.

In this example we do both as this covers the case where there are
existing VF users, which requires the PF to also provide the vf_token.
If there are no VF users, the PF is not required to provide a vf_token
and vfio-pci will not fail the device match if a vf_token is provided
but not needed.  In fact, when a PF is probed by vfio-pci a random
vf_token is set, so it's required to use a PF driver to set a known
vf_token before any VF users can access their VFs.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks
  2020-02-04 23:05 ` [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks Alex Williamson
@ 2020-02-06 11:14   ` Cornelia Huck
  2020-02-06 18:18     ` Alex Williamson
  0 siblings, 1 reply; 25+ messages in thread
From: Cornelia Huck @ 2020-02-06 11:14 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson

On Tue, 04 Feb 2020 16:05:43 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> Allow bus drivers to provide their own callback to match a device to
> the user provided string.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/vfio/vfio.c  |   19 +++++++++++++++----
>  include/linux/vfio.h |    3 +++
>  2 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 388597930b64..dda1726adda8 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -875,11 +875,22 @@ EXPORT_SYMBOL_GPL(vfio_device_get_from_dev);
>  static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
>  						     char *buf)
>  {
> -	struct vfio_device *it, *device = NULL;
> +	struct vfio_device *it, *device = ERR_PTR(-ENODEV);
>  
>  	mutex_lock(&group->device_lock);
>  	list_for_each_entry(it, &group->device_list, group_next) {
> -		if (!strcmp(dev_name(it->dev), buf)) {
> +		int ret;
> +
> +		if (it->ops->match) {
> +			ret = it->ops->match(it->device_data, buf);
> +			if (ret < 0 && ret != -ENODEV) {
> +				device = ERR_PTR(ret);
> +				break;
> +			}
> +		} else
> +			ret = strcmp(dev_name(it->dev), buf);

The asymmetric braces look a bit odd.

> +
> +		if (!ret) {
>  			device = it;
>  			vfio_device_get(device);
>  			break;
> @@ -1441,8 +1452,8 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
>  		return -EPERM;
>  
>  	device = vfio_device_get_from_name(group, buf);
> -	if (!device)
> -		return -ENODEV;
> +	if (IS_ERR(device))
> +		return PTR_ERR(device);
>  
>  	ret = device->ops->open(device->device_data);
>  	if (ret) {
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index e42a711a2800..755e0f0e2900 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -26,6 +26,8 @@
>   *         operations documented below
>   * @mmap: Perform mmap(2) on a region of the device file descriptor
>   * @request: Request for the bus driver to release the device
> + * @match: Optional device name match callback (return: 0 for match, -ENODEV
> + *         (or >0) for no match and continue, other -errno: no match and stop)

I'm wondering about these conventions.

If you basically want a tri-state return (match, don't match/continue,
don't match/stop), putting -ENODEV and >0 together feels odd. I would
rather expect either
- < 0 == don't match/stop, 0 == don't match/continue, > 0 == match, or
- 0 == match, -ENODEV (or some other defined error) == don't
  match/continue, all other values == don't match/abort?

>   */
>  struct vfio_device_ops {
>  	char	*name;
> @@ -39,6 +41,7 @@ struct vfio_device_ops {
>  			 unsigned long arg);
>  	int	(*mmap)(void *device_data, struct vm_area_struct *vma);
>  	void	(*request)(void *device_data, unsigned int count);
> +	int	(*match)(void *device_data, char *buf);
>  };
>  
>  extern struct iommu_group *vfio_iommu_group_get(struct device *dev);
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 6/7] vfio/pci: Remove dev_fmt definition
  2020-02-04 23:06 ` [RFC PATCH 6/7] vfio/pci: Remove dev_fmt definition Alex Williamson
@ 2020-02-06 13:45   ` Cornelia Huck
  0 siblings, 0 replies; 25+ messages in thread
From: Cornelia Huck @ 2020-02-06 13:45 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson

On Tue, 04 Feb 2020 16:06:24 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> It currently results in messages like:
> 
>  "vfio-pci 0000:03:00.0: vfio_pci: ..."
> 
> Which is quite a bit redundant.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/vfio/pci/vfio_pci.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 026308aa18b5..343fe38ed06b 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -9,7 +9,6 @@
>   */
>  
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> -#define dev_fmt pr_fmt
>  
>  #include <linux/device.h>
>  #include <linux/eventfd.h>
> 

Yes, that looks a bit superfluous.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks
  2020-02-06 11:14   ` Cornelia Huck
@ 2020-02-06 18:18     ` Alex Williamson
  2020-02-07  9:33       ` Cornelia Huck
  0 siblings, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2020-02-06 18:18 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson

On Thu, 6 Feb 2020 12:14:19 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Tue, 04 Feb 2020 16:05:43 -0700
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > Allow bus drivers to provide their own callback to match a device to
> > the user provided string.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  drivers/vfio/vfio.c  |   19 +++++++++++++++----
> >  include/linux/vfio.h |    3 +++
> >  2 files changed, 18 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> > index 388597930b64..dda1726adda8 100644
> > --- a/drivers/vfio/vfio.c
> > +++ b/drivers/vfio/vfio.c
> > @@ -875,11 +875,22 @@ EXPORT_SYMBOL_GPL(vfio_device_get_from_dev);
> >  static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
> >  						     char *buf)
> >  {
> > -	struct vfio_device *it, *device = NULL;
> > +	struct vfio_device *it, *device = ERR_PTR(-ENODEV);
> >  
> >  	mutex_lock(&group->device_lock);
> >  	list_for_each_entry(it, &group->device_list, group_next) {
> > -		if (!strcmp(dev_name(it->dev), buf)) {
> > +		int ret;
> > +
> > +		if (it->ops->match) {
> > +			ret = it->ops->match(it->device_data, buf);
> > +			if (ret < 0 && ret != -ENODEV) {
> > +				device = ERR_PTR(ret);
> > +				break;
> > +			}
> > +		} else
> > +			ret = strcmp(dev_name(it->dev), buf);  
> 
> The asymmetric braces look a bit odd.

Ok

> > +
> > +		if (!ret) {
> >  			device = it;
> >  			vfio_device_get(device);
> >  			break;
> > @@ -1441,8 +1452,8 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
> >  		return -EPERM;
> >  
> >  	device = vfio_device_get_from_name(group, buf);
> > -	if (!device)
> > -		return -ENODEV;
> > +	if (IS_ERR(device))
> > +		return PTR_ERR(device);
> >  
> >  	ret = device->ops->open(device->device_data);
> >  	if (ret) {
> > diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> > index e42a711a2800..755e0f0e2900 100644
> > --- a/include/linux/vfio.h
> > +++ b/include/linux/vfio.h
> > @@ -26,6 +26,8 @@
> >   *         operations documented below
> >   * @mmap: Perform mmap(2) on a region of the device file descriptor
> >   * @request: Request for the bus driver to release the device
> > + * @match: Optional device name match callback (return: 0 for match, -ENODEV
> > + *         (or >0) for no match and continue, other -errno: no match and stop)  
> 
> I'm wondering about these conventions.
> 
> If you basically want a tri-state return (match, don't match/continue,
> don't match/stop), putting -ENODEV and >0 together feels odd. I would
> rather expect either
> - < 0 == don't match/stop, 0 == don't match/continue, > 0 == match, or

So sort of a bool + errno.  I shied away from this because returning
zero for success, or match, is such a common semantic, especially when
we're replacing a simple strcmp().  I suppose it's logically just
!strcmp() though, which avoids the abort case for a simple
implementation like patch 2/7.

> - 0 == match, -ENODEV (or some other defined error) == don't
>   match/continue, all other values == don't match/abort?

This is closest to what I arrived at in this version, but I found it
necessary to exclude positive values from the no-match/abort and
consider them as no-match/continue because I didn't want to assume the
errno for that case.

I think with your first option we arrive at something like this:

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index dda1726adda8..b5609a411139 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -883,14 +883,15 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
 
 		if (it->ops->match) {
 			ret = it->ops->match(it->device_data, buf);
-			if (ret < 0 && ret != -ENODEV) {
+			if (ret < 0) {
 				device = ERR_PTR(ret);
 				break;
 			}
-		} else
-			ret = strcmp(dev_name(it->dev), buf);
+		} else {
+			ret = !strcmp(dev_name(it->dev), buf);
+		}
 
-		if (!ret) {
+		if (ret) {
 			device = it;
 			vfio_device_get(device);
 			break;
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 755e0f0e2900..029694b977f2 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -26,8 +26,9 @@
  *         operations documented below
  * @mmap: Perform mmap(2) on a region of the device file descriptor
  * @request: Request for the bus driver to release the device
- * @match: Optional device name match callback (return: 0 for match, -ENODEV
- *         (or >0) for no match and continue, other -errno: no match and stop)
+ * @match: Optional device name match callback (return: 0 for no-match, >0 for
+ *         match, -errno for abort (ex. match with insufficient or incorrect
+ *         additional args)
  */
 struct vfio_device_ops {
 	char	*name;


I like that.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks
  2020-02-06 18:18     ` Alex Williamson
@ 2020-02-07  9:33       ` Cornelia Huck
  0 siblings, 0 replies; 25+ messages in thread
From: Cornelia Huck @ 2020-02-07  9:33 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, linux-pci, linux-kernel, dev, mtosatti, thomas, bluca,
	jerinjacobk, bruce.richardson

On Thu, 6 Feb 2020 11:18:42 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Thu, 6 Feb 2020 12:14:19 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Tue, 04 Feb 2020 16:05:43 -0700
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >   
> > > Allow bus drivers to provide their own callback to match a device to
> > > the user provided string.
> > > 
> > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > ---
> > >  drivers/vfio/vfio.c  |   19 +++++++++++++++----
> > >  include/linux/vfio.h |    3 +++

> I think with your first option we arrive at something like this:
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index dda1726adda8..b5609a411139 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -883,14 +883,15 @@ static struct vfio_device *vfio_device_get_from_name(struct vfio_group *group,
>  
>  		if (it->ops->match) {
>  			ret = it->ops->match(it->device_data, buf);
> -			if (ret < 0 && ret != -ENODEV) {
> +			if (ret < 0) {
>  				device = ERR_PTR(ret);
>  				break;
>  			}
> -		} else
> -			ret = strcmp(dev_name(it->dev), buf);
> +		} else {
> +			ret = !strcmp(dev_name(it->dev), buf);
> +		}
>  
> -		if (!ret) {
> +		if (ret) {
>  			device = it;
>  			vfio_device_get(device);
>  			break;
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 755e0f0e2900..029694b977f2 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -26,8 +26,9 @@
>   *         operations documented below
>   * @mmap: Perform mmap(2) on a region of the device file descriptor
>   * @request: Request for the bus driver to release the device
> - * @match: Optional device name match callback (return: 0 for match, -ENODEV
> - *         (or >0) for no match and continue, other -errno: no match and stop)
> + * @match: Optional device name match callback (return: 0 for no-match, >0 for
> + *         match, -errno for abort (ex. match with insufficient or incorrect
> + *         additional args)
>   */
>  struct vfio_device_ops {
>  	char	*name;
> 
> 
> I like that.  Thanks,
> 
> Alex

Looks good to me.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
                   ` (9 preceding siblings ...)
  2020-02-05  7:57 ` Liu, Yi L
@ 2020-02-11 11:18 ` Jerin Jacob
  2020-02-11 13:57   ` [dpdk-dev] " Thomas Monjalon
  2020-02-11 17:06   ` Alex Williamson
  10 siblings, 2 replies; 25+ messages in thread
From: Jerin Jacob @ 2020-02-11 11:18 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, linux-pci, linux-kernel, dpdk-dev, mtosatti,
	Thomas Monjalon, Luca Boccassi, Richardson, Bruce, cohuck,
	Vamsi Attunuru

On Wed, Feb 5, 2020 at 4:35 AM Alex Williamson
<alex.williamson@redhat.com> wrote:
>
> There seems to be an ongoing desire to use userspace, vfio-based
> drivers for both SR-IOV PF and VF devices.  The fundamental issue
> with this concept is that the VF is not fully independent of the PF
> driver.  Minimally the PF driver might be able to deny service to the
> VF, VF data paths might be dependent on the state of the PF device,
> or the PF my have some degree of ability to inspect or manipulate the
> VF data.  It therefore would seem irresponsible to unleash VFs onto
> the system, managed by a user owned PF.
>
> We address this in a few ways in this series.  First, we can use a bus
> notifier and the driver_override facility to make sure VFs are bound
> to the vfio-pci driver by default.  This should eliminate the chance
> that a VF is accidentally bound and used by host drivers.  We don't
> however remove the ability for a host admin to change this override.
>
> The next issue we need to address is how we let userspace drivers
> opt-in to this participation with the PF driver.  We do not want an
> admin to be able to unwittingly assign one of these VFs to a tenant
> that isn't working in collaboration with the PF driver.  We could use
> IOMMU grouping, but this seems to push too far towards tightly coupled
> PF and VF drivers.  This series introduces a "VF token", implemented
> as a UUID, as a shared secret between PF and VF drivers.  The token
> needs to be set by the PF driver and used as part of the device
> matching by the VF driver.  Provisions in the code also account for
> restarting the PF driver with active VF drivers, requiring the PF to
> use the current token to re-gain access to the PF.

Thanks Alex for the series. DPDK realizes this use-case through, an out of
tree igb_uio module, for non VFIO devices. Supporting this use case, with
VFIO, will be a great enhancement for DPDK as we are planning to
get rid of out of tree modules any focus only on userspace aspects.

From the DPDK perspective, we have following use-cases

1) VF representer or OVS/vSwitch  use cases where
DPDK PF acts as an HW switch to steer traffic to VF
using the rte_flow library backed by HW CAMs.

2) Unlike, other PCI class of devices, Network class of PCIe devices
would have additional
capability on the PF devices such as promiscuous mode support etc
leverage that in DPDK
PF and VF use cases.

That would boil down to the use of the following topology.
a)  PF bound to DPDK/VFIO  and  VF bound to Linux
b)  PF bound to DPDK/VFIO  and  VF bound to DPDK/VFIO

Tested the use case (a) and it works this patch. Tested use case(b), it
works with patch provided both PF and VF under the same application.

Regarding the use case where  PF bound to DPDK/VFIO and
VF bound to DPDK/VFIO are _two different_ processes then sharing the UUID
will be a little tricky thing in terms of usage. But if that is the
purpose of bringing
UUID to the equation then it fine.

Overall this series looks good to me.  We can test the next non-RFC
series and give
Tested-by by after testing with DPDK.


>
> The above solutions introduce a bit of a modification to the VFIO ABI
> and an additional ABI extension.  The modification is that the
> VFIO_GROUP_GET_DEVICE_FD ioctl is specified to require a char string
> from the user providing the device name.  For this solution, we extend
> the syntax to allow the device name followed by key/value pairs.  In
> this case we add "vf_token=3e7e882e-1daf-417f-ad8d-882eea5ee337", for
> example.  These options are expected to be space separated.  Matching
> these key/value pairs is entirely left to the vfio bus driver (ex.
> vfio-pci) and the internal ops structure is extended to allow this
> optional support.  This extension should be fully backwards compatible
> to existing userspace, such code will simply fail to open these newly
> exposed devices, as intended.
>
> I've been debating whether instead of the above we should allow the
> user to get the device fd as normal, but restrict the interfaces until
> the user authenticates, but I'm afraid this would be a less backwards
> compatible solution.  It would be just as unclear to the user why a
> device read/write/mmap/ioctl failed as it might be to why getting the
> device fd could fail.  However in the latter case, I believe we do a
> better job of restricting how far userspace code might go before they
> ultimately fail.  I'd welcome discussion in the space, and or course
> the extension of the GET_DEVICE_FD string.
>
> Finally, the user needs to be able to set a VF token.  I add a
> VFIO_DEVICE_FEATURE ioctl for this that's meant to be reusable for
> getting, setting, and probing arbitrary features of a device.
>
> I'll reply to this cover letter with a very basic example of a QEMU
> update to support this interface, though I haven't found a device yet
> that behaves well with the PF running in one VM with the VF in
> another, or really even just a PF running in a VM with SR-IOV enabled.
> I know these devices exist though, and I suspect QEMU will not be the
> primary user of this support for now, but this behavior reaffirms my
> concerns to prevent mis-use.
>
> Please comment.  In particular, does this approach meet the DPDK needs
> for userspace PF and VF drivers, with the hopefully minor hurdle of
> sharing a token between drivers.  The token is of course left to
> userspace how to manage, and might be static (and not very secret) for
> a given set of drivers.  Thanks,
>
> Alex
>
> ---
>
> Alex Williamson (7):
>       vfio: Include optional device match in vfio_device_ops callbacks
>       vfio/pci: Implement match ops
>       vfio/pci: Introduce VF token
>       vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user
>       vfio/pci: Add sriov_configure support
>       vfio/pci: Remove dev_fmt definition
>       vfio/pci: Cleanup .probe() exit paths
>
>
>  drivers/vfio/pci/vfio_pci.c         |  315 ++++++++++++++++++++++++++++++++---
>  drivers/vfio/pci/vfio_pci_private.h |   10 +
>  drivers/vfio/vfio.c                 |   19 ++
>  include/linux/vfio.h                |    3
>  include/uapi/linux/vfio.h           |   37 ++++
>  5 files changed, 356 insertions(+), 28 deletions(-)
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-11 11:18 ` Jerin Jacob
@ 2020-02-11 13:57   ` " Thomas Monjalon
  2020-02-11 17:06   ` Alex Williamson
  1 sibling, 0 replies; 25+ messages in thread
From: Thomas Monjalon @ 2020-02-11 13:57 UTC (permalink / raw)
  To: Alex Williamson
  Cc: dev, kvm, linux-pci, linux-kernel, dev, mtosatti, Luca Boccassi,
	Richardson, Bruce, cohuck, Vamsi Attunuru, Jerin Jacob

11/02/2020 12:18, Jerin Jacob:
> On Wed, Feb 5, 2020 at 4:35 AM Alex Williamson wrote:
> >
> > There seems to be an ongoing desire to use userspace, vfio-based
> > drivers for both SR-IOV PF and VF devices.  The fundamental issue
> > with this concept is that the VF is not fully independent of the PF
> > driver.  Minimally the PF driver might be able to deny service to the
> > VF, VF data paths might be dependent on the state of the PF device,
> > or the PF my have some degree of ability to inspect or manipulate the
> > VF data.  It therefore would seem irresponsible to unleash VFs onto
> > the system, managed by a user owned PF.
> >
> > We address this in a few ways in this series.  First, we can use a bus
> > notifier and the driver_override facility to make sure VFs are bound
> > to the vfio-pci driver by default.  This should eliminate the chance
> > that a VF is accidentally bound and used by host drivers.  We don't
> > however remove the ability for a host admin to change this override.
> >
> > The next issue we need to address is how we let userspace drivers
> > opt-in to this participation with the PF driver.  We do not want an
> > admin to be able to unwittingly assign one of these VFs to a tenant
> > that isn't working in collaboration with the PF driver.  We could use
> > IOMMU grouping, but this seems to push too far towards tightly coupled
> > PF and VF drivers.  This series introduces a "VF token", implemented
> > as a UUID, as a shared secret between PF and VF drivers.  The token
> > needs to be set by the PF driver and used as part of the device
> > matching by the VF driver.  Provisions in the code also account for
> > restarting the PF driver with active VF drivers, requiring the PF to
> > use the current token to re-gain access to the PF.
> 
> Thanks Alex for the series. DPDK realizes this use-case through, an out of
> tree igb_uio module, for non VFIO devices. Supporting this use case, with
> VFIO, will be a great enhancement for DPDK as we are planning to
> get rid of out of tree modules any focus only on userspace aspects.
[..]
> Regarding the use case where  PF bound to DPDK/VFIO and
> VF bound to DPDK/VFIO are _two different_ processes then sharing the UUID
> will be a little tricky thing in terms of usage. But if that is the
> purpose of bringing UUID to the equation then it fine.
> 
> Overall this series looks good to me.  We can test the next non-RFC
> series and give
> Tested-by by after testing with DPDK.
[..]
> > Please comment.  In particular, does this approach meet the DPDK needs
> > for userspace PF and VF drivers, with the hopefully minor hurdle of
> > sharing a token between drivers.  The token is of course left to
> > userspace how to manage, and might be static (and not very secret) for
> > a given set of drivers.  Thanks,

Thanks Alex, it looks to be a great improvement.

In the meantime, DPDK is going to move igb_uio (an out-of-tree
Linux kernel module) from the main DPDK repository to a side-repo.
This move and this patchset will hopefully encourage using VFIO.
As Jerin said, DPDK prefers relying on upstream Linux modules.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-11 11:18 ` Jerin Jacob
  2020-02-11 13:57   ` [dpdk-dev] " Thomas Monjalon
@ 2020-02-11 17:06   ` Alex Williamson
  2020-02-11 18:03     ` Jerin Jacob
  1 sibling, 1 reply; 25+ messages in thread
From: Alex Williamson @ 2020-02-11 17:06 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: kvm, linux-pci, linux-kernel, dpdk-dev, mtosatti,
	Thomas Monjalon, Luca Boccassi, Richardson, Bruce, cohuck,
	Vamsi Attunuru

On Tue, 11 Feb 2020 16:48:47 +0530
Jerin Jacob <jerinjacobk@gmail.com> wrote:

> On Wed, Feb 5, 2020 at 4:35 AM Alex Williamson
> <alex.williamson@redhat.com> wrote:
> >
> > There seems to be an ongoing desire to use userspace, vfio-based
> > drivers for both SR-IOV PF and VF devices.  The fundamental issue
> > with this concept is that the VF is not fully independent of the PF
> > driver.  Minimally the PF driver might be able to deny service to the
> > VF, VF data paths might be dependent on the state of the PF device,
> > or the PF my have some degree of ability to inspect or manipulate the
> > VF data.  It therefore would seem irresponsible to unleash VFs onto
> > the system, managed by a user owned PF.
> >
> > We address this in a few ways in this series.  First, we can use a bus
> > notifier and the driver_override facility to make sure VFs are bound
> > to the vfio-pci driver by default.  This should eliminate the chance
> > that a VF is accidentally bound and used by host drivers.  We don't
> > however remove the ability for a host admin to change this override.
> >
> > The next issue we need to address is how we let userspace drivers
> > opt-in to this participation with the PF driver.  We do not want an
> > admin to be able to unwittingly assign one of these VFs to a tenant
> > that isn't working in collaboration with the PF driver.  We could use
> > IOMMU grouping, but this seems to push too far towards tightly coupled
> > PF and VF drivers.  This series introduces a "VF token", implemented
> > as a UUID, as a shared secret between PF and VF drivers.  The token
> > needs to be set by the PF driver and used as part of the device
> > matching by the VF driver.  Provisions in the code also account for
> > restarting the PF driver with active VF drivers, requiring the PF to
> > use the current token to re-gain access to the PF.  
> 
> Thanks Alex for the series. DPDK realizes this use-case through, an out of
> tree igb_uio module, for non VFIO devices. Supporting this use case, with
> VFIO, will be a great enhancement for DPDK as we are planning to
> get rid of out of tree modules any focus only on userspace aspects.
> 
> From the DPDK perspective, we have following use-cases
> 
> 1) VF representer or OVS/vSwitch  use cases where
> DPDK PF acts as an HW switch to steer traffic to VF
> using the rte_flow library backed by HW CAMs.
> 
> 2) Unlike, other PCI class of devices, Network class of PCIe devices
> would have additional
> capability on the PF devices such as promiscuous mode support etc
> leverage that in DPDK
> PF and VF use cases.
> 
> That would boil down to the use of the following topology.
> a)  PF bound to DPDK/VFIO  and  VF bound to Linux
> b)  PF bound to DPDK/VFIO  and  VF bound to DPDK/VFIO
> 
> Tested the use case (a) and it works this patch. Tested use case(b), it
> works with patch provided both PF and VF under the same application.
> 
> Regarding the use case where  PF bound to DPDK/VFIO and
> VF bound to DPDK/VFIO are _two different_ processes then sharing the UUID
> will be a little tricky thing in terms of usage. But if that is the
> purpose of bringing
> UUID to the equation then it fine.
> 
> Overall this series looks good to me.  We can test the next non-RFC
> series and give
> Tested-by by after testing with DPDK.

Thanks Jerin, that's great feedback.  For case b), it is rather the
intention of the shared VF token proposed here that it imposes some
small barrier in validating the collaboration between the PF and VF
drivers.  In a trusted environment, a common UUID might be exposed in a
shared file and the same token could be used by all PFs and VFs on the
system, or datacenter.  The goal is simply to make sure the
collaboration is explicit, I don't want to be fielding support issues
from users assigning PFs and VFs to unrelated VM instances or
unintentionally creating your scenario a) configuration.

With the positive response from you and Thomas, I'll post a non-RFC
version and barring any blockers maybe we can get this in for the v5.7
kernel.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/7] vfio/pci: SR-IOV support
  2020-02-11 17:06   ` Alex Williamson
@ 2020-02-11 18:03     ` Jerin Jacob
  0 siblings, 0 replies; 25+ messages in thread
From: Jerin Jacob @ 2020-02-11 18:03 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, linux-pci, linux-kernel, dpdk-dev, mtosatti,
	Thomas Monjalon, Luca Boccassi, Richardson, Bruce, cohuck,
	Vamsi Attunuru

On Tue, Feb 11, 2020 at 10:36 PM Alex Williamson
<alex.williamson@redhat.com> wrote:
>
> On Tue, 11 Feb 2020 16:48:47 +0530
> Jerin Jacob <jerinjacobk@gmail.com> wrote:
>
> > On Wed, Feb 5, 2020 at 4:35 AM Alex Williamson
> > <alex.williamson@redhat.com> wrote:
> > >
> > > There seems to be an ongoing desire to use userspace, vfio-based
> > > drivers for both SR-IOV PF and VF devices.  The fundamental issue
> > > with this concept is that the VF is not fully independent of the PF
> > > driver.  Minimally the PF driver might be able to deny service to the
> > > VF, VF data paths might be dependent on the state of the PF device,
> > > or the PF my have some degree of ability to inspect or manipulate the
> > > VF data.  It therefore would seem irresponsible to unleash VFs onto
> > > the system, managed by a user owned PF.
> > >
> > > We address this in a few ways in this series.  First, we can use a bus
> > > notifier and the driver_override facility to make sure VFs are bound
> > > to the vfio-pci driver by default.  This should eliminate the chance
> > > that a VF is accidentally bound and used by host drivers.  We don't
> > > however remove the ability for a host admin to change this override.
> > >
> > > The next issue we need to address is how we let userspace drivers
> > > opt-in to this participation with the PF driver.  We do not want an
> > > admin to be able to unwittingly assign one of these VFs to a tenant
> > > that isn't working in collaboration with the PF driver.  We could use
> > > IOMMU grouping, but this seems to push too far towards tightly coupled
> > > PF and VF drivers.  This series introduces a "VF token", implemented
> > > as a UUID, as a shared secret between PF and VF drivers.  The token
> > > needs to be set by the PF driver and used as part of the device
> > > matching by the VF driver.  Provisions in the code also account for
> > > restarting the PF driver with active VF drivers, requiring the PF to
> > > use the current token to re-gain access to the PF.
> >
> > Thanks Alex for the series. DPDK realizes this use-case through, an out of
> > tree igb_uio module, for non VFIO devices. Supporting this use case, with
> > VFIO, will be a great enhancement for DPDK as we are planning to
> > get rid of out of tree modules any focus only on userspace aspects.
> >
> > From the DPDK perspective, we have following use-cases
> >
> > 1) VF representer or OVS/vSwitch  use cases where
> > DPDK PF acts as an HW switch to steer traffic to VF
> > using the rte_flow library backed by HW CAMs.
> >
> > 2) Unlike, other PCI class of devices, Network class of PCIe devices
> > would have additional
> > capability on the PF devices such as promiscuous mode support etc
> > leverage that in DPDK
> > PF and VF use cases.
> >
> > That would boil down to the use of the following topology.
> > a)  PF bound to DPDK/VFIO  and  VF bound to Linux
> > b)  PF bound to DPDK/VFIO  and  VF bound to DPDK/VFIO
> >
> > Tested the use case (a) and it works this patch. Tested use case(b), it
> > works with patch provided both PF and VF under the same application.
> >
> > Regarding the use case where  PF bound to DPDK/VFIO and
> > VF bound to DPDK/VFIO are _two different_ processes then sharing the UUID
> > will be a little tricky thing in terms of usage. But if that is the
> > purpose of bringing
> > UUID to the equation then it fine.
> >
> > Overall this series looks good to me.  We can test the next non-RFC
> > series and give
> > Tested-by by after testing with DPDK.
>
> Thanks Jerin, that's great feedback.  For case b), it is rather the
> intention of the shared VF token proposed here that it imposes some
> small barrier in validating the collaboration between the PF and VF
> drivers.  In a trusted environment, a common UUID might be exposed in a
> shared file and the same token could be used by all PFs and VFs on the
> system, or datacenter.  The goal is simply to make sure the
> collaboration is explicit, I don't want to be fielding support issues
> from users assigning PFs and VFs to unrelated VM instances or
> unintentionally creating your scenario a) configuration.

Yes. Makes sense from kernel PoV.

DPDK side, probably we will end in hardcoded UUID value.

The tricky part would DPDK PF and QEMU VF integration case.
I am not sure about that integration( a command-line option for UUID) or
something more sophisticated orchestration. Anyway, it is clear from
kernel side,
Something needs to be sorted with the QEMU community.

> With the positive response from you and Thomas, I'll post a non-RFC
> version and barring any blockers maybe we can get this in for the v5.7
> kernel.  Thanks,

Great.

>
> Alex
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, back to index

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-04 23:05 [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
2020-02-04 23:05 ` [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks Alex Williamson
2020-02-06 11:14   ` Cornelia Huck
2020-02-06 18:18     ` Alex Williamson
2020-02-07  9:33       ` Cornelia Huck
2020-02-04 23:05 ` [RFC PATCH 2/7] vfio/pci: Implement match ops Alex Williamson
2020-02-04 23:06 ` [RFC PATCH 3/7] vfio/pci: Introduce VF token Alex Williamson
2020-02-05  7:57   ` Liu, Yi L
2020-02-05 14:13     ` Alex Williamson
2020-02-04 23:06 ` [RFC PATCH 4/7] vfio: Introduce VFIO_DEVICE_FEATURE ioctl and first user Alex Williamson
2020-02-04 23:06 ` [RFC PATCH 5/7] vfio/pci: Add sriov_configure support Alex Williamson
2020-02-04 23:06 ` [RFC PATCH 6/7] vfio/pci: Remove dev_fmt definition Alex Williamson
2020-02-06 13:45   ` Cornelia Huck
2020-02-04 23:06 ` [RFC PATCH 7/7] vfio/pci: Cleanup .probe() exit paths Alex Williamson
2020-02-04 23:17 ` [RFC PATCH 0/7] vfio/pci: SR-IOV support Alex Williamson
2020-02-05  7:57   ` Liu, Yi L
2020-02-05 14:18     ` Alex Williamson
2020-02-05  7:01 ` Christoph Hellwig
2020-02-05 13:58   ` Alex Williamson
2020-02-05  7:57 ` Liu, Yi L
2020-02-05 14:10   ` Alex Williamson
2020-02-11 11:18 ` Jerin Jacob
2020-02-11 13:57   ` [dpdk-dev] " Thomas Monjalon
2020-02-11 17:06   ` Alex Williamson
2020-02-11 18:03     ` Jerin Jacob

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git