All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 0/3] VFIO SRIOV support
@ 2016-08-18  7:29 Ilya Lesokhin
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Ilya Lesokhin @ 2016-08-18  7:29 UTC (permalink / raw)
  To: kvm, linux-pci
  Cc: bhelgaas, alex.williamson, noaos, haggaie, ogerlitz, liranl, ilyal

Changes from V2:
        1. Enabling and disabling SR-IOV is now done
        through the sysfs interface, requiring
        admin privileges.
        2. Since admin privileges are now required
        to enable SR-IOV most of the the security
        measures introduced in RFC V2 were removed.
        Unfortunately we still need a mutex to prevent
        the VFIO user from changing the number of
        VFs while enable_sriov is in progress.

Changes from V1:
        1. The VF are no longer assigned to PFs iommu group
        2. Add a pci_enable_sriov_with_override API to allow
        enablind sriov without probing the VFs with the
        default driver

Changes from RFC V2:
        1. pci_disable_sriov() is now called from a workqueue
        To avoid the situation where a process is blocked
        in pci_disable_sriov() wating for itself to relase the VFs.
        2. a mutex was added to synchronize calls to
        pci_enable_sriov() and pci_disable_sriov()

Changes from RFC V1:
        Due to the security concern raised in RFC V1, we add two patches
        to make sure the VFs belong to the same IOMMU group as
        the PF and are probed by VFIO.

Today the QEMU hypervisor allows assigning a physical device to a VM,
facilitating driver development. However, it does not support enabling
SR-IOV by the VM kernel driver. Our goal is to implement such support,
allowing developers working on SR-IOV physical function drivers to work
inside VMs as well.

This patch series implements the kernel side of our solution.  It extends
the VFIO driver to support the PCIE SRIOV extended capability with
following features:
1. The ability to probe SR-IOV BAR sizes.
2. The ability to enable and disable SR-IOV.

This patch series is going to be used by QEMU to expose SR-IOV capabilities
to VM. We already have an early prototype based on Knut Omang's patches for
SR-IOV[1].

Limitations:
1. Per SR-IOV spec section 3.3.12, PFs are required to support
4-KB, 8-KB, 64-KB, 256-KB, 1-MB, and 4-MB page sizes.
Unfourtently the kernel currently initializes the System Page Size register once
and assumes it doesn't change therefore we cannot allow guests to change this
register at will. We currently map both the Supported Page sizes and
System Page Size as virtualized and read only in violation of the spec.
In practice this is not an issue since both the hypervisor and the
guest typically select the same System Page Size.

[1] https://github.com/knuto/qemu/tree/sriov_patches_v6

Ilya Lesokhin (3):
  pci: Extend PCI IOV API
  vfio/pci: Allow control SR-IOV through sysfs interface
  vfio/pci: Add support for SR-IOV extended capablity

 drivers/pci/iov.c                   |  41 ++++++++--
 drivers/vfio/pci/vfio_pci.c         |  43 ++++++++--
 drivers/vfio/pci/vfio_pci_config.c  | 151 ++++++++++++++++++++++++++++++++----
 drivers/vfio/pci/vfio_pci_private.h |   2 +
 include/linux/pci.h                 |  13 +++-
 5 files changed, 219 insertions(+), 31 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V3 1/3] pci: Extend PCI IOV API
  2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
@ 2016-08-18  7:29 ` Ilya Lesokhin
  2016-08-18 22:09   ` Christoph Hellwig
  2016-08-22 18:51   ` kbuild test robot
  2016-08-18  7:29 ` [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface Ilya Lesokhin
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Ilya Lesokhin @ 2016-08-18  7:29 UTC (permalink / raw)
  To: kvm, linux-pci
  Cc: bhelgaas, alex.williamson, noaos, haggaie, ogerlitz, liranl, ilyal

1. Add pci_enable_sriov_with_override to allow
enabling sriov with a driver override
on the VFs.

2. Expose pci_iov_set_numvfs and pci_iov_resource_size
to make them available for other modules

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
---
 drivers/pci/iov.c   | 41 +++++++++++++++++++++++++++++++++--------
 include/linux/pci.h | 13 ++++++++++++-
 2 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 2194b44..98f6f10 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -41,7 +41,7 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
  *
  * Update iov->offset and iov->stride when NumVFs is written.
  */
-static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
+void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
 {
 	struct pci_sriov *iov = dev->sriov;
 
@@ -49,6 +49,7 @@ static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn)
 	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_OFFSET, &iov->offset);
 	pci_read_config_word(dev, iov->pos + PCI_SRIOV_VF_STRIDE, &iov->stride);
 }
+EXPORT_SYMBOL(pci_iov_set_numvfs);
 
 /*
  * The PF consumes one bus number.  NumVFs, First VF Offset, and VF Stride
@@ -112,8 +113,10 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 
 	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
 }
+EXPORT_SYMBOL(pci_iov_resource_size);
 
-int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
+int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset,
+		       char *driver_override)
 {
 	int i;
 	int rc = -ENOMEM;
@@ -154,14 +157,20 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
 		rc = request_resource(res, &virtfn->resource[i]);
 		BUG_ON(rc);
 	}
-
 	if (reset)
 		__pci_reset_function(virtfn);
 
 	pci_device_add(virtfn, virtfn->bus);
 	mutex_unlock(&iov->dev->sriov->lock);
 
+	if (driver_override) {
+		virtfn->driver_override = kstrdup(driver_override, GFP_KERNEL);
+		if (!virtfn->driver_override)
+			goto failed1;
+	}
+
 	pci_bus_add_device(virtfn);
+
 	sprintf(buf, "virtfn%u", id);
 	rc = sysfs_create_link(&dev->dev.kobj, &virtfn->dev.kobj, buf);
 	if (rc)
@@ -235,7 +244,8 @@ int __weak pcibios_sriov_disable(struct pci_dev *pdev)
 	return 0;
 }
 
-static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
+static int sriov_enable(struct pci_dev *dev, int nr_virtfn,
+			char *driver_override)
 {
 	int rc;
 	int i;
@@ -321,7 +331,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 	}
 
 	for (i = 0; i < initial; i++) {
-		rc = pci_iov_add_virtfn(dev, i, 0);
+		rc = pci_iov_add_virtfn(dev, i, 0, driver_override);
 		if (rc)
 			goto failed;
 	}
@@ -622,20 +632,35 @@ int pci_iov_bus_range(struct pci_bus *bus)
 }
 
 /**
- * pci_enable_sriov - enable the SR-IOV capability
+ * pci_enable_sriov_with_override - enable the SR-IOV capability
  * @dev: the PCI device
  * @nr_virtfn: number of virtual functions to enable
+ * @driver_override: driver override for VFs
  *
  * Returns 0 on success, or negative on failure.
  */
-int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+int pci_enable_sriov_with_override(struct pci_dev *dev, int nr_virtfn,
+				   char *driver_override)
 {
 	might_sleep();
 
 	if (!dev->is_physfn)
 		return -ENOSYS;
 
-	return sriov_enable(dev, nr_virtfn);
+	return sriov_enable(dev, nr_virtfn, driver_override);
+}
+EXPORT_SYMBOL_GPL(pci_enable_sriov_with_override);
+
+/**
+ * pci_enable_sriov - enable the SR-IOV capability
+ * @dev: the PCI device
+ * @nr_virtfn: number of virtual functions to enable
+ *
+ * Returns 0 on success, or negative on failure.
+ */
+int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
+{
+	return pci_enable_sriov_with_override(dev, nr_virtfn, NULL);
 }
 EXPORT_SYMBOL_GPL(pci_enable_sriov);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b67e4df..54b3059 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1739,15 +1739,20 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
 int pci_iov_virtfn_bus(struct pci_dev *dev, int id);
 int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
 
+int pci_enable_sriov_with_override(struct pci_dev *dev, int nr_virtfn,
+				   char *driver_override);
 int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
 void pci_disable_sriov(struct pci_dev *dev);
-int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset);
+int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset,
+		       char *driver_override);
 void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset);
 int pci_num_vf(struct pci_dev *dev);
 int pci_vfs_assigned(struct pci_dev *dev);
 int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
 int pci_sriov_get_totalvfs(struct pci_dev *dev);
 resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno);
+
+void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn);
 #else
 static inline int pci_iov_virtfn_bus(struct pci_dev *dev, int id)
 {
@@ -1757,6 +1762,11 @@ static inline int pci_iov_virtfn_devfn(struct pci_dev *dev, int id)
 {
 	return -ENOSYS;
 }
+
+static inline int pci_enable_sriov_with_override(struct pci_dev *dev,
+						 int nr_virtfn,
+						 char *driver_override)
+{ return -ENODEV; }
 static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
 { return -ENODEV; }
 static inline int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset)
@@ -1775,6 +1785,7 @@ static inline int pci_sriov_get_totalvfs(struct pci_dev *dev)
 { return 0; }
 static inline resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 { return 0; }
+static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn) { }
 #endif
 
 #if defined(CONFIG_HOTPLUG_PCI) || defined(CONFIG_HOTPLUG_PCI_MODULE)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface
  2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
@ 2016-08-18  7:29 ` Ilya Lesokhin
  2016-08-18 22:11   ` Christoph Hellwig
  2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
  2017-03-08  7:29 ` [PATCH V3 0/3] VFIO SRIOV support Jike Song
  3 siblings, 1 reply; 14+ messages in thread
From: Ilya Lesokhin @ 2016-08-18  7:29 UTC (permalink / raw)
  To: kvm, linux-pci
  Cc: bhelgaas, alex.williamson, noaos, haggaie, ogerlitz, liranl, ilyal

This patch allows enabling and disabling SR-IOV for
devices probed by vfio-pci.
Since the devices might be assigned to an untrusted entities
we use driver_override to make sure the VFs are also
probed by the the vfio-pci driver.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
---
 drivers/vfio/pci/vfio_pci.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index d624a52..6a203a7 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -297,6 +297,7 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev)
 	struct vfio_pci_dummy_resource *dummy_res, *tmp;
 	int i, bar;
 
+	pci_disable_sriov(pdev);
 	/* Stop the device from further DMA */
 	pci_clear_master(pdev);
 
@@ -1314,12 +1315,25 @@ static const struct pci_error_handlers vfio_err_handlers = {
 	.error_detected = vfio_pci_aer_err_detected,
 };
 
+static int vfio_pci_sriov_configure(struct pci_dev *pdev, int num_vfs)
+{
+	if (!num_vfs) {
+		pci_disable_sriov(pdev);
+		return 0;
+	}
+
+	return pci_enable_sriov_with_override(pdev,
+					      num_vfs,
+					     "vfio-pci");
+}
+
 static struct pci_driver vfio_pci_driver = {
-	.name		= "vfio-pci",
-	.id_table	= NULL, /* only dynamic ids */
-	.probe		= vfio_pci_probe,
-	.remove		= vfio_pci_remove,
-	.err_handler	= &vfio_err_handlers,
+	.name		 = "vfio-pci",
+	.id_table	 = NULL, /* only dynamic ids */
+	.probe		 = vfio_pci_probe,
+	.remove		 = vfio_pci_remove,
+	.err_handler	 = &vfio_err_handlers,
+	.sriov_configure = vfio_pci_sriov_configure,
 };
 
 struct vfio_devices {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity
  2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
  2016-08-18  7:29 ` [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface Ilya Lesokhin
@ 2016-08-18  7:29 ` Ilya Lesokhin
  2016-08-18 20:32   ` Alex Williamson
  2016-08-22  6:48   ` kbuild test robot
  2017-03-08  7:29 ` [PATCH V3 0/3] VFIO SRIOV support Jike Song
  3 siblings, 2 replies; 14+ messages in thread
From: Ilya Lesokhin @ 2016-08-18  7:29 UTC (permalink / raw)
  To: kvm, linux-pci
  Cc: bhelgaas, alex.williamson, noaos, haggaie, ogerlitz, liranl, ilyal

Add support for PCIE SR-IOV extended capability.
The capability gives the VFIO user the following abilities:
1. Detect that the device has an SR-IOV capability
2. Change sriov_numvfs and read the corresponding changes in
sriov_vf_offset and sriov_vf_stride
3. Probe vf bar sizes

Enabling and disable sriov is still done through the sysfs interface

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
---
 drivers/vfio/pci/vfio_pci.c         |  23 +++++-
 drivers/vfio/pci/vfio_pci_config.c  | 151 ++++++++++++++++++++++++++++++++----
 drivers/vfio/pci/vfio_pci_private.h |   2 +
 3 files changed, 157 insertions(+), 19 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 6a203a7..807caf2c 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -1229,6 +1229,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	vdev->irq_type = VFIO_PCI_NUM_IRQS;
 	mutex_init(&vdev->igate);
 	spin_lock_init(&vdev->irqlock);
+	mutex_init(&vdev->sriov_mutex);
 
 	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
 	if (ret) {
@@ -1317,14 +1318,32 @@ static const struct pci_error_handlers vfio_err_handlers = {
 
 static int vfio_pci_sriov_configure(struct pci_dev *pdev, int num_vfs)
 {
+	struct vfio_pci_device *vdev;
+	struct vfio_device *device;
+	int ret = 0;
+
+	device = vfio_device_get_from_dev(&pdev->dev);
+	if (!device)
+		return -EINVAL;
+
+	vdev = vfio_device_data(device);
+	if (!vdev) {
+		vfio_device_put(device);
+		return -EINVAL;
+	}
+
+	mutex_lock(&vdev->sriov_mutex);
 	if (!num_vfs) {
 		pci_disable_sriov(pdev);
-		return 0;
+		goto out;
 	}
 
-	return pci_enable_sriov_with_override(pdev,
+	ret =  pci_enable_sriov_with_override(pdev,
 					      num_vfs,
 					     "vfio-pci");
+out:
+	mutex_unlock(&vdev->sriov_mutex);
+	return ret;
 }
 
 static struct pci_driver vfio_pci_driver = {
diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 688691d..6c813d3 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -448,6 +448,35 @@ static __le32 vfio_generate_bar_flags(struct pci_dev *pdev, int bar)
 	return cpu_to_le32(val);
 }
 
+static void vfio_sriov_bar_fixup(struct vfio_pci_device *vdev,
+				 int sriov_cap_start)
+{
+	struct pci_dev *pdev = vdev->pdev;
+	int i;
+	__le32 *bar;
+	u64 mask;
+
+	bar = (__le32 *)&vdev->vconfig[sriov_cap_start + PCI_SRIOV_BAR];
+
+	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++, bar++) {
+		if (!pci_resource_start(pdev, i)) {
+			*bar = 0; /* Unmapped by host = unimplemented to user */
+			continue;
+		}
+
+		mask = ~(pci_iov_resource_size(pdev, i) - 1);
+
+		*bar &= cpu_to_le32((u32)mask);
+		*bar |= vfio_generate_bar_flags(pdev, i);
+
+		if (*bar & cpu_to_le32(PCI_BASE_ADDRESS_MEM_TYPE_64)) {
+			bar++;
+			*bar &= cpu_to_le32((u32)(mask >> 32));
+			i++;
+		}
+	}
+}
+
 /*
  * Pretend we're hardware and tweak the values of the *virtual* PCI BARs
  * to reflect the hardware capabilities.  This implements BAR sizing.
@@ -901,6 +930,106 @@ static int __init init_pci_ext_cap_pwr_perm(struct perm_bits *perm)
 	return 0;
 }
 
+static int __init init_pci_ext_cap_sriov_perm(struct perm_bits *perm)
+{
+	int i;
+
+	if (alloc_perm_bits(perm, pci_ext_cap_length[PCI_EXT_CAP_ID_SRIOV]))
+		return -ENOMEM;
+
+	/*
+	 * Virtualize the first dword of all express capabilities
+	 * because it includes the next pointer.  This lets us later
+	 * remove capabilities from the chain if we need to.
+	 */
+	p_setd(perm, 0, ALL_VIRT, NO_WRITE);
+
+	/* VF Enable - Virtualized and writable
+	 * Memory Space Enable - Non-virtualized and writable
+	 */
+	p_setw(perm, PCI_SRIOV_CTRL, NO_VIRT,
+	       PCI_SRIOV_CTRL_MSE);
+
+	p_setw(perm, PCI_SRIOV_NUM_VF, (u16)NO_VIRT, (u16)ALL_WRITE);
+	p_setw(perm, PCI_SRIOV_SUP_PGSIZE, (u16)ALL_VIRT, NO_WRITE);
+
+	/* We cannot let user space application change the page size
+	 * so we mark it as read only and trust the user application
+	 * (e.g. qemu) to virtualize this correctly for the guest
+	 */
+	p_setw(perm, PCI_SRIOV_SYS_PGSIZE, (u16)ALL_VIRT, NO_WRITE);
+
+	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
+		p_setd(perm, PCI_SRIOV_BAR + 4 * i, ALL_VIRT, ALL_WRITE);
+
+	return 0;
+}
+
+static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
+{
+	u8 cap;
+	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
+						 PCI_STD_HEADER_SIZEOF;
+	cap = vdev->pci_config_map[pos];
+
+	if (cap == PCI_CAP_ID_BASIC)
+		return 0;
+
+	/* XXX Can we have to abutting capabilities of the same type? */
+	while (pos - 1 >= base && vdev->pci_config_map[pos - 1] == cap)
+		pos--;
+
+	return pos;
+}
+
+static int vfio_sriov_cap_config_read(struct vfio_pci_device *vdev, int pos,
+				      int count, struct perm_bits *perm,
+				      int offset, __le32 *val)
+{
+	int cap_start = vfio_find_cap_start(vdev, pos);
+
+	vfio_sriov_bar_fixup(vdev, cap_start);
+	return vfio_default_config_read(vdev, pos, count, perm, offset, val);
+}
+
+static int vfio_sriov_cap_config_write(struct vfio_pci_device *vdev, int pos,
+				       int count, struct perm_bits *perm,
+				       int offset, __le32 val)
+{
+	switch (offset) {
+	case  PCI_SRIOV_NUM_VF:
+	/* Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset
+	 * and VF Stride may change when NumVFs changes.
+	 *
+	 * Therefore we should pass valid writes to the hardware.
+	 *
+	 * Per SR-IOV spec sec 3.3.7
+	 * The results are undefined if NumVFs is set to a value greater
+	 * than TotalVFs.
+	 * NumVFs may only be written while VF Enable is Clear.
+	 * If NumVFs is written when VF Enable is Set, the results
+	 * are undefined.
+
+	 * Avoid passing such writes to the Hardware just in case.
+	 */
+		mutex_lock(&vdev->sriov_mutex);
+		if (pci_num_vf(vdev->pdev) ||
+		    val > pci_sriov_get_totalvfs(vdev->pdev)) {
+			mutex_unlock(&vdev->sriov_mutex);
+			return count;
+		}
+
+		pci_iov_set_numvfs(vdev->pdev, val);
+		mutex_unlock(&vdev->sriov_mutex);
+		break;
+	default:
+		break;
+	}
+
+	return vfio_default_config_write(vdev, pos, count, perm,
+					 offset, val);
+}
+
 /*
  * Initialize the shared permission tables
  */
@@ -916,6 +1045,7 @@ void vfio_pci_uninit_perm_bits(void)
 
 	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_ERR]);
 	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_PWR]);
+	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_SRIOV]);
 }
 
 int __init vfio_pci_init_perm_bits(void)
@@ -938,29 +1068,16 @@ int __init vfio_pci_init_perm_bits(void)
 	ret |= init_pci_ext_cap_pwr_perm(&ecap_perms[PCI_EXT_CAP_ID_PWR]);
 	ecap_perms[PCI_EXT_CAP_ID_VNDR].writefn = vfio_raw_config_write;
 
+	ret |= init_pci_ext_cap_sriov_perm(&ecap_perms[PCI_EXT_CAP_ID_SRIOV]);
+	ecap_perms[PCI_EXT_CAP_ID_SRIOV].readfn = vfio_sriov_cap_config_read;
+	ecap_perms[PCI_EXT_CAP_ID_SRIOV].writefn = vfio_sriov_cap_config_write;
+
 	if (ret)
 		vfio_pci_uninit_perm_bits();
 
 	return ret;
 }
 
-static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
-{
-	u8 cap;
-	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
-						 PCI_STD_HEADER_SIZEOF;
-	cap = vdev->pci_config_map[pos];
-
-	if (cap == PCI_CAP_ID_BASIC)
-		return 0;
-
-	/* XXX Can we have to abutting capabilities of the same type? */
-	while (pos - 1 >= base && vdev->pci_config_map[pos - 1] == cap)
-		pos--;
-
-	return pos;
-}
-
 static int vfio_msi_config_read(struct vfio_pci_device *vdev, int pos,
 				int count, struct perm_bits *perm,
 				int offset, __le32 *val)
diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
index 2128de8..02732eb 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -96,6 +96,8 @@ struct vfio_pci_device {
 	struct eventfd_ctx	*err_trigger;
 	struct eventfd_ctx	*req_trigger;
 	struct list_head	dummy_resources_list;
+	struct mutex		sriov_mutex;
+
 };
 
 #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity
  2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
@ 2016-08-18 20:32   ` Alex Williamson
  2016-08-22  6:48   ` kbuild test robot
  1 sibling, 0 replies; 14+ messages in thread
From: Alex Williamson @ 2016-08-18 20:32 UTC (permalink / raw)
  To: Ilya Lesokhin; +Cc: kvm, linux-pci, bhelgaas, noaos, haggaie, ogerlitz, liranl

On Thu, 18 Aug 2016 10:29:17 +0300
Ilya Lesokhin <ilyal@mellanox.com> wrote:

> Add support for PCIE SR-IOV extended capability.
> The capability gives the VFIO user the following abilities:
> 1. Detect that the device has an SR-IOV capability
> 2. Change sriov_numvfs and read the corresponding changes in
> sriov_vf_offset and sriov_vf_stride
> 3. Probe vf bar sizes
> 
> Enabling and disable sriov is still done through the sysfs interface
> 
> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
> Signed-off-by: Noa Osherovich <noaos@mellanox.com>
> Signed-off-by: Haggai Eran <haggaie@mellanox.com>
> ---
>  drivers/vfio/pci/vfio_pci.c         |  23 +++++-
>  drivers/vfio/pci/vfio_pci_config.c  | 151 ++++++++++++++++++++++++++++++++----
>  drivers/vfio/pci/vfio_pci_private.h |   2 +
>  3 files changed, 157 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 6a203a7..807caf2c 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1229,6 +1229,7 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	vdev->irq_type = VFIO_PCI_NUM_IRQS;
>  	mutex_init(&vdev->igate);
>  	spin_lock_init(&vdev->irqlock);
> +	mutex_init(&vdev->sriov_mutex);
>  
>  	ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev);
>  	if (ret) {
> @@ -1317,14 +1318,32 @@ static const struct pci_error_handlers vfio_err_handlers = {
>  
>  static int vfio_pci_sriov_configure(struct pci_dev *pdev, int num_vfs)
>  {
> +	struct vfio_pci_device *vdev;
> +	struct vfio_device *device;
> +	int ret = 0;
> +
> +	device = vfio_device_get_from_dev(&pdev->dev);
> +	if (!device)
> +		return -EINVAL;
> +
> +	vdev = vfio_device_data(device);
> +	if (!vdev) {
> +		vfio_device_put(device);
> +		return -EINVAL;
> +	}
> +
> +	mutex_lock(&vdev->sriov_mutex);
>  	if (!num_vfs) {
>  		pci_disable_sriov(pdev);
> -		return 0;
> +		goto out;
>  	}
>  
> -	return pci_enable_sriov_with_override(pdev,
> +	ret =  pci_enable_sriov_with_override(pdev,
>  					      num_vfs,
>  					     "vfio-pci");
> +out:
> +	mutex_unlock(&vdev->sriov_mutex);

vfio_device_put(device);

> +	return ret;
>  }
>  
>  static struct pci_driver vfio_pci_driver = {
> diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> index 688691d..6c813d3 100644
> --- a/drivers/vfio/pci/vfio_pci_config.c
> +++ b/drivers/vfio/pci/vfio_pci_config.c
> @@ -448,6 +448,35 @@ static __le32 vfio_generate_bar_flags(struct pci_dev *pdev, int bar)
>  	return cpu_to_le32(val);
>  }
>  
> +static void vfio_sriov_bar_fixup(struct vfio_pci_device *vdev,
> +				 int sriov_cap_start)
> +{
> +	struct pci_dev *pdev = vdev->pdev;
> +	int i;
> +	__le32 *bar;
> +	u64 mask;
> +
> +	bar = (__le32 *)&vdev->vconfig[sriov_cap_start + PCI_SRIOV_BAR];
> +
> +	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++, bar++) {

These are only defined when CONFIG_PCI_IOV

> +		if (!pci_resource_start(pdev, i)) {
> +			*bar = 0; /* Unmapped by host = unimplemented to user */
> +			continue;
> +		}
> +
> +		mask = ~(pci_iov_resource_size(pdev, i) - 1);
> +
> +		*bar &= cpu_to_le32((u32)mask);
> +		*bar |= vfio_generate_bar_flags(pdev, i);
> +
> +		if (*bar & cpu_to_le32(PCI_BASE_ADDRESS_MEM_TYPE_64)) {
> +			bar++;
> +			*bar &= cpu_to_le32((u32)(mask >> 32));
> +			i++;
> +		}
> +	}
> +}
> +
>  /*
>   * Pretend we're hardware and tweak the values of the *virtual* PCI BARs
>   * to reflect the hardware capabilities.  This implements BAR sizing.
> @@ -901,6 +930,106 @@ static int __init init_pci_ext_cap_pwr_perm(struct perm_bits *perm)
>  	return 0;
>  }
>  
> +static int __init init_pci_ext_cap_sriov_perm(struct perm_bits *perm)
> +{
> +	int i;
> +
> +	if (alloc_perm_bits(perm, pci_ext_cap_length[PCI_EXT_CAP_ID_SRIOV]))
> +		return -ENOMEM;
> +
> +	/*
> +	 * Virtualize the first dword of all express capabilities
> +	 * because it includes the next pointer.  This lets us later
> +	 * remove capabilities from the chain if we need to.
> +	 */
> +	p_setd(perm, 0, ALL_VIRT, NO_WRITE);
> +
> +	/* VF Enable - Virtualized and writable

nit, comment style - multi-line comments as above.

Comment doesn't seem to match the code, VFE is neither virtualized nor
writable.

> +	 * Memory Space Enable - Non-virtualized and writable
> +	 */
> +	p_setw(perm, PCI_SRIOV_CTRL, NO_VIRT,
> +	       PCI_SRIOV_CTRL_MSE);
> +
> +	p_setw(perm, PCI_SRIOV_NUM_VF, (u16)NO_VIRT, (u16)ALL_WRITE);
> +	p_setw(perm, PCI_SRIOV_SUP_PGSIZE, (u16)ALL_VIRT, NO_WRITE);

Is this necessary?  What's the purpose in virtualizing it?  Per the
spec, it's read-only in hardware.

> +
> +	/* We cannot let user space application change the page size
> +	 * so we mark it as read only and trust the user application
> +	 * (e.g. qemu) to virtualize this correctly for the guest
> +	 */
> +	p_setw(perm, PCI_SRIOV_SYS_PGSIZE, (u16)ALL_VIRT, NO_WRITE);

But why do we virtualize it?

> +
> +	for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
> +		p_setd(perm, PCI_SRIOV_BAR + 4 * i, ALL_VIRT, ALL_WRITE);
> +
> +	return 0;
> +}
> +
> +static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
> +{
> +	u8 cap;
> +	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
> +						 PCI_STD_HEADER_SIZEOF;
> +	cap = vdev->pci_config_map[pos];
> +
> +	if (cap == PCI_CAP_ID_BASIC)
> +		return 0;
> +
> +	/* XXX Can we have to abutting capabilities of the same type? */
> +	while (pos - 1 >= base && vdev->pci_config_map[pos - 1] == cap)
> +		pos--;
> +
> +	return pos;
> +}
> +
> +static int vfio_sriov_cap_config_read(struct vfio_pci_device *vdev, int pos,
> +				      int count, struct perm_bits *perm,
> +				      int offset, __le32 *val)
> +{
> +	int cap_start = vfio_find_cap_start(vdev, pos);
> +
> +	vfio_sriov_bar_fixup(vdev, cap_start);

Should we make an is_iov_bar() function for at least a little bit of
filtering?

> +	return vfio_default_config_read(vdev, pos, count, perm, offset, val);
> +}
> +
> +static int vfio_sriov_cap_config_write(struct vfio_pci_device *vdev, int pos,
> +				       int count, struct perm_bits *perm,
> +				       int offset, __le32 val)
> +{
> +	switch (offset) {
> +	case  PCI_SRIOV_NUM_VF:
> +	/* Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset
> +	 * and VF Stride may change when NumVFs changes.

This really seems more complicated than set forth here to virtualize.
For instance offset and stride are also affected by ARI, so if the ARI
settings between host and VM don't match, a user like QEMU is going to
need to virtualize offset, stride, and maybe even TotalVFs to
something appropriate for the VM.  There's also the question of why
the physical offset/stride matter at all to a VM when these devices
aren't being initialized in the VM address space and the user/hypervisor
is free to manage them however they see fit.  So I think the only
purpose of virtualizing any of this is so that a VM can potentially
match the bare hardware in the case where the VM and physical system are
sufficiently similar.  Is that correct?

nit, comment stule, blank line within the comment block below.

> +	 *
> +	 * Therefore we should pass valid writes to the hardware.
> +	 *
> +	 * Per SR-IOV spec sec 3.3.7
> +	 * The results are undefined if NumVFs is set to a value greater
> +	 * than TotalVFs.
> +	 * NumVFs may only be written while VF Enable is Clear.
> +	 * If NumVFs is written when VF Enable is Set, the results
> +	 * are undefined.
> +
> +	 * Avoid passing such writes to the Hardware just in case.
> +	 */
> +		mutex_lock(&vdev->sriov_mutex);
> +		if (pci_num_vf(vdev->pdev) ||
> +		    val > pci_sriov_get_totalvfs(vdev->pdev)) {
> +			mutex_unlock(&vdev->sriov_mutex);
> +			return count;
> +		}
> +
> +		pci_iov_set_numvfs(vdev->pdev, val);
> +		mutex_unlock(&vdev->sriov_mutex);
> +		break;
> +	default:
> +		break;
> +	}

Seems unnecessary to have a switch statement for a single case, can't
we just wrap this in a "if (offset == PCI_SRIOV_NUM_VF)" block?

> +
> +	return vfio_default_config_write(vdev, pos, count, perm,
> +					 offset, val);
> +}
> +
>  /*
>   * Initialize the shared permission tables
>   */
> @@ -916,6 +1045,7 @@ void vfio_pci_uninit_perm_bits(void)
>  
>  	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_ERR]);
>  	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_PWR]);
> +	free_perm_bits(&ecap_perms[PCI_EXT_CAP_ID_SRIOV]);
>  }
>  
>  int __init vfio_pci_init_perm_bits(void)
> @@ -938,29 +1068,16 @@ int __init vfio_pci_init_perm_bits(void)
>  	ret |= init_pci_ext_cap_pwr_perm(&ecap_perms[PCI_EXT_CAP_ID_PWR]);
>  	ecap_perms[PCI_EXT_CAP_ID_VNDR].writefn = vfio_raw_config_write;
>  
> +	ret |= init_pci_ext_cap_sriov_perm(&ecap_perms[PCI_EXT_CAP_ID_SRIOV]);
> +	ecap_perms[PCI_EXT_CAP_ID_SRIOV].readfn = vfio_sriov_cap_config_read;
> +	ecap_perms[PCI_EXT_CAP_ID_SRIOV].writefn = vfio_sriov_cap_config_write;
> +
>  	if (ret)
>  		vfio_pci_uninit_perm_bits();
>  
>  	return ret;
>  }
>  
> -static int vfio_find_cap_start(struct vfio_pci_device *vdev, int pos)
> -{
> -	u8 cap;
> -	int base = (pos >= PCI_CFG_SPACE_SIZE) ? PCI_CFG_SPACE_SIZE :
> -						 PCI_STD_HEADER_SIZEOF;
> -	cap = vdev->pci_config_map[pos];
> -
> -	if (cap == PCI_CAP_ID_BASIC)
> -		return 0;
> -
> -	/* XXX Can we have to abutting capabilities of the same type? */
> -	while (pos - 1 >= base && vdev->pci_config_map[pos - 1] == cap)
> -		pos--;
> -
> -	return pos;
> -}
> -
>  static int vfio_msi_config_read(struct vfio_pci_device *vdev, int pos,
>  				int count, struct perm_bits *perm,
>  				int offset, __le32 *val)
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index 2128de8..02732eb 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -96,6 +96,8 @@ struct vfio_pci_device {
>  	struct eventfd_ctx	*err_trigger;
>  	struct eventfd_ctx	*req_trigger;
>  	struct list_head	dummy_resources_list;
> +	struct mutex		sriov_mutex;
> +

whitespace

>  };
>  
>  #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 1/3] pci: Extend PCI IOV API
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
@ 2016-08-18 22:09   ` Christoph Hellwig
  2016-08-22 18:51   ` kbuild test robot
  1 sibling, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2016-08-18 22:09 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, noaos, haggaie,
	ogerlitz, liranl

On Thu, Aug 18, 2016 at 10:29:15AM +0300, Ilya Lesokhin wrote:
> 1. Add pci_enable_sriov_with_override to allow
> enabling sriov with a driver override
> on the VFs.
> 
> 2. Expose pci_iov_set_numvfs and pci_iov_resource_size
> to make them available for other modules

Please use EXPORT_SYMBOL_GPL for such low-level exports.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface
  2016-08-18  7:29 ` [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface Ilya Lesokhin
@ 2016-08-18 22:11   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2016-08-18 22:11 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, noaos, haggaie,
	ogerlitz, liranl

On Thu, Aug 18, 2016 at 10:29:16AM +0300, Ilya Lesokhin wrote:
> +static int vfio_pci_sriov_configure(struct pci_dev *pdev, int num_vfs)
> +{
> +	if (!num_vfs) {
> +		pci_disable_sriov(pdev);
> +		return 0;
> +	}
> +
> +	return pci_enable_sriov_with_override(pdev,
> +					      num_vfs,
> +					     "vfio-pci");
> +}

I have to admit that I don't really like this API.  Would it be
major burden for use case to just have a flag instead that disables
automatic driver attachments for VFs and requires manual binding
instead?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity
  2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
  2016-08-18 20:32   ` Alex Williamson
@ 2016-08-22  6:48   ` kbuild test robot
  1 sibling, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2016-08-22  6:48 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kbuild-all, kvm, linux-pci, bhelgaas, alex.williamson, noaos,
	haggaie, ogerlitz, liranl, ilyal

[-- Attachment #1: Type: text/plain, Size: 1883 bytes --]

Hi Ilya,

[auto build test ERROR on vfio/next]
[also build test ERROR on v4.8-rc3 next-20160819]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/VFIO-SRIOV-support/20160818-153802
base:   https://github.com/awilliam/linux-vfio.git next
config: x86_64-randconfig-s1-08191332 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/vfio/pci/vfio_pci_config.c: In function 'vfio_sriov_bar_fixup':
>> drivers/vfio/pci/vfio_pci_config.c:461: error: 'PCI_IOV_RESOURCES' undeclared (first use in this function)
   drivers/vfio/pci/vfio_pci_config.c:461: error: (Each undeclared identifier is reported only once
   drivers/vfio/pci/vfio_pci_config.c:461: error: for each function it appears in.)
>> drivers/vfio/pci/vfio_pci_config.c:461: error: 'PCI_IOV_RESOURCE_END' undeclared (first use in this function)

vim +/PCI_IOV_RESOURCES +461 drivers/vfio/pci/vfio_pci_config.c

   455		int i;
   456		__le32 *bar;
   457		u64 mask;
   458	
   459		bar = (__le32 *)&vdev->vconfig[sriov_cap_start + PCI_SRIOV_BAR];
   460	
 > 461		for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++, bar++) {
   462			if (!pci_resource_start(pdev, i)) {
   463				*bar = 0; /* Unmapped by host = unimplemented to user */
   464				continue;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 21515 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 1/3] pci: Extend PCI IOV API
  2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
  2016-08-18 22:09   ` Christoph Hellwig
@ 2016-08-22 18:51   ` kbuild test robot
  1 sibling, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2016-08-22 18:51 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kbuild-all, kvm, linux-pci, bhelgaas, alex.williamson, noaos,
	haggaie, ogerlitz, liranl, ilyal

[-- Attachment #1: Type: text/plain, Size: 2438 bytes --]

Hi Ilya,

[auto build test ERROR on vfio/next]
[also build test ERROR on v4.8-rc3 next-20160822]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
[Suggest to use git(>=2.9.0) format-patch --base=<commit> (or --base=auto for convenience) to record what (public, well-known) commit your patch series was built on]
[Check https://git-scm.com/docs/git-format-patch for more information]

url:    https://github.com/0day-ci/linux/commits/Ilya-Lesokhin/VFIO-SRIOV-support/20160818-153802
base:   https://github.com/awilliam/linux-vfio.git next
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 5.4.0-6) 5.4.0 20160609
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/eeh_driver.c: In function 'eeh_add_virt_device':
>> arch/powerpc/kernel/eeh_driver.c:444:2: error: too few arguments to function 'pci_iov_add_virtfn'
     pci_iov_add_virtfn(edev->physfn, pdn->vf_index, 0);
     ^
   In file included from arch/powerpc/kernel/eeh_driver.c:29:0:
   include/linux/pci.h:1746:5: note: declared here
    int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset,
        ^

vim +/pci_iov_add_virtfn +444 arch/powerpc/kernel/eeh_driver.c

67086e32 Wei Yang 2016-03-04  438  		eeh_pcid_put(dev);
67086e32 Wei Yang 2016-03-04  439  		if (driver->err_handler)
67086e32 Wei Yang 2016-03-04  440  			return NULL;
67086e32 Wei Yang 2016-03-04  441  	}
67086e32 Wei Yang 2016-03-04  442  
67086e32 Wei Yang 2016-03-04  443  #ifdef CONFIG_PPC_POWERNV
67086e32 Wei Yang 2016-03-04 @444  	pci_iov_add_virtfn(edev->physfn, pdn->vf_index, 0);
67086e32 Wei Yang 2016-03-04  445  #endif
67086e32 Wei Yang 2016-03-04  446  	return NULL;
67086e32 Wei Yang 2016-03-04  447  }

:::::: The code at line 444 was first introduced by commit
:::::: 67086e32b56481531ab1292b284e074b1a8d764c powerpc/eeh: powerpc/eeh: Support error recovery for VF PE

:::::: TO: Wei Yang <weiyang@linux.vnet.ibm.com>
:::::: CC: Michael Ellerman <mpe@ellerman.id.au>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 49243 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V3 0/3] VFIO SRIOV support
  2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
                   ` (2 preceding siblings ...)
  2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
@ 2017-03-08  7:29 ` Jike Song
  2017-03-09  6:24     ` Ilya Lesokhin
  3 siblings, 1 reply; 14+ messages in thread
From: Jike Song @ 2017-03-08  7:29 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, noaos, haggaie,
	ogerlitz, liranl, You, Lizhen

On 08/18/2016 03:29 PM, Ilya Lesokhin wrote:
> Changes from V2:
>         1. Enabling and disabling SR-IOV is now done
>         through the sysfs interface, requiring
>         admin privileges.
>         2. Since admin privileges are now required
>         to enable SR-IOV most of the the security
>         measures introduced in RFC V2 were removed.
>         Unfortunately we still need a mutex to prevent
>         the VFIO user from changing the number of
>         VFs while enable_sriov is in progress.
> 
> Changes from V1:
>         1. The VF are no longer assigned to PFs iommu group
>         2. Add a pci_enable_sriov_with_override API to allow
>         enablind sriov without probing the VFs with the
>         default driver
> 
> Changes from RFC V2:
>         1. pci_disable_sriov() is now called from a workqueue
>         To avoid the situation where a process is blocked
>         in pci_disable_sriov() wating for itself to relase the VFs.
>         2. a mutex was added to synchronize calls to
>         pci_enable_sriov() and pci_disable_sriov()
> 
> Changes from RFC V1:
>         Due to the security concern raised in RFC V1, we add two patches
>         to make sure the VFs belong to the same IOMMU group as
>         the PF and are probed by VFIO.
> 
> Today the QEMU hypervisor allows assigning a physical device to a VM,
> facilitating driver development. However, it does not support enabling
> SR-IOV by the VM kernel driver. Our goal is to implement such support,
> allowing developers working on SR-IOV physical function drivers to work
> inside VMs as well.
> 
> This patch series implements the kernel side of our solution.  It extends
> the VFIO driver to support the PCIE SRIOV extended capability with
> following features:
> 1. The ability to probe SR-IOV BAR sizes.
> 2. The ability to enable and disable SR-IOV.
> 
> This patch series is going to be used by QEMU to expose SR-IOV capabilities
> to VM. We already have an early prototype based on Knut Omang's patches for
> SR-IOV[1].
> 
> Limitations:
> 1. Per SR-IOV spec section 3.3.12, PFs are required to support
> 4-KB, 8-KB, 64-KB, 256-KB, 1-MB, and 4-MB page sizes.
> Unfourtently the kernel currently initializes the System Page Size register once
> and assumes it doesn't change therefore we cannot allow guests to change this
> register at will. We currently map both the Supported Page sizes and
> System Page Size as virtualized and read only in violation of the spec.
> In practice this is not an issue since both the hypervisor and the
> guest typically select the same System Page Size.
> 
> [1] https://github.com/knuto/qemu/tree/sriov_patches_v6
> 
> Ilya Lesokhin (3):
>   pci: Extend PCI IOV API
>   vfio/pci: Allow control SR-IOV through sysfs interface
>   vfio/pci: Add support for SR-IOV extended capablity
> 
>  drivers/pci/iov.c                   |  41 ++++++++--
>  drivers/vfio/pci/vfio_pci.c         |  43 ++++++++--
>  drivers/vfio/pci/vfio_pci_config.c  | 151 ++++++++++++++++++++++++++++++++----
>  drivers/vfio/pci/vfio_pci_private.h |   2 +
>  include/linux/pci.h                 |  13 +++-
>  5 files changed, 219 insertions(+), 31 deletions(-)
> 

+Lizhen


Hi Ilya,

Sorry for jumping in abruptly. We are also looking forward to have PF
used within a VM, would you please share your next plan with us? Likely
there will be a v4 shortly?


--
Thanks,
Jike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V3 0/3] VFIO SRIOV support
  2017-03-08  7:29 ` [PATCH V3 0/3] VFIO SRIOV support Jike Song
@ 2017-03-09  6:24     ` Ilya Lesokhin
  0 siblings, 0 replies; 14+ messages in thread
From: Ilya Lesokhin @ 2017-03-09  6:24 UTC (permalink / raw)
  To: Jike Song
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, Noa Osherovich,
	Haggai Eran, Or Gerlitz, Liran Liss, You, Lizhen

SGkgSmlrZSwNCkkgZG9uJ3QgaGF2ZSBhIHBsYW4gdG8gd29yayBvbiBpdCBpbiB0aGUgbmVhciBm
dXR1cmUsIGJ1dCB3ZSBjYW4gc2hhcmUgdGhlIGNvZGUgaWYgeW91IGFyZSBpbnRlcmVzdGVkLg0K
DQpUaGFua3MsDQpJbHlhDQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTog
SmlrZSBTb25nIFttYWlsdG86amlrZS5zb25nQGludGVsLmNvbV0NCj4gU2VudDogV2VkbmVzZGF5
LCBNYXJjaCAwOCwgMjAxNyA5OjMwIEFNDQo+IFRvOiBJbHlhIExlc29raGluIDxpbHlhbEBtZWxs
YW5veC5jb20+DQo+IENjOiBrdm1Admdlci5rZXJuZWwub3JnOyBsaW51eC1wY2lAdmdlci5rZXJu
ZWwub3JnOyBiaGVsZ2Fhc0Bnb29nbGUuY29tOw0KPiBhbGV4LndpbGxpYW1zb25AcmVkaGF0LmNv
bTsgTm9hIE9zaGVyb3ZpY2ggPG5vYW9zQG1lbGxhbm94LmNvbT47DQo+IEhhZ2dhaSBFcmFuIDxo
YWdnYWllQG1lbGxhbm94LmNvbT47IE9yIEdlcmxpdHogPG9nZXJsaXR6QG1lbGxhbm94LmNvbT47
DQo+IExpcmFuIExpc3MgPGxpcmFubEBtZWxsYW5veC5jb20+OyBZb3UsIExpemhlbiA8bGl6aGVu
LnlvdUBpbnRlbC5jb20+DQo+IFN1YmplY3Q6IFJlOiBbUEFUQ0ggVjMgMC8zXSBWRklPIFNSSU9W
IHN1cHBvcnQNCj4gDQo+IE9uIDA4LzE4LzIwMTYgMDM6MjkgUE0sIElseWEgTGVzb2toaW4gd3Jv
dGU6DQo+ID4gQ2hhbmdlcyBmcm9tIFYyOg0KPiA+ICAgICAgICAgMS4gRW5hYmxpbmcgYW5kIGRp
c2FibGluZyBTUi1JT1YgaXMgbm93IGRvbmUNCj4gPiAgICAgICAgIHRocm91Z2ggdGhlIHN5c2Zz
IGludGVyZmFjZSwgcmVxdWlyaW5nDQo+ID4gICAgICAgICBhZG1pbiBwcml2aWxlZ2VzLg0KPiA+
ICAgICAgICAgMi4gU2luY2UgYWRtaW4gcHJpdmlsZWdlcyBhcmUgbm93IHJlcXVpcmVkDQo+ID4g
ICAgICAgICB0byBlbmFibGUgU1ItSU9WIG1vc3Qgb2YgdGhlIHRoZSBzZWN1cml0eQ0KPiA+ICAg
ICAgICAgbWVhc3VyZXMgaW50cm9kdWNlZCBpbiBSRkMgVjIgd2VyZSByZW1vdmVkLg0KPiA+ICAg
ICAgICAgVW5mb3J0dW5hdGVseSB3ZSBzdGlsbCBuZWVkIGEgbXV0ZXggdG8gcHJldmVudA0KPiA+
ICAgICAgICAgdGhlIFZGSU8gdXNlciBmcm9tIGNoYW5naW5nIHRoZSBudW1iZXIgb2YNCj4gPiAg
ICAgICAgIFZGcyB3aGlsZSBlbmFibGVfc3Jpb3YgaXMgaW4gcHJvZ3Jlc3MuDQo+ID4NCj4gPiBD
aGFuZ2VzIGZyb20gVjE6DQo+ID4gICAgICAgICAxLiBUaGUgVkYgYXJlIG5vIGxvbmdlciBhc3Np
Z25lZCB0byBQRnMgaW9tbXUgZ3JvdXANCj4gPiAgICAgICAgIDIuIEFkZCBhIHBjaV9lbmFibGVf
c3Jpb3Zfd2l0aF9vdmVycmlkZSBBUEkgdG8gYWxsb3cNCj4gPiAgICAgICAgIGVuYWJsaW5kIHNy
aW92IHdpdGhvdXQgcHJvYmluZyB0aGUgVkZzIHdpdGggdGhlDQo+ID4gICAgICAgICBkZWZhdWx0
IGRyaXZlcg0KPiA+DQo+ID4gQ2hhbmdlcyBmcm9tIFJGQyBWMjoNCj4gPiAgICAgICAgIDEuIHBj
aV9kaXNhYmxlX3NyaW92KCkgaXMgbm93IGNhbGxlZCBmcm9tIGEgd29ya3F1ZXVlDQo+ID4gICAg
ICAgICBUbyBhdm9pZCB0aGUgc2l0dWF0aW9uIHdoZXJlIGEgcHJvY2VzcyBpcyBibG9ja2VkDQo+
ID4gICAgICAgICBpbiBwY2lfZGlzYWJsZV9zcmlvdigpIHdhdGluZyBmb3IgaXRzZWxmIHRvIHJl
bGFzZSB0aGUgVkZzLg0KPiA+ICAgICAgICAgMi4gYSBtdXRleCB3YXMgYWRkZWQgdG8gc3luY2hy
b25pemUgY2FsbHMgdG8NCj4gPiAgICAgICAgIHBjaV9lbmFibGVfc3Jpb3YoKSBhbmQgcGNpX2Rp
c2FibGVfc3Jpb3YoKQ0KPiA+DQo+ID4gQ2hhbmdlcyBmcm9tIFJGQyBWMToNCj4gPiAgICAgICAg
IER1ZSB0byB0aGUgc2VjdXJpdHkgY29uY2VybiByYWlzZWQgaW4gUkZDIFYxLCB3ZSBhZGQgdHdv
IHBhdGNoZXMNCj4gPiAgICAgICAgIHRvIG1ha2Ugc3VyZSB0aGUgVkZzIGJlbG9uZyB0byB0aGUg
c2FtZSBJT01NVSBncm91cCBhcw0KPiA+ICAgICAgICAgdGhlIFBGIGFuZCBhcmUgcHJvYmVkIGJ5
IFZGSU8uDQo+ID4NCj4gPiBUb2RheSB0aGUgUUVNVSBoeXBlcnZpc29yIGFsbG93cyBhc3NpZ25p
bmcgYSBwaHlzaWNhbCBkZXZpY2UgdG8gYSBWTSwNCj4gPiBmYWNpbGl0YXRpbmcgZHJpdmVyIGRl
dmVsb3BtZW50LiBIb3dldmVyLCBpdCBkb2VzIG5vdCBzdXBwb3J0IGVuYWJsaW5nDQo+ID4gU1It
SU9WIGJ5IHRoZSBWTSBrZXJuZWwgZHJpdmVyLiBPdXIgZ29hbCBpcyB0byBpbXBsZW1lbnQgc3Vj
aCBzdXBwb3J0LA0KPiA+IGFsbG93aW5nIGRldmVsb3BlcnMgd29ya2luZyBvbiBTUi1JT1YgcGh5
c2ljYWwgZnVuY3Rpb24gZHJpdmVycyB0bw0KPiA+IHdvcmsgaW5zaWRlIFZNcyBhcyB3ZWxsLg0K
PiA+DQo+ID4gVGhpcyBwYXRjaCBzZXJpZXMgaW1wbGVtZW50cyB0aGUga2VybmVsIHNpZGUgb2Yg
b3VyIHNvbHV0aW9uLiAgSXQNCj4gPiBleHRlbmRzIHRoZSBWRklPIGRyaXZlciB0byBzdXBwb3J0
IHRoZSBQQ0lFIFNSSU9WIGV4dGVuZGVkIGNhcGFiaWxpdHkNCj4gPiB3aXRoIGZvbGxvd2luZyBm
ZWF0dXJlczoNCj4gPiAxLiBUaGUgYWJpbGl0eSB0byBwcm9iZSBTUi1JT1YgQkFSIHNpemVzLg0K
PiA+IDIuIFRoZSBhYmlsaXR5IHRvIGVuYWJsZSBhbmQgZGlzYWJsZSBTUi1JT1YuDQo+ID4NCj4g
PiBUaGlzIHBhdGNoIHNlcmllcyBpcyBnb2luZyB0byBiZSB1c2VkIGJ5IFFFTVUgdG8gZXhwb3Nl
IFNSLUlPVg0KPiA+IGNhcGFiaWxpdGllcyB0byBWTS4gV2UgYWxyZWFkeSBoYXZlIGFuIGVhcmx5
IHByb3RvdHlwZSBiYXNlZCBvbiBLbnV0DQo+ID4gT21hbmcncyBwYXRjaGVzIGZvciBTUi1JT1Zb
MV0uDQo+ID4NCj4gPiBMaW1pdGF0aW9uczoNCj4gPiAxLiBQZXIgU1ItSU9WIHNwZWMgc2VjdGlv
biAzLjMuMTIsIFBGcyBhcmUgcmVxdWlyZWQgdG8gc3VwcG9ydCA0LUtCLA0KPiA+IDgtS0IsIDY0
LUtCLCAyNTYtS0IsIDEtTUIsIGFuZCA0LU1CIHBhZ2Ugc2l6ZXMuDQo+ID4gVW5mb3VydGVudGx5
IHRoZSBrZXJuZWwgY3VycmVudGx5IGluaXRpYWxpemVzIHRoZSBTeXN0ZW0gUGFnZSBTaXplDQo+
ID4gcmVnaXN0ZXIgb25jZSBhbmQgYXNzdW1lcyBpdCBkb2Vzbid0IGNoYW5nZSB0aGVyZWZvcmUg
d2UgY2Fubm90IGFsbG93DQo+ID4gZ3Vlc3RzIHRvIGNoYW5nZSB0aGlzIHJlZ2lzdGVyIGF0IHdp
bGwuIFdlIGN1cnJlbnRseSBtYXAgYm90aCB0aGUNCj4gPiBTdXBwb3J0ZWQgUGFnZSBzaXplcyBh
bmQgU3lzdGVtIFBhZ2UgU2l6ZSBhcyB2aXJ0dWFsaXplZCBhbmQgcmVhZCBvbmx5IGluDQo+IHZp
b2xhdGlvbiBvZiB0aGUgc3BlYy4NCj4gPiBJbiBwcmFjdGljZSB0aGlzIGlzIG5vdCBhbiBpc3N1
ZSBzaW5jZSBib3RoIHRoZSBoeXBlcnZpc29yIGFuZCB0aGUNCj4gPiBndWVzdCB0eXBpY2FsbHkg
c2VsZWN0IHRoZSBzYW1lIFN5c3RlbSBQYWdlIFNpemUuDQo+ID4NCj4gPiBbMV0gaHR0cHM6Ly9n
aXRodWIuY29tL2tudXRvL3FlbXUvdHJlZS9zcmlvdl9wYXRjaGVzX3Y2DQo+ID4NCj4gPiBJbHlh
IExlc29raGluICgzKToNCj4gPiAgIHBjaTogRXh0ZW5kIFBDSSBJT1YgQVBJDQo+ID4gICB2Zmlv
L3BjaTogQWxsb3cgY29udHJvbCBTUi1JT1YgdGhyb3VnaCBzeXNmcyBpbnRlcmZhY2UNCj4gPiAg
IHZmaW8vcGNpOiBBZGQgc3VwcG9ydCBmb3IgU1ItSU9WIGV4dGVuZGVkIGNhcGFibGl0eQ0KPiA+
DQo+ID4gIGRyaXZlcnMvcGNpL2lvdi5jICAgICAgICAgICAgICAgICAgIHwgIDQxICsrKysrKysr
LS0NCj4gPiAgZHJpdmVycy92ZmlvL3BjaS92ZmlvX3BjaS5jICAgICAgICAgfCAgNDMgKysrKysr
KystLQ0KPiA+ICBkcml2ZXJzL3ZmaW8vcGNpL3ZmaW9fcGNpX2NvbmZpZy5jICB8IDE1MQ0KPiAr
KysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKy0tLS0NCj4gPiAgZHJpdmVycy92ZmlvL3Bj
aS92ZmlvX3BjaV9wcml2YXRlLmggfCAgIDIgKw0KPiA+ICBpbmNsdWRlL2xpbnV4L3BjaS5oICAg
ICAgICAgICAgICAgICB8ICAxMyArKystDQo+ID4gIDUgZmlsZXMgY2hhbmdlZCwgMjE5IGluc2Vy
dGlvbnMoKyksIDMxIGRlbGV0aW9ucygtKQ0KPiA+DQo+IA0KPiArTGl6aGVuDQo+IA0KPiANCj4g
SGkgSWx5YSwNCj4gDQo+IFNvcnJ5IGZvciBqdW1waW5nIGluIGFicnVwdGx5LiBXZSBhcmUgYWxz
byBsb29raW5nIGZvcndhcmQgdG8gaGF2ZSBQRiB1c2VkDQo+IHdpdGhpbiBhIFZNLCB3b3VsZCB5
b3UgcGxlYXNlIHNoYXJlIHlvdXIgbmV4dCBwbGFuIHdpdGggdXM/IExpa2VseSB0aGVyZSB3aWxs
DQo+IGJlIGEgdjQgc2hvcnRseT8NCj4gDQo+IA0KPiAtLQ0KPiBUaGFua3MsDQo+IEppa2UNCg==

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V3 0/3] VFIO SRIOV support
@ 2017-03-09  6:24     ` Ilya Lesokhin
  0 siblings, 0 replies; 14+ messages in thread
From: Ilya Lesokhin @ 2017-03-09  6:24 UTC (permalink / raw)
  To: Jike Song
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, Noa Osherovich,
	Haggai Eran, Or Gerlitz, Liran Liss, You, Lizhen

Hi Jike,
I don't have a plan to work on it in the near future, but we can share the code if you are interested.

Thanks,
Ilya

> -----Original Message-----
> From: Jike Song [mailto:jike.song@intel.com]
> Sent: Wednesday, March 08, 2017 9:30 AM
> To: Ilya Lesokhin <ilyal@mellanox.com>
> Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; bhelgaas@google.com;
> alex.williamson@redhat.com; Noa Osherovich <noaos@mellanox.com>;
> Haggai Eran <haggaie@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>;
> Liran Liss <liranl@mellanox.com>; You, Lizhen <lizhen.you@intel.com>
> Subject: Re: [PATCH V3 0/3] VFIO SRIOV support
> 
> On 08/18/2016 03:29 PM, Ilya Lesokhin wrote:
> > Changes from V2:
> >         1. Enabling and disabling SR-IOV is now done
> >         through the sysfs interface, requiring
> >         admin privileges.
> >         2. Since admin privileges are now required
> >         to enable SR-IOV most of the the security
> >         measures introduced in RFC V2 were removed.
> >         Unfortunately we still need a mutex to prevent
> >         the VFIO user from changing the number of
> >         VFs while enable_sriov is in progress.
> >
> > Changes from V1:
> >         1. The VF are no longer assigned to PFs iommu group
> >         2. Add a pci_enable_sriov_with_override API to allow
> >         enablind sriov without probing the VFs with the
> >         default driver
> >
> > Changes from RFC V2:
> >         1. pci_disable_sriov() is now called from a workqueue
> >         To avoid the situation where a process is blocked
> >         in pci_disable_sriov() wating for itself to relase the VFs.
> >         2. a mutex was added to synchronize calls to
> >         pci_enable_sriov() and pci_disable_sriov()
> >
> > Changes from RFC V1:
> >         Due to the security concern raised in RFC V1, we add two patches
> >         to make sure the VFs belong to the same IOMMU group as
> >         the PF and are probed by VFIO.
> >
> > Today the QEMU hypervisor allows assigning a physical device to a VM,
> > facilitating driver development. However, it does not support enabling
> > SR-IOV by the VM kernel driver. Our goal is to implement such support,
> > allowing developers working on SR-IOV physical function drivers to
> > work inside VMs as well.
> >
> > This patch series implements the kernel side of our solution.  It
> > extends the VFIO driver to support the PCIE SRIOV extended capability
> > with following features:
> > 1. The ability to probe SR-IOV BAR sizes.
> > 2. The ability to enable and disable SR-IOV.
> >
> > This patch series is going to be used by QEMU to expose SR-IOV
> > capabilities to VM. We already have an early prototype based on Knut
> > Omang's patches for SR-IOV[1].
> >
> > Limitations:
> > 1. Per SR-IOV spec section 3.3.12, PFs are required to support 4-KB,
> > 8-KB, 64-KB, 256-KB, 1-MB, and 4-MB page sizes.
> > Unfourtently the kernel currently initializes the System Page Size
> > register once and assumes it doesn't change therefore we cannot allow
> > guests to change this register at will. We currently map both the
> > Supported Page sizes and System Page Size as virtualized and read only in
> violation of the spec.
> > In practice this is not an issue since both the hypervisor and the
> > guest typically select the same System Page Size.
> >
> > [1] https://github.com/knuto/qemu/tree/sriov_patches_v6
> >
> > Ilya Lesokhin (3):
> >   pci: Extend PCI IOV API
> >   vfio/pci: Allow control SR-IOV through sysfs interface
> >   vfio/pci: Add support for SR-IOV extended capablity
> >
> >  drivers/pci/iov.c                   |  41 ++++++++--
> >  drivers/vfio/pci/vfio_pci.c         |  43 ++++++++--
> >  drivers/vfio/pci/vfio_pci_config.c  | 151
> ++++++++++++++++++++++++++++++++----
> >  drivers/vfio/pci/vfio_pci_private.h |   2 +
> >  include/linux/pci.h                 |  13 +++-
> >  5 files changed, 219 insertions(+), 31 deletions(-)
> >
> 
> +Lizhen
> 
> 
> Hi Ilya,
> 
> Sorry for jumping in abruptly. We are also looking forward to have PF used
> within a VM, would you please share your next plan with us? Likely there will
> be a v4 shortly?
> 
> 
> --
> Thanks,
> Jike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V3 0/3] VFIO SRIOV support
  2017-03-09  6:24     ` Ilya Lesokhin
@ 2017-03-09  6:29       ` You, Lizhen
  -1 siblings, 0 replies; 14+ messages in thread
From: You, Lizhen @ 2017-03-09  6:29 UTC (permalink / raw)
  To: Ilya Lesokhin, Song, Jike
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, Noa Osherovich,
	Haggai Eran, Or Gerlitz, Liran Liss

SGkgSWx5YSwNCg0KV2UnZCBsaWtlIHRvIGdpdmUgaXQgYSB0cnkuIElmIHlvdSBjYW4gc2hhcmUg
dGhlIGNvZGVzIHRoYXQgd291bGQgYmUgcmVhbGx5IGFwcHJlY2lhdGVkISEgIEFuZCBEbyB5b3Ug
aGF2ZSBhIGNvcHkgb2YgdGhlIHFlbXUgcmVsYXRlZCBjb2Rlcz8gDQoNClRoYW5rcywNCkxpemhl
bg0KDQotLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJvbTogSWx5YSBMZXNva2hpbiBbbWFp
bHRvOmlseWFsQG1lbGxhbm94LmNvbV0gDQpTZW50OiBUaHVyc2RheSwgTWFyY2ggOSwgMjAxNyAy
OjI0IFBNDQpUbzogU29uZywgSmlrZSA8amlrZS5zb25nQGludGVsLmNvbT4NCkNjOiBrdm1Admdl
ci5rZXJuZWwub3JnOyBsaW51eC1wY2lAdmdlci5rZXJuZWwub3JnOyBiaGVsZ2Fhc0Bnb29nbGUu
Y29tOyBhbGV4LndpbGxpYW1zb25AcmVkaGF0LmNvbTsgTm9hIE9zaGVyb3ZpY2ggPG5vYW9zQG1l
bGxhbm94LmNvbT47IEhhZ2dhaSBFcmFuIDxoYWdnYWllQG1lbGxhbm94LmNvbT47IE9yIEdlcmxp
dHogPG9nZXJsaXR6QG1lbGxhbm94LmNvbT47IExpcmFuIExpc3MgPGxpcmFubEBtZWxsYW5veC5j
b20+OyBZb3UsIExpemhlbiA8bGl6aGVuLnlvdUBpbnRlbC5jb20+DQpTdWJqZWN0OiBSRTogW1BB
VENIIFYzIDAvM10gVkZJTyBTUklPViBzdXBwb3J0DQoNCkhpIEppa2UsDQpJIGRvbid0IGhhdmUg
YSBwbGFuIHRvIHdvcmsgb24gaXQgaW4gdGhlIG5lYXIgZnV0dXJlLCBidXQgd2UgY2FuIHNoYXJl
IHRoZSBjb2RlIGlmIHlvdSBhcmUgaW50ZXJlc3RlZC4NCg0KVGhhbmtzLA0KSWx5YQ0KDQo+IC0t
LS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+IEZyb206IEppa2UgU29uZyBbbWFpbHRvOmppa2Uu
c29uZ0BpbnRlbC5jb21dDQo+IFNlbnQ6IFdlZG5lc2RheSwgTWFyY2ggMDgsIDIwMTcgOTozMCBB
TQ0KPiBUbzogSWx5YSBMZXNva2hpbiA8aWx5YWxAbWVsbGFub3guY29tPg0KPiBDYzoga3ZtQHZn
ZXIua2VybmVsLm9yZzsgbGludXgtcGNpQHZnZXIua2VybmVsLm9yZzsgDQo+IGJoZWxnYWFzQGdv
b2dsZS5jb207IGFsZXgud2lsbGlhbXNvbkByZWRoYXQuY29tOyBOb2EgT3NoZXJvdmljaCANCj4g
PG5vYW9zQG1lbGxhbm94LmNvbT47IEhhZ2dhaSBFcmFuIDxoYWdnYWllQG1lbGxhbm94LmNvbT47
IE9yIEdlcmxpdHogDQo+IDxvZ2VybGl0ekBtZWxsYW5veC5jb20+OyBMaXJhbiBMaXNzIDxsaXJh
bmxAbWVsbGFub3guY29tPjsgWW91LCBMaXpoZW4gDQo+IDxsaXpoZW4ueW91QGludGVsLmNvbT4N
Cj4gU3ViamVjdDogUmU6IFtQQVRDSCBWMyAwLzNdIFZGSU8gU1JJT1Ygc3VwcG9ydA0KPiANCj4g
T24gMDgvMTgvMjAxNiAwMzoyOSBQTSwgSWx5YSBMZXNva2hpbiB3cm90ZToNCj4gPiBDaGFuZ2Vz
IGZyb20gVjI6DQo+ID4gICAgICAgICAxLiBFbmFibGluZyBhbmQgZGlzYWJsaW5nIFNSLUlPViBp
cyBub3cgZG9uZQ0KPiA+ICAgICAgICAgdGhyb3VnaCB0aGUgc3lzZnMgaW50ZXJmYWNlLCByZXF1
aXJpbmcNCj4gPiAgICAgICAgIGFkbWluIHByaXZpbGVnZXMuDQo+ID4gICAgICAgICAyLiBTaW5j
ZSBhZG1pbiBwcml2aWxlZ2VzIGFyZSBub3cgcmVxdWlyZWQNCj4gPiAgICAgICAgIHRvIGVuYWJs
ZSBTUi1JT1YgbW9zdCBvZiB0aGUgdGhlIHNlY3VyaXR5DQo+ID4gICAgICAgICBtZWFzdXJlcyBp
bnRyb2R1Y2VkIGluIFJGQyBWMiB3ZXJlIHJlbW92ZWQuDQo+ID4gICAgICAgICBVbmZvcnR1bmF0
ZWx5IHdlIHN0aWxsIG5lZWQgYSBtdXRleCB0byBwcmV2ZW50DQo+ID4gICAgICAgICB0aGUgVkZJ
TyB1c2VyIGZyb20gY2hhbmdpbmcgdGhlIG51bWJlciBvZg0KPiA+ICAgICAgICAgVkZzIHdoaWxl
IGVuYWJsZV9zcmlvdiBpcyBpbiBwcm9ncmVzcy4NCj4gPg0KPiA+IENoYW5nZXMgZnJvbSBWMToN
Cj4gPiAgICAgICAgIDEuIFRoZSBWRiBhcmUgbm8gbG9uZ2VyIGFzc2lnbmVkIHRvIFBGcyBpb21t
dSBncm91cA0KPiA+ICAgICAgICAgMi4gQWRkIGEgcGNpX2VuYWJsZV9zcmlvdl93aXRoX292ZXJy
aWRlIEFQSSB0byBhbGxvdw0KPiA+ICAgICAgICAgZW5hYmxpbmQgc3Jpb3Ygd2l0aG91dCBwcm9i
aW5nIHRoZSBWRnMgd2l0aCB0aGUNCj4gPiAgICAgICAgIGRlZmF1bHQgZHJpdmVyDQo+ID4NCj4g
PiBDaGFuZ2VzIGZyb20gUkZDIFYyOg0KPiA+ICAgICAgICAgMS4gcGNpX2Rpc2FibGVfc3Jpb3Yo
KSBpcyBub3cgY2FsbGVkIGZyb20gYSB3b3JrcXVldWUNCj4gPiAgICAgICAgIFRvIGF2b2lkIHRo
ZSBzaXR1YXRpb24gd2hlcmUgYSBwcm9jZXNzIGlzIGJsb2NrZWQNCj4gPiAgICAgICAgIGluIHBj
aV9kaXNhYmxlX3NyaW92KCkgd2F0aW5nIGZvciBpdHNlbGYgdG8gcmVsYXNlIHRoZSBWRnMuDQo+
ID4gICAgICAgICAyLiBhIG11dGV4IHdhcyBhZGRlZCB0byBzeW5jaHJvbml6ZSBjYWxscyB0bw0K
PiA+ICAgICAgICAgcGNpX2VuYWJsZV9zcmlvdigpIGFuZCBwY2lfZGlzYWJsZV9zcmlvdigpDQo+
ID4NCj4gPiBDaGFuZ2VzIGZyb20gUkZDIFYxOg0KPiA+ICAgICAgICAgRHVlIHRvIHRoZSBzZWN1
cml0eSBjb25jZXJuIHJhaXNlZCBpbiBSRkMgVjEsIHdlIGFkZCB0d28gcGF0Y2hlcw0KPiA+ICAg
ICAgICAgdG8gbWFrZSBzdXJlIHRoZSBWRnMgYmVsb25nIHRvIHRoZSBzYW1lIElPTU1VIGdyb3Vw
IGFzDQo+ID4gICAgICAgICB0aGUgUEYgYW5kIGFyZSBwcm9iZWQgYnkgVkZJTy4NCj4gPg0KPiA+
IFRvZGF5IHRoZSBRRU1VIGh5cGVydmlzb3IgYWxsb3dzIGFzc2lnbmluZyBhIHBoeXNpY2FsIGRl
dmljZSB0byBhIA0KPiA+IFZNLCBmYWNpbGl0YXRpbmcgZHJpdmVyIGRldmVsb3BtZW50LiBIb3dl
dmVyLCBpdCBkb2VzIG5vdCBzdXBwb3J0IA0KPiA+IGVuYWJsaW5nIFNSLUlPViBieSB0aGUgVk0g
a2VybmVsIGRyaXZlci4gT3VyIGdvYWwgaXMgdG8gaW1wbGVtZW50IA0KPiA+IHN1Y2ggc3VwcG9y
dCwgYWxsb3dpbmcgZGV2ZWxvcGVycyB3b3JraW5nIG9uIFNSLUlPViBwaHlzaWNhbCANCj4gPiBm
dW5jdGlvbiBkcml2ZXJzIHRvIHdvcmsgaW5zaWRlIFZNcyBhcyB3ZWxsLg0KPiA+DQo+ID4gVGhp
cyBwYXRjaCBzZXJpZXMgaW1wbGVtZW50cyB0aGUga2VybmVsIHNpZGUgb2Ygb3VyIHNvbHV0aW9u
LiAgSXQgDQo+ID4gZXh0ZW5kcyB0aGUgVkZJTyBkcml2ZXIgdG8gc3VwcG9ydCB0aGUgUENJRSBT
UklPViBleHRlbmRlZCANCj4gPiBjYXBhYmlsaXR5IHdpdGggZm9sbG93aW5nIGZlYXR1cmVzOg0K
PiA+IDEuIFRoZSBhYmlsaXR5IHRvIHByb2JlIFNSLUlPViBCQVIgc2l6ZXMuDQo+ID4gMi4gVGhl
IGFiaWxpdHkgdG8gZW5hYmxlIGFuZCBkaXNhYmxlIFNSLUlPVi4NCj4gPg0KPiA+IFRoaXMgcGF0
Y2ggc2VyaWVzIGlzIGdvaW5nIHRvIGJlIHVzZWQgYnkgUUVNVSB0byBleHBvc2UgU1ItSU9WIA0K
PiA+IGNhcGFiaWxpdGllcyB0byBWTS4gV2UgYWxyZWFkeSBoYXZlIGFuIGVhcmx5IHByb3RvdHlw
ZSBiYXNlZCBvbiBLbnV0IA0KPiA+IE9tYW5nJ3MgcGF0Y2hlcyBmb3IgU1ItSU9WWzFdLg0KPiA+
DQo+ID4gTGltaXRhdGlvbnM6DQo+ID4gMS4gUGVyIFNSLUlPViBzcGVjIHNlY3Rpb24gMy4zLjEy
LCBQRnMgYXJlIHJlcXVpcmVkIHRvIHN1cHBvcnQgNC1LQiwgDQo+ID4gOC1LQiwgNjQtS0IsIDI1
Ni1LQiwgMS1NQiwgYW5kIDQtTUIgcGFnZSBzaXplcy4NCj4gPiBVbmZvdXJ0ZW50bHkgdGhlIGtl
cm5lbCBjdXJyZW50bHkgaW5pdGlhbGl6ZXMgdGhlIFN5c3RlbSBQYWdlIFNpemUgDQo+ID4gcmVn
aXN0ZXIgb25jZSBhbmQgYXNzdW1lcyBpdCBkb2Vzbid0IGNoYW5nZSB0aGVyZWZvcmUgd2UgY2Fu
bm90IA0KPiA+IGFsbG93IGd1ZXN0cyB0byBjaGFuZ2UgdGhpcyByZWdpc3RlciBhdCB3aWxsLiBX
ZSBjdXJyZW50bHkgbWFwIGJvdGggDQo+ID4gdGhlIFN1cHBvcnRlZCBQYWdlIHNpemVzIGFuZCBT
eXN0ZW0gUGFnZSBTaXplIGFzIHZpcnR1YWxpemVkIGFuZCANCj4gPiByZWFkIG9ubHkgaW4NCj4g
dmlvbGF0aW9uIG9mIHRoZSBzcGVjLg0KPiA+IEluIHByYWN0aWNlIHRoaXMgaXMgbm90IGFuIGlz
c3VlIHNpbmNlIGJvdGggdGhlIGh5cGVydmlzb3IgYW5kIHRoZSANCj4gPiBndWVzdCB0eXBpY2Fs
bHkgc2VsZWN0IHRoZSBzYW1lIFN5c3RlbSBQYWdlIFNpemUuDQo+ID4NCj4gPiBbMV0gaHR0cHM6
Ly9naXRodWIuY29tL2tudXRvL3FlbXUvdHJlZS9zcmlvdl9wYXRjaGVzX3Y2DQo+ID4NCj4gPiBJ
bHlhIExlc29raGluICgzKToNCj4gPiAgIHBjaTogRXh0ZW5kIFBDSSBJT1YgQVBJDQo+ID4gICB2
ZmlvL3BjaTogQWxsb3cgY29udHJvbCBTUi1JT1YgdGhyb3VnaCBzeXNmcyBpbnRlcmZhY2UNCj4g
PiAgIHZmaW8vcGNpOiBBZGQgc3VwcG9ydCBmb3IgU1ItSU9WIGV4dGVuZGVkIGNhcGFibGl0eQ0K
PiA+DQo+ID4gIGRyaXZlcnMvcGNpL2lvdi5jICAgICAgICAgICAgICAgICAgIHwgIDQxICsrKysr
KysrLS0NCj4gPiAgZHJpdmVycy92ZmlvL3BjaS92ZmlvX3BjaS5jICAgICAgICAgfCAgNDMgKysr
KysrKystLQ0KPiA+ICBkcml2ZXJzL3ZmaW8vcGNpL3ZmaW9fcGNpX2NvbmZpZy5jICB8IDE1MQ0K
PiArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKy0tLS0NCj4gPiAgZHJpdmVycy92Zmlv
L3BjaS92ZmlvX3BjaV9wcml2YXRlLmggfCAgIDIgKw0KPiA+ICBpbmNsdWRlL2xpbnV4L3BjaS5o
ICAgICAgICAgICAgICAgICB8ICAxMyArKystDQo+ID4gIDUgZmlsZXMgY2hhbmdlZCwgMjE5IGlu
c2VydGlvbnMoKyksIDMxIGRlbGV0aW9ucygtKQ0KPiA+DQo+IA0KPiArTGl6aGVuDQo+IA0KPiAN
Cj4gSGkgSWx5YSwNCj4gDQo+IFNvcnJ5IGZvciBqdW1waW5nIGluIGFicnVwdGx5LiBXZSBhcmUg
YWxzbyBsb29raW5nIGZvcndhcmQgdG8gaGF2ZSBQRiANCj4gdXNlZCB3aXRoaW4gYSBWTSwgd291
bGQgeW91IHBsZWFzZSBzaGFyZSB5b3VyIG5leHQgcGxhbiB3aXRoIHVzPyANCj4gTGlrZWx5IHRo
ZXJlIHdpbGwgYmUgYSB2NCBzaG9ydGx5Pw0KPiANCj4gDQo+IC0tDQo+IFRoYW5rcywNCj4gSmlr
ZQ0K

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V3 0/3] VFIO SRIOV support
@ 2017-03-09  6:29       ` You, Lizhen
  0 siblings, 0 replies; 14+ messages in thread
From: You, Lizhen @ 2017-03-09  6:29 UTC (permalink / raw)
  To: Ilya Lesokhin, Song, Jike
  Cc: kvm, linux-pci, bhelgaas, alex.williamson, Noa Osherovich,
	Haggai Eran, Or Gerlitz, Liran Liss

Hi Ilya,

We'd like to give it a try. If you can share the codes that would be really appreciated!!  And Do you have a copy of the qemu related codes? 

Thanks,
Lizhen

-----Original Message-----
From: Ilya Lesokhin [mailto:ilyal@mellanox.com] 
Sent: Thursday, March 9, 2017 2:24 PM
To: Song, Jike <jike.song@intel.com>
Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; bhelgaas@google.com; alex.williamson@redhat.com; Noa Osherovich <noaos@mellanox.com>; Haggai Eran <haggaie@mellanox.com>; Or Gerlitz <ogerlitz@mellanox.com>; Liran Liss <liranl@mellanox.com>; You, Lizhen <lizhen.you@intel.com>
Subject: RE: [PATCH V3 0/3] VFIO SRIOV support

Hi Jike,
I don't have a plan to work on it in the near future, but we can share the code if you are interested.

Thanks,
Ilya

> -----Original Message-----
> From: Jike Song [mailto:jike.song@intel.com]
> Sent: Wednesday, March 08, 2017 9:30 AM
> To: Ilya Lesokhin <ilyal@mellanox.com>
> Cc: kvm@vger.kernel.org; linux-pci@vger.kernel.org; 
> bhelgaas@google.com; alex.williamson@redhat.com; Noa Osherovich 
> <noaos@mellanox.com>; Haggai Eran <haggaie@mellanox.com>; Or Gerlitz 
> <ogerlitz@mellanox.com>; Liran Liss <liranl@mellanox.com>; You, Lizhen 
> <lizhen.you@intel.com>
> Subject: Re: [PATCH V3 0/3] VFIO SRIOV support
> 
> On 08/18/2016 03:29 PM, Ilya Lesokhin wrote:
> > Changes from V2:
> >         1. Enabling and disabling SR-IOV is now done
> >         through the sysfs interface, requiring
> >         admin privileges.
> >         2. Since admin privileges are now required
> >         to enable SR-IOV most of the the security
> >         measures introduced in RFC V2 were removed.
> >         Unfortunately we still need a mutex to prevent
> >         the VFIO user from changing the number of
> >         VFs while enable_sriov is in progress.
> >
> > Changes from V1:
> >         1. The VF are no longer assigned to PFs iommu group
> >         2. Add a pci_enable_sriov_with_override API to allow
> >         enablind sriov without probing the VFs with the
> >         default driver
> >
> > Changes from RFC V2:
> >         1. pci_disable_sriov() is now called from a workqueue
> >         To avoid the situation where a process is blocked
> >         in pci_disable_sriov() wating for itself to relase the VFs.
> >         2. a mutex was added to synchronize calls to
> >         pci_enable_sriov() and pci_disable_sriov()
> >
> > Changes from RFC V1:
> >         Due to the security concern raised in RFC V1, we add two patches
> >         to make sure the VFs belong to the same IOMMU group as
> >         the PF and are probed by VFIO.
> >
> > Today the QEMU hypervisor allows assigning a physical device to a 
> > VM, facilitating driver development. However, it does not support 
> > enabling SR-IOV by the VM kernel driver. Our goal is to implement 
> > such support, allowing developers working on SR-IOV physical 
> > function drivers to work inside VMs as well.
> >
> > This patch series implements the kernel side of our solution.  It 
> > extends the VFIO driver to support the PCIE SRIOV extended 
> > capability with following features:
> > 1. The ability to probe SR-IOV BAR sizes.
> > 2. The ability to enable and disable SR-IOV.
> >
> > This patch series is going to be used by QEMU to expose SR-IOV 
> > capabilities to VM. We already have an early prototype based on Knut 
> > Omang's patches for SR-IOV[1].
> >
> > Limitations:
> > 1. Per SR-IOV spec section 3.3.12, PFs are required to support 4-KB, 
> > 8-KB, 64-KB, 256-KB, 1-MB, and 4-MB page sizes.
> > Unfourtently the kernel currently initializes the System Page Size 
> > register once and assumes it doesn't change therefore we cannot 
> > allow guests to change this register at will. We currently map both 
> > the Supported Page sizes and System Page Size as virtualized and 
> > read only in
> violation of the spec.
> > In practice this is not an issue since both the hypervisor and the 
> > guest typically select the same System Page Size.
> >
> > [1] https://github.com/knuto/qemu/tree/sriov_patches_v6
> >
> > Ilya Lesokhin (3):
> >   pci: Extend PCI IOV API
> >   vfio/pci: Allow control SR-IOV through sysfs interface
> >   vfio/pci: Add support for SR-IOV extended capablity
> >
> >  drivers/pci/iov.c                   |  41 ++++++++--
> >  drivers/vfio/pci/vfio_pci.c         |  43 ++++++++--
> >  drivers/vfio/pci/vfio_pci_config.c  | 151
> ++++++++++++++++++++++++++++++++----
> >  drivers/vfio/pci/vfio_pci_private.h |   2 +
> >  include/linux/pci.h                 |  13 +++-
> >  5 files changed, 219 insertions(+), 31 deletions(-)
> >
> 
> +Lizhen
> 
> 
> Hi Ilya,
> 
> Sorry for jumping in abruptly. We are also looking forward to have PF 
> used within a VM, would you please share your next plan with us? 
> Likely there will be a v4 shortly?
> 
> 
> --
> Thanks,
> Jike

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-03-09  6:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-18  7:29 [PATCH V3 0/3] VFIO SRIOV support Ilya Lesokhin
2016-08-18  7:29 ` [PATCH V3 1/3] pci: Extend PCI IOV API Ilya Lesokhin
2016-08-18 22:09   ` Christoph Hellwig
2016-08-22 18:51   ` kbuild test robot
2016-08-18  7:29 ` [PATCH V3 2/3] vfio/pci: Allow control SR-IOV through sysfs interface Ilya Lesokhin
2016-08-18 22:11   ` Christoph Hellwig
2016-08-18  7:29 ` [PATCH V3 3/3] vfio/pci: Add support for SR-IOV extended capablity Ilya Lesokhin
2016-08-18 20:32   ` Alex Williamson
2016-08-22  6:48   ` kbuild test robot
2017-03-08  7:29 ` [PATCH V3 0/3] VFIO SRIOV support Jike Song
2017-03-09  6:24   ` Ilya Lesokhin
2017-03-09  6:24     ` Ilya Lesokhin
2017-03-09  6:29     ` You, Lizhen
2017-03-09  6:29       ` You, Lizhen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.