All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC net-next PATCH 0/4] Add new settings for ethtool
@ 2011-07-27 22:17 Greg Rose
  2011-07-27 22:17 ` [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM Greg Rose
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Greg Rose @ 2011-07-27 22:17 UTC (permalink / raw)
  To: netdev; +Cc: davem, bhutchings, jeffrey.t.kirsher

This series of patches implements several changes to the way SR-IOV
can be configured to allow for per port assignment of the number of
VFs.

In addition, patches 1/4 and 2/4 of this patch series adds a new flag
bit to the PCI device structure dev_flags that is used by the kvm
module to indicate when a PCI device is assigned to a guest virtual
machine (VM) and changes to the ixgbe driver to make use of this new
bit.  The advantage these first two patches provide is to prevent
destruction of virtual functions (VFs) by the physical function
(PF) driver when it is removed if the PF driver finds that any VFs
are still assigned to guest VMs.  This in turn prevents panics that
will result when the VF device disappears from the guest VM without
notification.

Patches 3/4 and 4/4 of this patch series implement new settings for
Ethtool that allow for reconfiguration of the number of VFs per PF
without resorting to use of the module parameter which is only capable
of setting a single number of VFs per PF, to set the number of VM
queues for devices that support it and an on/off switch for the
anti-spoofing feature.  A follow on patch for Ethtool that implements
these settings will be sent separately.

Patches 1/4 and 2/4 of this series could also be considered for
submission to the current net tree and also the stable tree since
they prevent catastrophic results when the PF driver is removed
while VFs are assigned to guest VMs.

---

Greg Rose (4):
      ixgbe: Add support for new ethtool settings
      ethtool: Add new set commands
      ixgbe: Reconfigure SR-IOV Init
      pci: Add flag indicating device has been assigned by KVM


 drivers/net/ixgbe/ixgbe.h         |    2 
 drivers/net/ixgbe/ixgbe_ethtool.c |   96 ++++++++++++++++++
 drivers/net/ixgbe/ixgbe_main.c    |  109 ++------------------
 drivers/net/ixgbe/ixgbe_sriov.c   |  197 +++++++++++++++++++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_sriov.h   |    6 +
 drivers/net/ixgbe/ixgbe_type.h    |    4 +
 include/linux/ethtool.h           |   11 ++
 include/linux/pci.h               |    2 
 virt/kvm/assigned-dev.c           |    2 
 virt/kvm/iommu.c                  |    4 +
 10 files changed, 333 insertions(+), 100 deletions(-)

-- 
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-27 22:17 [RFC net-next PATCH 0/4] Add new settings for ethtool Greg Rose
@ 2011-07-27 22:17 ` Greg Rose
  2011-07-28 15:11   ` Ian Campbell
  2011-07-27 22:17 ` [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init Greg Rose
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 32+ messages in thread
From: Greg Rose @ 2011-07-27 22:17 UTC (permalink / raw)
  To: netdev; +Cc: davem, bhutchings, jeffrey.t.kirsher

Device drivers that create and destroy SR-IOV virtual functions via
calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
failures if they attempt to destroy VFs while they are assigned to
guest virtual machines.  By adding a flag for use by the KVM module
to indicate that a device is assigned a device driver can check that
flag and avoid destroying VFs while they are assigned and avoid system
failures.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
---

 include/linux/pci.h     |    2 ++
 virt/kvm/assigned-dev.c |    2 ++
 virt/kvm/iommu.c        |    4 ++++
 3 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2d29218..a297ca2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -174,6 +174,8 @@ enum pci_dev_flags {
 	PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG = (__force pci_dev_flags_t) 1,
 	/* Device configuration is irrevocably lost if disabled into D3 */
 	PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2,
+	/* Provide indication device is assigned by KVM */
+	PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
 };
 
 enum pci_irq_reroute_variant {
diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index 6cc4b97..f401de1 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -205,6 +205,8 @@ static void kvm_free_assigned_device(struct kvm *kvm,
 	else
 		pci_restore_state(assigned_dev->dev);
 
+	assigned_dev->dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
+
 	pci_release_regions(assigned_dev->dev);
 	pci_disable_device(assigned_dev->dev);
 	pci_dev_put(assigned_dev->dev);
diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index 62a9caf..cffc530 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -181,6 +181,8 @@ int kvm_assign_device(struct kvm *kvm,
 			goto out_unmap;
 	}
 
+	pdev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
+
 	printk(KERN_DEBUG "assign device %x:%x:%x.%x\n",
 		assigned_dev->host_segnr,
 		assigned_dev->host_busnr,
@@ -209,6 +211,8 @@ int kvm_deassign_device(struct kvm *kvm,
 
 	iommu_detach_device(domain, &pdev->dev);
 
+	pdev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
+
 	printk(KERN_DEBUG "deassign device %x:%x:%x.%x\n",
 		assigned_dev->host_segnr,
 		assigned_dev->host_busnr,


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init
  2011-07-27 22:17 [RFC net-next PATCH 0/4] Add new settings for ethtool Greg Rose
  2011-07-27 22:17 ` [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM Greg Rose
@ 2011-07-27 22:17 ` Greg Rose
  2011-07-28  5:26   ` David Miller
  2011-07-27 22:17 ` [RFC net-next PATCH 3/4] ethtool: Add new set commands Greg Rose
  2011-07-27 22:18 ` [RFC net-next PATCH 4/4] ixgbe: Add support for new ethtool settings Greg Rose
  3 siblings, 1 reply; 32+ messages in thread
From: Greg Rose @ 2011-07-27 22:17 UTC (permalink / raw)
  To: netdev; +Cc: davem, bhutchings, jeffrey.t.kirsher

Use the PCI device flag indicating if a VF is assigned to a guest VM
to guard against destroying VFs upon driver removal.  Implement
additional feature to detect if VFs already exist when the driver
is loaded and if so configure them and set the driver state to
SR-IOV enabled.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
---

 drivers/net/ixgbe/ixgbe.h       |    1 
 drivers/net/ixgbe/ixgbe_main.c  |  104 ++-------------------
 drivers/net/ixgbe/ixgbe_sriov.c |  195 +++++++++++++++++++++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_sriov.h |    5 +
 drivers/net/ixgbe/ixgbe_type.h  |    4 +
 5 files changed, 213 insertions(+), 96 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index e04a8e4..ed9836f 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -120,6 +120,7 @@ struct vf_data_storage {
 	u16 pf_vlan; /* When set, guest VLAN config not allowed. */
 	u16 pf_qos;
 	u16 tx_rate;
+	struct pci_dev *vfdev;
 };
 
 struct vf_macvlans {
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 1be6175..06ba9f2 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -160,41 +160,6 @@ MODULE_VERSION(DRV_VERSION);
 
 #define DEFAULT_DEBUG_LEVEL_SHIFT 3
 
-static inline void ixgbe_disable_sriov(struct ixgbe_adapter *adapter)
-{
-	struct ixgbe_hw *hw = &adapter->hw;
-	u32 gcr;
-	u32 gpie;
-	u32 vmdctl;
-
-#ifdef CONFIG_PCI_IOV
-	/* disable iov and allow time for transactions to clear */
-	pci_disable_sriov(adapter->pdev);
-#endif
-
-	/* turn off device IOV mode */
-	gcr = IXGBE_READ_REG(hw, IXGBE_GCR_EXT);
-	gcr &= ~(IXGBE_GCR_EXT_SRIOV);
-	IXGBE_WRITE_REG(hw, IXGBE_GCR_EXT, gcr);
-	gpie = IXGBE_READ_REG(hw, IXGBE_GPIE);
-	gpie &= ~IXGBE_GPIE_VTMODE_MASK;
-	IXGBE_WRITE_REG(hw, IXGBE_GPIE, gpie);
-
-	/* set default pool back to 0 */
-	vmdctl = IXGBE_READ_REG(hw, IXGBE_VT_CTL);
-	vmdctl &= ~IXGBE_VT_CTL_POOL_MASK;
-	IXGBE_WRITE_REG(hw, IXGBE_VT_CTL, vmdctl);
-
-	/* take a breather then clean up driver data */
-	msleep(100);
-
-	kfree(adapter->vfinfo);
-	adapter->vfinfo = NULL;
-
-	adapter->num_vfs = 0;
-	adapter->flags &= ~IXGBE_FLAG_SRIOV_ENABLED;
-}
-
 static void ixgbe_service_event_schedule(struct ixgbe_adapter *adapter)
 {
 	if (!test_bit(__IXGBE_DOWN, &adapter->state) &&
@@ -7237,11 +7202,8 @@ static void __devinit ixgbe_probe_vf(struct ixgbe_adapter *adapter,
 {
 #ifdef CONFIG_PCI_IOV
 	struct ixgbe_hw *hw = &adapter->hw;
-	int err;
-	int num_vf_macvlans, i;
-	struct vf_macvlans *mv_list;
 
-	if (hw->mac.type == ixgbe_mac_82598EB || !max_vfs)
+	if (hw->mac.type == ixgbe_mac_82598EB)
 		return;
 
 	/* The 82599 supports up to 64 VFs per physical function
@@ -7250,60 +7212,7 @@ static void __devinit ixgbe_probe_vf(struct ixgbe_adapter *adapter,
 	 * physical function
 	 */
 	adapter->num_vfs = (max_vfs > 63) ? 63 : max_vfs;
-	adapter->flags |= IXGBE_FLAG_SRIOV_ENABLED;
-	err = pci_enable_sriov(adapter->pdev, adapter->num_vfs);
-	if (err) {
-		e_err(probe, "Failed to enable PCI sriov: %d\n", err);
-		goto err_novfs;
-	}
-
-	num_vf_macvlans = hw->mac.num_rar_entries -
-		(IXGBE_MAX_PF_MACVLANS + 1 + adapter->num_vfs);
-
-	adapter->mv_list = mv_list = kcalloc(num_vf_macvlans,
-					     sizeof(struct vf_macvlans),
-					     GFP_KERNEL);
-	if (mv_list) {
-		/* Initialize list of VF macvlans */
-		INIT_LIST_HEAD(&adapter->vf_mvs.l);
-		for (i = 0; i < num_vf_macvlans; i++) {
-			mv_list->vf = -1;
-			mv_list->free = true;
-			mv_list->rar_entry = hw->mac.num_rar_entries -
-				(i + adapter->num_vfs + 1);
-			list_add(&mv_list->l, &adapter->vf_mvs.l);
-			mv_list++;
-		}
-	}
-
-	/* If call to enable VFs succeeded then allocate memory
-	 * for per VF control structures.
-	 */
-	adapter->vfinfo =
-		kcalloc(adapter->num_vfs,
-			sizeof(struct vf_data_storage), GFP_KERNEL);
-	if (adapter->vfinfo) {
-		/* Now that we're sure SR-IOV is enabled
-		 * and memory allocated set up the mailbox parameters
-		 */
-		ixgbe_init_mbx_params_pf(hw);
-		memcpy(&hw->mbx.ops, ii->mbx_ops,
-		       sizeof(hw->mbx.ops));
-
-		/* Disable RSC when in SR-IOV mode */
-		adapter->flags2 &= ~(IXGBE_FLAG2_RSC_CAPABLE |
-				     IXGBE_FLAG2_RSC_ENABLED);
-		return;
-	}
-
-	/* Oh oh */
-	e_err(probe, "Unable to allocate memory for VF Data Storage - "
-	      "SRIOV disabled\n");
-	pci_disable_sriov(adapter->pdev);
-
-err_novfs:
-	adapter->flags &= ~IXGBE_FLAG_SRIOV_ENABLED;
-	adapter->num_vfs = 0;
+	ixgbe_enable_sriov(adapter, ii);
 #endif /* CONFIG_PCI_IOV */
 }
 
@@ -7752,8 +7661,13 @@ static void __devexit ixgbe_remove(struct pci_dev *pdev)
 	if (netdev->reg_state == NETREG_REGISTERED)
 		unregister_netdev(netdev);
 
-	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
-		ixgbe_disable_sriov(adapter);
+	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
+		if (!(ixgbe_check_vf_assignment(adapter)))
+			ixgbe_disable_sriov(adapter);
+		else
+			e_dev_warn("Unloading driver while VFs are assigned "
+				   "- VFs will not be deallocated\n");
+	}
 
 	ixgbe_clear_interrupt_scheme(adapter);
 
diff --git a/drivers/net/ixgbe/ixgbe_sriov.c b/drivers/net/ixgbe/ixgbe_sriov.c
index d99d01e..cdb2f0c 100644
--- a/drivers/net/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ixgbe/ixgbe_sriov.c
@@ -43,6 +43,158 @@
 
 #include "ixgbe_sriov.h"
 
+static int ixgbe_find_enabled_vfs(struct ixgbe_adapter *adapter)
+{
+	struct pci_dev *pdev = adapter->pdev;
+	struct pci_dev *pvfdev;
+	u16 vf_devfn = 0;
+	int device_id;
+	int vfs_found = 0;
+
+	switch (adapter->hw.mac.type) {
+		case ixgbe_mac_82599EB:
+			device_id = IXGBE_DEV_ID_82599_VF;
+			break;
+		case ixgbe_mac_X540:
+			device_id = IXGBE_DEV_ID_X540_VF;
+			break;
+		default:
+			device_id = 0;
+			break;
+	}
+
+	vf_devfn = pdev->devfn + 0x80;
+	pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID, device_id, NULL);
+	while (pvfdev) {
+		if (pvfdev->devfn == vf_devfn)
+			vfs_found++;
+		vf_devfn += 2;
+		pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID,
+					device_id, pvfdev);
+	}
+
+	return vfs_found;
+}
+
+void ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
+			 const struct ixgbe_info *ii)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	int err = 0;
+	int num_vf_macvlans, i;
+	struct vf_macvlans *mv_list;
+	int pre_existing_vfs = 0;
+
+	pre_existing_vfs = ixgbe_find_enabled_vfs(adapter);
+	if (!pre_existing_vfs && !adapter->num_vfs)
+		return;
+
+	/* If there are pre-existing VFs then we have to force
+ 	 * use of that many because they were not deleted the last
+ 	 * time someone removed the PF driver.  That would have
+ 	 * been because they were allocated to guest VMs and can't
+ 	 * be removed.  Go ahead and just re-enable the old amount.
+ 	 * If the user wants to change the number of VFs they can
+ 	 * use ethtool while making sure no VFs are allocated to
+ 	 * guest VMs... i.e. the right way.
+ 	 */
+	if (pre_existing_vfs)
+		adapter->num_vfs = pre_existing_vfs;
+	else
+		err = pci_enable_sriov(adapter->pdev, adapter->num_vfs);
+	if (err) {
+		e_err(probe, "Failed to enable PCI sriov: %d\n", err);
+		goto err_novfs;
+	}
+	adapter->flags |= IXGBE_FLAG_SRIOV_ENABLED;
+
+	e_info(probe, "SR-IOV enabled with %d VFs\n", adapter->num_vfs);
+
+	num_vf_macvlans = hw->mac.num_rar_entries -
+	(IXGBE_MAX_PF_MACVLANS + 1 + adapter->num_vfs);
+
+	adapter->mv_list = mv_list = kcalloc(num_vf_macvlans,
+					     sizeof(struct vf_macvlans),
+					     GFP_KERNEL);
+	if (mv_list) {
+		/* Initialize list of VF macvlans */
+		INIT_LIST_HEAD(&adapter->vf_mvs.l);
+		for (i = 0; i < num_vf_macvlans; i++) {
+			mv_list->vf = -1;
+			mv_list->free = true;
+			mv_list->rar_entry = hw->mac.num_rar_entries -
+				(i + adapter->num_vfs + 1);
+			list_add(&mv_list->l, &adapter->vf_mvs.l);
+			mv_list++;
+		}
+	}
+
+	/* If call to enable VFs succeeded then allocate memory
+	 * for per VF control structures.
+	 */
+	adapter->vfinfo =
+		kcalloc(adapter->num_vfs,
+			sizeof(struct vf_data_storage), GFP_KERNEL);
+	if (adapter->vfinfo) {
+		/* Now that we're sure SR-IOV is enabled
+		 * and memory allocated set up the mailbox parameters
+		 */
+		ixgbe_init_mbx_params_pf(hw);
+		memcpy(&hw->mbx.ops, ii->mbx_ops,
+		       sizeof(hw->mbx.ops));
+
+		/* Disable RSC when in SR-IOV mode */
+		adapter->flags2 &= ~(IXGBE_FLAG2_RSC_CAPABLE |
+				     IXGBE_FLAG2_RSC_ENABLED);
+		return;
+	}
+
+	/* Oh oh */
+	e_err(probe, "Unable to allocate memory for VF Data Storage - "
+	      "SRIOV disabled\n");
+	pci_disable_sriov(adapter->pdev);
+
+err_novfs:
+	adapter->flags &= ~IXGBE_FLAG_SRIOV_ENABLED;
+	adapter->num_vfs = 0;
+}
+
+void ixgbe_disable_sriov(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	u32 gcr;
+	u32 gpie;
+	u32 vmdctl;
+
+#ifdef CONFIG_PCI_IOV
+	/* disable iov and allow time for transactions to clear */
+	pci_disable_sriov(adapter->pdev);
+#endif
+
+	/* turn off device IOV mode */
+	gcr = IXGBE_READ_REG(hw, IXGBE_GCR_EXT);
+	gcr &= ~(IXGBE_GCR_EXT_SRIOV);
+	IXGBE_WRITE_REG(hw, IXGBE_GCR_EXT, gcr);
+	gpie = IXGBE_READ_REG(hw, IXGBE_GPIE);
+	gpie &= ~IXGBE_GPIE_VTMODE_MASK;
+	IXGBE_WRITE_REG(hw, IXGBE_GPIE, gpie);
+
+	/* set default pool back to 0 */
+	vmdctl = IXGBE_READ_REG(hw, IXGBE_VT_CTL);
+	vmdctl &= ~IXGBE_VT_CTL_POOL_MASK;
+	IXGBE_WRITE_REG(hw, IXGBE_VT_CTL, vmdctl);
+
+	/* take a breather then clean up driver data */
+	msleep(100);
+
+	kfree(adapter->vfinfo);
+	kfree(adapter->mv_list);
+	adapter->vfinfo = NULL;
+
+	adapter->num_vfs = 0;
+	adapter->flags &= ~IXGBE_FLAG_SRIOV_ENABLED;
+}
+
 static int ixgbe_set_vf_multicasts(struct ixgbe_adapter *adapter,
 				   int entries, u16 *hash_list, u32 vf)
 {
@@ -273,12 +425,28 @@ static int ixgbe_set_vf_macvlan(struct ixgbe_adapter *adapter,
 	return 0;
 }
 
+int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter)
+{
+	int i;
+	for (i = 0; i < adapter->num_vfs; i++) {
+		if (adapter->vfinfo[i].vfdev->dev_flags &
+			PCI_DEV_FLAGS_ASSIGNED) {
+		return true;
+		}
+	}
+	return false;
+}
+
 int ixgbe_vf_configuration(struct pci_dev *pdev, unsigned int event_mask)
 {
 	unsigned char vf_mac_addr[6];
 	struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
 	unsigned int vfn = (event_mask & 0x3f);
-
+	struct pci_dev *pvfdev;
+	unsigned int device_id;
+	u16 thisvf_devfn = (pdev->devfn + 0x80 + (vfn << 1)) |
+				(pdev->devfn & 1);
+ 
 	bool enable = ((event_mask & 0x10000000U) != 0);
 
 	if (enable) {
@@ -290,6 +458,31 @@ int ixgbe_vf_configuration(struct pci_dev *pdev, unsigned int event_mask)
 		 * for it later.
 		 */
 		memcpy(adapter->vfinfo[vfn].vf_mac_addresses, vf_mac_addr, 6);
+
+		switch (adapter->hw.mac.type) {
+		case ixgbe_mac_82599EB:
+			device_id = IXGBE_DEV_ID_82599_VF;
+			break;
+		case ixgbe_mac_X540:
+			device_id = IXGBE_DEV_ID_X540_VF;
+			break;
+		default:
+			device_id = 0;
+			break;
+		}
+
+		pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID, device_id, NULL);
+		while (pvfdev) {
+			if (pvfdev->devfn == thisvf_devfn)
+				break;
+			pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID,
+						device_id, pvfdev);
+		}
+		if (pvfdev)
+			adapter->vfinfo[vfn].vfdev = pvfdev;
+		else
+			e_err(drv, "Couldn't find pci dev ptr for VF %4.4x\n",
+			      thisvf_devfn);
 	}
 
 	return 0;
diff --git a/drivers/net/ixgbe/ixgbe_sriov.h b/drivers/net/ixgbe/ixgbe_sriov.h
index 3417556..2781847 100644
--- a/drivers/net/ixgbe/ixgbe_sriov.h
+++ b/drivers/net/ixgbe/ixgbe_sriov.h
@@ -41,6 +41,11 @@ int ixgbe_ndo_set_vf_bw(struct net_device *netdev, int vf, int tx_rate);
 int ixgbe_ndo_get_vf_config(struct net_device *netdev,
 			    int vf, struct ifla_vf_info *ivi);
 void ixgbe_check_vf_rate_limit(struct ixgbe_adapter *adapter);
+void ixgbe_disable_sriov(struct ixgbe_adapter *adapter);
+void ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
+			const struct ixgbe_info *ii);
+int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter);
+
 
 #endif /* _IXGBE_SRIOV_H_ */
 
diff --git a/drivers/net/ixgbe/ixgbe_type.h b/drivers/net/ixgbe/ixgbe_type.h
index e0d970e..648cf15 100644
--- a/drivers/net/ixgbe/ixgbe_type.h
+++ b/drivers/net/ixgbe/ixgbe_type.h
@@ -65,6 +65,10 @@
 #define IXGBE_DEV_ID_82599_LS            0x154F
 #define IXGBE_DEV_ID_X540T               0x1528
 
+/* VF Device IDs */
+#define IXGBE_DEV_ID_82599_VF           0x10ED
+#define IXGBE_DEV_ID_X540_VF            0x1515
+
 /* General Registers */
 #define IXGBE_CTRL      0x00000
 #define IXGBE_STATUS    0x00008


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-27 22:17 [RFC net-next PATCH 0/4] Add new settings for ethtool Greg Rose
  2011-07-27 22:17 ` [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM Greg Rose
  2011-07-27 22:17 ` [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init Greg Rose
@ 2011-07-27 22:17 ` Greg Rose
  2011-07-28  5:27   ` David Miller
  2011-07-28 21:20   ` Ben Hutchings
  2011-07-27 22:18 ` [RFC net-next PATCH 4/4] ixgbe: Add support for new ethtool settings Greg Rose
  3 siblings, 2 replies; 32+ messages in thread
From: Greg Rose @ 2011-07-27 22:17 UTC (permalink / raw)
  To: netdev; +Cc: davem, bhutchings, jeffrey.t.kirsher

Add new set commands to configure the number of SR-IOV VFs, the
number of VM queues and spoof checking on/off switch.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
---

 include/linux/ethtool.h |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index c6e427a..c4972ba 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -36,12 +36,14 @@ struct ethtool_cmd {
 	__u8	mdio_support;
 	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
 	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
+	__u32	num_vfs;	/* Enable SR-IOV VFs */
+	__u32	num_vmqs;	/* Set number of queues for VMDq */
 	__u16	speed_hi;       /* The forced speed (upper
 				 * bits) in Mbps. Please use
 				 * ethtool_cmd_speed()/_set() to
 				 * access it */
 	__u8	eth_tp_mdix;
-	__u8	reserved2;
+	__u8	spoof_check;	/* Enable/Disable anti-spoofing */
 	__u32	lp_advertising;	/* Features the link partner advertises */
 	__u32	reserved[2];
 };
@@ -1121,6 +1123,13 @@ struct ethtool_ops {
 #define AUTONEG_DISABLE		0x00
 #define AUTONEG_ENABLE		0x01
 
+/* Enable or disable MAC and/or VLAN spoofchecking.If this is
+ * set to enable, then depending on the controller capabilities
+ * MAC and/or VLAN spoofing will be turned on.
+ */
+#define SPOOFCHECK_DISABLE     0x00
+#define SPOOFCHECK_ENABLE      0x01
+
 /* Mode MDI or MDI-X */
 #define ETH_TP_MDI_INVALID	0x00
 #define ETH_TP_MDI		0x01


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [RFC net-next PATCH 4/4] ixgbe: Add support for new ethtool settings
  2011-07-27 22:17 [RFC net-next PATCH 0/4] Add new settings for ethtool Greg Rose
                   ` (2 preceding siblings ...)
  2011-07-27 22:17 ` [RFC net-next PATCH 3/4] ethtool: Add new set commands Greg Rose
@ 2011-07-27 22:18 ` Greg Rose
  2011-07-28 11:54   ` Michał Mirosław
  3 siblings, 1 reply; 32+ messages in thread
From: Greg Rose @ 2011-07-27 22:18 UTC (permalink / raw)
  To: netdev; +Cc: davem, bhutchings, jeffrey.t.kirsher

Adds ixgbe driver support for new ethtool settings for SR-IOV re-init,
number of VM queues and anti-spoofing ON/OFF switch.


Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
---

 drivers/net/ixgbe/ixgbe.h         |    1 
 drivers/net/ixgbe/ixgbe_ethtool.c |   96 +++++++++++++++++++++++++++++++++++++
 drivers/net/ixgbe/ixgbe_main.c    |    5 ++
 drivers/net/ixgbe/ixgbe_sriov.c   |    2 -
 drivers/net/ixgbe/ixgbe_sriov.h   |    3 -
 5 files changed, 103 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index ed9836f..c826d7e 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -504,6 +504,7 @@ struct ixgbe_adapter {
 	struct hlist_head fdir_filter_list;
 	union ixgbe_atr_input fdir_mask;
 	int fdir_filter_count;
+	const struct ixgbe_info *saved_ii;
 };
 
 struct ixgbe_fdir_filter {
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index dc64955..4a8d3e5 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -38,6 +38,7 @@
 #include <linux/uaccess.h>
 
 #include "ixgbe.h"
+#include "ixgbe_sriov.h"
 
 
 #define IXGBE_ALL_RAR_ENTRIES 16
@@ -314,6 +315,67 @@ static int ixgbe_get_settings(struct net_device *netdev,
 	return 0;
 }
 
+static int ixgbe_reinit_sriov(struct net_device *netdev, int new_vfs)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(netdev);
+	struct pci_dev *pdev = adapter->pdev;
+	int err;
+	int i;
+
+	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
+		if (ixgbe_check_vf_assignment(adapter)) {
+			netdev_warn(netdev, "%s	",
+				    "reconfigure of SR-IOV VFs "
+				    "not supported while VFs are "
+				    "assigned to guest VMs\n");
+			return -EBUSY;
+		}
+	}
+	if (netif_running(netdev)) {
+		netdev_warn(netdev, "%s",
+			    "Cannot reconfigure SR-IOV "
+			    "while interface is up\n"
+			    "Please bring the interface "
+			    "down first\n");
+		return -EBUSY;
+	}
+
+	ixgbe_clear_interrupt_scheme(adapter);
+
+	if (adapter->num_vfs)
+		ixgbe_disable_sriov(adapter);
+
+	adapter->num_vfs = (new_vfs > 63) ? 63 : new_vfs;
+
+	if (adapter->num_vfs) {
+		ixgbe_enable_sriov(adapter, adapter->saved_ii);
+		for (i = 0; i < adapter->num_vfs; i++)
+			ixgbe_vf_configure(pdev, (i | 0x10000000));
+	}
+
+	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
+		adapter->flags &= ~(IXGBE_FLAG_RSS_ENABLED |
+				    IXGBE_FLAG_DCB_ENABLED);
+		netdev->features &= ~NETIF_F_RXHASH;
+	} else {
+		adapter->flags |= IXGBE_FLAG_RSS_ENABLED;
+		netdev->features |= NETIF_F_RXHASH;
+	}
+
+	err = ixgbe_init_interrupt_scheme(adapter);
+	/*
+	 * If we can't init some sort of interrupt scheme then the device
+	 * is hosed - just print a warning and bail.  Nothing will work
+	 * but at least we've put a message in the system log telling why.
+	 */
+	if (err)
+		e_dev_err("Cannot initialize interrupts for device\n");
+	else
+		ixgbe_reset(adapter);
+
+	return err;
+}
+
 static int ixgbe_set_settings(struct net_device *netdev,
                               struct ethtool_cmd *ecmd)
 {
@@ -322,6 +384,40 @@ static int ixgbe_set_settings(struct net_device *netdev,
 	u32 advertised, old;
 	s32 err = 0;
 
+	if (hw->mac.type == ixgbe_mac_82598EB)
+		goto skip_sriov_checks;
+
+	if (ecmd->num_vfs != adapter->num_vfs) {
+		if (!(adapter->flags & IXGBE_FLAG_DCB_ENABLED)) {
+			err = ixgbe_reinit_sriov(netdev, ecmd->num_vfs);
+			if (err)
+				return err;
+		} else {
+			return -EINVAL;
+		}
+	}
+
+	if ((ecmd->spoof_check == SPOOFCHECK_ENABLE)
+	    && !adapter->antispoofing_enabled) {
+		int i;
+		hw->mac.ops.set_mac_anti_spoofing(hw, true,
+						  adapter->num_vfs);
+		for (i = 0; i < adapter->num_vfs; i++)
+			hw->mac.ops.set_vlan_anti_spoofing(hw, true, i);
+		adapter->antispoofing_enabled = true;
+	} else if ((ecmd->spoof_check == SPOOFCHECK_DISABLE)
+		 && adapter->antispoofing_enabled) {
+		int i;
+		hw->mac.ops.set_mac_anti_spoofing(hw, false,
+						  adapter->num_vfs);
+		for (i = 0; i < adapter->num_vfs; i++)
+			hw->mac.ops.set_vlan_anti_spoofing(hw, false, i);
+		adapter->antispoofing_enabled = false;
+	}
+
+skip_sriov_checks:
+
+
 	if ((hw->phy.media_type == ixgbe_media_type_copper) ||
 	    (hw->phy.multispeed_fiber)) {
 		/* 10000/copper and 1000/copper must autoneg
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 06ba9f2..fba7ff0 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -7206,6 +7206,9 @@ static void __devinit ixgbe_probe_vf(struct ixgbe_adapter *adapter,
 	if (hw->mac.type == ixgbe_mac_82598EB)
 		return;
 
+	/* need to save this away in case SR-IOV is reconfigured */
+	adapter->saved_ii = ii;
+
 	/* The 82599 supports up to 64 VFs per physical function
 	 * but this implementation limits allocation to 63 so that
 	 * basic networking resources are still available to the
@@ -7589,7 +7592,7 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
 		e_info(probe, "IOV is enabled with %d VFs\n", adapter->num_vfs);
 		for (i = 0; i < adapter->num_vfs; i++)
-			ixgbe_vf_configuration(pdev, (i | 0x10000000));
+			ixgbe_vf_configure(pdev, (i | 0x10000000));
 	}
 
 	/* Inform firmware of driver version */
diff --git a/drivers/net/ixgbe/ixgbe_sriov.c b/drivers/net/ixgbe/ixgbe_sriov.c
index cdb2f0c..0da639e 100644
--- a/drivers/net/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ixgbe/ixgbe_sriov.c
@@ -437,7 +437,7 @@ int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter)
 	return false;
 }
 
-int ixgbe_vf_configuration(struct pci_dev *pdev, unsigned int event_mask)
+int ixgbe_vf_configure(struct pci_dev *pdev, unsigned int event_mask)
 {
 	unsigned char vf_mac_addr[6];
 	struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
diff --git a/drivers/net/ixgbe/ixgbe_sriov.h b/drivers/net/ixgbe/ixgbe_sriov.h
index 2781847..f588bf5 100644
--- a/drivers/net/ixgbe/ixgbe_sriov.h
+++ b/drivers/net/ixgbe/ixgbe_sriov.h
@@ -30,7 +30,7 @@
 
 void ixgbe_restore_vf_multicasts(struct ixgbe_adapter *adapter);
 void ixgbe_msg_task(struct ixgbe_adapter *adapter);
-int ixgbe_vf_configuration(struct pci_dev *pdev, unsigned int event_mask);
+int ixgbe_vf_configure(struct pci_dev *pdev, unsigned int event_mask);
 void ixgbe_disable_tx_rx(struct ixgbe_adapter *adapter);
 void ixgbe_ping_all_vfs(struct ixgbe_adapter *adapter);
 void ixgbe_dump_registers(struct ixgbe_adapter *adapter);
@@ -46,6 +46,5 @@ void ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
 			const struct ixgbe_info *ii);
 int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter);
 
-
 #endif /* _IXGBE_SRIOV_H_ */
 


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init
  2011-07-27 22:17 ` [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init Greg Rose
@ 2011-07-28  5:26   ` David Miller
  2011-07-28 15:44     ` Rose, Gregory V
  0 siblings, 1 reply; 32+ messages in thread
From: David Miller @ 2011-07-28  5:26 UTC (permalink / raw)
  To: gregory.v.rose; +Cc: netdev, bhutchings, jeffrey.t.kirsher

From: Greg Rose <gregory.v.rose@intel.com>
Date: Wed, 27 Jul 2011 15:17:54 -0700

> +	int i;
> +	for (i = 0; i < adapter->num_vfs; i++) {
> +		if (adapter->vfinfo[i].vfdev->dev_flags &
> +			PCI_DEV_FLAGS_ASSIGNED) {
> +		return true;
> +		}
> +	}

Bad formatting and indentation, please fix this.

> +		pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID, device_id, NULL);
> +		while (pvfdev) {
> +			if (pvfdev->devfn == thisvf_devfn)
> +				break;
> +			pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID,
> +						device_id, pvfdev);
> +		}
> +		if (pvfdev)
> +			adapter->vfinfo[vfn].vfdev = pvfdev;

pci_get_*() grabs a reference to any non-NULL pci device object
returned, where does this reference get released?  I scanned
all uses of x.vfdev and x->vfdev and could not find the necessary
release.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-27 22:17 ` [RFC net-next PATCH 3/4] ethtool: Add new set commands Greg Rose
@ 2011-07-28  5:27   ` David Miller
  2011-07-28 15:51     ` Rose, Gregory V
  2011-07-28 21:20   ` Ben Hutchings
  1 sibling, 1 reply; 32+ messages in thread
From: David Miller @ 2011-07-28  5:27 UTC (permalink / raw)
  To: gregory.v.rose; +Cc: netdev, bhutchings, jeffrey.t.kirsher

From: Greg Rose <gregory.v.rose@intel.com>
Date: Wed, 27 Jul 2011 15:17:59 -0700

> Add new set commands to configure the number of SR-IOV VFs, the
> number of VM queues and spoof checking on/off switch.
> 
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> ---
> 
>  include/linux/ethtool.h |   11 ++++++++++-
>  1 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index c6e427a..c4972ba 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -36,12 +36,14 @@ struct ethtool_cmd {
>  	__u8	mdio_support;
>  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
>  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> +	__u32	num_vmqs;	/* Set number of queues for VMDq */

You can't change the layout of this datastructure in this way without
breaking every ethtool binary out there.

You have to find another place to add these knobs.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 4/4] ixgbe: Add support for new ethtool settings
  2011-07-27 22:18 ` [RFC net-next PATCH 4/4] ixgbe: Add support for new ethtool settings Greg Rose
@ 2011-07-28 11:54   ` Michał Mirosław
  2011-07-28 15:52     ` Rose, Gregory V
  0 siblings, 1 reply; 32+ messages in thread
From: Michał Mirosław @ 2011-07-28 11:54 UTC (permalink / raw)
  To: Greg Rose; +Cc: netdev, davem, bhutchings, jeffrey.t.kirsher

2011/7/28 Greg Rose <gregory.v.rose@intel.com>:
> Adds ixgbe driver support for new ethtool settings for SR-IOV re-init,
> number of VM queues and anti-spoofing ON/OFF switch.
[...]
> +static int ixgbe_reinit_sriov(struct net_device *netdev, int new_vfs)
> +{
[...]
> +       if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
> +               adapter->flags &= ~(IXGBE_FLAG_RSS_ENABLED |
> +                                   IXGBE_FLAG_DCB_ENABLED);
> +               netdev->features &= ~NETIF_F_RXHASH;
> +       } else {
> +               adapter->flags |= IXGBE_FLAG_RSS_ENABLED;
> +               netdev->features |= NETIF_F_RXHASH;
> +       }

Please use ndo_fix_features/ndo_set_features callbacks for this.

Best Regards,
Michał Mirosław

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-27 22:17 ` [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM Greg Rose
@ 2011-07-28 15:11   ` Ian Campbell
  2011-07-28 15:58     ` Rose, Gregory V
  2011-07-29 16:51     ` Jesse Barnes
  0 siblings, 2 replies; 32+ messages in thread
From: Ian Campbell @ 2011-07-28 15:11 UTC (permalink / raw)
  To: Greg Rose, Konrad Rzeszutek Wilk
  Cc: netdev, davem, bhutchings, jeffrey.t.kirsher, Jesse Barnes, linux-pci

On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> Device drivers that create and destroy SR-IOV virtual functions via
> calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
> failures if they attempt to destroy VFs while they are assigned to
> guest virtual machines.  By adding a flag for use by the KVM module
> to indicate that a device is assigned a device driver can check that
> flag and avoid destroying VFs while they are assigned and avoid system
> failures.
> 
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> ---
> 
>  include/linux/pci.h     |    2 ++

I added Jesse and linux-pci to CC.

>  virt/kvm/assigned-dev.c |    2 ++
>  virt/kvm/iommu.c        |    4 ++++
>  3 files changed, 8 insertions(+), 0 deletions(-)

I suppose this would also be useful in Xen's pciback or any other system
which does passthrough? (Konrad CC'd for pciback)

Is there some common lower layer we could hook this in to? (does
iommu_attach/detach_device make sense?) Or shall we just add the flag
manipulations to pciback as well?

Ian.

> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2d29218..a297ca2 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -174,6 +174,8 @@ enum pci_dev_flags {
>  	PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG = (__force pci_dev_flags_t) 1,
>  	/* Device configuration is irrevocably lost if disabled into D3 */
>  	PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2,
> +	/* Provide indication device is assigned by KVM */
> +	PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
>  };
>  
>  enum pci_irq_reroute_variant {
> diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> index 6cc4b97..f401de1 100644
> --- a/virt/kvm/assigned-dev.c
> +++ b/virt/kvm/assigned-dev.c
> @@ -205,6 +205,8 @@ static void kvm_free_assigned_device(struct kvm *kvm,
>  	else
>  		pci_restore_state(assigned_dev->dev);
>  
> +	assigned_dev->dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
> +
>  	pci_release_regions(assigned_dev->dev);
>  	pci_disable_device(assigned_dev->dev);
>  	pci_dev_put(assigned_dev->dev);
> diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
> index 62a9caf..cffc530 100644
> --- a/virt/kvm/iommu.c
> +++ b/virt/kvm/iommu.c
> @@ -181,6 +181,8 @@ int kvm_assign_device(struct kvm *kvm,
>  			goto out_unmap;
>  	}
>  
> +	pdev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
> +
>  	printk(KERN_DEBUG "assign device %x:%x:%x.%x\n",
>  		assigned_dev->host_segnr,
>  		assigned_dev->host_busnr,
> @@ -209,6 +211,8 @@ int kvm_deassign_device(struct kvm *kvm,
>  
>  	iommu_detach_device(domain, &pdev->dev);
>  
> +	pdev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
> +
>  	printk(KERN_DEBUG "deassign device %x:%x:%x.%x\n",
>  		assigned_dev->host_segnr,
>  		assigned_dev->host_busnr,
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Ian Campbell

All progress is based upon a universal innate desire of every organism
to live beyond its income.
		-- Samuel Butler, "Notebooks"

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init
  2011-07-28  5:26   ` David Miller
@ 2011-07-28 15:44     ` Rose, Gregory V
  0 siblings, 0 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 15:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bhutchings, Kirsher, Jeffrey T

> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Wednesday, July 27, 2011 10:27 PM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey T
> Subject: Re: [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init
> 
> From: Greg Rose <gregory.v.rose@intel.com>
> Date: Wed, 27 Jul 2011 15:17:54 -0700
> 
> > +	int i;
> > +	for (i = 0; i < adapter->num_vfs; i++) {
> > +		if (adapter->vfinfo[i].vfdev->dev_flags &
> > +			PCI_DEV_FLAGS_ASSIGNED) {
> > +		return true;
> > +		}
> > +	}
> 
> Bad formatting and indentation, please fix this.

Will do.

> 
> > +		pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID, device_id,
> NULL);
> > +		while (pvfdev) {
> > +			if (pvfdev->devfn == thisvf_devfn)
> > +				break;
> > +			pvfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID,
> > +						device_id, pvfdev);
> > +		}
> > +		if (pvfdev)
> > +			adapter->vfinfo[vfn].vfdev = pvfdev;
> 
> pci_get_*() grabs a reference to any non-NULL pci device object
> returned, where does this reference get released?  I scanned
> all uses of x.vfdev and x->vfdev and could not find the necessary
> release.

I was ignorant of this behavior.  I'll fix it.

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28  5:27   ` David Miller
@ 2011-07-28 15:51     ` Rose, Gregory V
  2011-07-28 16:14       ` David Miller
       [not found]       ` <539DF151-E442-4375-8777-19676B95059B@qlogic.com>
  0 siblings, 2 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 15:51 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bhutchings, Kirsher, Jeffrey T

> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Wednesday, July 27, 2011 10:28 PM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey T
> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> From: Greg Rose <gregory.v.rose@intel.com>
> Date: Wed, 27 Jul 2011 15:17:59 -0700
> 
> > Add new set commands to configure the number of SR-IOV VFs, the
> > number of VM queues and spoof checking on/off switch.
> >
> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > ---
> >
> >  include/linux/ethtool.h |   11 ++++++++++-
> >  1 files changed, 10 insertions(+), 1 deletions(-)
> >
> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> > index c6e427a..c4972ba 100644
> > --- a/include/linux/ethtool.h
> > +++ b/include/linux/ethtool.h
> > @@ -36,12 +36,14 @@ struct ethtool_cmd {
> >  	__u8	mdio_support;
> >  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
> >  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> > +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> > +	__u32	num_vmqs;	/* Set number of queues for VMDq */
> 
> You can't change the layout of this datastructure in this way without
> breaking every ethtool binary out there.
> 
> You have to find another place to add these knobs.

Perhaps at the end of the ethtool_cmd structure?  Something like this:

/* This should work for both 32 and 64 bit userland. */
struct ethtool_cmd {
        __u32   cmd;
        __u32   supported;      /* Features this interface supports */
        __u32   advertising;    /* Features this interface advertises */
        __u16   speed;          /* The forced speed (lower bits) in
                                 * Mbps. Please use
                                 * ethtool_cmd_speed()/_set() to
                                 * access it */
        __u8    duplex;         /* Duplex, half or full */
        __u8    port;           /* Which connector port */
        __u8    phy_address;
        __u8    transceiver;    /* Which transceiver to use */
        __u8    autoneg;        /* Enable or disable autonegotiation */
        __u8    mdio_support;
        __u32   maxtxpkt;       /* Tx pkts before generating tx int */
        __u32   maxrxpkt;       /* Rx pkts before generating rx int */
        __u16   speed_hi;       /* The forced speed (upper
                                 * bits) in Mbps. Please use
                                 * ethtool_cmd_speed()/_set() to
                                 * access it */
        __u8    eth_tp_mdix;
        __u8    spoof_check;    /* Enable/Disable anti-spoofing */
        __u32   lp_advertising; /* Features the link partner advertises */
        __u32   reserved[2];
        __u32   num_vfs;        /* Enable SR-IOV VFs */
        __u32   num_vmqs;       /* Set number of queues for VMDq */
};

- Greg



^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 4/4] ixgbe: Add support for new ethtool settings
  2011-07-28 11:54   ` Michał Mirosław
@ 2011-07-28 15:52     ` Rose, Gregory V
  0 siblings, 0 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 15:52 UTC (permalink / raw)
  To: Michał Mirosław; +Cc: netdev, davem, bhutchings, Kirsher, Jeffrey T

> -----Original Message-----
> From: Michał Mirosław [mailto:mirqus@gmail.com]
> Sent: Thursday, July 28, 2011 4:54 AM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org; davem@davemloft.net;
> bhutchings@solarflare.com; Kirsher, Jeffrey T
> Subject: Re: [RFC net-next PATCH 4/4] ixgbe: Add support for new ethtool
> settings
> 
> 2011/7/28 Greg Rose <gregory.v.rose@intel.com>:
> > Adds ixgbe driver support for new ethtool settings for SR-IOV re-init,
> > number of VM queues and anti-spoofing ON/OFF switch.
> [...]
> > +static int ixgbe_reinit_sriov(struct net_device *netdev, int new_vfs)
> > +{
> [...]
> > +       if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
> > +               adapter->flags &= ~(IXGBE_FLAG_RSS_ENABLED |
> > +                                   IXGBE_FLAG_DCB_ENABLED);
> > +               netdev->features &= ~NETIF_F_RXHASH;
> > +       } else {
> > +               adapter->flags |= IXGBE_FLAG_RSS_ENABLED;
> > +               netdev->features |= NETIF_F_RXHASH;
> > +       }
> 
> Please use ndo_fix_features/ndo_set_features callbacks for this.

OK, sure.  I'll add a separate patch to do that.

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-28 15:11   ` Ian Campbell
@ 2011-07-28 15:58     ` Rose, Gregory V
  2011-07-28 16:27       ` Ian Campbell
  2011-07-29 16:51     ` Jesse Barnes
  1 sibling, 1 reply; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 15:58 UTC (permalink / raw)
  To: Ian Campbell, Konrad Rzeszutek Wilk
  Cc: netdev, davem, bhutchings, Kirsher, Jeffrey T, Jesse Barnes, linux-pci

> -----Original Message-----
> From: Ian Campbell [mailto:ijc@hellion.org.uk]
> Sent: Thursday, July 28, 2011 8:11 AM
> To: Rose, Gregory V; Konrad Rzeszutek Wilk
> Cc: netdev@vger.kernel.org; davem@davemloft.net;
> bhutchings@solarflare.com; Kirsher, Jeffrey T; Jesse Barnes; linux-
> pci@vger.kernel.org
> Subject: Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has
> been assigned by KVM
> 
> On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > Device drivers that create and destroy SR-IOV virtual functions via
> > calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
> > failures if they attempt to destroy VFs while they are assigned to
> > guest virtual machines.  By adding a flag for use by the KVM module
> > to indicate that a device is assigned a device driver can check that
> > flag and avoid destroying VFs while they are assigned and avoid system
> > failures.
> >
> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > ---
> >
> >  include/linux/pci.h     |    2 ++
> 
> I added Jesse and linux-pci to CC.
> 
> >  virt/kvm/assigned-dev.c |    2 ++
> >  virt/kvm/iommu.c        |    4 ++++
> >  3 files changed, 8 insertions(+), 0 deletions(-)
> 
> I suppose this would also be useful in Xen's pciback or any other system
> which does passthrough? (Konrad CC'd for pciback)

Definitely yes.  Xen experiences the same issues when the PF driver is removed while VFs are assigned to guests.

> 
> Is there some common lower layer we could hook this in to? (does
> iommu_attach/detach_device make sense?) Or shall we just add the flag
> manipulations to pciback as well?
[Greg Rose] 

I was unaware of any common lower layers, i.e I didn't know that Xen also uses the iommu_attach/detach_device calls.  It took me a week of digging around in the KVM module code just to find these hooks.  Generally I stick to Ethernet device drivers and I'm not that familiar with device pass through code.  I was just confronted with a problem and looking for some way to fix it.

;^)

That sounds like a good idea, let me have a look at it.

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 15:51     ` Rose, Gregory V
@ 2011-07-28 16:14       ` David Miller
  2011-07-28 16:21         ` Rose, Gregory V
  2011-07-28 21:14         ` Ben Hutchings
       [not found]       ` <539DF151-E442-4375-8777-19676B95059B@qlogic.com>
  1 sibling, 2 replies; 32+ messages in thread
From: David Miller @ 2011-07-28 16:14 UTC (permalink / raw)
  To: gregory.v.rose; +Cc: netdev, bhutchings, jeffrey.t.kirsher

From: "Rose, Gregory V" <gregory.v.rose@intel.com>
Date: Thu, 28 Jul 2011 08:51:05 -0700

>> -----Original Message-----
>> From: David Miller [mailto:davem@davemloft.net]
>> Sent: Wednesday, July 27, 2011 10:28 PM
>> To: Rose, Gregory V
>> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey T
>> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
>> 
>> From: Greg Rose <gregory.v.rose@intel.com>
>> Date: Wed, 27 Jul 2011 15:17:59 -0700
>> 
>> > Add new set commands to configure the number of SR-IOV VFs, the
>> > number of VM queues and spoof checking on/off switch.
>> >
>> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
>> > ---
>> >
>> >  include/linux/ethtool.h |   11 ++++++++++-
>> >  1 files changed, 10 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>> > index c6e427a..c4972ba 100644
>> > --- a/include/linux/ethtool.h
>> > +++ b/include/linux/ethtool.h
>> > @@ -36,12 +36,14 @@ struct ethtool_cmd {
>> >  	__u8	mdio_support;
>> >  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
>> >  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
>> > +	__u32	num_vfs;	/* Enable SR-IOV VFs */
>> > +	__u32	num_vmqs;	/* Set number of queues for VMDq */
>> 
>> You can't change the layout of this datastructure in this way without
>> breaking every ethtool binary out there.
>> 
>> You have to find another place to add these knobs.
> 
> Perhaps at the end of the ethtool_cmd structure?  Something like this:

Either use the two reserved u32's we have there, or create a new
ethtool command and control structure.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 16:14       ` David Miller
@ 2011-07-28 16:21         ` Rose, Gregory V
  2011-07-28 21:14         ` Ben Hutchings
  1 sibling, 0 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 16:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bhutchings, Kirsher, Jeffrey T

> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Thursday, July 28, 2011 9:15 AM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey T
> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> From: "Rose, Gregory V" <gregory.v.rose@intel.com>
> Date: Thu, 28 Jul 2011 08:51:05 -0700
> 
> >> -----Original Message-----
> >> From: David Miller [mailto:davem@davemloft.net]
> >> Sent: Wednesday, July 27, 2011 10:28 PM
> >> To: Rose, Gregory V
> >> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey
> T
> >> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> >>
> >> From: Greg Rose <gregory.v.rose@intel.com>
> >> Date: Wed, 27 Jul 2011 15:17:59 -0700
> >>
> >> > Add new set commands to configure the number of SR-IOV VFs, the
> >> > number of VM queues and spoof checking on/off switch.
> >> >
> >> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> >> > ---
> >> >
> >> >  include/linux/ethtool.h |   11 ++++++++++-
> >> >  1 files changed, 10 insertions(+), 1 deletions(-)
> >> >
> >> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> >> > index c6e427a..c4972ba 100644
> >> > --- a/include/linux/ethtool.h
> >> > +++ b/include/linux/ethtool.h
> >> > @@ -36,12 +36,14 @@ struct ethtool_cmd {
> >> >  	__u8	mdio_support;
> >> >  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
> >> >  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> >> > +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> >> > +	__u32	num_vmqs;	/* Set number of queues for VMDq */
> >>
> >> You can't change the layout of this datastructure in this way without
> >> breaking every ethtool binary out there.
> >>
> >> You have to find another place to add these knobs.
> >
> > Perhaps at the end of the ethtool_cmd structure?  Something like this:
> 
> Either use the two reserved u32's we have there, or create a new
> ethtool command and control structure.

OK, I see what you're saying.  The size of the structure can never be changed.  I'm slow sometimes but I do eventually get there.  I didn't want to use the reserved u32's because I didn't know what they were reserved for.  The num_vfs and num_vmqs values will actually never exceed 8 bits.  I'll stuff them into the space used by one of the u32's and then add padding to align and keep the same structure size.

Thanks for the feedback and help.  I'll address this and the other feedback I've gotten and come back later with V2 of the RFC.

- Greg



^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-28 15:58     ` Rose, Gregory V
@ 2011-07-28 16:27       ` Ian Campbell
  2011-07-28 16:42         ` Rose, Gregory V
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2011-07-28 16:27 UTC (permalink / raw)
  To: Rose, Gregory V
  Cc: Konrad Rzeszutek Wilk, netdev, davem, bhutchings, Kirsher,
	Jeffrey T, Jesse Barnes, linux-pci

On Thu, 2011-07-28 at 08:58 -0700, Rose, Gregory V wrote:
> > -----Original Message-----
> > From: Ian Campbell [mailto:ijc@hellion.org.uk]
> > Sent: Thursday, July 28, 2011 8:11 AM
> > To: Rose, Gregory V; Konrad Rzeszutek Wilk
> > Cc: netdev@vger.kernel.org; davem@davemloft.net;
> > bhutchings@solarflare.com; Kirsher, Jeffrey T; Jesse Barnes; linux-
> > pci@vger.kernel.org
> > Subject: Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has
> > been assigned by KVM
> > 
> > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > > Device drivers that create and destroy SR-IOV virtual functions via
> > > calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
> > > failures if they attempt to destroy VFs while they are assigned to
> > > guest virtual machines.  By adding a flag for use by the KVM module
> > > to indicate that a device is assigned a device driver can check that
> > > flag and avoid destroying VFs while they are assigned and avoid system
> > > failures.
> > >
> > > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > > ---
> > >
> > >  include/linux/pci.h     |    2 ++
> > 
> > I added Jesse and linux-pci to CC.
> > 
> > >  virt/kvm/assigned-dev.c |    2 ++
> > >  virt/kvm/iommu.c        |    4 ++++
> > >  3 files changed, 8 insertions(+), 0 deletions(-)
> > 
> > I suppose this would also be useful in Xen's pciback or any other system
> > which does passthrough? (Konrad CC'd for pciback)
> 
> Definitely yes.  Xen experiences the same issues when the PF driver is
> removed while VFs are assigned to guests.
> 
> > 
> > Is there some common lower layer we could hook this in to? (does
> > iommu_attach/detach_device make sense?) Or shall we just add the flag
> > manipulations to pciback as well?
> [Greg Rose] 
> 
> I was unaware of any common lower layers, i.e I didn't know that Xen
> also uses the iommu_attach/detach_device calls.

my mistake -- under Xen the iommu is driver by the hypervisor and not
the domain 0 kernel so there is no iommu_* in pciback.

>   It took me a week of digging around in the KVM module code just to
> find these hooks.

I'm not actually sure where in pciback the right place to put this would
be is, perhaps Konrad has an idea.

>   Generally I stick to Ethernet device drivers and I'm not that
> familiar with device pass through code.  I was just confronted with a
> problem and looking for some way to fix it.
> 
> ;^)
> 
> That sounds like a good idea, let me have a look at it.
> 
> - Greg
> 

-- 
Ian Campbell

Traffic signals in New York are just rough guidelines.
		-- David Letterman

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-28 16:27       ` Ian Campbell
@ 2011-07-28 16:42         ` Rose, Gregory V
  2011-07-29 16:54           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 16:42 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Konrad Rzeszutek Wilk, netdev, davem, bhutchings, Kirsher,
	Jeffrey T, Jesse Barnes, linux-pci

> -----Original Message-----
> From: Ian Campbell [mailto:ijc@hellion.org.uk]
> Sent: Thursday, July 28, 2011 9:28 AM
> To: Rose, Gregory V
> Cc: Konrad Rzeszutek Wilk; netdev@vger.kernel.org; davem@davemloft.net;
> bhutchings@solarflare.com; Kirsher, Jeffrey T; Jesse Barnes; linux-
> pci@vger.kernel.org
> Subject: RE: [RFC net-next PATCH 1/4] pci: Add flag indicating device has
> been assigned by KVM
> 
> On Thu, 2011-07-28 at 08:58 -0700, Rose, Gregory V wrote:
> > > -----Original Message-----
> > > From: Ian Campbell [mailto:ijc@hellion.org.uk]
> > > Sent: Thursday, July 28, 2011 8:11 AM
> > > To: Rose, Gregory V; Konrad Rzeszutek Wilk
> > > Cc: netdev@vger.kernel.org; davem@davemloft.net;
> > > bhutchings@solarflare.com; Kirsher, Jeffrey T; Jesse Barnes; linux-
> > > pci@vger.kernel.org
> > > Subject: Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device
> has
> > > been assigned by KVM
> > >
> > > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > > > Device drivers that create and destroy SR-IOV virtual functions via
> > > > calls to pci_enable_sriov() and pci_disable_sriov can cause
> catastrophic
> > > > failures if they attempt to destroy VFs while they are assigned to
> > > > guest virtual machines.  By adding a flag for use by the KVM module
> > > > to indicate that a device is assigned a device driver can check that
> > > > flag and avoid destroying VFs while they are assigned and avoid
> system
> > > > failures.
> > > >
> > > > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > > > ---
> > > >
> > > >  include/linux/pci.h     |    2 ++
> > >
> > > I added Jesse and linux-pci to CC.
> > >
> > > >  virt/kvm/assigned-dev.c |    2 ++
> > > >  virt/kvm/iommu.c        |    4 ++++
> > > >  3 files changed, 8 insertions(+), 0 deletions(-)
> > >
> > > I suppose this would also be useful in Xen's pciback or any other
> system
> > > which does passthrough? (Konrad CC'd for pciback)
> >
> > Definitely yes.  Xen experiences the same issues when the PF driver is
> > removed while VFs are assigned to guests.
> >
> > >
> > > Is there some common lower layer we could hook this in to? (does
> > > iommu_attach/detach_device make sense?) Or shall we just add the flag
> > > manipulations to pciback as well?
> > [Greg Rose]
> >
> > I was unaware of any common lower layers, i.e I didn't know that Xen
> > also uses the iommu_attach/detach_device calls.
> 
> my mistake -- under Xen the iommu is driver by the hypervisor and not
> the domain 0 kernel so there is no iommu_* in pciback.
> 
> >   It took me a week of digging around in the KVM module code just to
> > find these hooks.
> 
> I'm not actually sure where in pciback the right place to put this would
> be is, perhaps Konrad has an idea.

OK, but I hope Xen can still use the dev_flag assignment bit.

Thanks,

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
       [not found]       ` <539DF151-E442-4375-8777-19676B95059B@qlogic.com>
@ 2011-07-28 20:38         ` Rose, Gregory V
  2011-07-28 22:01           ` Anirban Chakraborty
  2011-07-28 22:01           ` Anirban Chakraborty
  0 siblings, 2 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 20:38 UTC (permalink / raw)
  To: Anirban Chakraborty
  Cc: David Miller, netdev, Ben Hutchings, Kirsher, Jeffrey T


> From: Anirban Chakraborty [mailto:anirban.chakraborty@qlogic.com] 
> Sent: Thursday, July 28, 2011 12:04 PM
> To: Rose, Gregory V
> Cc: David Miller; netdev; Ben Hutchings; Kirsher, Jeffrey T
> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> 
> On Jul 28, 2011, at 8:51 AM, Rose, Gregory V wrote:
> 
> 
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Wednesday, July 27, 2011 10:28 PM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey T
> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> From: Greg Rose <gregory.v.rose@intel.com>
> Date: Wed, 27 Jul 2011 15:17:59 -0700
> 
> Add new set commands to configure the number of SR-IOV VFs, the
> number of VM queues and spoof checking on/off switch.
> 
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> ---
> 
> include/linux/ethtool.h |   11 ++++++++++-
> 1 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index c6e427a..c4972ba 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -36,12 +36,14 @@ struct ethtool_cmd {
> 	__u8	mdio_support;
> 	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
> 	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> +	__u32	num_vmqs;	/* Set number of queues for VMDq */
> 
> You can't change the layout of this datastructure in this way without
> breaking every ethtool binary out there.
> 
> You have to find another place to add these knobs.
> 
> Perhaps at the end of the ethtool_cmd structure?  Something like this:
> 
> /* This should work for both 32 and 64 bit userland. */
> struct ethtool_cmd {
>        __u32   cmd;
>        __u32   supported;      /* Features this interface supports */
>        __u32   advertising;    /* Features this interface advertises */
>        __u16   speed;          /* The forced speed (lower bits) in
>                                 * Mbps. Please use
>                                 * ethtool_cmd_speed()/_set() to
>                                 * access it */
>        __u8    duplex;         /* Duplex, half or full */
>        __u8    port;           /* Which connector port */
>        __u8    phy_address;
>        __u8    transceiver;    /* Which transceiver to use */
>        __u8    autoneg;        /* Enable or disable autonegotiation */
>        __u8    mdio_support;
>        __u32   maxtxpkt;       /* Tx pkts before generating tx int */
>        __u32   maxrxpkt;       /* Rx pkts before generating rx int */
>        __u16   speed_hi;       /* The forced speed (upper
>                                 * bits) in Mbps. Please use
>                                 * ethtool_cmd_speed()/_set() to
>                                 * access it */
>        __u8    eth_tp_mdix;
>        __u8    spoof_check;    /* Enable/Disable anti-spoofing */
>        __u32   lp_advertising; /* Features the link partner advertises */
>        __u32   reserved[2];
>        __u32   num_vfs;        /* Enable SR-IOV VFs */
>        __u32   num_vmqs;       /* Set number of queues for VMDq */
> };
> 
> If I understood it correctly, you are trying to set/unset spoofing on per
> eth interface,  which could be a PF on the hypervisor or a pci passthru-ed
> VF in the linux guest.  There are other security features that one could set
> for a port on the VF (lets call it vport),  e.g. setting a port VLAN ID for
> a VF and specifying if the VF (VM) is allowed to send tagged/untagged
> packets, setting a vport in port mirroring mode so that the PF can monitor
> the traffic on a VF,  setting a vport in promiscuous mode etc. 
> 
> Does it make sense to try to use ip link util to specify all these parameters,
> since ip link already does the  job of setting VF properties and VF port
> profile?
> 
> Any thoughts?
> 

Sure, that's a possibility too.  I was considering ethtool for this since MAC addresses and VLANs are fairly specific to Ethernet whereas netlink might apply to other types of physical networks.  At least that's my understanding.

However, I have no strong feelings about it and if community consensus is to use ip link instead then that's fine by me.

Of course, patches implementing such would be quite welcome also.

;^)

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 16:14       ` David Miller
  2011-07-28 16:21         ` Rose, Gregory V
@ 2011-07-28 21:14         ` Ben Hutchings
  2011-07-28 21:16           ` Rose, Gregory V
  1 sibling, 1 reply; 32+ messages in thread
From: Ben Hutchings @ 2011-07-28 21:14 UTC (permalink / raw)
  To: David Miller; +Cc: gregory.v.rose, netdev, jeffrey.t.kirsher

On Thu, 2011-07-28 at 09:14 -0700, David Miller wrote:
> From: "Rose, Gregory V" <gregory.v.rose@intel.com>
> Date: Thu, 28 Jul 2011 08:51:05 -0700
> 
> >> -----Original Message-----
> >> From: David Miller [mailto:davem@davemloft.net]
> >> Sent: Wednesday, July 27, 2011 10:28 PM
> >> To: Rose, Gregory V
> >> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey T
> >> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> >> 
> >> From: Greg Rose <gregory.v.rose@intel.com>
> >> Date: Wed, 27 Jul 2011 15:17:59 -0700
> >> 
> >> > Add new set commands to configure the number of SR-IOV VFs, the
> >> > number of VM queues and spoof checking on/off switch.
> >> >
> >> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> >> > ---
> >> >
> >> >  include/linux/ethtool.h |   11 ++++++++++-
> >> >  1 files changed, 10 insertions(+), 1 deletions(-)
> >> >
> >> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> >> > index c6e427a..c4972ba 100644
> >> > --- a/include/linux/ethtool.h
> >> > +++ b/include/linux/ethtool.h
> >> > @@ -36,12 +36,14 @@ struct ethtool_cmd {
> >> >  	__u8	mdio_support;
> >> >  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
> >> >  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> >> > +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> >> > +	__u32	num_vmqs;	/* Set number of queues for VMDq */
> >> 
> >> You can't change the layout of this datastructure in this way without
> >> breaking every ethtool binary out there.
> >> 
> >> You have to find another place to add these knobs.
> > 
> > Perhaps at the end of the ethtool_cmd structure?  Something like this:
> 
> Either use the two reserved u32's we have there, or create a new
> ethtool command and control structure.

I'm going to insist on the latter.  As I see it, struct ethtool_cmd is
for Ethernet PHY settings.  (The max{rx,tx}pkt fields weren't, but
they're obsolete.)

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 21:14         ` Ben Hutchings
@ 2011-07-28 21:16           ` Rose, Gregory V
  0 siblings, 0 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 21:16 UTC (permalink / raw)
  To: Ben Hutchings, David Miller; +Cc: netdev, Kirsher, Jeffrey T

> -----Original Message-----
> From: Ben Hutchings [mailto:bhutchings@solarflare.com]
> Sent: Thursday, July 28, 2011 2:15 PM
> To: David Miller
> Cc: Rose, Gregory V; netdev@vger.kernel.org; Kirsher, Jeffrey T
> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> On Thu, 2011-07-28 at 09:14 -0700, David Miller wrote:
> > From: "Rose, Gregory V" <gregory.v.rose@intel.com>
> > Date: Thu, 28 Jul 2011 08:51:05 -0700
> >
> > >> -----Original Message-----
> > >> From: David Miller [mailto:davem@davemloft.net]
> > >> Sent: Wednesday, July 27, 2011 10:28 PM
> > >> To: Rose, Gregory V
> > >> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher,
> Jeffrey T
> > >> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> > >>
> > >> From: Greg Rose <gregory.v.rose@intel.com>
> > >> Date: Wed, 27 Jul 2011 15:17:59 -0700
> > >>
> > >> > Add new set commands to configure the number of SR-IOV VFs, the
> > >> > number of VM queues and spoof checking on/off switch.
> > >> >
> > >> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > >> > ---
> > >> >
> > >> >  include/linux/ethtool.h |   11 ++++++++++-
> > >> >  1 files changed, 10 insertions(+), 1 deletions(-)
> > >> >
> > >> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> > >> > index c6e427a..c4972ba 100644
> > >> > --- a/include/linux/ethtool.h
> > >> > +++ b/include/linux/ethtool.h
> > >> > @@ -36,12 +36,14 @@ struct ethtool_cmd {
> > >> >  	__u8	mdio_support;
> > >> >  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
> > >> >  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> > >> > +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> > >> > +	__u32	num_vmqs;	/* Set number of queues for VMDq */
> > >>
> > >> You can't change the layout of this datastructure in this way without
> > >> breaking every ethtool binary out there.
> > >>
> > >> You have to find another place to add these knobs.
> > >
> > > Perhaps at the end of the ethtool_cmd structure?  Something like this:
> >
> > Either use the two reserved u32's we have there, or create a new
> > ethtool command and control structure.
> 
> I'm going to insist on the latter.  As I see it, struct ethtool_cmd is
> for Ethernet PHY settings.  (The max{rx,tx}pkt fields weren't, but
> they're obsolete.)

OK, that's what I'll do.

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-27 22:17 ` [RFC net-next PATCH 3/4] ethtool: Add new set commands Greg Rose
  2011-07-28  5:27   ` David Miller
@ 2011-07-28 21:20   ` Ben Hutchings
  2011-07-28 21:34     ` Rose, Gregory V
  1 sibling, 1 reply; 32+ messages in thread
From: Ben Hutchings @ 2011-07-28 21:20 UTC (permalink / raw)
  To: Greg Rose; +Cc: netdev, davem, jeffrey.t.kirsher

On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> Add new set commands to configure the number of SR-IOV VFs, the
> number of VM queues and spoof checking on/off switch.
> 
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> ---
> 
>  include/linux/ethtool.h |   11 ++++++++++-
>  1 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index c6e427a..c4972ba 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -36,12 +36,14 @@ struct ethtool_cmd {
>  	__u8	mdio_support;
>  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
>  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> +	__u32	num_vfs;	/* Enable SR-IOV VFs */

What are the semantics of changing this after VFs have already been set
up?

> +	__u32	num_vmqs;	/* Set number of queues for VMDq */

VMDq is an Intel proprietary name.  Please specify this in generic
terms.

>  	__u16	speed_hi;       /* The forced speed (upper
>  				 * bits) in Mbps. Please use
>  				 * ethtool_cmd_speed()/_set() to
>  				 * access it */
>  	__u8	eth_tp_mdix;
> -	__u8	reserved2;
> +	__u8	spoof_check;	/* Enable/Disable anti-spoofing */
>  	__u32	lp_advertising;	/* Features the link partner advertises */
>  	__u32	reserved[2];
>  };
> @@ -1121,6 +1123,13 @@ struct ethtool_ops {
>  #define AUTONEG_DISABLE		0x00
>  #define AUTONEG_ENABLE		0x01
>  
> +/* Enable or disable MAC and/or VLAN spoofchecking.If this is
> + * set to enable, then depending on the controller capabilities
> + * MAC and/or VLAN spoofing will be turned on.
> + */
> +#define SPOOFCHECK_DISABLE     0x00
> +#define SPOOFCHECK_ENABLE      0x01

I'm not sure it's necessary to continue defining macros for booleans.

This should not 'depend on controller capabilities'.  If the
administrator tries to enable this feature on a device that does not
support it, the request must return an error rather than failing
silently.  (This is another reason to define new commands.)

Ben.

>  /* Mode MDI or MDI-X */
>  #define ETH_TP_MDI_INVALID	0x00
>  #define ETH_TP_MDI		0x01
> 

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 21:20   ` Ben Hutchings
@ 2011-07-28 21:34     ` Rose, Gregory V
  2011-07-28 22:04       ` Ben Hutchings
  0 siblings, 1 reply; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 21:34 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, davem, Kirsher, Jeffrey T

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Ben Hutchings
> Sent: Thursday, July 28, 2011 2:20 PM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org; davem@davemloft.net; Kirsher, Jeffrey T
> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > Add new set commands to configure the number of SR-IOV VFs, the
> > number of VM queues and spoof checking on/off switch.
> >
> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > ---
> >
> >  include/linux/ethtool.h |   11 ++++++++++-
> >  1 files changed, 10 insertions(+), 1 deletions(-)
> >
> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> > index c6e427a..c4972ba 100644
> > --- a/include/linux/ethtool.h
> > +++ b/include/linux/ethtool.h
> > @@ -36,12 +36,14 @@ struct ethtool_cmd {
> >  	__u8	mdio_support;
> >  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
> >  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> > +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> 
> What are the semantics of changing this after VFs have already been set
> up?

There's an example of it patch 4/4 in this set.  The PF driver checks if any VFs are assigned and active and if not then it will disable and destroy all the current VFs via a call to pci_disable_sriov() and then call pci_enable_sriov() with the new number of VFs.  Or, if the number of new VFs is zero then SR-IOV is left disabled on that PF.

Mostly this is to accommodate customer requests to be able to set a different number of VFs per PF or to only have specified PFs enable VFs.  The current usage of the max_vfs module parameter is unwieldy in this sense as you must enable SR-IOV VFs on all physical functions found during the device probe with the same number of VFs.

> 
> > +	__u32	num_vmqs;	/* Set number of queues for VMDq */
> 
> VMDq is an Intel proprietary name.  Please specify this in generic
> terms.

I'll use the more generic term VM queues.

> 
> >  	__u16	speed_hi;       /* The forced speed (upper
> >  				 * bits) in Mbps. Please use
> >  				 * ethtool_cmd_speed()/_set() to
> >  				 * access it */
> >  	__u8	eth_tp_mdix;
> > -	__u8	reserved2;
> > +	__u8	spoof_check;	/* Enable/Disable anti-spoofing */
> >  	__u32	lp_advertising;	/* Features the link partner advertises */
> >  	__u32	reserved[2];
> >  };
> > @@ -1121,6 +1123,13 @@ struct ethtool_ops {
> >  #define AUTONEG_DISABLE		0x00
> >  #define AUTONEG_ENABLE		0x01
> >
> > +/* Enable or disable MAC and/or VLAN spoofchecking.If this is
> > + * set to enable, then depending on the controller capabilities
> > + * MAC and/or VLAN spoofing will be turned on.
> > + */
> > +#define SPOOFCHECK_DISABLE     0x00
> > +#define SPOOFCHECK_ENABLE      0x01
> 
> I'm not sure it's necessary to continue defining macros for booleans.

OK.  I tend to just follow the conventions in the code I see when I'm not that familiar with it but in this case I'll just use true and false.

> 
> This should not 'depend on controller capabilities'.  If the
> administrator tries to enable this feature on a device that does not
> support it, the request must return an error rather than failing
> silently.  (This is another reason to define new commands.)

Alright, I'll fix it up to work that way.

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 20:38         ` Rose, Gregory V
@ 2011-07-28 22:01           ` Anirban Chakraborty
  2011-07-28 22:04             ` Rose, Gregory V
  2011-07-28 22:04             ` Rose, Gregory V
  2011-07-28 22:01           ` Anirban Chakraborty
  1 sibling, 2 replies; 32+ messages in thread
From: Anirban Chakraborty @ 2011-07-28 22:01 UTC (permalink / raw)
  To: Rose, Gregory V
  Cc: David Miller, netdev, Ben Hutchings, Kirsher, Jeffrey T, virtualization


On Jul 28, 2011, at 1:38 PM, Rose, Gregory V wrote:

> 
>> From: Anirban Chakraborty [mailto:anirban.chakraborty@qlogic.com] 
>> Sent: Thursday, July 28, 2011 12:04 PM
>> To: Rose, Gregory V
>> Cc: David Miller; netdev; Ben Hutchings; Kirsher, Jeffrey T
>> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
>> 
>> 
>> On Jul 28, 2011, at 8:51 AM, Rose, Gregory V wrote:
>> 
>> 
>> -----Original Message-----
>> From: David Miller [mailto:davem@davemloft.net]
>> Sent: Wednesday, July 27, 2011 10:28 PM
>> To: Rose, Gregory V
>> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey T
>> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
>> 
>> From: Greg Rose <gregory.v.rose@intel.com>
>> Date: Wed, 27 Jul 2011 15:17:59 -0700
>> 
>> Add new set commands to configure the number of SR-IOV VFs, the
>> number of VM queues and spoof checking on/off switch.
>> 
>> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
>> ---
>> 
>> include/linux/ethtool.h |   11 ++++++++++-
>> 1 files changed, 10 insertions(+), 1 deletions(-)
>> 
>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>> index c6e427a..c4972ba 100644
>> --- a/include/linux/ethtool.h
>> +++ b/include/linux/ethtool.h
>> @@ -36,12 +36,14 @@ struct ethtool_cmd {
>> 	__u8	mdio_support;
>> 	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
>> 	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
>> +	__u32	num_vfs;	/* Enable SR-IOV VFs */
>> +	__u32	num_vmqs;	/* Set number of queues for VMDq */
>> 
>> You can't change the layout of this datastructure in this way without
>> breaking every ethtool binary out there.
>> 
>> You have to find another place to add these knobs.
>> 
>> Perhaps at the end of the ethtool_cmd structure?  Something like this:
>> 
>> /* This should work for both 32 and 64 bit userland. */
>> struct ethtool_cmd {
>>       __u32   cmd;
>>       __u32   supported;      /* Features this interface supports */
>>       __u32   advertising;    /* Features this interface advertises */
>>       __u16   speed;          /* The forced speed (lower bits) in
>>                                * Mbps. Please use
>>                                * ethtool_cmd_speed()/_set() to
>>                                * access it */
>>       __u8    duplex;         /* Duplex, half or full */
>>       __u8    port;           /* Which connector port */
>>       __u8    phy_address;
>>       __u8    transceiver;    /* Which transceiver to use */
>>       __u8    autoneg;        /* Enable or disable autonegotiation */
>>       __u8    mdio_support;
>>       __u32   maxtxpkt;       /* Tx pkts before generating tx int */
>>       __u32   maxrxpkt;       /* Rx pkts before generating rx int */
>>       __u16   speed_hi;       /* The forced speed (upper
>>                                * bits) in Mbps. Please use
>>                                * ethtool_cmd_speed()/_set() to
>>                                * access it */
>>       __u8    eth_tp_mdix;
>>       __u8    spoof_check;    /* Enable/Disable anti-spoofing */
>>       __u32   lp_advertising; /* Features the link partner advertises */
>>       __u32   reserved[2];
>>       __u32   num_vfs;        /* Enable SR-IOV VFs */
>>       __u32   num_vmqs;       /* Set number of queues for VMDq */
>> };
>> 
>> If I understood it correctly, you are trying to set/unset spoofing on per
>> eth interface,  which could be a PF on the hypervisor or a pci passthru-ed
>> VF in the linux guest.  There are other security features that one could set
>> for a port on the VF (lets call it vport),  e.g. setting a port VLAN ID for
>> a VF and specifying if the VF (VM) is allowed to send tagged/untagged
>> packets, setting a vport in port mirroring mode so that the PF can monitor
>> the traffic on a VF,  setting a vport in promiscuous mode etc. 
>> 
>> Does it make sense to try to use ip link util to specify all these parameters,
>> since ip link already does the  job of setting VF properties and VF port
>> profile?
>> 
>> Any thoughts?
>> 
> 
> Sure, that's a possibility too.  I was considering ethtool for this since MAC addresses and VLANs are fairly specific to Ethernet whereas netlink might apply to other types of physical networks.  At least that's my understanding.

You could specify VF MAC and VLANs using netlink today.
e.g. ip link set ethX vf # mac, vlan etc.
Adding spoofing as follows would do it.
ip link set ethX vf # spoof on|off 

Having SR-IOV features scattered among ethtool and ip link may be inconvenient for the end users.
CC-ing virtualization list.

> 
> However, I have no strong feelings about it and if community consensus is to use ip link instead then that's fine by me.
> 
> Of course, patches implementing such would be quite welcome also.

I could take a stab at it at the netlink side, if there is a consensus.

-Anirban



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 20:38         ` Rose, Gregory V
  2011-07-28 22:01           ` Anirban Chakraborty
@ 2011-07-28 22:01           ` Anirban Chakraborty
  1 sibling, 0 replies; 32+ messages in thread
From: Anirban Chakraborty @ 2011-07-28 22:01 UTC (permalink / raw)
  To: Rose, Gregory V
  Cc: Ben Hutchings, netdev, virtualization, David Miller, Kirsher, Jeffrey T


On Jul 28, 2011, at 1:38 PM, Rose, Gregory V wrote:

> 
>> From: Anirban Chakraborty [mailto:anirban.chakraborty@qlogic.com] 
>> Sent: Thursday, July 28, 2011 12:04 PM
>> To: Rose, Gregory V
>> Cc: David Miller; netdev; Ben Hutchings; Kirsher, Jeffrey T
>> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
>> 
>> 
>> On Jul 28, 2011, at 8:51 AM, Rose, Gregory V wrote:
>> 
>> 
>> -----Original Message-----
>> From: David Miller [mailto:davem@davemloft.net]
>> Sent: Wednesday, July 27, 2011 10:28 PM
>> To: Rose, Gregory V
>> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; Kirsher, Jeffrey T
>> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
>> 
>> From: Greg Rose <gregory.v.rose@intel.com>
>> Date: Wed, 27 Jul 2011 15:17:59 -0700
>> 
>> Add new set commands to configure the number of SR-IOV VFs, the
>> number of VM queues and spoof checking on/off switch.
>> 
>> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
>> ---
>> 
>> include/linux/ethtool.h |   11 ++++++++++-
>> 1 files changed, 10 insertions(+), 1 deletions(-)
>> 
>> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
>> index c6e427a..c4972ba 100644
>> --- a/include/linux/ethtool.h
>> +++ b/include/linux/ethtool.h
>> @@ -36,12 +36,14 @@ struct ethtool_cmd {
>> 	__u8	mdio_support;
>> 	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
>> 	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
>> +	__u32	num_vfs;	/* Enable SR-IOV VFs */
>> +	__u32	num_vmqs;	/* Set number of queues for VMDq */
>> 
>> You can't change the layout of this datastructure in this way without
>> breaking every ethtool binary out there.
>> 
>> You have to find another place to add these knobs.
>> 
>> Perhaps at the end of the ethtool_cmd structure?  Something like this:
>> 
>> /* This should work for both 32 and 64 bit userland. */
>> struct ethtool_cmd {
>>       __u32   cmd;
>>       __u32   supported;      /* Features this interface supports */
>>       __u32   advertising;    /* Features this interface advertises */
>>       __u16   speed;          /* The forced speed (lower bits) in
>>                                * Mbps. Please use
>>                                * ethtool_cmd_speed()/_set() to
>>                                * access it */
>>       __u8    duplex;         /* Duplex, half or full */
>>       __u8    port;           /* Which connector port */
>>       __u8    phy_address;
>>       __u8    transceiver;    /* Which transceiver to use */
>>       __u8    autoneg;        /* Enable or disable autonegotiation */
>>       __u8    mdio_support;
>>       __u32   maxtxpkt;       /* Tx pkts before generating tx int */
>>       __u32   maxrxpkt;       /* Rx pkts before generating rx int */
>>       __u16   speed_hi;       /* The forced speed (upper
>>                                * bits) in Mbps. Please use
>>                                * ethtool_cmd_speed()/_set() to
>>                                * access it */
>>       __u8    eth_tp_mdix;
>>       __u8    spoof_check;    /* Enable/Disable anti-spoofing */
>>       __u32   lp_advertising; /* Features the link partner advertises */
>>       __u32   reserved[2];
>>       __u32   num_vfs;        /* Enable SR-IOV VFs */
>>       __u32   num_vmqs;       /* Set number of queues for VMDq */
>> };
>> 
>> If I understood it correctly, you are trying to set/unset spoofing on per
>> eth interface,  which could be a PF on the hypervisor or a pci passthru-ed
>> VF in the linux guest.  There are other security features that one could set
>> for a port on the VF (lets call it vport),  e.g. setting a port VLAN ID for
>> a VF and specifying if the VF (VM) is allowed to send tagged/untagged
>> packets, setting a vport in port mirroring mode so that the PF can monitor
>> the traffic on a VF,  setting a vport in promiscuous mode etc. 
>> 
>> Does it make sense to try to use ip link util to specify all these parameters,
>> since ip link already does the  job of setting VF properties and VF port
>> profile?
>> 
>> Any thoughts?
>> 
> 
> Sure, that's a possibility too.  I was considering ethtool for this since MAC addresses and VLANs are fairly specific to Ethernet whereas netlink might apply to other types of physical networks.  At least that's my understanding.

You could specify VF MAC and VLANs using netlink today.
e.g. ip link set ethX vf # mac, vlan etc.
Adding spoofing as follows would do it.
ip link set ethX vf # spoof on|off 

Having SR-IOV features scattered among ethtool and ip link may be inconvenient for the end users.
CC-ing virtualization list.

> 
> However, I have no strong feelings about it and if community consensus is to use ip link instead then that's fine by me.
> 
> Of course, patches implementing such would be quite welcome also.

I could take a stab at it at the netlink side, if there is a consensus.

-Anirban

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 22:01           ` Anirban Chakraborty
@ 2011-07-28 22:04             ` Rose, Gregory V
  2011-07-28 22:04             ` Rose, Gregory V
  1 sibling, 0 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 22:04 UTC (permalink / raw)
  To: Anirban Chakraborty
  Cc: David Miller, netdev, Ben Hutchings, Kirsher, Jeffrey T, virtualization

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Anirban Chakraborty
> Sent: Thursday, July 28, 2011 3:01 PM
> To: Rose, Gregory V
> Cc: David Miller; netdev; Ben Hutchings; Kirsher, Jeffrey T;
> virtualization@lists.linux-foundation.org
> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> 
> On Jul 28, 2011, at 1:38 PM, Rose, Gregory V wrote:
> 
> >
> >> From: Anirban Chakraborty [mailto:anirban.chakraborty@qlogic.com]
> >> Sent: Thursday, July 28, 2011 12:04 PM
> >> To: Rose, Gregory V
> >> Cc: David Miller; netdev; Ben Hutchings; Kirsher, Jeffrey T
> >> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> >>
> >>
> >> If I understood it correctly, you are trying to set/unset spoofing on
> per
> >> eth interface,  which could be a PF on the hypervisor or a pci
> passthru-ed
> >> VF in the linux guest.  There are other security features that one
> could set
> >> for a port on the VF (lets call it vport),  e.g. setting a port VLAN ID
> for
> >> a VF and specifying if the VF (VM) is allowed to send tagged/untagged
> >> packets, setting a vport in port mirroring mode so that the PF can
> monitor
> >> the traffic on a VF,  setting a vport in promiscuous mode etc.
> >>
> >> Does it make sense to try to use ip link util to specify all these
> parameters,
> >> since ip link already does the  job of setting VF properties and VF
> port
> >> profile?
> >>
> >> Any thoughts?
> >>
> >
> > Sure, that's a possibility too.  I was considering ethtool for this
> since MAC addresses and VLANs are fairly specific to Ethernet whereas
> netlink might apply to other types of physical networks.  At least that's
> my understanding.
> 
> You could specify VF MAC and VLANs using netlink today.
> e.g. ip link set ethX vf # mac, vlan etc.
> Adding spoofing as follows would do it.
> ip link set ethX vf # spoof on|off
> 
> Having SR-IOV features scattered among ethtool and ip link may be
> inconvenient for the end users.
> CC-ing virtualization list.
> 
> >
> > However, I have no strong feelings about it and if community consensus
> is to use ip link instead then that's fine by me.
> >
> > Of course, patches implementing such would be quite welcome also.
> 
> I could take a stab at it at the netlink side, if there is a consensus.

Now that I think about it I'm seeing it more your way.  I'll drop the anti-spoofing stuff from my ethtool patches.  If you get the time to provide patches to netlink for anti-spoofing then that's great, otherwise I'll do it when I get done with the SR-IOV reconfig stuff.

Thanks,

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 22:01           ` Anirban Chakraborty
  2011-07-28 22:04             ` Rose, Gregory V
@ 2011-07-28 22:04             ` Rose, Gregory V
  1 sibling, 0 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 22:04 UTC (permalink / raw)
  To: Anirban Chakraborty
  Cc: Ben Hutchings, netdev, virtualization, David Miller, Kirsher, Jeffrey T

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Anirban Chakraborty
> Sent: Thursday, July 28, 2011 3:01 PM
> To: Rose, Gregory V
> Cc: David Miller; netdev; Ben Hutchings; Kirsher, Jeffrey T;
> virtualization@lists.linux-foundation.org
> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> 
> On Jul 28, 2011, at 1:38 PM, Rose, Gregory V wrote:
> 
> >
> >> From: Anirban Chakraborty [mailto:anirban.chakraborty@qlogic.com]
> >> Sent: Thursday, July 28, 2011 12:04 PM
> >> To: Rose, Gregory V
> >> Cc: David Miller; netdev; Ben Hutchings; Kirsher, Jeffrey T
> >> Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> >>
> >>
> >> If I understood it correctly, you are trying to set/unset spoofing on
> per
> >> eth interface,  which could be a PF on the hypervisor or a pci
> passthru-ed
> >> VF in the linux guest.  There are other security features that one
> could set
> >> for a port on the VF (lets call it vport),  e.g. setting a port VLAN ID
> for
> >> a VF and specifying if the VF (VM) is allowed to send tagged/untagged
> >> packets, setting a vport in port mirroring mode so that the PF can
> monitor
> >> the traffic on a VF,  setting a vport in promiscuous mode etc.
> >>
> >> Does it make sense to try to use ip link util to specify all these
> parameters,
> >> since ip link already does the  job of setting VF properties and VF
> port
> >> profile?
> >>
> >> Any thoughts?
> >>
> >
> > Sure, that's a possibility too.  I was considering ethtool for this
> since MAC addresses and VLANs are fairly specific to Ethernet whereas
> netlink might apply to other types of physical networks.  At least that's
> my understanding.
> 
> You could specify VF MAC and VLANs using netlink today.
> e.g. ip link set ethX vf # mac, vlan etc.
> Adding spoofing as follows would do it.
> ip link set ethX vf # spoof on|off
> 
> Having SR-IOV features scattered among ethtool and ip link may be
> inconvenient for the end users.
> CC-ing virtualization list.
> 
> >
> > However, I have no strong feelings about it and if community consensus
> is to use ip link instead then that's fine by me.
> >
> > Of course, patches implementing such would be quite welcome also.
> 
> I could take a stab at it at the netlink side, if there is a consensus.

Now that I think about it I'm seeing it more your way.  I'll drop the anti-spoofing stuff from my ethtool patches.  If you get the time to provide patches to netlink for anti-spoofing then that's great, otherwise I'll do it when I get done with the SR-IOV reconfig stuff.

Thanks,

- Greg

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 21:34     ` Rose, Gregory V
@ 2011-07-28 22:04       ` Ben Hutchings
  2011-07-28 22:25         ` Rose, Gregory V
  0 siblings, 1 reply; 32+ messages in thread
From: Ben Hutchings @ 2011-07-28 22:04 UTC (permalink / raw)
  To: Rose, Gregory V; +Cc: netdev, davem, Kirsher, Jeffrey T

On Thu, 2011-07-28 at 14:34 -0700, Rose, Gregory V wrote:
> > -----Original Message-----
> > From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> > On Behalf Of Ben Hutchings
> > Sent: Thursday, July 28, 2011 2:20 PM
> > To: Rose, Gregory V
> > Cc: netdev@vger.kernel.org; davem@davemloft.net; Kirsher, Jeffrey T
> > Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> > 
> > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > > Add new set commands to configure the number of SR-IOV VFs, the
> > > number of VM queues and spoof checking on/off switch.
> > >
> > > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > > ---
> > >
> > >  include/linux/ethtool.h |   11 ++++++++++-
> > >  1 files changed, 10 insertions(+), 1 deletions(-)
> > >
> > > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> > > index c6e427a..c4972ba 100644
> > > --- a/include/linux/ethtool.h
> > > +++ b/include/linux/ethtool.h
> > > @@ -36,12 +36,14 @@ struct ethtool_cmd {
> > >  	__u8	mdio_support;
> > >  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
> > >  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> > > +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> > 
> > What are the semantics of changing this after VFs have already been set
> > up?
> 
> There's an example of it patch 4/4 in this set.  The PF driver checks
> if any VFs are assigned and active and if not then it will disable and
> destroy all the current VFs via a call to pci_disable_sriov() and then
> call pci_enable_sriov() with the new number of VFs.  Or, if the number
> of new VFs is zero then SR-IOV is left disabled on that PF.

And otherwise the request fails?

> Mostly this is to accommodate customer requests to be able to set a
> different number of VFs per PF or to only have specified PFs enable
> VFs.  The current usage of the max_vfs module parameter is unwieldy in
> this sense as you must enable SR-IOV VFs on all physical functions
> found during the device probe with the same number of VFs.

Right, this makes a lot of sense.

> > > +	__u32	num_vmqs;	/* Set number of queues for VMDq */
> > 
> > VMDq is an Intel proprietary name.  Please specify this in generic
> > terms.
> 
> I'll use the more generic term VM queues.
[...]

Still ambiguous.  I think what you actually intend is that this will be
the number of RX queues and the number of TX queues per VF.  Right?
You might want to allow for different numbers of RX and TX queues.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
  2011-07-28 22:04       ` Ben Hutchings
@ 2011-07-28 22:25         ` Rose, Gregory V
  0 siblings, 0 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-28 22:25 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, davem, Kirsher, Jeffrey T

> -----Original Message-----
> From: Ben Hutchings [mailto:bhutchings@solarflare.com]
> Sent: Thursday, July 28, 2011 3:04 PM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org; davem@davemloft.net; Kirsher, Jeffrey T
> Subject: RE: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> 
> On Thu, 2011-07-28 at 14:34 -0700, Rose, Gregory V wrote:
> > > -----Original Message-----
> > > From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org]
> > > On Behalf Of Ben Hutchings
> > > Sent: Thursday, July 28, 2011 2:20 PM
> > > To: Rose, Gregory V
> > > Cc: netdev@vger.kernel.org; davem@davemloft.net; Kirsher, Jeffrey T
> > > Subject: Re: [RFC net-next PATCH 3/4] ethtool: Add new set commands
> > >
> > > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > > > Add new set commands to configure the number of SR-IOV VFs, the
> > > > number of VM queues and spoof checking on/off switch.
> > > >
> > > > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > > > ---
> > > >
> > > >  include/linux/ethtool.h |   11 ++++++++++-
> > > >  1 files changed, 10 insertions(+), 1 deletions(-)
> > > >
> > > > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> > > > index c6e427a..c4972ba 100644
> > > > --- a/include/linux/ethtool.h
> > > > +++ b/include/linux/ethtool.h
> > > > @@ -36,12 +36,14 @@ struct ethtool_cmd {
> > > >  	__u8	mdio_support;
> > > >  	__u32	maxtxpkt;	/* Tx pkts before generating tx int */
> > > >  	__u32	maxrxpkt;	/* Rx pkts before generating rx int */
> > > > +	__u32	num_vfs;	/* Enable SR-IOV VFs */
> > >
> > > What are the semantics of changing this after VFs have already been
> set
> > > up?
> >
> > There's an example of it patch 4/4 in this set.  The PF driver checks
> > if any VFs are assigned and active and if not then it will disable and
> > destroy all the current VFs via a call to pci_disable_sriov() and then
> > call pci_enable_sriov() with the new number of VFs.  Or, if the number
> > of new VFs is zero then SR-IOV is left disabled on that PF.
> 
> And otherwise the request fails?

Yes, it returns -EBUSY if VFs are in use (assigned to guest VMs) or the PF driver is up and running.  It can return other errors from the post configuration setup of interrupt resources.

> 
> > Mostly this is to accommodate customer requests to be able to set a
> > different number of VFs per PF or to only have specified PFs enable
> > VFs.  The current usage of the max_vfs module parameter is unwieldy in
> > this sense as you must enable SR-IOV VFs on all physical functions
> > found during the device probe with the same number of VFs.
> 
> Right, this makes a lot of sense.
> 
> > > > +	__u32	num_vmqs;	/* Set number of queues for VMDq */
> > >
> > > VMDq is an Intel proprietary name.  Please specify this in generic
> > > terms.
> >
> > I'll use the more generic term VM queues.
> [...]
> 
> Still ambiguous.  I think what you actually intend is that this will be
> the number of RX queues and the number of TX queues per VF.  Right?
> You might want to allow for different numbers of RX and TX queues.

OK, that's not a problem.  Our devices only support matching numbers of queue pairs but I suppose other devices might have the ability to have different numbers of Rx/Tx queues per VF.  I'll rework it that way while I'm adding the new command.

The number of queues might also be used in a scenario similar to the VMware Net Queues (TM) feature.  There is no support for that in Linux at the moment but it could be supported in the future.

- Greg


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-28 15:11   ` Ian Campbell
  2011-07-28 15:58     ` Rose, Gregory V
@ 2011-07-29 16:51     ` Jesse Barnes
  2011-07-29 16:54       ` Rose, Gregory V
  1 sibling, 1 reply; 32+ messages in thread
From: Jesse Barnes @ 2011-07-29 16:51 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Greg Rose, Konrad Rzeszutek Wilk, netdev, davem, bhutchings,
	jeffrey.t.kirsher, linux-pci

On Thu, 28 Jul 2011 16:11:17 +0100
Ian Campbell <ijc@hellion.org.uk> wrote:

> On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > Device drivers that create and destroy SR-IOV virtual functions via
> > calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
> > failures if they attempt to destroy VFs while they are assigned to
> > guest virtual machines.  By adding a flag for use by the KVM module
> > to indicate that a device is assigned a device driver can check that
> > flag and avoid destroying VFs while they are assigned and avoid system
> > failures.
> > 
> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > ---
> > 
> >  include/linux/pci.h     |    2 ++
> 
> I added Jesse and linux-pci to CC.
> 
> >  virt/kvm/assigned-dev.c |    2 ++
> >  virt/kvm/iommu.c        |    4 ++++
> >  3 files changed, 8 insertions(+), 0 deletions(-)
> 
> I suppose this would also be useful in Xen's pciback or any other system
> which does passthrough? (Konrad CC'd for pciback)
> 
> Is there some common lower layer we could hook this in to? (does
> iommu_attach/detach_device make sense?) Or shall we just add the flag
> manipulations to pciback as well?
> 
> Ian.
> 
> > 
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 2d29218..a297ca2 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -174,6 +174,8 @@ enum pci_dev_flags {
> >  	PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG = (__force pci_dev_flags_t) 1,
> >  	/* Device configuration is irrevocably lost if disabled into D3 */
> >  	PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2,
> > +	/* Provide indication device is assigned by KVM */
> > +	PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
> >  };

Looks fine, but I'd make the comment less redundant with the code, e.g.
"set when the device is assigned to a guest instance" or somesuch.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-28 16:42         ` Rose, Gregory V
@ 2011-07-29 16:54           ` Konrad Rzeszutek Wilk
  2011-07-30  4:00             ` Jeff Kirsher
  0 siblings, 1 reply; 32+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-07-29 16:54 UTC (permalink / raw)
  To: Rose, Gregory V
  Cc: Ian Campbell, netdev, davem, bhutchings, Kirsher, Jeffrey T,
	Jesse Barnes, linux-pci

> > > > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > > > > Device drivers that create and destroy SR-IOV virtual functions via
> > > > > calls to pci_enable_sriov() and pci_disable_sriov can cause
> > catastrophic
> > > > > failures if they attempt to destroy VFs while they are assigned to
> > > > > guest virtual machines.  By adding a flag for use by the KVM module
> > > > > to indicate that a device is assigned a device driver can check that
> > > > > flag and avoid destroying VFs while they are assigned and avoid
> > system
> > > > > failures.
> OK, but I hope Xen can still use the dev_flag assignment bit.

Yeah, I think the attached patch would do it, but I need to double check it.
Do you have a git tree with this patchset?

Um, so you are fixing up ixgbe only - what about the cxgb4 and be driver?
Shouldn't they also get some of this treatment?



diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index 206c4ce0..0d72e84 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -250,6 +250,7 @@ static int xen_pcibk_export_device(struct xen_pcibk_device *pdev,
 		goto out;
 
 	dev_dbg(&dev->dev, "registering for %d\n", pdev->xdev->otherend_id);
+	dev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
 	if (xen_register_device_domain_owner(dev,
 					     pdev->xdev->otherend_id) != 0) {
 		dev_err(&dev->dev, "device has been assigned to another " \
@@ -289,6 +290,7 @@ static int xen_pcibk_remove_device(struct xen_pcibk_device *pdev,
 	}
 
 	dev_dbg(&dev->dev, "unregistering for %d\n", pdev->xdev->otherend_id);
+	dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
 	xen_unregister_device_domain_owner(dev);
 
 	xen_pcibk_release_pci_dev(pdev, dev);

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* RE: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-29 16:51     ` Jesse Barnes
@ 2011-07-29 16:54       ` Rose, Gregory V
  0 siblings, 0 replies; 32+ messages in thread
From: Rose, Gregory V @ 2011-07-29 16:54 UTC (permalink / raw)
  To: Jesse Barnes, Ian Campbell
  Cc: Konrad Rzeszutek Wilk, netdev, davem, bhutchings, Kirsher,
	Jeffrey T, linux-pci

> -----Original Message-----
> From: Jesse Barnes [mailto:jbarnes@virtuousgeek.org]
> Sent: Friday, July 29, 2011 9:52 AM
> To: Ian Campbell
> Cc: Rose, Gregory V; Konrad Rzeszutek Wilk; netdev@vger.kernel.org;
> davem@davemloft.net; bhutchings@solarflare.com; Kirsher, Jeffrey T; linux-
> pci@vger.kernel.org
> Subject: Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has
> been assigned by KVM
> 
> On Thu, 28 Jul 2011 16:11:17 +0100
> Ian Campbell <ijc@hellion.org.uk> wrote:
> 
> > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > > Device drivers that create and destroy SR-IOV virtual functions via
> > > calls to pci_enable_sriov() and pci_disable_sriov can cause
> catastrophic
> > > failures if they attempt to destroy VFs while they are assigned to
> > > guest virtual machines.  By adding a flag for use by the KVM module
> > > to indicate that a device is assigned a device driver can check that
> > > flag and avoid destroying VFs while they are assigned and avoid system
> > > failures.
> > >
> > > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > > ---
> > >
> > >  include/linux/pci.h     |    2 ++
> >
> > I added Jesse and linux-pci to CC.
> >
> > >  virt/kvm/assigned-dev.c |    2 ++
> > >  virt/kvm/iommu.c        |    4 ++++
> > >  3 files changed, 8 insertions(+), 0 deletions(-)
> >
> > I suppose this would also be useful in Xen's pciback or any other system
> > which does passthrough? (Konrad CC'd for pciback)
> >
> > Is there some common lower layer we could hook this in to? (does
> > iommu_attach/detach_device make sense?) Or shall we just add the flag
> > manipulations to pciback as well?
> >
> > Ian.
> >
> > >
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 2d29218..a297ca2 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -174,6 +174,8 @@ enum pci_dev_flags {
> > >  	PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG = (__force pci_dev_flags_t) 1,
> > >  	/* Device configuration is irrevocably lost if disabled into D3 */
> > >  	PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2,
> > > +	/* Provide indication device is assigned by KVM */
> > > +	PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
> > >  };
> 
> Looks fine, but I'd make the comment less redundant with the code, e.g.
> "set when the device is assigned to a guest instance" or somesuch.

Sure, sounds good to me.

Rev 2 of the RFC patches will be out in a couple of weeks, I'm away next week.

Thanks,

- Greg

> 
> --
> Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
  2011-07-29 16:54           ` Konrad Rzeszutek Wilk
@ 2011-07-30  4:00             ` Jeff Kirsher
  0 siblings, 0 replies; 32+ messages in thread
From: Jeff Kirsher @ 2011-07-30  4:00 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Rose, Gregory V, Ian Campbell, netdev, davem, bhutchings,
	Jesse Barnes, linux-pci

On Fri, Jul 29, 2011 at 09:54, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
>> > > > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
>> > > > > Device drivers that create and destroy SR-IOV virtual functions via
>> > > > > calls to pci_enable_sriov() and pci_disable_sriov can cause
>> > catastrophic
>> > > > > failures if they attempt to destroy VFs while they are assigned to
>> > > > > guest virtual machines.  By adding a flag for use by the KVM module
>> > > > > to indicate that a device is assigned a device driver can check that
>> > > > > flag and avoid destroying VFs while they are assigned and avoid
>> > system
>> > > > > failures.
>> OK, but I hope Xen can still use the dev_flag assignment bit.
>
> Yeah, I think the attached patch would do it, but I need to double check it.
> Do you have a git tree with this patchset?

Not yet, Greg was send these patches out as a RFC to make sure this
was the correct direction to go.

>
> Um, so you are fixing up ixgbe only - what about the cxgb4 and be driver?
> Shouldn't they also get some of this treatment?
>

Yes, Greg made changes to ixgbe to give an idea of what changes would
need to be made with the suggested changes.  From the feedback Greg
received, there is definite changes needed.  When Greg gets finished
with the necessary changes to the kernel/ethtool, we will have a
better idea of what changes are needed for the drivers.

-- 
Cheers,
Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2011-07-30  4:00 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-27 22:17 [RFC net-next PATCH 0/4] Add new settings for ethtool Greg Rose
2011-07-27 22:17 ` [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM Greg Rose
2011-07-28 15:11   ` Ian Campbell
2011-07-28 15:58     ` Rose, Gregory V
2011-07-28 16:27       ` Ian Campbell
2011-07-28 16:42         ` Rose, Gregory V
2011-07-29 16:54           ` Konrad Rzeszutek Wilk
2011-07-30  4:00             ` Jeff Kirsher
2011-07-29 16:51     ` Jesse Barnes
2011-07-29 16:54       ` Rose, Gregory V
2011-07-27 22:17 ` [RFC net-next PATCH 2/4] ixgbe: Reconfigure SR-IOV Init Greg Rose
2011-07-28  5:26   ` David Miller
2011-07-28 15:44     ` Rose, Gregory V
2011-07-27 22:17 ` [RFC net-next PATCH 3/4] ethtool: Add new set commands Greg Rose
2011-07-28  5:27   ` David Miller
2011-07-28 15:51     ` Rose, Gregory V
2011-07-28 16:14       ` David Miller
2011-07-28 16:21         ` Rose, Gregory V
2011-07-28 21:14         ` Ben Hutchings
2011-07-28 21:16           ` Rose, Gregory V
     [not found]       ` <539DF151-E442-4375-8777-19676B95059B@qlogic.com>
2011-07-28 20:38         ` Rose, Gregory V
2011-07-28 22:01           ` Anirban Chakraborty
2011-07-28 22:04             ` Rose, Gregory V
2011-07-28 22:04             ` Rose, Gregory V
2011-07-28 22:01           ` Anirban Chakraborty
2011-07-28 21:20   ` Ben Hutchings
2011-07-28 21:34     ` Rose, Gregory V
2011-07-28 22:04       ` Ben Hutchings
2011-07-28 22:25         ` Rose, Gregory V
2011-07-27 22:18 ` [RFC net-next PATCH 4/4] ixgbe: Add support for new ethtool settings Greg Rose
2011-07-28 11:54   ` Michał Mirosław
2011-07-28 15:52     ` Rose, Gregory V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.