All of lore.kernel.org
 help / color / mirror / Atom feed
* [rfc 0/4] igb: bandwidth allocation
@ 2009-11-05  0:58 Simon Horman
  2009-11-05  0:58 ` [rfc 1/4] igb: Add igb_cleanup_vf() Simon Horman
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Simon Horman @ 2009-11-05  0:58 UTC (permalink / raw)
  To: e1000-devel, netdev; +Cc: Arnd Bergmann, Jeff Kirsher

Hi,

this series of patches exposes the bandwidth allocation
hardware support of the Intel 82576. It does so through
a rather hackish sysfs entry. That interface is just intended
for testing so that the exposed hardware feature can
be exercised. I would like to find a generic way to expose
this feature to user-space.


>From horms@vergenet.net Thu Nov  5 12:06:27 2009
Message-Id: <20091105010627.123781635@vergenet.net>
User-Agent: quilt/0.48-1
Date: Thu, 05 Nov 2009 11:58:48 +1100
From: Simon Horman <horms@verge.net.au>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>,
 Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
 Arnd Bergmann <arndbergmann@googlemail.com>
To: e1000-devel@lists.sourceforge.net,
 netdev@vger.kernel.org
Subject: [rfc 1/4] igb: Add igb_cleanup_vf()
References: <20091105005847.941190065@vergenet.net>
Content-Disposition: inline; filename=igb_cleanup_vf.patch

Move virtual finction cleanup code into igb_cleanup_vf() and for the sake
of symmetry rename igb_probe_vfs() as igb_init_vf().

Although these functions aren't entirely symmetrical it should aid
maintenance by making the relationship between initialisation and cleanup
more obvious.

Note that there appears to be no way for adapter->vfs_allocated_count to be
non-zero for the case where CONFIG_PCI_IOV is not set, so reseting this
value was moved to inside the relvant #ifdef.

Signed-off-by: Simon Horman <horms@verge.net.au>

Index: net-next-2.6/drivers/net/igb/igb_main.c
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb_main.c	2009-11-05 04:38:58.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb_main.c	2009-11-05 16:36:07.000000000 +0900
@@ -87,6 +87,7 @@ void igb_update_stats(struct igb_adapter
 static int igb_probe(struct pci_dev *, const struct pci_device_id *);
 static void __devexit igb_remove(struct pci_dev *pdev);
 static int igb_sw_init(struct igb_adapter *);
+static void igb_cleanup_vf(struct igb_adapter * adapter);
 static int igb_open(struct net_device *);
 static int igb_close(struct net_device *);
 static void igb_configure_tx(struct igb_adapter *);
@@ -650,21 +651,7 @@ static void igb_set_interrupt_capability
 
 	/* If we can't do MSI-X, try MSI */
 msi_only:
-#ifdef CONFIG_PCI_IOV
-	/* disable SR-IOV for non MSI-X configurations */
-	if (adapter->vf_data) {
-		struct e1000_hw *hw = &adapter->hw;
-		/* disable iov and allow time for transactions to clear */
-		pci_disable_sriov(adapter->pdev);
-		msleep(500);
-
-		kfree(adapter->vf_data);
-		adapter->vf_data = NULL;
-		wr32(E1000_IOVCTL, E1000_IOVCTL_REUSE_VFQ);
-		msleep(100);
-		dev_info(&adapter->pdev->dev, "IOV Disabled\n");
-	}
-#endif
+	igb_cleanup_vf(adapter);
 	adapter->vfs_allocated_count = 0;
 	adapter->flags |= IGB_FLAG_QUEUE_PAIRS;
 	adapter->num_rx_queues = 1;
@@ -1734,7 +1721,7 @@ static void __devexit igb_remove(struct 
 }
 
 /**
- * igb_probe_vfs - Initialize vf data storage and add VFs to pci config space
+ * igb_init_vf - Initialize vf data storage and add VFs to pci config space
  * @adapter: board private structure to initialize
  *
  * This function initializes the vf specific data storage and then attempts to
@@ -1742,7 +1729,7 @@ static void __devexit igb_remove(struct 
  * mor expensive time wise to disable SR-IOV than it is to allocate and free
  * the memory for the VFs.
  **/
-static void __devinit igb_probe_vfs(struct igb_adapter * adapter)
+static void __devinit igb_init_vf(struct igb_adapter * adapter)
 {
 #ifdef CONFIG_PCI_IOV
 	struct pci_dev *pdev = adapter->pdev;
@@ -1782,6 +1769,35 @@ static void __devinit igb_probe_vfs(stru
 }
 
 /**
+ * igb_cleanup_vf - Clean up vf data and remove vfs from pci config space
+ * @adapter: board private structure to initialize
+ *
+ * This function cleans-up the vf specific data storage and then attempts to
+ * deallocate the VFs.
+ **/
+static void igb_cleanup_vf(struct igb_adapter * adapter)
+{
+#ifdef CONFIG_PCI_IOV
+	struct e1000_hw *hw = &adapter->hw;
+
+	if (!adapter->vf_data)
+		return;
+
+	/* disable iov and allow time for transactions to clear */
+	pci_disable_sriov(adapter->pdev);
+	msleep(500);
+
+	kfree(adapter->vf_data);
+	adapter->vf_data = NULL;
+	adapter->vfs_allocated_count = 0;
+
+	wr32(E1000_IOVCTL, E1000_IOVCTL_REUSE_VFQ);
+	msleep(100);
+	dev_info(&adapter->pdev->dev, "IOV Disabled\n");
+#endif
+}
+
+/**
  * igb_sw_init - Initialize general software structures (struct igb_adapter)
  * @adapter: board private structure to initialize
  *
@@ -1816,7 +1832,7 @@ static int __devinit igb_sw_init(struct 
 		return -ENOMEM;
 	}
 
-	igb_probe_vfs(adapter);
+	igb_init_vf(adapter);
 
 	/* Explicitly disable IRQ since the NIC can be in any state. */
 	igb_irq_disable(adapter);


>From horms@vergenet.net Thu Nov  5 12:06:27 2009
Message-Id: <20091105010627.464560295@vergenet.net>
User-Agent: quilt/0.48-1
Date: Thu, 05 Nov 2009 11:58:49 +1100
From: Simon Horman <horms@verge.net.au>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>,
 Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
 Arnd Bergmann <arndbergmann@googlemail.com>
To: e1000-devel@lists.sourceforge.net,
 netdev@vger.kernel.org
Subject: [rfc 2/4] igb: Initialise adapter->vfs_allocated_count in igb_init_vf()
References: <20091105005847.941190065@vergenet.net>
Content-Disposition: inline; filename=igb_init_vf-vfs_allocated_count.patch

Initialise adapter->vfs_allocated_count in igb_init_vf()
and only do so if the VFs are successfully created.

This seems a lot tidier to me, for starters igb_init_vf() is no longer
spliced in two by an #ifdef just to allow vfs_allocated_count to be reset to
0 if CONFIG_PCI_IOV is unset.

This change will break things if igb_init_interrupt_scheme() relies on
adapter->vfs_allocated_count, but on inspection it appears not to.

Signed-off-by: Simon Horman <horms@verge.net.au>

Index: net-next-2.6/drivers/net/igb/igb_main.c
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb_main.c	2009-11-05 04:46:06.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb_main.c	2009-11-05 04:51:10.000000000 +0900
@@ -1723,48 +1723,48 @@ static void __devexit igb_remove(struct 
 /**
  * igb_init_vf - Initialize vf data storage and add VFs to pci config space
  * @adapter: board private structure to initialize
+ * @vfn: requested number of virtual functions
  *
  * This function initializes the vf specific data storage and then attempts to
  * allocate the VFs.  The reason for ordering it this way is because it is much
  * mor expensive time wise to disable SR-IOV than it is to allocate and free
  * the memory for the VFs.
  **/
-static void __devinit igb_init_vf(struct igb_adapter * adapter)
+static void __devinit igb_init_vf(struct igb_adapter * adapter, int vfn)
 {
 #ifdef CONFIG_PCI_IOV
 	struct pci_dev *pdev = adapter->pdev;
+	struct e1000_hw *hw = &adapter->hw;
 
-	if (adapter->vfs_allocated_count > 7)
-		adapter->vfs_allocated_count = 7;
+	if (hw->mac.type != e1000_82576 || !vfn)
+		return;
 
-	if (adapter->vfs_allocated_count) {
-		adapter->vf_data = kcalloc(adapter->vfs_allocated_count,
-		                           sizeof(struct vf_data_storage),
-		                           GFP_KERNEL);
-		/* if allocation failed then we do not support SR-IOV */
-		if (!adapter->vf_data) {
-			adapter->vfs_allocated_count = 0;
-			dev_err(&pdev->dev, "Unable to allocate memory for VF "
-			        "Data Storage\n");
-		}
+	if (vfn > 7)
+		vfn = 7;
+
+	adapter->vf_data = kcalloc(vfn, sizeof(struct vf_data_storage),
+				   GFP_KERNEL);
+	/* if allocation failed then we do not support SR-IOV */
+	if (!adapter->vf_data) {
+		dev_err(&pdev->dev, "Unable to allocate memory for VF "
+		        "Data Storage\n");
+		return;
 	}
 
-	if (pci_enable_sriov(pdev, adapter->vfs_allocated_count)) {
+	if (pci_enable_sriov(pdev, vfn)) {
 		kfree(adapter->vf_data);
 		adapter->vf_data = NULL;
-#endif /* CONFIG_PCI_IOV */
-		adapter->vfs_allocated_count = 0;
-#ifdef CONFIG_PCI_IOV
 	} else {
 		unsigned char mac_addr[ETH_ALEN];
 		int i;
-		dev_info(&pdev->dev, "%d vfs allocated\n",
-		         adapter->vfs_allocated_count);
-		for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		dev_info(&pdev->dev, "%d vfs allocated\n", vfn);
+		for (i = 0; i < vfn; i++) {
 			random_ether_addr(mac_addr);
 			igb_set_vf_mac(adapter, i, mac_addr);
 		}
 	}
+
+	adapter->vfs_allocated_count = vfn;
 #endif /* CONFIG_PCI_IOV */
 }
 
@@ -1821,18 +1821,13 @@ static int __devinit igb_sw_init(struct 
 	adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
-#ifdef CONFIG_PCI_IOV
-	if (hw->mac.type == e1000_82576)
-		adapter->vfs_allocated_count = max_vfs;
-
-#endif /* CONFIG_PCI_IOV */
 	/* This call may decrease the number of queues */
 	if (igb_init_interrupt_scheme(adapter)) {
 		dev_err(&pdev->dev, "Unable to allocate memory for queues\n");
 		return -ENOMEM;
 	}
 
-	igb_init_vf(adapter);
+	igb_init_vf(adapter, max_vfs);
 
 	/* Explicitly disable IRQ since the NIC can be in any state. */
 	igb_irq_disable(adapter);


>From horms@vergenet.net Thu Nov  5 12:06:28 2009
Message-Id: <20091105010627.806283906@vergenet.net>
User-Agent: quilt/0.48-1
Date: Thu, 05 Nov 2009 11:58:50 +1100
From: Simon Horman <horms@verge.net.au>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>,
 Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
 Arnd Bergmann <arndbergmann@googlemail.com>
To: e1000-devel@lists.sourceforge.net,
 netdev@vger.kernel.org
Subject: [rfc 3/4] igb: Common error path in igb_init_vfs()
References: <20091105005847.941190065@vergenet.net>
Content-Disposition: inline; filename=igb_init_vf-err.patch

Drop out into an error path on error.

This is a bit superfluous as things stand, though arguably
it already makes the code cleaner. But it should help things a lot
if igb_init_vfs() has a several things to unwind on error.

Signed-off-by: Simon Horman <horms@verge.net.au>

Index: net-next-2.6/drivers/net/igb/igb_main.c
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb_main.c	2009-11-05 04:52:16.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb_main.c	2009-11-05 16:36:01.000000000 +0900
@@ -1735,6 +1735,8 @@ static void __devinit igb_init_vf(struct
 #ifdef CONFIG_PCI_IOV
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_hw *hw = &adapter->hw;
+	unsigned char mac_addr[ETH_ALEN];
+	int i;
 
 	if (hw->mac.type != e1000_82576 || !vfn)
 		return;
@@ -1751,20 +1753,21 @@ static void __devinit igb_init_vf(struct
 		return;
 	}
 
-	if (pci_enable_sriov(pdev, vfn)) {
-		kfree(adapter->vf_data);
-		adapter->vf_data = NULL;
-	} else {
-		unsigned char mac_addr[ETH_ALEN];
-		int i;
-		dev_info(&pdev->dev, "%d vfs allocated\n", vfn);
-		for (i = 0; i < vfn; i++) {
-			random_ether_addr(mac_addr);
-			igb_set_vf_mac(adapter, i, mac_addr);
-		}
+	if (pci_enable_sriov(pdev, vfn))
+		goto err;
+
+	dev_info(&pdev->dev, "%d vfs allocated\n", vfn);
+	for (i = 0; i < vfn; i++) {
+		random_ether_addr(mac_addr);
+		igb_set_vf_mac(adapter, i, mac_addr);
 	}
 
 	adapter->vfs_allocated_count = vfn;
+
+	return;
+err:
+	kfree(adapter->vf_data);
+	adapter->vf_data = NULL;
 #endif /* CONFIG_PCI_IOV */
 }
 


>From horms@vergenet.net Thu Nov  5 12:06:28 2009
Message-Id: <20091105010628.148945886@vergenet.net>
User-Agent: quilt/0.48-1
Date: Thu, 05 Nov 2009 11:58:51 +1100
From: Simon Horman <horms@verge.net.au>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>,
 Jeff Kirsher <jeffrey.t.kirsher@intel.com>,
 Arnd Bergmann <arndbergmann@googlemail.com>
To: e1000-devel@lists.sourceforge.net,
 netdev@vger.kernel.org
Subject: [rfc 4/4] igb: expose 82576 bandiwidth allocation
References: <20091105005847.941190065@vergenet.net>
Content-Disposition: inline; filename=igb-ba.patch

The 82576 has support for bandwidth allocation to VFs.

Contrary to the documentation in the 82576 datasheet v2.41 this
appears to work as follows:

* The ratio supplied is always proportional to 1Gbit/s,
  regardless of if the link speed.
* The ratio supplied is an upper-bound on bandwidth available
  to the VF, not a minimun guarantee

This patch exposes bandwidth control to userspace through a simple
per-device (PF) sysfs file, bandwidth_allocation.

* The file contains a whitespace delimited list of values, one per VF.
* The first value corresponds to the first VF and so on.
* Valid values are integers from 0 to 1000
* A value of 0 indicates that bandwidth_allocation is disabled.
* Other values indicate the allocated bandwidth, in 1/1000ths of a gigabit/s

e.g. The following for a PF with 4 VFs allocates ~20Mbits/ to VF 1,
     ~100Mbit/s to VF 2, and leave the other 2 VFs with no allocation.

     echo "20 100 0 0" > /sys/class/net/eth3/device/bandwidth_allocation

This interface is intended to allow testing of the hardware feature.
There are ongoing discussions about how to expose this feature
to user-space in a more generic way.

Signed-off-by: Simon Horman <horms@verge.net.au>

Index: net-next-2.6/drivers/net/igb/igb_main.c
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb_main.c	2009-11-05 04:55:06.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb_main.c	2009-11-05 05:12:54.000000000 +0900
@@ -47,6 +47,9 @@
 #ifdef CONFIG_IGB_DCA
 #include <linux/dca.h>
 #endif
+#ifdef CONFIG_PCI_IOV
+#include <linux/ctype.h>
+#endif
 #include "igb.h"
 
 #define DRV_VERSION "1.3.16-k2"
@@ -152,6 +155,15 @@ static unsigned int max_vfs = 0;
 module_param(max_vfs, uint, 0);
 MODULE_PARM_DESC(max_vfs, "Maximum number of virtual functions to allocate "
                  "per physical function");
+
+static ssize_t igb_set_bandwidth_allocation(struct device *,
+					    struct device_attribute *,
+					    const char *, size_t);
+static ssize_t igb_show_bandwidth_allocation(struct device *,
+					     struct device_attribute *,
+					     char *);
+DEVICE_ATTR(bandwidth_allocation, S_IRUGO | S_IWUSR,
+	    igb_show_bandwidth_allocation, igb_set_bandwidth_allocation);
 #endif /* CONFIG_PCI_IOV */
 
 static pci_ers_result_t igb_io_error_detected(struct pci_dev *,
@@ -1754,7 +1766,18 @@ static void __devinit igb_init_vf(struct
 	}
 
 	if (pci_enable_sriov(pdev, vfn))
-		goto err;
+		goto err_kfree;
+
+	if (device_create_file(&pdev->dev, &dev_attr_bandwidth_allocation))
+		goto err_sriov;
+
+	adapter->bandwidth_allocation = kcalloc(vfn, sizeof(unsigned int),
+						GFP_KERNEL);
+	if (!adapter->bandwidth_allocation)
+		goto err_file;
+	memset(adapter->bandwidth_allocation, vfn * sizeof(unsigned int), 0);
+
+	spin_lock_init(&adapter->bandwidth_allocation_lock);
 
 	dev_info(&pdev->dev, "%d vfs allocated\n", vfn);
 	for (i = 0; i < vfn; i++) {
@@ -1765,7 +1788,11 @@ static void __devinit igb_init_vf(struct
 	adapter->vfs_allocated_count = vfn;
 
 	return;
-err:
+err_file:
+	device_remove_file(&pdev->dev, &dev_attr_bandwidth_allocation);
+err_sriov:
+	pci_disable_sriov(pdev);
+err_kfree:
 	kfree(adapter->vf_data);
 	adapter->vf_data = NULL;
 #endif /* CONFIG_PCI_IOV */
@@ -1781,6 +1808,7 @@ err:
 static void igb_cleanup_vf(struct igb_adapter * adapter)
 {
 #ifdef CONFIG_PCI_IOV
+	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_hw *hw = &adapter->hw;
 
 	if (!adapter->vf_data)
@@ -1797,6 +1825,9 @@ static void igb_cleanup_vf(struct igb_ad
 	wr32(E1000_IOVCTL, E1000_IOVCTL_REUSE_VFQ);
 	msleep(100);
 	dev_info(&adapter->pdev->dev, "IOV Disabled\n");
+
+	device_remove_file(&pdev->dev, &dev_attr_bandwidth_allocation);
+	kfree(adapter->bandwidth_allocation);
 #endif
 }
 
@@ -2088,6 +2119,123 @@ void igb_configure_tx_ring(struct igb_ad
 	wr32(E1000_TXDCTL(reg_idx), txdctl);
 }
 
+#ifdef CONFIG_PCI_IOV
+static void igb_disable_bandwidth_allocation_vf(struct e1000_hw *hw, int vf)
+{
+	wr32(E1000_VMBASEL, vf);
+	wr32(E1000_VMBAC, 0);
+}
+
+static void igb_disable_bandwidth_allocation(struct igb_adapter *adapter)
+{
+	struct e1000_hw *hw = &adapter->hw;
+	int i;
+
+	for (i = 0; i < adapter->vfs_allocated_count; i++)
+		igb_disable_bandwidth_allocation_vf(hw, i);
+}
+
+static void igb_enable_bandwidth_allocation_vf(struct e1000_hw *hw, int vf,
+					       unsigned int allocation)
+{
+	u32 rq;
+
+	/* Allocation is expressed as 1000ths of link speed [+]
+	 *
+	 * rq is calcualted as 1 / (allocation / 1000) = 1000 / allocation
+	 *
+	 * E1000_VMBAC_RF_INT_SHIFT and E1000_VMBAC_RF_MASK are used
+	 * to marshal the result into the desired format: 23 bits of
+	 * which 14 are to the right of the decimal point.
+	 *
+	 * [+] According to the the 82576 v2.41 datasheet rq should
+	 *     be a ratio of the link speed, however, empirically
+	 *     it appears to always be a ration of to 1Gbit/s,
+	 *     even when the link is 100Mbit/s.
+	 */
+	rq = ((1000 << E1000_VMBAC_RF_INT_SHIFT) / allocation) &
+	     E1000_VMBAC_RF_MASK;
+
+	wr32(E1000_VMBASEL, vf);
+	wr32(E1000_VMBAC, rq|E1000_VMBAC_RC_ENA);
+}
+
+static void igb_enable_bandwidth_allocation(struct igb_adapter *adapter)
+{
+	u32 i, reg;
+	struct e1000_hw *hw = &adapter->hw;
+
+	/* Only enable bandwidth_allocation if it has been set
+	 * and the link speed is 100Mbit/s or 1Gbit/s */
+	if (!adapter->bandwidth_allocation ||
+	    (adapter->link_speed != SPEED_100 &&
+	     adapter->link_speed != SPEED_1000)) {
+		igb_disable_bandwidth_allocation(adapter);
+		return;
+	}
+
+	for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		wr32(E1000_VMBASEL, i);
+		if (adapter->bandwidth_allocation[i])
+			igb_enable_bandwidth_allocation_vf(hw, i,
+					adapter->bandwidth_allocation[i]);
+		else
+			igb_disable_bandwidth_allocation_vf(hw, i);
+
+		/* XXX:
+		 *
+		 * The 82576 datasheet, section 4.5.11.1.5.1 "Configuring Tx
+		 * Bandwidth to VMs" states that the desired setting is:
+		 * VMBAMMW.MMW_SIZE = 16 * MSS
+		 *
+		 * But isn't  MSS a property of skbs that are using tso
+		 * rather than adapters?
+		 *
+		 * If so, should we use the maximum value here? */
+		/* XXX: Should this go inside or outside the for loop ? */
+		reg = 64 * 16;
+		wr32(E1000_VMBAMMW, reg);
+	}
+}
+#endif
+
+static void igb_check_bandwidth_allocation(struct igb_adapter *adapter)
+{
+#ifdef CONFIG_PCI_IOV
+	u32 vmbacs;
+	struct e1000_hw *hw = &adapter->hw;
+
+	if (!adapter->vf_data)
+		return;
+
+	/* The 82576 datasheet, section 4.5.11.1.5.2 "Link Speed Change
+	 * Procedure" describes the sequence below. However the
+	 * SPEED_CHG never seems to be set.
+	 */
+	vmbacs = rd32(E1000_VMBACS);
+	if (vmbacs & E1000_VMBACS_SPEED_CHG) {
+		/* XXX: Never seem to get here */
+		int err = 0;
+
+		if (vmbacs & E1000_VMBACS_VMBA_SET) {
+			igb_disable_bandwidth_allocation(adapter);
+			err = 1;
+		}
+
+		vmbacs &= ~E1000_VMBACS_SPEED_CHG;
+		wr32(E1000_VMBACS, vmbacs);
+
+		if (err)
+			return;
+	}
+
+	spin_lock(&adapter->bandwidth_allocation_lock);
+	igb_enable_bandwidth_allocation(adapter);
+	spin_unlock(&adapter->bandwidth_allocation_lock);
+#endif
+	return;
+}
+
 /**
  * igb_configure_tx - Configure transmit Unit after Reset
  * @adapter: board private structure
@@ -2969,6 +3117,8 @@ static void igb_watchdog_task(struct wor
 				break;
 			}
 
+			igb_check_bandwidth_allocation(adapter);
+
 			netif_carrier_on(netdev);
 
 			igb_ping_all_vfs(adapter);
@@ -5854,4 +6004,101 @@ static void igb_vmm_control(struct igb_a
 	}
 }
 
+#ifdef CONFIG_PCI_IOV
+static ssize_t igb_show_bandwidth_allocation(struct device *dev,
+					     struct device_attribute *attr,
+					     char *buf)
+{
+	struct net_device *netdev = dev_get_drvdata(dev);
+	struct igb_adapter *adapter = netdev_priv(netdev);
+	int i;
+
+	if (!adapter->vf_data)
+		return -ENOENT;
+
+	*buf = '\0';
+	for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		if (i > 0)
+			strcat(buf, " ");
+		sprintf(buf + strlen(buf), "%i",
+			adapter->bandwidth_allocation[i]);
+	}
+	strcat(buf, "\n");
+
+	return strlen(buf);
+}
+
+static unsigned long igb_strtoul(const char *cp, char **endp, unsigned int base)
+{
+	const char *orig = cp;
+	unsigned long x;
+
+	while (isspace(*cp))
+		cp++;
+
+	x = simple_strtoul(cp, endp, base);
+	if (cp == *endp)
+		*endp = (char *)orig;
+
+	return x;
+}
+
+static ssize_t igb_set_bandwidth_allocation(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
+{
+	struct net_device *netdev = dev_get_drvdata(dev);
+	struct igb_adapter *adapter = netdev_priv(netdev);
+	int i;
+	size_t len;
+	ssize_t status = -ENOENT;
+	unsigned int *new, total;
+	unsigned long x;
+	const char *p;
+	char *next_p;
+
+	if (!adapter->vf_data)
+		return -ENOENT;
+
+	len = adapter->vfs_allocated_count * sizeof(unsigned int);
+
+	new = kmalloc(len, GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+
+	p = buf;
+	total = 0;
+	for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		x = igb_strtoul(p, &next_p, 10);
+		if (p == next_p) {
+			dev_err(dev, "not enough values\n");
+			goto err;
+		}
+		if (x > 1000) {
+			dev_err(dev, "value is too large\n");
+			goto err;
+		}
+		new[i] = x;
+		total += x;
+		p = next_p;
+	}
+
+	/* Check for trailing rubbish */
+	igb_strtoul(p, &next_p, 10);
+	if (p != next_p) {
+		dev_err(dev, "trailing rubbish\n");
+		goto err;
+	}
+
+	spin_lock(&adapter->bandwidth_allocation_lock);
+	memcpy(adapter->bandwidth_allocation, new, len);
+	igb_enable_bandwidth_allocation(adapter);
+	spin_unlock(&adapter->bandwidth_allocation_lock);
+
+	status = count;
+err:
+	kfree(new);
+	return status;
+}
+#endif /* CONFIG_PCI_IOV */
 /* igb_main.c */
Index: net-next-2.6/drivers/net/igb/e1000_regs.h
===================================================================
--- net-next-2.6.orig/drivers/net/igb/e1000_regs.h	2009-11-05 03:07:08.000000000 +0900
+++ net-next-2.6/drivers/net/igb/e1000_regs.h	2009-11-05 05:01:35.000000000 +0900
@@ -308,6 +308,16 @@
 #define E1000_VLVF(_n)         (0x05D00 + (4 * (_n))) /* VLAN Virtual Machine
                                                        * Filter - RW */
 
+/* Tx Bandwidth Allocation to VM Registers */
+#define E1000_VMBACS	0x03600 /* VM Bandwidth Allocation
+				 * Control & Status - RW */
+#define E1000_VMBAMMW	0x03670 /* VM Bandwidth Allocation
+				 * Max Memory Window - RW */
+#define E1000_VMBASEL	0x03604 /* VM Bandwidth Allocation
+				 * Select - RW */
+#define E1000_VMBAC	0x03608 /* VM Bandwidth Allocation
+				 * Config - RW */
+
 #define wr32(reg, value) (writel(value, hw->hw_addr + reg))
 #define rd32(reg) (readl(hw->hw_addr + reg))
 #define wrfl() ((void)rd32(E1000_STATUS))
Index: net-next-2.6/drivers/net/igb/e1000_defines.h
===================================================================
--- net-next-2.6.orig/drivers/net/igb/e1000_defines.h	2009-11-05 03:07:08.000000000 +0900
+++ net-next-2.6/drivers/net/igb/e1000_defines.h	2009-11-05 05:01:35.000000000 +0900
@@ -711,4 +711,13 @@
 #define E1000_VFTA_ENTRY_MASK                0x7F
 #define E1000_VFTA_ENTRY_BIT_SHIFT_MASK      0x1F
 
+/* VM Bandwidth Allocation Control & Status */
+#define E1000_VMBACS_VMBA_SET		0x00001000
+#define E1000_VMBACS_SPEED_CHG		0x80000000
+
+/* VM Bandwidth Allocation Config */
+#define E1000_VMBAC_RF_INT_SHIFT	14
+#define E1000_VMBAC_RF_MASK		((1<<23)-1)	/* RF_DEC and RF_INT */
+#define E1000_VMBAC_RC_ENA		0x80000000
+
 #endif
Index: net-next-2.6/drivers/net/igb/igb.h
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb.h	2009-11-05 03:07:08.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb.h	2009-11-05 05:01:35.000000000 +0900
@@ -315,6 +315,10 @@ struct igb_adapter {
 	u16 rx_ring_count;
 	unsigned int vfs_allocated_count;
 	struct vf_data_storage *vf_data;
+#ifdef CONFIG_PCI_IOV
+	unsigned int *bandwidth_allocation;
+	spinlock_t bandwidth_allocation_lock;
+#endif
 };
 
 #define IGB_FLAG_HAS_MSI           (1 << 0)



------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [rfc 1/4] igb: Add igb_cleanup_vf()
  2009-11-05  0:58 [rfc 0/4] igb: bandwidth allocation Simon Horman
@ 2009-11-05  0:58 ` Simon Horman
  2009-11-05  0:58 ` [rfc 2/4] igb: Initialise adapter->vfs_allocated_count in igb_init_vf() Simon Horman
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Simon Horman @ 2009-11-05  0:58 UTC (permalink / raw)
  To: e1000-devel, netdev; +Cc: Arnd Bergmann, Jeff Kirsher

[-- Attachment #1: igb_cleanup_vf.patch --]
[-- Type: text/plain, Size: 4204 bytes --]

Move virtual finction cleanup code into igb_cleanup_vf() and for the sake
of symmetry rename igb_probe_vfs() as igb_init_vf().

Although these functions aren't entirely symmetrical it should aid
maintenance by making the relationship between initialisation and cleanup
more obvious.

Note that there appears to be no way for adapter->vfs_allocated_count to be
non-zero for the case where CONFIG_PCI_IOV is not set, so reseting this
value was moved to inside the relvant #ifdef.

Signed-off-by: Simon Horman <horms@verge.net.au>

Index: net-next-2.6/drivers/net/igb/igb_main.c
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb_main.c	2009-11-05 04:38:58.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb_main.c	2009-11-05 16:36:07.000000000 +0900
@@ -87,6 +87,7 @@ void igb_update_stats(struct igb_adapter
 static int igb_probe(struct pci_dev *, const struct pci_device_id *);
 static void __devexit igb_remove(struct pci_dev *pdev);
 static int igb_sw_init(struct igb_adapter *);
+static void igb_cleanup_vf(struct igb_adapter * adapter);
 static int igb_open(struct net_device *);
 static int igb_close(struct net_device *);
 static void igb_configure_tx(struct igb_adapter *);
@@ -650,21 +651,7 @@ static void igb_set_interrupt_capability
 
 	/* If we can't do MSI-X, try MSI */
 msi_only:
-#ifdef CONFIG_PCI_IOV
-	/* disable SR-IOV for non MSI-X configurations */
-	if (adapter->vf_data) {
-		struct e1000_hw *hw = &adapter->hw;
-		/* disable iov and allow time for transactions to clear */
-		pci_disable_sriov(adapter->pdev);
-		msleep(500);
-
-		kfree(adapter->vf_data);
-		adapter->vf_data = NULL;
-		wr32(E1000_IOVCTL, E1000_IOVCTL_REUSE_VFQ);
-		msleep(100);
-		dev_info(&adapter->pdev->dev, "IOV Disabled\n");
-	}
-#endif
+	igb_cleanup_vf(adapter);
 	adapter->vfs_allocated_count = 0;
 	adapter->flags |= IGB_FLAG_QUEUE_PAIRS;
 	adapter->num_rx_queues = 1;
@@ -1734,7 +1721,7 @@ static void __devexit igb_remove(struct 
 }
 
 /**
- * igb_probe_vfs - Initialize vf data storage and add VFs to pci config space
+ * igb_init_vf - Initialize vf data storage and add VFs to pci config space
  * @adapter: board private structure to initialize
  *
  * This function initializes the vf specific data storage and then attempts to
@@ -1742,7 +1729,7 @@ static void __devexit igb_remove(struct 
  * mor expensive time wise to disable SR-IOV than it is to allocate and free
  * the memory for the VFs.
  **/
-static void __devinit igb_probe_vfs(struct igb_adapter * adapter)
+static void __devinit igb_init_vf(struct igb_adapter * adapter)
 {
 #ifdef CONFIG_PCI_IOV
 	struct pci_dev *pdev = adapter->pdev;
@@ -1782,6 +1769,35 @@ static void __devinit igb_probe_vfs(stru
 }
 
 /**
+ * igb_cleanup_vf - Clean up vf data and remove vfs from pci config space
+ * @adapter: board private structure to initialize
+ *
+ * This function cleans-up the vf specific data storage and then attempts to
+ * deallocate the VFs.
+ **/
+static void igb_cleanup_vf(struct igb_adapter * adapter)
+{
+#ifdef CONFIG_PCI_IOV
+	struct e1000_hw *hw = &adapter->hw;
+
+	if (!adapter->vf_data)
+		return;
+
+	/* disable iov and allow time for transactions to clear */
+	pci_disable_sriov(adapter->pdev);
+	msleep(500);
+
+	kfree(adapter->vf_data);
+	adapter->vf_data = NULL;
+	adapter->vfs_allocated_count = 0;
+
+	wr32(E1000_IOVCTL, E1000_IOVCTL_REUSE_VFQ);
+	msleep(100);
+	dev_info(&adapter->pdev->dev, "IOV Disabled\n");
+#endif
+}
+
+/**
  * igb_sw_init - Initialize general software structures (struct igb_adapter)
  * @adapter: board private structure to initialize
  *
@@ -1816,7 +1832,7 @@ static int __devinit igb_sw_init(struct 
 		return -ENOMEM;
 	}
 
-	igb_probe_vfs(adapter);
+	igb_init_vf(adapter);
 
 	/* Explicitly disable IRQ since the NIC can be in any state. */
 	igb_irq_disable(adapter);


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [rfc 2/4] igb: Initialise adapter->vfs_allocated_count in igb_init_vf()
  2009-11-05  0:58 [rfc 0/4] igb: bandwidth allocation Simon Horman
  2009-11-05  0:58 ` [rfc 1/4] igb: Add igb_cleanup_vf() Simon Horman
@ 2009-11-05  0:58 ` Simon Horman
  2009-11-05  0:58 ` [rfc 3/4] igb: Common error path in igb_init_vfs() Simon Horman
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Simon Horman @ 2009-11-05  0:58 UTC (permalink / raw)
  To: e1000-devel, netdev; +Cc: Arnd Bergmann, Jeff Kirsher

[-- Attachment #1: igb_init_vf-vfs_allocated_count.patch --]
[-- Type: text/plain, Size: 4095 bytes --]

Initialise adapter->vfs_allocated_count in igb_init_vf()
and only do so if the VFs are successfully created.

This seems a lot tidier to me, for starters igb_init_vf() is no longer
spliced in two by an #ifdef just to allow vfs_allocated_count to be reset to
0 if CONFIG_PCI_IOV is unset.

This change will break things if igb_init_interrupt_scheme() relies on
adapter->vfs_allocated_count, but on inspection it appears not to.

Signed-off-by: Simon Horman <horms@verge.net.au>

Index: net-next-2.6/drivers/net/igb/igb_main.c
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb_main.c	2009-11-05 04:46:06.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb_main.c	2009-11-05 04:51:10.000000000 +0900
@@ -1723,48 +1723,48 @@ static void __devexit igb_remove(struct 
 /**
  * igb_init_vf - Initialize vf data storage and add VFs to pci config space
  * @adapter: board private structure to initialize
+ * @vfn: requested number of virtual functions
  *
  * This function initializes the vf specific data storage and then attempts to
  * allocate the VFs.  The reason for ordering it this way is because it is much
  * mor expensive time wise to disable SR-IOV than it is to allocate and free
  * the memory for the VFs.
  **/
-static void __devinit igb_init_vf(struct igb_adapter * adapter)
+static void __devinit igb_init_vf(struct igb_adapter * adapter, int vfn)
 {
 #ifdef CONFIG_PCI_IOV
 	struct pci_dev *pdev = adapter->pdev;
+	struct e1000_hw *hw = &adapter->hw;
 
-	if (adapter->vfs_allocated_count > 7)
-		adapter->vfs_allocated_count = 7;
+	if (hw->mac.type != e1000_82576 || !vfn)
+		return;
 
-	if (adapter->vfs_allocated_count) {
-		adapter->vf_data = kcalloc(adapter->vfs_allocated_count,
-		                           sizeof(struct vf_data_storage),
-		                           GFP_KERNEL);
-		/* if allocation failed then we do not support SR-IOV */
-		if (!adapter->vf_data) {
-			adapter->vfs_allocated_count = 0;
-			dev_err(&pdev->dev, "Unable to allocate memory for VF "
-			        "Data Storage\n");
-		}
+	if (vfn > 7)
+		vfn = 7;
+
+	adapter->vf_data = kcalloc(vfn, sizeof(struct vf_data_storage),
+				   GFP_KERNEL);
+	/* if allocation failed then we do not support SR-IOV */
+	if (!adapter->vf_data) {
+		dev_err(&pdev->dev, "Unable to allocate memory for VF "
+		        "Data Storage\n");
+		return;
 	}
 
-	if (pci_enable_sriov(pdev, adapter->vfs_allocated_count)) {
+	if (pci_enable_sriov(pdev, vfn)) {
 		kfree(adapter->vf_data);
 		adapter->vf_data = NULL;
-#endif /* CONFIG_PCI_IOV */
-		adapter->vfs_allocated_count = 0;
-#ifdef CONFIG_PCI_IOV
 	} else {
 		unsigned char mac_addr[ETH_ALEN];
 		int i;
-		dev_info(&pdev->dev, "%d vfs allocated\n",
-		         adapter->vfs_allocated_count);
-		for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		dev_info(&pdev->dev, "%d vfs allocated\n", vfn);
+		for (i = 0; i < vfn; i++) {
 			random_ether_addr(mac_addr);
 			igb_set_vf_mac(adapter, i, mac_addr);
 		}
 	}
+
+	adapter->vfs_allocated_count = vfn;
 #endif /* CONFIG_PCI_IOV */
 }
 
@@ -1821,18 +1821,13 @@ static int __devinit igb_sw_init(struct 
 	adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
-#ifdef CONFIG_PCI_IOV
-	if (hw->mac.type == e1000_82576)
-		adapter->vfs_allocated_count = max_vfs;
-
-#endif /* CONFIG_PCI_IOV */
 	/* This call may decrease the number of queues */
 	if (igb_init_interrupt_scheme(adapter)) {
 		dev_err(&pdev->dev, "Unable to allocate memory for queues\n");
 		return -ENOMEM;
 	}
 
-	igb_init_vf(adapter);
+	igb_init_vf(adapter, max_vfs);
 
 	/* Explicitly disable IRQ since the NIC can be in any state. */
 	igb_irq_disable(adapter);


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [rfc 3/4] igb: Common error path in igb_init_vfs()
  2009-11-05  0:58 [rfc 0/4] igb: bandwidth allocation Simon Horman
  2009-11-05  0:58 ` [rfc 1/4] igb: Add igb_cleanup_vf() Simon Horman
  2009-11-05  0:58 ` [rfc 2/4] igb: Initialise adapter->vfs_allocated_count in igb_init_vf() Simon Horman
@ 2009-11-05  0:58 ` Simon Horman
  2009-11-05  0:58 ` [rfc 4/4] igb: expose 82576 bandiwidth allocation Simon Horman
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Simon Horman @ 2009-11-05  0:58 UTC (permalink / raw)
  To: e1000-devel, netdev; +Cc: Arnd Bergmann, Jeff Kirsher

[-- Attachment #1: igb_init_vf-err.patch --]
[-- Type: text/plain, Size: 1927 bytes --]

Drop out into an error path on error.

This is a bit superfluous as things stand, though arguably
it already makes the code cleaner. But it should help things a lot
if igb_init_vfs() has a several things to unwind on error.

Signed-off-by: Simon Horman <horms@verge.net.au>

Index: net-next-2.6/drivers/net/igb/igb_main.c
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb_main.c	2009-11-05 04:52:16.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb_main.c	2009-11-05 16:36:01.000000000 +0900
@@ -1735,6 +1735,8 @@ static void __devinit igb_init_vf(struct
 #ifdef CONFIG_PCI_IOV
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_hw *hw = &adapter->hw;
+	unsigned char mac_addr[ETH_ALEN];
+	int i;
 
 	if (hw->mac.type != e1000_82576 || !vfn)
 		return;
@@ -1751,20 +1753,21 @@ static void __devinit igb_init_vf(struct
 		return;
 	}
 
-	if (pci_enable_sriov(pdev, vfn)) {
-		kfree(adapter->vf_data);
-		adapter->vf_data = NULL;
-	} else {
-		unsigned char mac_addr[ETH_ALEN];
-		int i;
-		dev_info(&pdev->dev, "%d vfs allocated\n", vfn);
-		for (i = 0; i < vfn; i++) {
-			random_ether_addr(mac_addr);
-			igb_set_vf_mac(adapter, i, mac_addr);
-		}
+	if (pci_enable_sriov(pdev, vfn))
+		goto err;
+
+	dev_info(&pdev->dev, "%d vfs allocated\n", vfn);
+	for (i = 0; i < vfn; i++) {
+		random_ether_addr(mac_addr);
+		igb_set_vf_mac(adapter, i, mac_addr);
 	}
 
 	adapter->vfs_allocated_count = vfn;
+
+	return;
+err:
+	kfree(adapter->vf_data);
+	adapter->vf_data = NULL;
 #endif /* CONFIG_PCI_IOV */
 }
 


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [rfc 4/4] igb: expose 82576 bandiwidth allocation
  2009-11-05  0:58 [rfc 0/4] igb: bandwidth allocation Simon Horman
                   ` (2 preceding siblings ...)
  2009-11-05  0:58 ` [rfc 3/4] igb: Common error path in igb_init_vfs() Simon Horman
@ 2009-11-05  0:58 ` Simon Horman
  2009-11-05 23:00   ` Alexander Duyck
  2009-11-05  1:46 ` [rfc 0/4] igb: bandwidth allocation Jeff Kirsher
  2009-11-05 12:09 ` Andi Kleen
  5 siblings, 1 reply; 14+ messages in thread
From: Simon Horman @ 2009-11-05  0:58 UTC (permalink / raw)
  To: e1000-devel, netdev; +Cc: Arnd Bergmann, Jeff Kirsher

[-- Attachment #1: igb-ba.patch --]
[-- Type: text/plain, Size: 12335 bytes --]

The 82576 has support for bandwidth allocation to VFs.

Contrary to the documentation in the 82576 datasheet v2.41 this
appears to work as follows:

* The ratio supplied is always proportional to 1Gbit/s,
  regardless of if the link speed.
* The ratio supplied is an upper-bound on bandwidth available
  to the VF, not a minimun guarantee

This patch exposes bandwidth control to userspace through a simple
per-device (PF) sysfs file, bandwidth_allocation.

* The file contains a whitespace delimited list of values, one per VF.
* The first value corresponds to the first VF and so on.
* Valid values are integers from 0 to 1000
* A value of 0 indicates that bandwidth_allocation is disabled.
* Other values indicate the allocated bandwidth, in 1/1000ths of a gigabit/s

e.g. The following for a PF with 4 VFs allocates ~20Mbits/ to VF 1,
     ~100Mbit/s to VF 2, and leave the other 2 VFs with no allocation.

     echo "20 100 0 0" > /sys/class/net/eth3/device/bandwidth_allocation

This interface is intended to allow testing of the hardware feature.
There are ongoing discussions about how to expose this feature
to user-space in a more generic way.

Signed-off-by: Simon Horman <horms@verge.net.au>

Index: net-next-2.6/drivers/net/igb/igb_main.c
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb_main.c	2009-11-05 04:55:06.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb_main.c	2009-11-05 05:12:54.000000000 +0900
@@ -47,6 +47,9 @@
 #ifdef CONFIG_IGB_DCA
 #include <linux/dca.h>
 #endif
+#ifdef CONFIG_PCI_IOV
+#include <linux/ctype.h>
+#endif
 #include "igb.h"
 
 #define DRV_VERSION "1.3.16-k2"
@@ -152,6 +155,15 @@ static unsigned int max_vfs = 0;
 module_param(max_vfs, uint, 0);
 MODULE_PARM_DESC(max_vfs, "Maximum number of virtual functions to allocate "
                  "per physical function");
+
+static ssize_t igb_set_bandwidth_allocation(struct device *,
+					    struct device_attribute *,
+					    const char *, size_t);
+static ssize_t igb_show_bandwidth_allocation(struct device *,
+					     struct device_attribute *,
+					     char *);
+DEVICE_ATTR(bandwidth_allocation, S_IRUGO | S_IWUSR,
+	    igb_show_bandwidth_allocation, igb_set_bandwidth_allocation);
 #endif /* CONFIG_PCI_IOV */
 
 static pci_ers_result_t igb_io_error_detected(struct pci_dev *,
@@ -1754,7 +1766,18 @@ static void __devinit igb_init_vf(struct
 	}
 
 	if (pci_enable_sriov(pdev, vfn))
-		goto err;
+		goto err_kfree;
+
+	if (device_create_file(&pdev->dev, &dev_attr_bandwidth_allocation))
+		goto err_sriov;
+
+	adapter->bandwidth_allocation = kcalloc(vfn, sizeof(unsigned int),
+						GFP_KERNEL);
+	if (!adapter->bandwidth_allocation)
+		goto err_file;
+	memset(adapter->bandwidth_allocation, vfn * sizeof(unsigned int), 0);
+
+	spin_lock_init(&adapter->bandwidth_allocation_lock);
 
 	dev_info(&pdev->dev, "%d vfs allocated\n", vfn);
 	for (i = 0; i < vfn; i++) {
@@ -1765,7 +1788,11 @@ static void __devinit igb_init_vf(struct
 	adapter->vfs_allocated_count = vfn;
 
 	return;
-err:
+err_file:
+	device_remove_file(&pdev->dev, &dev_attr_bandwidth_allocation);
+err_sriov:
+	pci_disable_sriov(pdev);
+err_kfree:
 	kfree(adapter->vf_data);
 	adapter->vf_data = NULL;
 #endif /* CONFIG_PCI_IOV */
@@ -1781,6 +1808,7 @@ err:
 static void igb_cleanup_vf(struct igb_adapter * adapter)
 {
 #ifdef CONFIG_PCI_IOV
+	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_hw *hw = &adapter->hw;
 
 	if (!adapter->vf_data)
@@ -1797,6 +1825,9 @@ static void igb_cleanup_vf(struct igb_ad
 	wr32(E1000_IOVCTL, E1000_IOVCTL_REUSE_VFQ);
 	msleep(100);
 	dev_info(&adapter->pdev->dev, "IOV Disabled\n");
+
+	device_remove_file(&pdev->dev, &dev_attr_bandwidth_allocation);
+	kfree(adapter->bandwidth_allocation);
 #endif
 }
 
@@ -2088,6 +2119,123 @@ void igb_configure_tx_ring(struct igb_ad
 	wr32(E1000_TXDCTL(reg_idx), txdctl);
 }
 
+#ifdef CONFIG_PCI_IOV
+static void igb_disable_bandwidth_allocation_vf(struct e1000_hw *hw, int vf)
+{
+	wr32(E1000_VMBASEL, vf);
+	wr32(E1000_VMBAC, 0);
+}
+
+static void igb_disable_bandwidth_allocation(struct igb_adapter *adapter)
+{
+	struct e1000_hw *hw = &adapter->hw;
+	int i;
+
+	for (i = 0; i < adapter->vfs_allocated_count; i++)
+		igb_disable_bandwidth_allocation_vf(hw, i);
+}
+
+static void igb_enable_bandwidth_allocation_vf(struct e1000_hw *hw, int vf,
+					       unsigned int allocation)
+{
+	u32 rq;
+
+	/* Allocation is expressed as 1000ths of link speed [+]
+	 *
+	 * rq is calcualted as 1 / (allocation / 1000) = 1000 / allocation
+	 *
+	 * E1000_VMBAC_RF_INT_SHIFT and E1000_VMBAC_RF_MASK are used
+	 * to marshal the result into the desired format: 23 bits of
+	 * which 14 are to the right of the decimal point.
+	 *
+	 * [+] According to the the 82576 v2.41 datasheet rq should
+	 *     be a ratio of the link speed, however, empirically
+	 *     it appears to always be a ration of to 1Gbit/s,
+	 *     even when the link is 100Mbit/s.
+	 */
+	rq = ((1000 << E1000_VMBAC_RF_INT_SHIFT) / allocation) &
+	     E1000_VMBAC_RF_MASK;
+
+	wr32(E1000_VMBASEL, vf);
+	wr32(E1000_VMBAC, rq|E1000_VMBAC_RC_ENA);
+}
+
+static void igb_enable_bandwidth_allocation(struct igb_adapter *adapter)
+{
+	u32 i, reg;
+	struct e1000_hw *hw = &adapter->hw;
+
+	/* Only enable bandwidth_allocation if it has been set
+	 * and the link speed is 100Mbit/s or 1Gbit/s */
+	if (!adapter->bandwidth_allocation ||
+	    (adapter->link_speed != SPEED_100 &&
+	     adapter->link_speed != SPEED_1000)) {
+		igb_disable_bandwidth_allocation(adapter);
+		return;
+	}
+
+	for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		wr32(E1000_VMBASEL, i);
+		if (adapter->bandwidth_allocation[i])
+			igb_enable_bandwidth_allocation_vf(hw, i,
+					adapter->bandwidth_allocation[i]);
+		else
+			igb_disable_bandwidth_allocation_vf(hw, i);
+
+		/* XXX:
+		 *
+		 * The 82576 datasheet, section 4.5.11.1.5.1 "Configuring Tx
+		 * Bandwidth to VMs" states that the desired setting is:
+		 * VMBAMMW.MMW_SIZE = 16 * MSS
+		 *
+		 * But isn't  MSS a property of skbs that are using tso
+		 * rather than adapters?
+		 *
+		 * If so, should we use the maximum value here? */
+		/* XXX: Should this go inside or outside the for loop ? */
+		reg = 64 * 16;
+		wr32(E1000_VMBAMMW, reg);
+	}
+}
+#endif
+
+static void igb_check_bandwidth_allocation(struct igb_adapter *adapter)
+{
+#ifdef CONFIG_PCI_IOV
+	u32 vmbacs;
+	struct e1000_hw *hw = &adapter->hw;
+
+	if (!adapter->vf_data)
+		return;
+
+	/* The 82576 datasheet, section 4.5.11.1.5.2 "Link Speed Change
+	 * Procedure" describes the sequence below. However the
+	 * SPEED_CHG never seems to be set.
+	 */
+	vmbacs = rd32(E1000_VMBACS);
+	if (vmbacs & E1000_VMBACS_SPEED_CHG) {
+		/* XXX: Never seem to get here */
+		int err = 0;
+
+		if (vmbacs & E1000_VMBACS_VMBA_SET) {
+			igb_disable_bandwidth_allocation(adapter);
+			err = 1;
+		}
+
+		vmbacs &= ~E1000_VMBACS_SPEED_CHG;
+		wr32(E1000_VMBACS, vmbacs);
+
+		if (err)
+			return;
+	}
+
+	spin_lock(&adapter->bandwidth_allocation_lock);
+	igb_enable_bandwidth_allocation(adapter);
+	spin_unlock(&adapter->bandwidth_allocation_lock);
+#endif
+	return;
+}
+
 /**
  * igb_configure_tx - Configure transmit Unit after Reset
  * @adapter: board private structure
@@ -2969,6 +3117,8 @@ static void igb_watchdog_task(struct wor
 				break;
 			}
 
+			igb_check_bandwidth_allocation(adapter);
+
 			netif_carrier_on(netdev);
 
 			igb_ping_all_vfs(adapter);
@@ -5854,4 +6004,101 @@ static void igb_vmm_control(struct igb_a
 	}
 }
 
+#ifdef CONFIG_PCI_IOV
+static ssize_t igb_show_bandwidth_allocation(struct device *dev,
+					     struct device_attribute *attr,
+					     char *buf)
+{
+	struct net_device *netdev = dev_get_drvdata(dev);
+	struct igb_adapter *adapter = netdev_priv(netdev);
+	int i;
+
+	if (!adapter->vf_data)
+		return -ENOENT;
+
+	*buf = '\0';
+	for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		if (i > 0)
+			strcat(buf, " ");
+		sprintf(buf + strlen(buf), "%i",
+			adapter->bandwidth_allocation[i]);
+	}
+	strcat(buf, "\n");
+
+	return strlen(buf);
+}
+
+static unsigned long igb_strtoul(const char *cp, char **endp, unsigned int base)
+{
+	const char *orig = cp;
+	unsigned long x;
+
+	while (isspace(*cp))
+		cp++;
+
+	x = simple_strtoul(cp, endp, base);
+	if (cp == *endp)
+		*endp = (char *)orig;
+
+	return x;
+}
+
+static ssize_t igb_set_bandwidth_allocation(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
+{
+	struct net_device *netdev = dev_get_drvdata(dev);
+	struct igb_adapter *adapter = netdev_priv(netdev);
+	int i;
+	size_t len;
+	ssize_t status = -ENOENT;
+	unsigned int *new, total;
+	unsigned long x;
+	const char *p;
+	char *next_p;
+
+	if (!adapter->vf_data)
+		return -ENOENT;
+
+	len = adapter->vfs_allocated_count * sizeof(unsigned int);
+
+	new = kmalloc(len, GFP_KERNEL);
+	if (!new)
+		return -ENOMEM;
+
+	p = buf;
+	total = 0;
+	for (i = 0; i < adapter->vfs_allocated_count; i++) {
+		x = igb_strtoul(p, &next_p, 10);
+		if (p == next_p) {
+			dev_err(dev, "not enough values\n");
+			goto err;
+		}
+		if (x > 1000) {
+			dev_err(dev, "value is too large\n");
+			goto err;
+		}
+		new[i] = x;
+		total += x;
+		p = next_p;
+	}
+
+	/* Check for trailing rubbish */
+	igb_strtoul(p, &next_p, 10);
+	if (p != next_p) {
+		dev_err(dev, "trailing rubbish\n");
+		goto err;
+	}
+
+	spin_lock(&adapter->bandwidth_allocation_lock);
+	memcpy(adapter->bandwidth_allocation, new, len);
+	igb_enable_bandwidth_allocation(adapter);
+	spin_unlock(&adapter->bandwidth_allocation_lock);
+
+	status = count;
+err:
+	kfree(new);
+	return status;
+}
+#endif /* CONFIG_PCI_IOV */
 /* igb_main.c */
Index: net-next-2.6/drivers/net/igb/e1000_regs.h
===================================================================
--- net-next-2.6.orig/drivers/net/igb/e1000_regs.h	2009-11-05 03:07:08.000000000 +0900
+++ net-next-2.6/drivers/net/igb/e1000_regs.h	2009-11-05 05:01:35.000000000 +0900
@@ -308,6 +308,16 @@
 #define E1000_VLVF(_n)         (0x05D00 + (4 * (_n))) /* VLAN Virtual Machine
                                                        * Filter - RW */
 
+/* Tx Bandwidth Allocation to VM Registers */
+#define E1000_VMBACS	0x03600 /* VM Bandwidth Allocation
+				 * Control & Status - RW */
+#define E1000_VMBAMMW	0x03670 /* VM Bandwidth Allocation
+				 * Max Memory Window - RW */
+#define E1000_VMBASEL	0x03604 /* VM Bandwidth Allocation
+				 * Select - RW */
+#define E1000_VMBAC	0x03608 /* VM Bandwidth Allocation
+				 * Config - RW */
+
 #define wr32(reg, value) (writel(value, hw->hw_addr + reg))
 #define rd32(reg) (readl(hw->hw_addr + reg))
 #define wrfl() ((void)rd32(E1000_STATUS))
Index: net-next-2.6/drivers/net/igb/e1000_defines.h
===================================================================
--- net-next-2.6.orig/drivers/net/igb/e1000_defines.h	2009-11-05 03:07:08.000000000 +0900
+++ net-next-2.6/drivers/net/igb/e1000_defines.h	2009-11-05 05:01:35.000000000 +0900
@@ -711,4 +711,13 @@
 #define E1000_VFTA_ENTRY_MASK                0x7F
 #define E1000_VFTA_ENTRY_BIT_SHIFT_MASK      0x1F
 
+/* VM Bandwidth Allocation Control & Status */
+#define E1000_VMBACS_VMBA_SET		0x00001000
+#define E1000_VMBACS_SPEED_CHG		0x80000000
+
+/* VM Bandwidth Allocation Config */
+#define E1000_VMBAC_RF_INT_SHIFT	14
+#define E1000_VMBAC_RF_MASK		((1<<23)-1)	/* RF_DEC and RF_INT */
+#define E1000_VMBAC_RC_ENA		0x80000000
+
 #endif
Index: net-next-2.6/drivers/net/igb/igb.h
===================================================================
--- net-next-2.6.orig/drivers/net/igb/igb.h	2009-11-05 03:07:08.000000000 +0900
+++ net-next-2.6/drivers/net/igb/igb.h	2009-11-05 05:01:35.000000000 +0900
@@ -315,6 +315,10 @@ struct igb_adapter {
 	u16 rx_ring_count;
 	unsigned int vfs_allocated_count;
 	struct vf_data_storage *vf_data;
+#ifdef CONFIG_PCI_IOV
+	unsigned int *bandwidth_allocation;
+	spinlock_t bandwidth_allocation_lock;
+#endif
 };
 
 #define IGB_FLAG_HAS_MSI           (1 << 0)


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 0/4] igb: bandwidth allocation
  2009-11-05  0:58 [rfc 0/4] igb: bandwidth allocation Simon Horman
                   ` (3 preceding siblings ...)
  2009-11-05  0:58 ` [rfc 4/4] igb: expose 82576 bandiwidth allocation Simon Horman
@ 2009-11-05  1:46 ` Jeff Kirsher
  2009-11-05  2:21   ` Simon Horman
  2009-11-05 12:09 ` Andi Kleen
  5 siblings, 1 reply; 14+ messages in thread
From: Jeff Kirsher @ 2009-11-05  1:46 UTC (permalink / raw)
  To: Simon Horman; +Cc: e1000-devel, netdev, Arnd Bergmann

On Wed, Nov 4, 2009 at 16:58, Simon Horman <horms@verge.net.au> wrote:
> Hi,
>
> this series of patches exposes the bandwidth allocation
> hardware support of the Intel 82576. It does so through
> a rather hackish sysfs entry. That interface is just intended
> for testing so that the exposed hardware feature can
> be exercised. I would like to find a generic way to expose
> this feature to user-space.
>

Thanks Simon.  I have add the 4 patch series to my tree for testing.

-- 
Cheers,
Jeff

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 0/4] igb: bandwidth allocation
  2009-11-05  1:46 ` [rfc 0/4] igb: bandwidth allocation Jeff Kirsher
@ 2009-11-05  2:21   ` Simon Horman
  2009-11-14  8:01     ` Jeff Kirsher
  0 siblings, 1 reply; 14+ messages in thread
From: Simon Horman @ 2009-11-05  2:21 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: e1000-devel, netdev, Alexander Duyck, Arnd Bergmann

On Wed, Nov 04, 2009 at 05:46:50PM -0800, Jeff Kirsher wrote:
> On Wed, Nov 4, 2009 at 16:58, Simon Horman <horms@verge.net.au> wrote:
> > Hi,
> >
> > this series of patches exposes the bandwidth allocation
> > hardware support of the Intel 82576. It does so through
> > a rather hackish sysfs entry. That interface is just intended
> > for testing so that the exposed hardware feature can
> > be exercised. I would like to find a generic way to expose
> > this feature to user-space.
> >
> 
> Thanks Simon.  I have add the 4 patch series to my tree for testing.

Thanks. I wanted to get the code out rather than sitting on it
for lack of a better user-space interface. Although there
is a lot of fluff the actual register twiddling for
bandwidth allocation turned out to be quite simple.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 0/4] igb: bandwidth allocation
  2009-11-05  0:58 [rfc 0/4] igb: bandwidth allocation Simon Horman
                   ` (4 preceding siblings ...)
  2009-11-05  1:46 ` [rfc 0/4] igb: bandwidth allocation Jeff Kirsher
@ 2009-11-05 12:09 ` Andi Kleen
  5 siblings, 0 replies; 14+ messages in thread
From: Andi Kleen @ 2009-11-05 12:09 UTC (permalink / raw)
  To: Simon Horman; +Cc: e1000-devel, netdev, Arnd Bergmann, Jeff Kirsher

Simon Horman <horms@verge.net.au> writes:

> Hi,
>
> this series of patches exposes the bandwidth allocation
> hardware support of the Intel 82576. It does so through
> a rather hackish sysfs entry. That interface is just intended
> for testing so that the exposed hardware feature can
> be exercised. I would like to find a generic way to expose
> this feature to user-space.

It would be cool if you had a interface that did a software
fallback for NICs that don't support this.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 4/4] igb: expose 82576 bandiwidth allocation
  2009-11-05  0:58 ` [rfc 4/4] igb: expose 82576 bandiwidth allocation Simon Horman
@ 2009-11-05 23:00   ` Alexander Duyck
  2009-11-05 23:30     ` Simon Horman
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Duyck @ 2009-11-05 23:00 UTC (permalink / raw)
  To: Simon Horman; +Cc: e1000-devel, netdev, Kirsher, Jeffrey T, Arnd Bergmann

Simon Horman wrote:
> The 82576 has support for bandwidth allocation to VFs.
> 
> Contrary to the documentation in the 82576 datasheet v2.41 this
> appears to work as follows:
> 
> * The ratio supplied is always proportional to 1Gbit/s,
>   regardless of if the link speed.
> * The ratio supplied is an upper-bound on bandwidth available
>   to the VF, not a minimun guarantee
> 
> This patch exposes bandwidth control to userspace through a simple
> per-device (PF) sysfs file, bandwidth_allocation.
> 
> * The file contains a whitespace delimited list of values, one per VF.
> * The first value corresponds to the first VF and so on.
> * Valid values are integers from 0 to 1000
> * A value of 0 indicates that bandwidth_allocation is disabled.
> * Other values indicate the allocated bandwidth, in 1/1000ths of a gigabit/s
> 
> e.g. The following for a PF with 4 VFs allocates ~20Mbits/ to VF 1,
>      ~100Mbit/s to VF 2, and leave the other 2 VFs with no allocation.
> 
>      echo "20 100 0 0" > /sys/class/net/eth3/device/bandwidth_allocation
> 
> This interface is intended to allow testing of the hardware feature.
> There are ongoing discussions about how to expose this feature
> to user-space in a more generic way.
> 
> Signed-off-by: Simon Horman <horms@verge.net.au>
> 

Of the patches it looks like the only one that really has any issues is 
this one and it is mostly due to the sysfs implementation.  The others I 
would say can be applied and pushed up into the net-next-2.6 tree.

We're currently working on an iproute2 based solution for configuring 
VFs and can incorporate this functionality into it at some point in the 
future.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 4/4] igb: expose 82576 bandiwidth allocation
  2009-11-05 23:00   ` Alexander Duyck
@ 2009-11-05 23:30     ` Simon Horman
  2009-11-05 23:42       ` Alexander Duyck
  0 siblings, 1 reply; 14+ messages in thread
From: Simon Horman @ 2009-11-05 23:30 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: e1000-devel, netdev, Arnd Bergmann, Kirsher, Jeffrey T

On Thu, Nov 05, 2009 at 03:00:02PM -0800, Alexander Duyck wrote:
> Simon Horman wrote:
> >The 82576 has support for bandwidth allocation to VFs.
> >
> >Contrary to the documentation in the 82576 datasheet v2.41 this
> >appears to work as follows:
> >
> >* The ratio supplied is always proportional to 1Gbit/s,
> >  regardless of if the link speed.
> >* The ratio supplied is an upper-bound on bandwidth available
> >  to the VF, not a minimun guarantee
> >
> >This patch exposes bandwidth control to userspace through a simple
> >per-device (PF) sysfs file, bandwidth_allocation.
> >
> >* The file contains a whitespace delimited list of values, one per VF.
> >* The first value corresponds to the first VF and so on.
> >* Valid values are integers from 0 to 1000
> >* A value of 0 indicates that bandwidth_allocation is disabled.
> >* Other values indicate the allocated bandwidth, in 1/1000ths of a gigabit/s
> >
> >e.g. The following for a PF with 4 VFs allocates ~20Mbits/ to VF 1,
> >     ~100Mbit/s to VF 2, and leave the other 2 VFs with no allocation.
> >
> >     echo "20 100 0 0" > /sys/class/net/eth3/device/bandwidth_allocation
> >
> >This interface is intended to allow testing of the hardware feature.
> >There are ongoing discussions about how to expose this feature
> >to user-space in a more generic way.
> >
> >Signed-off-by: Simon Horman <horms@verge.net.au>
> >
> 
> Of the patches it looks like the only one that really has any issues
> is this one and it is mostly due to the sysfs implementation.  The
> others I would say can be applied and pushed up into the
> net-next-2.6 tree.

Thanks, I suspected as much.

> We're currently working on an iproute2 based solution for
> configuring VFs and can incorporate this functionality into it at
> some point in the future.

Do you have any pointers to discussions relating to that interface.
Do you think it would be worth putting in the sysfs interface in the
mean-time, or would you rather wait?

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 4/4] igb: expose 82576 bandiwidth allocation
  2009-11-05 23:30     ` Simon Horman
@ 2009-11-05 23:42       ` Alexander Duyck
  2009-11-06  3:57         ` Simon Horman
  0 siblings, 1 reply; 14+ messages in thread
From: Alexander Duyck @ 2009-11-05 23:42 UTC (permalink / raw)
  To: Simon Horman; +Cc: e1000-devel, netdev, Kirsher, Jeffrey T, Arnd Bergmann

Simon Horman wrote:
> On Thu, Nov 05, 2009 at 03:00:02PM -0800, Alexander Duyck wrote:
>> Simon Horman wrote:
>>> The 82576 has support for bandwidth allocation to VFs.
>>>
>>> Contrary to the documentation in the 82576 datasheet v2.41 this
>>> appears to work as follows:
>>>
>>> * The ratio supplied is always proportional to 1Gbit/s,
>>>  regardless of if the link speed.
>>> * The ratio supplied is an upper-bound on bandwidth available
>>>  to the VF, not a minimun guarantee
>>>
>>> This patch exposes bandwidth control to userspace through a simple
>>> per-device (PF) sysfs file, bandwidth_allocation.
>>>
>>> * The file contains a whitespace delimited list of values, one per VF.
>>> * The first value corresponds to the first VF and so on.
>>> * Valid values are integers from 0 to 1000
>>> * A value of 0 indicates that bandwidth_allocation is disabled.
>>> * Other values indicate the allocated bandwidth, in 1/1000ths of a gigabit/s
>>>
>>> e.g. The following for a PF with 4 VFs allocates ~20Mbits/ to VF 1,
>>>     ~100Mbit/s to VF 2, and leave the other 2 VFs with no allocation.
>>>
>>>     echo "20 100 0 0" > /sys/class/net/eth3/device/bandwidth_allocation
>>>
>>> This interface is intended to allow testing of the hardware feature.
>>> There are ongoing discussions about how to expose this feature
>>> to user-space in a more generic way.
>>>
>>> Signed-off-by: Simon Horman <horms@verge.net.au>
>>>
>> Of the patches it looks like the only one that really has any issues
>> is this one and it is mostly due to the sysfs implementation.  The
>> others I would say can be applied and pushed up into the
>> net-next-2.6 tree.
> 
> Thanks, I suspected as much.
> 
>> We're currently working on an iproute2 based solution for
>> configuring VFs and can incorporate this functionality into it at
>> some point in the future.
> 
> Do you have any pointers to discussions relating to that interface.
> Do you think it would be worth putting in the sysfs interface in the
> mean-time, or would you rather wait?

I'm not the one working on the interface so I don't know much about it 
other than the fact it is being worked on.  Hopefully we should see 
something in regards to that soon though.

If anything it might be of some use to split this up into 2 patches. 
One that contains the sysfs bits, and another for enabling the bandwidth 
control registers.  We won't be able to get the sysfs interface accepted 
upstream so there isn't much point in us keeping it around for any other 
purpose than testing to verify the registers work as you have described.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 4/4] igb: expose 82576 bandiwidth allocation
  2009-11-05 23:42       ` Alexander Duyck
@ 2009-11-06  3:57         ` Simon Horman
  0 siblings, 0 replies; 14+ messages in thread
From: Simon Horman @ 2009-11-06  3:57 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: e1000-devel, netdev, Kirsher, Jeffrey T, Arnd Bergmann

On Thu, Nov 05, 2009 at 03:42:28PM -0800, Alexander Duyck wrote:
> Simon Horman wrote:
> >On Thu, Nov 05, 2009 at 03:00:02PM -0800, Alexander Duyck wrote:
> >>Simon Horman wrote:
> >>>The 82576 has support for bandwidth allocation to VFs.
> >>>
> >>>Contrary to the documentation in the 82576 datasheet v2.41 this
> >>>appears to work as follows:
> >>>
> >>>* The ratio supplied is always proportional to 1Gbit/s,
> >>> regardless of if the link speed.
> >>>* The ratio supplied is an upper-bound on bandwidth available
> >>> to the VF, not a minimun guarantee
> >>>
> >>>This patch exposes bandwidth control to userspace through a simple
> >>>per-device (PF) sysfs file, bandwidth_allocation.
> >>>
> >>>* The file contains a whitespace delimited list of values, one per VF.
> >>>* The first value corresponds to the first VF and so on.
> >>>* Valid values are integers from 0 to 1000
> >>>* A value of 0 indicates that bandwidth_allocation is disabled.
> >>>* Other values indicate the allocated bandwidth, in 1/1000ths of a gigabit/s
> >>>
> >>>e.g. The following for a PF with 4 VFs allocates ~20Mbits/ to VF 1,
> >>>    ~100Mbit/s to VF 2, and leave the other 2 VFs with no allocation.
> >>>
> >>>    echo "20 100 0 0" > /sys/class/net/eth3/device/bandwidth_allocation
> >>>
> >>>This interface is intended to allow testing of the hardware feature.
> >>>There are ongoing discussions about how to expose this feature
> >>>to user-space in a more generic way.
> >>>
> >>>Signed-off-by: Simon Horman <horms@verge.net.au>
> >>>
> >>Of the patches it looks like the only one that really has any issues
> >>is this one and it is mostly due to the sysfs implementation.  The
> >>others I would say can be applied and pushed up into the
> >>net-next-2.6 tree.
> >
> >Thanks, I suspected as much.
> >
> >>We're currently working on an iproute2 based solution for
> >>configuring VFs and can incorporate this functionality into it at
> >>some point in the future.
> >
> >Do you have any pointers to discussions relating to that interface.
> >Do you think it would be worth putting in the sysfs interface in the
> >mean-time, or would you rather wait?
> 
> I'm not the one working on the interface so I don't know much about
> it other than the fact it is being worked on.  Hopefully we should
> see something in regards to that soon though.

Great, I look forward to hearing something soon.

> If anything it might be of some use to split this up into 2 patches.
> One that contains the sysfs bits, and another for enabling the
> bandwidth control registers.  We won't be able to get the sysfs
> interface accepted upstream so there isn't much point in us keeping
> it around for any other purpose than testing to verify the registers
> work as you have described.

Understood with regards to the sysfs interface being umergable.
Do you think the non-sysfs portion would be acceptable even
though there would be no in-tree callers?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 0/4] igb: bandwidth allocation
  2009-11-05  2:21   ` Simon Horman
@ 2009-11-14  8:01     ` Jeff Kirsher
  2009-11-25  6:31       ` Simon Horman
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Kirsher @ 2009-11-14  8:01 UTC (permalink / raw)
  To: Simon Horman; +Cc: e1000-devel, netdev, Arnd Bergmann

On Wed, Nov 4, 2009 at 18:21, Simon Horman <horms@verge.net.au> wrote:
> On Wed, Nov 04, 2009 at 05:46:50PM -0800, Jeff Kirsher wrote:
>> On Wed, Nov 4, 2009 at 16:58, Simon Horman <horms@verge.net.au> wrote:
>> > Hi,
>> >
>> > this series of patches exposes the bandwidth allocation
>> > hardware support of the Intel 82576. It does so through
>> > a rather hackish sysfs entry. That interface is just intended
>> > for testing so that the exposed hardware feature can
>> > be exercised. I would like to find a generic way to expose
>> > this feature to user-space.
>> >
>>
>> Thanks Simon.  I have add the 4 patch series to my tree for testing.
>
> Thanks. I wanted to get the code out rather than sitting on it
> for lack of a better user-space interface. Although there
> is a lot of fluff the actual register twiddling for
> bandwidth allocation turned out to be quite simple.
>

Simon -
After doing some testing on the series of patches, we are getting a
panic with these patches applied to net-next.  I have provided below
the panic we saw, right now we have a large patch load so a bisect
will have to wait.  Hopefully with time permitting, we will be able to
revisit these patches soon.

igbvf 0000:02:10.0: PF still resetting
igbvf 0000:02:10.2: PF still resetting
igbvf 0000:02:10.4: PF still resetting
igbvf 0000:02:10.6: PF still resetting
igbvf 0000:02:11.0: PF still resetting
igbvf 0000:02:11.2: PF still resetting
igbvf 0000:02:11.4: PF still resetting
igbvf 0000:02:10.1: PF still resetting
igbvf 0000:02:10.3: PF still resetting
igbvf 0000:02:10.5: PF still resetting
igbvf 0000:02:10.7: PF still resetting
igbvf 0000:02:11.1: PF still resetting
igbvf 0000:02:11.3: PF still resetting
igbvf 0000:02:11.5: PF still resetting
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffa0087d3c>] igb_xmit_frame_ring_adv+0x1d/0x71a [igb] PGD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC last sysfs file:
/sys/bus/pci/slots/1-1/address CPU 1 Modules linked in: igbvf igb ipv6
autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table video output
sbs sbshc joydev e1000e e1000 i2c_i801 shpchp i2c_core pcspkr iTCO_wdt
iTCO_vendor_support usb_storage [last unloaded: igb]
Pid: 52, comm: events/1 Not tainted 2.6.32-rc3-net-next-vf-tag111109 #1 S5520HC
RIP: 0010:[<ffffffffa0087d3c>]  [<ffffffffa0087d3c>]
igb_xmit_frame_ring_adv+0x1d/0x71a [igb]
RSP: 0018:ffff8801f8803aa0  EFLAGS: 00010286
RAX: 0000000000000001 RBX: 0000000000000600 RCX: 0000000000000000
RDX: ffff8803609e4700 RSI: 0000000000000000 RDI: ffff88036559d500
RBP: ffff8801f8803b10 R08: 0000000000000000 R09: ffff8801f8994018
R10: ffff88036559d500 R11: ffffffff81e9caf8 R12: ffff88036559d500
R13: ffff88036559d500 R14: 0000000000000000 R15: ffff88036791a4c0
FS:  0000000000000000(0000) GS:ffff8801f8800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process events/1 (pid: 52, threadinfo ffff8801e91a6000, task
ffff8801e91abfc0)
Stack:
 ffff8801e91abfc0 0000000000000000 0000000000000001 0000000000000000
<0> ffff8801f8803b20 0000000000000246 0000000000000000
ffffffff812f7558 <0> ffff880100000000 ffff8803609f4e00
ffff88036559d500 ffff8803609e4000 Call Trace:
 <IRQ>
 [<ffffffff812f7558>] ? __netif_tx_lock+0x16/0x1f
[<ffffffffa008c130>] igb_xmit_frame_adv+0x3a/0x3c [igb]
[<ffffffff812e55d7>] dev_hard_start_xmit+0x1d1/0x27e
[<ffffffff812f7cc2>] sch_direct_xmit+0x68/0x14e  [<ffffffff812e59bd>]
dev_queue_xmit+0x22e/0x378  [<ffffffff812e588d>] ?
dev_queue_xmit+0xfe/0x378  [<ffffffff8104b565>] ?
_local_bh_enable_ip+0x9c/0xa7  [<ffffffff812eb72c>]
neigh_resolve_output+0x1f0/0x225  [<ffffffffa013a3be>]
ip6_output_finish+0x6f/0xd6 [ipv6]  [<ffffffffa013adbb>]
ip6_output2+0x328/0x337 [ipv6]  [<ffffffffa013bcce>]
ip6_output+0xf04/0xf28 [ipv6]  [<ffffffffa01547af>] dst_output+0xb/0xd
[ipv6]  [<ffffffffa0155ea7>] mld_sendpack+0x2aa/0x48f [ipv6]
[<ffffffffa0157001>] mld_ifc_timer_expire+0x1c8/0x201 [ipv6]
[<ffffffffa0156e39>] ? mld_ifc_timer_expire+0x0/0x201 [ipv6]
[<ffffffff81051667>] run_timer_softirq+0x1b0/0x26b
[<ffffffff810515da>] ? run_timer_softirq+0x123/0x26b
[<ffffffff8106681b>] ? ktime_get+0x73/0x8e  [<ffffffff8106a79e>] ?
tick_dev_program_event+0x2a/0x9c  [<ffffffff8104bcf1>]
__do_softirq+0xd5/0x19d  [<ffffffff8100cc9c>] call_softirq+0x1c/0x34
[<ffffffff8100e2d8>] do_softirq+0x33/0x6b  [<ffffffff8104b959>]
irq_exit+0x36/0x8a  [<ffffffff81020129>]
smp_apic_timer_interrupt+0x78/0x87
 [<ffffffff8100c673>] apic_timer_interrupt+0x13/0x20  <EOI>
[<ffffffff8130dec4>] ? rt_cache_flush+0x154/0x169
[<ffffffff8133feeb>] ? fib_netdev_event+0x84/0x8b
[<ffffffff81382b1c>] ? notifier_call_chain+0x33/0x5b
[<ffffffff812f0fd5>] ? linkwatch_event+0x0/0x31  [<ffffffff81062679>]
? __raw_notifier_call_chain+0x9/0xb  [<ffffffff8106268a>] ?
raw_notifier_call_chain+0xf/0x11  [<ffffffff812e5b9d>] ?
call_netdevice_notifiers+0x16/0x18


-- 
Cheers,
Jeff

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [rfc 0/4] igb: bandwidth allocation
  2009-11-14  8:01     ` Jeff Kirsher
@ 2009-11-25  6:31       ` Simon Horman
  0 siblings, 0 replies; 14+ messages in thread
From: Simon Horman @ 2009-11-25  6:31 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: e1000-devel, netdev, Alexander Duyck, Arnd Bergmann

On Sat, Nov 14, 2009 at 12:01:32AM -0800, Jeff Kirsher wrote:
> On Wed, Nov 4, 2009 at 18:21, Simon Horman <horms@verge.net.au> wrote:
> > On Wed, Nov 04, 2009 at 05:46:50PM -0800, Jeff Kirsher wrote:
> >> On Wed, Nov 4, 2009 at 16:58, Simon Horman <horms@verge.net.au> wrote:
> >> > Hi,
> >> >
> >> > this series of patches exposes the bandwidth allocation
> >> > hardware support of the Intel 82576. It does so through
> >> > a rather hackish sysfs entry. That interface is just intended
> >> > for testing so that the exposed hardware feature can
> >> > be exercised. I would like to find a generic way to expose
> >> > this feature to user-space.
> >> >
> >>
> >> Thanks Simon.  I have add the 4 patch series to my tree for testing.
> >
> > Thanks. I wanted to get the code out rather than sitting on it
> > for lack of a better user-space interface. Although there
> > is a lot of fluff the actual register twiddling for
> > bandwidth allocation turned out to be quite simple.
> >
> 
> Simon -
> After doing some testing on the series of patches, we are getting a
> panic with these patches applied to net-next.  I have provided below
> the panic we saw, right now we have a large patch load so a bisect
> will have to wait.  Hopefully with time permitting, we will be able to
> revisit these patches soon.

Hi Jeff,

sorry for not getting back to you earlier, I've been caught up with
family matters for the past few weeks (my wife had a baby!).

It seems that the problem was caused by the second patch in the series
moving the initialisation of adapter->vfs_allocated_count.
I will submit a fresh patch series to resolve this and some
other minor problems.



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-11-25  6:31 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-05  0:58 [rfc 0/4] igb: bandwidth allocation Simon Horman
2009-11-05  0:58 ` [rfc 1/4] igb: Add igb_cleanup_vf() Simon Horman
2009-11-05  0:58 ` [rfc 2/4] igb: Initialise adapter->vfs_allocated_count in igb_init_vf() Simon Horman
2009-11-05  0:58 ` [rfc 3/4] igb: Common error path in igb_init_vfs() Simon Horman
2009-11-05  0:58 ` [rfc 4/4] igb: expose 82576 bandiwidth allocation Simon Horman
2009-11-05 23:00   ` Alexander Duyck
2009-11-05 23:30     ` Simon Horman
2009-11-05 23:42       ` Alexander Duyck
2009-11-06  3:57         ` Simon Horman
2009-11-05  1:46 ` [rfc 0/4] igb: bandwidth allocation Jeff Kirsher
2009-11-05  2:21   ` Simon Horman
2009-11-14  8:01     ` Jeff Kirsher
2009-11-25  6:31       ` Simon Horman
2009-11-05 12:09 ` Andi Kleen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.