netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next 00/14][pull request] Intel Wired LAN Driver Updates
@ 2012-03-13  4:03 Jeff Kirsher
  2012-03-13  4:03 ` [net-next 01/14] igb: fix ethtool offline test Jeff Kirsher
                   ` (14 more replies)
  0 siblings, 15 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, gospo, sassmann

This series of patches contains fixes/cleanups igb, ixgbe and net.
Majority of the patches are against ixgbe and this series is part
one of three to update ixgbe.

The following are changes since commit f124488e4713dc9afa2028553261b1d399286e68:
  bnx2x: code doesn't use stats for allocating Rx BDs
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master

Alexander Duyck (12):
  ixgbe: add support for byte queue limits
  net: Fix issue with netdev_tx_reset_queue not resetting queue from
    XOFF state
  net: Add memory barriers to prevent possible race in byte queue
    limits
  ixgbe: Do no clear Tx status bits since eop_desc provides enough info
  ixgbe: Reorder adapter contents for better cache utilization
  ixgbe: Address issues with Tx WHTRESH value not being set correctly
  ixgbe: Correct Adaptive Interrupt Moderation so that it will change
    values
  ixgbe: Default to queue pairs when number of queues is less than CPUs
  ixgbe: Drop unnecessary napi_schedule_prep and spare blank line from
    ixgbe_intr
  ixgbe: Allocate rings as part of the q_vector
  ixgbe: Add iterator for cycling through rings on a q_vector
  ixgbe: Simplify logic for ethtool loopback frame creation and testing

Jeff Kirsher (2):
  igb: fix ethtool offline test
  ixgbe: remove tie between NAPI work limits and interrupt moderation

 drivers/net/ethernet/intel/igb/igb_ethtool.c     |    7 +
 drivers/net/ethernet/intel/igb/igb_main.c        |    3 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  115 +++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |   71 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |  547 +++++++++++-----------
 include/linux/netdevice.h                        |   50 ++-
 6 files changed, 409 insertions(+), 384 deletions(-)

-- 
1.7.7.6

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [net-next 01/14] igb: fix ethtool offline test
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 02/14] ixgbe: remove tie between NAPI work limits and interrupt moderation Jeff Kirsher
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, gospo, sassmann, Eric Dumazet, John Fastabend

A bug was introduced with the following patch:

  Commmit bdbc063129e811264cd6c311d8c2d9b95de01231
  Author: Eric Dumazet <eric.dumazet@gmail.com>
  igb: Add support for byte queue limits.

The ethtool offline tests will cause a perpetual link flap, this
is because the tests also need to account for byte queue limits (BQL).

CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by:  Jeff Pieper  <jeffrey.e.pieper@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_ethtool.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index aa399a8..e10821a 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -1577,7 +1577,9 @@ static int igb_clean_test_rings(struct igb_ring *rx_ring,
 	union e1000_adv_rx_desc *rx_desc;
 	struct igb_rx_buffer *rx_buffer_info;
 	struct igb_tx_buffer *tx_buffer_info;
+	struct netdev_queue *txq;
 	u16 rx_ntc, tx_ntc, count = 0;
+	unsigned int total_bytes = 0, total_packets = 0;
 
 	/* initialize next to clean and descriptor values */
 	rx_ntc = rx_ring->next_to_clean;
@@ -1601,6 +1603,8 @@ static int igb_clean_test_rings(struct igb_ring *rx_ring,
 
 		/* unmap buffer on tx side */
 		tx_buffer_info = &tx_ring->tx_buffer_info[tx_ntc];
+		total_bytes += tx_buffer_info->bytecount;
+		total_packets += tx_buffer_info->gso_segs;
 		igb_unmap_and_free_tx_resource(tx_ring, tx_buffer_info);
 
 		/* increment rx/tx next to clean counters */
@@ -1615,6 +1619,9 @@ static int igb_clean_test_rings(struct igb_ring *rx_ring,
 		rx_desc = IGB_RX_DESC(rx_ring, rx_ntc);
 	}
 
+	txq = netdev_get_tx_queue(tx_ring->netdev, tx_ring->queue_index);
+	netdev_tx_completed_queue(txq, total_packets, total_bytes);
+
 	/* re-map buffers to ring, store next to clean values */
 	igb_alloc_rx_buffers(rx_ring, count);
 	rx_ring->next_to_clean = rx_ntc;
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 02/14] ixgbe: remove tie between NAPI work limits and interrupt moderation
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
  2012-03-13  4:03 ` [net-next 01/14] igb: fix ethtool offline test Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 03/14] ixgbe: add support for byte queue limits Jeff Kirsher
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, gospo, sassmann, Ben Hutchings

As noted by Ben Hutchings and David Miller, work limits for NAPI
should not be tied to interrupt moderation parameters.  This
should be handled by NAPI, possibly through sysfs.

Neil Horman & Stephen Hemminger are working on a solution for
NAPI currently.  In the meantime, remove this tie between
work limits and interrupt moderation.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 8e2e473..4846206 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -2108,8 +2108,6 @@ static int ixgbe_get_coalesce(struct net_device *netdev,
 {
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
 
-	ec->tx_max_coalesced_frames_irq = adapter->tx_work_limit;
-
 	/* only valid if in constant ITR mode */
 	if (adapter->rx_itr_setting <= 1)
 		ec->rx_coalesce_usecs = adapter->rx_itr_setting;
@@ -2177,9 +2175,6 @@ static int ixgbe_set_coalesce(struct net_device *netdev,
 	    && ec->tx_coalesce_usecs)
 		return -EINVAL;
 
-	if (ec->tx_max_coalesced_frames_irq)
-		adapter->tx_work_limit = ec->tx_max_coalesced_frames_irq;
-
 	if ((ec->rx_coalesce_usecs > (IXGBE_MAX_EITR >> 2)) ||
 	    (ec->tx_coalesce_usecs > (IXGBE_MAX_EITR >> 2)))
 		return -EINVAL;
@@ -2214,7 +2209,6 @@ static int ixgbe_set_coalesce(struct net_device *netdev,
 
 	for (i = 0; i < num_vectors; i++) {
 		q_vector = adapter->q_vector[i];
-		q_vector->tx.work_limit = adapter->tx_work_limit;
 		if (q_vector->tx.count && !q_vector->rx.count)
 			/* tx only */
 			q_vector->itr = tx_itr_param;
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 03/14] ixgbe: add support for byte queue limits
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
  2012-03-13  4:03 ` [net-next 01/14] igb: fix ethtool offline test Jeff Kirsher
  2012-03-13  4:03 ` [net-next 02/14] ixgbe: remove tie between NAPI work limits and interrupt moderation Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state Jeff Kirsher
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann

From: Alexander Duyck <alexander.h.duyck@intel.com>

This adds support for byte queue limits (BQL).

Based on patch from Eric Dumazet for igb.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |    5 +++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    7 +++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 2807a25..f05bfdb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -643,4 +643,9 @@ extern int ixgbe_fcoe_get_hbainfo(struct net_device *netdev,
 				  struct netdev_fcoe_hbainfo *info);
 #endif /* IXGBE_FCOE */
 
+static inline struct netdev_queue *txring_txq(const struct ixgbe_ring *ring)
+{
+	return netdev_get_tx_queue(ring->netdev, ring->queue_index);
+}
+
 #endif /* _IXGBE_H_ */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 23a4665..0609643 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -840,6 +840,9 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_q_vector *q_vector,
 		return true;
 	}
 
+	netdev_tx_completed_queue(txring_txq(tx_ring),
+				  total_packets, total_bytes);
+
 #define TX_WAKE_THRESHOLD (DESC_NEEDED * 2)
 	if (unlikely(total_packets && netif_carrier_ok(tx_ring->netdev) &&
 		     (ixgbe_desc_unused(tx_ring) >= TX_WAKE_THRESHOLD))) {
@@ -2617,6 +2620,8 @@ void ixgbe_configure_tx_ring(struct ixgbe_adapter *adapter,
 	/* enable queue */
 	IXGBE_WRITE_REG(hw, IXGBE_TXDCTL(reg_idx), txdctl);
 
+	netdev_tx_reset_queue(txring_txq(ring));
+
 	/* TXDCTL.EN will return 0 on 82598 if link is down, so skip it */
 	if (hw->mac.type == ixgbe_mac_82598EB &&
 	    !(IXGBE_READ_REG(hw, IXGBE_LINKS) & IXGBE_LINKS_UP))
@@ -6808,6 +6813,8 @@ static void ixgbe_tx_map(struct ixgbe_ring *tx_ring,
 	tx_buffer_info->gso_segs = gso_segs;
 	tx_buffer_info->skb = skb;
 
+	netdev_tx_sent_queue(txring_txq(tx_ring), tx_buffer_info->bytecount);
+
 	/* set the timestamp */
 	first->time_stamp = jiffies;
 
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (2 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 03/14] ixgbe: add support for byte queue limits Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-04-20 21:19   ` Tom Herbert
  2012-03-13  4:03 ` [net-next 05/14] net: Add memory barriers to prevent possible race in byte queue limits Jeff Kirsher
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

We are seeing dev_watchdog hangs on several drivers.  I suspect this is due
to the __QUEUE_STATE_STACK_XOFF bit being set prior to a reset for link
change, and then not being cleared by netdev_tx_reset_queue.  This change
corrects that.

In addition we were seeing dev_watchdog hangs on igb after running the
ethtool tests.  We found this to be due to the fact that the ethtool test
runs the same logic as ndo_start_xmit, but we were never clearing the XOFF
flag since the loopback test in ethtool does not do byte queue accounting.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |    3 ++-
 include/linux/netdevice.h                 |    1 +
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index fda8247..e96cef8 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2752,6 +2752,8 @@ void igb_configure_tx_ring(struct igb_adapter *adapter,
 
 	txdctl |= E1000_TXDCTL_QUEUE_ENABLE;
 	wr32(E1000_TXDCTL(reg_idx), txdctl);
+
+	netdev_tx_reset_queue(txring_txq(ring));
 }
 
 /**
@@ -3244,7 +3246,6 @@ static void igb_clean_tx_ring(struct igb_ring *tx_ring)
 		buffer_info = &tx_ring->tx_buffer_info[i];
 		igb_unmap_and_free_tx_resource(tx_ring, buffer_info);
 	}
-	netdev_tx_reset_queue(txring_txq(tx_ring));
 
 	size = sizeof(struct igb_tx_buffer) * tx_ring->count;
 	memset(tx_ring->tx_buffer_info, 0, size);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b195a34..4bf314f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1939,6 +1939,7 @@ static inline void netdev_completed_queue(struct net_device *dev,
 static inline void netdev_tx_reset_queue(struct netdev_queue *q)
 {
 #ifdef CONFIG_BQL
+	clear_bit(__QUEUE_STATE_STACK_XOFF, &q->state);
 	dql_reset(&q->dql);
 #endif
 }
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 05/14] net: Add memory barriers to prevent possible race in byte queue limits
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (3 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 06/14] ixgbe: Do no clear Tx status bits since eop_desc provides enough info Jeff Kirsher
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change adds a memory barrier to the byte queue limit code to address a
possible race as has been seen in the past with the
netif_stop_queue/netif_wake_queue logic.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 include/linux/netdevice.h |   49 ++++++++++++++++++++++++++++++--------------
 1 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4bf314f..4535a4e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1899,12 +1899,22 @@ static inline void netdev_tx_sent_queue(struct netdev_queue *dev_queue,
 {
 #ifdef CONFIG_BQL
 	dql_queued(&dev_queue->dql, bytes);
-	if (unlikely(dql_avail(&dev_queue->dql) < 0)) {
-		set_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state);
-		if (unlikely(dql_avail(&dev_queue->dql) >= 0))
-			clear_bit(__QUEUE_STATE_STACK_XOFF,
-			    &dev_queue->state);
-	}
+
+	if (likely(dql_avail(&dev_queue->dql) >= 0))
+		return;
+
+	set_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state);
+
+	/*
+	 * The XOFF flag must be set before checking the dql_avail below,
+	 * because in netdev_tx_completed_queue we update the dql_completed
+	 * before checking the XOFF flag.
+	 */
+	smp_mb();
+
+	/* check again in case another CPU has just made room avail */
+	if (unlikely(dql_avail(&dev_queue->dql) >= 0))
+		clear_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state);
 #endif
 }
 
@@ -1917,16 +1927,23 @@ static inline void netdev_tx_completed_queue(struct netdev_queue *dev_queue,
 					     unsigned pkts, unsigned bytes)
 {
 #ifdef CONFIG_BQL
-	if (likely(bytes)) {
-		dql_completed(&dev_queue->dql, bytes);
-		if (unlikely(test_bit(__QUEUE_STATE_STACK_XOFF,
-		    &dev_queue->state) &&
-		    dql_avail(&dev_queue->dql) >= 0)) {
-			if (test_and_clear_bit(__QUEUE_STATE_STACK_XOFF,
-			     &dev_queue->state))
-				netif_schedule_queue(dev_queue);
-		}
-	}
+	if (unlikely(!bytes))
+		return;
+
+	dql_completed(&dev_queue->dql, bytes);
+
+	/*
+	 * Without the memory barrier there is a small possiblity that
+	 * netdev_tx_sent_queue will miss the update and cause the queue to
+	 * be stopped forever
+	 */
+	smp_mb();
+
+	if (dql_avail(&dev_queue->dql) < 0)
+		return;
+
+	if (test_and_clear_bit(__QUEUE_STATE_STACK_XOFF, &dev_queue->state))
+		netif_schedule_queue(dev_queue);
 #endif
 }
 
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 06/14] ixgbe: Do no clear Tx status bits since eop_desc provides enough info
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (4 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 05/14] net: Add memory barriers to prevent possible race in byte queue limits Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 07/14] ixgbe: Reorder adapter contents for better cache utilization Jeff Kirsher
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

There isn't any need to clear the status bits in the descriptors due to the
fact that the eop_desc provides enough information for us to know
that we have cleaned to the last packet that the software has put on the
ring.  The status bits are cleared as a part of putting the frame on the
ring so as long as we do not read the descriptor bit prior to reading the
value eop_desc we should be able to guarantee that we will not clean beyond
the end of the current data stream.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0609643..a2c14bf 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -763,6 +763,9 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_q_vector *q_vector,
 		if (!eop_desc)
 			break;
 
+		/* prevent any other reads prior to eop_desc */
+		rmb();
+
 		/* if DD is not set pending work has not been completed */
 		if (!(eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)))
 			break;
@@ -773,12 +776,8 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_q_vector *q_vector,
 		/* clear next_to_watch to prevent false hangs */
 		tx_buffer->next_to_watch = NULL;
 
-		/* prevent any other reads prior to eop_desc being verified */
-		rmb();
-
 		do {
 			ixgbe_unmap_tx_resource(tx_ring, tx_buffer);
-			tx_desc->wb.status = 0;
 			if (likely(tx_desc == eop_desc)) {
 				eop_desc = NULL;
 				dev_kfree_skb_any(tx_buffer->skb);
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 07/14] ixgbe: Reorder adapter contents for better cache utilization
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (5 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 06/14] ixgbe: Do no clear Tx status bits since eop_desc provides enough info Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 08/14] ixgbe: Address issues with Tx WHTRESH value not being set correctly Jeff Kirsher
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change moves several frequently accessed items together into one cache
line in order to reduce cache misses in the hot-path.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h |   94 +++++++++++++++---------------
 1 files changed, 48 insertions(+), 46 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index f05bfdb..c5cd2b6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -373,8 +373,16 @@ static inline u16 ixgbe_desc_unused(struct ixgbe_ring *ring)
 #define MIN_MSIX_Q_VECTORS 2
 #define MIN_MSIX_COUNT (MIN_MSIX_Q_VECTORS + NON_Q_VECTORS)
 
+/* default to trying for four seconds */
+#define IXGBE_TRY_LINK_TIMEOUT (4 * HZ)
+
 /* board specific private data structure */
 struct ixgbe_adapter {
+	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
+	/* OS defined structs */
+	struct net_device *netdev;
+	struct pci_dev *pdev;
+
 	unsigned long state;
 
 	/* Some features need tri-state capability,
@@ -418,59 +426,50 @@ struct ixgbe_adapter {
 #define IXGBE_FLAG2_RESET_REQUESTED             (u32)(1 << 6)
 #define IXGBE_FLAG2_FDIR_REQUIRES_REINIT        (u32)(1 << 7)
 
-	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
-	u16 bd_number;
-	struct ixgbe_q_vector *q_vector[MAX_MSIX_Q_VECTORS];
 
-	/* DCB parameters */
-	struct ieee_pfc *ixgbe_ieee_pfc;
-	struct ieee_ets *ixgbe_ieee_ets;
-	struct ixgbe_dcb_config dcb_cfg;
-	struct ixgbe_dcb_config temp_dcb_cfg;
-	u8 dcb_set_bitmap;
-	u8 dcbx_cap;
-	enum ixgbe_fc_mode last_lfc_mode;
-
-	/* Interrupt Throttle Rate */
-	u32 rx_itr_setting;
-	u32 tx_itr_setting;
-	u16 eitr_low;
-	u16 eitr_high;
-
-	/* Work limits */
+	/* Tx fast path data */
+	int num_tx_queues;
+	u16 tx_itr_setting;
 	u16 tx_work_limit;
 
+	/* Rx fast path data */
+	int num_rx_queues;
+	u16 rx_itr_setting;
+
 	/* TX */
 	struct ixgbe_ring *tx_ring[MAX_TX_QUEUES] ____cacheline_aligned_in_smp;
-	int num_tx_queues;
-	u32 tx_timeout_count;
-	bool detect_tx_hung;
 
 	u64 restart_queue;
 	u64 lsc_int;
+	u32 tx_timeout_count;
 
 	/* RX */
-	struct ixgbe_ring *rx_ring[MAX_RX_QUEUES] ____cacheline_aligned_in_smp;
-	int num_rx_queues;
+	struct ixgbe_ring *rx_ring[MAX_RX_QUEUES];
 	int num_rx_pools;		/* == num_rx_queues in 82598 */
 	int num_rx_queues_per_pool;	/* 1 if 82598, can be many if 82599 */
 	u64 hw_csum_rx_error;
 	u64 hw_rx_no_dma_resources;
+	u64 rsc_total_count;
+	u64 rsc_total_flush;
 	u64 non_eop_descs;
-	int num_msix_vectors;
-	int max_msix_q_vectors;         /* true count of q_vectors for device */
-	struct ixgbe_ring_feature ring_feature[RING_F_ARRAY_SIZE];
-	struct msix_entry *msix_entries;
-
 	u32 alloc_rx_page_failed;
 	u32 alloc_rx_buff_failed;
 
-/* default to trying for four seconds */
-#define IXGBE_TRY_LINK_TIMEOUT (4 * HZ)
+	struct ixgbe_q_vector *q_vector[MAX_MSIX_Q_VECTORS];
 
-	/* OS defined structs */
-	struct net_device *netdev;
-	struct pci_dev *pdev;
+	/* DCB parameters */
+	struct ieee_pfc *ixgbe_ieee_pfc;
+	struct ieee_ets *ixgbe_ieee_ets;
+	struct ixgbe_dcb_config dcb_cfg;
+	struct ixgbe_dcb_config temp_dcb_cfg;
+	u8 dcb_set_bitmap;
+	u8 dcbx_cap;
+	enum ixgbe_fc_mode last_lfc_mode;
+
+	int num_msix_vectors;
+	int max_msix_q_vectors;         /* true count of q_vectors for device */
+	struct ixgbe_ring_feature ring_feature[RING_F_ARRAY_SIZE];
+	struct msix_entry *msix_entries;
 
 	u32 test_icr;
 	struct ixgbe_ring test_tx_ring;
@@ -481,10 +480,6 @@ struct ixgbe_adapter {
 	u16 msg_enable;
 	struct ixgbe_hw_stats stats;
 
-	/* Interrupt Throttle Rate */
-	u32 rx_eitr_param;
-	u32 tx_eitr_param;
-
 	u64 tx_busy;
 	unsigned int tx_ring_count;
 	unsigned int rx_ring_count;
@@ -493,25 +488,35 @@ struct ixgbe_adapter {
 	bool link_up;
 	unsigned long link_check_timeout;
 
-	struct work_struct service_task;
 	struct timer_list service_timer;
+	struct work_struct service_task;
+
+	struct hlist_head fdir_filter_list;
+	unsigned long fdir_overflow; /* number of times ATR was backed off */
+	union ixgbe_atr_input fdir_mask;
+	int fdir_filter_count;
 	u32 fdir_pballoc;
 	u32 atr_sample_rate;
-	unsigned long fdir_overflow; /* number of times ATR was backed off */
 	spinlock_t fdir_perfect_lock;
+
 #ifdef IXGBE_FCOE
 	struct ixgbe_fcoe fcoe;
 #endif /* IXGBE_FCOE */
-	u64 rsc_total_count;
-	u64 rsc_total_flush;
 	u32 wol;
+
+	/* Interrupt Throttle Rate */
+	u16 eitr_low;
+	u16 eitr_high;
+
+	u16 bd_number;
+
 	u16 eeprom_verh;
 	u16 eeprom_verl;
 	u16 eeprom_cap;
 
 	int node;
-	u32 led_reg;
 	u32 interrupt_event;
+	u32 led_reg;
 
 	/* SR-IOV */
 	DECLARE_BITMAP(active_vfs, IXGBE_MAX_VF_FUNCTIONS);
@@ -521,9 +526,6 @@ struct ixgbe_adapter {
 	struct vf_macvlans vf_mvs;
 	struct vf_macvlans *mv_list;
 
-	struct hlist_head fdir_filter_list;
-	union ixgbe_atr_input fdir_mask;
-	int fdir_filter_count;
 	u32 timer_event_accumulator;
 	u32 vferr_refcount;
 };
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 08/14] ixgbe: Address issues with Tx WHTRESH value not being set correctly
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (6 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 07/14] ixgbe: Reorder adapter contents for better cache utilization Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 09/14] ixgbe: Correct Adaptive Interrupt Moderation so that it will change values Jeff Kirsher
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change is meant to address the fact that the tx_itr_setting was
dropping to 0 when no separate Tx vectors were provided.  This had resulted
in the driver incorrectly configuring the Tx ring with a WTHRESH of 1 in
order to avoid Tx hangs even though that was not necessary. This change
makes it so that we instead take a look at the Tx ring's q_vector to
determine if the ring will have an ITR value less than 8us.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index a2c14bf..fc3c33a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2595,12 +2595,15 @@ void ixgbe_configure_tx_ring(struct ixgbe_adapter *adapter,
 	 * to or less than the number of on chip descriptors, which is
 	 * currently 40.
 	 */
-	if (!adapter->tx_itr_setting || !adapter->rx_itr_setting)
+	if (!ring->q_vector || (ring->q_vector->itr < 8))
 		txdctl |= (1 << 16);	/* WTHRESH = 1 */
 	else
 		txdctl |= (8 << 16);	/* WTHRESH = 8 */
 
-	/* PTHRESH=32 is needed to avoid a Tx hang with DFP enabled. */
+	/*
+	 * Setting PTHRESH to 32 both improves performance
+	 * and avoids a TX hang with DFP enabled
+	 */
 	txdctl |= (1 << 8) |	/* HTHRESH = 1 */
 		   32;		/* PTHRESH = 32 */
 
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 09/14] ixgbe: Correct Adaptive Interrupt Moderation so that it will change values
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (7 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 08/14] ixgbe: Address issues with Tx WHTRESH value not being set correctly Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 10/14] ixgbe: Default to queue pairs when number of queues is less than CPUs Jeff Kirsher
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change corrects an issue in which Adaptive Interrupt Moderation was
not changing values due to the fact that we were performing an and
operation on the resultant value that was causing the value to never change
from the default 20K interrupts per second.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index fc3c33a..bad9c05 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1834,7 +1834,7 @@ void ixgbe_write_eitr(struct ixgbe_q_vector *q_vector)
 	struct ixgbe_adapter *adapter = q_vector->adapter;
 	struct ixgbe_hw *hw = &adapter->hw;
 	int v_idx = q_vector->v_idx;
-	u32 itr_reg = q_vector->itr;
+	u32 itr_reg = q_vector->itr & IXGBE_MAX_EITR;
 
 	switch (adapter->hw.mac.type) {
 	case ixgbe_mac_82598EB:
@@ -1886,7 +1886,7 @@ static void ixgbe_set_itr(struct ixgbe_q_vector *q_vector)
 			  ((9 * new_itr) + q_vector->itr);
 
 		/* save the algorithm value here */
-		q_vector->itr = new_itr & IXGBE_MAX_EITR;
+		q_vector->itr = new_itr;
 
 		ixgbe_write_eitr(q_vector);
 	}
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 10/14] ixgbe: Default to queue pairs when number of queues is less than CPUs
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (8 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 09/14] ixgbe: Correct Adaptive Interrupt Moderation so that it will change values Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 11/14] ixgbe: Drop unnecessary napi_schedule_prep and spare blank line from ixgbe_intr Jeff Kirsher
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

The old code had several errors in how it was determining the vector
budget.  In order to simplify things this patch updates the code so that it
will attempt to always allocated paired Rx/Tx vectors instead of attempting
to allocate individual vectors when the number of queues is less than the
number of CPUs.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |    2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   14 +++++++-------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index c5cd2b6..f25b4e2 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -370,7 +370,7 @@ static inline u16 ixgbe_desc_unused(struct ixgbe_ring *ring)
 #define MAX_MSIX_Q_VECTORS MAX_MSIX_Q_VECTORS_82599
 #define MAX_MSIX_COUNT MAX_MSIX_VECTORS_82599
 
-#define MIN_MSIX_Q_VECTORS 2
+#define MIN_MSIX_Q_VECTORS 1
 #define MIN_MSIX_COUNT (MIN_MSIX_Q_VECTORS + NON_Q_VECTORS)
 
 /* default to trying for four seconds */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index bad9c05..80ae919 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -4561,11 +4561,9 @@ static void ixgbe_acquire_msix_vectors(struct ixgbe_adapter *adapter,
 {
 	int err, vector_threshold;
 
-	/* We'll want at least 3 (vector_threshold):
-	 * 1) TxQ[0] Cleanup
-	 * 2) RxQ[0] Cleanup
-	 * 3) Other (Link Status Change, etc.)
-	 * 4) TCP Timer (optional)
+	/* We'll want at least 2 (vector_threshold):
+	 * 1) TxQ[0] + RxQ[0] handler
+	 * 2) Other (Link Status Change, etc.)
 	 */
 	vector_threshold = MIN_MSIX_COUNT;
 
@@ -4919,9 +4917,11 @@ static int ixgbe_set_interrupt_capability(struct ixgbe_adapter *adapter)
 	 * doesn't do us much good if we have a lot more vectors
 	 * than CPU's.  So let's be conservative and only ask for
 	 * (roughly) the same number of vectors as there are CPU's.
+	 * The default is to use pairs of vectors.
 	 */
-	v_budget = min(adapter->num_rx_queues + adapter->num_tx_queues,
-		       (int)num_online_cpus()) + NON_Q_VECTORS;
+	v_budget = max(adapter->num_rx_queues, adapter->num_tx_queues);
+	v_budget = min_t(int, v_budget, num_online_cpus());
+	v_budget += NON_Q_VECTORS;
 
 	/*
 	 * At the same time, hardware can only support a maximum of
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 11/14] ixgbe: Drop unnecessary napi_schedule_prep and spare blank line from ixgbe_intr
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (9 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 10/14] ixgbe: Default to queue pairs when number of queues is less than CPUs Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 12/14] ixgbe: Allocate rings as part of the q_vector Jeff Kirsher
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This patch is a minor cleanup to address the unnecessary use of
napi_schedule_prep in ixgbe_intr and to also remove a blank line that is
not needed since it is separating a comment from the line it is explaining.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 80ae919..b1f53ed 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2401,16 +2401,13 @@ static irqreturn_t ixgbe_intr(int irq, void *data)
 
 	ixgbe_check_fan_failure(adapter, eicr);
 
-	if (napi_schedule_prep(&(q_vector->napi))) {
-		/* would disable interrupts here but EIAM disabled it */
-		__napi_schedule(&(q_vector->napi));
-	}
+	/* would disable interrupts here but EIAM disabled it */
+	napi_schedule(&q_vector->napi);
 
 	/*
 	 * re-enable link(maybe) and non-queue interrupts, no flush.
 	 * ixgbe_poll will re-enable the queue interrupts
 	 */
-
 	if (!test_bit(__IXGBE_DOWN, &adapter->state))
 		ixgbe_irq_enable(adapter, false, false);
 
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 12/14] ixgbe: Allocate rings as part of the q_vector
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (10 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 11/14] ixgbe: Drop unnecessary napi_schedule_prep and spare blank line from ixgbe_intr Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 13/14] ixgbe: Add iterator for cycling through rings on a q_vector Jeff Kirsher
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This patch makes the rings a part of the q_vector directly instead of
indirectly.  Specifically on x86 systems this helps to avoid any cache
set conflicts between the q_vector, the tx_rings, and the rx_rings as the
critical stride is 4K and in order to cross that boundary you would need to
have over 15 rings on a single q_vector.

In addition this allows for smarter allocations when Flow Director is
enabled.  Previously Flow Director would set the irq_affinity hints based
on the CPU and was still using a node interleaving approach which on some
systems would end up with the two values mismatched.  With the new approach
we can set the affinity for the irq_vector and use the CPU for that
affinity to determine the node value for the node and the rings.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |   10 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |    2 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |  489 +++++++++++-----------
 3 files changed, 244 insertions(+), 257 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index f25b4e2..699899a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -254,10 +254,8 @@ struct ixgbe_ring {
 		struct ixgbe_tx_queue_stats tx_stats;
 		struct ixgbe_rx_queue_stats rx_stats;
 	};
-	int numa_node;
 	unsigned int size;		/* length in bytes */
 	dma_addr_t dma;			/* phys. address of descriptor ring */
-	struct rcu_head rcu;
 	struct ixgbe_q_vector *q_vector; /* back-pointer to host q_vector */
 } ____cacheline_internodealigned_in_smp;
 
@@ -317,8 +315,13 @@ struct ixgbe_q_vector {
 	struct ixgbe_ring_container rx, tx;
 
 	struct napi_struct napi;
-	cpumask_var_t affinity_mask;
+	cpumask_t affinity_mask;
+	int numa_node;
+	struct rcu_head rcu;	/* to avoid race with update stats on free */
 	char name[IFNAMSIZ + 9];
+
+	/* for dynamic allocation of rings associated with this q_vector */
+	struct ixgbe_ring ring[0] ____cacheline_internodealigned_in_smp;
 };
 
 /*
@@ -514,7 +517,6 @@ struct ixgbe_adapter {
 	u16 eeprom_verl;
 	u16 eeprom_cap;
 
-	int node;
 	u32 interrupt_event;
 	u32 led_reg;
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 4846206..51d159b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -1591,7 +1591,6 @@ static int ixgbe_setup_desc_rings(struct ixgbe_adapter *adapter)
 	tx_ring->dev = &adapter->pdev->dev;
 	tx_ring->netdev = adapter->netdev;
 	tx_ring->reg_idx = adapter->tx_ring[0]->reg_idx;
-	tx_ring->numa_node = adapter->node;
 
 	err = ixgbe_setup_tx_resources(tx_ring);
 	if (err)
@@ -1617,7 +1616,6 @@ static int ixgbe_setup_desc_rings(struct ixgbe_adapter *adapter)
 	rx_ring->netdev = adapter->netdev;
 	rx_ring->reg_idx = adapter->rx_ring[0]->reg_idx;
 	rx_ring->rx_buf_len = IXGBE_RXBUFFER_2K;
-	rx_ring->numa_node = adapter->node;
 
 	err = ixgbe_setup_rx_resources(rx_ring);
 	if (err) {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index b1f53ed..604540d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1893,7 +1893,7 @@ static void ixgbe_set_itr(struct ixgbe_q_vector *q_vector)
 }
 
 /**
- * ixgbe_check_overtemp_subtask - check for over tempurature
+ * ixgbe_check_overtemp_subtask - check for over temperature
  * @adapter: pointer to adapter
  **/
 static void ixgbe_check_overtemp_subtask(struct ixgbe_adapter *adapter)
@@ -2205,78 +2205,6 @@ static irqreturn_t ixgbe_msix_clean_rings(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
-static inline void map_vector_to_rxq(struct ixgbe_adapter *a, int v_idx,
-				     int r_idx)
-{
-	struct ixgbe_q_vector *q_vector = a->q_vector[v_idx];
-	struct ixgbe_ring *rx_ring = a->rx_ring[r_idx];
-
-	rx_ring->q_vector = q_vector;
-	rx_ring->next = q_vector->rx.ring;
-	q_vector->rx.ring = rx_ring;
-	q_vector->rx.count++;
-}
-
-static inline void map_vector_to_txq(struct ixgbe_adapter *a, int v_idx,
-				     int t_idx)
-{
-	struct ixgbe_q_vector *q_vector = a->q_vector[v_idx];
-	struct ixgbe_ring *tx_ring = a->tx_ring[t_idx];
-
-	tx_ring->q_vector = q_vector;
-	tx_ring->next = q_vector->tx.ring;
-	q_vector->tx.ring = tx_ring;
-	q_vector->tx.count++;
-	q_vector->tx.work_limit = a->tx_work_limit;
-}
-
-/**
- * ixgbe_map_rings_to_vectors - Maps descriptor rings to vectors
- * @adapter: board private structure to initialize
- *
- * This function maps descriptor rings to the queue-specific vectors
- * we were allotted through the MSI-X enabling code.  Ideally, we'd have
- * one vector per ring/queue, but on a constrained vector budget, we
- * group the rings as "efficiently" as possible.  You would add new
- * mapping configurations in here.
- **/
-static void ixgbe_map_rings_to_vectors(struct ixgbe_adapter *adapter)
-{
-	int q_vectors = adapter->num_msix_vectors - NON_Q_VECTORS;
-	int rxr_remaining = adapter->num_rx_queues, rxr_idx = 0;
-	int txr_remaining = adapter->num_tx_queues, txr_idx = 0;
-	int v_start = 0;
-
-	/* only one q_vector if MSI-X is disabled. */
-	if (!(adapter->flags & IXGBE_FLAG_MSIX_ENABLED))
-		q_vectors = 1;
-
-	/*
-	 * If we don't have enough vectors for a 1-to-1 mapping, we'll have to
-	 * group them so there are multiple queues per vector.
-	 *
-	 * Re-adjusting *qpv takes care of the remainder.
-	 */
-	for (; v_start < q_vectors && rxr_remaining; v_start++) {
-		int rqpv = DIV_ROUND_UP(rxr_remaining, q_vectors - v_start);
-		for (; rqpv; rqpv--, rxr_idx++, rxr_remaining--)
-			map_vector_to_rxq(adapter, v_start, rxr_idx);
-	}
-
-	/*
-	 * If there are not enough q_vectors for each ring to have it's own
-	 * vector then we must pair up Rx/Tx on a each vector
-	 */
-	if ((v_start + txr_remaining) > q_vectors)
-		v_start = 0;
-
-	for (; v_start < q_vectors && txr_remaining; v_start++) {
-		int tqpv = DIV_ROUND_UP(txr_remaining, q_vectors - v_start);
-		for (; tqpv; tqpv--, txr_idx++, txr_remaining--)
-			map_vector_to_txq(adapter, v_start, txr_idx);
-	}
-}
-
 /**
  * ixgbe_request_msix_irqs - Initialize MSI-X interrupts
  * @adapter: board private structure
@@ -2320,14 +2248,14 @@ static int ixgbe_request_msix_irqs(struct ixgbe_adapter *adapter)
 		if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
 			/* assign the mask for this irq */
 			irq_set_affinity_hint(entry->vector,
-					      q_vector->affinity_mask);
+					      &q_vector->affinity_mask);
 		}
 	}
 
 	err = request_irq(adapter->msix_entries[vector].vector,
 			  ixgbe_msix_other, 0, netdev->name, adapter);
 	if (err) {
-		e_err(probe, "request_irq for msix_lsc failed: %d\n", err);
+		e_err(probe, "request_irq for msix_other failed: %d\n", err);
 		goto free_queue_irqs;
 	}
 
@@ -2414,31 +2342,6 @@ static irqreturn_t ixgbe_intr(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
-static inline void ixgbe_reset_q_vectors(struct ixgbe_adapter *adapter)
-{
-	int q_vectors = adapter->num_msix_vectors - NON_Q_VECTORS;
-	int i;
-
-	/* legacy and MSI only use one vector */
-	if (!(adapter->flags & IXGBE_FLAG_MSIX_ENABLED))
-		q_vectors = 1;
-
-	for (i = 0; i < adapter->num_rx_queues; i++) {
-		adapter->rx_ring[i]->q_vector = NULL;
-		adapter->rx_ring[i]->next = NULL;
-	}
-	for (i = 0; i < adapter->num_tx_queues; i++) {
-		adapter->tx_ring[i]->q_vector = NULL;
-		adapter->tx_ring[i]->next = NULL;
-	}
-
-	for (i = 0; i < q_vectors; i++) {
-		struct ixgbe_q_vector *q_vector = adapter->q_vector[i];
-		memset(&q_vector->rx, 0, sizeof(struct ixgbe_ring_container));
-		memset(&q_vector->tx, 0, sizeof(struct ixgbe_ring_container));
-	}
-}
-
 /**
  * ixgbe_request_irq - initialize interrupts
  * @adapter: board private structure
@@ -2451,9 +2354,6 @@ static int ixgbe_request_irq(struct ixgbe_adapter *adapter)
 	struct net_device *netdev = adapter->netdev;
 	int err;
 
-	/* map all of the rings to the q_vectors */
-	ixgbe_map_rings_to_vectors(adapter);
-
 	if (adapter->flags & IXGBE_FLAG_MSIX_ENABLED)
 		err = ixgbe_request_msix_irqs(adapter);
 	else if (adapter->flags & IXGBE_FLAG_MSI_ENABLED)
@@ -2463,13 +2363,9 @@ static int ixgbe_request_irq(struct ixgbe_adapter *adapter)
 		err = request_irq(adapter->pdev->irq, ixgbe_intr, IRQF_SHARED,
 				  netdev->name, adapter);
 
-	if (err) {
+	if (err)
 		e_err(probe, "request_irq failed, Error %d\n", err);
 
-		/* place q_vectors and rings back into a known good state */
-		ixgbe_reset_q_vectors(adapter);
-	}
-
 	return err;
 }
 
@@ -2499,9 +2395,6 @@ static void ixgbe_free_irq(struct ixgbe_adapter *adapter)
 	} else {
 		free_irq(adapter->pdev->irq, adapter);
 	}
-
-	/* clear q_vector state information */
-	ixgbe_reset_q_vectors(adapter);
 }
 
 /**
@@ -4828,75 +4721,6 @@ static void ixgbe_cache_ring_register(struct ixgbe_adapter *adapter)
 }
 
 /**
- * ixgbe_alloc_queues - Allocate memory for all rings
- * @adapter: board private structure to initialize
- *
- * We allocate one ring per queue at run-time since we don't know the
- * number of queues at compile-time.  The polling_netdev array is
- * intended for Multiqueue, but should work fine with a single queue.
- **/
-static int ixgbe_alloc_queues(struct ixgbe_adapter *adapter)
-{
-	int rx = 0, tx = 0, nid = adapter->node;
-
-	if (nid < 0 || !node_online(nid))
-		nid = first_online_node;
-
-	for (; tx < adapter->num_tx_queues; tx++) {
-		struct ixgbe_ring *ring;
-
-		ring = kzalloc_node(sizeof(*ring), GFP_KERNEL, nid);
-		if (!ring)
-			ring = kzalloc(sizeof(*ring), GFP_KERNEL);
-		if (!ring)
-			goto err_allocation;
-		ring->count = adapter->tx_ring_count;
-		ring->queue_index = tx;
-		ring->numa_node = nid;
-		ring->dev = &adapter->pdev->dev;
-		ring->netdev = adapter->netdev;
-
-		adapter->tx_ring[tx] = ring;
-	}
-
-	for (; rx < adapter->num_rx_queues; rx++) {
-		struct ixgbe_ring *ring;
-
-		ring = kzalloc_node(sizeof(*ring), GFP_KERNEL, nid);
-		if (!ring)
-			ring = kzalloc(sizeof(*ring), GFP_KERNEL);
-		if (!ring)
-			goto err_allocation;
-		ring->count = adapter->rx_ring_count;
-		ring->queue_index = rx;
-		ring->numa_node = nid;
-		ring->dev = &adapter->pdev->dev;
-		ring->netdev = adapter->netdev;
-
-		/*
-		 * 82599 errata, UDP frames with a 0 checksum can be marked as
-		 * checksum errors.
-		 */
-		if (adapter->hw.mac.type == ixgbe_mac_82599EB)
-			set_bit(__IXGBE_RX_CSUM_UDP_ZERO_ERR, &ring->state);
-
-		adapter->rx_ring[rx] = ring;
-	}
-
-	ixgbe_cache_ring_register(adapter);
-
-	return 0;
-
-err_allocation:
-	while (tx)
-		kfree(adapter->tx_ring[--tx]);
-
-	while (rx)
-		kfree(adapter->rx_ring[--rx]);
-	return -ENOMEM;
-}
-
-/**
  * ixgbe_set_interrupt_capability - set MSI-X or MSI if supported
  * @adapter: board private structure to initialize
  *
@@ -4927,7 +4751,7 @@ static int ixgbe_set_interrupt_capability(struct ixgbe_adapter *adapter)
 	 * descriptor queues supported by our device.  Thus, we cap it off in
 	 * those rare cases where the cpu count also exceeds our vector limit.
 	 */
-	v_budget = min(v_budget, (int)hw->mac.max_msix_vectors);
+	v_budget = min_t(int, v_budget, hw->mac.max_msix_vectors);
 
 	/* A failure in MSI-X entry allocation isn't fatal, but it does
 	 * mean we disable MSI-X capabilities of the adapter. */
@@ -4974,6 +4798,164 @@ out:
 	return err;
 }
 
+static void ixgbe_add_ring(struct ixgbe_ring *ring,
+			   struct ixgbe_ring_container *head)
+{
+	ring->next = head->ring;
+	head->ring = ring;
+	head->count++;
+}
+
+/**
+ * ixgbe_alloc_q_vector - Allocate memory for a single interrupt vector
+ * @adapter: board private structure to initialize
+ * @v_idx: index of vector in adapter struct
+ *
+ * We allocate one q_vector.  If allocation fails we return -ENOMEM.
+ **/
+static int ixgbe_alloc_q_vector(struct ixgbe_adapter *adapter, int v_idx,
+				int txr_count, int txr_idx,
+				int rxr_count, int rxr_idx)
+{
+	struct ixgbe_q_vector *q_vector;
+	struct ixgbe_ring *ring;
+	int node = -1;
+	int cpu = -1;
+	int ring_count, size;
+
+	ring_count = txr_count + rxr_count;
+	size = sizeof(struct ixgbe_q_vector) +
+	       (sizeof(struct ixgbe_ring) * ring_count);
+
+	/* customize cpu for Flow Director mapping */
+	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
+		if (cpu_online(v_idx)) {
+			cpu = v_idx;
+			node = cpu_to_node(cpu);
+		}
+	}
+
+	/* allocate q_vector and rings */
+	q_vector = kzalloc_node(size, GFP_KERNEL, node);
+	if (!q_vector)
+		q_vector = kzalloc(size, GFP_KERNEL);
+	if (!q_vector)
+		return -ENOMEM;
+
+	/* setup affinity mask and node */
+	if (cpu != -1)
+		cpumask_set_cpu(cpu, &q_vector->affinity_mask);
+	else
+		cpumask_copy(&q_vector->affinity_mask, cpu_online_mask);
+	q_vector->numa_node = node;
+
+	/* initialize NAPI */
+	netif_napi_add(adapter->netdev, &q_vector->napi,
+		       ixgbe_poll, 64);
+
+	/* tie q_vector and adapter together */
+	adapter->q_vector[v_idx] = q_vector;
+	q_vector->adapter = adapter;
+	q_vector->v_idx = v_idx;
+
+	/* initialize work limits */
+	q_vector->tx.work_limit = adapter->tx_work_limit;
+
+	/* initialize pointer to rings */
+	ring = q_vector->ring;
+
+	while (txr_count) {
+		/* assign generic ring traits */
+		ring->dev = &adapter->pdev->dev;
+		ring->netdev = adapter->netdev;
+
+		/* configure backlink on ring */
+		ring->q_vector = q_vector;
+
+		/* update q_vector Tx values */
+		ixgbe_add_ring(ring, &q_vector->tx);
+
+		/* apply Tx specific ring traits */
+		ring->count = adapter->tx_ring_count;
+		ring->queue_index = txr_idx;
+
+		/* assign ring to adapter */
+		adapter->tx_ring[txr_idx] = ring;
+
+		/* update count and index */
+		txr_count--;
+		txr_idx++;
+
+		/* push pointer to next ring */
+		ring++;
+	}
+
+	while (rxr_count) {
+		/* assign generic ring traits */
+		ring->dev = &adapter->pdev->dev;
+		ring->netdev = adapter->netdev;
+
+		/* configure backlink on ring */
+		ring->q_vector = q_vector;
+
+		/* update q_vector Rx values */
+		ixgbe_add_ring(ring, &q_vector->rx);
+
+		/*
+		 * 82599 errata, UDP frames with a 0 checksum
+		 * can be marked as checksum errors.
+		 */
+		if (adapter->hw.mac.type == ixgbe_mac_82599EB)
+			set_bit(__IXGBE_RX_CSUM_UDP_ZERO_ERR, &ring->state);
+
+		/* apply Rx specific ring traits */
+		ring->count = adapter->rx_ring_count;
+		ring->queue_index = rxr_idx;
+
+		/* assign ring to adapter */
+		adapter->rx_ring[rxr_idx] = ring;
+
+		/* update count and index */
+		rxr_count--;
+		rxr_idx++;
+
+		/* push pointer to next ring */
+		ring++;
+	}
+
+	return 0;
+}
+
+/**
+ * ixgbe_free_q_vector - Free memory allocated for specific interrupt vector
+ * @adapter: board private structure to initialize
+ * @v_idx: Index of vector to be freed
+ *
+ * This function frees the memory allocated to the q_vector.  In addition if
+ * NAPI is enabled it will delete any references to the NAPI struct prior
+ * to freeing the q_vector.
+ **/
+static void ixgbe_free_q_vector(struct ixgbe_adapter *adapter, int v_idx)
+{
+	struct ixgbe_q_vector *q_vector = adapter->q_vector[v_idx];
+	struct ixgbe_ring *ring;
+
+	for (ring = q_vector->tx.ring; ring != NULL; ring = ring->next)
+		adapter->tx_ring[ring->queue_index] = NULL;
+
+	for (ring = q_vector->rx.ring; ring != NULL; ring = ring->next)
+		adapter->rx_ring[ring->queue_index] = NULL;
+
+	adapter->q_vector[v_idx] = NULL;
+	netif_napi_del(&q_vector->napi);
+
+	/*
+	 * ixgbe_get_stats64() might access the rings on this vector,
+	 * we must wait a grace period before freeing it.
+	 */
+	kfree_rcu(q_vector, rcu);
+}
+
 /**
  * ixgbe_alloc_q_vectors - Allocate memory for interrupt vectors
  * @adapter: board private structure to initialize
@@ -4983,33 +4965,46 @@ out:
  **/
 static int ixgbe_alloc_q_vectors(struct ixgbe_adapter *adapter)
 {
-	int v_idx, num_q_vectors;
-	struct ixgbe_q_vector *q_vector;
+	int q_vectors = adapter->num_msix_vectors - NON_Q_VECTORS;
+	int rxr_remaining = adapter->num_rx_queues;
+	int txr_remaining = adapter->num_tx_queues;
+	int rxr_idx = 0, txr_idx = 0, v_idx = 0;
+	int err;
 
-	if (adapter->flags & IXGBE_FLAG_MSIX_ENABLED)
-		num_q_vectors = adapter->num_msix_vectors - NON_Q_VECTORS;
-	else
-		num_q_vectors = 1;
+	/* only one q_vector if MSI-X is disabled. */
+	if (!(adapter->flags & IXGBE_FLAG_MSIX_ENABLED))
+		q_vectors = 1;
 
-	for (v_idx = 0; v_idx < num_q_vectors; v_idx++) {
-		q_vector = kzalloc_node(sizeof(struct ixgbe_q_vector),
-					GFP_KERNEL, adapter->node);
-		if (!q_vector)
-			q_vector = kzalloc(sizeof(struct ixgbe_q_vector),
-					   GFP_KERNEL);
-		if (!q_vector)
-			goto err_out;
+	if (q_vectors >= (rxr_remaining + txr_remaining)) {
+		for (; rxr_remaining; v_idx++, q_vectors--) {
+			int rqpv = DIV_ROUND_UP(rxr_remaining, q_vectors);
+			err = ixgbe_alloc_q_vector(adapter, v_idx,
+						   0, 0, rqpv, rxr_idx);
 
-		q_vector->adapter = adapter;
-		q_vector->v_idx = v_idx;
+			if (err)
+				goto err_out;
+
+			/* update counts and index */
+			rxr_remaining -= rqpv;
+			rxr_idx += rqpv;
+		}
+	}
 
-		/* Allocate the affinity_hint cpumask, configure the mask */
-		if (!alloc_cpumask_var(&q_vector->affinity_mask, GFP_KERNEL))
+	for (; q_vectors; v_idx++, q_vectors--) {
+		int rqpv = DIV_ROUND_UP(rxr_remaining, q_vectors);
+		int tqpv = DIV_ROUND_UP(txr_remaining, q_vectors);
+		err = ixgbe_alloc_q_vector(adapter, v_idx,
+					   tqpv, txr_idx,
+					   rqpv, rxr_idx);
+
+		if (err)
 			goto err_out;
-		cpumask_set_cpu(v_idx, q_vector->affinity_mask);
-		netif_napi_add(adapter->netdev, &q_vector->napi,
-			       ixgbe_poll, 64);
-		adapter->q_vector[v_idx] = q_vector;
+
+		/* update counts and index */
+		rxr_remaining -= rqpv;
+		rxr_idx += rqpv;
+		txr_remaining -= tqpv;
+		txr_idx += tqpv;
 	}
 
 	return 0;
@@ -5017,12 +5012,9 @@ static int ixgbe_alloc_q_vectors(struct ixgbe_adapter *adapter)
 err_out:
 	while (v_idx) {
 		v_idx--;
-		q_vector = adapter->q_vector[v_idx];
-		netif_napi_del(&q_vector->napi);
-		free_cpumask_var(q_vector->affinity_mask);
-		kfree(q_vector);
-		adapter->q_vector[v_idx] = NULL;
+		ixgbe_free_q_vector(adapter, v_idx);
 	}
+
 	return -ENOMEM;
 }
 
@@ -5036,20 +5028,15 @@ err_out:
  **/
 static void ixgbe_free_q_vectors(struct ixgbe_adapter *adapter)
 {
-	int v_idx, num_q_vectors;
+	int v_idx, q_vectors;
 
 	if (adapter->flags & IXGBE_FLAG_MSIX_ENABLED)
-		num_q_vectors = adapter->num_msix_vectors - NON_Q_VECTORS;
+		q_vectors = adapter->num_msix_vectors - NON_Q_VECTORS;
 	else
-		num_q_vectors = 1;
+		q_vectors = 1;
 
-	for (v_idx = 0; v_idx < num_q_vectors; v_idx++) {
-		struct ixgbe_q_vector *q_vector = adapter->q_vector[v_idx];
-		adapter->q_vector[v_idx] = NULL;
-		netif_napi_del(&q_vector->napi);
-		free_cpumask_var(q_vector->affinity_mask);
-		kfree(q_vector);
-	}
+	for (v_idx = 0; v_idx < q_vectors; v_idx++)
+		ixgbe_free_q_vector(adapter, v_idx);
 }
 
 static void ixgbe_reset_interrupt_capability(struct ixgbe_adapter *adapter)
@@ -5096,11 +5083,7 @@ int ixgbe_init_interrupt_scheme(struct ixgbe_adapter *adapter)
 		goto err_alloc_q_vectors;
 	}
 
-	err = ixgbe_alloc_queues(adapter);
-	if (err) {
-		e_dev_err("Unable to allocate memory for queues\n");
-		goto err_alloc_queues;
-	}
+	ixgbe_cache_ring_register(adapter);
 
 	e_dev_info("Multiqueue %s: Rx Queue count = %u, Tx Queue count = %u\n",
 		   (adapter->num_rx_queues > 1) ? "Enabled" : "Disabled",
@@ -5110,8 +5093,6 @@ int ixgbe_init_interrupt_scheme(struct ixgbe_adapter *adapter)
 
 	return 0;
 
-err_alloc_queues:
-	ixgbe_free_q_vectors(adapter);
 err_alloc_q_vectors:
 	ixgbe_reset_interrupt_capability(adapter);
 err_set_interrupt:
@@ -5127,22 +5108,6 @@ err_set_interrupt:
  **/
 void ixgbe_clear_interrupt_scheme(struct ixgbe_adapter *adapter)
 {
-	int i;
-
-	for (i = 0; i < adapter->num_tx_queues; i++) {
-		kfree(adapter->tx_ring[i]);
-		adapter->tx_ring[i] = NULL;
-	}
-	for (i = 0; i < adapter->num_rx_queues; i++) {
-		struct ixgbe_ring *ring = adapter->rx_ring[i];
-
-		/* ixgbe_get_stats64() might access this ring, we must wait
-		 * a grace period before freeing it.
-		 */
-		kfree_rcu(ring, rcu);
-		adapter->rx_ring[i] = NULL;
-	}
-
 	adapter->num_tx_queues = 0;
 	adapter->num_rx_queues = 0;
 
@@ -5286,9 +5251,6 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
 		return -EIO;
 	}
 
-	/* get assigned NUMA node */
-	adapter->node = dev_to_node(&pdev->dev);
-
 	set_bit(__IXGBE_DOWN, &adapter->state);
 
 	return 0;
@@ -5303,10 +5265,16 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
 int ixgbe_setup_tx_resources(struct ixgbe_ring *tx_ring)
 {
 	struct device *dev = tx_ring->dev;
+	int orig_node = dev_to_node(dev);
+	int numa_node = -1;
 	int size;
 
 	size = sizeof(struct ixgbe_tx_buffer) * tx_ring->count;
-	tx_ring->tx_buffer_info = vzalloc_node(size, tx_ring->numa_node);
+
+	if (tx_ring->q_vector)
+		numa_node = tx_ring->q_vector->numa_node;
+
+	tx_ring->tx_buffer_info = vzalloc_node(size, numa_node);
 	if (!tx_ring->tx_buffer_info)
 		tx_ring->tx_buffer_info = vzalloc(size);
 	if (!tx_ring->tx_buffer_info)
@@ -5316,8 +5284,15 @@ int ixgbe_setup_tx_resources(struct ixgbe_ring *tx_ring)
 	tx_ring->size = tx_ring->count * sizeof(union ixgbe_adv_tx_desc);
 	tx_ring->size = ALIGN(tx_ring->size, 4096);
 
-	tx_ring->desc = dma_alloc_coherent(dev, tx_ring->size,
-					   &tx_ring->dma, GFP_KERNEL);
+	set_dev_node(dev, numa_node);
+	tx_ring->desc = dma_alloc_coherent(dev,
+					   tx_ring->size,
+					   &tx_ring->dma,
+					   GFP_KERNEL);
+	set_dev_node(dev, orig_node);
+	if (!tx_ring->desc)
+		tx_ring->desc = dma_alloc_coherent(dev, tx_ring->size,
+						   &tx_ring->dma, GFP_KERNEL);
 	if (!tx_ring->desc)
 		goto err;
 
@@ -5366,10 +5341,16 @@ static int ixgbe_setup_all_tx_resources(struct ixgbe_adapter *adapter)
 int ixgbe_setup_rx_resources(struct ixgbe_ring *rx_ring)
 {
 	struct device *dev = rx_ring->dev;
+	int orig_node = dev_to_node(dev);
+	int numa_node = -1;
 	int size;
 
 	size = sizeof(struct ixgbe_rx_buffer) * rx_ring->count;
-	rx_ring->rx_buffer_info = vzalloc_node(size, rx_ring->numa_node);
+
+	if (rx_ring->q_vector)
+		numa_node = rx_ring->q_vector->numa_node;
+
+	rx_ring->rx_buffer_info = vzalloc_node(size, numa_node);
 	if (!rx_ring->rx_buffer_info)
 		rx_ring->rx_buffer_info = vzalloc(size);
 	if (!rx_ring->rx_buffer_info)
@@ -5379,9 +5360,15 @@ int ixgbe_setup_rx_resources(struct ixgbe_ring *rx_ring)
 	rx_ring->size = rx_ring->count * sizeof(union ixgbe_adv_rx_desc);
 	rx_ring->size = ALIGN(rx_ring->size, 4096);
 
-	rx_ring->desc = dma_alloc_coherent(dev, rx_ring->size,
-					   &rx_ring->dma, GFP_KERNEL);
-
+	set_dev_node(dev, numa_node);
+	rx_ring->desc = dma_alloc_coherent(dev,
+					   rx_ring->size,
+					   &rx_ring->dma,
+					   GFP_KERNEL);
+	set_dev_node(dev, orig_node);
+	if (!rx_ring->desc)
+		rx_ring->desc = dma_alloc_coherent(dev, rx_ring->size,
+						   &rx_ring->dma, GFP_KERNEL);
 	if (!rx_ring->desc)
 		goto err;
 
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 13/14] ixgbe: Add iterator for cycling through rings on a q_vector
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (11 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 12/14] ixgbe: Allocate rings as part of the q_vector Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  4:03 ` [net-next 14/14] ixgbe: Simplify logic for ethtool loopback frame creation and testing Jeff Kirsher
  2012-03-13  5:55 ` [net-next 00/14][pull request] Intel Wired LAN Driver Updates David Miller
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

Since there are multiple spots where we have to cycle through all of the
rings on a q_vector it makes sense to just add a function for iterating
through all of them.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |    4 ++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   16 ++++++++--------
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 699899a..03175cb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -296,6 +296,10 @@ struct ixgbe_ring_container {
 	u8 itr;				/* current ITR setting for ring */
 };
 
+/* iterator for handling rings in ring container */
+#define ixgbe_for_each_ring(pos, head) \
+	for (pos = (head).ring; pos != NULL; pos = pos->next)
+
 #define MAX_RX_PACKET_BUFFERS ((adapter->flags & IXGBE_FLAG_DCB_ENABLED) \
                               ? 8 : 1)
 #define MAX_TX_PACKET_BUFFERS MAX_RX_PACKET_BUFFERS
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 604540d..95240ab 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -928,10 +928,10 @@ static void ixgbe_update_dca(struct ixgbe_q_vector *q_vector)
 	if (q_vector->cpu == cpu)
 		goto out_no_update;
 
-	for (ring = q_vector->tx.ring; ring != NULL; ring = ring->next)
+	ixgbe_for_each_ring(ring, q_vector->tx)
 		ixgbe_update_tx_dca(adapter, ring, cpu);
 
-	for (ring = q_vector->rx.ring; ring != NULL; ring = ring->next)
+	ixgbe_for_each_ring(ring, q_vector->rx)
 		ixgbe_update_rx_dca(adapter, ring, cpu);
 
 	q_vector->cpu = cpu;
@@ -1706,10 +1706,10 @@ static void ixgbe_configure_msix(struct ixgbe_adapter *adapter)
 		struct ixgbe_ring *ring;
 		q_vector = adapter->q_vector[v_idx];
 
-		for (ring = q_vector->rx.ring; ring != NULL; ring = ring->next)
+		ixgbe_for_each_ring(ring, q_vector->rx)
 			ixgbe_set_ivar(adapter, 0, ring->reg_idx, v_idx);
 
-		for (ring = q_vector->tx.ring; ring != NULL; ring = ring->next)
+		ixgbe_for_each_ring(ring, q_vector->tx)
 			ixgbe_set_ivar(adapter, 1, ring->reg_idx, v_idx);
 
 		if (q_vector->tx.ring && !q_vector->rx.ring) {
@@ -4195,7 +4195,7 @@ static int ixgbe_poll(struct napi_struct *napi, int budget)
 		ixgbe_update_dca(q_vector);
 #endif
 
-	for (ring = q_vector->tx.ring; ring != NULL; ring = ring->next)
+	ixgbe_for_each_ring(ring, q_vector->tx)
 		clean_complete &= !!ixgbe_clean_tx_irq(q_vector, ring);
 
 	/* attempt to distribute budget to each queue fairly, but don't allow
@@ -4205,7 +4205,7 @@ static int ixgbe_poll(struct napi_struct *napi, int budget)
 	else
 		per_ring_budget = budget;
 
-	for (ring = q_vector->rx.ring; ring != NULL; ring = ring->next)
+	ixgbe_for_each_ring(ring, q_vector->rx)
 		clean_complete &= ixgbe_clean_rx_irq(q_vector, ring,
 						     per_ring_budget);
 
@@ -4940,10 +4940,10 @@ static void ixgbe_free_q_vector(struct ixgbe_adapter *adapter, int v_idx)
 	struct ixgbe_q_vector *q_vector = adapter->q_vector[v_idx];
 	struct ixgbe_ring *ring;
 
-	for (ring = q_vector->tx.ring; ring != NULL; ring = ring->next)
+	ixgbe_for_each_ring(ring, q_vector->tx)
 		adapter->tx_ring[ring->queue_index] = NULL;
 
-	for (ring = q_vector->rx.ring; ring != NULL; ring = ring->next)
+	ixgbe_for_each_ring(ring, q_vector->rx)
 		adapter->rx_ring[ring->queue_index] = NULL;
 
 	adapter->q_vector[v_idx] = NULL;
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [net-next 14/14] ixgbe: Simplify logic for ethtool loopback frame creation and testing
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (12 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 13/14] ixgbe: Add iterator for cycling through rings on a q_vector Jeff Kirsher
@ 2012-03-13  4:03 ` Jeff Kirsher
  2012-03-13  5:55 ` [net-next 00/14][pull request] Intel Wired LAN Driver Updates David Miller
  14 siblings, 0 replies; 23+ messages in thread
From: Jeff Kirsher @ 2012-03-13  4:03 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change makes it a bit easier to do the loopback frame creating and
testing.  Previously we were doing an and to drop the last bit, and then
dividing the frame_size by 2 in order to get locations for frame bytes and
testing.  Instead we can simplify it by just shifting the register one bit
to the right and using that for the frame offsets.

This change also replaces all instances of rx_buffer_info with just
rx_buffer since that is closer to the name of the actual structure being
used and can save a few extra characters.

In addition I have updated the logic for cleaning up a test frame so that
we pass an rx_buffer instead of the sk_buff.  The main motivation behind
this is changes that will replace the sk_buff with just a page in the
future.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |   63 +++++++++++-----------
 1 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 51d159b..1216bae 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -1701,63 +1701,65 @@ static void ixgbe_loopback_cleanup(struct ixgbe_adapter *adapter)
 }
 
 static void ixgbe_create_lbtest_frame(struct sk_buff *skb,
-                                      unsigned int frame_size)
+				      unsigned int frame_size)
 {
 	memset(skb->data, 0xFF, frame_size);
-	frame_size &= ~1;
-	memset(&skb->data[frame_size / 2], 0xAA, frame_size / 2 - 1);
-	memset(&skb->data[frame_size / 2 + 10], 0xBE, 1);
-	memset(&skb->data[frame_size / 2 + 12], 0xAF, 1);
+	frame_size >>= 1;
+	memset(&skb->data[frame_size], 0xAA, frame_size / 2 - 1);
+	memset(&skb->data[frame_size + 10], 0xBE, 1);
+	memset(&skb->data[frame_size + 12], 0xAF, 1);
 }
 
-static int ixgbe_check_lbtest_frame(struct sk_buff *skb,
-                                    unsigned int frame_size)
+static bool ixgbe_check_lbtest_frame(struct ixgbe_rx_buffer *rx_buffer,
+				     unsigned int frame_size)
 {
-	frame_size &= ~1;
-	if (*(skb->data + 3) == 0xFF) {
-		if ((*(skb->data + frame_size / 2 + 10) == 0xBE) &&
-		    (*(skb->data + frame_size / 2 + 12) == 0xAF)) {
-			return 0;
-		}
-	}
-	return 13;
+	unsigned char *data;
+	bool match = true;
+
+	frame_size >>= 1;
+
+	data = rx_buffer->skb->data;
+
+	if (data[3] != 0xFF ||
+	    data[frame_size + 10] != 0xBE ||
+	    data[frame_size + 12] != 0xAF)
+		match = false;
+
+	return match;
 }
 
 static u16 ixgbe_clean_test_rings(struct ixgbe_ring *rx_ring,
-                                  struct ixgbe_ring *tx_ring,
-                                  unsigned int size)
+				  struct ixgbe_ring *tx_ring,
+				  unsigned int size)
 {
 	union ixgbe_adv_rx_desc *rx_desc;
-	struct ixgbe_rx_buffer *rx_buffer_info;
-	struct ixgbe_tx_buffer *tx_buffer_info;
-	const int bufsz = rx_ring->rx_buf_len;
-	u32 staterr;
+	struct ixgbe_rx_buffer *rx_buffer;
+	struct ixgbe_tx_buffer *tx_buffer;
 	u16 rx_ntc, tx_ntc, count = 0;
 
 	/* initialize next to clean and descriptor values */
 	rx_ntc = rx_ring->next_to_clean;
 	tx_ntc = tx_ring->next_to_clean;
 	rx_desc = IXGBE_RX_DESC(rx_ring, rx_ntc);
-	staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
 
-	while (staterr & IXGBE_RXD_STAT_DD) {
+	while (ixgbe_test_staterr(rx_desc, IXGBE_RXD_STAT_DD)) {
 		/* check Rx buffer */
-		rx_buffer_info = &rx_ring->rx_buffer_info[rx_ntc];
+		rx_buffer = &rx_ring->rx_buffer_info[rx_ntc];
 
 		/* unmap Rx buffer, will be remapped by alloc_rx_buffers */
 		dma_unmap_single(rx_ring->dev,
-		                 rx_buffer_info->dma,
-				 bufsz,
+				 rx_buffer->dma,
+				 rx_ring->rx_buf_len,
 				 DMA_FROM_DEVICE);
-		rx_buffer_info->dma = 0;
+		rx_buffer->dma = 0;
 
 		/* verify contents of skb */
-		if (!ixgbe_check_lbtest_frame(rx_buffer_info->skb, size))
+		if (ixgbe_check_lbtest_frame(rx_buffer, size))
 			count++;
 
 		/* unmap buffer on Tx side */
-		tx_buffer_info = &tx_ring->tx_buffer_info[tx_ntc];
-		ixgbe_unmap_and_free_tx_resource(tx_ring, tx_buffer_info);
+		tx_buffer = &tx_ring->tx_buffer_info[tx_ntc];
+		ixgbe_unmap_and_free_tx_resource(tx_ring, tx_buffer);
 
 		/* increment Rx/Tx next to clean counters */
 		rx_ntc++;
@@ -1769,7 +1771,6 @@ static u16 ixgbe_clean_test_rings(struct ixgbe_ring *rx_ring,
 
 		/* fetch next descriptor */
 		rx_desc = IXGBE_RX_DESC(rx_ring, rx_ntc);
-		staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
 	}
 
 	/* re-map buffers to ring, store next to clean values */
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [net-next 00/14][pull request] Intel Wired LAN Driver Updates
  2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
                   ` (13 preceding siblings ...)
  2012-03-13  4:03 ` [net-next 14/14] ixgbe: Simplify logic for ethtool loopback frame creation and testing Jeff Kirsher
@ 2012-03-13  5:55 ` David Miller
  14 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2012-03-13  5:55 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon, 12 Mar 2012 21:03:38 -0700

> This series of patches contains fixes/cleanups igb, ixgbe and net.
> Majority of the patches are against ixgbe and this series is part
> one of three to update ixgbe.
> 
> The following are changes since commit f124488e4713dc9afa2028553261b1d399286e68:
>   bnx2x: code doesn't use stats for allocating Rx BDs
> and are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master

Pulled, thanks Jeff.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state
  2012-03-13  4:03 ` [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state Jeff Kirsher
@ 2012-04-20 21:19   ` Tom Herbert
  2012-04-20 21:31     ` John Fastabend
  0 siblings, 1 reply; 23+ messages in thread
From: Tom Herbert @ 2012-04-20 21:19 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, Alexander Duyck, netdev, gospo, sassmann

Hi Jeff,

> @@ -3244,7 +3246,6 @@ static void igb_clean_tx_ring(struct igb_ring *tx_ring)
>                buffer_info = &tx_ring->tx_buffer_info[i];
>                igb_unmap_and_free_tx_resource(tx_ring, buffer_info);
>        }
> -       netdev_tx_reset_queue(txring_txq(tx_ring));
>
Why is it necessary to remove this?  If rings are being freed with
going through completion path then we would need this reset.  Is this
being done elsewhere maybe?

Tom

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state
  2012-04-20 21:19   ` Tom Herbert
@ 2012-04-20 21:31     ` John Fastabend
  2012-04-21  3:24       ` Alexander Duyck
  0 siblings, 1 reply; 23+ messages in thread
From: John Fastabend @ 2012-04-20 21:31 UTC (permalink / raw)
  To: Tom Herbert, Alexander Duyck; +Cc: Jeff Kirsher, davem, netdev, gospo, sassmann

On 4/20/2012 2:19 PM, Tom Herbert wrote:
> Hi Jeff,
> 
>> @@ -3244,7 +3246,6 @@ static void igb_clean_tx_ring(struct igb_ring *tx_ring)
>>                buffer_info = &tx_ring->tx_buffer_info[i];
>>                igb_unmap_and_free_tx_resource(tx_ring, buffer_info);
>>        }
>> -       netdev_tx_reset_queue(txring_txq(tx_ring));
>>
> Why is it necessary to remove this?  If rings are being freed with
> going through completion path then we would need this reset.  Is this
> being done elsewhere maybe?
> 

Alex moved this into the igb_configure_tx_ring() iirc to catch an ethtool
test case. He assured me when I reviewed the patch that igb_configure_tx_ring()
would always be called before netif_carrier_on() so I think(?) that should
be OK.

The above removed case was called after netif_carrier_off() anyways.

> Tom
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state
  2012-04-20 21:31     ` John Fastabend
@ 2012-04-21  3:24       ` Alexander Duyck
  2012-04-21  6:01         ` Tom Herbert
  0 siblings, 1 reply; 23+ messages in thread
From: Alexander Duyck @ 2012-04-21  3:24 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Alexander Duyck, Jeff Kirsher, davem, netdev, gospo, sassmann,
	John Fastabend

On Fri, Apr 20, 2012 at 2:31 PM, John Fastabend
<john.r.fastabend@intel.com> wrote:
> On 4/20/2012 2:19 PM, Tom Herbert wrote:
>> Hi Jeff,
>>
>>> @@ -3244,7 +3246,6 @@ static void igb_clean_tx_ring(struct igb_ring *tx_ring)
>>>                buffer_info = &tx_ring->tx_buffer_info[i];
>>>                igb_unmap_and_free_tx_resource(tx_ring, buffer_info);
>>>        }
>>> -       netdev_tx_reset_queue(txring_txq(tx_ring));
>>>
>> Why is it necessary to remove this?  If rings are being freed with
>> going through completion path then we would need this reset.  Is this
>> being done elsewhere maybe?
>>
>
> Alex moved this into the igb_configure_tx_ring() iirc to catch an ethtool
> test case. He assured me when I reviewed the patch that igb_configure_tx_ring()
> would always be called before netif_carrier_on() so I think(?) that should
> be OK.
>
> The above removed case was called after netif_carrier_off() anyways.

I don't recall the exact reason now, as John said I think it had to do
with the ethtool tests not using the same cleanup routine and leaving
us in a bad state.  I am pretty sure there was some path in which
where the call was didn't work but I do not recall the exact details
now.  Most of the reason for moving it is due to the fact that the
reset is now also clearing the bit, and from the driver perspective we
didn't need it in two places.  After looking it all over again, I
suppose this causes a cosmetic issue for the bql "inflight" statistic
in sysfs since the value will be retained until the interface is
brought back up with this change.  Is that an issue or something that
can be lived with since the interface isn't active anyway in this
case?

Thanks,

Alex

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state
  2012-04-21  3:24       ` Alexander Duyck
@ 2012-04-21  6:01         ` Tom Herbert
  2012-04-21 16:27           ` John Fastabend
  0 siblings, 1 reply; 23+ messages in thread
From: Tom Herbert @ 2012-04-21  6:01 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, Jeff Kirsher, davem, netdev, gospo, sassmann,
	John Fastabend

> I don't recall the exact reason now, as John said I think it had to do
> with the ethtool tests not using the same cleanup routine and leaving
> us in a bad state.  I am pretty sure there was some path in which
> where the call was didn't work but I do not recall the exact details
> now.  Most of the reason for moving it is due to the fact that the
> reset is now also clearing the bit, and from the driver perspective we
> didn't need it in two places.  After looking it all over again, I
> suppose this causes a cosmetic issue for the bql "inflight" statistic
> in sysfs since the value will be retained until the interface is
> brought back up with this change.  Is that an issue or something that
> can be lived with since the interface isn't active anyway in this
> case?
>
On the surface, it seems cleaner and probably a simpler convention to
clear the state when the buffers are being freed.  If there's a good
reason not to do this for igb, it makes me wonder if this should be
done the same way in other drivers...

Tom

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state
  2012-04-21  6:01         ` Tom Herbert
@ 2012-04-21 16:27           ` John Fastabend
  2012-04-21 20:18             ` David Miller
  0 siblings, 1 reply; 23+ messages in thread
From: John Fastabend @ 2012-04-21 16:27 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Alexander Duyck, Alexander Duyck, Jeff Kirsher, davem, netdev,
	gospo, sassmann

On 4/20/2012 11:01 PM, Tom Herbert wrote:
>> I don't recall the exact reason now, as John said I think it had to do
>> with the ethtool tests not using the same cleanup routine and leaving
>> us in a bad state.  I am pretty sure there was some path in which
>> where the call was didn't work but I do not recall the exact details
>> now.  Most of the reason for moving it is due to the fact that the
>> reset is now also clearing the bit, and from the driver perspective we
>> didn't need it in two places.  After looking it all over again, I
>> suppose this causes a cosmetic issue for the bql "inflight" statistic
>> in sysfs since the value will be retained until the interface is
>> brought back up with this change.  Is that an issue or something that
>> can be lived with since the interface isn't active anyway in this
>> case?
>>
> On the surface, it seems cleaner and probably a simpler convention to
> clear the state when the buffers are being freed.  If there's a good
> reason not to do this for igb, it makes me wonder if this should be
> done the same way in other drivers...
> 
> Tom

Just dug up the old thread. This was to fix a bug that byte queue limits
was causing with some of the loopback tests evoked from ethtool. The
loopback test uses a separate path to free the buffers which wasn't
calling netdev_tx_reset_queue().

We should likely just explicitly clear the state in each routine rather
than try to be clever IMO.

.John

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state
  2012-04-21 16:27           ` John Fastabend
@ 2012-04-21 20:18             ` David Miller
  2012-04-23  3:21               ` John Fastabend
  0 siblings, 1 reply; 23+ messages in thread
From: David Miller @ 2012-04-21 20:18 UTC (permalink / raw)
  To: john.r.fastabend
  Cc: therbert, alexander.duyck, alexander.h.duyck, jeffrey.t.kirsher,
	netdev, gospo, sassmann

From: John Fastabend <john.r.fastabend@intel.com>
Date: Sat, 21 Apr 2012 09:27:31 -0700

> On 4/20/2012 11:01 PM, Tom Herbert wrote:
>>> I don't recall the exact reason now, as John said I think it had to do
>>> with the ethtool tests not using the same cleanup routine and leaving
>>> us in a bad state.  I am pretty sure there was some path in which
>>> where the call was didn't work but I do not recall the exact details
>>> now.  Most of the reason for moving it is due to the fact that the
>>> reset is now also clearing the bit, and from the driver perspective we
>>> didn't need it in two places.  After looking it all over again, I
>>> suppose this causes a cosmetic issue for the bql "inflight" statistic
>>> in sysfs since the value will be retained until the interface is
>>> brought back up with this change.  Is that an issue or something that
>>> can be lived with since the interface isn't active anyway in this
>>> case?
>>>
>> On the surface, it seems cleaner and probably a simpler convention to
>> clear the state when the buffers are being freed.  If there's a good
>> reason not to do this for igb, it makes me wonder if this should be
>> done the same way in other drivers...
>> 
>> Tom
> 
> Just dug up the old thread. This was to fix a bug that byte queue limits
> was causing with some of the loopback tests evoked from ethtool. The
> loopback test uses a separate path to free the buffers which wasn't
> calling netdev_tx_reset_queue().
> 
> We should likely just explicitly clear the state in each routine rather
> than try to be clever IMO.

Agreed.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state
  2012-04-21 20:18             ` David Miller
@ 2012-04-23  3:21               ` John Fastabend
  0 siblings, 0 replies; 23+ messages in thread
From: John Fastabend @ 2012-04-23  3:21 UTC (permalink / raw)
  To: David Miller, Kirsher, Jeffrey T
  Cc: therbert, alexander.duyck, alexander.h.duyck, jeffrey.t.kirsher,
	netdev, gospo, sassmann

On 4/21/2012 1:18 PM, David Miller wrote:
> From: John Fastabend <john.r.fastabend@intel.com>
> Date: Sat, 21 Apr 2012 09:27:31 -0700
> 
>> On 4/20/2012 11:01 PM, Tom Herbert wrote:
>>>> I don't recall the exact reason now, as John said I think it had to do
>>>> with the ethtool tests not using the same cleanup routine and leaving
>>>> us in a bad state.  I am pretty sure there was some path in which
>>>> where the call was didn't work but I do not recall the exact details
>>>> now.  Most of the reason for moving it is due to the fact that the
>>>> reset is now also clearing the bit, and from the driver perspective we
>>>> didn't need it in two places.  After looking it all over again, I
>>>> suppose this causes a cosmetic issue for the bql "inflight" statistic
>>>> in sysfs since the value will be retained until the interface is
>>>> brought back up with this change.  Is that an issue or something that
>>>> can be lived with since the interface isn't active anyway in this
>>>> case?
>>>>
>>> On the surface, it seems cleaner and probably a simpler convention to
>>> clear the state when the buffers are being freed.  If there's a good
>>> reason not to do this for igb, it makes me wonder if this should be
>>> done the same way in other drivers...
>>>
>>> Tom
>>
>> Just dug up the old thread. This was to fix a bug that byte queue limits
>> was causing with some of the loopback tests evoked from ethtool. The
>> loopback test uses a separate path to free the buffers which wasn't
>> calling netdev_tx_reset_queue().
>>
>> We should likely just explicitly clear the state in each routine rather
>> than try to be clever IMO.
> 
> Agreed.

Further it can cause a real bug (the qdisc stack bit doesn't get cleared) if
the real_num_tx_queues is decreased for example in ixgbe when DCB is disabled.
I'll send a patch to JeffK.

Thanks,
John

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-04-23  3:21 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-13  4:03 [net-next 00/14][pull request] Intel Wired LAN Driver Updates Jeff Kirsher
2012-03-13  4:03 ` [net-next 01/14] igb: fix ethtool offline test Jeff Kirsher
2012-03-13  4:03 ` [net-next 02/14] ixgbe: remove tie between NAPI work limits and interrupt moderation Jeff Kirsher
2012-03-13  4:03 ` [net-next 03/14] ixgbe: add support for byte queue limits Jeff Kirsher
2012-03-13  4:03 ` [net-next 04/14] net: Fix issue with netdev_tx_reset_queue not resetting queue from XOFF state Jeff Kirsher
2012-04-20 21:19   ` Tom Herbert
2012-04-20 21:31     ` John Fastabend
2012-04-21  3:24       ` Alexander Duyck
2012-04-21  6:01         ` Tom Herbert
2012-04-21 16:27           ` John Fastabend
2012-04-21 20:18             ` David Miller
2012-04-23  3:21               ` John Fastabend
2012-03-13  4:03 ` [net-next 05/14] net: Add memory barriers to prevent possible race in byte queue limits Jeff Kirsher
2012-03-13  4:03 ` [net-next 06/14] ixgbe: Do no clear Tx status bits since eop_desc provides enough info Jeff Kirsher
2012-03-13  4:03 ` [net-next 07/14] ixgbe: Reorder adapter contents for better cache utilization Jeff Kirsher
2012-03-13  4:03 ` [net-next 08/14] ixgbe: Address issues with Tx WHTRESH value not being set correctly Jeff Kirsher
2012-03-13  4:03 ` [net-next 09/14] ixgbe: Correct Adaptive Interrupt Moderation so that it will change values Jeff Kirsher
2012-03-13  4:03 ` [net-next 10/14] ixgbe: Default to queue pairs when number of queues is less than CPUs Jeff Kirsher
2012-03-13  4:03 ` [net-next 11/14] ixgbe: Drop unnecessary napi_schedule_prep and spare blank line from ixgbe_intr Jeff Kirsher
2012-03-13  4:03 ` [net-next 12/14] ixgbe: Allocate rings as part of the q_vector Jeff Kirsher
2012-03-13  4:03 ` [net-next 13/14] ixgbe: Add iterator for cycling through rings on a q_vector Jeff Kirsher
2012-03-13  4:03 ` [net-next 14/14] ixgbe: Simplify logic for ethtool loopback frame creation and testing Jeff Kirsher
2012-03-13  5:55 ` [net-next 00/14][pull request] Intel Wired LAN Driver Updates David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).