netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18
@ 2014-11-19  4:10 Jeff Kirsher
  2014-11-19  4:10 ` [net-next 01/15] ixgbevf: Update ixgbevf_alloc_rx_buffers to handle clearing of status bits Jeff Kirsher
                   ` (14 more replies)
  0 siblings, 15 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains updates to ixgbevf, i40e and i40evf.

Emil updates ixgbevf with much of the work that Alex Duyck did while at
Intel.  First updates the driver to clear the status bits on allocation
instead of in the cleanup routine, this way we can leave the recieve
descriptor rings as a read only memory block until we actually have
buffers to give back to the hardware.  Clean up ixgbevf_clean_rx_irq()
by creating ixgbevf_process_skb_field() to merge several similar
operations into this new function.  Cleanup temporary variables within
the receive hot-path and reducing the scope of variables that do not
need to exist outside the main loop.  Save on stack space by just
storing our updated values back in next_to_clean instead of using
a stack variable, which also collapses the size the function.  Improve
performace on IOMMU enabled systems and reduce cache misses by changing
the basic receive patch for ixgbevf so that instead of receiving the
data into an skb, it is received into a double buffered page.  Add
netpoll support by creating ixgbevf_netpoll(), which is a callback for
.ndo_poll_controller to allow for the VF interface to be used with
netconsole.

Mitch provides several cleanups and trivial fixes for i40e and i40evf.
First is a fix the overloading of the msg_size field in the
arq_event_info struct by splitting the field into two and renaming to
indicate the actual function of each field.  Updates code comments
to match the actual function.  Cleanup several checkpatch.pl warnings
by adding or removing blank lines, aligning function parameters, and
correcting over-long lines (which makes the code more readable).

Shannon provides a patch for i40e to write the extra bits that will
turn off the ITR wait for the interrupt, since we want the SW INT to
go off as soon as possible.

The following are changes since commit e3e3217029a35c579bf100998b43976d0b1cb8d7:
  icmp: Remove some spurious dropped packet profile hits from the ICMP path
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master

Emil Tantilov (9):
  ixgbevf: Update ixgbevf_alloc_rx_buffers to handle clearing of status
    bits
  ixgbevf: Test Rx status bits directly out of the descriptor
  ixgbevf: Combine the logic for post Rx processing into single function
  ixgbevf: Cleanup variable usage, improve stack performance
  ixgbevf: reorder main loop in ixgbe_clean_rx_irq to allow for
    do/while/continue
  ixgbevf: Update Rx next to clean in real time
  ixgbevf: Change receive model to use double buffered page based
    receives
  ixgbevf: compare total_rx_packets and budget in ixgbevf_clean_rx_irq
  ixgbevf: add netpoll support

Mitch Williams (5):
  i40e: don't overload fields
  i40evf: update header comments
  i40evf: make checkpatch happy
  i40evf: make comparisons consistent
  i40evf: remove unnecessary else

Shannon Nelson (1):
  i40e: trigger SW INT with no ITR wait

 drivers/net/ethernet/intel/i40e/i40e_adminq.c      |   6 +-
 drivers/net/ethernet/intel/i40e/i40e_adminq.h      |   3 +-
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c     |   5 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c        |  17 +-
 drivers/net/ethernet/intel/i40evf/i40e_adminq.c    |   6 +-
 drivers/net/ethernet/intel/i40evf/i40e_adminq.h    |   3 +-
 drivers/net/ethernet/intel/i40evf/i40evf.h         |   2 +-
 drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c |  14 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c    |  71 +-
 .../net/ethernet/intel/i40evf/i40evf_virtchnl.c    |  23 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h       |  39 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c  | 728 ++++++++++++++-------
 12 files changed, 592 insertions(+), 325 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [net-next 01/15] ixgbevf: Update ixgbevf_alloc_rx_buffers to handle clearing of status bits
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 02/15] ixgbevf: Test Rx status bits directly out of the descriptor Jeff Kirsher
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

Instead of clearing the status bits in the cleanup it makes more sense to
just clear the status bits on allocation.  This way we can leave the Rx
descriptor rings as a read only memory block until we actually have buffers
to give back to the hardware.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 132 +++++++++++++---------
 1 file changed, 80 insertions(+), 52 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 030a219..deda74d 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -143,21 +143,6 @@ u32 ixgbevf_read_reg(struct ixgbe_hw *hw, u32 reg)
 	return value;
 }
 
-static inline void ixgbevf_release_rx_desc(struct ixgbevf_ring *rx_ring,
-					   u32 val)
-{
-	rx_ring->next_to_use = val;
-
-	/*
-	 * Force memory writes to complete before letting h/w
-	 * know there are new descriptors to fetch.  (Only
-	 * applicable for weak-ordered memory model archs,
-	 * such as IA-64).
-	 */
-	wmb();
-	ixgbevf_write_tail(rx_ring, val);
-}
-
 /**
  * ixgbevf_set_ivar - set IVAR registers - maps interrupt causes to vectors
  * @adapter: pointer to adapter struct
@@ -424,52 +409,99 @@ static inline void ixgbevf_rx_checksum(struct ixgbevf_ring *ring,
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
 }
 
+static bool ixgbevf_alloc_mapped_skb(struct ixgbevf_ring *rx_ring,
+				     struct ixgbevf_rx_buffer *bi)
+{
+	struct sk_buff *skb = bi->skb;
+	dma_addr_t dma = bi->dma;
+
+	if (unlikely(skb))
+		return true;
+
+	skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
+					rx_ring->rx_buf_len);
+	if (unlikely(!skb)) {
+		rx_ring->rx_stats.alloc_rx_buff_failed++;
+		return false;
+	}
+
+	dma = dma_map_single(rx_ring->dev, skb->data,
+			     rx_ring->rx_buf_len, DMA_FROM_DEVICE);
+
+	/* if mapping failed free memory back to system since
+	 * there isn't much point in holding memory we can't use
+	 */
+	if (dma_mapping_error(rx_ring->dev, dma)) {
+		dev_kfree_skb_any(skb);
+
+		rx_ring->rx_stats.alloc_rx_buff_failed++;
+		return false;
+	}
+
+	bi->skb = skb;
+	bi->dma = dma;
+
+	return true;
+}
+
 /**
  * ixgbevf_alloc_rx_buffers - Replace used receive buffers; packet split
  * @rx_ring: rx descriptor ring (for a specific queue) to setup buffers on
+ * @cleaned_count: number of buffers to replace
  **/
 static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
-				     int cleaned_count)
+				     u16 cleaned_count)
 {
 	union ixgbe_adv_rx_desc *rx_desc;
 	struct ixgbevf_rx_buffer *bi;
 	unsigned int i = rx_ring->next_to_use;
 
-	while (cleaned_count--) {
-		rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
-		bi = &rx_ring->rx_buffer_info[i];
-
-		if (!bi->skb) {
-			struct sk_buff *skb;
+	/* nothing to do or no valid netdev defined */
+	if (!cleaned_count || !rx_ring->netdev)
+		return;
 
-			skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-							rx_ring->rx_buf_len);
-			if (!skb)
-				goto no_buffers;
+	rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
+	bi = &rx_ring->rx_buffer_info[i];
+	i -= rx_ring->count;
 
-			bi->skb = skb;
+	do {
+		if (!ixgbevf_alloc_mapped_skb(rx_ring, bi))
+			break;
 
-			bi->dma = dma_map_single(rx_ring->dev, skb->data,
-						 rx_ring->rx_buf_len,
-						 DMA_FROM_DEVICE);
-			if (dma_mapping_error(rx_ring->dev, bi->dma)) {
-				dev_kfree_skb(skb);
-				bi->skb = NULL;
-				dev_err(rx_ring->dev, "Rx DMA map failed\n");
-				break;
-			}
-		}
+		/* Refresh the desc even if pkt_addr didn't change
+		 * because each write-back erases this info.
+		 */
 		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma);
 
+		rx_desc++;
+		bi++;
 		i++;
-		if (i == rx_ring->count)
-			i = 0;
-	}
+		if (unlikely(!i)) {
+			rx_desc = IXGBEVF_RX_DESC(rx_ring, 0);
+			bi = rx_ring->rx_buffer_info;
+			i -= rx_ring->count;
+		}
+
+		/* clear the hdr_addr for the next_to_use descriptor */
+		rx_desc->read.hdr_addr = 0;
+
+		cleaned_count--;
+	} while (cleaned_count);
 
-no_buffers:
-	rx_ring->rx_stats.alloc_rx_buff_failed++;
-	if (rx_ring->next_to_use != i)
-		ixgbevf_release_rx_desc(rx_ring, i);
+	i += rx_ring->count;
+
+	if (rx_ring->next_to_use != i) {
+		/* record the next descriptor to use */
+		rx_ring->next_to_use = i;
+
+		/* Force memory writes to complete before letting h/w
+		 * know there are new descriptors to fetch.  (Only
+		 * applicable for weak-ordered memory model archs,
+		 * such as IA-64).
+		 */
+		wmb();
+		ixgbevf_write_tail(rx_ring, i);
+	}
 }
 
 static inline void ixgbevf_irq_enable_queues(struct ixgbevf_adapter *adapter,
@@ -489,8 +521,8 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 	struct sk_buff *skb;
 	unsigned int i;
 	u32 len, staterr;
-	int cleaned_count = 0;
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
+	u16 cleaned_count = ixgbevf_desc_unused(rx_ring);
 
 	i = rx_ring->next_to_clean;
 	rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
@@ -571,8 +603,6 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		ixgbevf_rx_skb(q_vector, skb, staterr, rx_desc);
 
 next_desc:
-		rx_desc->wb.upper.status_error = 0;
-
 		/* return some buffers to hardware, one at a time is too slow */
 		if (cleaned_count >= IXGBEVF_RX_BUFFER_WRITE) {
 			ixgbevf_alloc_rx_buffers(rx_ring, cleaned_count);
@@ -587,11 +617,6 @@ next_desc:
 	}
 
 	rx_ring->next_to_clean = i;
-	cleaned_count = ixgbevf_desc_unused(rx_ring);
-
-	if (cleaned_count)
-		ixgbevf_alloc_rx_buffers(rx_ring, cleaned_count);
-
 	u64_stats_update_begin(&rx_ring->syncp);
 	rx_ring->stats.packets += total_rx_packets;
 	rx_ring->stats.bytes += total_rx_bytes;
@@ -599,6 +624,9 @@ next_desc:
 	q_vector->rx.total_packets += total_rx_packets;
 	q_vector->rx.total_bytes += total_rx_bytes;
 
+	if (cleaned_count)
+		ixgbevf_alloc_rx_buffers(rx_ring, cleaned_count);
+
 	return total_rx_packets;
 }
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 02/15] ixgbevf: Test Rx status bits directly out of the descriptor
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
  2014-11-19  4:10 ` [net-next 01/15] ixgbevf: Update ixgbevf_alloc_rx_buffers to handle clearing of status bits Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 03/15] ixgbevf: Combine the logic for post Rx processing into single function Jeff Kirsher
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

Instead of keeping a local copy of the status bits from the descriptor
we can just read them directly - this is accomplished with the addition
of ixgbevf_test_staterr().

In addition instead of doing a byteswap on the status bits value, we
can byteswap the constant values we are testing since that can be done
at compile time which should help to improve performance on big-endian
systems.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  7 +++
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 59 ++++++++++-------------
 2 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index ba96cb5..5f7d2f3 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -307,6 +307,13 @@ static inline bool ixgbevf_qv_disable(struct ixgbevf_q_vector *q_vector)
 	((_eitr) ? (1000000000 / ((_eitr) * 256)) : 8)
 #define EITR_REG_TO_INTS_PER_SEC EITR_INTS_PER_SEC_TO_REG
 
+/* ixgbevf_test_staterr - tests bits in Rx descriptor status and error fields */
+static inline __le32 ixgbevf_test_staterr(union ixgbe_adv_rx_desc *rx_desc,
+					  const u32 stat_err_bits)
+{
+	return rx_desc->wb.upper.status_error & cpu_to_le32(stat_err_bits);
+}
+
 static inline u16 ixgbevf_desc_unused(struct ixgbevf_ring *ring)
 {
 	u16 ntc = ring->next_to_clean;
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index deda74d..19062dc 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -331,15 +331,14 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector,
  * ixgbevf_receive_skb - Send a completed packet up the stack
  * @q_vector: structure containing interrupt and ring information
  * @skb: packet to send up
- * @status: hardware indication of status of receive
  * @rx_desc: rx descriptor
  **/
 static void ixgbevf_receive_skb(struct ixgbevf_q_vector *q_vector,
-				struct sk_buff *skb, u8 status,
+				struct sk_buff *skb,
 				union ixgbe_adv_rx_desc *rx_desc)
 {
 	struct ixgbevf_adapter *adapter = q_vector->adapter;
-	bool is_vlan = (status & IXGBE_RXD_STAT_VP);
+	bool is_vlan = !!ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_VP);
 	u16 tag = le16_to_cpu(rx_desc->wb.upper.vlan);
 
 	if (is_vlan && test_bit(tag & VLAN_VID_MASK, adapter->active_vlans))
@@ -355,11 +354,10 @@ static void ixgbevf_receive_skb(struct ixgbevf_q_vector *q_vector,
  * ixgbevf_rx_skb - Helper function to determine proper Rx method
  * @q_vector: structure containing interrupt and ring information
  * @skb: packet to send up
- * @status: hardware indication of status of receive
  * @rx_desc: rx descriptor
  **/
 static void ixgbevf_rx_skb(struct ixgbevf_q_vector *q_vector,
-			   struct sk_buff *skb, u8 status,
+			   struct sk_buff *skb,
 			   union ixgbe_adv_rx_desc *rx_desc)
 {
 #ifdef CONFIG_NET_RX_BUSY_POLL
@@ -372,17 +370,17 @@ static void ixgbevf_rx_skb(struct ixgbevf_q_vector *q_vector,
 	}
 #endif /* CONFIG_NET_RX_BUSY_POLL */
 
-	ixgbevf_receive_skb(q_vector, skb, status, rx_desc);
+	ixgbevf_receive_skb(q_vector, skb, rx_desc);
 }
 
-/**
- * ixgbevf_rx_checksum - indicate in skb if hw indicated a good cksum
- * @ring: pointer to Rx descriptor ring structure
- * @status_err: hardware indication of status of receive
+/* ixgbevf_rx_checksum - indicate in skb if hw indicated a good cksum
+ * @ring: structure containig ring specific data
+ * @rx_desc: current Rx descriptor being processed
  * @skb: skb currently being received and modified
- **/
+ */
 static inline void ixgbevf_rx_checksum(struct ixgbevf_ring *ring,
-				       u32 status_err, struct sk_buff *skb)
+				       union ixgbe_adv_rx_desc *rx_desc,
+				       struct sk_buff *skb)
 {
 	skb_checksum_none_assert(skb);
 
@@ -391,16 +389,16 @@ static inline void ixgbevf_rx_checksum(struct ixgbevf_ring *ring,
 		return;
 
 	/* if IP and error */
-	if ((status_err & IXGBE_RXD_STAT_IPCS) &&
-	    (status_err & IXGBE_RXDADV_ERR_IPE)) {
+	if (ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_IPCS) &&
+	    ixgbevf_test_staterr(rx_desc, IXGBE_RXDADV_ERR_IPE)) {
 		ring->rx_stats.csum_err++;
 		return;
 	}
 
-	if (!(status_err & IXGBE_RXD_STAT_L4CS))
+	if (!ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_L4CS))
 		return;
 
-	if (status_err & IXGBE_RXDADV_ERR_TCPE) {
+	if (ixgbevf_test_staterr(rx_desc, IXGBE_RXDADV_ERR_TCPE)) {
 		ring->rx_stats.csum_err++;
 		return;
 	}
@@ -520,33 +518,29 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 	struct ixgbevf_rx_buffer *rx_buffer_info, *next_buffer;
 	struct sk_buff *skb;
 	unsigned int i;
-	u32 len, staterr;
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 cleaned_count = ixgbevf_desc_unused(rx_ring);
 
 	i = rx_ring->next_to_clean;
 	rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
-	staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
 	rx_buffer_info = &rx_ring->rx_buffer_info[i];
 
-	while (staterr & IXGBE_RXD_STAT_DD) {
+	while (ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_DD)) {
 		if (!budget)
 			break;
 		budget--;
 
 		rmb(); /* read descriptor and rx_buffer_info after status DD */
-		len = le16_to_cpu(rx_desc->wb.upper.length);
+
 		skb = rx_buffer_info->skb;
 		prefetch(skb->data - NET_IP_ALIGN);
 		rx_buffer_info->skb = NULL;
 
-		if (rx_buffer_info->dma) {
-			dma_unmap_single(rx_ring->dev, rx_buffer_info->dma,
-					 rx_ring->rx_buf_len,
-					 DMA_FROM_DEVICE);
-			rx_buffer_info->dma = 0;
-			skb_put(skb, len);
-		}
+		dma_unmap_single(rx_ring->dev, rx_buffer_info->dma,
+				 rx_ring->rx_buf_len,
+				 DMA_FROM_DEVICE);
+		rx_buffer_info->dma = 0;
+		skb_put(skb, le16_to_cpu(rx_desc->wb.upper.length));
 
 		i++;
 		if (i == rx_ring->count)
@@ -558,7 +552,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 
 		next_buffer = &rx_ring->rx_buffer_info[i];
 
-		if (!(staterr & IXGBE_RXD_STAT_EOP)) {
+		if (!(ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_EOP))) {
 			skb->next = next_buffer->skb;
 			IXGBE_CB(skb->next)->prev = skb;
 			rx_ring->rx_stats.non_eop_descs++;
@@ -576,12 +570,13 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		}
 
 		/* ERR_MASK will only have valid bits if EOP set */
-		if (unlikely(staterr & IXGBE_RXDADV_ERR_FRAME_ERR_MASK)) {
+		if (unlikely(ixgbevf_test_staterr(rx_desc,
+					    IXGBE_RXDADV_ERR_FRAME_ERR_MASK))) {
 			dev_kfree_skb_irq(skb);
 			goto next_desc;
 		}
 
-		ixgbevf_rx_checksum(rx_ring, staterr, skb);
+		ixgbevf_rx_checksum(rx_ring, rx_desc, skb);
 
 		/* probably a little skewed due to removing CRC */
 		total_rx_bytes += skb->len;
@@ -600,7 +595,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 			goto next_desc;
 		}
 
-		ixgbevf_rx_skb(q_vector, skb, staterr, rx_desc);
+		ixgbevf_rx_skb(q_vector, skb, rx_desc);
 
 next_desc:
 		/* return some buffers to hardware, one at a time is too slow */
@@ -612,8 +607,6 @@ next_desc:
 		/* use prefetched values */
 		rx_desc = next_rxd;
 		rx_buffer_info = &rx_ring->rx_buffer_info[i];
-
-		staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
 	}
 
 	rx_ring->next_to_clean = i;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 03/15] ixgbevf: Combine the logic for post Rx processing into single function
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
  2014-11-19  4:10 ` [net-next 01/15] ixgbevf: Update ixgbevf_alloc_rx_buffers to handle clearing of status bits Jeff Kirsher
  2014-11-19  4:10 ` [net-next 02/15] ixgbevf: Test Rx status bits directly out of the descriptor Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 04/15] ixgbevf: Cleanup variable usage, improve stack performance Jeff Kirsher
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

This patch cleans up ixgbevf_clean_rx_irq() by merging several similar
operations into a new function - ixgbevf_process_skb_fields().

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  4 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 68 ++++++++++++-----------
 2 files changed, 38 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index 5f7d2f3..90d5751 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -346,8 +346,10 @@ static inline void ixgbevf_write_tail(struct ixgbevf_ring *ring, u32 value)
 
 /* board specific private data structure */
 struct ixgbevf_adapter {
-	struct timer_list watchdog_timer;
+	/* this field must be first, see ixgbevf_process_skb_fields */
 	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
+
+	struct timer_list watchdog_timer;
 	struct work_struct reset_task;
 	struct ixgbevf_q_vector *q_vector[MAX_MSIX_Q_VECTORS];
 
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 19062dc..36b005e 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -328,37 +328,12 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector,
 }
 
 /**
- * ixgbevf_receive_skb - Send a completed packet up the stack
- * @q_vector: structure containing interrupt and ring information
- * @skb: packet to send up
- * @rx_desc: rx descriptor
- **/
-static void ixgbevf_receive_skb(struct ixgbevf_q_vector *q_vector,
-				struct sk_buff *skb,
-				union ixgbe_adv_rx_desc *rx_desc)
-{
-	struct ixgbevf_adapter *adapter = q_vector->adapter;
-	bool is_vlan = !!ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_VP);
-	u16 tag = le16_to_cpu(rx_desc->wb.upper.vlan);
-
-	if (is_vlan && test_bit(tag & VLAN_VID_MASK, adapter->active_vlans))
-		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), tag);
-
-	if (!(adapter->flags & IXGBE_FLAG_IN_NETPOLL))
-		napi_gro_receive(&q_vector->napi, skb);
-	else
-		netif_rx(skb);
-}
-
-/**
  * ixgbevf_rx_skb - Helper function to determine proper Rx method
  * @q_vector: structure containing interrupt and ring information
  * @skb: packet to send up
- * @rx_desc: rx descriptor
  **/
 static void ixgbevf_rx_skb(struct ixgbevf_q_vector *q_vector,
-			   struct sk_buff *skb,
-			   union ixgbe_adv_rx_desc *rx_desc)
+			   struct sk_buff *skb)
 {
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	skb_mark_napi_id(skb, &q_vector->napi);
@@ -369,8 +344,10 @@ static void ixgbevf_rx_skb(struct ixgbevf_q_vector *q_vector,
 		return;
 	}
 #endif /* CONFIG_NET_RX_BUSY_POLL */
-
-	ixgbevf_receive_skb(q_vector, skb, rx_desc);
+	if (!(q_vector->adapter->flags & IXGBE_FLAG_IN_NETPOLL))
+		napi_gro_receive(&q_vector->napi, skb);
+	else
+		netif_rx(skb);
 }
 
 /* ixgbevf_rx_checksum - indicate in skb if hw indicated a good cksum
@@ -407,6 +384,32 @@ static inline void ixgbevf_rx_checksum(struct ixgbevf_ring *ring,
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
 }
 
+/* ixgbevf_process_skb_fields - Populate skb header fields from Rx descriptor
+ * @rx_ring: rx descriptor ring packet is being transacted on
+ * @rx_desc: pointer to the EOP Rx descriptor
+ * @skb: pointer to current skb being populated
+ *
+ * This function checks the ring, descriptor, and packet information in
+ * order to populate the checksum, VLAN, protocol, and other fields within
+ * the skb.
+ */
+static void ixgbevf_process_skb_fields(struct ixgbevf_ring *rx_ring,
+				       union ixgbe_adv_rx_desc *rx_desc,
+				       struct sk_buff *skb)
+{
+	ixgbevf_rx_checksum(rx_ring, rx_desc, skb);
+
+	if (ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_VP)) {
+		u16 vid = le16_to_cpu(rx_desc->wb.upper.vlan);
+		unsigned long *active_vlans = netdev_priv(rx_ring->netdev);
+
+		if (test_bit(vid & VLAN_VID_MASK, active_vlans))
+			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vid);
+	}
+
+	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
+}
+
 static bool ixgbevf_alloc_mapped_skb(struct ixgbevf_ring *rx_ring,
 				     struct ixgbevf_rx_buffer *bi)
 {
@@ -576,14 +579,10 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 			goto next_desc;
 		}
 
-		ixgbevf_rx_checksum(rx_ring, rx_desc, skb);
-
 		/* probably a little skewed due to removing CRC */
 		total_rx_bytes += skb->len;
 		total_rx_packets++;
 
-		skb->protocol = eth_type_trans(skb, rx_ring->netdev);
-
 		/* Workaround hardware that can't do proper VEPA multicast
 		 * source pruning.
 		 */
@@ -595,7 +594,10 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 			goto next_desc;
 		}
 
-		ixgbevf_rx_skb(q_vector, skb, rx_desc);
+		/* populate checksum, VLAN, and protocol */
+		ixgbevf_process_skb_fields(rx_ring, rx_desc, skb);
+
+		ixgbevf_rx_skb(q_vector, skb);
 
 next_desc:
 		/* return some buffers to hardware, one at a time is too slow */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 04/15] ixgbevf: Cleanup variable usage, improve stack performance
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (2 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 03/15] ixgbevf: Combine the logic for post Rx processing into single function Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 05/15] ixgbevf: reorder main loop in ixgbe_clean_rx_irq to allow for do/while/continue Jeff Kirsher
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

This change is meant to help cleanup the usage of temporary variables
within the Rx hot-path by removing unnecessary variables and reducing
the scope of variables that do not need to exist outside the main loop.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 36b005e..f864da9 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -517,26 +517,28 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 				struct ixgbevf_ring *rx_ring,
 				int budget)
 {
-	union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
-	struct ixgbevf_rx_buffer *rx_buffer_info, *next_buffer;
-	struct sk_buff *skb;
+	union ixgbe_adv_rx_desc *rx_desc;
 	unsigned int i;
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 cleaned_count = ixgbevf_desc_unused(rx_ring);
 
 	i = rx_ring->next_to_clean;
 	rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
-	rx_buffer_info = &rx_ring->rx_buffer_info[i];
 
 	while (ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_DD)) {
+		union ixgbe_adv_rx_desc *next_rxd;
+		struct ixgbevf_rx_buffer *rx_buffer_info;
+		struct sk_buff *skb;
+
 		if (!budget)
 			break;
 		budget--;
 
 		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
+		rx_buffer_info = &rx_ring->rx_buffer_info[i];
 		skb = rx_buffer_info->skb;
-		prefetch(skb->data - NET_IP_ALIGN);
+		prefetch(skb->data);
 		rx_buffer_info->skb = NULL;
 
 		dma_unmap_single(rx_ring->dev, rx_buffer_info->dma,
@@ -545,18 +547,17 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		rx_buffer_info->dma = 0;
 		skb_put(skb, le16_to_cpu(rx_desc->wb.upper.length));
 
+		cleaned_count++;
+
 		i++;
 		if (i == rx_ring->count)
 			i = 0;
 
 		next_rxd = IXGBEVF_RX_DESC(rx_ring, i);
 		prefetch(next_rxd);
-		cleaned_count++;
-
-		next_buffer = &rx_ring->rx_buffer_info[i];
 
 		if (!(ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_EOP))) {
-			skb->next = next_buffer->skb;
+			skb->next = rx_ring->rx_buffer_info[i].skb;
 			IXGBE_CB(skb->next)->prev = skb;
 			rx_ring->rx_stats.non_eop_descs++;
 			goto next_desc;
@@ -609,6 +610,7 @@ next_desc:
 		/* use prefetched values */
 		rx_desc = next_rxd;
 		rx_buffer_info = &rx_ring->rx_buffer_info[i];
+		rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
 	}
 
 	rx_ring->next_to_clean = i;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 05/15] ixgbevf: reorder main loop in ixgbe_clean_rx_irq to allow for do/while/continue
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (3 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 04/15] ixgbevf: Cleanup variable usage, improve stack performance Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 06/15] ixgbevf: Update Rx next to clean in real time Jeff Kirsher
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

This change allows us to go from a loop based on the descriptor to one
primarily based on the budget. The advantage to this is that we can avoid
carrying too many values from one iteration to the next.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 64 ++++++++++++-----------
 1 file changed, 34 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index f864da9..2206992 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -517,35 +517,48 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 				struct ixgbevf_ring *rx_ring,
 				int budget)
 {
-	union ixgbe_adv_rx_desc *rx_desc;
 	unsigned int i;
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 cleaned_count = ixgbevf_desc_unused(rx_ring);
 
 	i = rx_ring->next_to_clean;
-	rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
 
-	while (ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_DD)) {
-		union ixgbe_adv_rx_desc *next_rxd;
-		struct ixgbevf_rx_buffer *rx_buffer_info;
+	do {
+		union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
+		struct ixgbevf_rx_buffer *rx_buffer;
 		struct sk_buff *skb;
 
-		if (!budget)
+		/* return some buffers to hardware, one at a time is too slow */
+		if (cleaned_count >= IXGBEVF_RX_BUFFER_WRITE) {
+			ixgbevf_alloc_rx_buffers(rx_ring, cleaned_count);
+			cleaned_count = 0;
+		}
+
+		rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
+		rx_buffer = &rx_ring->rx_buffer_info[i];
+
+		if (!ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_DD))
 			break;
-		budget--;
 
-		rmb(); /* read descriptor and rx_buffer_info after status DD */
+		/* This memory barrier is needed to keep us from reading
+		 * any other fields out of the rx_desc until we know the
+		 * RXD_STAT_DD bit is set
+		 */
+		rmb();
 
-		rx_buffer_info = &rx_ring->rx_buffer_info[i];
-		skb = rx_buffer_info->skb;
+		skb = rx_buffer->skb;
 		prefetch(skb->data);
-		rx_buffer_info->skb = NULL;
 
-		dma_unmap_single(rx_ring->dev, rx_buffer_info->dma,
+		/* pull the header of the skb in */
+		__skb_put(skb, le16_to_cpu(rx_desc->wb.upper.length));
+
+		dma_unmap_single(rx_ring->dev, rx_buffer->dma,
 				 rx_ring->rx_buf_len,
 				 DMA_FROM_DEVICE);
-		rx_buffer_info->dma = 0;
-		skb_put(skb, le16_to_cpu(rx_desc->wb.upper.length));
+
+		/* clear skb reference in buffer info structure */
+		rx_buffer->skb = NULL;
+		rx_buffer->dma = 0;
 
 		cleaned_count++;
 
@@ -560,7 +573,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 			skb->next = rx_ring->rx_buffer_info[i].skb;
 			IXGBE_CB(skb->next)->prev = skb;
 			rx_ring->rx_stats.non_eop_descs++;
-			goto next_desc;
+			continue;
 		}
 
 		/* we should not be chaining buffers, if we did drop the skb */
@@ -570,14 +583,14 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 				skb = IXGBE_CB(skb)->prev;
 				dev_kfree_skb(this);
 			} while (skb);
-			goto next_desc;
+			continue;
 		}
 
 		/* ERR_MASK will only have valid bits if EOP set */
 		if (unlikely(ixgbevf_test_staterr(rx_desc,
 					    IXGBE_RXDADV_ERR_FRAME_ERR_MASK))) {
 			dev_kfree_skb_irq(skb);
-			goto next_desc;
+			continue;
 		}
 
 		/* probably a little skewed due to removing CRC */
@@ -592,7 +605,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		    ether_addr_equal(rx_ring->netdev->dev_addr,
 				     eth_hdr(skb)->h_source)) {
 			dev_kfree_skb_irq(skb);
-			goto next_desc;
+			continue;
 		}
 
 		/* populate checksum, VLAN, and protocol */
@@ -600,18 +613,9 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 
 		ixgbevf_rx_skb(q_vector, skb);
 
-next_desc:
-		/* return some buffers to hardware, one at a time is too slow */
-		if (cleaned_count >= IXGBEVF_RX_BUFFER_WRITE) {
-			ixgbevf_alloc_rx_buffers(rx_ring, cleaned_count);
-			cleaned_count = 0;
-		}
-
-		/* use prefetched values */
-		rx_desc = next_rxd;
-		rx_buffer_info = &rx_ring->rx_buffer_info[i];
-		rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
-	}
+		/* update budget accounting */
+		budget--;
+	} while (likely(budget));
 
 	rx_ring->next_to_clean = i;
 	u64_stats_update_begin(&rx_ring->syncp);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 06/15] ixgbevf: Update Rx next to clean in real time
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (4 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 05/15] ixgbevf: reorder main loop in ixgbe_clean_rx_irq to allow for do/while/continue Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 07/15] ixgbevf: Change receive model to use double buffered page based receives Jeff Kirsher
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

Since the next_to_clean value is only accessed by the Rx interrupt handler
we can save on stack space by just storing our updated values back in
next_to_clean instead of using the stack variable i.  This should help to
reduce stack space and we can further collapse the size of the function.

Also removed  non_eop_descs counter as it was never shown in the stats.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  2 -
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 55 +++++++++++++++--------
 2 files changed, 36 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index 90d5751..72a354b 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -79,7 +79,6 @@ struct ixgbevf_tx_queue_stats {
 };
 
 struct ixgbevf_rx_queue_stats {
-	u64 non_eop_descs;
 	u64 alloc_rx_page_failed;
 	u64 alloc_rx_buff_failed;
 	u64 csum_err;
@@ -372,7 +371,6 @@ struct ixgbevf_adapter {
 	struct ixgbevf_ring *rx_ring[MAX_TX_QUEUES]; /* One per active queue */
 	u64 hw_csum_rx_error;
 	u64 hw_rx_no_dma_resources;
-	u64 non_eop_descs;
 	int num_msix_vectors;
 	u32 alloc_rx_page_failed;
 	u32 alloc_rx_buff_failed;
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 2206992..20bebd2 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -410,6 +410,35 @@ static void ixgbevf_process_skb_fields(struct ixgbevf_ring *rx_ring,
 	skb->protocol = eth_type_trans(skb, rx_ring->netdev);
 }
 
+/**
+ * ixgbevf_is_non_eop - process handling of non-EOP buffers
+ * @rx_ring: Rx ring being processed
+ * @rx_desc: Rx descriptor for current buffer
+ * @skb: current socket buffer containing buffer in progress
+ *
+ * This function updates next to clean.  If the buffer is an EOP buffer
+ * this function exits returning false, otherwise it will place the
+ * sk_buff in the next buffer to be chained and return true indicating
+ * that this is in fact a non-EOP buffer.
+ **/
+static bool ixgbevf_is_non_eop(struct ixgbevf_ring *rx_ring,
+			       union ixgbe_adv_rx_desc *rx_desc,
+			       struct sk_buff *skb)
+{
+	u32 ntc = rx_ring->next_to_clean + 1;
+
+	/* fetch, update, and store next to clean */
+	ntc = (ntc < rx_ring->count) ? ntc : 0;
+	rx_ring->next_to_clean = ntc;
+
+	prefetch(IXGBEVF_RX_DESC(rx_ring, ntc));
+
+	if (likely(ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_EOP)))
+		return false;
+
+	return true;
+}
+
 static bool ixgbevf_alloc_mapped_skb(struct ixgbevf_ring *rx_ring,
 				     struct ixgbevf_rx_buffer *bi)
 {
@@ -517,16 +546,14 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 				struct ixgbevf_ring *rx_ring,
 				int budget)
 {
-	unsigned int i;
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 cleaned_count = ixgbevf_desc_unused(rx_ring);
 
-	i = rx_ring->next_to_clean;
-
 	do {
-		union ixgbe_adv_rx_desc *rx_desc, *next_rxd;
+		union ixgbe_adv_rx_desc *rx_desc;
 		struct ixgbevf_rx_buffer *rx_buffer;
 		struct sk_buff *skb;
+		u16 ntc;
 
 		/* return some buffers to hardware, one at a time is too slow */
 		if (cleaned_count >= IXGBEVF_RX_BUFFER_WRITE) {
@@ -534,8 +561,9 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 			cleaned_count = 0;
 		}
 
-		rx_desc = IXGBEVF_RX_DESC(rx_ring, i);
-		rx_buffer = &rx_ring->rx_buffer_info[i];
+		ntc = rx_ring->next_to_clean;
+		rx_desc = IXGBEVF_RX_DESC(rx_ring, ntc);
+		rx_buffer = &rx_ring->rx_buffer_info[ntc];
 
 		if (!ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_DD))
 			break;
@@ -562,19 +590,9 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 
 		cleaned_count++;
 
-		i++;
-		if (i == rx_ring->count)
-			i = 0;
-
-		next_rxd = IXGBEVF_RX_DESC(rx_ring, i);
-		prefetch(next_rxd);
-
-		if (!(ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_EOP))) {
-			skb->next = rx_ring->rx_buffer_info[i].skb;
-			IXGBE_CB(skb->next)->prev = skb;
-			rx_ring->rx_stats.non_eop_descs++;
+		/* place incomplete frames back on ring for completion */
+		if (ixgbevf_is_non_eop(rx_ring, rx_desc, skb))
 			continue;
-		}
 
 		/* we should not be chaining buffers, if we did drop the skb */
 		if (IXGBE_CB(skb)->prev) {
@@ -617,7 +635,6 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		budget--;
 	} while (likely(budget));
 
-	rx_ring->next_to_clean = i;
 	u64_stats_update_begin(&rx_ring->syncp);
 	rx_ring->stats.packets += total_rx_packets;
 	rx_ring->stats.bytes += total_rx_bytes;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 07/15] ixgbevf: Change receive model to use double buffered page based receives
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (5 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 06/15] ixgbevf: Update Rx next to clean in real time Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19 18:24   ` Alexander Duyck
  2014-11-19  4:10 ` [net-next 08/15] ixgbevf: compare total_rx_packets and budget in ixgbevf_clean_rx_irq Jeff Kirsher
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

This patch changes the basic receive path for ixgbevf so that instead of
receiving the data into an skb it is received into a double buffered page.
The main change is that the receives will be done in pages only and then
pull the header out of the page and copy it into the sk_buff data.

This has the advantages of reduced cache misses and improved performance on
IOMMU enabled systems.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  24 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 452 +++++++++++++++-------
 2 files changed, 331 insertions(+), 145 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index 72a354b..2362001 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -58,8 +58,9 @@ struct ixgbevf_tx_buffer {
 };
 
 struct ixgbevf_rx_buffer {
-	struct sk_buff *skb;
 	dma_addr_t dma;
+	struct page *page;
+	unsigned int page_offset;
 };
 
 struct ixgbevf_stats {
@@ -91,9 +92,10 @@ struct ixgbevf_ring {
 	void *desc;			/* descriptor ring memory */
 	dma_addr_t dma;			/* phys. address of descriptor ring */
 	unsigned int size;		/* length in bytes */
-	unsigned int count;		/* amount of descriptors */
-	unsigned int next_to_use;
-	unsigned int next_to_clean;
+	u16 count;			/* amount of descriptors */
+	u16 next_to_use;
+	u16 next_to_clean;
+	u16 next_to_alloc;
 
 	union {
 		struct ixgbevf_tx_buffer *tx_buffer_info;
@@ -109,12 +111,11 @@ struct ixgbevf_ring {
 
 	u64 hw_csum_rx_error;
 	u8 __iomem *tail;
+	struct sk_buff *skb;
 
 	u16 reg_idx; /* holds the special value that gets the hardware register
 		      * offset associated with this ring, which is different
 		      * for DCB and RSS modes */
-
-	u16 rx_buf_len;
 	int queue_index; /* needed for multiqueue queue management */
 };
 
@@ -133,12 +134,10 @@ struct ixgbevf_ring {
 
 /* Supported Rx Buffer Sizes */
 #define IXGBEVF_RXBUFFER_256   256    /* Used for packet split */
-#define IXGBEVF_RXBUFFER_2K    2048
-#define IXGBEVF_RXBUFFER_4K    4096
-#define IXGBEVF_RXBUFFER_8K    8192
-#define IXGBEVF_RXBUFFER_10K   10240
+#define IXGBEVF_RXBUFFER_2048	2048
 
 #define IXGBEVF_RX_HDR_SIZE IXGBEVF_RXBUFFER_256
+#define IXGBEVF_RX_BUFSZ	IXGBEVF_RXBUFFER_2048
 
 #define MAXIMUM_ETHERNET_VLAN_SIZE (VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
 
@@ -430,11 +429,6 @@ enum ixbgevf_state_t {
 	__IXGBEVF_WORK_INIT,
 };
 
-struct ixgbevf_cb {
-	struct sk_buff *prev;
-};
-#define IXGBE_CB(skb) ((struct ixgbevf_cb *)(skb)->cb)
-
 enum ixgbevf_boards {
 	board_82599_vf,
 	board_X540_vf,
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 20bebd2..2ca7c96 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -422,8 +422,7 @@ static void ixgbevf_process_skb_fields(struct ixgbevf_ring *rx_ring,
  * that this is in fact a non-EOP buffer.
  **/
 static bool ixgbevf_is_non_eop(struct ixgbevf_ring *rx_ring,
-			       union ixgbe_adv_rx_desc *rx_desc,
-			       struct sk_buff *skb)
+			       union ixgbe_adv_rx_desc *rx_desc)
 {
 	u32 ntc = rx_ring->next_to_clean + 1;
 
@@ -439,37 +438,40 @@ static bool ixgbevf_is_non_eop(struct ixgbevf_ring *rx_ring,
 	return true;
 }
 
-static bool ixgbevf_alloc_mapped_skb(struct ixgbevf_ring *rx_ring,
-				     struct ixgbevf_rx_buffer *bi)
+static bool ixgbevf_alloc_mapped_page(struct ixgbevf_ring *rx_ring,
+				      struct ixgbevf_rx_buffer *bi)
 {
-	struct sk_buff *skb = bi->skb;
+	struct page *page = bi->page;
 	dma_addr_t dma = bi->dma;
 
-	if (unlikely(skb))
+	/* since we are recycling buffers we should seldom need to alloc */
+	if (likely(page))
 		return true;
 
-	skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-					rx_ring->rx_buf_len);
-	if (unlikely(!skb)) {
-		rx_ring->rx_stats.alloc_rx_buff_failed++;
+	/* alloc new page for storage */
+	page = dev_alloc_page();
+	if (unlikely(!page)) {
+		rx_ring->rx_stats.alloc_rx_page_failed++;
 		return false;
 	}
 
-	dma = dma_map_single(rx_ring->dev, skb->data,
-			     rx_ring->rx_buf_len, DMA_FROM_DEVICE);
+	/* map page for use */
+	dma = dma_map_page(rx_ring->dev, page, 0,
+			   PAGE_SIZE, DMA_FROM_DEVICE);
 
 	/* if mapping failed free memory back to system since
 	 * there isn't much point in holding memory we can't use
 	 */
 	if (dma_mapping_error(rx_ring->dev, dma)) {
-		dev_kfree_skb_any(skb);
+		__free_page(page);
 
 		rx_ring->rx_stats.alloc_rx_buff_failed++;
 		return false;
 	}
 
-	bi->skb = skb;
 	bi->dma = dma;
+	bi->page = page;
+	bi->page_offset = 0;
 
 	return true;
 }
@@ -495,13 +497,13 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
 	i -= rx_ring->count;
 
 	do {
-		if (!ixgbevf_alloc_mapped_skb(rx_ring, bi))
+		if (!ixgbevf_alloc_mapped_page(rx_ring, bi))
 			break;
 
 		/* Refresh the desc even if pkt_addr didn't change
 		 * because each write-back erases this info.
 		 */
-		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma);
+		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset);
 
 		rx_desc++;
 		bi++;
@@ -524,6 +526,9 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
 		/* record the next descriptor to use */
 		rx_ring->next_to_use = i;
 
+		/* update next to alloc since we have filled the ring */
+		rx_ring->next_to_alloc = i;
+
 		/* Force memory writes to complete before letting h/w
 		 * know there are new descriptors to fetch.  (Only
 		 * applicable for weak-ordered memory model archs,
@@ -534,6 +539,257 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
 	}
 }
 
+/* ixgbevf_pull_tail - ixgbevf specific version of skb_pull_tail
+ * @rx_ring: Rx descriptor ring packet is being transacted on
+ * @skb: pointer to current skb being adjusted
+ *
+ * This function is an ixgbevf specific version of __pskb_pull_tail.  The
+ * main difference between this version and the original function is that
+ * this function can make several assumptions about the state of things
+ * that allow for significant optimizations versus the standard function.
+ * As a result we can do things like drop a frag and maintain an accurate
+ * truesize for the skb.
+ */
+static void ixgbevf_pull_tail(struct ixgbevf_ring *rx_ring,
+			      struct sk_buff *skb)
+{
+	struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[0];
+	unsigned char *va;
+	unsigned int pull_len;
+
+	/* it is valid to use page_address instead of kmap since we are
+	 * working with pages allocated out of the lomem pool per
+	 * alloc_page(GFP_ATOMIC)
+	 */
+	va = skb_frag_address(frag);
+
+	/* we need the header to contain the greater of either ETH_HLEN or
+	 * 60 bytes if the skb->len is less than 60 for skb_pad.
+	 */
+	pull_len = eth_get_headlen(va, IXGBEVF_RX_HDR_SIZE);
+
+	/* align pull length to size of long to optimize memcpy performance */
+	skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long)));
+
+	/* update all of the pointers */
+	skb_frag_size_sub(frag, pull_len);
+	frag->page_offset += pull_len;
+	skb->data_len -= pull_len;
+	skb->tail += pull_len;
+}
+
+/* ixgbevf_cleanup_headers - Correct corrupted or empty headers
+ * @rx_ring: Rx descriptor ring packet is being transacted on
+ * @rx_desc: pointer to the EOP Rx descriptor
+ * @skb: pointer to current skb being fixed
+ *
+ * Check for corrupted packet headers caused by senders on the local L2
+ * embedded NIC switch not setting up their Tx Descriptors right.  These
+ * should be very rare.
+ *
+ * Also address the case where we are pulling data in on pages only
+ * and as such no data is present in the skb header.
+ *
+ * In addition if skb is not at least 60 bytes we need to pad it so that
+ * it is large enough to qualify as a valid Ethernet frame.
+ *
+ * Returns true if an error was encountered and skb was freed.
+ */
+static bool ixgbevf_cleanup_headers(struct ixgbevf_ring *rx_ring,
+				    union ixgbe_adv_rx_desc *rx_desc,
+				    struct sk_buff *skb)
+{
+	/* verify that the packet does not have any known errors */
+	if (unlikely(ixgbevf_test_staterr(rx_desc,
+					  IXGBE_RXDADV_ERR_FRAME_ERR_MASK))) {
+		struct net_device *netdev = rx_ring->netdev;
+
+		if (!(netdev->features & NETIF_F_RXALL)) {
+			dev_kfree_skb_any(skb);
+			return true;
+		}
+	}
+
+	/* place header in linear portion of buffer */
+	if (skb_is_nonlinear(skb))
+		ixgbevf_pull_tail(rx_ring, skb);
+
+	/* if skb_pad returns an error the skb was freed */
+	if (unlikely(skb->len < 60)) {
+		int pad_len = 60 - skb->len;
+
+		if (skb_pad(skb, pad_len))
+			return true;
+		__skb_put(skb, pad_len);
+	}
+
+	return false;
+}
+
+/* ixgbevf_reuse_rx_page - page flip buffer and store it back on the ring
+ * @rx_ring: Rx descriptor ring to store buffers on
+ * @old_buff: donor buffer to have page reused
+ *
+ * Synchronizes page for reuse by the adapter
+ */
+static void ixgbevf_reuse_rx_page(struct ixgbevf_ring *rx_ring,
+				  struct ixgbevf_rx_buffer *old_buff)
+{
+	struct ixgbevf_rx_buffer *new_buff;
+	u16 nta = rx_ring->next_to_alloc;
+
+	new_buff = &rx_ring->rx_buffer_info[nta];
+
+	/* update, and store next to alloc */
+	nta++;
+	rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0;
+
+	/* transfer page from old buffer to new buffer */
+	new_buff->page = old_buff->page;
+	new_buff->dma = old_buff->dma;
+	new_buff->page_offset = old_buff->page_offset;
+
+	/* sync the buffer for use by the device */
+	dma_sync_single_range_for_device(rx_ring->dev, new_buff->dma,
+					 new_buff->page_offset,
+					 IXGBEVF_RX_BUFSZ,
+					 DMA_FROM_DEVICE);
+}
+
+/* ixgbevf_add_rx_frag - Add contents of Rx buffer to sk_buff
+ * @rx_ring: Rx descriptor ring to transact packets on
+ * @rx_buffer: buffer containing page to add
+ * @rx_desc: descriptor containing length of buffer written by hardware
+ * @skb: sk_buff to place the data into
+ *
+ * This function will add the data contained in rx_buffer->page to the skb.
+ * This is done either through a direct copy if the data in the buffer is
+ * less than the skb header size, otherwise it will just attach the page as
+ * a frag to the skb.
+ *
+ * The function will then update the page offset if necessary and return
+ * true if the buffer can be reused by the adapter.
+ */
+static bool ixgbevf_add_rx_frag(struct ixgbevf_ring *rx_ring,
+				struct ixgbevf_rx_buffer *rx_buffer,
+				union ixgbe_adv_rx_desc *rx_desc,
+				struct sk_buff *skb)
+{
+	struct page *page = rx_buffer->page;
+	unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
+#if (PAGE_SIZE < 8192)
+	unsigned int truesize = IXGBEVF_RX_BUFSZ;
+#else
+	unsigned int truesize = ALIGN(size, L1_CACHE_BYTES);
+#endif
+
+	if ((size <= IXGBEVF_RX_HDR_SIZE) && !skb_is_nonlinear(skb)) {
+		unsigned char *va = page_address(page) + rx_buffer->page_offset;
+
+		memcpy(__skb_put(skb, size), va, ALIGN(size, sizeof(long)));
+
+		/* we can reuse buffer as-is, just make sure it is local */
+		if (likely(page_to_nid(page) == numa_node_id()))
+			return true;
+
+		/* this page cannot be reused so discard it */
+		put_page(page);
+		return false;
+	}
+
+	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
+			rx_buffer->page_offset, size, truesize);
+
+	/* avoid re-using remote pages */
+	if (unlikely(page_to_nid(page) != numa_node_id()))
+		return false;
+
+#if (PAGE_SIZE < 8192)
+	/* if we are only owner of page we can reuse it */
+	if (unlikely(page_count(page) != 1))
+		return false;
+
+	/* flip page offset to other buffer */
+	rx_buffer->page_offset ^= IXGBEVF_RX_BUFSZ;
+
+	/* since we are the only owner of the page and we need to
+	 * increment it.
+	 */
+	atomic_inc(&page->_count);
+#else
+	/* move offset up to the next cache line */
+	rx_buffer->page_offset += truesize;
+
+	if (rx_buffer->page_offset > (PAGE_SIZE - IXGBEVF_RX_BUFSZ))
+		return false;
+
+	/* bump ref count on page before it is given to the stack */
+	get_page(page);
+#endif
+
+	return true;
+}
+
+static struct sk_buff *ixgbevf_fetch_rx_buffer(struct ixgbevf_ring *rx_ring,
+					       union ixgbe_adv_rx_desc *rx_desc,
+					       struct sk_buff *skb)
+{
+	struct ixgbevf_rx_buffer *rx_buffer;
+	struct page *page;
+
+	rx_buffer = &rx_ring->rx_buffer_info[rx_ring->next_to_clean];
+	page = rx_buffer->page;
+	prefetchw(page);
+
+	if (likely(!skb)) {
+		void *page_addr = page_address(page) +
+				  rx_buffer->page_offset;
+
+		/* prefetch first cache line of first page */
+		prefetch(page_addr);
+#if L1_CACHE_BYTES < 128
+		prefetch(page_addr + L1_CACHE_BYTES);
+#endif
+
+		/* allocate a skb to store the frags */
+		skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
+						IXGBEVF_RX_HDR_SIZE);
+		if (unlikely(!skb)) {
+			rx_ring->rx_stats.alloc_rx_buff_failed++;
+			return NULL;
+		}
+
+		/* we will be copying header into skb->data in
+		 * pskb_may_pull so it is in our interest to prefetch
+		 * it now to avoid a possible cache miss
+		 */
+		prefetchw(skb->data);
+	}
+
+	/* we are reusing so sync this buffer for CPU use */
+	dma_sync_single_range_for_cpu(rx_ring->dev,
+				      rx_buffer->dma,
+				      rx_buffer->page_offset,
+				      IXGBEVF_RX_BUFSZ,
+				      DMA_FROM_DEVICE);
+
+	/* pull page into skb */
+	if (ixgbevf_add_rx_frag(rx_ring, rx_buffer, rx_desc, skb)) {
+		/* hand second half of page back to the ring */
+		ixgbevf_reuse_rx_page(rx_ring, rx_buffer);
+	} else {
+		/* we are not reusing the buffer so unmap it */
+		dma_unmap_page(rx_ring->dev, rx_buffer->dma,
+			       PAGE_SIZE, DMA_FROM_DEVICE);
+	}
+
+	/* clear contents of buffer_info */
+	rx_buffer->dma = 0;
+	rx_buffer->page = NULL;
+
+	return skb;
+}
+
 static inline void ixgbevf_irq_enable_queues(struct ixgbevf_adapter *adapter,
 					     u32 qmask)
 {
@@ -548,12 +804,10 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 {
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 cleaned_count = ixgbevf_desc_unused(rx_ring);
+	struct sk_buff *skb = rx_ring->skb;
 
 	do {
 		union ixgbe_adv_rx_desc *rx_desc;
-		struct ixgbevf_rx_buffer *rx_buffer;
-		struct sk_buff *skb;
-		u16 ntc;
 
 		/* return some buffers to hardware, one at a time is too slow */
 		if (cleaned_count >= IXGBEVF_RX_BUFFER_WRITE) {
@@ -561,9 +815,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 			cleaned_count = 0;
 		}
 
-		ntc = rx_ring->next_to_clean;
-		rx_desc = IXGBEVF_RX_DESC(rx_ring, ntc);
-		rx_buffer = &rx_ring->rx_buffer_info[ntc];
+		rx_desc = IXGBEVF_RX_DESC(rx_ring, rx_ring->next_to_clean);
 
 		if (!ixgbevf_test_staterr(rx_desc, IXGBE_RXD_STAT_DD))
 			break;
@@ -574,40 +826,22 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		 */
 		rmb();
 
-		skb = rx_buffer->skb;
-		prefetch(skb->data);
-
-		/* pull the header of the skb in */
-		__skb_put(skb, le16_to_cpu(rx_desc->wb.upper.length));
-
-		dma_unmap_single(rx_ring->dev, rx_buffer->dma,
-				 rx_ring->rx_buf_len,
-				 DMA_FROM_DEVICE);
+		/* retrieve a buffer from the ring */
+		skb = ixgbevf_fetch_rx_buffer(rx_ring, rx_desc, skb);
 
-		/* clear skb reference in buffer info structure */
-		rx_buffer->skb = NULL;
-		rx_buffer->dma = 0;
+		/* exit if we failed to retrieve a buffer */
+		if (!skb)
+			break;
 
 		cleaned_count++;
 
-		/* place incomplete frames back on ring for completion */
-		if (ixgbevf_is_non_eop(rx_ring, rx_desc, skb))
+		/* fetch next buffer in frame if non-eop */
+		if (ixgbevf_is_non_eop(rx_ring, rx_desc))
 			continue;
 
-		/* we should not be chaining buffers, if we did drop the skb */
-		if (IXGBE_CB(skb)->prev) {
-			do {
-				struct sk_buff *this = skb;
-				skb = IXGBE_CB(skb)->prev;
-				dev_kfree_skb(this);
-			} while (skb);
-			continue;
-		}
-
-		/* ERR_MASK will only have valid bits if EOP set */
-		if (unlikely(ixgbevf_test_staterr(rx_desc,
-					    IXGBE_RXDADV_ERR_FRAME_ERR_MASK))) {
-			dev_kfree_skb_irq(skb);
+		/* verify the packet layout is correct */
+		if (ixgbevf_cleanup_headers(rx_ring, rx_desc, skb)) {
+			skb = NULL;
 			continue;
 		}
 
@@ -631,10 +865,16 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 
 		ixgbevf_rx_skb(q_vector, skb);
 
+		/* reset skb pointer */
+		skb = NULL;
+
 		/* update budget accounting */
 		budget--;
 	} while (likely(budget));
 
+	/* place incomplete frames back on ring for completion */
+	rx_ring->skb = skb;
+
 	u64_stats_update_begin(&rx_ring->syncp);
 	rx_ring->stats.packets += total_rx_packets;
 	rx_ring->stats.bytes += total_rx_bytes;
@@ -642,9 +882,6 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 	q_vector->rx.total_packets += total_rx_packets;
 	q_vector->rx.total_bytes += total_rx_bytes;
 
-	if (cleaned_count)
-		ixgbevf_alloc_rx_buffers(rx_ring, cleaned_count);
-
 	return total_rx_packets;
 }
 
@@ -1275,19 +1512,15 @@ static void ixgbevf_configure_tx(struct ixgbevf_adapter *adapter)
 
 static void ixgbevf_configure_srrctl(struct ixgbevf_adapter *adapter, int index)
 {
-	struct ixgbevf_ring *rx_ring;
 	struct ixgbe_hw *hw = &adapter->hw;
 	u32 srrctl;
 
-	rx_ring = adapter->rx_ring[index];
-
 	srrctl = IXGBE_SRRCTL_DROP_EN;
 
+	srrctl |= IXGBEVF_RX_HDR_SIZE << IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT;
+	srrctl |= IXGBEVF_RX_BUFSZ >> IXGBE_SRRCTL_BSIZEPKT_SHIFT;
 	srrctl |= IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
 
-	srrctl |= ALIGN(rx_ring->rx_buf_len, 1024) >>
-		  IXGBE_SRRCTL_BSIZEPKT_SHIFT;
-
 	IXGBE_WRITE_REG(hw, IXGBE_VFSRRCTL(index), srrctl);
 }
 
@@ -1306,40 +1539,6 @@ static void ixgbevf_setup_psrtype(struct ixgbevf_adapter *adapter)
 	IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype);
 }
 
-static void ixgbevf_set_rx_buffer_len(struct ixgbevf_adapter *adapter)
-{
-	struct ixgbe_hw *hw = &adapter->hw;
-	struct net_device *netdev = adapter->netdev;
-	int max_frame = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
-	int i;
-	u16 rx_buf_len;
-
-	/* notify the PF of our intent to use this size of frame */
-	ixgbevf_rlpml_set_vf(hw, max_frame);
-
-	/* PF will allow an extra 4 bytes past for vlan tagged frames */
-	max_frame += VLAN_HLEN;
-
-	/*
-	 * Allocate buffer sizes that fit well into 32K and
-	 * take into account max frame size of 9.5K
-	 */
-	if ((hw->mac.type == ixgbe_mac_X540_vf) &&
-	    (max_frame <= MAXIMUM_ETHERNET_VLAN_SIZE))
-		rx_buf_len = MAXIMUM_ETHERNET_VLAN_SIZE;
-	else if (max_frame <= IXGBEVF_RXBUFFER_2K)
-		rx_buf_len = IXGBEVF_RXBUFFER_2K;
-	else if (max_frame <= IXGBEVF_RXBUFFER_4K)
-		rx_buf_len = IXGBEVF_RXBUFFER_4K;
-	else if (max_frame <= IXGBEVF_RXBUFFER_8K)
-		rx_buf_len = IXGBEVF_RXBUFFER_8K;
-	else
-		rx_buf_len = IXGBEVF_RXBUFFER_10K;
-
-	for (i = 0; i < adapter->num_rx_queues; i++)
-		adapter->rx_ring[i]->rx_buf_len = rx_buf_len;
-}
-
 #define IXGBEVF_MAX_RX_DESC_POLL 10
 static void ixgbevf_disable_rx_queue(struct ixgbevf_adapter *adapter,
 				     struct ixgbevf_ring *ring)
@@ -1417,12 +1616,13 @@ static void ixgbevf_configure_rx_ring(struct ixgbevf_adapter *adapter,
 	/* reset ntu and ntc to place SW in sync with hardwdare */
 	ring->next_to_clean = 0;
 	ring->next_to_use = 0;
+	ring->next_to_alloc = 0;
 
 	ixgbevf_configure_srrctl(adapter, reg_idx);
 
-	/* prevent DMA from exceeding buffer space available */
-	rxdctl &= ~IXGBE_RXDCTL_RLPMLMASK;
-	rxdctl |= ring->rx_buf_len | IXGBE_RXDCTL_RLPML_EN;
+	/* allow any size packet since we can handle overflow */
+	rxdctl &= ~IXGBE_RXDCTL_RLPML_EN;
+
 	rxdctl |= IXGBE_RXDCTL_ENABLE | IXGBE_RXDCTL_VME;
 	IXGBE_WRITE_REG(hw, IXGBE_VFRXDCTL(reg_idx), rxdctl);
 
@@ -1439,11 +1639,13 @@ static void ixgbevf_configure_rx_ring(struct ixgbevf_adapter *adapter,
 static void ixgbevf_configure_rx(struct ixgbevf_adapter *adapter)
 {
 	int i;
+	struct ixgbe_hw *hw = &adapter->hw;
+	struct net_device *netdev = adapter->netdev;
 
 	ixgbevf_setup_psrtype(adapter);
 
-	/* set_rx_buffer_len must be called before ring initialization */
-	ixgbevf_set_rx_buffer_len(adapter);
+	/* notify the PF of our intent to use this size of frame */
+	ixgbevf_rlpml_set_vf(hw, netdev->mtu + ETH_HLEN + ETH_FCS_LEN);
 
 	/* Setup the HW Rx Head and Tail Descriptor Pointers and
 	 * the Base and Length of the Rx Descriptor Ring */
@@ -1748,32 +1950,32 @@ void ixgbevf_up(struct ixgbevf_adapter *adapter)
  **/
 static void ixgbevf_clean_rx_ring(struct ixgbevf_ring *rx_ring)
 {
+	struct device *dev = rx_ring->dev;
 	unsigned long size;
 	unsigned int i;
 
+	/* Free Rx ring sk_buff */
+	if (rx_ring->skb) {
+		dev_kfree_skb(rx_ring->skb);
+		rx_ring->skb = NULL;
+	}
+
+	/* ring already cleared, nothing to do */
 	if (!rx_ring->rx_buffer_info)
 		return;
 
-	/* Free all the Rx ring sk_buffs */
+	/* Free all the Rx ring pages */
 	for (i = 0; i < rx_ring->count; i++) {
-		struct ixgbevf_rx_buffer *rx_buffer_info;
+		struct ixgbevf_rx_buffer *rx_buffer;
 
-		rx_buffer_info = &rx_ring->rx_buffer_info[i];
-		if (rx_buffer_info->dma) {
-			dma_unmap_single(rx_ring->dev, rx_buffer_info->dma,
-					 rx_ring->rx_buf_len,
-					 DMA_FROM_DEVICE);
-			rx_buffer_info->dma = 0;
-		}
-		if (rx_buffer_info->skb) {
-			struct sk_buff *skb = rx_buffer_info->skb;
-			rx_buffer_info->skb = NULL;
-			do {
-				struct sk_buff *this = skb;
-				skb = IXGBE_CB(skb)->prev;
-				dev_kfree_skb(this);
-			} while (skb);
-		}
+		rx_buffer = &rx_ring->rx_buffer_info[i];
+		if (rx_buffer->dma)
+			dma_unmap_page(dev, rx_buffer->dma,
+				       PAGE_SIZE, DMA_FROM_DEVICE);
+		rx_buffer->dma = 0;
+		if (rx_buffer->page)
+			__free_page(rx_buffer->page);
+		rx_buffer->page = NULL;
 	}
 
 	size = sizeof(struct ixgbevf_rx_buffer) * rx_ring->count;
@@ -3320,21 +3522,11 @@ static int ixgbevf_set_mac(struct net_device *netdev, void *p)
 static int ixgbevf_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct ixgbevf_adapter *adapter = netdev_priv(netdev);
+	struct ixgbe_hw *hw = &adapter->hw;
 	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
-	int max_possible_frame = MAXIMUM_ETHERNET_VLAN_SIZE;
-
-	switch (adapter->hw.api_version) {
-	case ixgbe_mbox_api_11:
-		max_possible_frame = IXGBE_MAX_JUMBO_FRAME_SIZE;
-		break;
-	default:
-		if (adapter->hw.mac.type == ixgbe_mac_X540_vf)
-			max_possible_frame = IXGBE_MAX_JUMBO_FRAME_SIZE;
-		break;
-	}
 
 	/* MTU < 68 is an error and causes problems on some kernels */
-	if ((new_mtu < 68) || (max_frame > max_possible_frame))
+	if ((new_mtu < 68) || (max_frame > IXGBE_MAX_JUMBO_FRAME_SIZE))
 		return -EINVAL;
 
 	hw_dbg(&adapter->hw, "changing MTU from %d to %d\n",
@@ -3342,8 +3534,8 @@ static int ixgbevf_change_mtu(struct net_device *netdev, int new_mtu)
 	/* must set new MTU before calling down or up */
 	netdev->mtu = new_mtu;
 
-	if (netif_running(netdev))
-		ixgbevf_reinit_locked(adapter);
+	/* notify the PF of our intent to use this size of frame */
+	ixgbevf_rlpml_set_vf(hw, max_frame);
 
 	return 0;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 08/15] ixgbevf: compare total_rx_packets and budget in ixgbevf_clean_rx_irq
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (6 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 07/15] ixgbevf: Change receive model to use double buffered page based receives Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 09/15] ixgbevf: add netpoll support Jeff Kirsher
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

total_rx_packets is the number of packets we had cleaned, and budget is
the total number of packets that we could clean per poll. Instead of
altering both of these values we can save ourselves one write to memory by
just comparing total_rx_packets to the budget and as long as we are less
than budget we continue cleaning.

Also change the do{}while logic to while{} in order to avoid processing
packets when budget is 0.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 2ca7c96..dcc50cd 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -806,7 +806,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 	u16 cleaned_count = ixgbevf_desc_unused(rx_ring);
 	struct sk_buff *skb = rx_ring->skb;
 
-	do {
+	while (likely(total_rx_packets < budget)) {
 		union ixgbe_adv_rx_desc *rx_desc;
 
 		/* return some buffers to hardware, one at a time is too slow */
@@ -847,7 +847,6 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 
 		/* probably a little skewed due to removing CRC */
 		total_rx_bytes += skb->len;
-		total_rx_packets++;
 
 		/* Workaround hardware that can't do proper VEPA multicast
 		 * source pruning.
@@ -869,8 +868,8 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 		skb = NULL;
 
 		/* update budget accounting */
-		budget--;
-	} while (likely(budget));
+		total_rx_packets++;
+	}
 
 	/* place incomplete frames back on ring for completion */
 	rx_ring->skb = skb;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 09/15] ixgbevf: add netpoll support
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (7 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 08/15] ixgbevf: compare total_rx_packets and budget in ixgbevf_clean_rx_irq Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 10/15] i40e: don't overload fields Jeff Kirsher
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem
  Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene,
	Alexander Duyck, Jeff Kirsher

From: Emil Tantilov <emil.s.tantilov@intel.com>

This patch adds ixgbevf_netpoll() a callback for .ndo_poll_controller to
allow for the VF interface to be used with netconsole.

CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  2 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 29 ++++++++++++++++++-----
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index 2362001..1900ae6 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -379,7 +379,7 @@ struct ixgbevf_adapter {
 	 */
 	u32 flags;
 #define IXGBE_FLAG_IN_WATCHDOG_TASK             (u32)(1)
-#define IXGBE_FLAG_IN_NETPOLL                   (u32)(1 << 1)
+
 #define IXGBEVF_FLAG_QUEUE_RESET_REQUESTED	(u32)(1 << 2)
 
 	struct msix_entry *msix_entries;
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index dcc50cd..beb2ff1 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -344,10 +344,8 @@ static void ixgbevf_rx_skb(struct ixgbevf_q_vector *q_vector,
 		return;
 	}
 #endif /* CONFIG_NET_RX_BUSY_POLL */
-	if (!(q_vector->adapter->flags & IXGBE_FLAG_IN_NETPOLL))
-		napi_gro_receive(&q_vector->napi, skb);
-	else
-		netif_rx(skb);
+
+	napi_gro_receive(&q_vector->napi, skb);
 }
 
 /* ixgbevf_rx_checksum - indicate in skb if hw indicated a good cksum
@@ -916,12 +914,10 @@ static int ixgbevf_poll(struct napi_struct *napi, int budget)
 	else
 		per_ring_budget = budget;
 
-	adapter->flags |= IXGBE_FLAG_IN_NETPOLL;
 	ixgbevf_for_each_ring(ring, q_vector->rx)
 		clean_complete &= (ixgbevf_clean_rx_irq(q_vector, ring,
 							per_ring_budget)
 				   < per_ring_budget);
-	adapter->flags &= ~IXGBE_FLAG_IN_NETPOLL;
 
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	ixgbevf_qv_unlock_napi(q_vector);
@@ -3539,6 +3535,24 @@ static int ixgbevf_change_mtu(struct net_device *netdev, int new_mtu)
 	return 0;
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+/* Polling 'interrupt' - used by things like netconsole to send skbs
+ * without having to re-enable interrupts. It's not called while
+ * the interrupt routine is executing.
+ */
+static void ixgbevf_netpoll(struct net_device *netdev)
+{
+	struct ixgbevf_adapter *adapter = netdev_priv(netdev);
+	int i;
+
+	/* if interface is down do nothing */
+	if (test_bit(__IXGBEVF_DOWN, &adapter->state))
+		return;
+	for (i = 0; i < adapter->num_rx_queues; i++)
+		ixgbevf_msix_clean_rings(0, adapter->q_vector[i]);
+}
+#endif /* CONFIG_NET_POLL_CONTROLLER */
+
 static int ixgbevf_suspend(struct pci_dev *pdev, pm_message_t state)
 {
 	struct net_device *netdev = pci_get_drvdata(pdev);
@@ -3675,6 +3689,9 @@ static const struct net_device_ops ixgbevf_netdev_ops = {
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	.ndo_busy_poll		= ixgbevf_busy_poll_recv,
 #endif
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	.ndo_poll_controller	= ixgbevf_netpoll,
+#endif
 };
 
 static void ixgbevf_assign_netdev_ops(struct net_device *dev)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 10/15] i40e: don't overload fields
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (8 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 09/15] ixgbevf: add netpoll support Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 11/15] i40evf: update header comments Jeff Kirsher
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem; +Cc: Mitch Williams, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Mitch Williams <mitch.a.williams@intel.com>

Overloading the msg_size field in the arq_event_info struct is just a
bad idea. It leads to repeated bugs when the structure is used in a
loop, since the input value (buffer size) is overwritten by the output
value (actual message length).

Fix this by splitting the field into two and renaming to indicate the
actual function of each field.

Since the arq_event struct has now changed, we need to change the drivers
to support this. Note that we no longer need to initialize the buffer size
each time we go through a loop as this value is no longer destroyed by
arq processing.

In the process, we also fix a bug in i40evf_verify_api_ver where the
buffer size was not correctly reinitialized each time through the loop.

Change-ID: Ic7f9633cdd6f871f93e698dfb095e29c696f5581
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Ashish Shah <ashish.n.shah@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_adminq.c       |  6 +++---
 drivers/net/ethernet/intel/i40e/i40e_adminq.h       |  3 ++-
 drivers/net/ethernet/intel/i40e/i40e_main.c         |  7 +++----
 drivers/net/ethernet/intel/i40evf/i40e_adminq.c     |  6 +++---
 drivers/net/ethernet/intel/i40evf/i40e_adminq.h     |  3 ++-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c     |  7 +++----
 drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c | 11 +++++------
 7 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.c b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
index f7f6206..5bb4914 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
@@ -980,10 +980,10 @@ i40e_status i40e_clean_arq_element(struct i40e_hw *hw,
 
 	e->desc = *desc;
 	datalen = le16_to_cpu(desc->datalen);
-	e->msg_size = min(datalen, e->msg_size);
-	if (e->msg_buf != NULL && (e->msg_size != 0))
+	e->msg_len = min(datalen, e->buf_len);
+	if (e->msg_buf != NULL && (e->msg_len != 0))
 		memcpy(e->msg_buf, hw->aq.arq.r.arq_bi[desc_idx].va,
-		       e->msg_size);
+		       e->msg_len);
 
 	i40e_debug(hw, I40E_DEBUG_AQ_MESSAGE, "AQRX: desc and buffer:\n");
 	i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc, e->msg_buf,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.h b/drivers/net/ethernet/intel/i40e/i40e_adminq.h
index df0bd09..003a227 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.h
@@ -76,7 +76,8 @@ struct i40e_asq_cmd_details {
 /* ARQ event information */
 struct i40e_arq_event_info {
 	struct i40e_aq_desc desc;
-	u16 msg_size;
+	u16 msg_len;
+	u16 buf_len;
 	u8 *msg_buf;
 };
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index c998d82..3ebab03 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5750,13 +5750,12 @@ static void i40e_clean_adminq_subtask(struct i40e_pf *pf)
 	if (oldval != val)
 		wr32(&pf->hw, pf->hw.aq.asq.len, val);
 
-	event.msg_size = I40E_MAX_AQ_BUF_SIZE;
-	event.msg_buf = kzalloc(event.msg_size, GFP_KERNEL);
+	event.buf_len = I40E_MAX_AQ_BUF_SIZE;
+	event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
 	if (!event.msg_buf)
 		return;
 
 	do {
-		event.msg_size = I40E_MAX_AQ_BUF_SIZE; /* reinit each time */
 		ret = i40e_clean_arq_element(hw, &event, &pending);
 		if (ret == I40E_ERR_ADMIN_QUEUE_NO_WORK)
 			break;
@@ -5777,7 +5776,7 @@ static void i40e_clean_adminq_subtask(struct i40e_pf *pf)
 					le32_to_cpu(event.desc.cookie_high),
 					le32_to_cpu(event.desc.cookie_low),
 					event.msg_buf,
-					event.msg_size);
+					event.msg_len);
 			break;
 		case i40e_aqc_opc_lldp_update_mib:
 			dev_dbg(&pf->pdev->dev, "ARQ: Update LLDP MIB event received\n");
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_adminq.c b/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
index 500ca21..d7e446f 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
@@ -929,10 +929,10 @@ i40e_status i40evf_clean_arq_element(struct i40e_hw *hw,
 
 	e->desc = *desc;
 	datalen = le16_to_cpu(desc->datalen);
-	e->msg_size = min(datalen, e->msg_size);
-	if (e->msg_buf != NULL && (e->msg_size != 0))
+	e->msg_len = min(datalen, e->buf_len);
+	if (e->msg_buf != NULL && (e->msg_len != 0))
 		memcpy(e->msg_buf, hw->aq.arq.r.arq_bi[desc_idx].va,
-		       e->msg_size);
+		       e->msg_len);
 
 	if (i40e_is_nvm_update_op(&e->desc))
 		hw->aq.nvm_busy = false;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_adminq.h b/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
index f40cfac..0d58378 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
@@ -76,7 +76,8 @@ struct i40e_asq_cmd_details {
 /* ARQ event information */
 struct i40e_arq_event_info {
 	struct i40e_aq_desc desc;
-	u16 msg_size;
+	u16 msg_len;
+	u16 buf_len;
 	u8 *msg_buf;
 };
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 4892278..d378e13 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1632,8 +1632,8 @@ static void i40evf_adminq_task(struct work_struct *work)
 	if (adapter->flags & I40EVF_FLAG_PF_COMMS_FAILED)
 		return;
 
-	event.msg_size = I40EVF_MAX_AQ_BUF_SIZE;
-	event.msg_buf = kzalloc(event.msg_size, GFP_KERNEL);
+	event.buf_len = I40EVF_MAX_AQ_BUF_SIZE;
+	event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
 	if (!event.msg_buf)
 		return;
 
@@ -1645,10 +1645,9 @@ static void i40evf_adminq_task(struct work_struct *work)
 
 		i40evf_virtchnl_completion(adapter, v_msg->v_opcode,
 					   v_msg->v_retval, event.msg_buf,
-					   event.msg_size);
+					   event.msg_len);
 		if (pending != 0) {
 			memset(event.msg_buf, 0, I40EVF_MAX_AQ_BUF_SIZE);
-			event.msg_size = I40EVF_MAX_AQ_BUF_SIZE;
 		}
 	} while (pending);
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 49bfdb5..422dd2d 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -92,8 +92,8 @@ int i40evf_verify_api_ver(struct i40evf_adapter *adapter)
 	enum i40e_virtchnl_ops op;
 	i40e_status err;
 
-	event.msg_size = I40EVF_MAX_AQ_BUF_SIZE;
-	event.msg_buf = kzalloc(event.msg_size, GFP_KERNEL);
+	event.buf_len = I40EVF_MAX_AQ_BUF_SIZE;
+	event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
 	if (!event.msg_buf) {
 		err = -ENOMEM;
 		goto out;
@@ -169,15 +169,14 @@ int i40evf_get_vf_config(struct i40evf_adapter *adapter)
 
 	len =  sizeof(struct i40e_virtchnl_vf_resource) +
 		I40E_MAX_VF_VSI * sizeof(struct i40e_virtchnl_vsi_resource);
-	event.msg_size = len;
-	event.msg_buf = kzalloc(event.msg_size, GFP_KERNEL);
+	event.buf_len = len;
+	event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
 	if (!event.msg_buf) {
 		err = -ENOMEM;
 		goto out;
 	}
 
 	while (1) {
-		event.msg_size = len;
 		/* When the AQ is empty, i40evf_clean_arq_element will return
 		 * nonzero and this loop will terminate.
 		 */
@@ -191,7 +190,7 @@ int i40evf_get_vf_config(struct i40evf_adapter *adapter)
 	}
 
 	err = (i40e_status)le32_to_cpu(event.desc.cookie_low);
-	memcpy(adapter->vf_res, event.msg_buf, min(event.msg_size, len));
+	memcpy(adapter->vf_res, event.msg_buf, min(event.msg_len, len));
 
 	i40e_vf_parse_hw_config(hw, adapter->vf_res);
 out_alloc:
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 11/15] i40evf: update header comments
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (9 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 10/15] i40e: don't overload fields Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 12/15] i40evf: make checkpatch happy Jeff Kirsher
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem; +Cc: Mitch Williams, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Mitch Williams <mitch.a.williams@intel.com>

No code changes. Update comments to match actual function declarations.

Change-ID: Ib830d2f154ee917a104955c0914267fc98f3d2c8
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
index 876411c..b2cd9ea 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
@@ -621,7 +621,7 @@ static u32 i40evf_get_rxfh_indir_size(struct net_device *netdev)
  * i40evf_get_rxfh - get the rx flow hash indirection table
  * @netdev: network interface device structure
  * @indir: indirection table
- * @key: hash key (will be %NULL until get_rxfh_key_size is implemented)
+ * @key: hash key
  *
  * Reads the indirection table directly from the hardware. Always returns 0.
  **/
@@ -646,7 +646,7 @@ static int i40evf_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key)
  * i40evf_set_rxfh - set the rx flow hash indirection table
  * @netdev: network interface device structure
  * @indir: indirection table
- * @key: hash key (will be %NULL until get_rxfh_key_size is implemented)
+ * @key: hash key
  *
  * Returns -EINVAL if the table specifies an inavlid queue id, otherwise
  * returns 0 after programming the table.
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 12/15] i40evf: make checkpatch happy
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (10 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 11/15] i40evf: update header comments Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:59   ` Joe Perches
  2014-11-19  4:10 ` [net-next 13/15] i40evf: make comparisons consistent Jeff Kirsher
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem; +Cc: Mitch Williams, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Mitch Williams <mitch.a.williams@intel.com>

This patch is the result of running checkpatch on the i40evf driver with
the --strict option. The vast majority of changes are adding/removing
blank lines, aligning function parameters, and correcting over-long
lines.

The only possible functional change is changing the flags member of the
adapter structure to be non-volatile. However, according to the kernel
documentation, this is not necessary and the volatile should be removed.

Change-ID: Ie8c6414800924f529bef831e8845292b970fe2ed
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf.h         |  2 +-
 drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c | 10 +++--
 drivers/net/ethernet/intel/i40evf/i40evf_main.c    | 48 ++++++++++++----------
 .../net/ethernet/intel/i40evf/i40evf_virtchnl.c    | 12 +++---
 4 files changed, 39 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf.h b/drivers/net/ethernet/intel/i40evf/i40evf.h
index 1113f8a..9812247 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf.h
+++ b/drivers/net/ethernet/intel/i40evf/i40evf.h
@@ -244,7 +244,7 @@ struct i40evf_adapter {
 	struct i40e_hw hw; /* defined in i40e_type.h */
 
 	enum i40evf_state_t state;
-	volatile unsigned long crit_section;
+	unsigned long crit_section;
 
 	struct work_struct watchdog_task;
 	bool netdev_registered;
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
index b2cd9ea..69a269b 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
@@ -58,7 +58,7 @@ static const struct i40evf_stats i40evf_gstrings_stats[] = {
 
 #define I40EVF_GLOBAL_STATS_LEN ARRAY_SIZE(i40evf_gstrings_stats)
 #define I40EVF_QUEUE_STATS_LEN(_dev) \
-	(((struct i40evf_adapter *) \
+	(((struct i40evf_adapter *)\
 		netdev_priv(_dev))->num_active_queues \
 		  * 2 * (sizeof(struct i40e_queue_stats) / sizeof(u64)))
 #define I40EVF_STATS_LEN(_dev) \
@@ -175,6 +175,7 @@ static void i40evf_get_strings(struct net_device *netdev, u32 sset, u8 *data)
 static u32 i40evf_get_msglevel(struct net_device *netdev)
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
+
 	return adapter->msg_enable;
 }
 
@@ -189,6 +190,7 @@ static u32 i40evf_get_msglevel(struct net_device *netdev)
 static void i40evf_set_msglevel(struct net_device *netdev, u32 data)
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
+
 	adapter->msg_enable = data;
 }
 
@@ -219,7 +221,7 @@ static void i40evf_get_drvinfo(struct net_device *netdev,
  * but the number of rings is not reported.
  **/
 static void i40evf_get_ringparam(struct net_device *netdev,
-				  struct ethtool_ringparam *ring)
+				 struct ethtool_ringparam *ring)
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
 
@@ -280,7 +282,7 @@ static int i40evf_set_ringparam(struct net_device *netdev,
  * this functionality.
  **/
 static int i40evf_get_coalesce(struct net_device *netdev,
-			     struct ethtool_coalesce *ec)
+			       struct ethtool_coalesce *ec)
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
 	struct i40e_vsi *vsi = &adapter->vsi;
@@ -308,7 +310,7 @@ static int i40evf_get_coalesce(struct net_device *netdev,
  * Change current coalescing settings.
  **/
 static int i40evf_set_coalesce(struct net_device *netdev,
-			     struct ethtool_coalesce *ec)
+			       struct ethtool_coalesce *ec)
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
 	struct i40e_hw *hw = &adapter->hw;
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index d378e13..777d3a5 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -185,6 +185,7 @@ static void i40evf_tx_timeout(struct net_device *netdev)
 static void i40evf_misc_irq_disable(struct i40evf_adapter *adapter)
 {
 	struct i40e_hw *hw = &adapter->hw;
+
 	wr32(hw, I40E_VFINT_DYN_CTL01, 0);
 
 	/* read flush */
@@ -200,6 +201,7 @@ static void i40evf_misc_irq_disable(struct i40evf_adapter *adapter)
 static void i40evf_misc_irq_enable(struct i40evf_adapter *adapter)
 {
 	struct i40e_hw *hw = &adapter->hw;
+
 	wr32(hw, I40E_VFINT_DYN_CTL01, I40E_VFINT_DYN_CTL01_INTENA_MASK |
 				       I40E_VFINT_DYN_CTL01_ITR_INDX_MASK);
 	wr32(hw, I40E_VFINT_ICR0_ENA1, I40E_VFINT_ICR0_ENA_ADMINQ_MASK);
@@ -226,7 +228,6 @@ static void i40evf_irq_disable(struct i40evf_adapter *adapter)
 	}
 	/* read flush */
 	rd32(hw, I40E_VFGEN_RSTAT);
-
 }
 
 /**
@@ -253,8 +254,7 @@ void i40evf_irq_enable_queues(struct i40evf_adapter *adapter, u32 mask)
  * @adapter: board private structure
  * @mask: bitmap of vectors to trigger
  **/
-static void i40evf_fire_sw_int(struct i40evf_adapter *adapter,
-					    u32 mask)
+static void i40evf_fire_sw_int(struct i40evf_adapter *adapter, u32 mask)
 {
 	struct i40e_hw *hw = &adapter->hw;
 	int i;
@@ -551,6 +551,7 @@ static void i40evf_free_traffic_irqs(struct i40evf_adapter *adapter)
 {
 	int i;
 	int q_vectors;
+
 	q_vectors = adapter->num_msix_vectors - NONQ_VECS;
 
 	for (i = 0; i < q_vectors; i++) {
@@ -584,6 +585,7 @@ static void i40evf_configure_tx(struct i40evf_adapter *adapter)
 {
 	struct i40e_hw *hw = &adapter->hw;
 	int i;
+
 	for (i = 0; i < adapter->num_active_queues; i++)
 		adapter->tx_rings[i]->tail = hw->hw_addr + I40E_QTX_TAIL1(i);
 }
@@ -705,7 +707,7 @@ static void i40evf_del_vlan(struct i40evf_adapter *adapter, u16 vlan)
  * @vid: VLAN tag
  **/
 static int i40evf_vlan_rx_add_vid(struct net_device *netdev,
-			 __always_unused __be16 proto, u16 vid)
+				  __always_unused __be16 proto, u16 vid)
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
 
@@ -720,7 +722,7 @@ static int i40evf_vlan_rx_add_vid(struct net_device *netdev,
  * @vid: VLAN tag
  **/
 static int i40evf_vlan_rx_kill_vid(struct net_device *netdev,
-			  __always_unused __be16 proto, u16 vid)
+				   __always_unused __be16 proto, u16 vid)
 {
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
 
@@ -881,6 +883,7 @@ static void i40evf_napi_enable_all(struct i40evf_adapter *adapter)
 
 	for (q_idx = 0; q_idx < q_vectors; q_idx++) {
 		struct napi_struct *napi;
+
 		q_vector = adapter->q_vector[q_idx];
 		napi = &q_vector->napi;
 		napi_enable(napi);
@@ -920,6 +923,7 @@ static void i40evf_configure(struct i40evf_adapter *adapter)
 
 	for (i = 0; i < adapter->num_active_queues; i++) {
 		struct i40e_ring *ring = adapter->rx_rings[i];
+
 		i40evf_alloc_rx_buffers(ring, ring->count);
 		ring->next_to_use = ring->count - 1;
 		writel(ring->next_to_use, ring->tail);
@@ -1088,7 +1092,7 @@ static int i40evf_alloc_queues(struct i40evf_adapter *adapter)
 		struct i40e_ring *tx_ring;
 		struct i40e_ring *rx_ring;
 
-		tx_ring = kzalloc(sizeof(struct i40e_ring) * 2, GFP_KERNEL);
+		tx_ring = kzalloc(sizeof(*tx_ring) * 2, GFP_KERNEL);
 		if (!tx_ring)
 			goto err_out;
 
@@ -1172,14 +1176,14 @@ static int i40evf_alloc_q_vectors(struct i40evf_adapter *adapter)
 	num_q_vectors = adapter->num_msix_vectors - NONQ_VECS;
 
 	for (q_idx = 0; q_idx < num_q_vectors; q_idx++) {
-		q_vector = kzalloc(sizeof(struct i40e_q_vector), GFP_KERNEL);
+		q_vector = kzalloc(sizeof(*q_vector), GFP_KERNEL);
 		if (!q_vector)
 			goto err_out;
 		q_vector->adapter = adapter;
 		q_vector->vsi = &adapter->vsi;
 		q_vector->v_idx = q_idx;
 		netif_napi_add(adapter->netdev, &q_vector->napi,
-				       i40evf_napi_poll, NAPI_POLL_WEIGHT);
+			       i40evf_napi_poll, NAPI_POLL_WEIGHT);
 		adapter->q_vector[q_idx] = q_vector;
 	}
 
@@ -1265,8 +1269,8 @@ int i40evf_init_interrupt_scheme(struct i40evf_adapter *adapter)
 	}
 
 	dev_info(&adapter->pdev->dev, "Multiqueue %s: Queue pair count = %u",
-		(adapter->num_active_queues > 1) ? "Enabled" :
-		"Disabled", adapter->num_active_queues);
+		 (adapter->num_active_queues > 1) ? "Enabled" : "Disabled",
+		 adapter->num_active_queues);
 
 	return 0;
 err_alloc_queues:
@@ -1284,6 +1288,7 @@ err_set_interrupt:
 static void i40evf_watchdog_timer(unsigned long data)
 {
 	struct i40evf_adapter *adapter = (struct i40evf_adapter *)data;
+
 	schedule_work(&adapter->watchdog_task);
 	/* timer will be rescheduled in watchdog task */
 }
@@ -1295,8 +1300,8 @@ static void i40evf_watchdog_timer(unsigned long data)
 static void i40evf_watchdog_task(struct work_struct *work)
 {
 	struct i40evf_adapter *adapter = container_of(work,
-					  struct i40evf_adapter,
-					  watchdog_task);
+						      struct i40evf_adapter,
+						      watchdog_task);
 	struct i40e_hw *hw = &adapter->hw;
 	uint32_t rstat_val;
 
@@ -1334,7 +1339,7 @@ static void i40evf_watchdog_task(struct work_struct *work)
 
 	/* check for reset */
 	rstat_val = rd32(hw, I40E_VFGEN_RSTAT) &
-			    I40E_VFGEN_RSTAT_VFR_STATE_MASK;
+		    I40E_VFGEN_RSTAT_VFR_STATE_MASK;
 	if (!(adapter->flags & I40EVF_FLAG_RESET_PENDING) &&
 	    (rstat_val != I40E_VFR_VFACTIVE) &&
 	    (rstat_val != I40E_VFR_COMPLETED)) {
@@ -1575,12 +1580,12 @@ continue_reset:
 	/* kill and reinit the admin queue */
 	if (i40evf_shutdown_adminq(hw))
 		dev_warn(&adapter->pdev->dev,
-			"%s: Failed to destroy the Admin Queue resources\n",
-			__func__);
+			 "%s: Failed to destroy the Admin Queue resources\n",
+			 __func__);
 	err = i40evf_init_adminq(hw);
 	if (err)
 		dev_info(&adapter->pdev->dev, "%s: init_adminq failed: %d\n",
-			__func__, err);
+			 __func__, err);
 
 	adapter->aq_pending = 0;
 	adapter->aq_required = 0;
@@ -1646,9 +1651,8 @@ static void i40evf_adminq_task(struct work_struct *work)
 		i40evf_virtchnl_completion(adapter, v_msg->v_opcode,
 					   v_msg->v_retval, event.msg_buf,
 					   event.msg_len);
-		if (pending != 0) {
+		if (pending != 0)
 			memset(event.msg_buf, 0, I40EVF_MAX_AQ_BUF_SIZE);
-		}
 	} while (pending);
 
 	/* check for error indications */
@@ -1705,7 +1709,6 @@ static void i40evf_free_all_tx_resources(struct i40evf_adapter *adapter)
 	for (i = 0; i < adapter->num_active_queues; i++)
 		if (adapter->tx_rings[i]->desc)
 			i40evf_free_tx_resources(adapter->tx_rings[i]);
-
 }
 
 /**
@@ -2019,7 +2022,7 @@ static void i40evf_init_task(struct work_struct *work)
 		err = i40evf_check_reset_complete(hw);
 		if (err) {
 			dev_info(&pdev->dev, "Device is still in reset (%d), retrying\n",
-				err);
+				 err);
 			goto err;
 		}
 		hw->aq.num_arq_entries = I40EVF_AQ_LEN;
@@ -2051,7 +2054,7 @@ static void i40evf_init_task(struct work_struct *work)
 		err = i40evf_verify_api_ver(adapter);
 		if (err) {
 			dev_info(&pdev->dev, "Unable to verify API version (%d), retrying\n",
-				err);
+				 err);
 			if (err == I40E_ERR_ADMIN_QUEUE_NO_WORK) {
 				dev_info(&pdev->dev, "Resending request\n");
 				err = i40evf_send_api_ver(adapter);
@@ -2500,8 +2503,9 @@ static struct pci_driver i40evf_driver = {
 static int __init i40evf_init_module(void)
 {
 	int ret;
+
 	pr_info("i40evf: %s - version %s\n", i40evf_driver_string,
-	       i40evf_driver_version);
+		i40evf_driver_version);
 
 	pr_info("%s\n", i40evf_copyright);
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 422dd2d..07c13b0 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -395,7 +395,7 @@ void i40evf_add_ether_addrs(struct i40evf_adapter *adapter)
 	      (count * sizeof(struct i40e_virtchnl_ether_addr));
 	if (len > I40EVF_MAX_AQ_BUF_SIZE) {
 		dev_warn(&adapter->pdev->dev, "%s: Too many MAC address changes in one request\n",
-			__func__);
+			 __func__);
 		count = (I40EVF_MAX_AQ_BUF_SIZE -
 			 sizeof(struct i40e_virtchnl_ether_addr_list)) /
 			sizeof(struct i40e_virtchnl_ether_addr);
@@ -456,7 +456,7 @@ void i40evf_del_ether_addrs(struct i40evf_adapter *adapter)
 	      (count * sizeof(struct i40e_virtchnl_ether_addr));
 	if (len > I40EVF_MAX_AQ_BUF_SIZE) {
 		dev_warn(&adapter->pdev->dev, "%s: Too many MAC address changes in one request\n",
-			__func__);
+			 __func__);
 		count = (I40EVF_MAX_AQ_BUF_SIZE -
 			 sizeof(struct i40e_virtchnl_ether_addr_list)) /
 			sizeof(struct i40e_virtchnl_ether_addr);
@@ -518,7 +518,7 @@ void i40evf_add_vlans(struct i40evf_adapter *adapter)
 	      (count * sizeof(u16));
 	if (len > I40EVF_MAX_AQ_BUF_SIZE) {
 		dev_warn(&adapter->pdev->dev, "%s: Too many VLAN changes in one request\n",
-			__func__);
+			 __func__);
 		count = (I40EVF_MAX_AQ_BUF_SIZE -
 			 sizeof(struct i40e_virtchnl_vlan_filter_list)) /
 			sizeof(u16);
@@ -578,7 +578,7 @@ void i40evf_del_vlans(struct i40evf_adapter *adapter)
 	      (count * sizeof(u16));
 	if (len > I40EVF_MAX_AQ_BUF_SIZE) {
 		dev_warn(&adapter->pdev->dev, "%s: Too many VLAN changes in one request\n",
-			__func__);
+			 __func__);
 		count = (I40EVF_MAX_AQ_BUF_SIZE -
 			 sizeof(struct i40e_virtchnl_vlan_filter_list)) /
 			sizeof(u16);
@@ -637,6 +637,7 @@ void i40evf_set_promiscuous(struct i40evf_adapter *adapter, int flags)
 void i40evf_request_stats(struct i40evf_adapter *adapter)
 {
 	struct i40e_virtchnl_queue_select vqs;
+
 	if (adapter->current_op != I40E_VIRTCHNL_OP_UNKNOWN) {
 		/* no error message, this isn't crucial */
 		return;
@@ -711,7 +712,6 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
 				"%s: Unknown event %d from pf\n",
 				__func__, vpe->event);
 			break;
-
 		}
 		return;
 	}
@@ -776,7 +776,7 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
 		break;
 	default:
 		dev_warn(&adapter->pdev->dev, "%s: Received unexpected message %d from PF\n",
-			__func__, v_opcode);
+			 __func__, v_opcode);
 		break;
 	} /* switch v_opcode */
 	adapter->current_op = I40E_VIRTCHNL_OP_UNKNOWN;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 13/15] i40evf: make comparisons consistent
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (11 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 12/15] i40evf: make checkpatch happy Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 14/15] i40evf: remove unnecessary else Jeff Kirsher
  2014-11-19  4:10 ` [net-next 15/15] i40e: trigger SW INT with no ITR wait Jeff Kirsher
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem; +Cc: Mitch Williams, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Mitch Williams <mitch.a.williams@intel.com>

Most of the null-checking in this driver is of the style if (!foo),
except these few. Make these checks consistent with the rest of the
code.

Change-ID: I991924f34072fa607a1b626a8b3f1fa5195d43e9
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 777d3a5..233710c 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -669,9 +669,9 @@ i40evf_vlan_filter *i40evf_add_vlan(struct i40evf_adapter *adapter, u16 vlan)
 	struct i40evf_vlan_filter *f;
 
 	f = i40evf_find_vlan(adapter, vlan);
-	if (NULL == f) {
+	if (!f) {
 		f = kzalloc(sizeof(*f), GFP_ATOMIC);
-		if (NULL == f)
+		if (!f)
 			return NULL;
 
 		f->vlan = vlan;
@@ -774,9 +774,9 @@ i40evf_mac_filter *i40evf_add_filter(struct i40evf_adapter *adapter,
 		udelay(1);
 
 	f = i40evf_find_filter(adapter, macaddr);
-	if (NULL == f) {
+	if (!f) {
 		f = kzalloc(sizeof(*f), GFP_ATOMIC);
-		if (NULL == f) {
+		if (!f) {
 			clear_bit(__I40EVF_IN_CRITICAL_TASK,
 				  &adapter->crit_section);
 			return NULL;
@@ -2138,7 +2138,7 @@ static void i40evf_init_task(struct work_struct *work)
 	ether_addr_copy(netdev->perm_addr, adapter->hw.mac.addr);
 
 	f = kzalloc(sizeof(*f), GFP_ATOMIC);
-	if (NULL == f)
+	if (!f)
 		goto err_sw_init;
 
 	ether_addr_copy(f->macaddr, adapter->hw.mac.addr);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 14/15] i40evf: remove unnecessary else
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (12 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 13/15] i40evf: make comparisons consistent Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  2014-11-19  4:10 ` [net-next 15/15] i40e: trigger SW INT with no ITR wait Jeff Kirsher
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem; +Cc: Mitch Williams, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Mitch Williams <mitch.a.williams@intel.com>

Since the if part of this statement contains a break, there's no reason
for the else. Clean up the code and make it more obvious that the delay
happens each time through the loop.

Change-ID: I9292eaf7dd687688bdc401b8bd8d1d14f6944460
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 233710c..8e01009 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1513,8 +1513,7 @@ static void i40evf_reset_task(struct work_struct *work)
 		if ((rstat_val != I40E_VFR_VFACTIVE) &&
 		    (rstat_val != I40E_VFR_COMPLETED))
 			break;
-		else
-			msleep(I40EVF_RESET_WAIT_MS);
+		msleep(I40EVF_RESET_WAIT_MS);
 	}
 	if (i == I40EVF_RESET_WAIT_COUNT) {
 		adapter->flags &= ~I40EVF_FLAG_RESET_PENDING;
@@ -1528,8 +1527,7 @@ static void i40evf_reset_task(struct work_struct *work)
 		if ((rstat_val == I40E_VFR_VFACTIVE) ||
 		    (rstat_val == I40E_VFR_COMPLETED))
 			break;
-		else
-			msleep(I40EVF_RESET_WAIT_MS);
+		msleep(I40EVF_RESET_WAIT_MS);
 	}
 	if (i == I40EVF_RESET_WAIT_COUNT) {
 		struct i40evf_mac_filter *f, *ftmp;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [net-next 15/15] i40e: trigger SW INT with no ITR wait
  2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
                   ` (13 preceding siblings ...)
  2014-11-19  4:10 ` [net-next 14/15] i40evf: remove unnecessary else Jeff Kirsher
@ 2014-11-19  4:10 ` Jeff Kirsher
  14 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  4:10 UTC (permalink / raw)
  To: davem; +Cc: Shannon Nelson, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Shannon Nelson <shannon.nelson@intel.com>

Since we want the SW INT to go off as soon as possible, write the
extra bits that will turn off the ITR wait for the interrupt.

Change-ID: I6d5382ba60840fa32abb7dea17c839eb4b5f68f7
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c |  5 ++++-
 drivers/net/ethernet/intel/i40e/i40e_main.c    | 10 ++++++++--
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index bb1698a..b2402851 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -1400,7 +1400,10 @@ static int i40e_intr_test(struct net_device *netdev, u64 *data)
 	netif_info(pf, hw, netdev, "interrupt test\n");
 	wr32(&pf->hw, I40E_PFINT_DYN_CTL0,
 	     (I40E_PFINT_DYN_CTL0_INTENA_MASK |
-	      I40E_PFINT_DYN_CTL0_SWINT_TRIG_MASK));
+	      I40E_PFINT_DYN_CTL0_SWINT_TRIG_MASK |
+	      I40E_PFINT_DYN_CTL0_ITR_INDX_MASK |
+	      I40E_PFINT_DYN_CTL0_SW_ITR_INDX_ENA_MASK |
+	      I40E_PFINT_DYN_CTL0_SW_ITR_INDX_MASK));
 	usleep_range(1000, 2000);
 	*data = (swc_old == pf->sw_int_count);
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 3ebab03..3913329 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5565,11 +5565,17 @@ static void i40e_check_hang_subtask(struct i40e_pf *pf)
 			if (!(pf->flags & I40E_FLAG_MSIX_ENABLED)) {
 				wr32(&vsi->back->hw, I40E_PFINT_DYN_CTL0,
 				     (I40E_PFINT_DYN_CTL0_INTENA_MASK |
-				      I40E_PFINT_DYN_CTL0_SWINT_TRIG_MASK));
+				      I40E_PFINT_DYN_CTL0_SWINT_TRIG_MASK |
+				      I40E_PFINT_DYN_CTL0_ITR_INDX_MASK |
+				      I40E_PFINT_DYN_CTL0_SW_ITR_INDX_ENA_MASK |
+				      I40E_PFINT_DYN_CTL0_SW_ITR_INDX_MASK));
 			} else {
 				u16 vec = vsi->base_vector - 1;
 				u32 val = (I40E_PFINT_DYN_CTLN_INTENA_MASK |
-					   I40E_PFINT_DYN_CTLN_SWINT_TRIG_MASK);
+				      I40E_PFINT_DYN_CTLN_SWINT_TRIG_MASK |
+				      I40E_PFINT_DYN_CTLN_ITR_INDX_MASK |
+				      I40E_PFINT_DYN_CTLN_SW_ITR_INDX_ENA_MASK |
+				      I40E_PFINT_DYN_CTLN_SW_ITR_INDX_MASK);
 				for (i = 0; i < vsi->num_q_vectors; i++, vec++)
 					wr32(&vsi->back->hw,
 					     I40E_PFINT_DYN_CTLN(vec), val);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [net-next 12/15] i40evf: make checkpatch happy
  2014-11-19  4:10 ` [net-next 12/15] i40evf: make checkpatch happy Jeff Kirsher
@ 2014-11-19  4:59   ` Joe Perches
  2014-11-19  9:55     ` Jeff Kirsher
  0 siblings, 1 reply; 20+ messages in thread
From: Joe Perches @ 2014-11-19  4:59 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, Mitch Williams, netdev, nhorman, sassmann, jogreene

On Tue, 2014-11-18 at 20:10 -0800, Jeff Kirsher wrote:
> This patch is the result of running checkpatch on the i40evf driver with
> the --strict option. The vast majority of changes are adding/removing
> blank lines, aligning function parameters, and correcting over-long
> lines.

Hey Mitch, Jeff:

> diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
[]
> @@ -1265,8 +1269,8 @@ int i40evf_init_interrupt_scheme(struct i40evf_adapter *adapter)
>  	}
>  
>  	dev_info(&adapter->pdev->dev, "Multiqueue %s: Queue pair count = %u",
> -		(adapter->num_active_queues > 1) ? "Enabled" :
> -		"Disabled", adapter->num_active_queues);
> +		 (adapter->num_active_queues > 1) ? "Enabled" : "Disabled",
> +		 adapter->num_active_queues);

You could add a newline to that format one day.

> diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
[]
> @@ -711,7 +712,6 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
>  				"%s: Unknown event %d from pf\n",
>  				__func__, vpe->event);
>  			break;
> -
>  		}
>  		return;
>  	}
> @@ -776,7 +776,7 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
>  		break;
>  	default:
>  		dev_warn(&adapter->pdev->dev, "%s: Received unexpected message %d from PF\n",
> -			__func__, v_opcode);
> +			 __func__, v_opcode);
>  		break;
>  	} /* switch v_opcode */
>  	adapter->current_op = I40E_VIRTCHNL_OP_UNKNOWN;

And be consistent with PF vs pf usage too.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [net-next 12/15] i40evf: make checkpatch happy
  2014-11-19  4:59   ` Joe Perches
@ 2014-11-19  9:55     ` Jeff Kirsher
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff Kirsher @ 2014-11-19  9:55 UTC (permalink / raw)
  To: Joe Perches; +Cc: davem, Mitch Williams, netdev, nhorman, sassmann, jogreene

[-- Attachment #1: Type: text/plain, Size: 2056 bytes --]

On Tue, 2014-11-18 at 20:59 -0800, Joe Perches wrote:
> On Tue, 2014-11-18 at 20:10 -0800, Jeff Kirsher wrote:
> > This patch is the result of running checkpatch on the i40evf driver with
> > the --strict option. The vast majority of changes are adding/removing
> > blank lines, aligning function parameters, and correcting over-long
> > lines.
> 
> Hey Mitch, Jeff:
> 
> > diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
> []
> > @@ -1265,8 +1269,8 @@ int i40evf_init_interrupt_scheme(struct i40evf_adapter *adapter)
> >  	}
> >  
> >  	dev_info(&adapter->pdev->dev, "Multiqueue %s: Queue pair count = %u",
> > -		(adapter->num_active_queues > 1) ? "Enabled" :
> > -		"Disabled", adapter->num_active_queues);
> > +		 (adapter->num_active_queues > 1) ? "Enabled" : "Disabled",
> > +		 adapter->num_active_queues);
> 
> You could add a newline to that format one day.
> 
> > diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
> []
> > @@ -711,7 +712,6 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
> >  				"%s: Unknown event %d from pf\n",
> >  				__func__, vpe->event);
> >  			break;
> > -
> >  		}
> >  		return;
> >  	}
> > @@ -776,7 +776,7 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
> >  		break;
> >  	default:
> >  		dev_warn(&adapter->pdev->dev, "%s: Received unexpected message %d from PF\n",
> > -			__func__, v_opcode);
> > +			 __func__, v_opcode);
> >  		break;
> >  	} /* switch v_opcode */
> >  	adapter->current_op = I40E_VIRTCHNL_OP_UNKNOWN;
> 
> And be consistent with PF vs pf usage too.

Thanks for noticing, I had not caught that there was inconsistent use.
I try to catch acronyms that are not capitalized, and after looking
through the code, I see that there is inconsistent use of VF vs vf as
well.  I am putting together a patch now that will make the use of both
acronyms consistent throughout our code.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [net-next 07/15] ixgbevf: Change receive model to use double buffered page based receives
  2014-11-19  4:10 ` [net-next 07/15] ixgbevf: Change receive model to use double buffered page based receives Jeff Kirsher
@ 2014-11-19 18:24   ` Alexander Duyck
  2014-11-19 23:06     ` Tantilov, Emil S
  0 siblings, 1 reply; 20+ messages in thread
From: Alexander Duyck @ 2014-11-19 18:24 UTC (permalink / raw)
  To: Jeff Kirsher, davem; +Cc: Emil Tantilov, netdev, nhorman, sassmann, jogreene

There were a few things in this patch that should be addressed.

Comments inline below.

- Alex

On 11/18/2014 08:10 PM, Jeff Kirsher wrote:
> From: Emil Tantilov <emil.s.tantilov@intel.com>
>
> This patch changes the basic receive path for ixgbevf so that instead of
> receiving the data into an skb it is received into a double buffered page.
> The main change is that the receives will be done in pages only and then
> pull the header out of the page and copy it into the sk_buff data.
>
> This has the advantages of reduced cache misses and improved performance on
> IOMMU enabled systems.
>
> CC: Alexander Duyck <alexander.h.duyck@redhat.com>
> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>   drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  24 +-
>   drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 452 +++++++++++++++-------
>   2 files changed, 331 insertions(+), 145 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
> index 72a354b..2362001 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
> @@ -58,8 +58,9 @@ struct ixgbevf_tx_buffer {
>   };
>
>   struct ixgbevf_rx_buffer {
> -	struct sk_buff *skb;
>   	dma_addr_t dma;
> +	struct page *page;
> +	unsigned int page_offset;
>   };
>
>   struct ixgbevf_stats {
> @@ -91,9 +92,10 @@ struct ixgbevf_ring {
>   	void *desc;			/* descriptor ring memory */
>   	dma_addr_t dma;			/* phys. address of descriptor ring */
>   	unsigned int size;		/* length in bytes */
> -	unsigned int count;		/* amount of descriptors */
> -	unsigned int next_to_use;
> -	unsigned int next_to_clean;
> +	u16 count;			/* amount of descriptors */
> +	u16 next_to_use;
> +	u16 next_to_clean;
> +	u16 next_to_alloc;
>
>   	union {
>   		struct ixgbevf_tx_buffer *tx_buffer_info;
> @@ -109,12 +111,11 @@ struct ixgbevf_ring {
>
>   	u64 hw_csum_rx_error;
>   	u8 __iomem *tail;
> +	struct sk_buff *skb;
>
>   	u16 reg_idx; /* holds the special value that gets the hardware register
>   		      * offset associated with this ring, which is different
>   		      * for DCB and RSS modes */
> -
> -	u16 rx_buf_len;
>   	int queue_index; /* needed for multiqueue queue management */
>   };
>
> @@ -133,12 +134,10 @@ struct ixgbevf_ring {
>
>   /* Supported Rx Buffer Sizes */
>   #define IXGBEVF_RXBUFFER_256   256    /* Used for packet split */
> -#define IXGBEVF_RXBUFFER_2K    2048
> -#define IXGBEVF_RXBUFFER_4K    4096
> -#define IXGBEVF_RXBUFFER_8K    8192
> -#define IXGBEVF_RXBUFFER_10K   10240
> +#define IXGBEVF_RXBUFFER_2048	2048
>
>   #define IXGBEVF_RX_HDR_SIZE IXGBEVF_RXBUFFER_256
> +#define IXGBEVF_RX_BUFSZ	IXGBEVF_RXBUFFER_2048
>
>   #define MAXIMUM_ETHERNET_VLAN_SIZE (VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
>
> @@ -430,11 +429,6 @@ enum ixbgevf_state_t {
>   	__IXGBEVF_WORK_INIT,
>   };
>
> -struct ixgbevf_cb {
> -	struct sk_buff *prev;
> -};
> -#define IXGBE_CB(skb) ((struct ixgbevf_cb *)(skb)->cb)
> -
>   enum ixgbevf_boards {
>   	board_82599_vf,
>   	board_X540_vf,
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> index 20bebd2..2ca7c96 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> @@ -422,8 +422,7 @@ static void ixgbevf_process_skb_fields(struct ixgbevf_ring *rx_ring,
>    * that this is in fact a non-EOP buffer.
>    **/
>   static bool ixgbevf_is_non_eop(struct ixgbevf_ring *rx_ring,
> -			       union ixgbe_adv_rx_desc *rx_desc,
> -			       struct sk_buff *skb)
> +			       union ixgbe_adv_rx_desc *rx_desc)
>   {
>   	u32 ntc = rx_ring->next_to_clean + 1;
>
> @@ -439,37 +438,40 @@ static bool ixgbevf_is_non_eop(struct ixgbevf_ring *rx_ring,
>   	return true;
>   }
>
> -static bool ixgbevf_alloc_mapped_skb(struct ixgbevf_ring *rx_ring,
> -				     struct ixgbevf_rx_buffer *bi)
> +static bool ixgbevf_alloc_mapped_page(struct ixgbevf_ring *rx_ring,
> +				      struct ixgbevf_rx_buffer *bi)
>   {
> -	struct sk_buff *skb = bi->skb;
> +	struct page *page = bi->page;
>   	dma_addr_t dma = bi->dma;
>
> -	if (unlikely(skb))
> +	/* since we are recycling buffers we should seldom need to alloc */
> +	if (likely(page))
>   		return true;
>
> -	skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
> -					rx_ring->rx_buf_len);
> -	if (unlikely(!skb)) {
> -		rx_ring->rx_stats.alloc_rx_buff_failed++;
> +	/* alloc new page for storage */
> +	page = dev_alloc_page();
> +	if (unlikely(!page)) {
> +		rx_ring->rx_stats.alloc_rx_page_failed++;
>   		return false;
>   	}
>
> -	dma = dma_map_single(rx_ring->dev, skb->data,
> -			     rx_ring->rx_buf_len, DMA_FROM_DEVICE);
> +	/* map page for use */
> +	dma = dma_map_page(rx_ring->dev, page, 0,
> +			   PAGE_SIZE, DMA_FROM_DEVICE);
>
>   	/* if mapping failed free memory back to system since
>   	 * there isn't much point in holding memory we can't use
>   	 */
>   	if (dma_mapping_error(rx_ring->dev, dma)) {
> -		dev_kfree_skb_any(skb);
> +		__free_page(page);
>
>   		rx_ring->rx_stats.alloc_rx_buff_failed++;
>   		return false;
>   	}
>
> -	bi->skb = skb;
>   	bi->dma = dma;
> +	bi->page = page;
> +	bi->page_offset = 0;
>
>   	return true;
>   }
> @@ -495,13 +497,13 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
>   	i -= rx_ring->count;
>
>   	do {
> -		if (!ixgbevf_alloc_mapped_skb(rx_ring, bi))
> +		if (!ixgbevf_alloc_mapped_page(rx_ring, bi))
>   			break;
>
>   		/* Refresh the desc even if pkt_addr didn't change
>   		 * because each write-back erases this info.
>   		 */
> -		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma);
> +		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset);
>
>   		rx_desc++;
>   		bi++;
> @@ -524,6 +526,9 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
>   		/* record the next descriptor to use */
>   		rx_ring->next_to_use = i;
>
> +		/* update next to alloc since we have filled the ring */
> +		rx_ring->next_to_alloc = i;
> +
>   		/* Force memory writes to complete before letting h/w
>   		 * know there are new descriptors to fetch.  (Only
>   		 * applicable for weak-ordered memory model archs,
> @@ -534,6 +539,257 @@ static void ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
>   	}
>   }
>
> +/* ixgbevf_pull_tail - ixgbevf specific version of skb_pull_tail
> + * @rx_ring: Rx descriptor ring packet is being transacted on
> + * @skb: pointer to current skb being adjusted
> + *
> + * This function is an ixgbevf specific version of __pskb_pull_tail.  The
> + * main difference between this version and the original function is that
> + * this function can make several assumptions about the state of things
> + * that allow for significant optimizations versus the standard function.
> + * As a result we can do things like drop a frag and maintain an accurate
> + * truesize for the skb.
> + */
> +static void ixgbevf_pull_tail(struct ixgbevf_ring *rx_ring,
> +			      struct sk_buff *skb)
> +{
> +	struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[0];
> +	unsigned char *va;
> +	unsigned int pull_len;
> +
> +	/* it is valid to use page_address instead of kmap since we are
> +	 * working with pages allocated out of the lomem pool per
> +	 * alloc_page(GFP_ATOMIC)
> +	 */
> +	va = skb_frag_address(frag);
> +
> +	/* we need the header to contain the greater of either ETH_HLEN or
> +	 * 60 bytes if the skb->len is less than 60 for skb_pad.
> +	 */
> +	pull_len = eth_get_headlen(va, IXGBEVF_RX_HDR_SIZE);
> +
> +	/* align pull length to size of long to optimize memcpy performance */
> +	skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long)));
> +
> +	/* update all of the pointers */
> +	skb_frag_size_sub(frag, pull_len);
> +	frag->page_offset += pull_len;
> +	skb->data_len -= pull_len;
> +	skb->tail += pull_len;
> +}

I really think we should look at making this into a generic function. 
Maybe I will submit something later today to get a common function 
placed in the code.  Maybe something like eth_pull_tail.

> +/* ixgbevf_cleanup_headers - Correct corrupted or empty headers
> + * @rx_ring: Rx descriptor ring packet is being transacted on
> + * @rx_desc: pointer to the EOP Rx descriptor
> + * @skb: pointer to current skb being fixed
> + *
> + * Check for corrupted packet headers caused by senders on the local L2
> + * embedded NIC switch not setting up their Tx Descriptors right.  These
> + * should be very rare.
> + *
> + * Also address the case where we are pulling data in on pages only
> + * and as such no data is present in the skb header.
> + *
> + * In addition if skb is not at least 60 bytes we need to pad it so that
> + * it is large enough to qualify as a valid Ethernet frame.
> + *
> + * Returns true if an error was encountered and skb was freed.
> + */
> +static bool ixgbevf_cleanup_headers(struct ixgbevf_ring *rx_ring,
> +				    union ixgbe_adv_rx_desc *rx_desc,
> +				    struct sk_buff *skb)
> +{
> +	/* verify that the packet does not have any known errors */
> +	if (unlikely(ixgbevf_test_staterr(rx_desc,
> +					  IXGBE_RXDADV_ERR_FRAME_ERR_MASK))) {
> +		struct net_device *netdev = rx_ring->netdev;
> +
> +		if (!(netdev->features & NETIF_F_RXALL)) {
> +			dev_kfree_skb_any(skb);
> +			return true;
> +		}
> +	}
> +
> +	/* place header in linear portion of buffer */
> +	if (skb_is_nonlinear(skb))
> +		ixgbevf_pull_tail(rx_ring, skb);
> +
> +	/* if skb_pad returns an error the skb was freed */
> +	if (unlikely(skb->len < 60)) {
> +		int pad_len = 60 - skb->len;
> +
> +		if (skb_pad(skb, pad_len))
> +			return true;
> +		__skb_put(skb, pad_len);
> +	}
> +
> +	return false;
> +}

The same goes for the padding bit here.  Maybe something like eth_skb_pad.

> +/* ixgbevf_reuse_rx_page - page flip buffer and store it back on the ring
> + * @rx_ring: Rx descriptor ring to store buffers on
> + * @old_buff: donor buffer to have page reused
> + *
> + * Synchronizes page for reuse by the adapter
> + */
> +static void ixgbevf_reuse_rx_page(struct ixgbevf_ring *rx_ring,
> +				  struct ixgbevf_rx_buffer *old_buff)
> +{
> +	struct ixgbevf_rx_buffer *new_buff;
> +	u16 nta = rx_ring->next_to_alloc;
> +
> +	new_buff = &rx_ring->rx_buffer_info[nta];
> +
> +	/* update, and store next to alloc */
> +	nta++;
> +	rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0;
> +
> +	/* transfer page from old buffer to new buffer */
> +	new_buff->page = old_buff->page;
> +	new_buff->dma = old_buff->dma;
> +	new_buff->page_offset = old_buff->page_offset;
> +
> +	/* sync the buffer for use by the device */
> +	dma_sync_single_range_for_device(rx_ring->dev, new_buff->dma,
> +					 new_buff->page_offset,
> +					 IXGBEVF_RX_BUFSZ,
> +					 DMA_FROM_DEVICE);
> +}
> +
> +/* ixgbevf_add_rx_frag - Add contents of Rx buffer to sk_buff
> + * @rx_ring: Rx descriptor ring to transact packets on
> + * @rx_buffer: buffer containing page to add
> + * @rx_desc: descriptor containing length of buffer written by hardware
> + * @skb: sk_buff to place the data into
> + *
> + * This function will add the data contained in rx_buffer->page to the skb.
> + * This is done either through a direct copy if the data in the buffer is
> + * less than the skb header size, otherwise it will just attach the page as
> + * a frag to the skb.
> + *
> + * The function will then update the page offset if necessary and return
> + * true if the buffer can be reused by the adapter.
> + */
> +static bool ixgbevf_add_rx_frag(struct ixgbevf_ring *rx_ring,
> +				struct ixgbevf_rx_buffer *rx_buffer,
> +				union ixgbe_adv_rx_desc *rx_desc,
> +				struct sk_buff *skb)
> +{
> +	struct page *page = rx_buffer->page;
> +	unsigned int size = le16_to_cpu(rx_desc->wb.upper.length);
> +#if (PAGE_SIZE < 8192)
> +	unsigned int truesize = IXGBEVF_RX_BUFSZ;
> +#else
> +	unsigned int truesize = ALIGN(size, L1_CACHE_BYTES);
> +#endif
> +
> +	if ((size <= IXGBEVF_RX_HDR_SIZE) && !skb_is_nonlinear(skb)) {
> +		unsigned char *va = page_address(page) + rx_buffer->page_offset;
> +
> +		memcpy(__skb_put(skb, size), va, ALIGN(size, sizeof(long)));
> +
> +		/* we can reuse buffer as-is, just make sure it is local */
> +		if (likely(page_to_nid(page) == numa_node_id()))
> +			return true;
> +
> +		/* this page cannot be reused so discard it */
> +		put_page(page);
> +		return false;
> +	}
> +
> +	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
> +			rx_buffer->page_offset, size, truesize);
> +
> +	/* avoid re-using remote pages */
> +	if (unlikely(page_to_nid(page) != numa_node_id()))
> +		return false;
> +

This is missing the pfmemalloc fix that is already in igb, and that I 
submitted for ixgbe and fm10k.


> +#if (PAGE_SIZE < 8192)
> +	/* if we are only owner of page we can reuse it */
> +	if (unlikely(page_count(page) != 1))
> +		return false;
> +
> +	/* flip page offset to other buffer */
> +	rx_buffer->page_offset ^= IXGBEVF_RX_BUFSZ;
> +
> +	/* since we are the only owner of the page and we need to
> +	 * increment it.
> +	 */
> +	atomic_inc(&page->_count);
> +#else
> +	/* move offset up to the next cache line */
> +	rx_buffer->page_offset += truesize;
> +
> +	if (rx_buffer->page_offset > (PAGE_SIZE - IXGBEVF_RX_BUFSZ))
> +		return false;
> +
> +	/* bump ref count on page before it is given to the stack */
> +	get_page(page);
> +#endif
> +
> +	return true;
> +}

The get_page and atomic_inc calls can be pulled out and placed after if 
#if/else logic.  The preference is to use atomic_inc since you are using 
an order 0 page and don't need to check for PageTail().

> +static struct sk_buff *ixgbevf_fetch_rx_buffer(struct ixgbevf_ring *rx_ring,
> +					       union ixgbe_adv_rx_desc *rx_desc,
> +					       struct sk_buff *skb)
> +{
> +	struct ixgbevf_rx_buffer *rx_buffer;
> +	struct page *page;
> +
> +	rx_buffer = &rx_ring->rx_buffer_info[rx_ring->next_to_clean];
> +	page = rx_buffer->page;
> +	prefetchw(page);
> +
> +	if (likely(!skb)) {
> +		void *page_addr = page_address(page) +
> +				  rx_buffer->page_offset;
> +
> +		/* prefetch first cache line of first page */
> +		prefetch(page_addr);
> +#if L1_CACHE_BYTES < 128
> +		prefetch(page_addr + L1_CACHE_BYTES);
> +#endif
> +
> +		/* allocate a skb to store the frags */
> +		skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
> +						IXGBEVF_RX_HDR_SIZE);
> +		if (unlikely(!skb)) {
> +			rx_ring->rx_stats.alloc_rx_buff_failed++;
> +			return NULL;
> +		}
> +
> +		/* we will be copying header into skb->data in
> +		 * pskb_may_pull so it is in our interest to prefetch
> +		 * it now to avoid a possible cache miss
> +		 */
> +		prefetchw(skb->data);
> +	}
> +
> +	/* we are reusing so sync this buffer for CPU use */
> +	dma_sync_single_range_for_cpu(rx_ring->dev,
> +				      rx_buffer->dma,
> +				      rx_buffer->page_offset,
> +				      IXGBEVF_RX_BUFSZ,
> +				      DMA_FROM_DEVICE);
> +
> +	/* pull page into skb */
> +	if (ixgbevf_add_rx_frag(rx_ring, rx_buffer, rx_desc, skb)) {
> +		/* hand second half of page back to the ring */
> +		ixgbevf_reuse_rx_page(rx_ring, rx_buffer);
> +	} else {
> +		/* we are not reusing the buffer so unmap it */
> +		dma_unmap_page(rx_ring->dev, rx_buffer->dma,
> +			       PAGE_SIZE, DMA_FROM_DEVICE);
> +	}
> +
> +	/* clear contents of buffer_info */
> +	rx_buffer->dma = 0;
> +	rx_buffer->page = NULL;
> +
> +	return skb;
> +}
> +
>   static inline void ixgbevf_irq_enable_queues(struct ixgbevf_adapter *adapter,
>   					     u32 qmask)
>   {

<snip>

> @@ -3320,21 +3522,11 @@ static int ixgbevf_set_mac(struct net_device *netdev, void *p)
>   static int ixgbevf_change_mtu(struct net_device *netdev, int new_mtu)
>   {
>   	struct ixgbevf_adapter *adapter = netdev_priv(netdev);
> +	struct ixgbe_hw *hw = &adapter->hw;
>   	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
> -	int max_possible_frame = MAXIMUM_ETHERNET_VLAN_SIZE;
> -
> -	switch (adapter->hw.api_version) {
> -	case ixgbe_mbox_api_11:
> -		max_possible_frame = IXGBE_MAX_JUMBO_FRAME_SIZE;
> -		break;
> -	default:
> -		if (adapter->hw.mac.type == ixgbe_mac_X540_vf)
> -			max_possible_frame = IXGBE_MAX_JUMBO_FRAME_SIZE;
> -		break;
> -	}
>
>   	/* MTU < 68 is an error and causes problems on some kernels */
> -	if ((new_mtu < 68) || (max_frame > max_possible_frame))
> +	if ((new_mtu < 68) || (max_frame > IXGBE_MAX_JUMBO_FRAME_SIZE))
>   		return -EINVAL;
>
>   	hw_dbg(&adapter->hw, "changing MTU from %d to %d\n",

This is wrong.  You are still limited by the PF so if it is a version 
1.0 mailbox on an 82599 you cannot enable jumbo frames.  Yes you can 
support it but the PF won't let you do it.

> @@ -3342,8 +3534,8 @@ static int ixgbevf_change_mtu(struct net_device *netdev, int new_mtu)
>   	/* must set new MTU before calling down or up */
>   	netdev->mtu = new_mtu;
>
> -	if (netif_running(netdev))
> -		ixgbevf_reinit_locked(adapter);
> +	/* notify the PF of our intent to use this size of frame */
> +	ixgbevf_rlpml_set_vf(hw, max_frame);
>
>   	return 0;
>   }
>

This is the reason why the change is wrong.  If the mailbox api is 
version 1.0 you cannot support jumbo frames so ixgbevf_rlmpl_set_vf will 
return an error via the mailbox indicating that the message is not 
supported.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [net-next 07/15] ixgbevf: Change receive model to use double buffered page based receives
  2014-11-19 18:24   ` Alexander Duyck
@ 2014-11-19 23:06     ` Tantilov, Emil S
  0 siblings, 0 replies; 20+ messages in thread
From: Tantilov, Emil S @ 2014-11-19 23:06 UTC (permalink / raw)
  To: Alexander Duyck, Kirsher, Jeffrey T, davem
  Cc: netdev, nhorman, sassmann, jogreene

>-----Original Message-----
>From: Alexander Duyck [mailto:alexander.h.duyck@redhat.com]
>Sent: Wednesday, November 19, 2014 10:25 AM
>To: Kirsher, Jeffrey T; davem@davemloft.net
>Cc: Tantilov, Emil S; netdev@vger.kernel.org;
>nhorman@redhat.com; sassmann@redhat.com; jogreene@redhat.com
>Subject: Re: [net-next 07/15] ixgbevf: Change receive model to use double buffered page based receives
>
>There were a few things in this patch that should be addressed.
>
>Comments inline below.
>
>- Alex
>
>On 11/18/2014 08:10 PM, Jeff Kirsher wrote:
>> From: Emil Tantilov <emil.s.tantilov@intel.com>
>>
>> This patch changes the basic receive path for ixgbevf so that instead of
>> receiving the data into an skb it is received into a double buffered page.
>> The main change is that the receives will be done in pages only and then
>> pull the header out of the page and copy it into the sk_buff data.
>>
>> This has the advantages of reduced cache misses and improved performance on
>> IOMMU enabled systems.
>>
>> CC: Alexander Duyck <alexander.h.duyck@redhat.com>
>> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
>> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
>> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>> ---
>>   drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |  24 +-
>>   drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 452 +++++++++++++++-------
>>   2 files changed, 331 insertions(+), 145 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
>b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
>> index 72a354b..2362001 100644
>> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
>> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
>> @@ -58,8 +58,9 @@ struct ixgbevf_tx_buffer {
>>   };
>>
>>   struct ixgbevf_rx_buffer {
>> -	struct sk_buff *skb;
>>   	dma_addr_t dma;
>> +	struct page *page;
>> +	unsigned int page_offset;
>>   };
>>
>>   struct ixgbevf_stats {
>> @@ -91,9 +92,10 @@ struct ixgbevf_ring {
>>   	void *desc;			/* descriptor ring memory */
>>   	dma_addr_t dma;			/* phys. address of
>descriptor ring */
>>   	unsigned int size;		/* length in bytes */
>> -	unsigned int count;		/* amount of descriptors */
>> -	unsigned int next_to_use;
>> -	unsigned int next_to_clean;
>> +	u16 count;			/* amount of descriptors */
>> +	u16 next_to_use;
>> +	u16 next_to_clean;
>> +	u16 next_to_alloc;
>>
>>   	union {
>>   		struct ixgbevf_tx_buffer *tx_buffer_info;
>> @@ -109,12 +111,11 @@ struct ixgbevf_ring {
>>
>>   	u64 hw_csum_rx_error;
>>   	u8 __iomem *tail;
>> +	struct sk_buff *skb;
>>
>>   	u16 reg_idx; /* holds the special value that gets the hardware register
>>   		      * offset associated with this ring, which is different
>>   		      * for DCB and RSS modes */
>> -
>> -	u16 rx_buf_len;
>>   	int queue_index; /* needed for multiqueue queue management */
>>   };
>>
>> @@ -133,12 +134,10 @@ struct ixgbevf_ring {
>>
>>   /* Supported Rx Buffer Sizes */
>>   #define IXGBEVF_RXBUFFER_256   256    /* Used for packet split */
>> -#define IXGBEVF_RXBUFFER_2K    2048
>> -#define IXGBEVF_RXBUFFER_4K    4096
>> -#define IXGBEVF_RXBUFFER_8K    8192
>> -#define IXGBEVF_RXBUFFER_10K   10240
>> +#define IXGBEVF_RXBUFFER_2048	2048
>>
>>   #define IXGBEVF_RX_HDR_SIZE IXGBEVF_RXBUFFER_256
>> +#define IXGBEVF_RX_BUFSZ	IXGBEVF_RXBUFFER_2048
>>
>>   #define MAXIMUM_ETHERNET_VLAN_SIZE (VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
>>
>> @@ -430,11 +429,6 @@ enum ixbgevf_state_t {
>>   	__IXGBEVF_WORK_INIT,
>>   };
>>
>> -struct ixgbevf_cb {
>> -	struct sk_buff *prev;
>> -};
>> -#define IXGBE_CB(skb) ((struct ixgbevf_cb *)(skb)->cb)
>> -
>>   enum ixgbevf_boards {
>>   	board_82599_vf,
>>   	board_X540_vf,
>> diff --git
>a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
>b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
>> index 20bebd2..2ca7c96 100644
>> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
>> @@ -422,8 +422,7 @@ static void
>ixgbevf_process_skb_fields(struct ixgbevf_ring *rx_ring,
>>    * that this is in fact a non-EOP buffer.
>>    **/
>>   static bool ixgbevf_is_non_eop(struct ixgbevf_ring *rx_ring,
>> -			       union ixgbe_adv_rx_desc *rx_desc,
>> -			       struct sk_buff *skb)
>> +			       union ixgbe_adv_rx_desc *rx_desc)
>>   {
>>   	u32 ntc = rx_ring->next_to_clean + 1;
>>
>> @@ -439,37 +438,40 @@ static bool
>ixgbevf_is_non_eop(struct ixgbevf_ring *rx_ring,
>>   	return true;
>>   }
>>
>> -static bool ixgbevf_alloc_mapped_skb(struct ixgbevf_ring *rx_ring,
>> -				     struct ixgbevf_rx_buffer *bi)
>> +static bool ixgbevf_alloc_mapped_page(struct ixgbevf_ring *rx_ring,
>> +				      struct ixgbevf_rx_buffer *bi)
>>   {
>> -	struct sk_buff *skb = bi->skb;
>> +	struct page *page = bi->page;
>>   	dma_addr_t dma = bi->dma;
>>
>> -	if (unlikely(skb))
>> +	/* since we are recycling buffers we should seldom need to alloc */
>> +	if (likely(page))
>>   		return true;
>>
>> -	skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
>> -					rx_ring->rx_buf_len);
>> -	if (unlikely(!skb)) {
>> -		rx_ring->rx_stats.alloc_rx_buff_failed++;
>> +	/* alloc new page for storage */
>> +	page = dev_alloc_page();
>> +	if (unlikely(!page)) {
>> +		rx_ring->rx_stats.alloc_rx_page_failed++;
>>   		return false;
>>   	}
>>
>> -	dma = dma_map_single(rx_ring->dev, skb->data,
>> -			     rx_ring->rx_buf_len, DMA_FROM_DEVICE);
>> +	/* map page for use */
>> +	dma = dma_map_page(rx_ring->dev, page, 0,
>> +			   PAGE_SIZE, DMA_FROM_DEVICE);
>>
>>   	/* if mapping failed free memory back to system since
>>   	 * there isn't much point in holding memory we can't use
>>   	 */
>>   	if (dma_mapping_error(rx_ring->dev, dma)) {
>> -		dev_kfree_skb_any(skb);
>> +		__free_page(page);
>>
>>   		rx_ring->rx_stats.alloc_rx_buff_failed++;
>>   		return false;
>>   	}
>>
>> -	bi->skb = skb;
>>   	bi->dma = dma;
>> +	bi->page = page;
>> +	bi->page_offset = 0;
>>
>>   	return true;
>>   }
>> @@ -495,13 +497,13 @@ static void
>ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
>>   	i -= rx_ring->count;
>>
>>   	do {
>> -		if (!ixgbevf_alloc_mapped_skb(rx_ring, bi))
>> +		if (!ixgbevf_alloc_mapped_page(rx_ring, bi))
>>   			break;
>>
>>   		/* Refresh the desc even if pkt_addr didn't change
>>   		 * because each write-back erases this info.
>>   		 */
>> -		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma);
>> +		rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset);
>>
>>   		rx_desc++;
>>   		bi++;
>> @@ -524,6 +526,9 @@ static void
>ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
>>   		/* record the next descriptor to use */
>>   		rx_ring->next_to_use = i;
>>
>> +		/* update next to alloc since we have filled the ring */
>> +		rx_ring->next_to_alloc = i;
>> +
>>   		/* Force memory writes to complete before letting h/w
>>   		 * know there are new descriptors to fetch.  (Only
>>   		 * applicable for weak-ordered memory model archs,
>> @@ -534,6 +539,257 @@ static void
>ixgbevf_alloc_rx_buffers(struct ixgbevf_ring *rx_ring,
>>   	}
>>   }
>>
>> +/* ixgbevf_pull_tail - ixgbevf specific version of skb_pull_tail
>> + * @rx_ring: Rx descriptor ring packet is being transacted on
>> + * @skb: pointer to current skb being adjusted 
>> + *
>> + * This function is an ixgbevf specific version of __pskb_pull_tail.  The
>> + * main difference between this version and the original function is that
>> + * this function can make several assumptions about the state of things
>> + * that allow for significant optimizations versus the standard function.
>> + * As a result we can do things like drop a frag and maintain an accurate
>> + * truesize for the skb.
>> + */
>> +static void ixgbevf_pull_tail(struct ixgbevf_ring *rx_ring,
>> +			      struct sk_buff *skb)
>> +{
>> +	struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[0];
>> +	unsigned char *va;
>> +	unsigned int pull_len;
>> +
>> +	/* it is valid to use page_address instead of kmap since we are
>> +	 * working with pages allocated out of the lomem pool per
>> +	 * alloc_page(GFP_ATOMIC)
>> +	 */
>> +	va = skb_frag_address(frag);
>> +
>> +	/* we need the header to contain the greater of either ETH_HLEN or
>> +	 * 60 bytes if the skb->len is less than 60 for skb_pad.
>> +	 */
>> +	pull_len = eth_get_headlen(va, IXGBEVF_RX_HDR_SIZE);
>> +
>> +	/* align pull length to size of long to optimize memcpy performance */
>> +	skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long)));
>> +
>> +	/* update all of the pointers */
>> +	skb_frag_size_sub(frag, pull_len);
>> +	frag->page_offset += pull_len;
>> +	skb->data_len -= pull_len;
>> +	skb->tail += pull_len;
>> +}
>
>I really think we should look at making this into a generic function.
>Maybe I will submit something later today to get a common function
>placed in the code.  Maybe something like eth_pull_tail.

Sounds good.

>
>> +/* ixgbevf_cleanup_headers - Correct corrupted or empty headers
>> + * @rx_ring: Rx descriptor ring packet is being transacted on
>> + * @rx_desc: pointer to the EOP Rx descriptor
>> + * @skb: pointer to current skb being fixed
>> + *
>> + * Check for corrupted packet headers caused by senders on the local L2
>> + * embedded NIC switch not setting up their Tx Descriptors right.  These
>> + * should be very rare.
>> + *
>> + * Also address the case where we are pulling data in on pages only
>> + * and as such no data is present in the skb header.
>> + *
>> + * In addition if skb is not at least 60 bytes we need to pad it so that
>> + * it is large enough to qualify as a valid Ethernet frame.
>> + *
>> + * Returns true if an error was encountered and skb was freed.
>> + */
>> +static bool ixgbevf_cleanup_headers(struct ixgbevf_ring *rx_ring,
>> +				    union ixgbe_adv_rx_desc *rx_desc,
>> +				    struct sk_buff *skb)
>> +{
>> +	/* verify that the packet does not have any known errors */
>> +	if (unlikely(ixgbevf_test_staterr(rx_desc,
>> +
>IXGBE_RXDADV_ERR_FRAME_ERR_MASK))) {
>> +		struct net_device *netdev = rx_ring->netdev;
>> +
>> +		if (!(netdev->features & NETIF_F_RXALL)) {
>> +			dev_kfree_skb_any(skb);
>> +			return true;
>> +		}
>> +	}
>> +
>> +	/* place header in linear portion of buffer */
>> +	if (skb_is_nonlinear(skb))
>> +		ixgbevf_pull_tail(rx_ring, skb);
>> +
>> +	/* if skb_pad returns an error the skb was freed */
>> +	if (unlikely(skb->len < 60)) {
>> +		int pad_len = 60 - skb->len;
>> +
>> +		if (skb_pad(skb, pad_len))
>> +			return true;
>> +		__skb_put(skb, pad_len);
>> +	}
>> +
>> +	return false;
>> +}
>
>The same goes for the padding bit here.  Maybe something like eth_skb_pad.

OK

>> +/* ixgbevf_reuse_rx_page - page flip buffer and store it back on the ring
>> + * @rx_ring: Rx descriptor ring to store buffers on
>> + * @old_buff: donor buffer to have page reused
>> + *
>> + * Synchronizes page for reuse by the adapter
>> + */
>> +static void ixgbevf_reuse_rx_page(struct ixgbevf_ring *rx_ring,
>> +				  struct ixgbevf_rx_buffer *old_buff)
>> +{
>> +	struct ixgbevf_rx_buffer *new_buff;
>> +	u16 nta = rx_ring->next_to_alloc;
>> +
>> +	new_buff = &rx_ring->rx_buffer_info[nta];
>> +
>> +	/* update, and store next to alloc */
>> +	nta++;
>> +	rx_ring->next_to_alloc = (nta < rx_ring->count) ? nta : 0;
>> +
>> +	/* transfer page from old buffer to new buffer */
>> +	new_buff->page = old_buff->page;
>> +	new_buff->dma = old_buff->dma;
>> +	new_buff->page_offset = old_buff->page_offset;
>> +
>> +	/* sync the buffer for use by the device */
>> +	dma_sync_single_range_for_device(rx_ring->dev, new_buff->dma,
>> +					 new_buff->page_offset,
>> +					 IXGBEVF_RX_BUFSZ,
>> +					 DMA_FROM_DEVICE);
>> +}
>> +
>> +/* ixgbevf_add_rx_frag - Add contents of Rx buffer to sk_buff
>> + * @rx_ring: Rx descriptor ring to transact packets on
>> + * @rx_buffer: buffer containing page to add
>> + * @rx_desc: descriptor containing length of buffer written by hardware
>> + * @skb: sk_buff to place the data into
>> + *
>> + * This function will add the data contained in rx_buffer->page to the skb.
>> + * This is done either through a direct copy if the data in the buffer is
>> + * less than the skb header size, otherwise it will just attach the page as
>> + * a frag to the skb.
>> + *
>> + * The function will then update the page offset if necessary and return
>> + * true if the buffer can be reused by the adapter.
>> + */
>> +static bool ixgbevf_add_rx_frag(struct ixgbevf_ring *rx_ring,
>> +				struct ixgbevf_rx_buffer *rx_buffer,
>> +				union ixgbe_adv_rx_desc *rx_desc,
>> +				struct sk_buff *skb)
>> +{
>> +	struct page *page = rx_buffer->page;
>> +	unsigned int size = le16_to_cpu(rx_desc-
>>wb.upper.length);
>> +#if (PAGE_SIZE < 8192)
>> +	unsigned int truesize = IXGBEVF_RX_BUFSZ;
>> +#else
>> +	unsigned int truesize = ALIGN(size, L1_CACHE_BYTES);
>> +#endif
>> +
>> +	if ((size <= IXGBEVF_RX_HDR_SIZE) && !skb_is_nonlinear(skb)) {
>> +		unsigned char *va = page_address(page) + rx_buffer->page_offset;
>> +
>> +		memcpy(__skb_put(skb, size), va, ALIGN(size, sizeof(long)));
>> +
>> +		/* we can reuse buffer as-is, just make sure it is local */
>> +		if (likely(page_to_nid(page) == numa_node_id()))
>> +			return true;
>> +
>> +		/* this page cannot be reused so discard it */
>> +		put_page(page);
>> +		return false;
>> +	}
>> +
>> +	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
>> +			rx_buffer->page_offset, size, truesize);
>> +
>> +	/* avoid re-using remote pages */
>> +	if (unlikely(page_to_nid(page) != numa_node_id()))
>> +		return false;
>> +
>
>This is missing the pfmemalloc fix that is already in igb,
>and that I submitted for ixgbe and fm10k.

Just because of timing. I can redo the patch with the pfmemalloc check added.

>
>> +#if (PAGE_SIZE < 8192)
>> +	/* if we are only owner of page we can reuse it */
>> +	if (unlikely(page_count(page) != 1))
>> +		return false;
>> +
>> +	/* flip page offset to other buffer */
>> +	rx_buffer->page_offset ^= IXGBEVF_RX_BUFSZ;
>> +
>> +	/* since we are the only owner of the page and we need to
>> +	 * increment it.
>> +	 */
>> +	atomic_inc(&page->_count);
>> +#else
>> +	/* move offset up to the next cache line */
>> +	rx_buffer->page_offset += truesize;
>> +
>> +	if (rx_buffer->page_offset > (PAGE_SIZE - IXGBEVF_RX_BUFSZ))
>> +		return false;
>> +
>> +	/* bump ref count on page before it is given to the stack */
>> +	get_page(page);
>> +#endif
>> +
>> +	return true;
>> +}
>
>The get_page and atomic_inc calls can be pulled out and
>placed after if #if/else logic.  The preference is to use atomic_inc since
>you are using an order 0 page and don't need to check for PageTail().

I can do that,

>> +static struct sk_buff *ixgbevf_fetch_rx_buffer(struct ixgbevf_ring *rx_ring,
>> +					       union ixgbe_adv_rx_desc *rx_desc,
>> +					       struct sk_buff *skb)
>> +{
>> +	struct ixgbevf_rx_buffer *rx_buffer;
>> +	struct page *page;
>> +
>> +	rx_buffer = &rx_ring->rx_buffer_info[rx_ring->next_to_clean];
>> +	page = rx_buffer->page;
>> +	prefetchw(page);
>> +
>> +	if (likely(!skb)) {
>> +		void *page_addr = page_address(page) +
>> +				  rx_buffer->page_offset;
>> +
>> +		/* prefetch first cache line of first page */
>> +		prefetch(page_addr);
>> +#if L1_CACHE_BYTES < 128
>> +		prefetch(page_addr + L1_CACHE_BYTES);
>> +#endif
>> +
>> +		/* allocate a skb to store the frags */
>> +		skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
>> +						IXGBEVF_RX_HDR_SIZE);
>> +		if (unlikely(!skb)) {
>> +			rx_ring->rx_stats.alloc_rx_buff_failed++;
>> +			return NULL;
>> +		}
>> +
>> +		/* we will be copying header into skb->data in
>> +		 * pskb_may_pull so it is in our interest to prefetch
>> +		 * it now to avoid a possible cache miss
>> +		 */
>> +		prefetchw(skb->data);
>> +	}
>> +
>> +	/* we are reusing so sync this buffer for CPU use */
>> +	dma_sync_single_range_for_cpu(rx_ring->dev,
>> +				      rx_buffer->dma,
>> +				      rx_buffer->page_offset,
>> +				      IXGBEVF_RX_BUFSZ,
>> +				      DMA_FROM_DEVICE);
>> +
>> +	/* pull page into skb */
>> +	if (ixgbevf_add_rx_frag(rx_ring, rx_buffer, rx_desc, skb)) {
>> +		/* hand second half of page back to the ring */
>> +		ixgbevf_reuse_rx_page(rx_ring, rx_buffer);
>> +	} else {
>> +		/* we are not reusing the buffer so unmap it */
>> +		dma_unmap_page(rx_ring->dev, rx_buffer->dma,
>> +			       PAGE_SIZE, DMA_FROM_DEVICE);
>> +	}
>> +
>> +	/* clear contents of buffer_info */
>> +	rx_buffer->dma = 0;
>> +	rx_buffer->page = NULL;
>> +
>> +	return skb;
>> +}
>> +
>>   static inline void ixgbevf_irq_enable_queues(struct ixgbevf_adapter *adapter,
>>   					     u32 qmask)
>>   {
>
><snip>
>
>> @@ -3320,21 +3522,11 @@ static int ixgbevf_set_mac(struct net_device *netdev, void *p)
>>   static int ixgbevf_change_mtu(struct net_device *netdev, int new_mtu)
>>   {
>>   	struct ixgbevf_adapter *adapter = netdev_priv(netdev);
>> +	struct ixgbe_hw *hw = &adapter->hw;
>>   	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
>> -	int max_possible_frame = MAXIMUM_ETHERNET_VLAN_SIZE;
>> -
>> -	switch (adapter->hw.api_version) {
>> -	case ixgbe_mbox_api_11:
>> -		max_possible_frame = IXGBE_MAX_JUMBO_FRAME_SIZE;
>> -		break;
>> -	default:
>> -		if (adapter->hw.mac.type == ixgbe_mac_X540_vf)
>> -			max_possible_frame = IXGBE_MAX_JUMBO_FRAME_SIZE;
>> -		break;
>> -	}
>>
>>   	/* MTU < 68 is an error and causes problems on some kernels */
>> -	if ((new_mtu < 68) || (max_frame > max_possible_frame))
>> +	if ((new_mtu < 68) || (max_frame > IXGBE_MAX_JUMBO_FRAME_SIZE))
>>   		return -EINVAL;
>>
>>   	hw_dbg(&adapter->hw, "changing MTU from %d to %d\n",
>
>This is wrong. You are still limited by the PF so if it is a version
>1.0 mailbox on an 82599 you cannot enable jumbo frames.  Yes you can
>support it but the PF won't let you do it.
>
>> @@ -3342,8 +3534,8 @@ static int ixgbevf_change_mtu(struct
>net_device *netdev, int new_mtu)
>>   	/* must set new MTU before calling down or up */
>>   	netdev->mtu = new_mtu;
>>
>> -	if (netif_running(netdev))
>> -		ixgbevf_reinit_locked(adapter);
>> +	/* notify the PF of our intent to use this size of frame */
>> +	ixgbevf_rlpml_set_vf(hw, max_frame);
>>
>>   	return 0;
>>   }
>>
>
>This is the reason why the change is wrong.  If the mailbox api is
>version 1.0 you cannot support jumbo frames so ixgbevf_rlmpl_set_vf will
>return an error via the mailbox indicating that the message >is not
>supported.

I'll have to see how this change was introduced, but I can drop it for now since it's not directly related to this change anyway.

Thanks,
Emil

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-11-19 23:06 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-19  4:10 [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2014-11-18 Jeff Kirsher
2014-11-19  4:10 ` [net-next 01/15] ixgbevf: Update ixgbevf_alloc_rx_buffers to handle clearing of status bits Jeff Kirsher
2014-11-19  4:10 ` [net-next 02/15] ixgbevf: Test Rx status bits directly out of the descriptor Jeff Kirsher
2014-11-19  4:10 ` [net-next 03/15] ixgbevf: Combine the logic for post Rx processing into single function Jeff Kirsher
2014-11-19  4:10 ` [net-next 04/15] ixgbevf: Cleanup variable usage, improve stack performance Jeff Kirsher
2014-11-19  4:10 ` [net-next 05/15] ixgbevf: reorder main loop in ixgbe_clean_rx_irq to allow for do/while/continue Jeff Kirsher
2014-11-19  4:10 ` [net-next 06/15] ixgbevf: Update Rx next to clean in real time Jeff Kirsher
2014-11-19  4:10 ` [net-next 07/15] ixgbevf: Change receive model to use double buffered page based receives Jeff Kirsher
2014-11-19 18:24   ` Alexander Duyck
2014-11-19 23:06     ` Tantilov, Emil S
2014-11-19  4:10 ` [net-next 08/15] ixgbevf: compare total_rx_packets and budget in ixgbevf_clean_rx_irq Jeff Kirsher
2014-11-19  4:10 ` [net-next 09/15] ixgbevf: add netpoll support Jeff Kirsher
2014-11-19  4:10 ` [net-next 10/15] i40e: don't overload fields Jeff Kirsher
2014-11-19  4:10 ` [net-next 11/15] i40evf: update header comments Jeff Kirsher
2014-11-19  4:10 ` [net-next 12/15] i40evf: make checkpatch happy Jeff Kirsher
2014-11-19  4:59   ` Joe Perches
2014-11-19  9:55     ` Jeff Kirsher
2014-11-19  4:10 ` [net-next 13/15] i40evf: make comparisons consistent Jeff Kirsher
2014-11-19  4:10 ` [net-next 14/15] i40evf: remove unnecessary else Jeff Kirsher
2014-11-19  4:10 ` [net-next 15/15] i40e: trigger SW INT with no ITR wait Jeff Kirsher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).