All of lore.kernel.org
 help / color / mirror / Atom feed
* e1000: convert to build_skb/napi_gro_frags api
@ 2014-08-30 18:28 Florian Westphal
  2014-08-30 18:28 ` [PATCH 1/7] e1000: move e1000_tbi_adjust_stats to where its used Florian Westphal
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Florian Westphal @ 2014-08-30 18:28 UTC (permalink / raw)
  To: e1000-devel; +Cc: netdev

e1000 driver preallocates skbs, then sends them up the stack.

This series changes the rx routine to only preallocate data buffers,
then initialize skb right before passing it up the stack.

This gives slight performance inprovements as the skb will be
fresh in the cache once its allocated/initialized.

The default-mtu 1500 path is converted to build_skb()/netdev_alloc_frag.

Jumbo path is converted to napi_gro_frags.

Note that this has only been tested with kvms e1000 emulation.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/7] e1000: move e1000_tbi_adjust_stats to where its used
  2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
@ 2014-08-30 18:28 ` Florian Westphal
  2014-08-30 18:28 ` [PATCH 2/7] e1000: move tbi workaround code into helper function Florian Westphal
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2014-08-30 18:28 UTC (permalink / raw)
  To: e1000-devel; +Cc: netdev, Florian Westphal

... and make it static.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 drivers/net/ethernet/intel/e1000/e1000_hw.c   | 78 ---------------------------
 drivers/net/ethernet/intel/e1000/e1000_hw.h   |  2 -
 drivers/net/ethernet/intel/e1000/e1000_main.c | 77 ++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 80 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_hw.c b/drivers/net/ethernet/intel/e1000/e1000_hw.c
index 1acf503..45c8c864 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_hw.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_hw.c
@@ -4837,84 +4837,6 @@ void e1000_update_adaptive(struct e1000_hw *hw)
 }
 
 /**
- * e1000_tbi_adjust_stats
- * @hw: Struct containing variables accessed by shared code
- * @frame_len: The length of the frame in question
- * @mac_addr: The Ethernet destination address of the frame in question
- *
- * Adjusts the statistic counters when a frame is accepted by TBI_ACCEPT
- */
-void e1000_tbi_adjust_stats(struct e1000_hw *hw, struct e1000_hw_stats *stats,
-			    u32 frame_len, u8 *mac_addr)
-{
-	u64 carry_bit;
-
-	/* First adjust the frame length. */
-	frame_len--;
-	/* We need to adjust the statistics counters, since the hardware
-	 * counters overcount this packet as a CRC error and undercount
-	 * the packet as a good packet
-	 */
-	/* This packet should not be counted as a CRC error. */
-	stats->crcerrs--;
-	/* This packet does count as a Good Packet Received. */
-	stats->gprc++;
-
-	/* Adjust the Good Octets received counters */
-	carry_bit = 0x80000000 & stats->gorcl;
-	stats->gorcl += frame_len;
-	/* If the high bit of Gorcl (the low 32 bits of the Good Octets
-	 * Received Count) was one before the addition,
-	 * AND it is zero after, then we lost the carry out,
-	 * need to add one to Gorch (Good Octets Received Count High).
-	 * This could be simplified if all environments supported
-	 * 64-bit integers.
-	 */
-	if (carry_bit && ((stats->gorcl & 0x80000000) == 0))
-		stats->gorch++;
-	/* Is this a broadcast or multicast?  Check broadcast first,
-	 * since the test for a multicast frame will test positive on
-	 * a broadcast frame.
-	 */
-	if (is_broadcast_ether_addr(mac_addr))
-		/* Broadcast packet */
-		stats->bprc++;
-	else if (is_multicast_ether_addr(mac_addr))
-		/* Multicast packet */
-		stats->mprc++;
-
-	if (frame_len == hw->max_frame_size) {
-		/* In this case, the hardware has overcounted the number of
-		 * oversize frames.
-		 */
-		if (stats->roc > 0)
-			stats->roc--;
-	}
-
-	/* Adjust the bin counters when the extra byte put the frame in the
-	 * wrong bin. Remember that the frame_len was adjusted above.
-	 */
-	if (frame_len == 64) {
-		stats->prc64++;
-		stats->prc127--;
-	} else if (frame_len == 127) {
-		stats->prc127++;
-		stats->prc255--;
-	} else if (frame_len == 255) {
-		stats->prc255++;
-		stats->prc511--;
-	} else if (frame_len == 511) {
-		stats->prc511++;
-		stats->prc1023--;
-	} else if (frame_len == 1023) {
-		stats->prc1023++;
-		stats->prc1522--;
-	} else if (frame_len == 1522) {
-		stats->prc1522++;
-	}
-}
-
-/**
  * e1000_get_bus_info
  * @hw: Struct containing variables accessed by shared code
  *
diff --git a/drivers/net/ethernet/intel/e1000/e1000_hw.h b/drivers/net/ethernet/intel/e1000/e1000_hw.h
index 11578c8..5cf7268 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_hw.h
+++ b/drivers/net/ethernet/intel/e1000/e1000_hw.h
@@ -393,8 +393,6 @@ s32 e1000_blink_led_start(struct e1000_hw *hw);
 /* Everything else */
 void e1000_reset_adaptive(struct e1000_hw *hw);
 void e1000_update_adaptive(struct e1000_hw *hw);
-void e1000_tbi_adjust_stats(struct e1000_hw *hw, struct e1000_hw_stats *stats,
-			    u32 frame_len, u8 * mac_addr);
 void e1000_get_bus_info(struct e1000_hw *hw);
 void e1000_pci_set_mwi(struct e1000_hw *hw);
 void e1000_pci_clear_mwi(struct e1000_hw *hw);
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index cbc330b..fe56fac 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -3978,6 +3978,83 @@ static void e1000_receive_skb(struct e1000_adapter *adapter, u8 status,
 }
 
 /**
+ * e1000_tbi_adjust_stats
+ * @hw: Struct containing variables accessed by shared code
+ * @frame_len: The length of the frame in question
+ * @mac_addr: The Ethernet destination address of the frame in question
+ *
+ * Adjusts the statistic counters when a frame is accepted by TBI_ACCEPT
+ */
+static void e1000_tbi_adjust_stats(struct e1000_hw *hw,
+				   struct e1000_hw_stats *stats,
+				   u32 frame_len, const u8 *mac_addr)
+{
+	u64 carry_bit;
+
+	/* First adjust the frame length. */
+	frame_len--;
+	/* We need to adjust the statistics counters, since the hardware
+	 * counters overcount this packet as a CRC error and undercount
+	 * the packet as a good packet
+	 */
+	/* This packet should not be counted as a CRC error. */
+	stats->crcerrs--;
+	/* This packet does count as a Good Packet Received. */
+	stats->gprc++;
+
+	/* Adjust the Good Octets received counters */
+	carry_bit = 0x80000000 & stats->gorcl;
+	stats->gorcl += frame_len;
+	/* If the high bit of Gorcl (the low 32 bits of the Good Octets
+	 * Received Count) was one before the addition,
+	 * AND it is zero after, then we lost the carry out,
+	 * need to add one to Gorch (Good Octets Received Count High).
+	 * This could be simplified if all environments supported
+	 * 64-bit integers.
+	 */
+	if (carry_bit && ((stats->gorcl & 0x80000000) == 0))
+		stats->gorch++;
+	/* Is this a broadcast or multicast?  Check broadcast first,
+	 * since the test for a multicast frame will test positive on
+	 * a broadcast frame.
+	 */
+	if (is_broadcast_ether_addr(mac_addr))
+		stats->bprc++;
+	else if (is_multicast_ether_addr(mac_addr))
+		stats->mprc++;
+
+	if (frame_len == hw->max_frame_size) {
+		/* In this case, the hardware has overcounted the number of
+		 * oversize frames.
+		 */
+		if (stats->roc > 0)
+			stats->roc--;
+	}
+
+	/* Adjust the bin counters when the extra byte put the frame in the
+	 * wrong bin. Remember that the frame_len was adjusted above.
+	 */
+	if (frame_len == 64) {
+		stats->prc64++;
+		stats->prc127--;
+	} else if (frame_len == 127) {
+		stats->prc127++;
+		stats->prc255--;
+	} else if (frame_len == 255) {
+		stats->prc255++;
+		stats->prc511--;
+	} else if (frame_len == 511) {
+		stats->prc511++;
+		stats->prc1023--;
+	} else if (frame_len == 1023) {
+		stats->prc1023++;
+		stats->prc1522--;
+	} else if (frame_len == 1522) {
+		stats->prc1522++;
+	}
+}
+
+/**
  * e1000_clean_jumbo_rx_irq - Send received data up the network stack; legacy
  * @adapter: board private structure
  * @rx_ring: ring to clean
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/7] e1000: move tbi workaround code into helper function
  2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
  2014-08-30 18:28 ` [PATCH 1/7] e1000: move e1000_tbi_adjust_stats to where its used Florian Westphal
@ 2014-08-30 18:28 ` Florian Westphal
  2014-08-30 18:28 ` [PATCH 3/7] e1000: perform copybreak ahead of dma unmap Florian Westphal
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2014-08-30 18:28 UTC (permalink / raw)
  To: e1000-devel; +Cc: netdev, Florian Westphal

Its the same in both handlers.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 drivers/net/ethernet/intel/e1000/e1000_main.c | 63 ++++++++++++++-------------
 1 file changed, 33 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index fe56fac..f79ba40 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -4054,6 +4054,26 @@ static void e1000_tbi_adjust_stats(struct e1000_hw *hw,
 	}
 }
 
+static bool e1000_tbi_should_accept(struct e1000_adapter *adapter,
+				    u8 status, u8 errors,
+				    u32 length, const u8 *data)
+{
+	struct e1000_hw *hw = &adapter->hw;
+	u8 last_byte = *(data + length - 1);
+
+	if (TBI_ACCEPT(hw, status, errors, length, last_byte)) {
+		unsigned long irq_flags;
+
+		spin_lock_irqsave(&adapter->stats_lock, irq_flags);
+		e1000_tbi_adjust_stats(hw, &adapter->stats, length, data);
+		spin_unlock_irqrestore(&adapter->stats_lock, irq_flags);
+
+		return true;
+	}
+
+	return false;
+}
+
 /**
  * e1000_clean_jumbo_rx_irq - Send received data up the network stack; legacy
  * @adapter: board private structure
@@ -4068,12 +4088,10 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 				     struct e1000_rx_ring *rx_ring,
 				     int *work_done, int work_to_do)
 {
-	struct e1000_hw *hw = &adapter->hw;
 	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_rx_desc *rx_desc, *next_rxd;
 	struct e1000_buffer *buffer_info, *next_buffer;
-	unsigned long irq_flags;
 	u32 length;
 	unsigned int i;
 	int cleaned_count = 0;
@@ -4114,23 +4132,15 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 		/* errors is only valid for DD + EOP descriptors */
 		if (unlikely((status & E1000_RXD_STAT_EOP) &&
 		    (rx_desc->errors & E1000_RXD_ERR_FRAME_ERR_MASK))) {
-			u8 *mapped;
-			u8 last_byte;
-
-			mapped = page_address(buffer_info->page);
-			last_byte = *(mapped + length - 1);
-			if (TBI_ACCEPT(hw, status, rx_desc->errors, length,
-				       last_byte)) {
-				spin_lock_irqsave(&adapter->stats_lock,
-						  irq_flags);
-				e1000_tbi_adjust_stats(hw, &adapter->stats,
-						       length, mapped);
-				spin_unlock_irqrestore(&adapter->stats_lock,
-						       irq_flags);
+			u8 *mapped = page_address(buffer_info->page);
+
+			if (e1000_tbi_should_accept(adapter, status,
+						    rx_desc->errors,
+						    length, mapped)) {
 				length--;
+			} else if (netdev->features & NETIF_F_RXALL) {
+				goto process_skb;
 			} else {
-				if (netdev->features & NETIF_F_RXALL)
-					goto process_skb;
 				/* recycle both page and skb */
 				buffer_info->skb = skb;
 				/* an error means any chain goes out the window
@@ -4281,12 +4291,10 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 			       struct e1000_rx_ring *rx_ring,
 			       int *work_done, int work_to_do)
 {
-	struct e1000_hw *hw = &adapter->hw;
 	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_rx_desc *rx_desc, *next_rxd;
 	struct e1000_buffer *buffer_info, *next_buffer;
-	unsigned long flags;
 	u32 length;
 	unsigned int i;
 	int cleaned_count = 0;
@@ -4336,7 +4344,7 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 
 		if (adapter->discarding) {
 			/* All receives must fit into a single buffer */
-			e_dbg("Receive packet consumed multiple buffers\n");
+			netdev_dbg(netdev, "Receive packet consumed multiple buffers\n");
 			/* recycle */
 			buffer_info->skb = skb;
 			if (status & E1000_RXD_STAT_EOP)
@@ -4345,18 +4353,13 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 		}
 
 		if (unlikely(rx_desc->errors & E1000_RXD_ERR_FRAME_ERR_MASK)) {
-			u8 last_byte = *(skb->data + length - 1);
-			if (TBI_ACCEPT(hw, status, rx_desc->errors, length,
-				       last_byte)) {
-				spin_lock_irqsave(&adapter->stats_lock, flags);
-				e1000_tbi_adjust_stats(hw, &adapter->stats,
-						       length, skb->data);
-				spin_unlock_irqrestore(&adapter->stats_lock,
-						       flags);
+			if (e1000_tbi_should_accept(adapter, status,
+						    rx_desc->errors,
+						    length, skb->data)) {
 				length--;
+			} else if (netdev->features & NETIF_F_RXALL) {
+				goto process_skb;
 			} else {
-				if (netdev->features & NETIF_F_RXALL)
-					goto process_skb;
 				/* recycle */
 				buffer_info->skb = skb;
 				goto next_desc;
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/7] e1000: perform copybreak ahead of dma unmap
  2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
  2014-08-30 18:28 ` [PATCH 1/7] e1000: move e1000_tbi_adjust_stats to where its used Florian Westphal
  2014-08-30 18:28 ` [PATCH 2/7] e1000: move tbi workaround code into helper function Florian Westphal
@ 2014-08-30 18:28 ` Florian Westphal
  2014-08-30 18:28 ` [PATCH 4/7] e1000: add and use e1000_rx_buffer info for rx Florian Westphal
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2014-08-30 18:28 UTC (permalink / raw)
  To: e1000-devel; +Cc: netdev, Florian Westphal

Currently we unmap the dma range, then copy to new skb.
Change this so we can keep the mapping in case the data is copied.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 drivers/net/ethernet/intel/e1000/e1000_main.c | 73 ++++++++++++++++-----------
 1 file changed, 43 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index f79ba40..0cb05a5 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -4074,6 +4074,16 @@ static bool e1000_tbi_should_accept(struct e1000_adapter *adapter,
 	return false;
 }
 
+static struct sk_buff *e1000_alloc_rx_skb(struct e1000_adapter *adapter,
+					  unsigned int bufsz)
+{
+	struct sk_buff *skb = netdev_alloc_skb_ip_align(adapter->netdev, bufsz);
+
+	if (unlikely(!skb))
+		adapter->alloc_rx_buff_failed++;
+	return skb;
+}
+
 /**
  * e1000_clean_jumbo_rx_irq - Send received data up the network stack; legacy
  * @adapter: board private structure
@@ -4259,25 +4269,25 @@ next_desc:
 /* this should improve performance for small packets with large amounts
  * of reassembly being done in the stack
  */
-static void e1000_check_copybreak(struct net_device *netdev,
-				 struct e1000_buffer *buffer_info,
-				 u32 length, struct sk_buff **skb)
+static struct sk_buff *e1000_copybreak(struct e1000_adapter *adapter,
+				       struct e1000_buffer *buffer_info,
+				       u32 length, const void *data)
 {
-	struct sk_buff *new_skb;
+	struct sk_buff *skb;
 
 	if (length > copybreak)
-		return;
+		return NULL;
 
-	new_skb = netdev_alloc_skb_ip_align(netdev, length);
-	if (!new_skb)
-		return;
+	skb = e1000_alloc_rx_skb(adapter, length);
+	if (!skb)
+		return NULL;
+
+	dma_sync_single_for_cpu(&adapter->pdev->dev, buffer_info->dma,
+				length, DMA_FROM_DEVICE);
+
+	memcpy(skb_put(skb, length), data, length);
 
-	skb_copy_to_linear_data_offset(new_skb, -NET_IP_ALIGN,
-				       (*skb)->data - NET_IP_ALIGN,
-				       length + NET_IP_ALIGN);
-	/* save the skb in buffer_info as good */
-	buffer_info->skb = *skb;
-	*skb = new_skb;
+	return skb;
 }
 
 /**
@@ -4315,10 +4325,18 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
-		skb = buffer_info->skb;
-		buffer_info->skb = NULL;
+		length = le16_to_cpu(rx_desc->length);
 
-		prefetch(skb->data - NET_IP_ALIGN);
+		prefetch(buffer_info->skb->data - NET_IP_ALIGN);
+		skb = e1000_copybreak(adapter, buffer_info, length,
+				      buffer_info->skb->data);
+		if (!skb) {
+			skb = buffer_info->skb;
+			buffer_info->skb = NULL;
+			dma_unmap_single(&pdev->dev, buffer_info->dma,
+					 buffer_info->length, DMA_FROM_DEVICE);
+			buffer_info->dma = 0;
+		}
 
 		if (++i == rx_ring->count) i = 0;
 		next_rxd = E1000_RX_DESC(*rx_ring, i);
@@ -4328,11 +4346,7 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 
 		cleaned = true;
 		cleaned_count++;
-		dma_unmap_single(&pdev->dev, buffer_info->dma,
-				 buffer_info->length, DMA_FROM_DEVICE);
-		buffer_info->dma = 0;
 
-		length = le16_to_cpu(rx_desc->length);
 		/* !EOP means multiple descriptors were used to store a single
 		 * packet, if thats the case we need to toss it.  In fact, we
 		 * to toss every packet with the EOP bit clear and the next
@@ -4345,8 +4359,7 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 		if (adapter->discarding) {
 			/* All receives must fit into a single buffer */
 			netdev_dbg(netdev, "Receive packet consumed multiple buffers\n");
-			/* recycle */
-			buffer_info->skb = skb;
+			dev_kfree_skb(skb);
 			if (status & E1000_RXD_STAT_EOP)
 				adapter->discarding = false;
 			goto next_desc;
@@ -4360,8 +4373,7 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 			} else if (netdev->features & NETIF_F_RXALL) {
 				goto process_skb;
 			} else {
-				/* recycle */
-				buffer_info->skb = skb;
+				dev_kfree_skb(skb);
 				goto next_desc;
 			}
 		}
@@ -4376,9 +4388,10 @@ process_skb:
 			 */
 			length -= 4;
 
-		e1000_check_copybreak(netdev, buffer_info, length, &skb);
-
-		skb_put(skb, length);
+		if (buffer_info->skb == NULL)
+			skb_put(skb, length);
+		else /* copybreak skb */
+			skb_trim(skb, length);
 
 		/* Receive Checksum Offload */
 		e1000_rx_checksum(adapter,
@@ -4524,7 +4537,7 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 		skb = buffer_info->skb;
 		if (skb) {
 			skb_trim(skb, 0);
-			goto map_skb;
+			goto skip;
 		}
 
 		skb = netdev_alloc_skb_ip_align(netdev, bufsz);
@@ -4561,7 +4574,6 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 		}
 		buffer_info->skb = skb;
 		buffer_info->length = adapter->rx_buffer_len;
-map_skb:
 		buffer_info->dma = dma_map_single(&pdev->dev,
 						  skb->data,
 						  buffer_info->length,
@@ -4599,6 +4611,7 @@ map_skb:
 		rx_desc = E1000_RX_DESC(*rx_ring, i);
 		rx_desc->buffer_addr = cpu_to_le64(buffer_info->dma);
 
+skip:
 		if (unlikely(++i == rx_ring->count))
 			i = 0;
 		buffer_info = &rx_ring->buffer_info[i];
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/7] e1000: add and use e1000_rx_buffer info for rx
  2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
                   ` (2 preceding siblings ...)
  2014-08-30 18:28 ` [PATCH 3/7] e1000: perform copybreak ahead of dma unmap Florian Westphal
@ 2014-08-30 18:28 ` Florian Westphal
  2014-08-30 18:28 ` [PATCH 5/7] e1000: rename struct e1000_buffer to e1000_tx_buffer Florian Westphal
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2014-08-30 18:28 UTC (permalink / raw)
  To: e1000-devel; +Cc: netdev, Florian Westphal

e1000 uses the same metadata struct for rx and tx.  But tx and rx have
different requirements.

For rx, we only need to store a buffer and a DMA address.

Followup patch will remove skb for rx, bringing rx_buffer_info down
to 16 bytes on x86_64.

[ buffer_info is 48 bytes ]

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 drivers/net/ethernet/intel/e1000/e1000.h         |  8 +++++-
 drivers/net/ethernet/intel/e1000/e1000_ethtool.c |  7 ++---
 drivers/net/ethernet/intel/e1000/e1000_main.c    | 35 ++++++++++++------------
 3 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000.h b/drivers/net/ethernet/intel/e1000/e1000.h
index 10a0f22..81efe33 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -160,6 +160,12 @@ struct e1000_buffer {
 	u16 mapped_as_page;
 };
 
+struct e1000_rx_buffer {
+	struct sk_buff *skb;
+	dma_addr_t dma;
+	struct page *page;
+};
+
 struct e1000_tx_ring {
 	/* pointer to the descriptor ring memory */
 	void *desc;
@@ -195,7 +201,7 @@ struct e1000_rx_ring {
 	/* next descriptor to check for DD status bit */
 	unsigned int next_to_clean;
 	/* array of buffer information structs */
-	struct e1000_buffer *buffer_info;
+	struct e1000_rx_buffer *buffer_info;
 	struct sk_buff *rx_skb_top;
 
 	/* cpu for rx queue */
diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
index cca5bca..fbf7a4d 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
@@ -972,7 +972,7 @@ static void e1000_free_desc_rings(struct e1000_adapter *adapter)
 			if (rxdr->buffer_info[i].dma)
 				dma_unmap_single(&pdev->dev,
 						 rxdr->buffer_info[i].dma,
-						 rxdr->buffer_info[i].length,
+						 E1000_RXBUFFER_2048,
 						 DMA_FROM_DEVICE);
 			if (rxdr->buffer_info[i].skb)
 				dev_kfree_skb(rxdr->buffer_info[i].skb);
@@ -1069,7 +1069,7 @@ static int e1000_setup_desc_rings(struct e1000_adapter *adapter)
 	if (!rxdr->count)
 		rxdr->count = E1000_DEFAULT_RXD;
 
-	rxdr->buffer_info = kcalloc(rxdr->count, sizeof(struct e1000_buffer),
+	rxdr->buffer_info = kcalloc(rxdr->count, sizeof(struct e1000_rx_buffer),
 				    GFP_KERNEL);
 	if (!rxdr->buffer_info) {
 		ret_val = 5;
@@ -1108,7 +1108,6 @@ static int e1000_setup_desc_rings(struct e1000_adapter *adapter)
 		}
 		skb_reserve(skb, NET_IP_ALIGN);
 		rxdr->buffer_info[i].skb = skb;
-		rxdr->buffer_info[i].length = E1000_RXBUFFER_2048;
 		rxdr->buffer_info[i].dma =
 			dma_map_single(&pdev->dev, skb->data,
 				       E1000_RXBUFFER_2048, DMA_FROM_DEVICE);
@@ -1444,7 +1443,7 @@ static int e1000_run_loopback_test(struct e1000_adapter *adapter)
 		do { /* receive the sent packets */
 			dma_sync_single_for_cpu(&pdev->dev,
 						rxdr->buffer_info[l].dma,
-						rxdr->buffer_info[l].length,
+						E1000_RXBUFFER_2048,
 						DMA_FROM_DEVICE);
 
 			ret_val = e1000_check_lbtest_frame(
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 0cb05a5..5ff48af 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -1687,7 +1687,7 @@ static int e1000_setup_rx_resources(struct e1000_adapter *adapter,
 	struct pci_dev *pdev = adapter->pdev;
 	int size, desc_len;
 
-	size = sizeof(struct e1000_buffer) * rxdr->count;
+	size = sizeof(struct e1000_rx_buffer) * rxdr->count;
 	rxdr->buffer_info = vzalloc(size);
 	if (!rxdr->buffer_info)
 		return -ENOMEM;
@@ -2062,7 +2062,7 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter,
 				struct e1000_rx_ring *rx_ring)
 {
 	struct e1000_hw *hw = &adapter->hw;
-	struct e1000_buffer *buffer_info;
+	struct e1000_rx_buffer *buffer_info;
 	struct pci_dev *pdev = adapter->pdev;
 	unsigned long size;
 	unsigned int i;
@@ -2073,12 +2073,12 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter,
 		if (buffer_info->dma &&
 		    adapter->clean_rx == e1000_clean_rx_irq) {
 			dma_unmap_single(&pdev->dev, buffer_info->dma,
-			                 buffer_info->length,
+					 adapter->rx_buffer_len,
 					 DMA_FROM_DEVICE);
 		} else if (buffer_info->dma &&
 		           adapter->clean_rx == e1000_clean_jumbo_rx_irq) {
 			dma_unmap_page(&pdev->dev, buffer_info->dma,
-				       buffer_info->length,
+				       adapter->rx_buffer_len,
 				       DMA_FROM_DEVICE);
 		}
 
@@ -2099,7 +2099,7 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter,
 		rx_ring->rx_skb_top = NULL;
 	}
 
-	size = sizeof(struct e1000_buffer) * rx_ring->count;
+	size = sizeof(struct e1000_rx_buffer) * rx_ring->count;
 	memset(rx_ring->buffer_info, 0, size);
 
 	/* Zero out the descriptor ring */
@@ -3412,7 +3412,7 @@ rx_ring_summary:
 
 	for (i = 0; rx_ring->desc && (i < rx_ring->count); i++) {
 		struct e1000_rx_desc *rx_desc = E1000_RX_DESC(*rx_ring, i);
-		struct e1000_buffer *buffer_info = &rx_ring->buffer_info[i];
+		struct e1000_rx_buffer *buffer_info = &rx_ring->buffer_info[i];
 		struct my_u { __le64 a; __le64 b; };
 		struct my_u *u = (struct my_u *)rx_desc;
 		const char *type;
@@ -3948,7 +3948,7 @@ static void e1000_rx_checksum(struct e1000_adapter *adapter, u32 status_err,
 /**
  * e1000_consume_page - helper function
  **/
-static void e1000_consume_page(struct e1000_buffer *bi, struct sk_buff *skb,
+static void e1000_consume_page(struct e1000_rx_buffer *bi, struct sk_buff *skb,
 			       u16 length)
 {
 	bi->page = NULL;
@@ -4101,7 +4101,7 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_rx_desc *rx_desc, *next_rxd;
-	struct e1000_buffer *buffer_info, *next_buffer;
+	struct e1000_rx_buffer *buffer_info, *next_buffer;
 	u32 length;
 	unsigned int i;
 	int cleaned_count = 0;
@@ -4134,7 +4134,7 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 		cleaned = true;
 		cleaned_count++;
 		dma_unmap_page(&pdev->dev, buffer_info->dma,
-			       buffer_info->length, DMA_FROM_DEVICE);
+			       adapter->rx_buffer_len, DMA_FROM_DEVICE);
 		buffer_info->dma = 0;
 
 		length = le16_to_cpu(rx_desc->length);
@@ -4270,7 +4270,7 @@ next_desc:
  * of reassembly being done in the stack
  */
 static struct sk_buff *e1000_copybreak(struct e1000_adapter *adapter,
-				       struct e1000_buffer *buffer_info,
+				       struct e1000_rx_buffer *buffer_info,
 				       u32 length, const void *data)
 {
 	struct sk_buff *skb;
@@ -4304,7 +4304,7 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_rx_desc *rx_desc, *next_rxd;
-	struct e1000_buffer *buffer_info, *next_buffer;
+	struct e1000_rx_buffer *buffer_info, *next_buffer;
 	u32 length;
 	unsigned int i;
 	int cleaned_count = 0;
@@ -4334,7 +4334,8 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 			skb = buffer_info->skb;
 			buffer_info->skb = NULL;
 			dma_unmap_single(&pdev->dev, buffer_info->dma,
-					 buffer_info->length, DMA_FROM_DEVICE);
+					 adapter->rx_buffer_len,
+					 DMA_FROM_DEVICE);
 			buffer_info->dma = 0;
 		}
 
@@ -4440,7 +4441,7 @@ e1000_alloc_jumbo_rx_buffers(struct e1000_adapter *adapter,
 	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_rx_desc *rx_desc;
-	struct e1000_buffer *buffer_info;
+	struct e1000_rx_buffer *buffer_info;
 	struct sk_buff *skb;
 	unsigned int i;
 	unsigned int bufsz = 256 - 16 /*for skb_reserve */ ;
@@ -4463,7 +4464,6 @@ e1000_alloc_jumbo_rx_buffers(struct e1000_adapter *adapter,
 		}
 
 		buffer_info->skb = skb;
-		buffer_info->length = adapter->rx_buffer_len;
 check_page:
 		/* allocate a new page if necessary */
 		if (!buffer_info->page) {
@@ -4477,7 +4477,7 @@ check_page:
 		if (!buffer_info->dma) {
 			buffer_info->dma = dma_map_page(&pdev->dev,
 							buffer_info->page, 0,
-							buffer_info->length,
+							PAGE_SIZE,
 							DMA_FROM_DEVICE);
 			if (dma_mapping_error(&pdev->dev, buffer_info->dma)) {
 				put_page(buffer_info->page);
@@ -4525,7 +4525,7 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_rx_desc *rx_desc;
-	struct e1000_buffer *buffer_info;
+	struct e1000_rx_buffer *buffer_info;
 	struct sk_buff *skb;
 	unsigned int i;
 	unsigned int bufsz = adapter->rx_buffer_len;
@@ -4573,10 +4573,9 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 			dev_kfree_skb(oldskb);
 		}
 		buffer_info->skb = skb;
-		buffer_info->length = adapter->rx_buffer_len;
 		buffer_info->dma = dma_map_single(&pdev->dev,
 						  skb->data,
-						  buffer_info->length,
+						  adapter->rx_buffer_len,
 						  DMA_FROM_DEVICE);
 		if (dma_mapping_error(&pdev->dev, buffer_info->dma)) {
 			dev_kfree_skb(skb);
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/7] e1000: rename struct e1000_buffer to e1000_tx_buffer
  2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
                   ` (3 preceding siblings ...)
  2014-08-30 18:28 ` [PATCH 4/7] e1000: add and use e1000_rx_buffer info for rx Florian Westphal
@ 2014-08-30 18:28 ` Florian Westphal
  2014-08-30 18:28 ` [PATCH 6/7] e1000: convert to build_skb Florian Westphal
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2014-08-30 18:28 UTC (permalink / raw)
  To: e1000-devel; +Cc: netdev, Florian Westphal

and remove *page, its only used for rx.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 drivers/net/ethernet/intel/e1000/e1000.h         |  9 ++++-----
 drivers/net/ethernet/intel/e1000/e1000_ethtool.c |  2 +-
 drivers/net/ethernet/intel/e1000/e1000_main.c    | 23 ++++++++++++-----------
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000.h b/drivers/net/ethernet/intel/e1000/e1000.h
index 81efe33..4c2a102 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -148,16 +148,15 @@ struct e1000_adapter;
 /* wrapper around a pointer to a socket buffer,
  * so a DMA handle can be stored along with the buffer
  */
-struct e1000_buffer {
+struct e1000_tx_buffer {
 	struct sk_buff *skb;
 	dma_addr_t dma;
-	struct page *page;
 	unsigned long time_stamp;
 	u16 length;
 	u16 next_to_watch;
-	unsigned int segs;
+	bool mapped_as_page;
+	unsigned short segs;
 	unsigned int bytecount;
-	u16 mapped_as_page;
 };
 
 struct e1000_rx_buffer {
@@ -180,7 +179,7 @@ struct e1000_tx_ring {
 	/* next descriptor to check for DD status bit */
 	unsigned int next_to_clean;
 	/* array of buffer information structs */
-	struct e1000_buffer *buffer_info;
+	struct e1000_tx_buffer *buffer_info;
 
 	u16 tdh;
 	u16 tdt;
diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
index fbf7a4d..380a730 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
@@ -1010,7 +1010,7 @@ static int e1000_setup_desc_rings(struct e1000_adapter *adapter)
 	if (!txdr->count)
 		txdr->count = E1000_DEFAULT_TXD;
 
-	txdr->buffer_info = kcalloc(txdr->count, sizeof(struct e1000_buffer),
+	txdr->buffer_info = kcalloc(txdr->count, sizeof(struct e1000_tx_buffer),
 				    GFP_KERNEL);
 	if (!txdr->buffer_info) {
 		ret_val = 1;
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 5ff48af..73cd509 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -1497,7 +1497,7 @@ static int e1000_setup_tx_resources(struct e1000_adapter *adapter,
 	struct pci_dev *pdev = adapter->pdev;
 	int size;
 
-	size = sizeof(struct e1000_buffer) * txdr->count;
+	size = sizeof(struct e1000_tx_buffer) * txdr->count;
 	txdr->buffer_info = vzalloc(size);
 	if (!txdr->buffer_info)
 		return -ENOMEM;
@@ -1947,8 +1947,9 @@ void e1000_free_all_tx_resources(struct e1000_adapter *adapter)
 		e1000_free_tx_resources(adapter, &adapter->tx_ring[i]);
 }
 
-static void e1000_unmap_and_free_tx_resource(struct e1000_adapter *adapter,
-					     struct e1000_buffer *buffer_info)
+static void
+e1000_unmap_and_free_tx_resource(struct e1000_adapter *adapter,
+				 struct e1000_tx_buffer *buffer_info)
 {
 	if (buffer_info->dma) {
 		if (buffer_info->mapped_as_page)
@@ -1977,7 +1978,7 @@ static void e1000_clean_tx_ring(struct e1000_adapter *adapter,
 				struct e1000_tx_ring *tx_ring)
 {
 	struct e1000_hw *hw = &adapter->hw;
-	struct e1000_buffer *buffer_info;
+	struct e1000_tx_buffer *buffer_info;
 	unsigned long size;
 	unsigned int i;
 
@@ -1989,7 +1990,7 @@ static void e1000_clean_tx_ring(struct e1000_adapter *adapter,
 	}
 
 	netdev_reset_queue(adapter->netdev);
-	size = sizeof(struct e1000_buffer) * tx_ring->count;
+	size = sizeof(struct e1000_tx_buffer) * tx_ring->count;
 	memset(tx_ring->buffer_info, 0, size);
 
 	/* Zero out the descriptor ring */
@@ -2677,7 +2678,7 @@ static int e1000_tso(struct e1000_adapter *adapter,
 		     struct e1000_tx_ring *tx_ring, struct sk_buff *skb)
 {
 	struct e1000_context_desc *context_desc;
-	struct e1000_buffer *buffer_info;
+	struct e1000_tx_buffer *buffer_info;
 	unsigned int i;
 	u32 cmd_length = 0;
 	u16 ipcse = 0, tucse, mss;
@@ -2748,7 +2749,7 @@ static bool e1000_tx_csum(struct e1000_adapter *adapter,
 			  struct e1000_tx_ring *tx_ring, struct sk_buff *skb)
 {
 	struct e1000_context_desc *context_desc;
-	struct e1000_buffer *buffer_info;
+	struct e1000_tx_buffer *buffer_info;
 	unsigned int i;
 	u8 css;
 	u32 cmd_len = E1000_TXD_CMD_DEXT;
@@ -2807,7 +2808,7 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
 {
 	struct e1000_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
-	struct e1000_buffer *buffer_info;
+	struct e1000_tx_buffer *buffer_info;
 	unsigned int len = skb_headlen(skb);
 	unsigned int offset = 0, size, count = 0, i;
 	unsigned int f, bytecount, segs;
@@ -2953,7 +2954,7 @@ static void e1000_tx_queue(struct e1000_adapter *adapter,
 {
 	struct e1000_hw *hw = &adapter->hw;
 	struct e1000_tx_desc *tx_desc = NULL;
-	struct e1000_buffer *buffer_info;
+	struct e1000_tx_buffer *buffer_info;
 	u32 txd_upper = 0, txd_lower = E1000_TXD_CMD_IFCS;
 	unsigned int i;
 
@@ -3370,7 +3371,7 @@ static void e1000_dump(struct e1000_adapter *adapter)
 
 	for (i = 0; tx_ring->desc && (i < tx_ring->count); i++) {
 		struct e1000_tx_desc *tx_desc = E1000_TX_DESC(*tx_ring, i);
-		struct e1000_buffer *buffer_info = &tx_ring->buffer_info[i];
+		struct e1000_tx_buffer *buffer_info = &tx_ring->buffer_info[i];
 		struct my_u { __le64 a; __le64 b; };
 		struct my_u *u = (struct my_u *)tx_desc;
 		const char *type;
@@ -3808,7 +3809,7 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter,
 	struct e1000_hw *hw = &adapter->hw;
 	struct net_device *netdev = adapter->netdev;
 	struct e1000_tx_desc *tx_desc, *eop_desc;
-	struct e1000_buffer *buffer_info;
+	struct e1000_tx_buffer *buffer_info;
 	unsigned int i, eop;
 	unsigned int count = 0;
 	unsigned int total_tx_bytes=0, total_tx_packets=0;
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/7] e1000: convert to build_skb
  2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
                   ` (4 preceding siblings ...)
  2014-08-30 18:28 ` [PATCH 5/7] e1000: rename struct e1000_buffer to e1000_tx_buffer Florian Westphal
@ 2014-08-30 18:28 ` Florian Westphal
  2014-08-30 18:28 ` [PATCH 7/7] e1000: switch to napi_gro_frags api Florian Westphal
  2014-09-02  0:30 ` e1000: convert to build_skb/napi_gro_frags api Jeff Kirsher
  7 siblings, 0 replies; 10+ messages in thread
From: Florian Westphal @ 2014-08-30 18:28 UTC (permalink / raw)
  To: e1000-devel; +Cc: netdev, Florian Westphal

instead of preallocating rx skbs, allocate them right before sending
inbound packet up the stack.

e1000-kvm, mtu1500, netperf TCP_STREAM:
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec
old: 87380  16384  16384    60.00    4532.40
new: 87380  16384  16384    60.00    4599.05

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 drivers/net/ethernet/intel/e1000/e1000.h         |   6 +-
 drivers/net/ethernet/intel/e1000/e1000_ethtool.c |  29 +--
 drivers/net/ethernet/intel/e1000/e1000_main.c    | 216 ++++++++++++-----------
 3 files changed, 131 insertions(+), 120 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000.h b/drivers/net/ethernet/intel/e1000/e1000.h
index 4c2a102..6970710 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -160,9 +160,11 @@ struct e1000_tx_buffer {
 };
 
 struct e1000_rx_buffer {
-	struct sk_buff *skb;
+	union {
+		struct page *page; /* jumbo: alloc_page */
+		u8 *data; /* else, netdev_alloc_frag */
+	} rxbuf;
 	dma_addr_t dma;
-	struct page *page;
 };
 
 struct e1000_tx_ring {
diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
index 380a730..ee139bf 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
@@ -974,8 +974,7 @@ static void e1000_free_desc_rings(struct e1000_adapter *adapter)
 						 rxdr->buffer_info[i].dma,
 						 E1000_RXBUFFER_2048,
 						 DMA_FROM_DEVICE);
-			if (rxdr->buffer_info[i].skb)
-				dev_kfree_skb(rxdr->buffer_info[i].skb);
+			kfree(rxdr->buffer_info[i].rxbuf.data);
 		}
 	}
 
@@ -1099,24 +1098,25 @@ static int e1000_setup_desc_rings(struct e1000_adapter *adapter)
 
 	for (i = 0; i < rxdr->count; i++) {
 		struct e1000_rx_desc *rx_desc = E1000_RX_DESC(*rxdr, i);
-		struct sk_buff *skb;
+		u8 *buf;
 
-		skb = alloc_skb(E1000_RXBUFFER_2048 + NET_IP_ALIGN, GFP_KERNEL);
-		if (!skb) {
+		buf = kzalloc(E1000_RXBUFFER_2048 + NET_SKB_PAD + NET_IP_ALIGN,
+			      GFP_KERNEL);
+		if (!buf) {
 			ret_val = 7;
 			goto err_nomem;
 		}
-		skb_reserve(skb, NET_IP_ALIGN);
-		rxdr->buffer_info[i].skb = skb;
+		rxdr->buffer_info[i].rxbuf.data = buf;
+
 		rxdr->buffer_info[i].dma =
-			dma_map_single(&pdev->dev, skb->data,
+			dma_map_single(&pdev->dev,
+				       buf + NET_SKB_PAD + NET_IP_ALIGN,
 				       E1000_RXBUFFER_2048, DMA_FROM_DEVICE);
 		if (dma_mapping_error(&pdev->dev, rxdr->buffer_info[i].dma)) {
 			ret_val = 8;
 			goto err_nomem;
 		}
 		rx_desc->buffer_addr = cpu_to_le64(rxdr->buffer_info[i].dma);
-		memset(skb->data, 0x00, skb->len);
 	}
 
 	return 0;
@@ -1390,13 +1390,13 @@ static void e1000_create_lbtest_frame(struct sk_buff *skb,
 	memset(&skb->data[frame_size / 2 + 12], 0xAF, 1);
 }
 
-static int e1000_check_lbtest_frame(struct sk_buff *skb,
+static int e1000_check_lbtest_frame(const unsigned char *data,
 				    unsigned int frame_size)
 {
 	frame_size &= ~1;
-	if (*(skb->data + 3) == 0xFF) {
-		if ((*(skb->data + frame_size / 2 + 10) == 0xBE) &&
-		   (*(skb->data + frame_size / 2 + 12) == 0xAF)) {
+	if (*(data + 3) == 0xFF) {
+		if ((*(data + frame_size / 2 + 10) == 0xBE) &&
+		    (*(data + frame_size / 2 + 12) == 0xAF)) {
 			return 0;
 		}
 	}
@@ -1447,7 +1447,8 @@ static int e1000_run_loopback_test(struct e1000_adapter *adapter)
 						DMA_FROM_DEVICE);
 
 			ret_val = e1000_check_lbtest_frame(
-					rxdr->buffer_info[l].skb,
+					rxdr->buffer_info[l].rxbuf.data +
+					NET_SKB_PAD + NET_IP_ALIGN,
 					1024);
 			if (!ret_val)
 				good_cnt++;
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 73cd509..9ac9a77 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -2054,6 +2054,28 @@ void e1000_free_all_rx_resources(struct e1000_adapter *adapter)
 		e1000_free_rx_resources(adapter, &adapter->rx_ring[i]);
 }
 
+#define E1000_HEADROOM (NET_SKB_PAD + NET_IP_ALIGN)
+static unsigned int e1000_frag_len(const struct e1000_adapter *a)
+{
+	return SKB_DATA_ALIGN(a->rx_buffer_len + E1000_HEADROOM) +
+		SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+}
+
+static void *e1000_alloc_frag(const struct e1000_adapter *a)
+{
+	unsigned int len = e1000_frag_len(a);
+	u8 *data = netdev_alloc_frag(len);
+
+	if (likely(data))
+		data += E1000_HEADROOM;
+	return data;
+}
+
+static void e1000_free_frag(const void *data)
+{
+	put_page(virt_to_head_page(data));
+}
+
 /**
  * e1000_clean_rx_ring - Free Rx Buffers per Queue
  * @adapter: board private structure
@@ -2068,30 +2090,30 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter,
 	unsigned long size;
 	unsigned int i;
 
-	/* Free all the Rx ring sk_buffs */
+	/* Free all the rx netfrags */
 	for (i = 0; i < rx_ring->count; i++) {
 		buffer_info = &rx_ring->buffer_info[i];
-		if (buffer_info->dma &&
-		    adapter->clean_rx == e1000_clean_rx_irq) {
-			dma_unmap_single(&pdev->dev, buffer_info->dma,
-					 adapter->rx_buffer_len,
-					 DMA_FROM_DEVICE);
-		} else if (buffer_info->dma &&
-		           adapter->clean_rx == e1000_clean_jumbo_rx_irq) {
-			dma_unmap_page(&pdev->dev, buffer_info->dma,
-				       adapter->rx_buffer_len,
-				       DMA_FROM_DEVICE);
+		if (adapter->clean_rx == e1000_clean_rx_irq) {
+			if (buffer_info->dma)
+				dma_unmap_single(&pdev->dev, buffer_info->dma,
+						 adapter->rx_buffer_len,
+						 DMA_FROM_DEVICE);
+			if (buffer_info->rxbuf.data) {
+				e1000_free_frag(buffer_info->rxbuf.data);
+				buffer_info->rxbuf.data = NULL;
+			}
+		} else if (adapter->clean_rx == e1000_clean_jumbo_rx_irq) {
+			if (buffer_info->dma)
+				dma_unmap_page(&pdev->dev, buffer_info->dma,
+					       adapter->rx_buffer_len,
+					       DMA_FROM_DEVICE);
+			if (buffer_info->rxbuf.page) {
+				put_page(buffer_info->rxbuf.page);
+				buffer_info->rxbuf.page = NULL;
+			}
 		}
 
 		buffer_info->dma = 0;
-		if (buffer_info->page) {
-			put_page(buffer_info->page);
-			buffer_info->page = NULL;
-		}
-		if (buffer_info->skb) {
-			dev_kfree_skb(buffer_info->skb);
-			buffer_info->skb = NULL;
-		}
 	}
 
 	/* there also may be some cached data from a chained receive */
@@ -3427,7 +3449,7 @@ rx_ring_summary:
 
 		pr_info("R[0x%03X]     %016llX %016llX %016llX %p %s\n",
 			i, le64_to_cpu(u->a), le64_to_cpu(u->b),
-			(u64)buffer_info->dma, buffer_info->skb, type);
+			(u64)buffer_info->dma, buffer_info->rxbuf.data, type);
 	} /* for */
 
 	/* dump the descriptor caches */
@@ -3947,12 +3969,12 @@ static void e1000_rx_checksum(struct e1000_adapter *adapter, u32 status_err,
 }
 
 /**
- * e1000_consume_page - helper function
+ * e1000_consume_page - helper function for jumbo rx path
  **/
 static void e1000_consume_page(struct e1000_rx_buffer *bi, struct sk_buff *skb,
 			       u16 length)
 {
-	bi->page = NULL;
+	bi->rxbuf.page = NULL;
 	skb->len += length;
 	skb->data_len += length;
 	skb->truesize += PAGE_SIZE;
@@ -4108,6 +4130,7 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 	int cleaned_count = 0;
 	bool cleaned = false;
 	unsigned int total_rx_bytes=0, total_rx_packets=0;
+	static const unsigned int bufsz = 256 - 16; /* for skb_reserve */
 
 	i = rx_ring->next_to_clean;
 	rx_desc = E1000_RX_DESC(*rx_ring, i);
@@ -4123,8 +4146,6 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
-		skb = buffer_info->skb;
-		buffer_info->skb = NULL;
 
 		if (++i == rx_ring->count) i = 0;
 		next_rxd = E1000_RX_DESC(*rx_ring, i);
@@ -4143,7 +4164,7 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 		/* errors is only valid for DD + EOP descriptors */
 		if (unlikely((status & E1000_RXD_STAT_EOP) &&
 		    (rx_desc->errors & E1000_RXD_ERR_FRAME_ERR_MASK))) {
-			u8 *mapped = page_address(buffer_info->page);
+			u8 *mapped = page_address(buffer_info->rxbuf.page);
 
 			if (e1000_tbi_should_accept(adapter, status,
 						    rx_desc->errors,
@@ -4152,8 +4173,6 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 			} else if (netdev->features & NETIF_F_RXALL) {
 				goto process_skb;
 			} else {
-				/* recycle both page and skb */
-				buffer_info->skb = skb;
 				/* an error means any chain goes out the window
 				 * too
 				 */
@@ -4170,16 +4189,18 @@ process_skb:
 			/* this descriptor is only the beginning (or middle) */
 			if (!rxtop) {
 				/* this is the beginning of a chain */
-				rxtop = skb;
-				skb_fill_page_desc(rxtop, 0, buffer_info->page,
+				rxtop = e1000_alloc_rx_skb(adapter, bufsz);
+				if (!rxtop)
+					break;
+
+				skb_fill_page_desc(rxtop, 0,
+						   buffer_info->rxbuf.page,
 						   0, length);
 			} else {
 				/* this is the middle of a chain */
 				skb_fill_page_desc(rxtop,
 				    skb_shinfo(rxtop)->nr_frags,
-				    buffer_info->page, 0, length);
-				/* re-use the skb, only consumed the page */
-				buffer_info->skb = skb;
+				    buffer_info->rxbuf.page, 0, length);
 			}
 			e1000_consume_page(buffer_info, rxtop, length);
 			goto next_desc;
@@ -4188,32 +4209,33 @@ process_skb:
 				/* end of the chain */
 				skb_fill_page_desc(rxtop,
 				    skb_shinfo(rxtop)->nr_frags,
-				    buffer_info->page, 0, length);
-				/* re-use the current skb, we only consumed the
-				 * page
-				 */
-				buffer_info->skb = skb;
+				    buffer_info->rxbuf.page, 0, length);
 				skb = rxtop;
 				rxtop = NULL;
 				e1000_consume_page(buffer_info, skb, length);
 			} else {
+				struct page *p;
 				/* no chain, got EOP, this buf is the packet
 				 * copybreak to save the put_page/alloc_page
 				 */
+				skb = e1000_alloc_rx_skb(adapter, bufsz);
+				if (!skb)
+					break;
+				p = buffer_info->rxbuf.page;
 				if (length <= copybreak &&
 				    skb_tailroom(skb) >= length) {
 					u8 *vaddr;
-					vaddr = kmap_atomic(buffer_info->page);
+
+					vaddr = kmap_atomic(p);
 					memcpy(skb_tail_pointer(skb), vaddr,
 					       length);
 					kunmap_atomic(vaddr);
 					/* re-use the page, so don't erase
-					 * buffer_info->page
+					 * buffer_info->rxbuf.page
 					 */
 					skb_put(skb, length);
 				} else {
-					skb_fill_page_desc(skb, 0,
-							   buffer_info->page, 0,
+					skb_fill_page_desc(skb, 0, p, 0,
 							   length);
 					e1000_consume_page(buffer_info, skb,
 							   length);
@@ -4318,6 +4340,7 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 
 	while (rx_desc->status & E1000_RXD_STAT_DD) {
 		struct sk_buff *skb;
+		u8 *data;
 		u8 status;
 
 		if (*work_done >= work_to_do)
@@ -4328,16 +4351,24 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 		status = rx_desc->status;
 		length = le16_to_cpu(rx_desc->length);
 
-		prefetch(buffer_info->skb->data - NET_IP_ALIGN);
-		skb = e1000_copybreak(adapter, buffer_info, length,
-				      buffer_info->skb->data);
+		data = buffer_info->rxbuf.data;
+		prefetch(data);
+		skb = e1000_copybreak(adapter, buffer_info, length, data);
 		if (!skb) {
-			skb = buffer_info->skb;
-			buffer_info->skb = NULL;
+			unsigned int frag_len = e1000_frag_len(adapter);
+
+			skb = build_skb(data - E1000_HEADROOM, frag_len);
+			if (!skb) {
+				adapter->alloc_rx_buff_failed++;
+				break;
+			}
+
+			skb_reserve(skb, E1000_HEADROOM);
 			dma_unmap_single(&pdev->dev, buffer_info->dma,
 					 adapter->rx_buffer_len,
 					 DMA_FROM_DEVICE);
 			buffer_info->dma = 0;
+			buffer_info->rxbuf.data = NULL;
 		}
 
 		if (++i == rx_ring->count) i = 0;
@@ -4370,7 +4401,7 @@ static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 		if (unlikely(rx_desc->errors & E1000_RXD_ERR_FRAME_ERR_MASK)) {
 			if (e1000_tbi_should_accept(adapter, status,
 						    rx_desc->errors,
-						    length, skb->data)) {
+						    length, data)) {
 				length--;
 			} else if (netdev->features & NETIF_F_RXALL) {
 				goto process_skb;
@@ -4390,7 +4421,7 @@ process_skb:
 			 */
 			length -= 4;
 
-		if (buffer_info->skb == NULL)
+		if (buffer_info->rxbuf.data == NULL)
 			skb_put(skb, length);
 		else /* copybreak skb */
 			skb_trim(skb, length);
@@ -4439,37 +4470,19 @@ static void
 e1000_alloc_jumbo_rx_buffers(struct e1000_adapter *adapter,
 			     struct e1000_rx_ring *rx_ring, int cleaned_count)
 {
-	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_rx_desc *rx_desc;
 	struct e1000_rx_buffer *buffer_info;
-	struct sk_buff *skb;
 	unsigned int i;
-	unsigned int bufsz = 256 - 16 /*for skb_reserve */ ;
 
 	i = rx_ring->next_to_use;
 	buffer_info = &rx_ring->buffer_info[i];
 
 	while (cleaned_count--) {
-		skb = buffer_info->skb;
-		if (skb) {
-			skb_trim(skb, 0);
-			goto check_page;
-		}
-
-		skb = netdev_alloc_skb_ip_align(netdev, bufsz);
-		if (unlikely(!skb)) {
-			/* Better luck next round */
-			adapter->alloc_rx_buff_failed++;
-			break;
-		}
-
-		buffer_info->skb = skb;
-check_page:
 		/* allocate a new page if necessary */
-		if (!buffer_info->page) {
-			buffer_info->page = alloc_page(GFP_ATOMIC);
-			if (unlikely(!buffer_info->page)) {
+		if (!buffer_info->rxbuf.page) {
+			buffer_info->rxbuf.page = alloc_page(GFP_ATOMIC);
+			if (unlikely(!buffer_info->rxbuf.page)) {
 				adapter->alloc_rx_buff_failed++;
 				break;
 			}
@@ -4477,17 +4490,15 @@ check_page:
 
 		if (!buffer_info->dma) {
 			buffer_info->dma = dma_map_page(&pdev->dev,
-							buffer_info->page, 0,
-							PAGE_SIZE,
+							buffer_info->rxbuf.page, 0,
+							adapter->rx_buffer_len,
 							DMA_FROM_DEVICE);
 			if (dma_mapping_error(&pdev->dev, buffer_info->dma)) {
-				put_page(buffer_info->page);
-				dev_kfree_skb(skb);
-				buffer_info->page = NULL;
-				buffer_info->skb = NULL;
+				put_page(buffer_info->rxbuf.page);
+				buffer_info->rxbuf.page = NULL;
 				buffer_info->dma = 0;
 				adapter->alloc_rx_buff_failed++;
-				break; /* while !buffer_info->skb */
+				break;
 			}
 		}
 
@@ -4523,11 +4534,9 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 				   int cleaned_count)
 {
 	struct e1000_hw *hw = &adapter->hw;
-	struct net_device *netdev = adapter->netdev;
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_rx_desc *rx_desc;
 	struct e1000_rx_buffer *buffer_info;
-	struct sk_buff *skb;
 	unsigned int i;
 	unsigned int bufsz = adapter->rx_buffer_len;
 
@@ -4535,55 +4544,52 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 	buffer_info = &rx_ring->buffer_info[i];
 
 	while (cleaned_count--) {
-		skb = buffer_info->skb;
-		if (skb) {
-			skb_trim(skb, 0);
+		void *data;
+
+		if (buffer_info->rxbuf.data)
 			goto skip;
-		}
 
-		skb = netdev_alloc_skb_ip_align(netdev, bufsz);
-		if (unlikely(!skb)) {
+		data = e1000_alloc_frag(adapter);
+		if (!data) {
 			/* Better luck next round */
 			adapter->alloc_rx_buff_failed++;
 			break;
 		}
 
 		/* Fix for errata 23, can't cross 64kB boundary */
-		if (!e1000_check_64k_bound(adapter, skb->data, bufsz)) {
-			struct sk_buff *oldskb = skb;
+		if (!e1000_check_64k_bound(adapter, data, bufsz)) {
+			void *olddata = data;
 			e_err(rx_err, "skb align check failed: %u bytes at "
-			      "%p\n", bufsz, skb->data);
+			      "%p\n", bufsz, data);
 			/* Try again, without freeing the previous */
-			skb = netdev_alloc_skb_ip_align(netdev, bufsz);
+			data = e1000_alloc_frag(adapter);
 			/* Failed allocation, critical failure */
-			if (!skb) {
-				dev_kfree_skb(oldskb);
+			if (!data) {
+				e1000_free_frag(olddata);
 				adapter->alloc_rx_buff_failed++;
 				break;
 			}
 
-			if (!e1000_check_64k_bound(adapter, skb->data, bufsz)) {
+			if (!e1000_check_64k_bound(adapter, data, bufsz)) {
 				/* give up */
-				dev_kfree_skb(skb);
-				dev_kfree_skb(oldskb);
+				e1000_free_frag(data);
+				e1000_free_frag(olddata);
 				adapter->alloc_rx_buff_failed++;
-				break; /* while !buffer_info->skb */
+				break;
 			}
 
 			/* Use new allocation */
-			dev_kfree_skb(oldskb);
+			e1000_free_frag(olddata);
 		}
-		buffer_info->skb = skb;
 		buffer_info->dma = dma_map_single(&pdev->dev,
-						  skb->data,
+						  data,
 						  adapter->rx_buffer_len,
 						  DMA_FROM_DEVICE);
 		if (dma_mapping_error(&pdev->dev, buffer_info->dma)) {
-			dev_kfree_skb(skb);
-			buffer_info->skb = NULL;
+			e1000_free_frag(data);
 			buffer_info->dma = 0;
 			adapter->alloc_rx_buff_failed++;
-			break; /* while !buffer_info->skb */
+			break;
 		}
 
 		/* XXX if it was allocated cleanly it will never map to a
@@ -4597,21 +4603,23 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter,
 			e_err(rx_err, "dma align check failed: %u bytes at "
 			      "%p\n", adapter->rx_buffer_len,
 			      (void *)(unsigned long)buffer_info->dma);
-			dev_kfree_skb(skb);
-			buffer_info->skb = NULL;
 
 			dma_unmap_single(&pdev->dev, buffer_info->dma,
 					 adapter->rx_buffer_len,
 					 DMA_FROM_DEVICE);
+
+			e1000_free_frag(data);
+			buffer_info->rxbuf.data = NULL;
 			buffer_info->dma = 0;
 
 			adapter->alloc_rx_buff_failed++;
-			break; /* while !buffer_info->skb */
+			break;
 		}
+		buffer_info->rxbuf.data = data;
+ skip:
 		rx_desc = E1000_RX_DESC(*rx_ring, i);
 		rx_desc->buffer_addr = cpu_to_le64(buffer_info->dma);
 
-skip:
 		if (unlikely(++i == rx_ring->count))
 			i = 0;
 		buffer_info = &rx_ring->buffer_info[i];
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/7] e1000: switch to napi_gro_frags api
  2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
                   ` (5 preceding siblings ...)
  2014-08-30 18:28 ` [PATCH 6/7] e1000: convert to build_skb Florian Westphal
@ 2014-08-30 18:28 ` Florian Westphal
  2014-08-31 12:03   ` Sergei Shtylyov
  2014-09-02  0:30 ` e1000: convert to build_skb/napi_gro_frags api Jeff Kirsher
  7 siblings, 1 reply; 10+ messages in thread
From: Florian Westphal @ 2014-08-30 18:28 UTC (permalink / raw)
  To: e1000-devel; +Cc: netdev, Florian Westphal

napi_gro_frags allows skb re-use in case GRO can merge payload pages
into an skb on the gro lists.

netperf TCP_STREAM, kvm-e1000 emulation, mtu 9k:
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec
old: 87380  16384  16384    30.00  8985.78
new: 87380  16384  16384    30.00  9907.05

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 drivers/net/ethernet/intel/e1000/e1000_main.c | 50 ++++++++++++++++++---------
 1 file changed, 33 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 9ac9a77..784a67f 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -2117,10 +2117,8 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter,
 	}
 
 	/* there also may be some cached data from a chained receive */
-	if (rx_ring->rx_skb_top) {
-		dev_kfree_skb(rx_ring->rx_skb_top);
-		rx_ring->rx_skb_top = NULL;
-	}
+	napi_free_frags(&adapter->napi);
+	rx_ring->rx_skb_top = NULL;
 
 	size = sizeof(struct e1000_rx_buffer) * rx_ring->count;
 	memset(rx_ring->buffer_info, 0, size);
@@ -4130,7 +4128,6 @@ static bool e1000_clean_jumbo_rx_irq(struct e1000_adapter *adapter,
 	int cleaned_count = 0;
 	bool cleaned = false;
 	unsigned int total_rx_bytes=0, total_rx_packets=0;
-	static const unsigned int bufsz = 256 - 16; /* for skb_reserve */
 
 	i = rx_ring->next_to_clean;
 	rx_desc = E1000_RX_DESC(*rx_ring, i);
@@ -4189,7 +4186,7 @@ process_skb:
 			/* this descriptor is only the beginning (or middle) */
 			if (!rxtop) {
 				/* this is the beginning of a chain */
-				rxtop = e1000_alloc_rx_skb(adapter, bufsz);
+				rxtop = napi_get_frags(&adapter->napi);
 				if (!rxtop)
 					break;
 
@@ -4218,14 +4215,17 @@ process_skb:
 				/* no chain, got EOP, this buf is the packet
 				 * copybreak to save the put_page/alloc_page
 				 */
-				skb = e1000_alloc_rx_skb(adapter, bufsz);
-				if (!skb)
-					break;
 				p = buffer_info->rxbuf.page;
-				if (length <= copybreak &&
-				    skb_tailroom(skb) >= length) {
+				if (length <= copybreak) {
 					u8 *vaddr;
 
+					if (likely(!(netdev->features & NETIF_F_RXFCS)))
+						length -= 4;
+					skb = e1000_alloc_rx_skb(adapter,
+								 length);
+					if (!skb)
+						break;
+
 					vaddr = kmap_atomic(p);
 					memcpy(skb_tail_pointer(skb), vaddr,
 					       length);
@@ -4234,7 +4234,23 @@ process_skb:
 					 * buffer_info->rxbuf.page
 					 */
 					skb_put(skb, length);
+					e1000_rx_checksum(adapter,
+							  (u32)(status) |
+							  ((u32)(rx_desc->errors) << 24),
+							  le16_to_cpu(rx_desc->csum), skb);
+
+					total_rx_bytes += skb->len;
+					total_rx_packets++;
+
+					e1000_receive_skb(adapter, status,
+							  rx_desc->special, skb);
+					goto next_desc;
 				} else {
+					skb = napi_get_frags(&adapter->napi);
+					if (!skb) {
+						adapter->alloc_rx_buff_failed++;
+						break;
+					}
 					skb_fill_page_desc(skb, 0, p, 0,
 							   length);
 					e1000_consume_page(buffer_info, skb,
@@ -4254,14 +4270,14 @@ process_skb:
 			pskb_trim(skb, skb->len - 4);
 		total_rx_packets++;
 
-		/* eth type trans needs skb->data to point to something */
-		if (!pskb_may_pull(skb, ETH_HLEN)) {
-			e_err(drv, "pskb_may_pull failed.\n");
-			dev_kfree_skb(skb);
-			goto next_desc;
+		if (status & E1000_RXD_STAT_VP) {
+			__le16 vlan = rx_desc->special;
+			u16 vid = le16_to_cpu(vlan) & E1000_RXD_SPC_VLAN_MASK;
+
+			__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vid);
 		}
 
-		e1000_receive_skb(adapter, status, rx_desc->special, skb);
+		napi_gro_frags(&adapter->napi);
 
 next_desc:
 		rx_desc->status = 0;
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 7/7] e1000: switch to napi_gro_frags api
  2014-08-30 18:28 ` [PATCH 7/7] e1000: switch to napi_gro_frags api Florian Westphal
@ 2014-08-31 12:03   ` Sergei Shtylyov
  0 siblings, 0 replies; 10+ messages in thread
From: Sergei Shtylyov @ 2014-08-31 12:03 UTC (permalink / raw)
  To: Florian Westphal, e1000-devel; +Cc: netdev

Hello.

On 8/30/2014 10:28 PM, Florian Westphal wrote:

> napi_gro_frags allows skb re-use in case GRO can merge payload pages
> into an skb on the gro lists.

> netperf TCP_STREAM, kvm-e1000 emulation, mtu 9k:
> Size   Size    Size     Time     Throughput
> bytes  bytes   bytes    secs.    10^6bits/sec
> old: 87380  16384  16384    30.00  8985.78
> new: 87380  16384  16384    30.00  9907.05

> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>   drivers/net/ethernet/intel/e1000/e1000_main.c | 50 ++++++++++++++++++---------
>   1 file changed, 33 insertions(+), 17 deletions(-)

> diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
> index 9ac9a77..784a67f 100644
> --- a/drivers/net/ethernet/intel/e1000/e1000_main.c
> +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
[...]
> @@ -4234,7 +4234,23 @@ process_skb:
>   					 * buffer_info->rxbuf.page
>   					 */
>   					skb_put(skb, length);
> +					e1000_rx_checksum(adapter,
> +							  (u32)(status) |
> +							  ((u32)(rx_desc->errors) << 24),

    Inner parens are not needed.

[...]

WBR, Sergei


------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: e1000: convert to build_skb/napi_gro_frags api
  2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
                   ` (6 preceding siblings ...)
  2014-08-30 18:28 ` [PATCH 7/7] e1000: switch to napi_gro_frags api Florian Westphal
@ 2014-09-02  0:30 ` Jeff Kirsher
  7 siblings, 0 replies; 10+ messages in thread
From: Jeff Kirsher @ 2014-09-02  0:30 UTC (permalink / raw)
  To: Florian Westphal; +Cc: e1000-devel, netdev


[-- Attachment #1.1: Type: text/plain, Size: 731 bytes --]

On Sat, 2014-08-30 at 20:28 +0200, Florian Westphal wrote:
> e1000 driver preallocates skbs, then sends them up the stack.
> 
> This series changes the rx routine to only preallocate data buffers,
> then initialize skb right before passing it up the stack.
> 
> This gives slight performance inprovements as the skb will be
> fresh in the cache once its allocated/initialized.
> 
> The default-mtu 1500 path is converted to build_skb()/netdev_alloc_frag.
> 
> Jumbo path is converted to napi_gro_frags.
> 
> Note that this has only been tested with kvms e1000 emulation.
> 

Can you re-spin the series with Sergei's suggest change for patch 7,
please?  I will await your v2 before picking up this series, thanks!

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 155 bytes --]

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/

[-- Attachment #3: Type: text/plain, Size: 257 bytes --]

_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-09-02  0:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-30 18:28 e1000: convert to build_skb/napi_gro_frags api Florian Westphal
2014-08-30 18:28 ` [PATCH 1/7] e1000: move e1000_tbi_adjust_stats to where its used Florian Westphal
2014-08-30 18:28 ` [PATCH 2/7] e1000: move tbi workaround code into helper function Florian Westphal
2014-08-30 18:28 ` [PATCH 3/7] e1000: perform copybreak ahead of dma unmap Florian Westphal
2014-08-30 18:28 ` [PATCH 4/7] e1000: add and use e1000_rx_buffer info for rx Florian Westphal
2014-08-30 18:28 ` [PATCH 5/7] e1000: rename struct e1000_buffer to e1000_tx_buffer Florian Westphal
2014-08-30 18:28 ` [PATCH 6/7] e1000: convert to build_skb Florian Westphal
2014-08-30 18:28 ` [PATCH 7/7] e1000: switch to napi_gro_frags api Florian Westphal
2014-08-31 12:03   ` Sergei Shtylyov
2014-09-02  0:30 ` e1000: convert to build_skb/napi_gro_frags api Jeff Kirsher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.