All of lore.kernel.org
 help / color / mirror / Atom feed
* [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06
@ 2015-01-07  1:31 Jeff Kirsher
  2015-01-07  1:31 ` [net v2 1/3] i40e: fix un-necessary Tx hangs Jeff Kirsher
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Jeff Kirsher @ 2015-01-07  1:31 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene

This series contains fixes to i40e only.

Jesse provides a fix for when the driver was polling with interrupts
disabled the hardware would occasionally not write back descriptors.
His fix causes the driver to detect this situation and force an interrupt
to fire which will flush the stuck descriptor.

Anjali provides a couple of fixes, the first corrects an issue where
the receive port checksum error counter was incrementing incorrectly with
UDP encapsulated tunneled traffic.  The second fix resolves an issue where
the driver was examining the outer protocol layer to set the inner protocol
layer checksum offload.  In the case of TCP over IPv6 over an IPv4 based
VXLAN, the inner checksum offloads would be set to look for IPv4/UDP
instead of IPv6/TCP, so fixed the issue so that the driver will look at
the proper layer for encapsulation offload settings.

v2: fixed a bug in patch 01 of the series, where the interrupt rate impacted
    4 port workloads by reducing throughput.

The following are changes since commit 2abad79afa700e837cb4feed170141292e0720c0:
  qla3xxx: don't allow never end busy loop
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net master

Anjali Singhai (2):
  i40e: Fix Rx checksum error counter
  i40e: Fix bug with TCP over IPv6 over VXLAN

Jesse Brandeburg (1):
  i40e: fix un-necessary Tx hangs

 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 104 +++++++++++++++++++---------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |   1 +
 2 files changed, 73 insertions(+), 32 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [net v2 1/3] i40e: fix un-necessary Tx hangs
  2015-01-07  1:31 [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06 Jeff Kirsher
@ 2015-01-07  1:31 ` Jeff Kirsher
  2015-01-07  1:31 ` [net v2 2/3] i40e: Fix Rx checksum error counter Jeff Kirsher
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Jeff Kirsher @ 2015-01-07  1:31 UTC (permalink / raw)
  To: davem; +Cc: Jesse Brandeburg, netdev, nhorman, sassmann, jogreene, Jeff Kirsher

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

When the driver was polling with interrupts disabled the hardware
will occasionally not write back descriptors.  This patch causes
the driver to detect this situation and force an interrupt to
fire which will flush the stuck descriptor.  Does not conflict
with napi because if we are already polling the napi_schedule is
ignored.  Additionally the extra interrupts are rate limited, so
don't cause a burden to the CPU.

Change-ID: Iba4616d2a71288672a5f08e4512e2704b97335e8
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
v2: fixed a bug where the interrupt rate impacted 4 port workloads by
    reducing throughput.

 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 56 ++++++++++++++++++++++++-----
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |  1 +
 2 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 04b4414..f145aaf 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -658,6 +658,8 @@ static inline u32 i40e_get_head(struct i40e_ring *tx_ring)
 	return le32_to_cpu(*(volatile __le32 *)head);
 }
 
+#define WB_STRIDE 0x3
+
 /**
  * i40e_clean_tx_irq - Reclaim resources after transmit completes
  * @tx_ring:  tx ring to clean
@@ -759,6 +761,18 @@ static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, int budget)
 	tx_ring->q_vector->tx.total_bytes += total_bytes;
 	tx_ring->q_vector->tx.total_packets += total_packets;
 
+	/* check to see if there are any non-cache aligned descriptors
+	 * waiting to be written back, and kick the hardware to force
+	 * them to be written back in case of napi polling
+	 */
+	if (budget &&
+	    !((i & WB_STRIDE) == WB_STRIDE) &&
+	    !test_bit(__I40E_DOWN, &tx_ring->vsi->state) &&
+	    (I40E_DESC_UNUSED(tx_ring) != tx_ring->count))
+		tx_ring->arm_wb = true;
+	else
+		tx_ring->arm_wb = false;
+
 	if (check_for_tx_hang(tx_ring) && i40e_check_tx_hang(tx_ring)) {
 		/* schedule immediate reset if we believe we hung */
 		dev_info(tx_ring->dev, "Detected Tx Unit Hang\n"
@@ -777,13 +791,16 @@ static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, int budget)
 		netif_stop_subqueue(tx_ring->netdev, tx_ring->queue_index);
 
 		dev_info(tx_ring->dev,
-			 "tx hang detected on queue %d, resetting adapter\n",
+			 "tx hang detected on queue %d, reset requested\n",
 			 tx_ring->queue_index);
 
-		tx_ring->netdev->netdev_ops->ndo_tx_timeout(tx_ring->netdev);
+		/* do not fire the reset immediately, wait for the stack to
+		 * decide we are truly stuck, also prevents every queue from
+		 * simultaneously requesting a reset
+		 */
 
-		/* the adapter is about to reset, no point in enabling stuff */
-		return true;
+		/* the adapter is about to reset, no point in enabling polling */
+		budget = 1;
 	}
 
 	netdev_tx_completed_queue(netdev_get_tx_queue(tx_ring->netdev,
@@ -806,7 +823,25 @@ static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, int budget)
 		}
 	}
 
-	return budget > 0;
+	return !!budget;
+}
+
+/**
+ * i40e_force_wb - Arm hardware to do a wb on noncache aligned descriptors
+ * @vsi: the VSI we care about
+ * @q_vector: the vector  on which to force writeback
+ *
+ **/
+static void i40e_force_wb(struct i40e_vsi *vsi, struct i40e_q_vector *q_vector)
+{
+	u32 val = I40E_PFINT_DYN_CTLN_INTENA_MASK |
+		  I40E_PFINT_DYN_CTLN_SWINT_TRIG_MASK |
+		  I40E_PFINT_DYN_CTLN_SW_ITR_INDX_ENA_MASK
+		  /* allow 00 to be written to the index */;
+
+	wr32(&vsi->back->hw,
+	     I40E_PFINT_DYN_CTLN(q_vector->v_idx + vsi->base_vector - 1),
+	     val);
 }
 
 /**
@@ -1581,6 +1616,7 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 	struct i40e_vsi *vsi = q_vector->vsi;
 	struct i40e_ring *ring;
 	bool clean_complete = true;
+	bool arm_wb = false;
 	int budget_per_ring;
 
 	if (test_bit(__I40E_DOWN, &vsi->state)) {
@@ -1591,8 +1627,10 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 	/* Since the actual Tx work is minimal, we can give the Tx a larger
 	 * budget and be more aggressive about cleaning up the Tx descriptors.
 	 */
-	i40e_for_each_ring(ring, q_vector->tx)
+	i40e_for_each_ring(ring, q_vector->tx) {
 		clean_complete &= i40e_clean_tx_irq(ring, vsi->work_limit);
+		arm_wb |= ring->arm_wb;
+	}
 
 	/* We attempt to distribute budget to each Rx queue fairly, but don't
 	 * allow the budget to go below 1 because that would exit polling early.
@@ -1603,8 +1641,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 		clean_complete &= i40e_clean_rx_irq(ring, budget_per_ring);
 
 	/* If work not completed, return budget and polling will return */
-	if (!clean_complete)
+	if (!clean_complete) {
+		if (arm_wb)
+			i40e_force_wb(vsi, q_vector);
 		return budget;
+	}
 
 	/* Work is done so exit the polling mode and re-enable the interrupt */
 	napi_complete(napi);
@@ -2198,7 +2239,6 @@ static void i40e_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb,
 	/* Place RS bit on last descriptor of any packet that spans across the
 	 * 4th descriptor (WB_STRIDE aka 0x3) in a 64B cacheline.
 	 */
-#define WB_STRIDE 0x3
 	if (((i & WB_STRIDE) != WB_STRIDE) &&
 	    (first <= &tx_ring->tx_bi[i]) &&
 	    (first >= &tx_ring->tx_bi[i & ~WB_STRIDE])) {
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index e60d3ac..18b0023 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -241,6 +241,7 @@ struct i40e_ring {
 	unsigned long last_rx_timestamp;
 
 	bool ring_active;		/* is ring online or not */
+	bool arm_wb;		/* do something to arm write back */
 
 	/* stats structs */
 	struct i40e_queue_stats	stats;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net v2 2/3] i40e: Fix Rx checksum error counter
  2015-01-07  1:31 [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06 Jeff Kirsher
  2015-01-07  1:31 ` [net v2 1/3] i40e: fix un-necessary Tx hangs Jeff Kirsher
@ 2015-01-07  1:31 ` Jeff Kirsher
  2015-01-07  5:43   ` Tom Herbert
  2015-01-07  1:31 ` [net v2 3/3] i40e: Fix bug with TCP over IPv6 over VXLAN Jeff Kirsher
  2015-01-09  3:43 ` [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06 David Miller
  3 siblings, 1 reply; 9+ messages in thread
From: Jeff Kirsher @ 2015-01-07  1:31 UTC (permalink / raw)
  To: davem
  Cc: Anjali Singhai, netdev, nhorman, sassmann, jogreene, Greg Rose,
	Jeff Kirsher

From: Anjali Singhai <anjali.singhai@intel.com>

The Rx port checksum error counter was incrementing incorrectly with
UDP encapsulated tunneled traffic.  This patch fixes the problem so that
the port_rx_csum counter will show accurate statistics.

Signed-off-by: Anjali Singhai <anjali.singhai@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index f145aaf..38c7638 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1325,9 +1325,7 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
 	 * so the total length of IPv4 header is IHL*4 bytes
 	 * The UDP_0 bit *may* bet set if the *inner* header is UDP
 	 */
-	if (ipv4_tunnel &&
-	    (decoded.inner_prot != I40E_RX_PTYPE_INNER_PROT_UDP) &&
-	    !(rx_status & (1 << I40E_RX_DESC_STATUS_UDP_0_SHIFT))) {
+	if (ipv4_tunnel) {
 		skb->transport_header = skb->mac_header +
 					sizeof(struct ethhdr) +
 					(ip_hdr(skb)->ihl * 4);
@@ -1337,15 +1335,19 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
 					  skb->protocol == htons(ETH_P_8021AD))
 					  ? VLAN_HLEN : 0;
 
-		rx_udp_csum = udp_csum(skb);
-		iph = ip_hdr(skb);
-		csum = csum_tcpudp_magic(
-				iph->saddr, iph->daddr,
-				(skb->len - skb_transport_offset(skb)),
-				IPPROTO_UDP, rx_udp_csum);
+		if ((ip_hdr(skb)->protocol == IPPROTO_UDP) &&
+		    (udp_hdr(skb)->check != 0)) {
+			rx_udp_csum = udp_csum(skb);
+			iph = ip_hdr(skb);
+			csum = csum_tcpudp_magic(
+					iph->saddr, iph->daddr,
+					(skb->len - skb_transport_offset(skb)),
+					IPPROTO_UDP, rx_udp_csum);
 
-		if (udp_hdr(skb)->check != csum)
-			goto checksum_fail;
+			if (udp_hdr(skb)->check != csum)
+				goto checksum_fail;
+
+		} /* else its GRE and so no outer UDP header */
 	}
 
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [net v2 3/3] i40e: Fix bug with TCP over IPv6 over VXLAN
  2015-01-07  1:31 [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06 Jeff Kirsher
  2015-01-07  1:31 ` [net v2 1/3] i40e: fix un-necessary Tx hangs Jeff Kirsher
  2015-01-07  1:31 ` [net v2 2/3] i40e: Fix Rx checksum error counter Jeff Kirsher
@ 2015-01-07  1:31 ` Jeff Kirsher
  2015-01-09  3:43 ` [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06 David Miller
  3 siblings, 0 replies; 9+ messages in thread
From: Jeff Kirsher @ 2015-01-07  1:31 UTC (permalink / raw)
  To: davem
  Cc: Anjali Singhai, netdev, nhorman, sassmann, jogreene, Greg Rose,
	Jeff Kirsher

From: Anjali Singhai <anjali.singhai@intel.com>

The driver was examining the outer protocol layer to set the inner protocol
layer checksum offload.  In the case of TCP over IPV6 over an IPv4 based
VXLAN the inner checksum offloads would be set to look for IPv4/UDP instead
of IPv6/TCP.  This code fixes that so that the driver will look at the
proper layer for encapsulation offload settings.

Signed-off-by: Anjali Singhai <anjali.singhai@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 38c7638..cecb340 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1883,17 +1883,16 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb,
 	if (err < 0)
 		return err;
 
-	if (protocol == htons(ETH_P_IP)) {
-		iph = skb->encapsulation ? inner_ip_hdr(skb) : ip_hdr(skb);
+	iph = skb->encapsulation ? inner_ip_hdr(skb) : ip_hdr(skb);
+	ipv6h = skb->encapsulation ? inner_ipv6_hdr(skb) : ipv6_hdr(skb);
+
+	if (iph->version == 4) {
 		tcph = skb->encapsulation ? inner_tcp_hdr(skb) : tcp_hdr(skb);
 		iph->tot_len = 0;
 		iph->check = 0;
 		tcph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr,
 						 0, IPPROTO_TCP, 0);
-	} else if (skb_is_gso_v6(skb)) {
-
-		ipv6h = skb->encapsulation ? inner_ipv6_hdr(skb)
-					   : ipv6_hdr(skb);
+	} else if (ipv6h->version == 6) {
 		tcph = skb->encapsulation ? inner_tcp_hdr(skb) : tcp_hdr(skb);
 		ipv6h->payload_len = 0;
 		tcph->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr,
@@ -1989,13 +1988,9 @@ static void i40e_tx_enable_csum(struct sk_buff *skb, u32 tx_flags,
 					 I40E_TX_CTX_EXT_IP_IPV4_NO_CSUM;
 			}
 		} else if (tx_flags & I40E_TX_FLAGS_IPV6) {
-			if (tx_flags & I40E_TX_FLAGS_TSO) {
-				*cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV6;
+			*cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV6;
+			if (tx_flags & I40E_TX_FLAGS_TSO)
 				ip_hdr(skb)->check = 0;
-			} else {
-				*cd_tunneling |=
-					 I40E_TX_CTX_EXT_IP_IPV4_NO_CSUM;
-			}
 		}
 
 		/* Now set the ctx descriptor fields */
@@ -2005,7 +2000,10 @@ static void i40e_tx_enable_csum(struct sk_buff *skb, u32 tx_flags,
 				   ((skb_inner_network_offset(skb) -
 					skb_transport_offset(skb)) >> 1) <<
 				   I40E_TXD_CTX_QW0_NATLEN_SHIFT;
-
+		if (this_ip_hdr->version == 6) {
+			tx_flags &= ~I40E_TX_FLAGS_IPV4;
+			tx_flags |= I40E_TX_FLAGS_IPV6;
+		}
 	} else {
 		network_hdr_len = skb_network_header_len(skb);
 		this_ip_hdr = ip_hdr(skb);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [net v2 2/3] i40e: Fix Rx checksum error counter
  2015-01-07  1:31 ` [net v2 2/3] i40e: Fix Rx checksum error counter Jeff Kirsher
@ 2015-01-07  5:43   ` Tom Herbert
  2015-01-07 20:14     ` Singhai, Anjali
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Herbert @ 2015-01-07  5:43 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: David Miller, Anjali Singhai, Linux Netdev List, nhorman,
	sassmann, jogreene, Greg Rose

On Tue, Jan 6, 2015 at 5:31 PM, Jeff Kirsher
<jeffrey.t.kirsher@intel.com> wrote:
> From: Anjali Singhai <anjali.singhai@intel.com>
>
> The Rx port checksum error counter was incrementing incorrectly with
> UDP encapsulated tunneled traffic.  This patch fixes the problem so that
> the port_rx_csum counter will show accurate statistics.
>
> Signed-off-by: Anjali Singhai <anjali.singhai@intel.com>
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> Tested-by: Jim Young <james.m.young@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index f145aaf..38c7638 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -1325,9 +1325,7 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
>          * so the total length of IPv4 header is IHL*4 bytes
>          * The UDP_0 bit *may* bet set if the *inner* header is UDP
>          */
> -       if (ipv4_tunnel &&
> -           (decoded.inner_prot != I40E_RX_PTYPE_INNER_PROT_UDP) &&
> -           !(rx_status & (1 << I40E_RX_DESC_STATUS_UDP_0_SHIFT))) {
> +       if (ipv4_tunnel) {
>                 skb->transport_header = skb->mac_header +
>                                         sizeof(struct ethhdr) +
>                                         (ip_hdr(skb)->ihl * 4);
> @@ -1337,15 +1335,19 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
>                                           skb->protocol == htons(ETH_P_8021AD))
>                                           ? VLAN_HLEN : 0;
>
> -               rx_udp_csum = udp_csum(skb);
> -               iph = ip_hdr(skb);
> -               csum = csum_tcpudp_magic(
> -                               iph->saddr, iph->daddr,
> -                               (skb->len - skb_transport_offset(skb)),
> -                               IPPROTO_UDP, rx_udp_csum);
> +               if ((ip_hdr(skb)->protocol == IPPROTO_UDP) &&
> +                   (udp_hdr(skb)->check != 0)) {
> +                       rx_udp_csum = udp_csum(skb);

Doesn't this compute the whole checksum of the packet making the fact
that device verified inner checksum pretty much irrelevant? It would
probably be just as well to return CHECKSUM_NONE and let the stack
deal with it and remove all this complexity.

> +                       iph = ip_hdr(skb);
> +                       csum = csum_tcpudp_magic(
> +                                       iph->saddr, iph->daddr,
> +                                       (skb->len - skb_transport_offset(skb)),
> +                                       IPPROTO_UDP, rx_udp_csum);
>
> -               if (udp_hdr(skb)->check != csum)
> -                       goto checksum_fail;
> +                       if (udp_hdr(skb)->check != csum)
> +                               goto checksum_fail;
> +
> +               } /* else its GRE and so no outer UDP header */
>         }
>
>         skb->ip_summed = CHECKSUM_UNNECESSARY;
> --
> 1.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [net v2 2/3] i40e: Fix Rx checksum error counter
  2015-01-07  5:43   ` Tom Herbert
@ 2015-01-07 20:14     ` Singhai, Anjali
  2015-01-07 21:34       ` Tom Herbert
  0 siblings, 1 reply; 9+ messages in thread
From: Singhai, Anjali @ 2015-01-07 20:14 UTC (permalink / raw)
  To: Tom Herbert, Kirsher, Jeffrey T
  Cc: David Miller, Linux Netdev List, nhorman, sassmann, jogreene,
	Rose, Gregory V

On Tue, 6 Jan 2015 21:43:57 -0800
Tom Herbert <therbert@google.com> wrote:
> > @@ -1337,15 +1335,19 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
> >                                           skb->protocol == htons(ETH_P_8021AD))
> >                                           ? VLAN_HLEN : 0;
> >
> > -               rx_udp_csum = udp_csum(skb);
> > -               iph = ip_hdr(skb);
> > -               csum = csum_tcpudp_magic(
> > -                               iph->saddr, iph->daddr,
> > -                               (skb->len - skb_transport_offset(skb)),
> > -                               IPPROTO_UDP, rx_udp_csum);
> > +               if ((ip_hdr(skb)->protocol == IPPROTO_UDP) &&
> > +                   (udp_hdr(skb)->check != 0)) {
> > +                       rx_udp_csum = udp_csum(skb);
> 
> Doesn't this compute the whole checksum of the packet making the fact 
> that device verified inner checksum pretty much irrelevant? It would 
> probably be just as well to return CHECKSUM_NONE and let the stack 
> deal with it and remove all this complexity.

This is only calculating outer UDP csum, inner csums are offloaded and so is the outer IP csum. Overall this is less work than asking the stack to do all of those by marking it as CHECKSUM_UNNECESSARY. We do have a patch in line to use csum_level but I believe even with that we would be asking the stack to do more work than necessary if we indicate that only inner checksums are offloaded. With our HW we are able to offload 3 out of 4 csums.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [net v2 2/3] i40e: Fix Rx checksum error counter
  2015-01-07 20:14     ` Singhai, Anjali
@ 2015-01-07 21:34       ` Tom Herbert
  2015-01-08 20:33         ` Singhai, Anjali
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Herbert @ 2015-01-07 21:34 UTC (permalink / raw)
  To: Singhai, Anjali
  Cc: Kirsher, Jeffrey T, David Miller, Linux Netdev List, nhorman,
	sassmann, jogreene, Rose, Gregory V

On Wed, Jan 7, 2015 at 12:14 PM, Singhai, Anjali
<anjali.singhai@intel.com> wrote:
> On Tue, 6 Jan 2015 21:43:57 -0800
> Tom Herbert <therbert@google.com> wrote:
>> > @@ -1337,15 +1335,19 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
>> >                                           skb->protocol == htons(ETH_P_8021AD))
>> >                                           ? VLAN_HLEN : 0;
>> >
>> > -               rx_udp_csum = udp_csum(skb);
>> > -               iph = ip_hdr(skb);
>> > -               csum = csum_tcpudp_magic(
>> > -                               iph->saddr, iph->daddr,
>> > -                               (skb->len - skb_transport_offset(skb)),
>> > -                               IPPROTO_UDP, rx_udp_csum);
>> > +               if ((ip_hdr(skb)->protocol == IPPROTO_UDP) &&
>> > +                   (udp_hdr(skb)->check != 0)) {
>> > +                       rx_udp_csum = udp_csum(skb);
>>
>> Doesn't this compute the whole checksum of the packet making the fact
>> that device verified inner checksum pretty much irrelevant? It would
>> probably be just as well to return CHECKSUM_NONE and let the stack
>> deal with it and remove all this complexity.
>
> This is only calculating outer UDP csum, inner csums are offloaded and so is the outer IP csum. Overall this is less work than asking the stack to do all of those by marking it as CHECKSUM_UNNECESSARY. We do have a patch in line to use csum_level but I believe even with that we would be asking the stack to do more work than necessary if we indicate that only inner checksums are offloaded. With our HW we are able to offload 3 out of 4 csums.

The stack will only compute the checksum over the packet zero or one
times. The fact that the driver is doing it instead of the stack makes
little difference, once the decision is made to checksum the packet
the performance gains of offloading are lost. If you defer to the
stack then this complex code is removed and there is the possibility
that the checksum doesn't even need to be calculated at all (like UDP
packet is bad or no listener for port).

What does "we are able to offload 3 out of 4 csums" mean?

Thanks,
Tom

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [net v2 2/3] i40e: Fix Rx checksum error counter
  2015-01-07 21:34       ` Tom Herbert
@ 2015-01-08 20:33         ` Singhai, Anjali
  0 siblings, 0 replies; 9+ messages in thread
From: Singhai, Anjali @ 2015-01-08 20:33 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Kirsher, Jeffrey T, David Miller, Linux Netdev List, nhorman,
	sassmann, jogreene, Rose, Gregory V

On Wed, Jan 7, 2015 at 12:14 PM, Singhai, Anjali <anjali.singhai@intel.com> wrote:
> On Tue, 6 Jan 2015 21:43:57 -0800
> Tom Herbert <therbert@google.com> wrote:
>>> > @@ -1337,15 +1335,19 @@ static inline void i40e_rx_checksum(struct i40e_vsi *vsi,
>>> >                                           skb->protocol == htons(ETH_P_8021AD))
>>> >                                           ? VLAN_HLEN : 0;
>>> >
>>> > -               rx_udp_csum = udp_csum(skb);
>>> > -               iph = ip_hdr(skb);
>>> > -               csum = csum_tcpudp_magic(
>>> > -                               iph->saddr, iph->daddr,
>>> > -                               (skb->len - skb_transport_offset(skb)),
>>> > -                               IPPROTO_UDP, rx_udp_csum);
>>> > +               if ((ip_hdr(skb)->protocol == IPPROTO_UDP) &&
>>> > +                   (udp_hdr(skb)->check != 0)) {
>>> > +                       rx_udp_csum = udp_csum(skb);
>>>
>>> Doesn't this compute the whole checksum of the packet making the 
>>> fact that device verified inner checksum pretty much irrelevant? It 
>>> would probably be just as well to return CHECKSUM_NONE and let the 
>>> stack deal with it and remove all this complexity.
>>
>> This is only calculating outer UDP csum, inner csums are offloaded and so is the outer IP csum. Overall this is less work than asking the stack to do all of those by marking it as CHECKSUM_UNNECESSARY. We do have a patch in line to use csum_level but I believe even with that we would be asking the stack to do more work than necessary if we indicate that only inner checksums are offloaded. With our HW we are able to offload 3 out of 4 csums.
>
>The stack will only compute the checksum over the packet zero or one times. The fact that the driver is doing it instead of the stack makes little difference, once the decision is made to checksum the packet the performance gains of offloading are lost. If you defer to the stack then this complex code is removed and there is the possibility that the checksum doesn't even need to be calculated at all (like UDP packet is bad or no listener for port).
>
>What does "we are able to offload 3 out of 4 csums" mean?

With Fortville, the HW lets you offload inner IP and TCP/UDP checksum as well as outer IP checksum. The one that is not offloaded is outer UDP. I guess in most cases, outer UDP checksum will be zero and we do check for that, in the less likely case where outer UDP  checksum is non-zero we could defer it to the stack.  I guess I will simplify the code, although this patch is just fixing bugs already present in the upstream driver.

Thanks
Anjali


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06
  2015-01-07  1:31 [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06 Jeff Kirsher
                   ` (2 preceding siblings ...)
  2015-01-07  1:31 ` [net v2 3/3] i40e: Fix bug with TCP over IPv6 over VXLAN Jeff Kirsher
@ 2015-01-09  3:43 ` David Miller
  3 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2015-01-09  3:43 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue,  6 Jan 2015 17:31:54 -0800

> This series contains fixes to i40e only.
> 
> Jesse provides a fix for when the driver was polling with interrupts
> disabled the hardware would occasionally not write back descriptors.
> His fix causes the driver to detect this situation and force an interrupt
> to fire which will flush the stuck descriptor.
> 
> Anjali provides a couple of fixes, the first corrects an issue where
> the receive port checksum error counter was incrementing incorrectly with
> UDP encapsulated tunneled traffic.  The second fix resolves an issue where
> the driver was examining the outer protocol layer to set the inner protocol
> layer checksum offload.  In the case of TCP over IPv6 over an IPv4 based
> VXLAN, the inner checksum offloads would be set to look for IPv4/UDP
> instead of IPv6/TCP, so fixed the issue so that the driver will look at
> the proper layer for encapsulation offload settings.
> 
> v2: fixed a bug in patch 01 of the series, where the interrupt rate impacted
>     4 port workloads by reducing throughput.

Pulled, thanks Jeff.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-01-09  3:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-07  1:31 [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06 Jeff Kirsher
2015-01-07  1:31 ` [net v2 1/3] i40e: fix un-necessary Tx hangs Jeff Kirsher
2015-01-07  1:31 ` [net v2 2/3] i40e: Fix Rx checksum error counter Jeff Kirsher
2015-01-07  5:43   ` Tom Herbert
2015-01-07 20:14     ` Singhai, Anjali
2015-01-07 21:34       ` Tom Herbert
2015-01-08 20:33         ` Singhai, Anjali
2015-01-07  1:31 ` [net v2 3/3] i40e: Fix bug with TCP over IPv6 over VXLAN Jeff Kirsher
2015-01-09  3:43 ` [net v2 0/3][pull request] Intel Wired LAN Driver Updates 2015-01-06 David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.