All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next V1 0/2] Helper to find length of headers in an ethernet frame
@ 2014-07-28 11:27 Amir Vadai
  2014-07-28 11:28 ` [PATCH net-next V1 1/2] net: Header length compution function Amir Vadai
  2014-07-28 11:28 ` [PATCH net-next V1 2/2] net/mlx4_en: Copy exact header to SKB linear part Amir Vadai
  0 siblings, 2 replies; 7+ messages in thread
From: Amir Vadai @ 2014-07-28 11:27 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, Alexander Duyck, netdev, Amir Vadai, Or Gerlitz,
	Yevgeny Petrilin, Ido Shamay

Hi,

This patchset is based on the patch suggested by Eric Dumazet to calculate
header length of Ethernet headers using flow_dissector [1].
We have tested it in Mellanox lab, non GRO traffic shows some improvement in
bandwidth when using the patch, and anyway, the right way is to copy the exact
header and not just an arbitrary number of bytes.
Also  tested using a built-in function (the orignal patch [2]). Eric's solution
gives the same numbers (even a bit better).

[1] - http://patchwork.ozlabs.org/patch/350413/
[2] - http://patchwork.ozlabs.org/patch/347031/

CC: Alexander Duyck <alexander.h.duyck@intel.com>

Changes from V0:
- Patch 2/2: net/mlx4_en: Copy exact header to SKB linear part
  - Fixed commit message
  - Added empty line after declaration

Amir Vadai (1):
  net/mlx4_en: Copy exact header to SKB linear part

Eric Dumazet (1):
  net: Header length compution function

 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 14 +++++++++-----
 include/linux/skbuff.h                     |  1 +
 net/core/flow_dissector.c                  | 24 ++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 5 deletions(-)

-- 
1.8.3.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net-next V1 1/2] net: Header length compution function
  2014-07-28 11:27 [PATCH net-next V1 0/2] Helper to find length of headers in an ethernet frame Amir Vadai
@ 2014-07-28 11:28 ` Amir Vadai
  2014-07-28 14:50   ` Alexander Duyck
  2014-07-28 11:28 ` [PATCH net-next V1 2/2] net/mlx4_en: Copy exact header to SKB linear part Amir Vadai
  1 sibling, 1 reply; 7+ messages in thread
From: Amir Vadai @ 2014-07-28 11:28 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, Alexander Duyck, netdev, Amir Vadai, Or Gerlitz,
	Yevgeny Petrilin, Ido Shamay, Eric Dumazet

From: Eric Dumazet <eric.dumazet@gmail.com>

This commit is based on Eric Dumazet suggestion.
Use flow dissector to calculate header length.
Tested the following with a mlx4, and it indeed speeds up GRE traffic,
as GRO packets can now contain 17 MSS instead of 8.
(Pulling payload means GRO had to use 2 'frags' per MSS)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 include/linux/skbuff.h    |  1 +
 net/core/flow_dissector.c | 24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b613557..1f9af4d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3133,6 +3133,7 @@ bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off);
 int skb_checksum_setup(struct sk_buff *skb, bool recalculate);
 
 u32 __skb_get_poff(const struct sk_buff *skb);
+u32 eth_frame_headlen(void *data, unsigned int len);
 
 /**
  * skb_head_is_locked - Determine if the skb->head is locked down
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 5f362c1..c3afd27 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -343,6 +343,30 @@ u32 __skb_get_poff(const struct sk_buff *skb)
 	return poff;
 }
 
+
+/* Helper to find length of headers in an ethernet frame.
+ * This can help drivers to pull exact amount of bytes into
+ * skb->head to get optimal GRO performance.
+ * TODO: Could also return rxhash while we do a complete flow dissection.
+ */
+u32 eth_frame_headlen(void *data, unsigned int len)
+{
+	const struct ethhdr *eth = data;
+	struct sk_buff skb;
+
+	if (unlikely(len < ETH_HLEN))
+		return len;
+
+	skb.protocol = eth->h_proto;
+	skb.head = data + ETH_HLEN;
+	skb.data = skb.head;
+	skb_reset_network_header(&skb);
+	skb.len = len - ETH_HLEN;
+	skb.data_len = 0;
+	return __skb_get_poff(&skb) + ETH_HLEN;
+}
+EXPORT_SYMBOL(eth_frame_headlen);
+
 static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 {
 #ifdef CONFIG_XPS
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next V1 2/2] net/mlx4_en: Copy exact header to SKB linear part
  2014-07-28 11:27 [PATCH net-next V1 0/2] Helper to find length of headers in an ethernet frame Amir Vadai
  2014-07-28 11:28 ` [PATCH net-next V1 1/2] net: Header length compution function Amir Vadai
@ 2014-07-28 11:28 ` Amir Vadai
  1 sibling, 0 replies; 7+ messages in thread
From: Amir Vadai @ 2014-07-28 11:28 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, Alexander Duyck, netdev, Amir Vadai, Or Gerlitz,
	Yevgeny Petrilin, Ido Shamay

Based on patch from Eric Dumazet
When copying received packet header to the linear section of the SKB,
copy the exact header (best effort) and not the max possible header,
using the new network helper function eth_frame_headlen().
It will return the size of the header up to the latest known header.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 9c909d2..ca2cfbc 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -588,6 +588,8 @@ static struct sk_buff *mlx4_en_rx_skb(struct mlx4_en_priv *priv,
 		skb_copy_to_linear_data(skb, va, length);
 		skb->tail += length;
 	} else {
+		unsigned int hlen;
+
 		/* Move relevant fragments to skb */
 		used_frags = mlx4_en_complete_rx_desc(priv, rx_desc, frags,
 							skb, length);
@@ -597,16 +599,18 @@ static struct sk_buff *mlx4_en_rx_skb(struct mlx4_en_priv *priv,
 		}
 		skb_shinfo(skb)->nr_frags = used_frags;
 
+		hlen = eth_frame_headlen(va, length);
+
 		/* Copy headers into the skb linear buffer */
-		memcpy(skb->data, va, HEADER_COPY_SIZE);
-		skb->tail += HEADER_COPY_SIZE;
+		memcpy(skb->data, va, hlen);
+		skb->tail += hlen;
 
 		/* Skip headers in first fragment */
-		skb_shinfo(skb)->frags[0].page_offset += HEADER_COPY_SIZE;
+		skb_shinfo(skb)->frags[0].page_offset += hlen;
 
 		/* Adjust size of first fragment */
-		skb_frag_size_sub(&skb_shinfo(skb)->frags[0], HEADER_COPY_SIZE);
-		skb->data_len = length - HEADER_COPY_SIZE;
+		skb_frag_size_sub(&skb_shinfo(skb)->frags[0], hlen);
+		skb->data_len = length - hlen;
 	}
 	return skb;
 }
-- 
1.8.3.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next V1 1/2] net: Header length compution function
  2014-07-28 11:28 ` [PATCH net-next V1 1/2] net: Header length compution function Amir Vadai
@ 2014-07-28 14:50   ` Alexander Duyck
  2014-07-28 21:26     ` Cong Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Duyck @ 2014-07-28 14:50 UTC (permalink / raw)
  To: Amir Vadai, David S. Miller
  Cc: Eric Dumazet, netdev, Or Gerlitz, Yevgeny Petrilin, Ido Shamay,
	Eric Dumazet

On 07/28/2014 04:28 AM, Amir Vadai wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> 
> This commit is based on Eric Dumazet suggestion.
> Use flow dissector to calculate header length.
> Tested the following with a mlx4, and it indeed speeds up GRE traffic,
> as GRO packets can now contain 17 MSS instead of 8.
> (Pulling payload means GRO had to use 2 'frags' per MSS)
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Amir Vadai <amirv@mellanox.com>
> ---
>  include/linux/skbuff.h    |  1 +
>  net/core/flow_dissector.c | 24 ++++++++++++++++++++++++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index b613557..1f9af4d 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -3133,6 +3133,7 @@ bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off);
>  int skb_checksum_setup(struct sk_buff *skb, bool recalculate);
>  
>  u32 __skb_get_poff(const struct sk_buff *skb);
> +u32 eth_frame_headlen(void *data, unsigned int len);
>  
>  /**
>   * skb_head_is_locked - Determine if the skb->head is locked down
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 5f362c1..c3afd27 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -343,6 +343,30 @@ u32 __skb_get_poff(const struct sk_buff *skb)
>  	return poff;
>  }
>  
> +
> +/* Helper to find length of headers in an ethernet frame.
> + * This can help drivers to pull exact amount of bytes into
> + * skb->head to get optimal GRO performance.
> + * TODO: Could also return rxhash while we do a complete flow dissection.
> + */
> +u32 eth_frame_headlen(void *data, unsigned int len)
> +{
> +	const struct ethhdr *eth = data;
> +	struct sk_buff skb;
> +
> +	if (unlikely(len < ETH_HLEN))
> +		return len;
> +
> +	skb.protocol = eth->h_proto;
> +	skb.head = data + ETH_HLEN;
> +	skb.data = skb.head;
> +	skb_reset_network_header(&skb);
> +	skb.len = len - ETH_HLEN;
> +	skb.data_len = 0;
> +	return __skb_get_poff(&skb) + ETH_HLEN;
> +}

I'm still not a big fan of allocating an sk_buff on the stack.  Seems
like it isn't maintainable and really opens things up to possible issues
if someone ever extends the __skb_get_poff call.  But I'm not going to
force the issue since for now this isn't impacting igb or ixgbe.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next V1 1/2] net: Header length compution function
  2014-07-28 14:50   ` Alexander Duyck
@ 2014-07-28 21:26     ` Cong Wang
  2014-07-28 22:42       ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2014-07-28 21:26 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Amir Vadai, David S. Miller, Eric Dumazet, netdev, Or Gerlitz,
	Yevgeny Petrilin, Ido Shamay, Eric Dumazet

On Mon, Jul 28, 2014 at 7:50 AM, Alexander Duyck
<alexander.h.duyck@intel.com> wrote:
> On 07/28/2014 04:28 AM, Amir Vadai wrote:
>> +u32 eth_frame_headlen(void *data, unsigned int len)
>> +{
>> +     const struct ethhdr *eth = data;
>> +     struct sk_buff skb;
>> +
>> +     if (unlikely(len < ETH_HLEN))
>> +             return len;
>> +
>> +     skb.protocol = eth->h_proto;
>> +     skb.head = data + ETH_HLEN;
>> +     skb.data = skb.head;
>> +     skb_reset_network_header(&skb);
>> +     skb.len = len - ETH_HLEN;
>> +     skb.data_len = 0;
>> +     return __skb_get_poff(&skb) + ETH_HLEN;
>> +}
>
> I'm still not a big fan of allocating an sk_buff on the stack.  Seems
> like it isn't maintainable and really opens things up to possible issues
> if someone ever extends the __skb_get_poff call.  But I'm not going to
> force the issue since for now this isn't impacting igb or ixgbe.
>

+1

I think you can refactor the code to pass all these input as
arguments instead of a whole skbuff.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next V1 1/2] net: Header length compution function
  2014-07-28 21:26     ` Cong Wang
@ 2014-07-28 22:42       ` David Miller
  2014-07-28 23:08         ` Alexander Duyck
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2014-07-28 22:42 UTC (permalink / raw)
  To: cwang
  Cc: alexander.h.duyck, amirv, edumazet, netdev, ogerlitz, yevgenyp,
	idos, eric.dumazet

From: Cong Wang <cwang@twopensource.com>
Date: Mon, 28 Jul 2014 14:26:08 -0700

> On Mon, Jul 28, 2014 at 7:50 AM, Alexander Duyck
> <alexander.h.duyck@intel.com> wrote:
>> On 07/28/2014 04:28 AM, Amir Vadai wrote:
>>> +u32 eth_frame_headlen(void *data, unsigned int len)
>>> +{
>>> +     const struct ethhdr *eth = data;
>>> +     struct sk_buff skb;
>>> +
>>> +     if (unlikely(len < ETH_HLEN))
>>> +             return len;
>>> +
>>> +     skb.protocol = eth->h_proto;
>>> +     skb.head = data + ETH_HLEN;
>>> +     skb.data = skb.head;
>>> +     skb_reset_network_header(&skb);
>>> +     skb.len = len - ETH_HLEN;
>>> +     skb.data_len = 0;
>>> +     return __skb_get_poff(&skb) + ETH_HLEN;
>>> +}
>>
>> I'm still not a big fan of allocating an sk_buff on the stack.  Seems
>> like it isn't maintainable and really opens things up to possible issues
>> if someone ever extends the __skb_get_poff call.  But I'm not going to
>> force the issue since for now this isn't impacting igb or ixgbe.
>>
> 
> +1
> 
> I think you can refactor the code to pass all these input as
> arguments instead of a whole skbuff.

I was going to say the same thing, but if you take a look it's not so
simple.

The code currently handles fragmented SKBs just fine, and you'd
therefore have to make a seperate code path for purely linear buffers,
and thus code duplication.

I'm still not sure what's better, to be honest.  Currently I'm leaning
towards allowing the version in this patch set, even though it's a bit
risky this is in the fast path so perhaps warrants such tricks for
performance's sake.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next V1 1/2] net: Header length compution function
  2014-07-28 22:42       ` David Miller
@ 2014-07-28 23:08         ` Alexander Duyck
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Duyck @ 2014-07-28 23:08 UTC (permalink / raw)
  To: David Miller, cwang
  Cc: amirv, edumazet, netdev, ogerlitz, yevgenyp, idos, eric.dumazet

On 07/28/2014 03:42 PM, David Miller wrote:
> From: Cong Wang <cwang@twopensource.com>
> Date: Mon, 28 Jul 2014 14:26:08 -0700
> 
>> On Mon, Jul 28, 2014 at 7:50 AM, Alexander Duyck
>> <alexander.h.duyck@intel.com> wrote:
>>> On 07/28/2014 04:28 AM, Amir Vadai wrote:
>>>> +u32 eth_frame_headlen(void *data, unsigned int len)
>>>> +{
>>>> +     const struct ethhdr *eth = data;
>>>> +     struct sk_buff skb;
>>>> +
>>>> +     if (unlikely(len < ETH_HLEN))
>>>> +             return len;
>>>> +
>>>> +     skb.protocol = eth->h_proto;
>>>> +     skb.head = data + ETH_HLEN;
>>>> +     skb.data = skb.head;
>>>> +     skb_reset_network_header(&skb);
>>>> +     skb.len = len - ETH_HLEN;
>>>> +     skb.data_len = 0;
>>>> +     return __skb_get_poff(&skb) + ETH_HLEN;
>>>> +}
>>>
>>> I'm still not a big fan of allocating an sk_buff on the stack.  Seems
>>> like it isn't maintainable and really opens things up to possible issues
>>> if someone ever extends the __skb_get_poff call.  But I'm not going to
>>> force the issue since for now this isn't impacting igb or ixgbe.
>>>
>>
>> +1
>>
>> I think you can refactor the code to pass all these input as
>> arguments instead of a whole skbuff.
> 
> I was going to say the same thing, but if you take a look it's not so
> simple.
> 
> The code currently handles fragmented SKBs just fine, and you'd
> therefore have to make a seperate code path for purely linear buffers,
> and thus code duplication.
> 
> I'm still not sure what's better, to be honest.  Currently I'm leaning
> towards allowing the version in this patch set, even though it's a bit
> risky this is in the fast path so perhaps warrants such tricks for
> performance's sake.
> 

I think if anything it might be nice to add some warnings to
__skb_get_poff, and skb_flow_dissect pointing back to this code in order
to warn someone thinking of optimizing it so that they are aware that
there is a caller that provides partially initialized and incomplete SKBs.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-07-28 23:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-28 11:27 [PATCH net-next V1 0/2] Helper to find length of headers in an ethernet frame Amir Vadai
2014-07-28 11:28 ` [PATCH net-next V1 1/2] net: Header length compution function Amir Vadai
2014-07-28 14:50   ` Alexander Duyck
2014-07-28 21:26     ` Cong Wang
2014-07-28 22:42       ` David Miller
2014-07-28 23:08         ` Alexander Duyck
2014-07-28 11:28 ` [PATCH net-next V1 2/2] net/mlx4_en: Copy exact header to SKB linear part Amir Vadai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.