All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
@ 2016-01-31 21:37 Tom Herbert
  2016-01-31 21:47 ` kbuild test robot
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Tom Herbert @ 2016-01-31 21:37 UTC (permalink / raw)
  To: davem, netdev; +Cc: sowmini.varadhan, kernel-team

Call get_unaligned_be32 when we access 32-bit fields in
__skb_flow_dissect. At the beginning check for unlikely case of
1-byte aligned packet.

Note that flow_dissector may be asked to parse packet unaligned
fields in two instances:

1) Packet from a driver which is aligned to Ethernet header
   (2-byte alignment)
2) Parsing inner headers of a received GRE-TEB packet

Testing: Ran super_netperf tests did not see a regression. This was on
x86 which does not have problems with unaligned data.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 net/core/flow_dissector.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index d79699c..1c64a1a 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -95,7 +95,7 @@ __be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
 		ports = __skb_header_pointer(skb, thoff + poff,
 					     sizeof(_ports), data, hlen, &_ports);
 		if (ports)
-			return *ports;
+			return get_unaligned_be32(ports);
 	}
 
 	return 0;
@@ -116,6 +116,10 @@ EXPORT_SYMBOL(__skb_flow_get_ports);
  * by flow_dissector from either the skbuff or a raw buffer specified by the
  * rest parameters.
  *
+ * This function does not assume 4-byte alignment, but it does assume 2-byte
+ * alignment (false is returned for 1-byte alignment). get_unaligned_be32
+ * is called to get thirty-two values out of the packet.
+ *
  * Caller must take care of zeroing target container memory.
  */
 bool __skb_flow_dissect(const struct sk_buff *skb,
@@ -140,6 +144,11 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 		hlen = skb_headlen(skb);
 	}
 
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+	if (WARN_ON(((u64)data & 0x1)))
+		return false;
+#endif
+
 	/* It is ensured by skb_flow_dissector_init() that control key will
 	 * be always present.
 	 */
@@ -230,7 +239,7 @@ ipv6:
 			key_control->addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
 		}
 
-		flow_label = ip6_flowlabel(iph);
+		flow_label = get_unaligned_be32(iph) & IPV6_FLOWLABEL_MASK;
 		if (flow_label) {
 			if (dissector_uses_key(flow_dissector,
 					       FLOW_DISSECTOR_KEY_FLOW_LABEL)) {
@@ -303,7 +312,7 @@ ipv6:
 			key_addrs = skb_flow_dissector_target(flow_dissector,
 							      FLOW_DISSECTOR_KEY_TIPC_ADDRS,
 							      target_container);
-			key_addrs->tipcaddrs.srcnode = hdr->srcnode;
+			key_addrs->tipcaddrs.srcnode = get_unaligned_be32(&hdr->srcnode);
 			key_control->addr_type = FLOW_DISSECTOR_KEY_TIPC_ADDRS;
 		}
 		goto out_good;
@@ -325,7 +334,7 @@ mpls:
 				key_keyid = skb_flow_dissector_target(flow_dissector,
 								      FLOW_DISSECTOR_KEY_MPLS_ENTROPY,
 								      target_container);
-				key_keyid->keyid = hdr[1].entry &
+				key_keyid->keyid = get_unaligned_be32(&hdr[1].entry) &
 					htonl(MPLS_LS_LABEL_MASK);
 			}
 
@@ -379,7 +388,7 @@ ip_proto_again:
 				key_keyid = skb_flow_dissector_target(flow_dissector,
 								      FLOW_DISSECTOR_KEY_GRE_KEYID,
 								      target_container);
-				key_keyid->keyid = *keyid;
+				key_keyid->keyid = get_unaligned_be32(keyid);
 			}
 			nhoff += 4;
 		}
-- 
2.4.6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-01-31 21:37 [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers Tom Herbert
@ 2016-01-31 21:47 ` kbuild test robot
  2016-01-31 22:06 ` Florian Westphal
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: kbuild test robot @ 2016-01-31 21:47 UTC (permalink / raw)
  To: Tom Herbert; +Cc: kbuild-all, davem, netdev, sowmini.varadhan, kernel-team

[-- Attachment #1: Type: text/plain, Size: 7523 bytes --]

Hi Tom,

[auto build test WARNING on net/master]

url:    https://github.com/0day-ci/linux/commits/Tom-Herbert/net-Allow-flow-dissector-to-handle-non-4-byte-aligned-headers/20160201-053832
config: xtensa-allyesconfig (attached as .config)
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=xtensa 

All warnings (new ones prefixed by >>):

   In file included from arch/xtensa/include/generated/asm/bug.h:1:0,
                    from include/linux/bug.h:4,
                    from include/linux/thread_info.h:11,
                    from include/asm-generic/preempt.h:4,
                    from arch/xtensa/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:59,
                    from include/linux/spinlock.h:50,
                    from include/linux/mm_types.h:8,
                    from include/linux/kmemcheck.h:4,
                    from include/linux/skbuff.h:18,
                    from net/core/flow_dissector.c:2:
   net/core/flow_dissector.c: In function '__skb_flow_dissect':
>> net/core/flow_dissector.c:148:15: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
     if (WARN_ON(((u64)data & 0x1)))
                  ^
   include/asm-generic/bug.h:86:25: note: in definition of macro 'WARN_ON'
     int __ret_warn_on = !!(condition);    \
                            ^

vim +148 net/core/flow_dissector.c

     1	#include <linux/kernel.h>
   > 2	#include <linux/skbuff.h>
     3	#include <linux/export.h>
     4	#include <linux/ip.h>
     5	#include <linux/ipv6.h>
     6	#include <linux/if_vlan.h>
     7	#include <net/ip.h>
     8	#include <net/ipv6.h>
     9	#include <linux/igmp.h>
    10	#include <linux/icmp.h>
    11	#include <linux/sctp.h>
    12	#include <linux/dccp.h>
    13	#include <linux/if_tunnel.h>
    14	#include <linux/if_pppox.h>
    15	#include <linux/ppp_defs.h>
    16	#include <linux/stddef.h>
    17	#include <linux/if_ether.h>
    18	#include <linux/mpls.h>
    19	#include <net/flow_dissector.h>
    20	#include <scsi/fc/fc_fcoe.h>
    21	
    22	static bool dissector_uses_key(const struct flow_dissector *flow_dissector,
    23				       enum flow_dissector_key_id key_id)
    24	{
    25		return flow_dissector->used_keys & (1 << key_id);
    26	}
    27	
    28	static void dissector_set_key(struct flow_dissector *flow_dissector,
    29				      enum flow_dissector_key_id key_id)
    30	{
    31		flow_dissector->used_keys |= (1 << key_id);
    32	}
    33	
    34	static void *skb_flow_dissector_target(struct flow_dissector *flow_dissector,
    35					       enum flow_dissector_key_id key_id,
    36					       void *target_container)
    37	{
    38		return ((char *) target_container) + flow_dissector->offset[key_id];
    39	}
    40	
    41	void skb_flow_dissector_init(struct flow_dissector *flow_dissector,
    42				     const struct flow_dissector_key *key,
    43				     unsigned int key_count)
    44	{
    45		unsigned int i;
    46	
    47		memset(flow_dissector, 0, sizeof(*flow_dissector));
    48	
    49		for (i = 0; i < key_count; i++, key++) {
    50			/* User should make sure that every key target offset is withing
    51			 * boundaries of unsigned short.
    52			 */
    53			BUG_ON(key->offset > USHRT_MAX);
    54			BUG_ON(dissector_uses_key(flow_dissector,
    55						  key->key_id));
    56	
    57			dissector_set_key(flow_dissector, key->key_id);
    58			flow_dissector->offset[key->key_id] = key->offset;
    59		}
    60	
    61		/* Ensure that the dissector always includes control and basic key.
    62		 * That way we are able to avoid handling lack of these in fast path.
    63		 */
    64		BUG_ON(!dissector_uses_key(flow_dissector,
    65					   FLOW_DISSECTOR_KEY_CONTROL));
    66		BUG_ON(!dissector_uses_key(flow_dissector,
    67					   FLOW_DISSECTOR_KEY_BASIC));
    68	}
    69	EXPORT_SYMBOL(skb_flow_dissector_init);
    70	
    71	/**
    72	 * __skb_flow_get_ports - extract the upper layer ports and return them
    73	 * @skb: sk_buff to extract the ports from
    74	 * @thoff: transport header offset
    75	 * @ip_proto: protocol for which to get port offset
    76	 * @data: raw buffer pointer to the packet, if NULL use skb->data
    77	 * @hlen: packet header length, if @data is NULL use skb_headlen(skb)
    78	 *
    79	 * The function will try to retrieve the ports at offset thoff + poff where poff
    80	 * is the protocol port offset returned from proto_ports_offset
    81	 */
    82	__be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
    83				    void *data, int hlen)
    84	{
    85		int poff = proto_ports_offset(ip_proto);
    86	
    87		if (!data) {
    88			data = skb->data;
    89			hlen = skb_headlen(skb);
    90		}
    91	
    92		if (poff >= 0) {
    93			__be32 *ports, _ports;
    94	
    95			ports = __skb_header_pointer(skb, thoff + poff,
    96						     sizeof(_ports), data, hlen, &_ports);
    97			if (ports)
    98				return get_unaligned_be32(ports);
    99		}
   100	
   101		return 0;
   102	}
   103	EXPORT_SYMBOL(__skb_flow_get_ports);
   104	
   105	/**
   106	 * __skb_flow_dissect - extract the flow_keys struct and return it
   107	 * @skb: sk_buff to extract the flow from, can be NULL if the rest are specified
   108	 * @flow_dissector: list of keys to dissect
   109	 * @target_container: target structure to put dissected values into
   110	 * @data: raw buffer pointer to the packet, if NULL use skb->data
   111	 * @proto: protocol for which to get the flow, if @data is NULL use skb->protocol
   112	 * @nhoff: network header offset, if @data is NULL use skb_network_offset(skb)
   113	 * @hlen: packet header length, if @data is NULL use skb_headlen(skb)
   114	 *
   115	 * The function will try to retrieve individual keys into target specified
   116	 * by flow_dissector from either the skbuff or a raw buffer specified by the
   117	 * rest parameters.
   118	 *
   119	 * This function does not assume 4-byte alignment, but it does assume 2-byte
   120	 * alignment (false is returned for 1-byte alignment). get_unaligned_be32
   121	 * is called to get thirty-two values out of the packet.
   122	 *
   123	 * Caller must take care of zeroing target container memory.
   124	 */
   125	bool __skb_flow_dissect(const struct sk_buff *skb,
   126				struct flow_dissector *flow_dissector,
   127				void *target_container,
   128				void *data, __be16 proto, int nhoff, int hlen,
   129				unsigned int flags)
   130	{
   131		struct flow_dissector_key_control *key_control;
   132		struct flow_dissector_key_basic *key_basic;
   133		struct flow_dissector_key_addrs *key_addrs;
   134		struct flow_dissector_key_ports *key_ports;
   135		struct flow_dissector_key_tags *key_tags;
   136		struct flow_dissector_key_keyid *key_keyid;
   137		u8 ip_proto = 0;
   138		bool ret = false;
   139	
   140		if (!data) {
   141			data = skb->data;
   142			proto = skb->protocol;
   143			nhoff = skb_network_offset(skb);
   144			hlen = skb_headlen(skb);
   145		}
   146	
   147	#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
 > 148		if (WARN_ON(((u64)data & 0x1)))
   149			return false;
   150	#endif
   151	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 44040 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-01-31 21:37 [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers Tom Herbert
  2016-01-31 21:47 ` kbuild test robot
@ 2016-01-31 22:06 ` Florian Westphal
  2016-02-01  0:24 ` Eric Dumazet
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2016-01-31 22:06 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, sowmini.varadhan, kernel-team

Tom Herbert <tom@herbertland.com> wrote:
> Call get_unaligned_be32 when we access 32-bit fields in
> __skb_flow_dissect. At the beginning check for unlikely case of
> 1-byte aligned packet.
> 
> Note that flow_dissector may be asked to parse packet unaligned
> fields in two instances:
> 
> 1) Packet from a driver which is aligned to Ethernet header
>    (2-byte alignment)
> 2) Parsing inner headers of a received GRE-TEB packet
> 
> Testing: Ran super_netperf tests did not see a regression. This was on
> x86 which does not have problems with unaligned data.
> 
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
>  net/core/flow_dissector.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index d79699c..1c64a1a 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -95,7 +95,7 @@ __be32 __skb_flow_get_ports(const struct sk_buff *skb, int thoff, u8 ip_proto,
>  		ports = __skb_header_pointer(skb, thoff + poff,
>  					     sizeof(_ports), data, hlen, &_ports);
>  		if (ports)
> -			return *ports;
> +			return get_unaligned_be32(ports);

This adds an ntohl(), is that intended (same for other places)?

> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> +	if (WARN_ON(((u64)data & 0x1)))
> +		return false;
> +#endif

WARN_ON_ONCE ?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-01-31 21:37 [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers Tom Herbert
  2016-01-31 21:47 ` kbuild test robot
  2016-01-31 22:06 ` Florian Westphal
@ 2016-02-01  0:24 ` Eric Dumazet
  2016-02-01  0:39   ` Florian Fainelli
  2016-02-01  0:43   ` Sowmini Varadhan
  2016-02-01 12:32 ` Sergei Shtylyov
  2016-02-02  0:31 ` Sowmini Varadhan
  4 siblings, 2 replies; 18+ messages in thread
From: Eric Dumazet @ 2016-02-01  0:24 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, sowmini.varadhan, kernel-team

On Sun, 2016-01-31 at 13:37 -0800, Tom Herbert wrote:
> Call get_unaligned_be32 when we access 32-bit fields in
> __skb_flow_dissect. At the beginning check for unlikely case of
> 1-byte aligned packet.
> 
> Note that flow_dissector may be asked to parse packet unaligned
> fields in two instances:
> 
> 1) Packet from a driver which is aligned to Ethernet header
>    (2-byte alignment)
> 2) Parsing inner headers of a received GRE-TEB packet
> 
> Testing: Ran super_netperf tests did not see a regression. This was on
> x86 which does not have problems with unaligned data.

But this test is absolutely useless, what about testing arches that
actually care ?

I am told all these MIPS based boxes have already not enough cpu power.

It is sad, because none of them use the drivers that might call flow
dissection (mlx4 and 4 intel drivers)

So I would rather fix the cases where flow dissection called from
eth_get_headlen() with non aligned stuff.

And maybe restrict GRE-TEB to platforms with
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y, since otherwise we need to add
these unaligned macros in thousands of places in our stacks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-01  0:24 ` Eric Dumazet
@ 2016-02-01  0:39   ` Florian Fainelli
  2016-02-01 15:20     ` Nicolas Dichtel
  2016-02-01  0:43   ` Sowmini Varadhan
  1 sibling, 1 reply; 18+ messages in thread
From: Florian Fainelli @ 2016-02-01  0:39 UTC (permalink / raw)
  To: Eric Dumazet, Tom Herbert; +Cc: davem, netdev, sowmini.varadhan, kernel-team

Le 31/01/2016 16:24, Eric Dumazet a écrit :
> On Sun, 2016-01-31 at 13:37 -0800, Tom Herbert wrote:
>> Call get_unaligned_be32 when we access 32-bit fields in
>> __skb_flow_dissect. At the beginning check for unlikely case of
>> 1-byte aligned packet.
>>
>> Note that flow_dissector may be asked to parse packet unaligned
>> fields in two instances:
>>
>> 1) Packet from a driver which is aligned to Ethernet header
>>    (2-byte alignment)
>> 2) Parsing inner headers of a received GRE-TEB packet
>>
>> Testing: Ran super_netperf tests did not see a regression. This was on
>> x86 which does not have problems with unaligned data.
> 
> But this test is absolutely useless, what about testing arches that
> actually care ?
> 
> I am told all these MIPS based boxes have already not enough cpu power.

How about the Cavium OCTEON family and Broadcom/Netlogic XLR/XLP, those
are massively multi-core and MIPS64 capable, even though they may not
always run a Linux networking stack, some do.

There are also plenty of ARMv7/ARMv8 devices out there that would
benefit from proper alignment some might end-up using mlx4/5 and intel
cards.

> 
> It is sad, because none of them use the drivers that might call flow
> dissection (mlx4 and 4 intel drivers)
> 
> So I would rather fix the cases where flow dissection called from
> eth_get_headlen() with non aligned stuff.
> 
> And maybe restrict GRE-TEB to platforms with
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y, since otherwise we need to add
> these unaligned macros in thousands of places in our stacks.
> 
> 
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-01  0:24 ` Eric Dumazet
  2016-02-01  0:39   ` Florian Fainelli
@ 2016-02-01  0:43   ` Sowmini Varadhan
  1 sibling, 0 replies; 18+ messages in thread
From: Sowmini Varadhan @ 2016-02-01  0:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Tom Herbert, davem, netdev, kernel-team

On (01/31/16 16:24), Eric Dumazet wrote:
> 
> But this test is absolutely useless, what about testing arches that
> actually care ?
> 

Yes, I plan to help test this out tomorrow, Tom suggested setting up
gre-teb between x86 and sparc. 

--Sowmini

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-01-31 21:37 [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers Tom Herbert
                   ` (2 preceding siblings ...)
  2016-02-01  0:24 ` Eric Dumazet
@ 2016-02-01 12:32 ` Sergei Shtylyov
  2016-02-02  0:31 ` Sowmini Varadhan
  4 siblings, 0 replies; 18+ messages in thread
From: Sergei Shtylyov @ 2016-02-01 12:32 UTC (permalink / raw)
  To: Tom Herbert, davem, netdev; +Cc: sowmini.varadhan, kernel-team

Hello.

On 2/1/2016 12:37 AM, Tom Herbert wrote:

> Call get_unaligned_be32 when we access 32-bit fields in
> __skb_flow_dissect. At the beginning check for unlikely case of
> 1-byte aligned packet.
>
> Note that flow_dissector may be asked to parse packet unaligned
> fields in two instances:
>
> 1) Packet from a driver which is aligned to Ethernet header
>     (2-byte alignment)
> 2) Parsing inner headers of a received GRE-TEB packet
>
> Testing: Ran super_netperf tests did not see a regression. This was on
> x86 which does not have problems with unaligned data.
>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
>   net/core/flow_dissector.c | 19 ++++++++++++++-----
>   1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index d79699c..1c64a1a 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
[...]
> @@ -116,6 +116,10 @@ EXPORT_SYMBOL(__skb_flow_get_ports);
>    * by flow_dissector from either the skbuff or a raw buffer specified by the
>    * rest parameters.
>    *
> + * This function does not assume 4-byte alignment, but it does assume 2-byte
> + * alignment (false is returned for 1-byte alignment). get_unaligned_be32
> + * is called to get thirty-two values out of the packet.

    32-bit, maybe?

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-01  0:39   ` Florian Fainelli
@ 2016-02-01 15:20     ` Nicolas Dichtel
  2016-02-01 16:01       ` Sowmini Varadhan
  0 siblings, 1 reply; 18+ messages in thread
From: Nicolas Dichtel @ 2016-02-01 15:20 UTC (permalink / raw)
  To: Florian Fainelli, Eric Dumazet, Tom Herbert
  Cc: davem, netdev, sowmini.varadhan, kernel-team

Le 01/02/2016 01:39, Florian Fainelli a écrit :
> Le 31/01/2016 16:24, Eric Dumazet a écrit :
>> On Sun, 2016-01-31 at 13:37 -0800, Tom Herbert wrote:
>>> Call get_unaligned_be32 when we access 32-bit fields in
>>> __skb_flow_dissect. At the beginning check for unlikely case of
>>> 1-byte aligned packet.
>>>
>>> Note that flow_dissector may be asked to parse packet unaligned
>>> fields in two instances:
>>>
>>> 1) Packet from a driver which is aligned to Ethernet header
>>>     (2-byte alignment)
>>> 2) Parsing inner headers of a received GRE-TEB packet
>>>
>>> Testing: Ran super_netperf tests did not see a regression. This was on
>>> x86 which does not have problems with unaligned data.
>>
>> But this test is absolutely useless, what about testing arches that
>> actually care ?
>>
>> I am told all these MIPS based boxes have already not enough cpu power.
>
> How about the Cavium OCTEON family and Broadcom/Netlogic XLR/XLP, those
> are massively multi-core and MIPS64 capable, even though they may not
> always run a Linux networking stack, some do.
>
> There are also plenty of ARMv7/ARMv8 devices out there that would
> benefit from proper alignment some might end-up using mlx4/5 and intel
> cards.
There is also the tile architecture, up to 76 cores on the board I've seen. It
requires an alignment on 8!
I wonder how this case may be properly handled. A simple ipip scenario fails.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-01 15:20     ` Nicolas Dichtel
@ 2016-02-01 16:01       ` Sowmini Varadhan
  0 siblings, 0 replies; 18+ messages in thread
From: Sowmini Varadhan @ 2016-02-01 16:01 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: Florian Fainelli, Eric Dumazet, Tom Herbert, davem, netdev, kernel-team

On (02/01/16 16:20), Nicolas Dichtel wrote:
> There is also the tile architecture, up to 76 cores on the board I've seen. It
> requires an alignment on 8!
> I wonder how this case may be properly handled. A simple ipip scenario fails.

Yes, I'm also observing the same thing. Simply setting up gretap like this
results in

# ip link add gretap2 type gretap local myaddr remote hisaddr
# ifconfig gretap2 up

Kernel unaligned access at TPC[969d88] iptunnel_xmit+0xa4/0x1c0
Kernel unaligned access at TPC[969db0] iptunnel_xmit+0xcc/0x1c0
Kernel unaligned access at TPC[969dc4] iptunnel_xmit+0xe0/0x1c0
Kernel unaligned access at TPC[969dc8] iptunnel_xmit+0xe4/0x1c0
Kernel unaligned access at TPC[9191fc] __ip_select_ident+0x80/0x13c

and afacit, the unaligned ip header is triggered by an inner ipv6 packet
that is usually coming from mld or dad.

(and fwiw, the basic iperf/ipv6 test is fine with Tom's patch, need
to try out some of the other permutations without rps etc)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-01-31 21:37 [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers Tom Herbert
                   ` (3 preceding siblings ...)
  2016-02-01 12:32 ` Sergei Shtylyov
@ 2016-02-02  0:31 ` Sowmini Varadhan
  2016-02-02  0:46   ` Tom Herbert
                     ` (2 more replies)
  4 siblings, 3 replies; 18+ messages in thread
From: Sowmini Varadhan @ 2016-02-02  0:31 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, kernel-team

On (01/31/16 13:37), Tom Herbert wrote:
> 
> Call get_unaligned_be32 when we access 32-bit fields in
> __skb_flow_dissect. At the beginning check for unlikely case of
> 1-byte aligned packet.
> 
> Note that flow_dissector may be asked to parse packet unaligned
> fields in two instances:

I tested this out with the following setup (hostnames in quotes,
network connectivity in the line above)

                                      gre-tunnel 
     (eth1) --- (eth1)      (eth0) ... {routed} ...  (eth0) 
   "sparc44"         "sparc45"                      "x86-1"


eth1 connections are point-to-point ethernets.
The GRE tunnel is configured with some non-zero gre key.

So sparc45 bridges sparc44 to x86-1 and has

  sparc45# brctl show br0
  bridge name     bridge id               STP enabled     interfaces
  br0             8000.0010e0570711       no              eth1
                                                          gretap2


I tried various pings etc across the nodes, and made sure that
(excluding the mpls change)  all of the changes in Tom's patch
worked fine. To do this, I had to enable rps and disable rxhash/ntuple
in my intel driver. 

And then I noticed that the iph->saddr access does indeed result in
an unaligned access (because this time it gets called with
flow_keys_dissector). The other thing that generated unaligned access
was the attempt to do this 
               if (proto == htons(ETH_P_TEB)) {
			:
                        eth = __skb_header_pointer(skb, nhoff,
                                                   sizeof(_eth),
                                                   data, hlen, &_eth);
			:
                        proto = eth->h_proto;
                }
which I fixed thus:

@@ -394,7 +407,7 @@ ip_proto_again:
                                                   data, hlen, &_eth);
                        if (!eth)
                                goto out_bad;
-                       proto = eth->h_proto;
+                       proto = get_unaligned_be16(&eth->h_proto);
                        nhoff += sizeof(*eth);
                }

Having said all this, the larger problem remains:

There are a number of unaligned access on the xmit path, I've listed
a few below. A lot of it is coming of gre_tap_xmit not setting up 
offsets properly, but there are others.
 
In general, since the actual path taken by the packet (i.e., 
what driver gets to process this next, what types/sizes of headers 
will get added, etc) is determined dynamically, it's not enough to just
do a one-time skb_reserve (or page_offset) for NET_IP_ALIGN.

Kernel unaligned access at TPC[9150dc] ipv4_neigh_lookup+0x38/0x170
CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E   4.4.0-dissect+ #19
Call Trace:
 [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
 [0000000000427ef8] sun4v_do_mna+0x58/0x9c
 [0000000000406d30] sun4v_mna+0x5c/0x68
 [00000000009150dc] ipv4_neigh_lookup+0x38/0x170
 [00000000008ead10] neigh_update+0x450/0x5cc
 [00000000008eb8b8] neigh_event_ns+0x84/0x94
 [00000000009518f8] arp_process+0x470/0x854
 [0000000000951e78] arp_rcv+0x19c/0x1dc
 [00000000008d647c] __netif_receive_skb_core+0xa24/0xa58
 [00000000008d6510] __netif_receive_skb+0x60/0x74
 [00000000008d65dc] process_backlog+0xb8/0x168
 [00000000008dc9f8] napi_poll+0x74/0x1c8
 [00000000008dcc20] net_rx_action+0xd4/0x1d4
 [00000000004633a4] __do_softirq+0x11c/0x28c
 [000000000042abb0] do_softirq_own_stack+0x34/0x48
 [0000000000463050] irq_exit+0x48/0x60


Kernel unaligned access at TPC[103e720c] build_header+0x64/0xc4 [ip_gre]
CPU: 44 PID: 0 Comm: swapper/44 Tainted: G            E   4.4.0-dissect+ #19
Call Trace:
 [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
 [0000000000427ef8] sun4v_do_mna+0x58/0x9c
 [0000000000406d30] sun4v_mna+0x5c/0x68
 [00000000103e720c] build_header+0x64/0xc4 [ip_gre]
 [00000000103e76a4] __gre_xmit+0x40/0x7c [ip_gre]
 [00000000103e79e8] gre_tap_xmit+0xfc/0x12c [ip_gre]
 [00000000008de6dc] dev_hard_start_xmit+0x304/0x3ec
 [0000000000904d30] sch_direct_xmit+0x74/0x1dc
 [00000000008df1a8] __dev_queue_xmit+0x568/0x7e8
 [00000000008df45c] dev_queue_xmit+0x10/0x24
 [0000000010207654] ip6_finish_output2+0x5b4/0x6b0 [ipv6]
 [0000000010207954] ip6_finish_output+0x204/0x218 [ipv6]
 [0000000010207afc] ip6_output+0x194/0x1a8 [ipv6]
 [000000001021f97c] ndisc_send_skb+0x390/0x438 [ipv6]
 [000000001021ff14] ndisc_send_rs+0xe0/0xfc [ipv6]
 [000000001020b190] addrconf_rs_timer+0xc4/0x124 [ipv6]


Kernel unaligned access at TPC[969ee0] iptunnel_xmit+0xcc/0x1c0
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E   4.4.0-dissect+ #19
Call Trace:
 [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
 [0000000000427ef8] sun4v_do_mna+0x58/0x9c
 [0000000000406d30] sun4v_mna+0x5c/0x68
 [0000000000969ee0] iptunnel_xmit+0xcc/0x1c0
 [00000000103dc9dc] ip_tunnel_xmit+0xb78/0xc2c [ip_tunnel]
 [00000000103e76d0] __gre_xmit+0x6c/0x7c [ip_gre]
 [00000000103e79e8] gre_tap_xmit+0xfc/0x12c [ip_gre]
 [00000000008de6dc] dev_hard_start_xmit+0x304/0x3ec
 [0000000000904d30] sch_direct_xmit+0x74/0x1dc
 [00000000008df1a8] __dev_queue_xmit+0x568/0x7e8
 [00000000008df45c] dev_queue_xmit+0x10/0x24
 [00000000104034b8] br_dev_queue_push_xmit+0x1d8/0x1fc [bridge]
 [000000001040355c] br_forward_finish+0x80/0x94 [bridge]
 [0000000010403684] __br_forward+0x114/0x124 [bridge]
 [0000000010403700] br_forward+0x6c/0x94 [bridge]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-02  0:31 ` Sowmini Varadhan
@ 2016-02-02  0:46   ` Tom Herbert
  2016-02-02  3:56   ` Alexander Duyck
  2016-02-03 17:07   ` Tom Herbert
  2 siblings, 0 replies; 18+ messages in thread
From: Tom Herbert @ 2016-02-02  0:46 UTC (permalink / raw)
  To: Sowmini Varadhan
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On Mon, Feb 1, 2016 at 4:31 PM, Sowmini Varadhan
<sowmini.varadhan@oracle.com> wrote:
> On (01/31/16 13:37), Tom Herbert wrote:
>>
>> Call get_unaligned_be32 when we access 32-bit fields in
>> __skb_flow_dissect. At the beginning check for unlikely case of
>> 1-byte aligned packet.
>>
>> Note that flow_dissector may be asked to parse packet unaligned
>> fields in two instances:
>
> I tested this out with the following setup (hostnames in quotes,
> network connectivity in the line above)
>
>                                       gre-tunnel
>      (eth1) --- (eth1)      (eth0) ... {routed} ...  (eth0)
>    "sparc44"         "sparc45"                      "x86-1"
>
>
> eth1 connections are point-to-point ethernets.
> The GRE tunnel is configured with some non-zero gre key.
>
> So sparc45 bridges sparc44 to x86-1 and has
>
>   sparc45# brctl show br0
>   bridge name     bridge id               STP enabled     interfaces
>   br0             8000.0010e0570711       no              eth1
>                                                           gretap2
>
>
> I tried various pings etc across the nodes, and made sure that
> (excluding the mpls change)  all of the changes in Tom's patch
> worked fine. To do this, I had to enable rps and disable rxhash/ntuple
> in my intel driver.
>
> And then I noticed that the iph->saddr access does indeed result in
> an unaligned access (because this time it gets called with
> flow_keys_dissector). The other thing that generated unaligned access
> was the attempt to do this
>                if (proto == htons(ETH_P_TEB)) {
>                         :
>                         eth = __skb_header_pointer(skb, nhoff,
>                                                    sizeof(_eth),
>                                                    data, hlen, &_eth);
>                         :
>                         proto = eth->h_proto;
>                 }
> which I fixed thus:
>
> @@ -394,7 +407,7 @@ ip_proto_again:
>                                                    data, hlen, &_eth);
>                         if (!eth)
>                                 goto out_bad;
> -                       proto = eth->h_proto;
> +                       proto = get_unaligned_be16(&eth->h_proto);
>                         nhoff += sizeof(*eth);
>                 }
>
That's bad, it would seem imply that eth->h_proto has an odd byte
alignment somehow.

> Having said all this, the larger problem remains:
>
> There are a number of unaligned access on the xmit path, I've listed
> a few below. A lot of it is coming of gre_tap_xmit not setting up
> offsets properly, but there are others.
>
It's going to be a problem both the xmit path and rx path for GRE,
VXLAN, and Geneve (and any other encapsulation carries an Ethernet
header not aligned to two bytes). Unfortunately, this going to be a
header one to fix I think...

> In general, since the actual path taken by the packet (i.e.,
> what driver gets to process this next, what types/sizes of headers
> will get added, etc) is determined dynamically, it's not enough to just
> do a one-time skb_reserve (or page_offset) for NET_IP_ALIGN.
>
> Kernel unaligned access at TPC[9150dc] ipv4_neigh_lookup+0x38/0x170
> CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [00000000009150dc] ipv4_neigh_lookup+0x38/0x170
>  [00000000008ead10] neigh_update+0x450/0x5cc
>  [00000000008eb8b8] neigh_event_ns+0x84/0x94
>  [00000000009518f8] arp_process+0x470/0x854
>  [0000000000951e78] arp_rcv+0x19c/0x1dc
>  [00000000008d647c] __netif_receive_skb_core+0xa24/0xa58
>  [00000000008d6510] __netif_receive_skb+0x60/0x74
>  [00000000008d65dc] process_backlog+0xb8/0x168
>  [00000000008dc9f8] napi_poll+0x74/0x1c8
>  [00000000008dcc20] net_rx_action+0xd4/0x1d4
>  [00000000004633a4] __do_softirq+0x11c/0x28c
>  [000000000042abb0] do_softirq_own_stack+0x34/0x48
>  [0000000000463050] irq_exit+0x48/0x60
>
>
> Kernel unaligned access at TPC[103e720c] build_header+0x64/0xc4 [ip_gre]
> CPU: 44 PID: 0 Comm: swapper/44 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [00000000103e720c] build_header+0x64/0xc4 [ip_gre]
>  [00000000103e76a4] __gre_xmit+0x40/0x7c [ip_gre]
>  [00000000103e79e8] gre_tap_xmit+0xfc/0x12c [ip_gre]
>  [00000000008de6dc] dev_hard_start_xmit+0x304/0x3ec
>  [0000000000904d30] sch_direct_xmit+0x74/0x1dc
>  [00000000008df1a8] __dev_queue_xmit+0x568/0x7e8
>  [00000000008df45c] dev_queue_xmit+0x10/0x24
>  [0000000010207654] ip6_finish_output2+0x5b4/0x6b0 [ipv6]
>  [0000000010207954] ip6_finish_output+0x204/0x218 [ipv6]
>  [0000000010207afc] ip6_output+0x194/0x1a8 [ipv6]
>  [000000001021f97c] ndisc_send_skb+0x390/0x438 [ipv6]
>  [000000001021ff14] ndisc_send_rs+0xe0/0xfc [ipv6]
>  [000000001020b190] addrconf_rs_timer+0xc4/0x124 [ipv6]
>
>
> Kernel unaligned access at TPC[969ee0] iptunnel_xmit+0xcc/0x1c0
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [0000000000969ee0] iptunnel_xmit+0xcc/0x1c0
>  [00000000103dc9dc] ip_tunnel_xmit+0xb78/0xc2c [ip_tunnel]
>  [00000000103e76d0] __gre_xmit+0x6c/0x7c [ip_gre]
>  [00000000103e79e8] gre_tap_xmit+0xfc/0x12c [ip_gre]
>  [00000000008de6dc] dev_hard_start_xmit+0x304/0x3ec
>  [0000000000904d30] sch_direct_xmit+0x74/0x1dc
>  [00000000008df1a8] __dev_queue_xmit+0x568/0x7e8
>  [00000000008df45c] dev_queue_xmit+0x10/0x24
>  [00000000104034b8] br_dev_queue_push_xmit+0x1d8/0x1fc [bridge]
>  [000000001040355c] br_forward_finish+0x80/0x94 [bridge]
>  [0000000010403684] __br_forward+0x114/0x124 [bridge]
>  [0000000010403700] br_forward+0x6c/0x94 [bridge]
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-02  0:31 ` Sowmini Varadhan
  2016-02-02  0:46   ` Tom Herbert
@ 2016-02-02  3:56   ` Alexander Duyck
  2016-02-02 13:41     ` Hannes Frederic Sowa
  2016-02-02 18:35     ` Sowmini Varadhan
  2016-02-03 17:07   ` Tom Herbert
  2 siblings, 2 replies; 18+ messages in thread
From: Alexander Duyck @ 2016-02-02  3:56 UTC (permalink / raw)
  To: Sowmini Varadhan; +Cc: Tom Herbert, David Miller, Netdev, Kernel Team

On Mon, Feb 1, 2016 at 4:31 PM, Sowmini Varadhan
<sowmini.varadhan@oracle.com> wrote:
> On (01/31/16 13:37), Tom Herbert wrote:
>>
>> Call get_unaligned_be32 when we access 32-bit fields in
>> __skb_flow_dissect. At the beginning check for unlikely case of
>> 1-byte aligned packet.
>>
>> Note that flow_dissector may be asked to parse packet unaligned
>> fields in two instances:
>
> I tested this out with the following setup (hostnames in quotes,
> network connectivity in the line above)
>
>                                       gre-tunnel
>      (eth1) --- (eth1)      (eth0) ... {routed} ...  (eth0)
>    "sparc44"         "sparc45"                      "x86-1"
>
>
> eth1 connections are point-to-point ethernets.
> The GRE tunnel is configured with some non-zero gre key.
>
> So sparc45 bridges sparc44 to x86-1 and has
>
>   sparc45# brctl show br0
>   bridge name     bridge id               STP enabled     interfaces
>   br0             8000.0010e0570711       no              eth1
>                                                           gretap2
>
>
> I tried various pings etc across the nodes, and made sure that
> (excluding the mpls change)  all of the changes in Tom's patch
> worked fine. To do this, I had to enable rps and disable rxhash/ntuple
> in my intel driver.
>
> And then I noticed that the iph->saddr access does indeed result in
> an unaligned access (because this time it gets called with

This part isn't surprising.  Pretty much anything that has an access
width greater than 16 bits will be unaligned.

> flow_keys_dissector). The other thing that generated unaligned access
> was the attempt to do this
>                if (proto == htons(ETH_P_TEB)) {
>                         :
>                         eth = __skb_header_pointer(skb, nhoff,
>                                                    sizeof(_eth),
>                                                    data, hlen, &_eth);
>                         :
>                         proto = eth->h_proto;
>                 }
> which I fixed thus:
>
> @@ -394,7 +407,7 @@ ip_proto_again:
>                                                    data, hlen, &_eth);
>                         if (!eth)
>                                 goto out_bad;
> -                       proto = eth->h_proto;
> +                       proto = get_unaligned_be16(&eth->h_proto);
>                         nhoff += sizeof(*eth);
>                 }

This piece doesn't make any sense to me.  It is already only 2 bytes
wide.  I'm not sure why we should be seeing this trigger an unaligned
access.  Are you sure it wasn't something like the keyid causing the
issue?  I'd be interested in seeing what the compiler did here that it
is triggering the problem.

> Having said all this, the larger problem remains:
>
> There are a number of unaligned access on the xmit path, I've listed
> a few below. A lot of it is coming of gre_tap_xmit not setting up
> offsets properly, but there are others.
>
> In general, since the actual path taken by the packet (i.e.,
> what driver gets to process this next, what types/sizes of headers
> will get added, etc) is determined dynamically, it's not enough to just
> do a one-time skb_reserve (or page_offset) for NET_IP_ALIGN.

Right.  As long as the headers are linked we are in a bad spot on this
issue since it forces either the inner or outer headers to be
unaligned.

> Kernel unaligned access at TPC[9150dc] ipv4_neigh_lookup+0x38/0x170
> CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [00000000009150dc] ipv4_neigh_lookup+0x38/0x170
>  [00000000008ead10] neigh_update+0x450/0x5cc
>  [00000000008eb8b8] neigh_event_ns+0x84/0x94
>  [00000000009518f8] arp_process+0x470/0x854
>  [0000000000951e78] arp_rcv+0x19c/0x1dc
>  [00000000008d647c] __netif_receive_skb_core+0xa24/0xa58
>  [00000000008d6510] __netif_receive_skb+0x60/0x74
>  [00000000008d65dc] process_backlog+0xb8/0x168
>  [00000000008dc9f8] napi_poll+0x74/0x1c8
>  [00000000008dcc20] net_rx_action+0xd4/0x1d4
>  [00000000004633a4] __do_softirq+0x11c/0x28c
>  [000000000042abb0] do_softirq_own_stack+0x34/0x48
>  [0000000000463050] irq_exit+0x48/0x60
>
>
> Kernel unaligned access at TPC[103e720c] build_header+0x64/0xc4 [ip_gre]
> CPU: 44 PID: 0 Comm: swapper/44 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [00000000103e720c] build_header+0x64/0xc4 [ip_gre]
>  [00000000103e76a4] __gre_xmit+0x40/0x7c [ip_gre]
>  [00000000103e79e8] gre_tap_xmit+0xfc/0x12c [ip_gre]
>  [00000000008de6dc] dev_hard_start_xmit+0x304/0x3ec
>  [0000000000904d30] sch_direct_xmit+0x74/0x1dc
>  [00000000008df1a8] __dev_queue_xmit+0x568/0x7e8
>  [00000000008df45c] dev_queue_xmit+0x10/0x24
>  [0000000010207654] ip6_finish_output2+0x5b4/0x6b0 [ipv6]
>  [0000000010207954] ip6_finish_output+0x204/0x218 [ipv6]
>  [0000000010207afc] ip6_output+0x194/0x1a8 [ipv6]
>  [000000001021f97c] ndisc_send_skb+0x390/0x438 [ipv6]
>  [000000001021ff14] ndisc_send_rs+0xe0/0xfc [ipv6]
>  [000000001020b190] addrconf_rs_timer+0xc4/0x124 [ipv6]
>
>
> Kernel unaligned access at TPC[969ee0] iptunnel_xmit+0xcc/0x1c0
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [0000000000969ee0] iptunnel_xmit+0xcc/0x1c0
>  [00000000103dc9dc] ip_tunnel_xmit+0xb78/0xc2c [ip_tunnel]
>  [00000000103e76d0] __gre_xmit+0x6c/0x7c [ip_gre]
>  [00000000103e79e8] gre_tap_xmit+0xfc/0x12c [ip_gre]
>  [00000000008de6dc] dev_hard_start_xmit+0x304/0x3ec
>  [0000000000904d30] sch_direct_xmit+0x74/0x1dc
>  [00000000008df1a8] __dev_queue_xmit+0x568/0x7e8
>  [00000000008df45c] dev_queue_xmit+0x10/0x24
>  [00000000104034b8] br_dev_queue_push_xmit+0x1d8/0x1fc [bridge]
>  [000000001040355c] br_forward_finish+0x80/0x94 [bridge]
>  [0000000010403684] __br_forward+0x114/0x124 [bridge]
>  [0000000010403700] br_forward+0x6c/0x94 [bridge]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-02  3:56   ` Alexander Duyck
@ 2016-02-02 13:41     ` Hannes Frederic Sowa
  2016-02-02 18:35     ` Sowmini Varadhan
  1 sibling, 0 replies; 18+ messages in thread
From: Hannes Frederic Sowa @ 2016-02-02 13:41 UTC (permalink / raw)
  To: Alexander Duyck, Sowmini Varadhan
  Cc: Tom Herbert, David Miller, Netdev, Kernel Team

On 02.02.2016 04:56, Alexander Duyck wrote:
>> @@ -394,7 +407,7 @@ ip_proto_again:
>>                                                     data, hlen, &_eth);
>>                          if (!eth)
>>                                  goto out_bad;
>> -                       proto = eth->h_proto;
>> +                       proto = get_unaligned_be16(&eth->h_proto);
>>                          nhoff += sizeof(*eth);
>>                  }
>
> This piece doesn't make any sense to me.  It is already only 2 bytes
> wide.  I'm not sure why we should be seeing this trigger an unaligned
> access.  Are you sure it wasn't something like the keyid causing the
> issue?  I'd be interested in seeing what the compiler did here that it
> is triggering the problem.
>

Correct, and the __packed attribute of struct ethhdr already causes all 
members to have an assumed-alignment of '1', so gcc doesn't create any 
16 bit width accesses anyway.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-02  3:56   ` Alexander Duyck
  2016-02-02 13:41     ` Hannes Frederic Sowa
@ 2016-02-02 18:35     ` Sowmini Varadhan
  1 sibling, 0 replies; 18+ messages in thread
From: Sowmini Varadhan @ 2016-02-02 18:35 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Tom Herbert, David Miller, Netdev, Kernel Team

On (02/01/16 19:56), Alexander Duyck wrote:
> > @@ -394,7 +407,7 @@ ip_proto_again:
> >                                                    data, hlen, &_eth);
> >                         if (!eth)
> >                                 goto out_bad;
> > -                       proto = eth->h_proto;
> > +                       proto = get_unaligned_be16(&eth->h_proto);
> >                         nhoff += sizeof(*eth);
> >                 }
> 
> This piece doesn't make any sense to me.  It is already only 2 bytes
> wide.  I'm not sure why we should be seeing this trigger an unaligned
> access.  Are you sure it wasn't something like the keyid causing the
> issue?  I'd be interested in seeing what the compiler did here that it
> is triggering the problem.

You're right- I was getting blinded by all the unaligned-access
messages swimming by and making a mistake. It was actually the 

                memcpy(&key_addrs->v4addrs, &iph->saddr,
                       sizeof(key_addrs->v4addrs));

The assembler code is this:

   0x8d3298 <__skb_flow_dissect+500>:   ld  [ %l5 + 0xc ], %g3
   0x8d329c <__skb_flow_dissect+504>:   add  %i2, %g1, %g2
   0x8d32a0 <__skb_flow_dissect+508>:   st  %g3, [ %i2 + %g1 ]
   0x8d32a4 <__skb_flow_dissect+512>:   ld  [ %l5 + 0x10 ], %g1
   0x8d32a8 <__skb_flow_dissect+516>:   st  %g1, [ %g2 + 4 ]
   0x8d32ac <__skb_flow_dissect+520>:   mov  2, %g1

I get unaligned access traps at __skb_flow_dissect+500 and
__skb_flow_dissect+512 (corresponding to saddr and daddr), once for
each interface (gretap/eth0 and eth1). 

--Sowmini

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-02  0:31 ` Sowmini Varadhan
  2016-02-02  0:46   ` Tom Herbert
  2016-02-02  3:56   ` Alexander Duyck
@ 2016-02-03 17:07   ` Tom Herbert
  2016-02-03 17:31     ` Sowmini Varadhan
  2 siblings, 1 reply; 18+ messages in thread
From: Tom Herbert @ 2016-02-03 17:07 UTC (permalink / raw)
  To: Sowmini Varadhan
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On Mon, Feb 1, 2016 at 4:31 PM, Sowmini Varadhan
<sowmini.varadhan@oracle.com> wrote:
> On (01/31/16 13:37), Tom Herbert wrote:
>>
>> Call get_unaligned_be32 when we access 32-bit fields in
>> __skb_flow_dissect. At the beginning check for unlikely case of
>> 1-byte aligned packet.
>>
>> Note that flow_dissector may be asked to parse packet unaligned
>> fields in two instances:
>
> I tested this out with the following setup (hostnames in quotes,
> network connectivity in the line above)
>
>                                       gre-tunnel
>      (eth1) --- (eth1)      (eth0) ... {routed} ...  (eth0)
>    "sparc44"         "sparc45"                      "x86-1"
>
>
> eth1 connections are point-to-point ethernets.
> The GRE tunnel is configured with some non-zero gre key.
>
> So sparc45 bridges sparc44 to x86-1 and has
>
>   sparc45# brctl show br0
>   bridge name     bridge id               STP enabled     interfaces
>   br0             8000.0010e0570711       no              eth1
>                                                           gretap2
>
>
> I tried various pings etc across the nodes, and made sure that
> (excluding the mpls change)  all of the changes in Tom's patch
> worked fine. To do this, I had to enable rps and disable rxhash/ntuple
> in my intel driver.
>
> And then I noticed that the iph->saddr access does indeed result in
> an unaligned access (because this time it gets called with
> flow_keys_dissector). The other thing that generated unaligned access
> was the attempt to do this
>                if (proto == htons(ETH_P_TEB)) {
>                         :
>                         eth = __skb_header_pointer(skb, nhoff,
>                                                    sizeof(_eth),
>                                                    data, hlen, &_eth);
>                         :
>                         proto = eth->h_proto;
>                 }
> which I fixed thus:
>
> @@ -394,7 +407,7 @@ ip_proto_again:
>                                                    data, hlen, &_eth);
>                         if (!eth)
>                                 goto out_bad;
> -                       proto = eth->h_proto;
> +                       proto = get_unaligned_be16(&eth->h_proto);
>                         nhoff += sizeof(*eth);
>                 }
>
> Having said all this, the larger problem remains:
>
> There are a number of unaligned access on the xmit path, I've listed
> a few below. A lot of it is coming of gre_tap_xmit not setting up
> offsets properly, but there are others.
>
> In general, since the actual path taken by the packet (i.e.,
> what driver gets to process this next, what types/sizes of headers
> will get added, etc) is determined dynamically, it's not enough to just
> do a one-time skb_reserve (or page_offset) for NET_IP_ALIGN.
>
> Kernel unaligned access at TPC[9150dc] ipv4_neigh_lookup+0x38/0x170

Sowmini,

This doesn't look like a hard crash to me. Instead of trying to fix
all the alignment issues for Sparc, can we just take the trap, fix up
the load, and continue without any further fuss? Performance might
suffer, but it doesn't seem like the bad alignments are happening in
critical paths.

Tom


> CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [00000000009150dc] ipv4_neigh_lookup+0x38/0x170
>  [00000000008ead10] neigh_update+0x450/0x5cc
>  [00000000008eb8b8] neigh_event_ns+0x84/0x94
>  [00000000009518f8] arp_process+0x470/0x854
>  [0000000000951e78] arp_rcv+0x19c/0x1dc
>  [00000000008d647c] __netif_receive_skb_core+0xa24/0xa58
>  [00000000008d6510] __netif_receive_skb+0x60/0x74
>  [00000000008d65dc] process_backlog+0xb8/0x168
>  [00000000008dc9f8] napi_poll+0x74/0x1c8
>  [00000000008dcc20] net_rx_action+0xd4/0x1d4
>  [00000000004633a4] __do_softirq+0x11c/0x28c
>  [000000000042abb0] do_softirq_own_stack+0x34/0x48
>  [0000000000463050] irq_exit+0x48/0x60
>
>
> Kernel unaligned access at TPC[103e720c] build_header+0x64/0xc4 [ip_gre]
> CPU: 44 PID: 0 Comm: swapper/44 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [00000000103e720c] build_header+0x64/0xc4 [ip_gre]
>  [00000000103e76a4] __gre_xmit+0x40/0x7c [ip_gre]
>  [00000000103e79e8] gre_tap_xmit+0xfc/0x12c [ip_gre]
>  [00000000008de6dc] dev_hard_start_xmit+0x304/0x3ec
>  [0000000000904d30] sch_direct_xmit+0x74/0x1dc
>  [00000000008df1a8] __dev_queue_xmit+0x568/0x7e8
>  [00000000008df45c] dev_queue_xmit+0x10/0x24
>  [0000000010207654] ip6_finish_output2+0x5b4/0x6b0 [ipv6]
>  [0000000010207954] ip6_finish_output+0x204/0x218 [ipv6]
>  [0000000010207afc] ip6_output+0x194/0x1a8 [ipv6]
>  [000000001021f97c] ndisc_send_skb+0x390/0x438 [ipv6]
>  [000000001021ff14] ndisc_send_rs+0xe0/0xfc [ipv6]
>  [000000001020b190] addrconf_rs_timer+0xc4/0x124 [ipv6]
>
>
> Kernel unaligned access at TPC[969ee0] iptunnel_xmit+0xcc/0x1c0
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E   4.4.0-dissect+ #19
> Call Trace:
>  [0000000000432f4c] kernel_unaligned_trap+0xfc/0x5ac
>  [0000000000427ef8] sun4v_do_mna+0x58/0x9c
>  [0000000000406d30] sun4v_mna+0x5c/0x68
>  [0000000000969ee0] iptunnel_xmit+0xcc/0x1c0
>  [00000000103dc9dc] ip_tunnel_xmit+0xb78/0xc2c [ip_tunnel]
>  [00000000103e76d0] __gre_xmit+0x6c/0x7c [ip_gre]
>  [00000000103e79e8] gre_tap_xmit+0xfc/0x12c [ip_gre]
>  [00000000008de6dc] dev_hard_start_xmit+0x304/0x3ec
>  [0000000000904d30] sch_direct_xmit+0x74/0x1dc
>  [00000000008df1a8] __dev_queue_xmit+0x568/0x7e8
>  [00000000008df45c] dev_queue_xmit+0x10/0x24
>  [00000000104034b8] br_dev_queue_push_xmit+0x1d8/0x1fc [bridge]
>  [000000001040355c] br_forward_finish+0x80/0x94 [bridge]
>  [0000000010403684] __br_forward+0x114/0x124 [bridge]
>  [0000000010403700] br_forward+0x6c/0x94 [bridge]
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-03 17:07   ` Tom Herbert
@ 2016-02-03 17:31     ` Sowmini Varadhan
  2016-02-03 17:51       ` Tom Herbert
  0 siblings, 1 reply; 18+ messages in thread
From: Sowmini Varadhan @ 2016-02-03 17:31 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On (02/03/16 09:07), Tom Herbert wrote:
> > Kernel unaligned access at TPC[9150dc] ipv4_neigh_lookup+0x38/0x170
> 
> Sowmini,
> 
> This doesn't look like a hard crash to me. Instead of trying to fix
> all the alignment issues for Sparc, can we just take the trap, fix up
> the load, and continue without any further fuss? Performance might
> suffer, but it doesn't seem like the bad alignments are happening in
> critical paths.
> 

None of these things is a hard crash, but they are 

(a) quite noisy 
(b) I'm able to generate alignment falls merely by configuring
    tunnels, and it gets worse when I disable RSS and use RFS/RPS
    instead. So "critical path" might need some definition.
(c) a perf risk for other platforms as well, even when they dont
    complain noisily about it.

--Sowmini

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-03 17:31     ` Sowmini Varadhan
@ 2016-02-03 17:51       ` Tom Herbert
  2016-02-03 17:59         ` Sowmini Varadhan
  0 siblings, 1 reply; 18+ messages in thread
From: Tom Herbert @ 2016-02-03 17:51 UTC (permalink / raw)
  To: Sowmini Varadhan
  Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On Wed, Feb 3, 2016 at 9:31 AM, Sowmini Varadhan
<sowmini.varadhan@oracle.com> wrote:
> On (02/03/16 09:07), Tom Herbert wrote:
>> > Kernel unaligned access at TPC[9150dc] ipv4_neigh_lookup+0x38/0x170
>>
>> Sowmini,
>>
>> This doesn't look like a hard crash to me. Instead of trying to fix
>> all the alignment issues for Sparc, can we just take the trap, fix up
>> the load, and continue without any further fuss? Performance might
>> suffer, but it doesn't seem like the bad alignments are happening in
>> critical paths.
>>
>
> None of these things is a hard crash, but they are
>
> (a) quite noisy

Try disabling the crash dump. That will improve performance.

> (b) I'm able to generate alignment falls merely by configuring
>     tunnels, and it gets worse when I disable RSS and use RFS/RPS
>     instead. So "critical path" might need some definition.

But as we said it's only for tunnels that specifically encapsulate an
ethernet header with aligning it. Many other encapsulations (e.g.
IPIP, GUE, EtherIP,IP/GRE) should be fine. We could take this to IETF
and point out that alignment is still relevant in protocol
development. We can't fix this for GRE or VXLAN at this point, but
maybe there's still hope for VXLAN-GPE or Geneve...

> (c) a perf risk for other platforms as well, even when they dont
>     complain noisily about it.
>
Right, but there is a big difference between a performance degradation
and a hard failure. It would at least be nice to know what the
performance hit actually is, if it's acceptable then this would be a
far simpler and much less invasive fix than the alternatives.

Thanks,
Tom

> --Sowmini

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers
  2016-02-03 17:51       ` Tom Herbert
@ 2016-02-03 17:59         ` Sowmini Varadhan
  0 siblings, 0 replies; 18+ messages in thread
From: Sowmini Varadhan @ 2016-02-03 17:59 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David S. Miller, Linux Kernel Network Developers, Kernel Team

On (02/03/16 09:51), Tom Herbert wrote:
> > (a) quite noisy
> 
> Try disabling the crash dump. That will improve performance.

huh?? 

there is no crash dump involved. If you meant "disable dump_stack()"
sure, I am aware of that, and that is the default behavior of 
log_unaligned(). I was just trying to be helpful and provide
stack traces (I dropped out quite a few, which come from mld, 
ip_fast_csum() etc, which log_unaligned rate-limits and suppresses
by default, btw)

(Removing the batteries from my fire-alarm doesnt make the fire 
go away :-))

> But as we said it's only for tunnels that specifically encapsulate an
> ethernet header with aligning it. Many other encapsulations (e.g.
> IPIP, GUE, EtherIP,IP/GRE) should be fine. We could take this to IETF
> and point out that alignment is still relevant in protocol
> development. We can't fix this for GRE or VXLAN at this point, but
> maybe there's still hope for VXLAN-GPE or Geneve...

good point about taking to ietf, but the list above is not accurate.
IP/GRE itself generated a few log_unaligned() warnings for me, I'd
have to sift through it carefully - need some time for that.. 

> Right, but there is a big difference between a performance degradation
> and a hard failure. It would at least be nice to know what the
> performance hit actually is, if it's acceptable then this would be a
> far simpler and much less invasive fix than the alternatives.

--Sowmini

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-02-03 18:00 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-31 21:37 [PATCH net] net: Allow flow dissector to handle non 4-byte aligned headers Tom Herbert
2016-01-31 21:47 ` kbuild test robot
2016-01-31 22:06 ` Florian Westphal
2016-02-01  0:24 ` Eric Dumazet
2016-02-01  0:39   ` Florian Fainelli
2016-02-01 15:20     ` Nicolas Dichtel
2016-02-01 16:01       ` Sowmini Varadhan
2016-02-01  0:43   ` Sowmini Varadhan
2016-02-01 12:32 ` Sergei Shtylyov
2016-02-02  0:31 ` Sowmini Varadhan
2016-02-02  0:46   ` Tom Herbert
2016-02-02  3:56   ` Alexander Duyck
2016-02-02 13:41     ` Hannes Frederic Sowa
2016-02-02 18:35     ` Sowmini Varadhan
2016-02-03 17:07   ` Tom Herbert
2016-02-03 17:31     ` Sowmini Varadhan
2016-02-03 17:51       ` Tom Herbert
2016-02-03 17:59         ` Sowmini Varadhan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.