All of lore.kernel.org
 help / color / mirror / Atom feed
* alignment faults in 3.6
@ 2012-10-04 23:10 Rob Herring
  2012-10-05  0:58 ` Michael Hope
  2012-10-05  7:29 ` Russell King - ARM Linux
  0 siblings, 2 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-04 23:10 UTC (permalink / raw)
  To: linux-arm-kernel

I've been scratching my head with a "scheduling while atomic" bug I
started seeing on 3.6. I can easily reproduce this problem when doing a
wget on my system. It ultimately seems to be a combination of factors.
The "scheduling while atomic" bug is triggered in do_alignment which
gets triggered by this code in net/ipv4/af_inet.c, line 1356:

id = ntohl(*(__be32 *)&iph->id);
flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
id >>= 16;

This code compiles into this using "gcc version 4.6.3 (Ubuntu/Linaro
4.6.3-1ubuntu5)":

c02ac020:       e8920840        ldm     r2, {r6, fp}
c02ac024:       e6bfbf3b        rev     fp, fp
c02ac028:       e6bf6f36        rev     r6, r6
c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
c02ac030:       e0266008        eor     r6, r6, r8
c02ac034:       e18c6006        orr     r6, ip, r6

which generates alignment faults on the ldm. These are silent until this
commit is applied:

commit ad72907acd2943304c292ae36960bb66e6dc23c9
Author: Will Deacon <will.deacon@arm.com>
Date:   Fri Sep 7 18:24:10 2012 +0100

    ARM: 7528/1: uaccess: annotate [__]{get,put}_user functions with
might_fault()

    The user access functions may generate a fault, resulting in invocation
    of a handler that may sleep.

    This patch annotates the accessors with might_fault() so that we print a
    warning if they are invoked from atomic context and help lockdep keep
    track of mmap_sem.


I would think the scheduling while atomic messages are harmless in this
case. However, in addition to spewing out BUG messages this commit also
seems to eventually cause a kernel panic in __napi_complete. That panic
seems to go away if I put barrier() between the 2 accesses above which
eliminates the alignment faults. I haven't figured that part out yet.

There's at least a couple of problems here:

This seems like an overly aggressive compiler optimization considering
unaligned accesses are not supported by ldm/stm.

The alignment fault handler should handle kernel address faults atomically.

I need to fix alignment of the network rx buffers if possible.

Rob

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-04 23:10 alignment faults in 3.6 Rob Herring
@ 2012-10-05  0:58 ` Michael Hope
  2012-10-05  1:26   ` Mans Rullgard
  2012-10-05 16:08   ` Rob Herring
  2012-10-05  7:29 ` Russell King - ARM Linux
  1 sibling, 2 replies; 133+ messages in thread
From: Michael Hope @ 2012-10-05  0:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 October 2012 12:10, Rob Herring <robherring2@gmail.com> wrote:
> I've been scratching my head with a "scheduling while atomic" bug I
> started seeing on 3.6. I can easily reproduce this problem when doing a
> wget on my system. It ultimately seems to be a combination of factors.
> The "scheduling while atomic" bug is triggered in do_alignment which
> gets triggered by this code in net/ipv4/af_inet.c, line 1356:
>
> id = ntohl(*(__be32 *)&iph->id);
> flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
> id >>= 16;
>
> This code compiles into this using "gcc version 4.6.3 (Ubuntu/Linaro
> 4.6.3-1ubuntu5)":
>
> c02ac020:       e8920840        ldm     r2, {r6, fp}
> c02ac024:       e6bfbf3b        rev     fp, fp
> c02ac028:       e6bf6f36        rev     r6, r6
> c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
> c02ac030:       e0266008        eor     r6, r6, r8
> c02ac034:       e18c6006        orr     r6, ip, r6
>
> which generates alignment faults on the ldm. These are silent until this
> commit is applied:

Hi Rob.  I assume that iph is something like:

struct foo {
    u32 x;
    char id[8];
};

struct foo *iph;

GCC merged the two adjacent loads of x and id into one ldm.  This is
an ARM specific optimisation done in load_multiple_sequence() and
enabled with -fpeephole2.

I think the assembly is correct - GCC knows that iph is aligned and
knows the offsets of both x and id.  Happy to be corrected if I'm
wrong, but I think the assembly is valid given the C code.

-- Michael

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  0:58 ` Michael Hope
@ 2012-10-05  1:26   ` Mans Rullgard
  2012-10-05  1:56     ` Rob Herring
  2012-10-05 16:08   ` Rob Herring
  1 sibling, 1 reply; 133+ messages in thread
From: Mans Rullgard @ 2012-10-05  1:26 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 October 2012 01:58, Michael Hope <michael.hope@linaro.org> wrote:
> On 5 October 2012 12:10, Rob Herring <robherring2@gmail.com> wrote:
>> I've been scratching my head with a "scheduling while atomic" bug I
>> started seeing on 3.6. I can easily reproduce this problem when doing a
>> wget on my system. It ultimately seems to be a combination of factors.
>> The "scheduling while atomic" bug is triggered in do_alignment which
>> gets triggered by this code in net/ipv4/af_inet.c, line 1356:
>>
>> id = ntohl(*(__be32 *)&iph->id);
>> flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
>> id >>= 16;
>>
>> This code compiles into this using "gcc version 4.6.3 (Ubuntu/Linaro
>> 4.6.3-1ubuntu5)":
>>
>> c02ac020:       e8920840        ldm     r2, {r6, fp}
>> c02ac024:       e6bfbf3b        rev     fp, fp
>> c02ac028:       e6bf6f36        rev     r6, r6
>> c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
>> c02ac030:       e0266008        eor     r6, r6, r8
>> c02ac034:       e18c6006        orr     r6, ip, r6
>>
>> which generates alignment faults on the ldm. These are silent until this
>> commit is applied:
>
> Hi Rob.  I assume that iph is something like:
>
> struct foo {
>     u32 x;
>     char id[8];
> };
>
> struct foo *iph;
>
> GCC merged the two adjacent loads of x and id into one ldm.  This is
> an ARM specific optimisation done in load_multiple_sequence() and
> enabled with -fpeephole2.
>
> I think the assembly is correct - GCC knows that iph is aligned and
> knows the offsets of both x and id.  Happy to be corrected if I'm
> wrong, but I think the assembly is valid given the C code.

The struct looks like this:

struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
	__u8	ihl:4,
		version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
	__u8	version:4,
  		ihl:4;
#else
#error	"Please fix <asm/byteorder.h>"
#endif
	__u8	tos;
	__be16	tot_len;
	__be16	id;
	__be16	frag_off;
	__u8	ttl;
	__u8	protocol;
	__sum16	check;
	__be32	saddr;
	__be32	daddr;
	/*The options start here. */
};

In a normal build (there's some magic for special checkers) __be32 is a plain
__u32 so the struct should be at least 4-byte aligned.  If somehow it is not,
that is the real bug.

-- 
Mans Rullgard / mru

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  1:26   ` Mans Rullgard
@ 2012-10-05  1:56     ` Rob Herring
  2012-10-05  2:25       ` Mans Rullgard
  0 siblings, 1 reply; 133+ messages in thread
From: Rob Herring @ 2012-10-05  1:56 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/04/2012 08:26 PM, Mans Rullgard wrote:
> On 5 October 2012 01:58, Michael Hope <michael.hope@linaro.org> wrote:
>> On 5 October 2012 12:10, Rob Herring <robherring2@gmail.com> wrote:
>>> I've been scratching my head with a "scheduling while atomic" bug I
>>> started seeing on 3.6. I can easily reproduce this problem when doing a
>>> wget on my system. It ultimately seems to be a combination of factors.
>>> The "scheduling while atomic" bug is triggered in do_alignment which
>>> gets triggered by this code in net/ipv4/af_inet.c, line 1356:
>>>
>>> id = ntohl(*(__be32 *)&iph->id);
>>> flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
>>> id >>= 16;
>>>
>>> This code compiles into this using "gcc version 4.6.3 (Ubuntu/Linaro
>>> 4.6.3-1ubuntu5)":
>>>
>>> c02ac020:       e8920840        ldm     r2, {r6, fp}
>>> c02ac024:       e6bfbf3b        rev     fp, fp
>>> c02ac028:       e6bf6f36        rev     r6, r6
>>> c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
>>> c02ac030:       e0266008        eor     r6, r6, r8
>>> c02ac034:       e18c6006        orr     r6, ip, r6
>>>
>>> which generates alignment faults on the ldm. These are silent until this
>>> commit is applied:
>>
>> Hi Rob.  I assume that iph is something like:
>>
>> struct foo {
>>     u32 x;
>>     char id[8];
>> };
>>
>> struct foo *iph;
>>
>> GCC merged the two adjacent loads of x and id into one ldm.  This is
>> an ARM specific optimisation done in load_multiple_sequence() and
>> enabled with -fpeephole2.
>>
>> I think the assembly is correct - GCC knows that iph is aligned and
>> knows the offsets of both x and id.  Happy to be corrected if I'm
>> wrong, but I think the assembly is valid given the C code.
> 
> The struct looks like this:
> 
> struct iphdr {
> #if defined(__LITTLE_ENDIAN_BITFIELD)
> 	__u8	ihl:4,
> 		version:4;
> #elif defined (__BIG_ENDIAN_BITFIELD)
> 	__u8	version:4,
>   		ihl:4;
> #else
> #error	"Please fix <asm/byteorder.h>"
> #endif
> 	__u8	tos;
> 	__be16	tot_len;
> 	__be16	id;
> 	__be16	frag_off;
> 	__u8	ttl;
> 	__u8	protocol;
> 	__sum16	check;
> 	__be32	saddr;
> 	__be32	daddr;
> 	/*The options start here. */
> };
> 
> In a normal build (there's some magic for special checkers) __be32 is a plain
> __u32 so the struct should be at least 4-byte aligned.  If somehow it is not,
> that is the real bug.

This struct is the IP header, so a struct ptr is just set to the
beginning of the received data. Since ethernet headers are 14 bytes,
often the IP header is not aligned unless the NIC can place the frame at
a 2 byte offset (which is something I need to investigate). So this
function cannot make any assumptions about the alignment. Does the ABI
define structs have some minimum alignment? Does the struct need to be
declared as packed or something?

Rob

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  1:56     ` Rob Herring
@ 2012-10-05  2:25       ` Mans Rullgard
  2012-10-05  3:04         ` Rob Herring
  2012-10-05  7:12         ` Russell King - ARM Linux
  0 siblings, 2 replies; 133+ messages in thread
From: Mans Rullgard @ 2012-10-05  2:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
> On 10/04/2012 08:26 PM, Mans Rullgard wrote:
>> On 5 October 2012 01:58, Michael Hope <michael.hope@linaro.org> wrote:
>>> On 5 October 2012 12:10, Rob Herring <robherring2@gmail.com> wrote:
>>>> I've been scratching my head with a "scheduling while atomic" bug I
>>>> started seeing on 3.6. I can easily reproduce this problem when doing a
>>>> wget on my system. It ultimately seems to be a combination of factors.
>>>> The "scheduling while atomic" bug is triggered in do_alignment which
>>>> gets triggered by this code in net/ipv4/af_inet.c, line 1356:
>>>>
>>>> id = ntohl(*(__be32 *)&iph->id);
>>>> flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
>>>> id >>= 16;
>>>>
>>>> This code compiles into this using "gcc version 4.6.3 (Ubuntu/Linaro
>>>> 4.6.3-1ubuntu5)":
>>>>
>>>> c02ac020:       e8920840        ldm     r2, {r6, fp}
>>>> c02ac024:       e6bfbf3b        rev     fp, fp
>>>> c02ac028:       e6bf6f36        rev     r6, r6
>>>> c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
>>>> c02ac030:       e0266008        eor     r6, r6, r8
>>>> c02ac034:       e18c6006        orr     r6, ip, r6
>>>>
>>>> which generates alignment faults on the ldm. These are silent until this
>>>> commit is applied:
>>>
>>> Hi Rob.  I assume that iph is something like:
>>>
>>> struct foo {
>>>     u32 x;
>>>     char id[8];
>>> };
>>>
>>> struct foo *iph;
>>>
>>> GCC merged the two adjacent loads of x and id into one ldm.  This is
>>> an ARM specific optimisation done in load_multiple_sequence() and
>>> enabled with -fpeephole2.
>>>
>>> I think the assembly is correct - GCC knows that iph is aligned and
>>> knows the offsets of both x and id.  Happy to be corrected if I'm
>>> wrong, but I think the assembly is valid given the C code.
>>
>> The struct looks like this:
>>
>> struct iphdr {
>> #if defined(__LITTLE_ENDIAN_BITFIELD)
>>       __u8    ihl:4,
>>               version:4;
>> #elif defined (__BIG_ENDIAN_BITFIELD)
>>       __u8    version:4,
>>               ihl:4;
>> #else
>> #error        "Please fix <asm/byteorder.h>"
>> #endif
>>       __u8    tos;
>>       __be16  tot_len;
>>       __be16  id;
>>       __be16  frag_off;
>>       __u8    ttl;
>>       __u8    protocol;
>>       __sum16 check;
>>       __be32  saddr;
>>       __be32  daddr;
>>       /*The options start here. */
>> };
>>
>> In a normal build (there's some magic for special checkers) __be32 is a plain
>> __u32 so the struct should be at least 4-byte aligned.  If somehow it is not,
>> that is the real bug.
>
> This struct is the IP header, so a struct ptr is just set to the
> beginning of the received data. Since ethernet headers are 14 bytes,
> often the IP header is not aligned unless the NIC can place the frame at
> a 2 byte offset (which is something I need to investigate). So this
> function cannot make any assumptions about the alignment. Does the ABI
> define structs have some minimum alignment? Does the struct need to be
> declared as packed or something?

The ABI defines the alignment of structs as the maximum alignment of its
members.  Since this struct contains 32-bit members, the alignment for the
whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
might be unaligned (in addition to removing any holes within).

-- 
Mans Rullgard / mru

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  2:25       ` Mans Rullgard
@ 2012-10-05  3:04         ` Rob Herring
  2012-10-05  5:37           ` Khem Raj
  2012-10-05  7:12         ` Russell King - ARM Linux
  1 sibling, 1 reply; 133+ messages in thread
From: Rob Herring @ 2012-10-05  3:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/04/2012 09:25 PM, Mans Rullgard wrote:
> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
>> On 10/04/2012 08:26 PM, Mans Rullgard wrote:
>>> On 5 October 2012 01:58, Michael Hope <michael.hope@linaro.org> wrote:
>>>> On 5 October 2012 12:10, Rob Herring <robherring2@gmail.com> wrote:
>>>>> I've been scratching my head with a "scheduling while atomic" bug I
>>>>> started seeing on 3.6. I can easily reproduce this problem when doing a
>>>>> wget on my system. It ultimately seems to be a combination of factors.
>>>>> The "scheduling while atomic" bug is triggered in do_alignment which
>>>>> gets triggered by this code in net/ipv4/af_inet.c, line 1356:
>>>>>
>>>>> id = ntohl(*(__be32 *)&iph->id);
>>>>> flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
>>>>> id >>= 16;
>>>>>
>>>>> This code compiles into this using "gcc version 4.6.3 (Ubuntu/Linaro
>>>>> 4.6.3-1ubuntu5)":
>>>>>
>>>>> c02ac020:       e8920840        ldm     r2, {r6, fp}
>>>>> c02ac024:       e6bfbf3b        rev     fp, fp
>>>>> c02ac028:       e6bf6f36        rev     r6, r6
>>>>> c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
>>>>> c02ac030:       e0266008        eor     r6, r6, r8
>>>>> c02ac034:       e18c6006        orr     r6, ip, r6
>>>>>
>>>>> which generates alignment faults on the ldm. These are silent until this
>>>>> commit is applied:
>>>>
>>>> Hi Rob.  I assume that iph is something like:
>>>>
>>>> struct foo {
>>>>     u32 x;
>>>>     char id[8];
>>>> };
>>>>
>>>> struct foo *iph;
>>>>
>>>> GCC merged the two adjacent loads of x and id into one ldm.  This is
>>>> an ARM specific optimisation done in load_multiple_sequence() and
>>>> enabled with -fpeephole2.
>>>>
>>>> I think the assembly is correct - GCC knows that iph is aligned and
>>>> knows the offsets of both x and id.  Happy to be corrected if I'm
>>>> wrong, but I think the assembly is valid given the C code.
>>>
>>> The struct looks like this:
>>>
>>> struct iphdr {
>>> #if defined(__LITTLE_ENDIAN_BITFIELD)
>>>       __u8    ihl:4,
>>>               version:4;
>>> #elif defined (__BIG_ENDIAN_BITFIELD)
>>>       __u8    version:4,
>>>               ihl:4;
>>> #else
>>> #error        "Please fix <asm/byteorder.h>"
>>> #endif
>>>       __u8    tos;
>>>       __be16  tot_len;
>>>       __be16  id;
>>>       __be16  frag_off;
>>>       __u8    ttl;
>>>       __u8    protocol;
>>>       __sum16 check;
>>>       __be32  saddr;
>>>       __be32  daddr;
>>>       /*The options start here. */
>>> };
>>>
>>> In a normal build (there's some magic for special checkers) __be32 is a plain
>>> __u32 so the struct should be at least 4-byte aligned.  If somehow it is not,
>>> that is the real bug.
>>
>> This struct is the IP header, so a struct ptr is just set to the
>> beginning of the received data. Since ethernet headers are 14 bytes,
>> often the IP header is not aligned unless the NIC can place the frame at
>> a 2 byte offset (which is something I need to investigate). So this
>> function cannot make any assumptions about the alignment. Does the ABI
>> define structs have some minimum alignment? Does the struct need to be
>> declared as packed or something?
> 
> The ABI defines the alignment of structs as the maximum alignment of its
> members.  Since this struct contains 32-bit members, the alignment for the
> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
> might be unaligned (in addition to removing any holes within).

Unfortunately, declaring the struct or __be32* cast as packed have no
effect. I still get an ldm emitted.

Rob

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  3:04         ` Rob Herring
@ 2012-10-05  5:37           ` Khem Raj
  0 siblings, 0 replies; 133+ messages in thread
From: Khem Raj @ 2012-10-05  5:37 UTC (permalink / raw)
  To: linux-arm-kernel


On Oct 4, 2012, at 8:04 PM, Rob Herring <robherring2@gmail.com> wrote:

> On 10/04/2012 09:25 PM, Mans Rullgard wrote:
>> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
>>> On 10/04/2012 08:26 PM, Mans Rullgard wrote:
>>>> On 5 October 2012 01:58, Michael Hope <michael.hope@linaro.org> wrote:
>>>>> On 5 October 2012 12:10, Rob Herring <robherring2@gmail.com> wrote:
>>>>>> I've been scratching my head with a "scheduling while atomic" bug I
>>>>>> started seeing on 3.6. I can easily reproduce this problem when doing a
>>>>>> wget on my system. It ultimately seems to be a combination of factors.
>>>>>> The "scheduling while atomic" bug is triggered in do_alignment which
>>>>>> gets triggered by this code in net/ipv4/af_inet.c, line 1356:
>>>>>> 
>>>>>> id = ntohl(*(__be32 *)&iph->id);
>>>>>> flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
>>>>>> id >>= 16;
>>>>>> 
>>>>>> This code compiles into this using "gcc version 4.6.3 (Ubuntu/Linaro
>>>>>> 4.6.3-1ubuntu5)":
>>>>>> 
>>>>>> c02ac020:       e8920840        ldm     r2, {r6, fp}
>>>>>> c02ac024:       e6bfbf3b        rev     fp, fp
>>>>>> c02ac028:       e6bf6f36        rev     r6, r6
>>>>>> c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
>>>>>> c02ac030:       e0266008        eor     r6, r6, r8
>>>>>> c02ac034:       e18c6006        orr     r6, ip, r6
>>>>>> 
>>>>>> which generates alignment faults on the ldm. These are silent until this
>>>>>> commit is applied:
>>>>> 
>>>>> Hi Rob.  I assume that iph is something like:
>>>>> 
>>>>> struct foo {
>>>>>    u32 x;
>>>>>    char id[8];
>>>>> };
>>>>> 
>>>>> struct foo *iph;
>>>>> 
>>>>> GCC merged the two adjacent loads of x and id into one ldm.  This is
>>>>> an ARM specific optimisation done in load_multiple_sequence() and
>>>>> enabled with -fpeephole2.
>>>>> 
>>>>> I think the assembly is correct - GCC knows that iph is aligned and
>>>>> knows the offsets of both x and id.  Happy to be corrected if I'm
>>>>> wrong, but I think the assembly is valid given the C code.
>>>> 
>>>> The struct looks like this:
>>>> 
>>>> struct iphdr {
>>>> #if defined(__LITTLE_ENDIAN_BITFIELD)
>>>>      __u8    ihl:4,
>>>>              version:4;
>>>> #elif defined (__BIG_ENDIAN_BITFIELD)
>>>>      __u8    version:4,
>>>>              ihl:4;
>>>> #else
>>>> #error        "Please fix <asm/byteorder.h>"
>>>> #endif
>>>>      __u8    tos;
>>>>      __be16  tot_len;
>>>>      __be16  id;
>>>>      __be16  frag_off;
>>>>      __u8    ttl;
>>>>      __u8    protocol;
>>>>      __sum16 check;
>>>>      __be32  saddr;
>>>>      __be32  daddr;
>>>>      /*The options start here. */
>>>> };
>>>> 
>>>> In a normal build (there's some magic for special checkers) __be32 is a plain
>>>> __u32 so the struct should be at least 4-byte aligned.  If somehow it is not,
>>>> that is the real bug.
>>> 
>>> This struct is the IP header, so a struct ptr is just set to the
>>> beginning of the received data. Since ethernet headers are 14 bytes,
>>> often the IP header is not aligned unless the NIC can place the frame at
>>> a 2 byte offset (which is something I need to investigate). So this
>>> function cannot make any assumptions about the alignment. Does the ABI
>>> define structs have some minimum alignment? Does the struct need to be
>>> declared as packed or something?
>> 
>> The ABI defines the alignment of structs as the maximum alignment of its
>> members.  Since this struct contains 32-bit members, the alignment for the
>> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
>> might be unaligned (in addition to removing any holes within).
> 
> Unfortunately, declaring the struct or __be32* cast as packed have no
> effect. I still get an ldm emitted.

what is value of r2 ? and can you pastebin the .i file somewhere ?

> 
> Rob
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  2:25       ` Mans Rullgard
  2012-10-05  3:04         ` Rob Herring
@ 2012-10-05  7:12         ` Russell King - ARM Linux
  2012-10-05  8:20           ` Mans Rullgard
  1 sibling, 1 reply; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-05  7:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
> > This struct is the IP header, so a struct ptr is just set to the
> > beginning of the received data. Since ethernet headers are 14 bytes,
> > often the IP header is not aligned unless the NIC can place the frame at
> > a 2 byte offset (which is something I need to investigate). So this
> > function cannot make any assumptions about the alignment. Does the ABI
> > define structs have some minimum alignment? Does the struct need to be
> > declared as packed or something?
> 
> The ABI defines the alignment of structs as the maximum alignment of its
> members.  Since this struct contains 32-bit members, the alignment for the
> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
> might be unaligned (in addition to removing any holes within).

This has come up before in the past.

The Linux network folk will _not_ allow - in any shape or form - for
this struct to be marked packed (it's the struct which needs to be
marked packed) because by doing so, it causes GCC to issue byte loads/
stores on architectures where there isn't a problem, and that decreases
the performance of the Linux IP stack unnecessarily.

I don't think there's going to be a satisfactory answer to this issue.
I think we're going to be stuck between GCC people saying that the
kernel is buggy, and the network people refusing to fix this as they
have done in the past.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-04 23:10 alignment faults in 3.6 Rob Herring
  2012-10-05  0:58 ` Michael Hope
@ 2012-10-05  7:29 ` Russell King - ARM Linux
  2012-10-05 10:51   ` Russell King - ARM Linux
  1 sibling, 1 reply; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-05  7:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 04, 2012 at 06:10:26PM -0500, Rob Herring wrote:
> I would think the scheduling while atomic messages are harmless in this
> case. However, in addition to spewing out BUG messages this commit also
> seems to eventually cause a kernel panic in __napi_complete. That panic
> seems to go away if I put barrier() between the 2 accesses above which
> eliminates the alignment faults. I haven't figured that part out yet.
> 
> There's at least a couple of problems here:
> 
> This seems like an overly aggressive compiler optimization considering
> unaligned accesses are not supported by ldm/stm.
> 
> The alignment fault handler should handle kernel address faults atomically.

This is bad news.  do_alignment() can be called in almost any kernel
context, and it must work.  die() and oops dumps - specifically dump_mem()
and dump_instr() will suffer from exactly the same problem.

Will, can you take a look please?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  7:12         ` Russell King - ARM Linux
@ 2012-10-05  8:20           ` Mans Rullgard
  2012-10-05  8:24             ` Russell King - ARM Linux
  0 siblings, 1 reply; 133+ messages in thread
From: Mans Rullgard @ 2012-10-05  8:20 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 October 2012 08:12, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
>> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
>> > This struct is the IP header, so a struct ptr is just set to the
>> > beginning of the received data. Since ethernet headers are 14 bytes,
>> > often the IP header is not aligned unless the NIC can place the frame at
>> > a 2 byte offset (which is something I need to investigate). So this
>> > function cannot make any assumptions about the alignment. Does the ABI
>> > define structs have some minimum alignment? Does the struct need to be
>> > declared as packed or something?
>>
>> The ABI defines the alignment of structs as the maximum alignment of its
>> members.  Since this struct contains 32-bit members, the alignment for the
>> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
>> might be unaligned (in addition to removing any holes within).
>
> This has come up before in the past.
>
> The Linux network folk will _not_ allow - in any shape or form - for
> this struct to be marked packed (it's the struct which needs to be
> marked packed) because by doing so, it causes GCC to issue byte loads/
> stores on architectures where there isn't a problem, and that decreases
> the performance of the Linux IP stack unnecessarily.

Which architectures?  I have never seen anything like that.

-- 
Mans Rullgard / mru

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  8:20           ` Mans Rullgard
@ 2012-10-05  8:24             ` Russell King - ARM Linux
  2012-10-05  8:33               ` Mans Rullgard
  2012-10-05 12:24               ` Rob Herring
  0 siblings, 2 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-05  8:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
> On 5 October 2012 08:12, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
> >> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
> >> > This struct is the IP header, so a struct ptr is just set to the
> >> > beginning of the received data. Since ethernet headers are 14 bytes,
> >> > often the IP header is not aligned unless the NIC can place the frame at
> >> > a 2 byte offset (which is something I need to investigate). So this
> >> > function cannot make any assumptions about the alignment. Does the ABI
> >> > define structs have some minimum alignment? Does the struct need to be
> >> > declared as packed or something?
> >>
> >> The ABI defines the alignment of structs as the maximum alignment of its
> >> members.  Since this struct contains 32-bit members, the alignment for the
> >> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
> >> might be unaligned (in addition to removing any holes within).
> >
> > This has come up before in the past.
> >
> > The Linux network folk will _not_ allow - in any shape or form - for
> > this struct to be marked packed (it's the struct which needs to be
> > marked packed) because by doing so, it causes GCC to issue byte loads/
> > stores on architectures where there isn't a problem, and that decreases
> > the performance of the Linux IP stack unnecessarily.
> 
> Which architectures?  I have never seen anything like that.

Does it matter?  I'm just relaying the argument against adding __packed
which was used before we were forced (by the networking folk) to implement
the alignment fault handler.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  8:24             ` Russell King - ARM Linux
@ 2012-10-05  8:33               ` Mans Rullgard
  2012-10-05  8:33                 ` Russell King - ARM Linux
  2012-10-05 12:24               ` Rob Herring
  1 sibling, 1 reply; 133+ messages in thread
From: Mans Rullgard @ 2012-10-05  8:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 October 2012 09:24, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
>> On 5 October 2012 08:12, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
>> >> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
>> >> > This struct is the IP header, so a struct ptr is just set to the
>> >> > beginning of the received data. Since ethernet headers are 14 bytes,
>> >> > often the IP header is not aligned unless the NIC can place the frame at
>> >> > a 2 byte offset (which is something I need to investigate). So this
>> >> > function cannot make any assumptions about the alignment. Does the ABI
>> >> > define structs have some minimum alignment? Does the struct need to be
>> >> > declared as packed or something?
>> >>
>> >> The ABI defines the alignment of structs as the maximum alignment of its
>> >> members.  Since this struct contains 32-bit members, the alignment for the
>> >> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
>> >> might be unaligned (in addition to removing any holes within).
>> >
>> > This has come up before in the past.
>> >
>> > The Linux network folk will _not_ allow - in any shape or form - for
>> > this struct to be marked packed (it's the struct which needs to be
>> > marked packed) because by doing so, it causes GCC to issue byte loads/
>> > stores on architectures where there isn't a problem, and that decreases
>> > the performance of the Linux IP stack unnecessarily.
>>
>> Which architectures?  I have never seen anything like that.
>
> Does it matter?

It matters if we want to fix it.

-- 
Mans Rullgard / mru

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  8:33               ` Mans Rullgard
@ 2012-10-05  8:33                 ` Russell King - ARM Linux
  2012-10-05  8:37                   ` Mans Rullgard
  0 siblings, 1 reply; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-05  8:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 05, 2012 at 09:33:04AM +0100, Mans Rullgard wrote:
> On 5 October 2012 09:24, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
> >> On 5 October 2012 08:12, Russell King - ARM Linux
> >> <linux@arm.linux.org.uk> wrote:
> >> > On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
> >> >> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
> >> >> > This struct is the IP header, so a struct ptr is just set to the
> >> >> > beginning of the received data. Since ethernet headers are 14 bytes,
> >> >> > often the IP header is not aligned unless the NIC can place the frame at
> >> >> > a 2 byte offset (which is something I need to investigate). So this
> >> >> > function cannot make any assumptions about the alignment. Does the ABI
> >> >> > define structs have some minimum alignment? Does the struct need to be
> >> >> > declared as packed or something?
> >> >>
> >> >> The ABI defines the alignment of structs as the maximum alignment of its
> >> >> members.  Since this struct contains 32-bit members, the alignment for the
> >> >> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
> >> >> might be unaligned (in addition to removing any holes within).
> >> >
> >> > This has come up before in the past.
> >> >
> >> > The Linux network folk will _not_ allow - in any shape or form - for
> >> > this struct to be marked packed (it's the struct which needs to be
> >> > marked packed) because by doing so, it causes GCC to issue byte loads/
> >> > stores on architectures where there isn't a problem, and that decreases
> >> > the performance of the Linux IP stack unnecessarily.
> >>
> >> Which architectures?  I have never seen anything like that.
> >
> > Does it matter?
> 
> It matters if we want to fix it.

I can't help you there.  Ask the networking people.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  8:33                 ` Russell King - ARM Linux
@ 2012-10-05  8:37                   ` Mans Rullgard
  2012-10-05  8:50                     ` Russell King - ARM Linux
  2012-10-05 13:49                     ` Mikael Pettersson
  0 siblings, 2 replies; 133+ messages in thread
From: Mans Rullgard @ 2012-10-05  8:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 October 2012 09:33, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Oct 05, 2012 at 09:33:04AM +0100, Mans Rullgard wrote:
>> On 5 October 2012 09:24, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
>> >> On 5 October 2012 08:12, Russell King - ARM Linux
>> >> <linux@arm.linux.org.uk> wrote:
>> >> > On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
>> >> >> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
>> >> >> > This struct is the IP header, so a struct ptr is just set to the
>> >> >> > beginning of the received data. Since ethernet headers are 14 bytes,
>> >> >> > often the IP header is not aligned unless the NIC can place the frame at
>> >> >> > a 2 byte offset (which is something I need to investigate). So this
>> >> >> > function cannot make any assumptions about the alignment. Does the ABI
>> >> >> > define structs have some minimum alignment? Does the struct need to be
>> >> >> > declared as packed or something?
>> >> >>
>> >> >> The ABI defines the alignment of structs as the maximum alignment of its
>> >> >> members.  Since this struct contains 32-bit members, the alignment for the
>> >> >> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
>> >> >> might be unaligned (in addition to removing any holes within).
>> >> >
>> >> > This has come up before in the past.
>> >> >
>> >> > The Linux network folk will _not_ allow - in any shape or form - for
>> >> > this struct to be marked packed (it's the struct which needs to be
>> >> > marked packed) because by doing so, it causes GCC to issue byte loads/
>> >> > stores on architectures where there isn't a problem, and that decreases
>> >> > the performance of the Linux IP stack unnecessarily.
>> >>
>> >> Which architectures?  I have never seen anything like that.
>> >
>> > Does it matter?
>>
>> It matters if we want to fix it.
>
> I can't help you there.  Ask the networking people.

I'm asking anyone who is listening.  Since you've (apparently) seen the claim
being made, perhaps you can point to an email or so with more information.
I'd hate to think you're just making up excuses.

-- 
Mans Rullgard / mru

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  8:37                   ` Mans Rullgard
@ 2012-10-05  8:50                     ` Russell King - ARM Linux
  2012-10-05 13:49                     ` Mikael Pettersson
  1 sibling, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-05  8:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 05, 2012 at 09:37:38AM +0100, Mans Rullgard wrote:
> On 5 October 2012 09:33, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Oct 05, 2012 at 09:33:04AM +0100, Mans Rullgard wrote:
> >> On 5 October 2012 09:24, Russell King - ARM Linux
> >> <linux@arm.linux.org.uk> wrote:
> >> > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
> >> >> On 5 October 2012 08:12, Russell King - ARM Linux
> >> >> <linux@arm.linux.org.uk> wrote:
> >> >> > On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
> >> >> >> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
> >> >> >> > This struct is the IP header, so a struct ptr is just set to the
> >> >> >> > beginning of the received data. Since ethernet headers are 14 bytes,
> >> >> >> > often the IP header is not aligned unless the NIC can place the frame at
> >> >> >> > a 2 byte offset (which is something I need to investigate). So this
> >> >> >> > function cannot make any assumptions about the alignment. Does the ABI
> >> >> >> > define structs have some minimum alignment? Does the struct need to be
> >> >> >> > declared as packed or something?
> >> >> >>
> >> >> >> The ABI defines the alignment of structs as the maximum alignment of its
> >> >> >> members.  Since this struct contains 32-bit members, the alignment for the
> >> >> >> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
> >> >> >> might be unaligned (in addition to removing any holes within).
> >> >> >
> >> >> > This has come up before in the past.
> >> >> >
> >> >> > The Linux network folk will _not_ allow - in any shape or form - for
> >> >> > this struct to be marked packed (it's the struct which needs to be
> >> >> > marked packed) because by doing so, it causes GCC to issue byte loads/
> >> >> > stores on architectures where there isn't a problem, and that decreases
> >> >> > the performance of the Linux IP stack unnecessarily.
> >> >>
> >> >> Which architectures?  I have never seen anything like that.
> >> >
> >> > Does it matter?
> >>
> >> It matters if we want to fix it.
> >
> > I can't help you there.  Ask the networking people.
> 
> I'm asking anyone who is listening.  Since you've (apparently) seen the claim
> being made, perhaps you can point to an email or so with more information.
> I'd hate to think you're just making up excuses.

Sorry, I'm just not going to bother - especially given your attitude of
"you're just making up excuses" - you just accused me of being a liar.
Thanks for that.

We have the merge window open at the moment, and I've only just got back
from being on holiday for the first half of that - so I have new tree
conflicts to deal with, and need to finish preparing my tree for Linus.
I'm not going to waste the precious time that the merge window is open
for searching lots of mail archives for this when others can do the same,
or even try sending the Linux networking people a patch adding __packed
to their structures and seeing the response.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  7:29 ` Russell King - ARM Linux
@ 2012-10-05 10:51   ` Russell King - ARM Linux
  2012-10-23 16:30     ` Jon Masters
  0 siblings, 1 reply; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-05 10:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 05, 2012 at 08:29:14AM +0100, Russell King - ARM Linux wrote:
> On Thu, Oct 04, 2012 at 06:10:26PM -0500, Rob Herring wrote:
> > I would think the scheduling while atomic messages are harmless in this
> > case. However, in addition to spewing out BUG messages this commit also
> > seems to eventually cause a kernel panic in __napi_complete. That panic
> > seems to go away if I put barrier() between the 2 accesses above which
> > eliminates the alignment faults. I haven't figured that part out yet.
> > 
> > There's at least a couple of problems here:
> > 
> > This seems like an overly aggressive compiler optimization considering
> > unaligned accesses are not supported by ldm/stm.
> > 
> > The alignment fault handler should handle kernel address faults atomically.
> 
> This is bad news.  do_alignment() can be called in almost any kernel
> context, and it must work.  die() and oops dumps - specifically dump_mem()
> and dump_instr() will suffer from exactly the same problem.

Okay, this should fix the issue...  I've only compile tested it so far.
Rob, as you have a way to trigger this easily, can you give this patch
a go and let me know if it solves your problem?  Thanks.

 arch/arm/kernel/traps.c |   34 +++++++---------------------------
 arch/arm/mm/alignment.c |   11 ++++-------
 2 files changed, 11 insertions(+), 34 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b0179b8..62f429e 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -89,17 +89,8 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom,
 		     unsigned long top)
 {
 	unsigned long first;
-	mm_segment_t fs;
 	int i;
 
-	/*
-	 * We need to switch to kernel mode so that we can use __get_user
-	 * to safely read from kernel space.  Note that we now dump the
-	 * code first, just in case the backtrace kills us.
-	 */
-	fs = get_fs();
-	set_fs(KERNEL_DS);
-
 	printk("%s%s(0x%08lx to 0x%08lx)\n", lvl, str, bottom, top);
 
 	for (first = bottom & ~31; first < top; first += 32) {
@@ -112,7 +103,7 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom,
 		for (p = first, i = 0; i < 8 && p < top; i++, p += 4) {
 			if (p >= bottom && p < top) {
 				unsigned long val;
-				if (__get_user(val, (unsigned long *)p) == 0)
+				if (probe_kernel_address(p, val) == 0)
 					sprintf(str + i * 9, " %08lx", val);
 				else
 					sprintf(str + i * 9, " ????????");
@@ -120,8 +111,6 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom,
 		}
 		printk("%s%04lx:%s\n", lvl, first & 0xffff, str);
 	}
-
-	set_fs(fs);
 }
 
 static void dump_instr(const char *lvl, struct pt_regs *regs)
@@ -129,25 +118,18 @@ static void dump_instr(const char *lvl, struct pt_regs *regs)
 	unsigned long addr = instruction_pointer(regs);
 	const int thumb = thumb_mode(regs);
 	const int width = thumb ? 4 : 8;
-	mm_segment_t fs;
 	char str[sizeof("00000000 ") * 5 + 2 + 1], *p = str;
 	int i;
 
-	/*
-	 * We need to switch to kernel mode so that we can use __get_user
-	 * to safely read from kernel space.  Note that we now dump the
-	 * code first, just in case the backtrace kills us.
-	 */
-	fs = get_fs();
-	set_fs(KERNEL_DS);
-
 	for (i = -4; i < 1 + !!thumb; i++) {
 		unsigned int val, bad;
 
-		if (thumb)
-			bad = __get_user(val, &((u16 *)addr)[i]);
-		else
-			bad = __get_user(val, &((u32 *)addr)[i]);
+		if (thumb) {
+			u16 instr;
+			bad = probe_kernel_address(addr, instr);
+			val = instr;
+		} else
+			bad = probe_kernel_address(addr, val);
 
 		if (!bad)
 			p += sprintf(p, i == 0 ? "(%0*x) " : "%0*x ",
@@ -158,8 +140,6 @@ static void dump_instr(const char *lvl, struct pt_regs *regs)
 		}
 	}
 	printk("%sCode: %s\n", lvl, str);
-
-	set_fs(fs);
 }
 
 #ifdef CONFIG_ARM_UNWIND
diff --git a/arch/arm/mm/alignment.c b/arch/arm/mm/alignment.c
index b9f60eb..f8f14fc 100644
--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -749,7 +749,6 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 	unsigned long instr = 0, instrptr;
 	int (*handler)(unsigned long addr, unsigned long instr, struct pt_regs *regs);
 	unsigned int type;
-	mm_segment_t fs;
 	unsigned int fault;
 	u16 tinstr = 0;
 	int isize = 4;
@@ -760,16 +759,15 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 
 	instrptr = instruction_pointer(regs);
 
-	fs = get_fs();
-	set_fs(KERNEL_DS);
 	if (thumb_mode(regs)) {
-		fault = __get_user(tinstr, (u16 *)(instrptr & ~1));
+		unsigned long ptr = instrptr;
+		fault = probe_kernel_address(ptr, tinstr);
 		if (!fault) {
 			if (cpu_architecture() >= CPU_ARCH_ARMv7 &&
 			    IS_T32(tinstr)) {
 				/* Thumb-2 32-bit */
 				u16 tinst2 = 0;
-				fault = __get_user(tinst2, (u16 *)(instrptr+2));
+				fault = probe_kernel_address(ptr + 2, tinst2);
 				instr = (tinstr << 16) | tinst2;
 				thumb2_32b = 1;
 			} else {
@@ -778,8 +776,7 @@ do_alignment(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 			}
 		}
 	} else
-		fault = __get_user(instr, (u32 *)instrptr);
-	set_fs(fs);
+		fault = probe_kernel_address(instrptr, instr);
 
 	if (fault) {
 		type = TYPE_FAULT;

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  8:24             ` Russell King - ARM Linux
  2012-10-05  8:33               ` Mans Rullgard
@ 2012-10-05 12:24               ` Rob Herring
  2012-10-05 13:51                 ` Mikael Pettersson
  2012-10-05 14:05                 ` Russell King - ARM Linux
  1 sibling, 2 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-05 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
> On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
>> On 5 October 2012 08:12, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
>>>> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
>>>>> This struct is the IP header, so a struct ptr is just set to the
>>>>> beginning of the received data. Since ethernet headers are 14 bytes,
>>>>> often the IP header is not aligned unless the NIC can place the frame at
>>>>> a 2 byte offset (which is something I need to investigate). So this
>>>>> function cannot make any assumptions about the alignment. Does the ABI
>>>>> define structs have some minimum alignment? Does the struct need to be
>>>>> declared as packed or something?
>>>>
>>>> The ABI defines the alignment of structs as the maximum alignment of its
>>>> members.  Since this struct contains 32-bit members, the alignment for the
>>>> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
>>>> might be unaligned (in addition to removing any holes within).
>>>
>>> This has come up before in the past.
>>>
>>> The Linux network folk will _not_ allow - in any shape or form - for
>>> this struct to be marked packed (it's the struct which needs to be
>>> marked packed) because by doing so, it causes GCC to issue byte loads/
>>> stores on architectures where there isn't a problem, and that decreases
>>> the performance of the Linux IP stack unnecessarily.
>>
>> Which architectures?  I have never seen anything like that.
> 
> Does it matter?  I'm just relaying the argument against adding __packed
> which was used before we were forced (by the networking folk) to implement
> the alignment fault handler.

It doesn't really matter what will be accepted or not as adding __packed
to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
The only way I've found to eliminate the alignment fault is adding a
barrier between the 2 loads. That seems like a compiler issue to me if
there is not a better fix.

Rob

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  8:37                   ` Mans Rullgard
  2012-10-05  8:50                     ` Russell King - ARM Linux
@ 2012-10-05 13:49                     ` Mikael Pettersson
  1 sibling, 0 replies; 133+ messages in thread
From: Mikael Pettersson @ 2012-10-05 13:49 UTC (permalink / raw)
  To: linux-arm-kernel

Mans Rullgard writes:
 > On 5 October 2012 09:33, Russell King - ARM Linux
 > <linux@arm.linux.org.uk> wrote:
 > > On Fri, Oct 05, 2012 at 09:33:04AM +0100, Mans Rullgard wrote:
 > >> On 5 October 2012 09:24, Russell King - ARM Linux
 > >> <linux@arm.linux.org.uk> wrote:
 > >> > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
 > >> >> On 5 October 2012 08:12, Russell King - ARM Linux
 > >> >> <linux@arm.linux.org.uk> wrote:
 > >> >> > On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
 > >> >> >> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
 > >> >> >> > This struct is the IP header, so a struct ptr is just set to the
 > >> >> >> > beginning of the received data. Since ethernet headers are 14 bytes,
 > >> >> >> > often the IP header is not aligned unless the NIC can place the frame at
 > >> >> >> > a 2 byte offset (which is something I need to investigate). So this
 > >> >> >> > function cannot make any assumptions about the alignment. Does the ABI
 > >> >> >> > define structs have some minimum alignment? Does the struct need to be
 > >> >> >> > declared as packed or something?
 > >> >> >>
 > >> >> >> The ABI defines the alignment of structs as the maximum alignment of its
 > >> >> >> members.  Since this struct contains 32-bit members, the alignment for the
 > >> >> >> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
 > >> >> >> might be unaligned (in addition to removing any holes within).
 > >> >> >
 > >> >> > This has come up before in the past.
 > >> >> >
 > >> >> > The Linux network folk will _not_ allow - in any shape or form - for
 > >> >> > this struct to be marked packed (it's the struct which needs to be
 > >> >> > marked packed) because by doing so, it causes GCC to issue byte loads/
 > >> >> > stores on architectures where there isn't a problem, and that decreases
 > >> >> > the performance of the Linux IP stack unnecessarily.
 > >> >>
 > >> >> Which architectures?  I have never seen anything like that.
 > >> >
 > >> > Does it matter?
 > >>
 > >> It matters if we want to fix it.
 > >
 > > I can't help you there.  Ask the networking people.
 > 
 > I'm asking anyone who is listening.  Since you've (apparently) seen the claim
 > being made, perhaps you can point to an email or so with more information.
 > I'd hate to think you're just making up excuses.

I'm pretty sure I've seen DaveM make this statement (not allowing the packed
attribute) and also stating that alignment faults in the kernel itself MUST be
handled silently and correctly.  (And DaveM is of course also the maintainer
of SPARC, another machine with strict-alignment requirements.)

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 12:24               ` Rob Herring
@ 2012-10-05 13:51                 ` Mikael Pettersson
  2012-10-05 16:01                   ` Rob Herring
  2012-10-05 14:05                 ` Russell King - ARM Linux
  1 sibling, 1 reply; 133+ messages in thread
From: Mikael Pettersson @ 2012-10-05 13:51 UTC (permalink / raw)
  To: linux-arm-kernel

Rob Herring writes:
 > On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
 > > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
 > >> On 5 October 2012 08:12, Russell King - ARM Linux
 > >> <linux@arm.linux.org.uk> wrote:
 > >>> On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
 > >>>> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
 > >>>>> This struct is the IP header, so a struct ptr is just set to the
 > >>>>> beginning of the received data. Since ethernet headers are 14 bytes,
 > >>>>> often the IP header is not aligned unless the NIC can place the frame at
 > >>>>> a 2 byte offset (which is something I need to investigate). So this
 > >>>>> function cannot make any assumptions about the alignment. Does the ABI
 > >>>>> define structs have some minimum alignment? Does the struct need to be
 > >>>>> declared as packed or something?
 > >>>>
 > >>>> The ABI defines the alignment of structs as the maximum alignment of its
 > >>>> members.  Since this struct contains 32-bit members, the alignment for the
 > >>>> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
 > >>>> might be unaligned (in addition to removing any holes within).
 > >>>
 > >>> This has come up before in the past.
 > >>>
 > >>> The Linux network folk will _not_ allow - in any shape or form - for
 > >>> this struct to be marked packed (it's the struct which needs to be
 > >>> marked packed) because by doing so, it causes GCC to issue byte loads/
 > >>> stores on architectures where there isn't a problem, and that decreases
 > >>> the performance of the Linux IP stack unnecessarily.
 > >>
 > >> Which architectures?  I have never seen anything like that.
 > > 
 > > Does it matter?  I'm just relaying the argument against adding __packed
 > > which was used before we were forced (by the networking folk) to implement
 > > the alignment fault handler.
 > 
 > It doesn't really matter what will be accepted or not as adding __packed
 > to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
 > The only way I've found to eliminate the alignment fault is adding a
 > barrier between the 2 loads. That seems like a compiler issue to me if
 > there is not a better fix.

If you suspect a GCC bug, please prepare a standalone user-space test case
and submit it to GCC's bugzilla (I can do the latter if you absolutely do not
want to).  It wouldn't be the first alignment-related GCC bug...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 12:24               ` Rob Herring
  2012-10-05 13:51                 ` Mikael Pettersson
@ 2012-10-05 14:05                 ` Russell King - ARM Linux
  2012-10-05 14:33                   ` Rob Herring
  1 sibling, 1 reply; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-05 14:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
> > Does it matter?  I'm just relaying the argument against adding __packed
> > which was used before we were forced (by the networking folk) to implement
> > the alignment fault handler.
> 
> It doesn't really matter what will be accepted or not as adding __packed
> to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
> The only way I've found to eliminate the alignment fault is adding a
> barrier between the 2 loads. That seems like a compiler issue to me if
> there is not a better fix.

Even so, please test the patch I've sent you in the sub-thread - that
needs testing whether or not GCC is at fault.  Will's patch to add the
warnings _has_ uncovered a potential issue with the use of __get_user()
in some parts of the ARM specific kernel, and I really need you to test
that while you're experiencing this problem.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 14:05                 ` Russell King - ARM Linux
@ 2012-10-05 14:33                   ` Rob Herring
  2012-10-11  0:59                     ` Jon Masters
  0 siblings, 1 reply; 133+ messages in thread
From: Rob Herring @ 2012-10-05 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>>> Does it matter?  I'm just relaying the argument against adding __packed
>>> which was used before we were forced (by the networking folk) to implement
>>> the alignment fault handler.
>>
>> It doesn't really matter what will be accepted or not as adding __packed
>> to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
>> The only way I've found to eliminate the alignment fault is adding a
>> barrier between the 2 loads. That seems like a compiler issue to me if
>> there is not a better fix.
> 
> Even so, please test the patch I've sent you in the sub-thread - that
> needs testing whether or not GCC is at fault.  Will's patch to add the
> warnings _has_ uncovered a potential issue with the use of __get_user()
> in some parts of the ARM specific kernel, and I really need you to test
> that while you're experiencing this problem.

I've tested your patch and it appears to fix things. Thanks!

Now on to getting rid of faults on practically every single received IP
packet:

Multi:          9871002

RX packets:9872010 errors:0 dropped:0 overruns:0 frame:0

Rob

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 13:51                 ` Mikael Pettersson
@ 2012-10-05 16:01                   ` Rob Herring
  2012-10-05 22:37                     ` Mans Rullgard
                                       ` (2 more replies)
  0 siblings, 3 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-05 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/05/2012 08:51 AM, Mikael Pettersson wrote:
> Rob Herring writes:
>  > On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>  > > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
>  > >> On 5 October 2012 08:12, Russell King - ARM Linux
>  > >> <linux@arm.linux.org.uk> wrote:
>  > >>> On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
>  > >>>> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
>  > >>>>> This struct is the IP header, so a struct ptr is just set to the
>  > >>>>> beginning of the received data. Since ethernet headers are 14 bytes,
>  > >>>>> often the IP header is not aligned unless the NIC can place the frame at
>  > >>>>> a 2 byte offset (which is something I need to investigate). So this
>  > >>>>> function cannot make any assumptions about the alignment. Does the ABI
>  > >>>>> define structs have some minimum alignment? Does the struct need to be
>  > >>>>> declared as packed or something?
>  > >>>>
>  > >>>> The ABI defines the alignment of structs as the maximum alignment of its
>  > >>>> members.  Since this struct contains 32-bit members, the alignment for the
>  > >>>> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
>  > >>>> might be unaligned (in addition to removing any holes within).
>  > >>>
>  > >>> This has come up before in the past.
>  > >>>
>  > >>> The Linux network folk will _not_ allow - in any shape or form - for
>  > >>> this struct to be marked packed (it's the struct which needs to be
>  > >>> marked packed) because by doing so, it causes GCC to issue byte loads/
>  > >>> stores on architectures where there isn't a problem, and that decreases
>  > >>> the performance of the Linux IP stack unnecessarily.
>  > >>
>  > >> Which architectures?  I have never seen anything like that.
>  > > 
>  > > Does it matter?  I'm just relaying the argument against adding __packed
>  > > which was used before we were forced (by the networking folk) to implement
>  > > the alignment fault handler.
>  > 
>  > It doesn't really matter what will be accepted or not as adding __packed
>  > to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
>  > The only way I've found to eliminate the alignment fault is adding a
>  > barrier between the 2 loads. That seems like a compiler issue to me if
>  > there is not a better fix.
> 
> If you suspect a GCC bug, please prepare a standalone user-space test case
> and submit it to GCC's bugzilla (I can do the latter if you absolutely do not
> want to).  It wouldn't be the first alignment-related GCC bug...
> 

Here's a testcase. Compiled on ubuntu precise with
"arm-linux-gnueabihf-gcc -O2 -marm -march=armv7-a test.c".

typedef unsigned short u16;
typedef unsigned short __sum16;
typedef unsigned int __u32;
typedef unsigned char __u8;
typedef __u32 __be32;
typedef u16 __be16;

struct iphdr {
	__u8	ihl:4,
		version:4;
	__u8	tos;
	__be16	tot_len;
	__be16	id;
	__be16	frag_off;
	__u8	ttl;
	__u8	protocol;
	__sum16	check;
	__be32	saddr;
	__be32	daddr;
	/*The options start here. */
};

#define ntohl(x) __swab32((__u32)(__be32)(x))
#define IP_DF		0x4000		/* Flag: "Don't Fragment"	*/

static inline __attribute__((const)) __u32 __swab32(__u32 x)
{
	__asm__ ("rev %0, %1" : "=r" (x) : "r" (x));
	return x;
}

int main(void * buffer, unsigned int *p_id)
{
	unsigned int id;
	int flush = 1;
	const struct iphdr *iph = buffer;
	__u32 len = *p_id;
	
	id = ntohl(*(__be32 *)&iph->id);
	flush = (u16)((ntohl(*(__be32 *)iph) ^ len) | (id ^ IP_DF));
	id >>= 16;
	
	*p_id = id;
	return flush;
}

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05  0:58 ` Michael Hope
  2012-10-05  1:26   ` Mans Rullgard
@ 2012-10-05 16:08   ` Rob Herring
  1 sibling, 0 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-05 16:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/04/2012 07:58 PM, Michael Hope wrote:
> On 5 October 2012 12:10, Rob Herring <robherring2@gmail.com> wrote:
>> I've been scratching my head with a "scheduling while atomic" bug I
>> started seeing on 3.6. I can easily reproduce this problem when doing a
>> wget on my system. It ultimately seems to be a combination of factors.
>> The "scheduling while atomic" bug is triggered in do_alignment which
>> gets triggered by this code in net/ipv4/af_inet.c, line 1356:
>>
>> id = ntohl(*(__be32 *)&iph->id);
>> flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
>> id >>= 16;
>>
>> This code compiles into this using "gcc version 4.6.3 (Ubuntu/Linaro
>> 4.6.3-1ubuntu5)":
>>
>> c02ac020:       e8920840        ldm     r2, {r6, fp}
>> c02ac024:       e6bfbf3b        rev     fp, fp
>> c02ac028:       e6bf6f36        rev     r6, r6
>> c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
>> c02ac030:       e0266008        eor     r6, r6, r8
>> c02ac034:       e18c6006        orr     r6, ip, r6
>>
>> which generates alignment faults on the ldm. These are silent until this
>> commit is applied:
> 
> Hi Rob.  I assume that iph is something like:
> 
> struct foo {
>     u32 x;
>     char id[8];
> };
> 
> struct foo *iph;
> 
> GCC merged the two adjacent loads of x and id into one ldm.  This is
> an ARM specific optimisation done in load_multiple_sequence() and
> enabled with -fpeephole2.

I'm probably on some watch list now after searching for peephole...

It's not clear what all turning this off would affect. Is it just struct
or array loading? Or does this turn off lots of different optimizations?

Rob

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 16:01                   ` Rob Herring
@ 2012-10-05 22:37                     ` Mans Rullgard
  2012-10-05 22:42                       ` Russell King - ARM Linux
  2012-10-06 10:58                     ` Mikael Pettersson
  2012-10-09 14:05                     ` Scott Bambrough
  2 siblings, 1 reply; 133+ messages in thread
From: Mans Rullgard @ 2012-10-05 22:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 October 2012 17:01, Rob Herring <robherring2@gmail.com> wrote:
> Here's a testcase. Compiled on ubuntu precise with
> "arm-linux-gnueabihf-gcc -O2 -marm -march=armv7-a test.c".
>
> typedef unsigned short u16;
> typedef unsigned short __sum16;
> typedef unsigned int __u32;
> typedef unsigned char __u8;
> typedef __u32 __be32;
> typedef u16 __be16;
>
> struct iphdr {
>         __u8    ihl:4,
>                 version:4;
>         __u8    tos;
>         __be16  tot_len;
>         __be16  id;
>         __be16  frag_off;
>         __u8    ttl;
>         __u8    protocol;
>         __sum16 check;
>         __be32  saddr;
>         __be32  daddr;
>         /*The options start here. */
> };
>
> #define ntohl(x) __swab32((__u32)(__be32)(x))
> #define IP_DF           0x4000          /* Flag: "Don't Fragment"       */
>
> static inline __attribute__((const)) __u32 __swab32(__u32 x)
> {
>         __asm__ ("rev %0, %1" : "=r" (x) : "r" (x));
>         return x;
> }
>
> int main(void * buffer, unsigned int *p_id)
> {
>         unsigned int id;
>         int flush = 1;
>         const struct iphdr *iph = buffer;
>         __u32 len = *p_id;
>
>         id = ntohl(*(__be32 *)&iph->id);
>         flush = (u16)((ntohl(*(__be32 *)iph) ^ len) | (id ^ IP_DF));
>         id >>= 16;
>
>         *p_id = id;
>         return flush;
> }

The problem is the (__be32 *) casts.  This is a normal pointer to a 32-bit,
which is assumed to be aligned, and the cast overrides the packed attribute
from the struct.  Dereferencing these cast expressions must be done with the
macros from asm/unaligned.h

-- 
Mans Rullgard / mru

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 22:37                     ` Mans Rullgard
@ 2012-10-05 22:42                       ` Russell King - ARM Linux
  2012-10-06  1:41                         ` Nicolas Pitre
  2012-10-06 16:04                         ` Mans Rullgard
  0 siblings, 2 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-05 22:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 05, 2012 at 11:37:40PM +0100, Mans Rullgard wrote:
> The problem is the (__be32 *) casts.  This is a normal pointer to a 32-bit,
> which is assumed to be aligned, and the cast overrides the packed attribute
> from the struct.  Dereferencing these cast expressions must be done with the
> macros from asm/unaligned.h

Again, not going to happen.  DaveM is on record for saying as much, but
I guess you're going to reject that as well, so I'm not sure why I'm
even bothering to reply.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 22:42                       ` Russell King - ARM Linux
@ 2012-10-06  1:41                         ` Nicolas Pitre
  2012-10-06 16:04                         ` Mans Rullgard
  1 sibling, 0 replies; 133+ messages in thread
From: Nicolas Pitre @ 2012-10-06  1:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 5 Oct 2012, Russell King - ARM Linux wrote:

> On Fri, Oct 05, 2012 at 11:37:40PM +0100, Mans Rullgard wrote:
> > The problem is the (__be32 *) casts.  This is a normal pointer to a 32-bit,
> > which is assumed to be aligned, and the cast overrides the packed attribute
> > from the struct.  Dereferencing these cast expressions must be done with the
> > macros from asm/unaligned.h
> 
> Again, not going to happen.  DaveM is on record for saying as much, but
> I guess you're going to reject that as well, so I'm not sure why I'm
> even bothering to reply.

May I suggest to send a recap of this thread to netdev at vger.kernel.org 
and see what davem has to say this time?  Otherwise we are just going in 
circle.


Nicolas

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 16:01                   ` Rob Herring
  2012-10-05 22:37                     ` Mans Rullgard
@ 2012-10-06 10:58                     ` Mikael Pettersson
  2012-10-09 14:05                     ` Scott Bambrough
  2 siblings, 0 replies; 133+ messages in thread
From: Mikael Pettersson @ 2012-10-06 10:58 UTC (permalink / raw)
  To: linux-arm-kernel

Rob Herring writes:
 > On 10/05/2012 08:51 AM, Mikael Pettersson wrote:
 > > Rob Herring writes:
 > >  > On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
 > >  > > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
 > >  > >> On 5 October 2012 08:12, Russell King - ARM Linux
 > >  > >> <linux@arm.linux.org.uk> wrote:
 > >  > >>> On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
 > >  > >>>> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
 > >  > >>>>> This struct is the IP header, so a struct ptr is just set to the
 > >  > >>>>> beginning of the received data. Since ethernet headers are 14 bytes,
 > >  > >>>>> often the IP header is not aligned unless the NIC can place the frame at
 > >  > >>>>> a 2 byte offset (which is something I need to investigate). So this
 > >  > >>>>> function cannot make any assumptions about the alignment. Does the ABI
 > >  > >>>>> define structs have some minimum alignment? Does the struct need to be
 > >  > >>>>> declared as packed or something?
 > >  > >>>>
 > >  > >>>> The ABI defines the alignment of structs as the maximum alignment of its
 > >  > >>>> members.  Since this struct contains 32-bit members, the alignment for the
 > >  > >>>> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
 > >  > >>>> might be unaligned (in addition to removing any holes within).
 > >  > >>>
 > >  > >>> This has come up before in the past.
 > >  > >>>
 > >  > >>> The Linux network folk will _not_ allow - in any shape or form - for
 > >  > >>> this struct to be marked packed (it's the struct which needs to be
 > >  > >>> marked packed) because by doing so, it causes GCC to issue byte loads/
 > >  > >>> stores on architectures where there isn't a problem, and that decreases
 > >  > >>> the performance of the Linux IP stack unnecessarily.
 > >  > >>
 > >  > >> Which architectures?  I have never seen anything like that.
 > >  > > 
 > >  > > Does it matter?  I'm just relaying the argument against adding __packed
 > >  > > which was used before we were forced (by the networking folk) to implement
 > >  > > the alignment fault handler.
 > >  > 
 > >  > It doesn't really matter what will be accepted or not as adding __packed
 > >  > to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
 > >  > The only way I've found to eliminate the alignment fault is adding a
 > >  > barrier between the 2 loads. That seems like a compiler issue to me if
 > >  > there is not a better fix.
 > > 
 > > If you suspect a GCC bug, please prepare a standalone user-space test case
 > > and submit it to GCC's bugzilla (I can do the latter if you absolutely do not
 > > want to).  It wouldn't be the first alignment-related GCC bug...
 > > 
 > 
 > Here's a testcase. Compiled on ubuntu precise with
 > "arm-linux-gnueabihf-gcc -O2 -marm -march=armv7-a test.c".
 > 
 > typedef unsigned short u16;
 > typedef unsigned short __sum16;
 > typedef unsigned int __u32;
 > typedef unsigned char __u8;
 > typedef __u32 __be32;
 > typedef u16 __be16;
 > 
 > struct iphdr {
 > 	__u8	ihl:4,
 > 		version:4;
 > 	__u8	tos;
 > 	__be16	tot_len;
 > 	__be16	id;
 > 	__be16	frag_off;
 > 	__u8	ttl;
 > 	__u8	protocol;
 > 	__sum16	check;
 > 	__be32	saddr;
 > 	__be32	daddr;
 > 	/*The options start here. */
 > };
 > 
 > #define ntohl(x) __swab32((__u32)(__be32)(x))
 > #define IP_DF		0x4000		/* Flag: "Don't Fragment"	*/
 > 
 > static inline __attribute__((const)) __u32 __swab32(__u32 x)
 > {
 > 	__asm__ ("rev %0, %1" : "=r" (x) : "r" (x));
 > 	return x;
 > }
 > 
 > int main(void * buffer, unsigned int *p_id)
 > {
 > 	unsigned int id;
 > 	int flush = 1;
 > 	const struct iphdr *iph = buffer;
 > 	__u32 len = *p_id;
 > 	
 > 	id = ntohl(*(__be32 *)&iph->id);
 > 	flush = (u16)((ntohl(*(__be32 *)iph) ^ len) | (id ^ IP_DF));
 > 	id >>= 16;
 > 	
 > 	*p_id = id;
 > 	return flush;
 > }
 > 

I was referring to your statement that adding __packed to the types involved
didn't prevent GCC from emitting aligned memory accesses. The test case above
only shows that if the source code lies to GCC then things break...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 22:42                       ` Russell King - ARM Linux
  2012-10-06  1:41                         ` Nicolas Pitre
@ 2012-10-06 16:04                         ` Mans Rullgard
  2012-10-06 16:19                           ` Nicolas Pitre
  2012-10-06 16:31                           ` Russell King - ARM Linux
  1 sibling, 2 replies; 133+ messages in thread
From: Mans Rullgard @ 2012-10-06 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 October 2012 23:42, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Oct 05, 2012 at 11:37:40PM +0100, Mans Rullgard wrote:
>> The problem is the (__be32 *) casts.  This is a normal pointer to a 32-bit,
>> which is assumed to be aligned, and the cast overrides the packed attribute
>> from the struct.  Dereferencing these cast expressions must be done with the
>> macros from asm/unaligned.h
>
> Again, not going to happen.

There are only two options for fixing this:

1. Ensure the struct is always aligned.
2. Declare it packed (and fix casts).

Refusing to do either leaves us with a broken kernel.  Is that what you want?

-- 
Mans Rullgard / mru

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-06 16:04                         ` Mans Rullgard
@ 2012-10-06 16:19                           ` Nicolas Pitre
  2012-10-06 16:31                           ` Russell King - ARM Linux
  1 sibling, 0 replies; 133+ messages in thread
From: Nicolas Pitre @ 2012-10-06 16:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, 6 Oct 2012, Mans Rullgard wrote:

> On 5 October 2012 23:42, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Oct 05, 2012 at 11:37:40PM +0100, Mans Rullgard wrote:
> >> The problem is the (__be32 *) casts.  This is a normal pointer to a 32-bit,
> >> which is assumed to be aligned, and the cast overrides the packed attribute
> >> from the struct.  Dereferencing these cast expressions must be done with the
> >> macros from asm/unaligned.h
> >
> > Again, not going to happen.
> 
> There are only two options for fixing this:
> 
> 1. Ensure the struct is always aligned.
> 2. Declare it packed (and fix casts).
> 
> Refusing to do either leaves us with a broken kernel.  Is that what you want?

Once again, please bring this up with davem and also CC 
netdev at vger.kernel.org as he's the point of contention here.


Nicolas

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-06 16:04                         ` Mans Rullgard
  2012-10-06 16:19                           ` Nicolas Pitre
@ 2012-10-06 16:31                           ` Russell King - ARM Linux
  1 sibling, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-06 16:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Oct 06, 2012 at 05:04:33PM +0100, Mans Rullgard wrote:
> On 5 October 2012 23:42, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Oct 05, 2012 at 11:37:40PM +0100, Mans Rullgard wrote:
> >> The problem is the (__be32 *) casts.  This is a normal pointer to a 32-bit,
> >> which is assumed to be aligned, and the cast overrides the packed attribute
> >> from the struct.  Dereferencing these cast expressions must be done with the
> >> macros from asm/unaligned.h
> >
> > Again, not going to happen.
> 
> There are only two options for fixing this:
> 
> 1. Ensure the struct is always aligned.
> 2. Declare it packed (and fix casts).
> 
> Refusing to do either leaves us with a broken kernel.  Is that what you want?

How about you start reading the emails that you receive rather than
seemingly insisting that I somehow fix the Linux networking stack -
which would involve me talking to someone who has publically tried
to oust me from being ARM maintainer, and whom would probably reject
any attempt to fix this _because_ I was the person involved in proposing
the fix?

It's far better that someone else sorts this out, they are much more
likely to get a favourable response.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 16:01                   ` Rob Herring
  2012-10-05 22:37                     ` Mans Rullgard
  2012-10-06 10:58                     ` Mikael Pettersson
@ 2012-10-09 14:05                     ` Scott Bambrough
  2012-10-09 14:18                       ` Mans Rullgard
  2 siblings, 1 reply; 133+ messages in thread
From: Scott Bambrough @ 2012-10-09 14:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 12-10-05 12:01 PM, Rob Herring wrote:
> On 10/05/2012 08:51 AM, Mikael Pettersson wrote:
>> Rob Herring writes:
>>   > On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>>   > > On Fri, Oct 05, 2012 at 09:20:56AM +0100, Mans Rullgard wrote:
>>   > >> On 5 October 2012 08:12, Russell King - ARM Linux
>>   > >> <linux@arm.linux.org.uk> wrote:
>>   > >>> On Fri, Oct 05, 2012 at 03:25:16AM +0100, Mans Rullgard wrote:
>>   > >>>> On 5 October 2012 02:56, Rob Herring <robherring2@gmail.com> wrote:
>>   > >>>>> This struct is the IP header, so a struct ptr is just set to the
>>   > >>>>> beginning of the received data. Since ethernet headers are 14 bytes,
>>   > >>>>> often the IP header is not aligned unless the NIC can place the frame at
>>   > >>>>> a 2 byte offset (which is something I need to investigate). So this
>>   > >>>>> function cannot make any assumptions about the alignment. Does the ABI
>>   > >>>>> define structs have some minimum alignment? Does the struct need to be
>>   > >>>>> declared as packed or something?
>>   > >>>>
>>   > >>>> The ABI defines the alignment of structs as the maximum alignment of its
>>   > >>>> members.  Since this struct contains 32-bit members, the alignment for the
>>   > >>>> whole struct becomes 32 bits as well.  Declaring it as packed tells gcc it
>>   > >>>> might be unaligned (in addition to removing any holes within).
>>   > >>>
>>   > >>> This has come up before in the past.
>>   > >>>
>>   > >>> The Linux network folk will _not_ allow - in any shape or form - for
>>   > >>> this struct to be marked packed (it's the struct which needs to be
>>   > >>> marked packed) because by doing so, it causes GCC to issue byte loads/
>>   > >>> stores on architectures where there isn't a problem, and that decreases
>>   > >>> the performance of the Linux IP stack unnecessarily.
>>   > >>
>>   > >> Which architectures?  I have never seen anything like that.
>>   > >
>>   > > Does it matter?  I'm just relaying the argument against adding __packed
>>   > > which was used before we were forced (by the networking folk) to implement
>>   > > the alignment fault handler.
>>   >
>>   > It doesn't really matter what will be accepted or not as adding __packed
>>   > to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
>>   > The only way I've found to eliminate the alignment fault is adding a
>>   > barrier between the 2 loads. That seems like a compiler issue to me if
>>   > there is not a better fix.
>>
>> If you suspect a GCC bug, please prepare a standalone user-space test case
>> and submit it to GCC's bugzilla (I can do the latter if you absolutely do not
>> want to).  It wouldn't be the first alignment-related GCC bug...
>>
>
> Here's a testcase. Compiled on ubuntu precise with
> "arm-linux-gnueabihf-gcc -O2 -marm -march=armv7-a test.c".
>
> typedef unsigned short u16;
> typedef unsigned short __sum16;
> typedef unsigned int __u32;
> typedef unsigned char __u8;
> typedef __u32 __be32;
> typedef u16 __be16;
>
> struct iphdr {
> 	__u8	ihl:4,
> 		version:4;
> 	__u8	tos;
> 	__be16	tot_len;
> 	__be16	id;
> 	__be16	frag_off;
> 	__u8	ttl;
> 	__u8	protocol;
> 	__sum16	check;
> 	__be32	saddr;
> 	__be32	daddr;
> 	/*The options start here. */
> };

I was reading this thread with some interest.  AFAIK, with the default 
alignment rules the above struct is packed; there will be no holes in it.

>
> #define ntohl(x) __swab32((__u32)(__be32)(x))
> #define IP_DF		0x4000		/* Flag: "Don't Fragment"	*/
>
> static inline __attribute__((const)) __u32 __swab32(__u32 x)
> {
> 	__asm__ ("rev %0, %1" : "=r" (x) : "r" (x));
> 	return x;
> }
>
> int main(void * buffer, unsigned int *p_id)
> {
> 	unsigned int id;
> 	int flush = 1;
> 	const struct iphdr *iph = buffer;
> 	__u32 len = *p_id;
> 	
> 	id = ntohl(*(__be32 *)&iph->id);

The above statement is the problem.  I think it is poorly written 
networking code.  It takes the address of a 16 bit quantity (aligned on 
a halfword address), attempts to do a type conversion using pointers, 
then dereference it.  I would have thought:

id = ntohs(iph->id);

would have been enough.

Scott

-- 
Scott Bambrough
Technical Director, Member Services
Linaro Ltd.
email: scott.bambrough at linaro.org
irc: scottb (freenode, irc.linaro.org)
web: http://www.linaro.org
Linaro: The future of Linux on ARM.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-09 14:05                     ` Scott Bambrough
@ 2012-10-09 14:18                       ` Mans Rullgard
  0 siblings, 0 replies; 133+ messages in thread
From: Mans Rullgard @ 2012-10-09 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 9 October 2012 15:05, Scott Bambrough <scott.bambrough@linaro.org> wrote:
> On 12-10-05 12:01 PM, Rob Herring wrote:
>> Here's a testcase. Compiled on ubuntu precise with
>> "arm-linux-gnueabihf-gcc -O2 -marm -march=armv7-a test.c".
>>
>> typedef unsigned short u16;
>> typedef unsigned short __sum16;
>> typedef unsigned int __u32;
>> typedef unsigned char __u8;
>> typedef __u32 __be32;
>> typedef u16 __be16;
>>
>> struct iphdr {
>>         __u8    ihl:4,
>>                 version:4;
>>         __u8    tos;
>>         __be16  tot_len;
>>         __be16  id;
>>         __be16  frag_off;
>>         __u8    ttl;
>>         __u8    protocol;
>>         __sum16 check;
>>         __be32  saddr;
>>         __be32  daddr;
>>         /*The options start here. */
>> };
>
>
> I was reading this thread with some interest.  AFAIK, with the default
> alignment rules the above struct is packed; there will be no holes in it.

Correct.  The problem here is that something is passing around a pointer to
such a struct which is not 4-byte aligned as required by the ABI rules.

>> #define ntohl(x) __swab32((__u32)(__be32)(x))
>> #define IP_DF           0x4000          /* Flag: "Don't Fragment"       */
>>
>> static inline __attribute__((const)) __u32 __swab32(__u32 x)
>> {
>>         __asm__ ("rev %0, %1" : "=r" (x) : "r" (x));
>>         return x;
>> }
>>
>> int main(void * buffer, unsigned int *p_id)
>> {
>>         unsigned int id;
>>         int flush = 1;
>>         const struct iphdr *iph = buffer;
>>         __u32 len = *p_id;
>>
>>         id = ntohl(*(__be32 *)&iph->id);
>
>
> The above statement is the problem.  I think it is poorly written networking
> code.  It takes the address of a 16 bit quantity (aligned on a halfword
> address),

The 'id' member starts 4 bytes into the struct, so if the struct is properly
aligned, there will be no fault here.  The problem is that networking code does
not always align these structs correctly.

> attempts to do a type conversion using pointers, then dereference
> it.  I would have thought:
>
> id = ntohs(iph->id);
>
> would have been enough.

I'm assuming the is intentionally merging two 16-bit fields into one 32-bit
value.  What you suggest would do something rather different.

-- 
Mans Rullgard / mru

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 14:33                   ` Rob Herring
@ 2012-10-11  0:59                     ` Jon Masters
  2012-10-11  2:27                         ` Måns Rullgård
  0 siblings, 1 reply; 133+ messages in thread
From: Jon Masters @ 2012-10-11  0:59 UTC (permalink / raw)
  To: linux-arm-kernel

Hi everyone,

On 10/05/2012 10:33 AM, Rob Herring wrote:
> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>>>> Does it matter?  I'm just relaying the argument against adding __packed
>>>> which was used before we were forced (by the networking folk) to implement
>>>> the alignment fault handler.
>>>
>>> It doesn't really matter what will be accepted or not as adding __packed
>>> to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
>>> The only way I've found to eliminate the alignment fault is adding a
>>> barrier between the 2 loads. That seems like a compiler issue to me if
>>> there is not a better fix.
>>
>> Even so, please test the patch I've sent you in the sub-thread - that
>> needs testing whether or not GCC is at fault.  Will's patch to add the
>> warnings _has_ uncovered a potential issue with the use of __get_user()
>> in some parts of the ARM specific kernel, and I really need you to test
>> that while you're experiencing this problem.
> 
> I've tested your patch and it appears to fix things. Thanks!

Ok. I'm looking for a short term solution in the Fedora kernel because
we've been bitten by this bug (I've been following this thread). I
considered just reverting Will's patch, but that only sweeps the issue
under the rug back to where we were in our 3.4 kernel. So, should we
pull in rmk's fix? It seems there are two problems here:

1). Missaligned access fault handling atomicity in general
2). Assumptions around struct alignment that turn out not to be true at
runtime due to the way that the structs are actually then aligned.

> Now on to getting rid of faults on practically every single received IP
> packet:
> 
> Multi:          9871002
> 
> RX packets:9872010 errors:0 dropped:0 overruns:0 frame:0

This will still be a problem, indeed. At least we can be aware we're
taking a large number of faults and hope for a netdev solution.

Jon.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11  0:59                     ` Jon Masters
@ 2012-10-11  2:27                         ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11  2:27 UTC (permalink / raw)
  To: Jon Masters; +Cc: netdev, linux-arm-kernel

Jon Masters <jonathan@jonmasters.org> writes:

> Hi everyone,
>
> On 10/05/2012 10:33 AM, Rob Herring wrote:
>> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
>>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
>>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>>>>> Does it matter?  I'm just relaying the argument against adding __packed
>>>>> which was used before we were forced (by the networking folk) to implement
>>>>> the alignment fault handler.
>>>>
>>>> It doesn't really matter what will be accepted or not as adding __packed
>>>> to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
>>>> The only way I've found to eliminate the alignment fault is adding a
>>>> barrier between the 2 loads. That seems like a compiler issue to me if
>>>> there is not a better fix.

This turns out to be caused by pointers being typecast to normal
(aligned) types.

>>> Even so, please test the patch I've sent you in the sub-thread - that
>>> needs testing whether or not GCC is at fault.  Will's patch to add the
>>> warnings _has_ uncovered a potential issue with the use of __get_user()
>>> in some parts of the ARM specific kernel, and I really need you to test
>>> that while you're experiencing this problem.
>> 
>> I've tested your patch and it appears to fix things. Thanks!

[...]

>> Now on to getting rid of faults on practically every single received IP
>> packet:
>> 
>> Multi:          9871002
>> 
>> RX packets:9872010 errors:0 dropped:0 overruns:0 frame:0
>
> This will still be a problem, indeed. At least we can be aware we're
> taking a large number of faults and hope for a netdev solution.

There are exactly two possible solutions:

1. Change the networking code so those structs are always aligned.  This
   might not be (easily) possible.
2. Mark the structs __packed and fix any typecasts like the ones seen in
   this thread.  This will have an adverse effect in cases where the
   structs are in fact aligned.

Both solutions lie squarely in the networking code.  It's time to
involve that list, or we'll never get anywhere.

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11  2:27                         ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11  2:27 UTC (permalink / raw)
  To: linux-arm-kernel

Jon Masters <jonathan@jonmasters.org> writes:

> Hi everyone,
>
> On 10/05/2012 10:33 AM, Rob Herring wrote:
>> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
>>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
>>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>>>>> Does it matter?  I'm just relaying the argument against adding __packed
>>>>> which was used before we were forced (by the networking folk) to implement
>>>>> the alignment fault handler.
>>>>
>>>> It doesn't really matter what will be accepted or not as adding __packed
>>>> to struct iphdr doesn't fix the problem anyway. gcc still emits a ldm.
>>>> The only way I've found to eliminate the alignment fault is adding a
>>>> barrier between the 2 loads. That seems like a compiler issue to me if
>>>> there is not a better fix.

This turns out to be caused by pointers being typecast to normal
(aligned) types.

>>> Even so, please test the patch I've sent you in the sub-thread - that
>>> needs testing whether or not GCC is at fault.  Will's patch to add the
>>> warnings _has_ uncovered a potential issue with the use of __get_user()
>>> in some parts of the ARM specific kernel, and I really need you to test
>>> that while you're experiencing this problem.
>> 
>> I've tested your patch and it appears to fix things. Thanks!

[...]

>> Now on to getting rid of faults on practically every single received IP
>> packet:
>> 
>> Multi:          9871002
>> 
>> RX packets:9872010 errors:0 dropped:0 overruns:0 frame:0
>
> This will still be a problem, indeed. At least we can be aware we're
> taking a large number of faults and hope for a netdev solution.

There are exactly two possible solutions:

1. Change the networking code so those structs are always aligned.  This
   might not be (easily) possible.
2. Mark the structs __packed and fix any typecasts like the ones seen in
   this thread.  This will have an adverse effect in cases where the
   structs are in fact aligned.

Both solutions lie squarely in the networking code.  It's time to
involve that list, or we'll never get anywhere.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11  2:27                         ` Måns Rullgård
@ 2012-10-11  2:34                           ` Jon Masters
  -1 siblings, 0 replies; 133+ messages in thread
From: Jon Masters @ 2012-10-11  2:34 UTC (permalink / raw)
  To: Måns Rullgård; +Cc: netdev, linux-arm-kernel

On 10/10/2012 10:27 PM, Måns Rullgård wrote:

> There are exactly two possible solutions:
> 
> 1. Change the networking code so those structs are always aligned.  This
>    might not be (easily) possible.
> 2. Mark the structs __packed and fix any typecasts like the ones seen in
>    this thread.  This will have an adverse effect in cases where the
>    structs are in fact aligned.
> 
> Both solutions lie squarely in the networking code.  It's time to
> involve that list, or we'll never get anywhere.

Sure, please do let's figure out the plan. But my question is tangential
- I am after input from rmk on whether that patch he posted to fix the
atomicity of missaligned faults is going to be something we should plan
to be pulling into 3.6 for Fedora (after there's an official version) to
correct the might_fault warnings, etc.

Meanwhile, a separate fix of some kind is still likely to be needed
because we don't want to take a large number of alignment traps.

Jon.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11  2:34                           ` Jon Masters
  0 siblings, 0 replies; 133+ messages in thread
From: Jon Masters @ 2012-10-11  2:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/10/2012 10:27 PM, M?ns Rullg?rd wrote:

> There are exactly two possible solutions:
> 
> 1. Change the networking code so those structs are always aligned.  This
>    might not be (easily) possible.
> 2. Mark the structs __packed and fix any typecasts like the ones seen in
>    this thread.  This will have an adverse effect in cases where the
>    structs are in fact aligned.
> 
> Both solutions lie squarely in the networking code.  It's time to
> involve that list, or we'll never get anywhere.

Sure, please do let's figure out the plan. But my question is tangential
- I am after input from rmk on whether that patch he posted to fix the
atomicity of missaligned faults is going to be something we should plan
to be pulling into 3.6 for Fedora (after there's an official version) to
correct the might_fault warnings, etc.

Meanwhile, a separate fix of some kind is still likely to be needed
because we don't want to take a large number of alignment traps.

Jon.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: alignment faults in 3.6
  2012-10-11  2:27                         ` Måns Rullgård
@ 2012-10-11  8:21                           ` David Laight
  -1 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-11  8:21 UTC (permalink / raw)
  To: Måns Rullgård, Jon Masters; +Cc: linux-arm-kernel, netdev



> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Måns Rullgård
> Sent: 11 October 2012 03:27
> To: Jon Masters
> Cc: linux-arm-kernel@lists.infradead.org; netdev@vger.kernel.org
> Subject: Re: alignment faults in 3.6
> 
> Jon Masters <jonathan@jonmasters.org> writes:
> 
> > Hi everyone,
> >
> > On 10/05/2012 10:33 AM, Rob Herring wrote:
> >> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
> >>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
> >>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
> >>>>> Does it matter?  I'm just relaying the argument against adding __packed
> >>>>> which was used before we were forced (by the networking folk) to implement
> >>>>> the alignment fault handler.
> >>>>
> >>>> It doesn't really matter what will be accepted or not as adding __packed
> >>>> to struct iphdr doesn't fix the problem anyway. 
...
> There are exactly two possible solutions:
> 
> 1. Change the networking code so those structs are always aligned.  This
>    might not be (easily) possible.
> 2. Mark the structs __packed and fix any typecasts like the ones seen in
>    this thread.  This will have an adverse effect in cases where the
>    structs are in fact aligned.
> 
> Both solutions lie squarely in the networking code.  It's time to
> involve that list, or we'll never get anywhere.

It might be enough to use __attribute__((aligned(2))) on some structure
members (actually does 'ldm' need 8 byte alignment?? - in which case
aligned(4) is enough).

One on my bugbears is hardware that will once receive ethernet frames
onto a 4n boundary - it needs to be 4n+2. two bytes of 'junk' will do.

	David

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11  8:21                           ` David Laight
  0 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-11  8:21 UTC (permalink / raw)
  To: linux-arm-kernel



> -----Original Message-----
> From: netdev-owner at vger.kernel.org [mailto:netdev-owner at vger.kernel.org] On Behalf Of M?ns Rullg?rd
> Sent: 11 October 2012 03:27
> To: Jon Masters
> Cc: linux-arm-kernel at lists.infradead.org; netdev at vger.kernel.org
> Subject: Re: alignment faults in 3.6
> 
> Jon Masters <jonathan@jonmasters.org> writes:
> 
> > Hi everyone,
> >
> > On 10/05/2012 10:33 AM, Rob Herring wrote:
> >> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
> >>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
> >>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
> >>>>> Does it matter?  I'm just relaying the argument against adding __packed
> >>>>> which was used before we were forced (by the networking folk) to implement
> >>>>> the alignment fault handler.
> >>>>
> >>>> It doesn't really matter what will be accepted or not as adding __packed
> >>>> to struct iphdr doesn't fix the problem anyway. 
...
> There are exactly two possible solutions:
> 
> 1. Change the networking code so those structs are always aligned.  This
>    might not be (easily) possible.
> 2. Mark the structs __packed and fix any typecasts like the ones seen in
>    this thread.  This will have an adverse effect in cases where the
>    structs are in fact aligned.
> 
> Both solutions lie squarely in the networking code.  It's time to
> involve that list, or we'll never get anywhere.

It might be enough to use __attribute__((aligned(2))) on some structure
members (actually does 'ldm' need 8 byte alignment?? - in which case
aligned(4) is enough).

One on my bugbears is hardware that will once receive ethernet frames
onto a 4n boundary - it needs to be 4n+2. two bytes of 'junk' will do.

	David

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11  8:21                           ` David Laight
@ 2012-10-11  8:53                             ` Russell King - ARM Linux
  -1 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-11  8:53 UTC (permalink / raw)
  To: David Laight
  Cc: Måns Rullgård, Jon Masters, netdev, linux-arm-kernel

On Thu, Oct 11, 2012 at 09:21:35AM +0100, David Laight wrote:
> It might be enough to use __attribute__((aligned(2))) on some structure
> members (actually does 'ldm' need 8 byte alignment?? - in which case
> aligned(4) is enough).

No, ldm just needs 4 byte alignment, the same as normal word loads/stores
not to fault.  The only instructions which needs 8 byte alignment not to
fault are the double-word load/stores.

> One on my bugbears is hardware that will once receive ethernet frames
> onto a 4n boundary - it needs to be 4n+2. two bytes of 'junk' will do.

Indeed, but remember that is just a mere optimisation for IPv4.  What
may be true of IPv4 is not necessarily true of other protocols, though
IPv4 is currently the dominant protocol today.  IPv6 follows the same
alignment rules as IPv4, so it's unlikely to change.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11  8:53                             ` Russell King - ARM Linux
  0 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-11  8:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 11, 2012 at 09:21:35AM +0100, David Laight wrote:
> It might be enough to use __attribute__((aligned(2))) on some structure
> members (actually does 'ldm' need 8 byte alignment?? - in which case
> aligned(4) is enough).

No, ldm just needs 4 byte alignment, the same as normal word loads/stores
not to fault.  The only instructions which needs 8 byte alignment not to
fault are the double-word load/stores.

> One on my bugbears is hardware that will once receive ethernet frames
> onto a 4n boundary - it needs to be 4n+2. two bytes of 'junk' will do.

Indeed, but remember that is just a mere optimisation for IPv4.  What
may be true of IPv4 is not necessarily true of other protocols, though
IPv4 is currently the dominant protocol today.  IPv6 follows the same
alignment rules as IPv4, so it's unlikely to change.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11  8:21                           ` David Laight
@ 2012-10-11  9:45                             ` Måns Rullgård
  -1 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11  9:45 UTC (permalink / raw)
  To: David Laight
  Cc: Måns Rullgård, Jon Masters, linux-arm-kernel, netdev

"David Laight" <David.Laight@ACULAB.COM> writes:

>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Måns Rullgård
>> Sent: 11 October 2012 03:27
>> To: Jon Masters
>> Cc: linux-arm-kernel@lists.infradead.org; netdev@vger.kernel.org
>> Subject: Re: alignment faults in 3.6
>> 
>> Jon Masters <jonathan@jonmasters.org> writes:
>> 
>> > Hi everyone,
>> >
>> > On 10/05/2012 10:33 AM, Rob Herring wrote:
>> >> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
>> >>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
>> >>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>> >>>>> Does it matter?  I'm just relaying the argument against adding __packed
>> >>>>> which was used before we were forced (by the networking folk) to implement
>> >>>>> the alignment fault handler.
>> >>>>
>> >>>> It doesn't really matter what will be accepted or not as adding __packed
>> >>>> to struct iphdr doesn't fix the problem anyway. 
> ...
>> There are exactly two possible solutions:
>> 
>> 1. Change the networking code so those structs are always aligned.  This
>>    might not be (easily) possible.
>> 2. Mark the structs __packed and fix any typecasts like the ones seen in
>>    this thread.  This will have an adverse effect in cases where the
>>    structs are in fact aligned.
>> 
>> Both solutions lie squarely in the networking code.  It's time to
>> involve that list, or we'll never get anywhere.
>
> It might be enough to use __attribute__((aligned(2))) on some structure
> members (actually does 'ldm' need 8 byte alignment?? - in which case
> aligned(4) is enough).

The aligned attribute can only increase alignment.

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11  9:45                             ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11  9:45 UTC (permalink / raw)
  To: linux-arm-kernel

"David Laight" <David.Laight@ACULAB.COM> writes:

>> -----Original Message-----
>> From: netdev-owner at vger.kernel.org [mailto:netdev-owner at vger.kernel.org] On Behalf Of M?ns Rullg?rd
>> Sent: 11 October 2012 03:27
>> To: Jon Masters
>> Cc: linux-arm-kernel at lists.infradead.org; netdev at vger.kernel.org
>> Subject: Re: alignment faults in 3.6
>> 
>> Jon Masters <jonathan@jonmasters.org> writes:
>> 
>> > Hi everyone,
>> >
>> > On 10/05/2012 10:33 AM, Rob Herring wrote:
>> >> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
>> >>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
>> >>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>> >>>>> Does it matter?  I'm just relaying the argument against adding __packed
>> >>>>> which was used before we were forced (by the networking folk) to implement
>> >>>>> the alignment fault handler.
>> >>>>
>> >>>> It doesn't really matter what will be accepted or not as adding __packed
>> >>>> to struct iphdr doesn't fix the problem anyway. 
> ...
>> There are exactly two possible solutions:
>> 
>> 1. Change the networking code so those structs are always aligned.  This
>>    might not be (easily) possible.
>> 2. Mark the structs __packed and fix any typecasts like the ones seen in
>>    this thread.  This will have an adverse effect in cases where the
>>    structs are in fact aligned.
>> 
>> Both solutions lie squarely in the networking code.  It's time to
>> involve that list, or we'll never get anywhere.
>
> It might be enough to use __attribute__((aligned(2))) on some structure
> members (actually does 'ldm' need 8 byte alignment?? - in which case
> aligned(4) is enough).

The aligned attribute can only increase alignment.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11  9:45                             ` Måns Rullgård
@ 2012-10-11 10:00                               ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 10:00 UTC (permalink / raw)
  To: Måns Rullgård
  Cc: David Laight, Jon Masters, linux-arm-kernel, netdev

On Thu, 2012-10-11 at 10:45 +0100, Måns Rullgård wrote:
> "David Laight" <David.Laight@ACULAB.COM> writes:
> 
> >> -----Original Message-----
> >> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Måns Rullgård
> >> Sent: 11 October 2012 03:27
> >> To: Jon Masters
> >> Cc: linux-arm-kernel@lists.infradead.org; netdev@vger.kernel.org
> >> Subject: Re: alignment faults in 3.6
> >> 
> >> Jon Masters <jonathan@jonmasters.org> writes:
> >> 
> >> > Hi everyone,
> >> >
> >> > On 10/05/2012 10:33 AM, Rob Herring wrote:
> >> >> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
> >> >>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
> >> >>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
> >> >>>>> Does it matter?  I'm just relaying the argument against adding __packed
> >> >>>>> which was used before we were forced (by the networking folk) to implement
> >> >>>>> the alignment fault handler.
> >> >>>>
> >> >>>> It doesn't really matter what will be accepted or not as adding __packed
> >> >>>> to struct iphdr doesn't fix the problem anyway. 
> > ...
> >> There are exactly two possible solutions:
> >> 
> >> 1. Change the networking code so those structs are always aligned.  This
> >>    might not be (easily) possible.
> >> 2. Mark the structs __packed and fix any typecasts like the ones seen in
> >>    this thread.  This will have an adverse effect in cases where the
> >>    structs are in fact aligned.
> >> 
> >> Both solutions lie squarely in the networking code.  It's time to
> >> involve that list, or we'll never get anywhere.
> >
> > It might be enough to use __attribute__((aligned(2))) on some structure
> > members (actually does 'ldm' need 8 byte alignment?? - in which case
> > aligned(4) is enough).
> 
> The aligned attribute can only increase alignment.
> 

I have no idea what is the problem, 

-ENOTENOUGHCONTEXT

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 10:00                               ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 10:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 10:45 +0100, M?ns Rullg?rd wrote:
> "David Laight" <David.Laight@ACULAB.COM> writes:
> 
> >> -----Original Message-----
> >> From: netdev-owner at vger.kernel.org [mailto:netdev-owner at vger.kernel.org] On Behalf Of M?ns Rullg?rd
> >> Sent: 11 October 2012 03:27
> >> To: Jon Masters
> >> Cc: linux-arm-kernel at lists.infradead.org; netdev at vger.kernel.org
> >> Subject: Re: alignment faults in 3.6
> >> 
> >> Jon Masters <jonathan@jonmasters.org> writes:
> >> 
> >> > Hi everyone,
> >> >
> >> > On 10/05/2012 10:33 AM, Rob Herring wrote:
> >> >> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
> >> >>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
> >> >>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
> >> >>>>> Does it matter?  I'm just relaying the argument against adding __packed
> >> >>>>> which was used before we were forced (by the networking folk) to implement
> >> >>>>> the alignment fault handler.
> >> >>>>
> >> >>>> It doesn't really matter what will be accepted or not as adding __packed
> >> >>>> to struct iphdr doesn't fix the problem anyway. 
> > ...
> >> There are exactly two possible solutions:
> >> 
> >> 1. Change the networking code so those structs are always aligned.  This
> >>    might not be (easily) possible.
> >> 2. Mark the structs __packed and fix any typecasts like the ones seen in
> >>    this thread.  This will have an adverse effect in cases where the
> >>    structs are in fact aligned.
> >> 
> >> Both solutions lie squarely in the networking code.  It's time to
> >> involve that list, or we'll never get anywhere.
> >
> > It might be enough to use __attribute__((aligned(2))) on some structure
> > members (actually does 'ldm' need 8 byte alignment?? - in which case
> > aligned(4) is enough).
> 
> The aligned attribute can only increase alignment.
> 

I have no idea what is the problem, 

-ENOTENOUGHCONTEXT

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: alignment faults in 3.6
  2012-10-11  9:45                             ` Måns Rullgård
@ 2012-10-11 10:16                               ` David Laight
  -1 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-11 10:16 UTC (permalink / raw)
  To: Måns Rullgård; +Cc: Jon Masters, linux-arm-kernel, netdev

> > It might be enough to use __attribute__((aligned(2))) on some structure
> > members (actually does 'ldm' need 8 byte alignment?? - in which case
> > aligned(4) is enough).
> 
> The aligned attribute can only increase alignment.

Not true.
If you have:

struct foo {
    short   a;
    int     b __attribute__((aligned(2)));
    short   c;
};

You'll find sizeof (struct foo) is 8, and that, on arm/sparc etc,
gcc will generate 2 16bit accesses (and shifts) for foo.b;

	David

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 10:16                               ` David Laight
  0 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-11 10:16 UTC (permalink / raw)
  To: linux-arm-kernel

> > It might be enough to use __attribute__((aligned(2))) on some structure
> > members (actually does 'ldm' need 8 byte alignment?? - in which case
> > aligned(4) is enough).
> 
> The aligned attribute can only increase alignment.

Not true.
If you have:

struct foo {
    short   a;
    int     b __attribute__((aligned(2)));
    short   c;
};

You'll find sizeof (struct foo) is 8, and that, on arm/sparc etc,
gcc will generate 2 16bit accesses (and shifts) for foo.b;

	David

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:00                               ` Eric Dumazet
@ 2012-10-11 10:20                                 ` Måns Rullgård
  -1 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11 10:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Måns Rullgård, David Laight, Jon Masters,
	linux-arm-kernel, netdev

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Thu, 2012-10-11 at 10:45 +0100, Måns Rullgård wrote:
>> "David Laight" <David.Laight@ACULAB.COM> writes:
>> 
>> >> -----Original Message-----
>> >> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Måns Rullgård
>> >> Sent: 11 October 2012 03:27
>> >> To: Jon Masters
>> >> Cc: linux-arm-kernel@lists.infradead.org; netdev@vger.kernel.org
>> >> Subject: Re: alignment faults in 3.6
>> >> 
>> >> Jon Masters <jonathan@jonmasters.org> writes:
>> >> 
>> >> > Hi everyone,
>> >> >
>> >> > On 10/05/2012 10:33 AM, Rob Herring wrote:
>> >> >> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
>> >> >>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
>> >> >>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>> >> >>>>> Does it matter?  I'm just relaying the argument against adding __packed
>> >> >>>>> which was used before we were forced (by the networking folk) to implement
>> >> >>>>> the alignment fault handler.
>> >> >>>>
>> >> >>>> It doesn't really matter what will be accepted or not as adding __packed
>> >> >>>> to struct iphdr doesn't fix the problem anyway. 
>> > ...
>> >> There are exactly two possible solutions:
>> >> 
>> >> 1. Change the networking code so those structs are always aligned.  This
>> >>    might not be (easily) possible.
>> >> 2. Mark the structs __packed and fix any typecasts like the ones seen in
>> >>    this thread.  This will have an adverse effect in cases where the
>> >>    structs are in fact aligned.
>> >> 
>> >> Both solutions lie squarely in the networking code.  It's time to
>> >> involve that list, or we'll never get anywhere.
>> >
>> > It might be enough to use __attribute__((aligned(2))) on some structure
>> > members (actually does 'ldm' need 8 byte alignment?? - in which case
>> > aligned(4) is enough).
>> 
>> The aligned attribute can only increase alignment.
>
> I have no idea what is the problem, 
>
> -ENOTENOUGHCONTEXT

The thread starts here:
http://marc.info/?l=linux-arm-kernel&m=134939228120020

Summary: a pointer to "struct iphdr" is not 4-byte aligned as required
by the ARM ABI rules, and this causes traps to the unaligned access
fault handler.  A recent change makes the kernel print "scheduling while
atomic" warnings on some of these traps, which may or may not be
benign.  Either way, this is bad for performance and should be fixed one
way or another.

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 10:20                                 ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11 10:20 UTC (permalink / raw)
  To: linux-arm-kernel

Eric Dumazet <eric.dumazet@gmail.com> writes:

> On Thu, 2012-10-11 at 10:45 +0100, M?ns Rullg?rd wrote:
>> "David Laight" <David.Laight@ACULAB.COM> writes:
>> 
>> >> -----Original Message-----
>> >> From: netdev-owner at vger.kernel.org [mailto:netdev-owner at vger.kernel.org] On Behalf Of M?ns Rullg?rd
>> >> Sent: 11 October 2012 03:27
>> >> To: Jon Masters
>> >> Cc: linux-arm-kernel at lists.infradead.org; netdev at vger.kernel.org
>> >> Subject: Re: alignment faults in 3.6
>> >> 
>> >> Jon Masters <jonathan@jonmasters.org> writes:
>> >> 
>> >> > Hi everyone,
>> >> >
>> >> > On 10/05/2012 10:33 AM, Rob Herring wrote:
>> >> >> On 10/05/2012 09:05 AM, Russell King - ARM Linux wrote:
>> >> >>> On Fri, Oct 05, 2012 at 07:24:44AM -0500, Rob Herring wrote:
>> >> >>>> On 10/05/2012 03:24 AM, Russell King - ARM Linux wrote:
>> >> >>>>> Does it matter?  I'm just relaying the argument against adding __packed
>> >> >>>>> which was used before we were forced (by the networking folk) to implement
>> >> >>>>> the alignment fault handler.
>> >> >>>>
>> >> >>>> It doesn't really matter what will be accepted or not as adding __packed
>> >> >>>> to struct iphdr doesn't fix the problem anyway. 
>> > ...
>> >> There are exactly two possible solutions:
>> >> 
>> >> 1. Change the networking code so those structs are always aligned.  This
>> >>    might not be (easily) possible.
>> >> 2. Mark the structs __packed and fix any typecasts like the ones seen in
>> >>    this thread.  This will have an adverse effect in cases where the
>> >>    structs are in fact aligned.
>> >> 
>> >> Both solutions lie squarely in the networking code.  It's time to
>> >> involve that list, or we'll never get anywhere.
>> >
>> > It might be enough to use __attribute__((aligned(2))) on some structure
>> > members (actually does 'ldm' need 8 byte alignment?? - in which case
>> > aligned(4) is enough).
>> 
>> The aligned attribute can only increase alignment.
>
> I have no idea what is the problem, 
>
> -ENOTENOUGHCONTEXT

The thread starts here:
http://marc.info/?l=linux-arm-kernel&m=134939228120020

Summary: a pointer to "struct iphdr" is not 4-byte aligned as required
by the ARM ABI rules, and this causes traps to the unaligned access
fault handler.  A recent change makes the kernel print "scheduling while
atomic" warnings on some of these traps, which may or may not be
benign.  Either way, this is bad for performance and should be fixed one
way or another.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:00                               ` Eric Dumazet
@ 2012-10-11 10:22                                 ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 10:22 UTC (permalink / raw)
  To: Måns Rullgård
  Cc: David Laight, Jon Masters, linux-arm-kernel, netdev

On Thu, 2012-10-11 at 12:00 +0200, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 10:45 +0100, Måns Rullgård wrote:

> 
> I have no idea what is the problem, 
> 
> -ENOTENOUGHCONTEXT
> 
> 

I took a look, and I dont see why/how gcc could use a ldm instruction

Doing so assumed the alignment of the structure was 8 bytes, but its
not.

Networking stack mandates that IP headers are aligned on 4 bytes
boundaries, not 8 bytes.

(Some arches like x86 dont care, so we might have some bugs in some
drivers, but not in the GRO code)

Some drivers are not aware of the NET_IP_ALIGN stuff.
They should be fixed, or else you have alignment faults.

struct iphdr {
 	__u8	ihl:4,
		version:4;
	__u8	tos;
	__be16	tot_len;
	__be16	id;
	__be16	frag_off;
	__u8	ttl;
	__u8	protocol;
	__sum16	check;
	__be32	saddr;
	__be32	daddr;
	/*The options start here. */
};

The alignment of iphdr is 4, not 8

doing id = ntohl(*(__be32 *)&iph->id); is valid

doing flush = (u16)((ntohl(*(__be32 *)iph) ^ len) | (id ^ IP_DF)); is valid as well.

If arm compiler decided to use a 8bytes load, thats a compiler bug.

(unless compiler was specifically told that alignment did not matter)

c02ac020:       e8920840        ldm     r2, {r6, fp}    // HERE
c02ac024:       e6bfbf3b        rev     fp, fp
c02ac028:       e6bf6f36        rev     r6, r6
c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
c02ac030:       e0266008        eor     r6, r6, r8
c02ac034:       e18c6006        orr     r6, ip, r6

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 10:22                                 ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 10:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 12:00 +0200, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 10:45 +0100, M?ns Rullg?rd wrote:

> 
> I have no idea what is the problem, 
> 
> -ENOTENOUGHCONTEXT
> 
> 

I took a look, and I dont see why/how gcc could use a ldm instruction

Doing so assumed the alignment of the structure was 8 bytes, but its
not.

Networking stack mandates that IP headers are aligned on 4 bytes
boundaries, not 8 bytes.

(Some arches like x86 dont care, so we might have some bugs in some
drivers, but not in the GRO code)

Some drivers are not aware of the NET_IP_ALIGN stuff.
They should be fixed, or else you have alignment faults.

struct iphdr {
 	__u8	ihl:4,
		version:4;
	__u8	tos;
	__be16	tot_len;
	__be16	id;
	__be16	frag_off;
	__u8	ttl;
	__u8	protocol;
	__sum16	check;
	__be32	saddr;
	__be32	daddr;
	/*The options start here. */
};

The alignment of iphdr is 4, not 8

doing id = ntohl(*(__be32 *)&iph->id); is valid

doing flush = (u16)((ntohl(*(__be32 *)iph) ^ len) | (id ^ IP_DF)); is valid as well.

If arm compiler decided to use a 8bytes load, thats a compiler bug.

(unless compiler was specifically told that alignment did not matter)

c02ac020:       e8920840        ldm     r2, {r6, fp}    // HERE
c02ac024:       e6bfbf3b        rev     fp, fp
c02ac028:       e6bf6f36        rev     r6, r6
c02ac02c:       e22bc901        eor     ip, fp, #16384  ; 0x4000
c02ac030:       e0266008        eor     r6, r6, r8
c02ac034:       e18c6006        orr     r6, ip, r6

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:22                                 ` Eric Dumazet
@ 2012-10-11 10:32                                   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-11 10:32 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Måns Rullgård, Jon Masters, netdev, David Laight,
	linux-arm-kernel

On Thu, Oct 11, 2012 at 12:22:06PM +0200, Eric Dumazet wrote:
> I took a look, and I dont see why/how gcc could use a ldm instruction
> 
> Doing so assumed the alignment of the structure was 8 bytes, but its
> not.
> 
> Networking stack mandates that IP headers are aligned on 4 bytes
> boundaries, not 8 bytes.

Err, no.  ldm is "load multiple" not "load double".  It loads multiple
32-bit registers, and its requirement for non-faulting behaviour is for
the pointer to be 4 byte aligned.  However, "load double" requires 8
byte alignment.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 10:32                                   ` Russell King - ARM Linux
  0 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-11 10:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 11, 2012 at 12:22:06PM +0200, Eric Dumazet wrote:
> I took a look, and I dont see why/how gcc could use a ldm instruction
> 
> Doing so assumed the alignment of the structure was 8 bytes, but its
> not.
> 
> Networking stack mandates that IP headers are aligned on 4 bytes
> boundaries, not 8 bytes.

Err, no.  ldm is "load multiple" not "load double".  It loads multiple
32-bit registers, and its requirement for non-faulting behaviour is for
the pointer to be 4 byte aligned.  However, "load double" requires 8
byte alignment.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:16                               ` David Laight
@ 2012-10-11 10:46                                 ` Måns Rullgård
  -1 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11 10:46 UTC (permalink / raw)
  To: David Laight
  Cc: Måns Rullgård, Jon Masters, linux-arm-kernel, netdev

"David Laight" <David.Laight@ACULAB.COM> writes:

>> > It might be enough to use __attribute__((aligned(2))) on some structure
>> > members (actually does 'ldm' need 8 byte alignment?? - in which case
>> > aligned(4) is enough).
>> 
>> The aligned attribute can only increase alignment.
>
> Not true.
> If you have:
>
> struct foo {
>     short   a;
>     int     b __attribute__((aligned(2)));
>     short   c;
> };
>
> You'll find sizeof (struct foo) is 8, and that, on arm/sparc etc,
> gcc will generate 2 16bit accesses (and shifts) for foo.b;

That is not what the manual says.  Quoting:

     When used on a struct, or struct member, the `aligned' attribute
     can only increase the alignment

My gcc agrees with the manual:

$ cat foo.c
struct foo {
    short a;
    int b __attribute__((aligned(2)));
    short c;
};

int foo(struct foo *f)
{
    return f->b;
}
$ arm-unknown-linux-gnueabi-gcc-4.7.2 -O2 -o- -S foo.c
        .text
        .align  2
        .global foo
        .type   foo, %function
foo:
        ldr     r0, [r0, #4]
        bx      lr

(compiler output edited for clarity)

This clearly says the 'b' member resides at an offset of 4 bytes into
the struct.

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 10:46                                 ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

"David Laight" <David.Laight@ACULAB.COM> writes:

>> > It might be enough to use __attribute__((aligned(2))) on some structure
>> > members (actually does 'ldm' need 8 byte alignment?? - in which case
>> > aligned(4) is enough).
>> 
>> The aligned attribute can only increase alignment.
>
> Not true.
> If you have:
>
> struct foo {
>     short   a;
>     int     b __attribute__((aligned(2)));
>     short   c;
> };
>
> You'll find sizeof (struct foo) is 8, and that, on arm/sparc etc,
> gcc will generate 2 16bit accesses (and shifts) for foo.b;

That is not what the manual says.  Quoting:

     When used on a struct, or struct member, the `aligned' attribute
     can only increase the alignment

My gcc agrees with the manual:

$ cat foo.c
struct foo {
    short a;
    int b __attribute__((aligned(2)));
    short c;
};

int foo(struct foo *f)
{
    return f->b;
}
$ arm-unknown-linux-gnueabi-gcc-4.7.2 -O2 -o- -S foo.c
        .text
        .align  2
        .global foo
        .type   foo, %function
foo:
        ldr     r0, [r0, #4]
        bx      lr

(compiler output edited for clarity)

This clearly says the 'b' member resides at an offset of 4 bytes into
the struct.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:32                                   ` Russell King - ARM Linux
@ 2012-10-11 10:49                                     ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 10:49 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Måns Rullgård, Jon Masters, netdev, David Laight,
	linux-arm-kernel

On Thu, 2012-10-11 at 11:32 +0100, Russell King - ARM Linux wrote:
> On Thu, Oct 11, 2012 at 12:22:06PM +0200, Eric Dumazet wrote:
> > I took a look, and I dont see why/how gcc could use a ldm instruction
> > 
> > Doing so assumed the alignment of the structure was 8 bytes, but its
> > not.
> > 
> > Networking stack mandates that IP headers are aligned on 4 bytes
> > boundaries, not 8 bytes.
> 
> Err, no.  ldm is "load multiple" not "load double".  It loads multiple
> 32-bit registers, and its requirement for non-faulting behaviour is for
> the pointer to be 4 byte aligned.  However, "load double" requires 8
> byte alignment.

So if you have an alignment fault, thats because IP header is not
aligned on 4 bytes ?

If so a driver is buggy and must be fixed.

Please send us full stack trace

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 10:49                                     ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 10:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 11:32 +0100, Russell King - ARM Linux wrote:
> On Thu, Oct 11, 2012 at 12:22:06PM +0200, Eric Dumazet wrote:
> > I took a look, and I dont see why/how gcc could use a ldm instruction
> > 
> > Doing so assumed the alignment of the structure was 8 bytes, but its
> > not.
> > 
> > Networking stack mandates that IP headers are aligned on 4 bytes
> > boundaries, not 8 bytes.
> 
> Err, no.  ldm is "load multiple" not "load double".  It loads multiple
> 32-bit registers, and its requirement for non-faulting behaviour is for
> the pointer to be 4 byte aligned.  However, "load double" requires 8
> byte alignment.

So if you have an alignment fault, thats because IP header is not
aligned on 4 bytes ?

If so a driver is buggy and must be fixed.

Please send us full stack trace

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:49                                     ` Eric Dumazet
@ 2012-10-11 10:56                                       ` Maxime Bizon
  -1 siblings, 0 replies; 133+ messages in thread
From: Maxime Bizon @ 2012-10-11 10:56 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Russell King - ARM Linux, Måns Rullgård, Jon Masters,
	netdev, David Laight, linux-arm-kernel


On Thu, 2012-10-11 at 12:49 +0200, Eric Dumazet wrote:


> So if you have an alignment fault, thats because IP header is not
> aligned on 4 bytes ?
> 
> If so a driver is buggy and must be fixed.

So a driver that does not align the ip header is buggy ?

I always thought it was ok not to do so (with a potential performance
penalty).

I have some MIPS hardware that is not able to DMA on anything but 32bits
aligned addresses (bcm63xx). I tried once to add a memcpy instead of
taking unaligned faults and the result was *much slower* on a ipv4
forwarding test (which is what the hardware is used for).

-- 
Maxime

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 10:56                                       ` Maxime Bizon
  0 siblings, 0 replies; 133+ messages in thread
From: Maxime Bizon @ 2012-10-11 10:56 UTC (permalink / raw)
  To: linux-arm-kernel


On Thu, 2012-10-11 at 12:49 +0200, Eric Dumazet wrote:


> So if you have an alignment fault, thats because IP header is not
> aligned on 4 bytes ?
> 
> If so a driver is buggy and must be fixed.

So a driver that does not align the ip header is buggy ?

I always thought it was ok not to do so (with a potential performance
penalty).

I have some MIPS hardware that is not able to DMA on anything but 32bits
aligned addresses (bcm63xx). I tried once to add a memcpy instead of
taking unaligned faults and the result was *much slower* on a ipv4
forwarding test (which is what the hardware is used for).

-- 
Maxime

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:56                                       ` Maxime Bizon
@ 2012-10-11 11:28                                         ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 11:28 UTC (permalink / raw)
  To: mbizon
  Cc: Russell King - ARM Linux, Måns Rullgård, Jon Masters,
	netdev, David Laight, linux-arm-kernel

On Thu, 2012-10-11 at 12:56 +0200, Maxime Bizon wrote:
> On Thu, 2012-10-11 at 12:49 +0200, Eric Dumazet wrote:
> 
> 
> > So if you have an alignment fault, thats because IP header is not
> > aligned on 4 bytes ?
> > 
> > If so a driver is buggy and must be fixed.
> 
> So a driver that does not align the ip header is buggy ?
> 

exactly.

> I always thought it was ok not to do so (with a potential performance
> penalty).
> 

Apparently not for some arches.

> I have some MIPS hardware that is not able to DMA on anything but 32bits
> aligned addresses (bcm63xx). I tried once to add a memcpy instead of
> taking unaligned faults and the result was *much slower* on a ipv4
> forwarding test (which is what the hardware is used for).
> 

You probably are aware that a driver can use : 

- a fragment to hold the frame given by the hardware, with whatever
alignment is needed by the hardware.

Then allocate an skb with enough room (128 bytes) to pull the headers as
needed later.

skb = netdev_alloc_skb_ip_align(dev, 128);

Attach the fragment to the skb, and feed stack with this skb.
(pull the ethernet header before calling eth_type_trans()

Most modern drivers exactly use this strategy and get nice performance.

Of course, if hardware doesnt need a strange alignment, we can avoid the
fragment and directly use 

skb = netdev_alloc_skb_ip_align(dev, 1536);

So all drivers can be fixed if needed.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 11:28                                         ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 11:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 12:56 +0200, Maxime Bizon wrote:
> On Thu, 2012-10-11 at 12:49 +0200, Eric Dumazet wrote:
> 
> 
> > So if you have an alignment fault, thats because IP header is not
> > aligned on 4 bytes ?
> > 
> > If so a driver is buggy and must be fixed.
> 
> So a driver that does not align the ip header is buggy ?
> 

exactly.

> I always thought it was ok not to do so (with a potential performance
> penalty).
> 

Apparently not for some arches.

> I have some MIPS hardware that is not able to DMA on anything but 32bits
> aligned addresses (bcm63xx). I tried once to add a memcpy instead of
> taking unaligned faults and the result was *much slower* on a ipv4
> forwarding test (which is what the hardware is used for).
> 

You probably are aware that a driver can use : 

- a fragment to hold the frame given by the hardware, with whatever
alignment is needed by the hardware.

Then allocate an skb with enough room (128 bytes) to pull the headers as
needed later.

skb = netdev_alloc_skb_ip_align(dev, 128);

Attach the fragment to the skb, and feed stack with this skb.
(pull the ethernet header before calling eth_type_trans()

Most modern drivers exactly use this strategy and get nice performance.

Of course, if hardware doesnt need a strange alignment, we can avoid the
fragment and directly use 

skb = netdev_alloc_skb_ip_align(dev, 1536);

So all drivers can be fixed if needed.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 11:28                                         ` Eric Dumazet
@ 2012-10-11 11:47                                           ` Maxime Bizon
  -1 siblings, 0 replies; 133+ messages in thread
From: Maxime Bizon @ 2012-10-11 11:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Måns Rullgård, Russell King - ARM Linux, netdev,
	David Laight, Jon Masters, linux-arm-kernel


On Thu, 2012-10-11 at 13:28 +0200, Eric Dumazet wrote:

> You probably are aware that a driver can use : 
> 
> - a fragment to hold the frame given by the hardware, with whatever
> alignment is needed by the hardware.
> 
> Then allocate an skb with enough room (128 bytes) to pull the headers as
> needed later.
> 
> skb = netdev_alloc_skb_ip_align(dev, 128);

What happen at tx time, supposing that same hardware cannot do SG ?

Aren't we going to memcpy the data into the head ?

-- 
Maxime

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 11:47                                           ` Maxime Bizon
  0 siblings, 0 replies; 133+ messages in thread
From: Maxime Bizon @ 2012-10-11 11:47 UTC (permalink / raw)
  To: linux-arm-kernel


On Thu, 2012-10-11 at 13:28 +0200, Eric Dumazet wrote:

> You probably are aware that a driver can use : 
> 
> - a fragment to hold the frame given by the hardware, with whatever
> alignment is needed by the hardware.
> 
> Then allocate an skb with enough room (128 bytes) to pull the headers as
> needed later.
> 
> skb = netdev_alloc_skb_ip_align(dev, 128);

What happen at tx time, supposing that same hardware cannot do SG ?

Aren't we going to memcpy the data into the head ?

-- 
Maxime

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 11:47                                           ` Maxime Bizon
@ 2012-10-11 11:54                                             ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 11:54 UTC (permalink / raw)
  To: mbizon
  Cc: Måns Rullgård, Russell King - ARM Linux, netdev,
	David Laight, Jon Masters, linux-arm-kernel

On Thu, 2012-10-11 at 13:47 +0200, Maxime Bizon wrote:
> On Thu, 2012-10-11 at 13:28 +0200, Eric Dumazet wrote:
> 
> > You probably are aware that a driver can use : 
> > 
> > - a fragment to hold the frame given by the hardware, with whatever
> > alignment is needed by the hardware.
> > 
> > Then allocate an skb with enough room (128 bytes) to pull the headers as
> > needed later.
> > 
> > skb = netdev_alloc_skb_ip_align(dev, 128);
> 
> What happen at tx time, supposing that same hardware cannot do SG ?
> 
> Aren't we going to memcpy the data into the head ?
> 

Of course, if you use a forwarding setup, and the tx driver is not SG
capable, performance will be bad (You have to copy the data into a
single skb (linearize the skb)) 

But in 2012, having to use hardware without SG for a router sounds a bad
joke (if cpu speed is _also_ too low)

Adding get_unaligned() everywhere in linux network stacks is not an
option.

We actually want to be able to read the code and fix the bugs, not only
run it on a cheap low end router.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 11:54                                             ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 11:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 13:47 +0200, Maxime Bizon wrote:
> On Thu, 2012-10-11 at 13:28 +0200, Eric Dumazet wrote:
> 
> > You probably are aware that a driver can use : 
> > 
> > - a fragment to hold the frame given by the hardware, with whatever
> > alignment is needed by the hardware.
> > 
> > Then allocate an skb with enough room (128 bytes) to pull the headers as
> > needed later.
> > 
> > skb = netdev_alloc_skb_ip_align(dev, 128);
> 
> What happen at tx time, supposing that same hardware cannot do SG ?
> 
> Aren't we going to memcpy the data into the head ?
> 

Of course, if you use a forwarding setup, and the tx driver is not SG
capable, performance will be bad (You have to copy the data into a
single skb (linearize the skb)) 

But in 2012, having to use hardware without SG for a router sounds a bad
joke (if cpu speed is _also_ too low)

Adding get_unaligned() everywhere in linux network stacks is not an
option.

We actually want to be able to read the code and fix the bugs, not only
run it on a cheap low end router.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 11:54                                             ` Eric Dumazet
@ 2012-10-11 12:00                                               ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 12:00 UTC (permalink / raw)
  To: mbizon
  Cc: Måns Rullgård, Russell King - ARM Linux, netdev,
	David Laight, Jon Masters, linux-arm-kernel

On Thu, 2012-10-11 at 13:54 +0200, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 13:47 +0200, Maxime Bizon wrote:
> > On Thu, 2012-10-11 at 13:28 +0200, Eric Dumazet wrote:
> > 
> > > You probably are aware that a driver can use : 
> > > 
> > > - a fragment to hold the frame given by the hardware, with whatever
> > > alignment is needed by the hardware.
> > > 
> > > Then allocate an skb with enough room (128 bytes) to pull the headers as
> > > needed later.
> > > 
> > > skb = netdev_alloc_skb_ip_align(dev, 128);
> > 
> > What happen at tx time, supposing that same hardware cannot do SG ?
> > 
> > Aren't we going to memcpy the data into the head ?
> > 
> 
> Of course, if you use a forwarding setup, and the tx driver is not SG
> capable, performance will be bad (You have to copy the data into a
> single skb (linearize the skb)) 

By the way, if said driver also has alignments issues, it will probably
copy the packet given by the stack. (Think you added an IPIP
encapsulation)

It would be better for such driver to still pretend it can do SG,
and it does the copy itself, avoiding the prior copies in core stack.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 12:00                                               ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 12:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 13:54 +0200, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 13:47 +0200, Maxime Bizon wrote:
> > On Thu, 2012-10-11 at 13:28 +0200, Eric Dumazet wrote:
> > 
> > > You probably are aware that a driver can use : 
> > > 
> > > - a fragment to hold the frame given by the hardware, with whatever
> > > alignment is needed by the hardware.
> > > 
> > > Then allocate an skb with enough room (128 bytes) to pull the headers as
> > > needed later.
> > > 
> > > skb = netdev_alloc_skb_ip_align(dev, 128);
> > 
> > What happen at tx time, supposing that same hardware cannot do SG ?
> > 
> > Aren't we going to memcpy the data into the head ?
> > 
> 
> Of course, if you use a forwarding setup, and the tx driver is not SG
> capable, performance will be bad (You have to copy the data into a
> single skb (linearize the skb)) 

By the way, if said driver also has alignments issues, it will probably
copy the packet given by the stack. (Think you added an IPIP
encapsulation)

It would be better for such driver to still pretend it can do SG,
and it does the copy itself, avoiding the prior copies in core stack.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:49                                     ` Eric Dumazet
@ 2012-10-11 12:28                                       ` Arnd Bergmann
  -1 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2012-10-11 12:28 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Eric Dumazet, Russell King - ARM Linux, Jon Masters, netdev,
	Måns Rullgård, David Laight, Rob Herring

On Thursday 11 October 2012, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 11:32 +0100, Russell King - ARM Linux wrote:
> > On Thu, Oct 11, 2012 at 12:22:06PM +0200, Eric Dumazet wrote:
> > > I took a look, and I dont see why/how gcc could use a ldm instruction
> > > 
> > > Doing so assumed the alignment of the structure was 8 bytes, but its
> > > not.
> > > 
> > > Networking stack mandates that IP headers are aligned on 4 bytes
> > > boundaries, not 8 bytes.
> > 
> > Err, no.  ldm is "load multiple" not "load double".  It loads multiple
> > 32-bit registers, and its requirement for non-faulting behaviour is for
> > the pointer to be 4 byte aligned.  However, "load double" requires 8
> > byte alignment.
> 
> So if you have an alignment fault, thats because IP header is not
> aligned on 4 bytes ?
> 
> If so a driver is buggy and must be fixed.
> 
> Please send us full stack trace

Rob Herring as the original reporter has dropped off the Cc list, adding
him back.

I assume that the calxeda xgmac driver is the culprit then. It uses
netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
xgmac_rx_refill but it is not clear whether it does so intentionally
or by accident.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 12:28                                       ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2012-10-11 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 11 October 2012, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 11:32 +0100, Russell King - ARM Linux wrote:
> > On Thu, Oct 11, 2012 at 12:22:06PM +0200, Eric Dumazet wrote:
> > > I took a look, and I dont see why/how gcc could use a ldm instruction
> > > 
> > > Doing so assumed the alignment of the structure was 8 bytes, but its
> > > not.
> > > 
> > > Networking stack mandates that IP headers are aligned on 4 bytes
> > > boundaries, not 8 bytes.
> > 
> > Err, no.  ldm is "load multiple" not "load double".  It loads multiple
> > 32-bit registers, and its requirement for non-faulting behaviour is for
> > the pointer to be 4 byte aligned.  However, "load double" requires 8
> > byte alignment.
> 
> So if you have an alignment fault, thats because IP header is not
> aligned on 4 bytes ?
> 
> If so a driver is buggy and must be fixed.
> 
> Please send us full stack trace

Rob Herring as the original reporter has dropped off the Cc list, adding
him back.

I assume that the calxeda xgmac driver is the culprit then. It uses
netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
xgmac_rx_refill but it is not clear whether it does so intentionally
or by accident.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 12:28                                       ` Arnd Bergmann
@ 2012-10-11 12:40                                         ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 12:40 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, Russell King - ARM Linux, Jon Masters, netdev,
	Måns Rullgård, David Laight, Rob Herring

On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:

> 
> Rob Herring as the original reporter has dropped off the Cc list, adding
> him back.
> 
> I assume that the calxeda xgmac driver is the culprit then. It uses
> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
> xgmac_rx_refill but it is not clear whether it does so intentionally
> or by accident.

Thanks Arnd

It seems an accident, since driver doesnt check skb->data alignment at
all (this can change with SLAB debug on/off)

It also incorrectly adds 64 bytes to bfsize, there is no need for this.

(or if its needed, a comment would be nice, because on prior kernels,
this makes skb->head allocations uses kmalloc-4096 instead of
kmalloc-2048 slab cache... With 3.7 its less an issue now we use order-3
pages to deliver fragments for rx skbs

So the following patch should fix the alignment, and makes driver uses
half memory than before for stable kernels

diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
index 16814b3..a895e18 100644
--- a/drivers/net/ethernet/calxeda/xgmac.c
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -671,7 +671,8 @@ static void xgmac_rx_refill(struct xgmac_priv *priv)
 		p = priv->dma_rx + entry;
 
 		if (priv->rx_skbuff[entry] == NULL) {
-			skb = netdev_alloc_skb(priv->dev, priv->dma_buf_sz);
+			skb = netdev_alloc_skb_ip_align(priv->dev,
+							priv->dma_buf_sz);
 			if (unlikely(skb == NULL))
 				break;
 
@@ -703,7 +704,7 @@ static int xgmac_dma_desc_rings_init(struct net_device *dev)
 	/* Set the Buffer size according to the MTU;
 	 * indeed, in case of jumbo we need to bump-up the buffer sizes.
 	 */
-	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN + NET_IP_ALIGN + 64,
+	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN,
 		       64);
 
 	netdev_dbg(priv->dev, "mtu [%d] bfsize [%d]\n", dev->mtu, bfsize);

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 12:40                                         ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 12:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:

> 
> Rob Herring as the original reporter has dropped off the Cc list, adding
> him back.
> 
> I assume that the calxeda xgmac driver is the culprit then. It uses
> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
> xgmac_rx_refill but it is not clear whether it does so intentionally
> or by accident.

Thanks Arnd

It seems an accident, since driver doesnt check skb->data alignment at
all (this can change with SLAB debug on/off)

It also incorrectly adds 64 bytes to bfsize, there is no need for this.

(or if its needed, a comment would be nice, because on prior kernels,
this makes skb->head allocations uses kmalloc-4096 instead of
kmalloc-2048 slab cache... With 3.7 its less an issue now we use order-3
pages to deliver fragments for rx skbs

So the following patch should fix the alignment, and makes driver uses
half memory than before for stable kernels

diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
index 16814b3..a895e18 100644
--- a/drivers/net/ethernet/calxeda/xgmac.c
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -671,7 +671,8 @@ static void xgmac_rx_refill(struct xgmac_priv *priv)
 		p = priv->dma_rx + entry;
 
 		if (priv->rx_skbuff[entry] == NULL) {
-			skb = netdev_alloc_skb(priv->dev, priv->dma_buf_sz);
+			skb = netdev_alloc_skb_ip_align(priv->dev,
+							priv->dma_buf_sz);
 			if (unlikely(skb == NULL))
 				break;
 
@@ -703,7 +704,7 @@ static int xgmac_dma_desc_rings_init(struct net_device *dev)
 	/* Set the Buffer size according to the MTU;
 	 * indeed, in case of jumbo we need to bump-up the buffer sizes.
 	 */
-	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN + NET_IP_ALIGN + 64,
+	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN,
 		       64);
 
 	netdev_dbg(priv->dev, "mtu [%d] bfsize [%d]\n", dev->mtu, bfsize);

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 11:54                                             ` Eric Dumazet
@ 2012-10-11 12:51                                               ` Maxime Bizon
  -1 siblings, 0 replies; 133+ messages in thread
From: Maxime Bizon @ 2012-10-11 12:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Måns Rullgård, Russell King - ARM Linux, netdev,
	David Laight, Jon Masters, linux-arm-kernel


On Thu, 2012-10-11 at 13:54 +0200, Eric Dumazet wrote:

> Of course, if you use a forwarding setup, and the tx driver is not SG
> capable, performance will be bad (You have to copy the data into a
> single skb (linearize the skb)) 
> 
> But in 2012, having to use hardware without SG for a router sounds a bad
> joke (if cpu speed is _also_ too low)

Hey I cannot go back in time, when that hardware was built in 2004 (mips
@250Mhz), it was considered good, and we did manufacture a lot of it, so
it's still maintained.

People run recent kernels on older hardware because they are *encouraged
to do so*.

I fought inside my company to be good kernel citizen, not using
proprietary BSP, rewrite & mainline the drivers, because that was the
community promise: mainline it, we will support it for you, you will get
the latest kernel features for free.


That worked, but with some drawbacks:

 - kernel footprint grew that much (we started from 2.4) that it does
not fit in device flash anymore

 - performance took a hit each time we upgrade, mostly because of cache
footprint growth.

 - as kernel footprint grew, available RAM for conntrack & route cache
entries was smaller each time


But I had to stop upgrading after 2.6.20. Everything below is not
anybody's fault. Bloat is unavoidable for software project that big.

I'm perfectly ok with that, but I don't want to be ridiculed for running
mainline kernel on old hardware.


> Adding get_unaligned() everywhere in linux network stacks is not an
> option.
> 
> We actually want to be able to read the code and fix the bugs, not only
> run it on a cheap low end router.

That was not a request, I just needed a clarification. 

Documentation/unaligned-memory-access.txt does not say it's a big no-no,
it says you can give unaligned pointers to the networking stack if you
arch can do unaligned access (with an "efficiency" notion).

MIPS and ARM have a software handler for this, and performance wise in
my case it's better to take the faults, a driver writer may think a
benchmark will dictate what to do.

-- 
Maxime

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 12:51                                               ` Maxime Bizon
  0 siblings, 0 replies; 133+ messages in thread
From: Maxime Bizon @ 2012-10-11 12:51 UTC (permalink / raw)
  To: linux-arm-kernel


On Thu, 2012-10-11 at 13:54 +0200, Eric Dumazet wrote:

> Of course, if you use a forwarding setup, and the tx driver is not SG
> capable, performance will be bad (You have to copy the data into a
> single skb (linearize the skb)) 
> 
> But in 2012, having to use hardware without SG for a router sounds a bad
> joke (if cpu speed is _also_ too low)

Hey I cannot go back in time, when that hardware was built in 2004 (mips
@250Mhz), it was considered good, and we did manufacture a lot of it, so
it's still maintained.

People run recent kernels on older hardware because they are *encouraged
to do so*.

I fought inside my company to be good kernel citizen, not using
proprietary BSP, rewrite & mainline the drivers, because that was the
community promise: mainline it, we will support it for you, you will get
the latest kernel features for free.


That worked, but with some drawbacks:

 - kernel footprint grew that much (we started from 2.4) that it does
not fit in device flash anymore

 - performance took a hit each time we upgrade, mostly because of cache
footprint growth.

 - as kernel footprint grew, available RAM for conntrack & route cache
entries was smaller each time


But I had to stop upgrading after 2.6.20. Everything below is not
anybody's fault. Bloat is unavoidable for software project that big.

I'm perfectly ok with that, but I don't want to be ridiculed for running
mainline kernel on old hardware.


> Adding get_unaligned() everywhere in linux network stacks is not an
> option.
> 
> We actually want to be able to read the code and fix the bugs, not only
> run it on a cheap low end router.

That was not a request, I just needed a clarification. 

Documentation/unaligned-memory-access.txt does not say it's a big no-no,
it says you can give unaligned pointers to the networking stack if you
arch can do unaligned access (with an "efficiency" notion).

MIPS and ARM have a software handler for this, and performance wise in
my case it's better to take the faults, a driver writer may think a
benchmark will dictate what to do.

-- 
Maxime

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 12:51                                               ` Maxime Bizon
@ 2012-10-11 12:59                                                 ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 12:59 UTC (permalink / raw)
  To: mbizon
  Cc: Måns Rullgård, Russell King - ARM Linux, netdev,
	David Laight, Jon Masters, linux-arm-kernel

On Thu, 2012-10-11 at 14:51 +0200, Maxime Bizon wrote:

> Hey I cannot go back in time, when that hardware was built in 2004 (mips
> @250Mhz), it was considered good, and we did manufacture a lot of it, so
> it's still maintained.
> 
> People run recent kernels on older hardware because they are *encouraged
> to do so*.
> 
> I fought inside my company to be good kernel citizen, not using
> proprietary BSP, rewrite & mainline the drivers, because that was the
> community promise: mainline it, we will support it for you, you will get
> the latest kernel features for free.
> 
> 
> That worked, but with some drawbacks:
> 
>  - kernel footprint grew that much (we started from 2.4) that it does
> not fit in device flash anymore
> 
>  - performance took a hit each time we upgrade, mostly because of cache
> footprint growth.
> 
>  - as kernel footprint grew, available RAM for conntrack & route cache
> entries was smaller each time
> 
> 
> But I had to stop upgrading after 2.6.20. Everything below is not
> anybody's fault. Bloat is unavoidable for software project that big.
> 
> I'm perfectly ok with that, but I don't want to be ridiculed for running
> mainline kernel on old hardware.

Hmm, I am sorry if you felt that, it was not my intent.

> 
> 
> > Adding get_unaligned() everywhere in linux network stacks is not an
> > option.
> > 
> > We actually want to be able to read the code and fix the bugs, not only
> > run it on a cheap low end router.
> 
> That was not a request, I just needed a clarification. 
> 
> Documentation/unaligned-memory-access.txt does not say it's a big no-no,
> it says you can give unaligned pointers to the networking stack if you
> arch can do unaligned access (with an "efficiency" notion).
> 
> MIPS and ARM have a software handler for this, and performance wise in
> my case it's better to take the faults, a driver writer may think a
> benchmark will dictate what to do.
> 

Sure, but all this discussion started because one arch apparently did
not like these mis alignments, and some people complained that network
guys would not add _needed_ get_unaligned_xxx() wrappers ...

So far, linux is 20 years old, I dont think we are going to add wrappers
right now. Machines that could have cared are dying anyway.

Please note we still support NET_IP_ALIGN, even if its 0 on x86.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 12:59                                                 ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 12:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 14:51 +0200, Maxime Bizon wrote:

> Hey I cannot go back in time, when that hardware was built in 2004 (mips
> @250Mhz), it was considered good, and we did manufacture a lot of it, so
> it's still maintained.
> 
> People run recent kernels on older hardware because they are *encouraged
> to do so*.
> 
> I fought inside my company to be good kernel citizen, not using
> proprietary BSP, rewrite & mainline the drivers, because that was the
> community promise: mainline it, we will support it for you, you will get
> the latest kernel features for free.
> 
> 
> That worked, but with some drawbacks:
> 
>  - kernel footprint grew that much (we started from 2.4) that it does
> not fit in device flash anymore
> 
>  - performance took a hit each time we upgrade, mostly because of cache
> footprint growth.
> 
>  - as kernel footprint grew, available RAM for conntrack & route cache
> entries was smaller each time
> 
> 
> But I had to stop upgrading after 2.6.20. Everything below is not
> anybody's fault. Bloat is unavoidable for software project that big.
> 
> I'm perfectly ok with that, but I don't want to be ridiculed for running
> mainline kernel on old hardware.

Hmm, I am sorry if you felt that, it was not my intent.

> 
> 
> > Adding get_unaligned() everywhere in linux network stacks is not an
> > option.
> > 
> > We actually want to be able to read the code and fix the bugs, not only
> > run it on a cheap low end router.
> 
> That was not a request, I just needed a clarification. 
> 
> Documentation/unaligned-memory-access.txt does not say it's a big no-no,
> it says you can give unaligned pointers to the networking stack if you
> arch can do unaligned access (with an "efficiency" notion).
> 
> MIPS and ARM have a software handler for this, and performance wise in
> my case it's better to take the faults, a driver writer may think a
> benchmark will dictate what to do.
> 

Sure, but all this discussion started because one arch apparently did
not like these mis alignments, and some people complained that network
guys would not add _needed_ get_unaligned_xxx() wrappers ...

So far, linux is 20 years old, I dont think we are going to add wrappers
right now. Machines that could have cared are dying anyway.

Please note we still support NET_IP_ALIGN, even if its 0 on x86.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 12:40                                         ` Eric Dumazet
@ 2012-10-11 13:20                                           ` Rob Herring
  -1 siblings, 0 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-11 13:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Arnd Bergmann, linux-arm-kernel, Russell King - ARM Linux,
	Jon Masters, netdev, Måns Rullgård, David Laight

On 10/11/2012 07:40 AM, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
> 
>>
>> Rob Herring as the original reporter has dropped off the Cc list, adding
>> him back.
>>
>> I assume that the calxeda xgmac driver is the culprit then. It uses
>> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
>> xgmac_rx_refill but it is not clear whether it does so intentionally
>> or by accident.

This in fact does work and eliminates the unaligned traps. However, not
all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
this is a questionable optimization by the compiler. We're saving 1 load
instruction here for data that is likely already in the cache. It may be
legal per the ABI, but the downside of this optimization is much greater
than the upside.

> 
> Thanks Arnd
> 
> It seems an accident, since driver doesnt check skb->data alignment at
> all (this can change with SLAB debug on/off)
> 
> It also incorrectly adds 64 bytes to bfsize, there is no need for this.

I'm pretty sure this was needed as the h/w writes out full bursts of
data, but I'll go back and check.

Rob

> (or if its needed, a comment would be nice, because on prior kernels,
> this makes skb->head allocations uses kmalloc-4096 instead of
> kmalloc-2048 slab cache... With 3.7 its less an issue now we use order-3
> pages to deliver fragments for rx skbs
>
> So the following patch should fix the alignment, and makes driver uses
> half memory than before for stable kernels
> 
> diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
> index 16814b3..a895e18 100644
> --- a/drivers/net/ethernet/calxeda/xgmac.c
> +++ b/drivers/net/ethernet/calxeda/xgmac.c
> @@ -671,7 +671,8 @@ static void xgmac_rx_refill(struct xgmac_priv *priv)
>  		p = priv->dma_rx + entry;
>  
>  		if (priv->rx_skbuff[entry] == NULL) {
> -			skb = netdev_alloc_skb(priv->dev, priv->dma_buf_sz);
> +			skb = netdev_alloc_skb_ip_align(priv->dev,
> +							priv->dma_buf_sz);
>  			if (unlikely(skb == NULL))
>  				break;
>  
> @@ -703,7 +704,7 @@ static int xgmac_dma_desc_rings_init(struct net_device *dev)
>  	/* Set the Buffer size according to the MTU;
>  	 * indeed, in case of jumbo we need to bump-up the buffer sizes.
>  	 */
> -	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN + NET_IP_ALIGN + 64,
> +	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN,
>  		       64);
>  
>  	netdev_dbg(priv->dev, "mtu [%d] bfsize [%d]\n", dev->mtu, bfsize);
> 
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 13:20                                           ` Rob Herring
  0 siblings, 0 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-11 13:20 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/11/2012 07:40 AM, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
> 
>>
>> Rob Herring as the original reporter has dropped off the Cc list, adding
>> him back.
>>
>> I assume that the calxeda xgmac driver is the culprit then. It uses
>> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
>> xgmac_rx_refill but it is not clear whether it does so intentionally
>> or by accident.

This in fact does work and eliminates the unaligned traps. However, not
all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
this is a questionable optimization by the compiler. We're saving 1 load
instruction here for data that is likely already in the cache. It may be
legal per the ABI, but the downside of this optimization is much greater
than the upside.

> 
> Thanks Arnd
> 
> It seems an accident, since driver doesnt check skb->data alignment at
> all (this can change with SLAB debug on/off)
> 
> It also incorrectly adds 64 bytes to bfsize, there is no need for this.

I'm pretty sure this was needed as the h/w writes out full bursts of
data, but I'll go back and check.

Rob

> (or if its needed, a comment would be nice, because on prior kernels,
> this makes skb->head allocations uses kmalloc-4096 instead of
> kmalloc-2048 slab cache... With 3.7 its less an issue now we use order-3
> pages to deliver fragments for rx skbs
>
> So the following patch should fix the alignment, and makes driver uses
> half memory than before for stable kernels
> 
> diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
> index 16814b3..a895e18 100644
> --- a/drivers/net/ethernet/calxeda/xgmac.c
> +++ b/drivers/net/ethernet/calxeda/xgmac.c
> @@ -671,7 +671,8 @@ static void xgmac_rx_refill(struct xgmac_priv *priv)
>  		p = priv->dma_rx + entry;
>  
>  		if (priv->rx_skbuff[entry] == NULL) {
> -			skb = netdev_alloc_skb(priv->dev, priv->dma_buf_sz);
> +			skb = netdev_alloc_skb_ip_align(priv->dev,
> +							priv->dma_buf_sz);
>  			if (unlikely(skb == NULL))
>  				break;
>  
> @@ -703,7 +704,7 @@ static int xgmac_dma_desc_rings_init(struct net_device *dev)
>  	/* Set the Buffer size according to the MTU;
>  	 * indeed, in case of jumbo we need to bump-up the buffer sizes.
>  	 */
> -	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN + NET_IP_ALIGN + 64,
> +	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN,
>  		       64);
>  
>  	netdev_dbg(priv->dev, "mtu [%d] bfsize [%d]\n", dev->mtu, bfsize);
> 
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 13:20                                           ` Rob Herring
@ 2012-10-11 13:32                                             ` Måns Rullgård
  -1 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11 13:32 UTC (permalink / raw)
  To: Rob Herring
  Cc: Eric Dumazet, Arnd Bergmann, linux-arm-kernel,
	Russell King - ARM Linux, Jon Masters, netdev,
	Måns Rullgård, David Laight

Rob Herring <robherring2@gmail.com> writes:

> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
>> On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
>> 
>>>
>>> Rob Herring as the original reporter has dropped off the Cc list, adding
>>> him back.
>>>
>>> I assume that the calxeda xgmac driver is the culprit then. It uses
>>> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
>>> xgmac_rx_refill but it is not clear whether it does so intentionally
>>> or by accident.
>
> This in fact does work and eliminates the unaligned traps. However, not
> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
> this is a questionable optimization by the compiler. We're saving 1 load
> instruction here for data that is likely already in the cache. It may be
> legal per the ABI, but the downside of this optimization is much greater
> than the upside.

The compiler is working *exactly* as it should.  Merging the loads saves
cycles *and* code size.  Many of these added up can make a real difference.

When writing code, you must follow all the rules, whether you like them
or not.  Without rules, the compiler would be very limited in the
optimisations it could perform.

Unfortunately, new optimisations occasionally uncover broken code
violating some constraint or other.  When this happens, the correct
course of action is to fix the code, not cripple the compiler.

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 13:32                                             ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

Rob Herring <robherring2@gmail.com> writes:

> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
>> On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
>> 
>>>
>>> Rob Herring as the original reporter has dropped off the Cc list, adding
>>> him back.
>>>
>>> I assume that the calxeda xgmac driver is the culprit then. It uses
>>> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
>>> xgmac_rx_refill but it is not clear whether it does so intentionally
>>> or by accident.
>
> This in fact does work and eliminates the unaligned traps. However, not
> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
> this is a questionable optimization by the compiler. We're saving 1 load
> instruction here for data that is likely already in the cache. It may be
> legal per the ABI, but the downside of this optimization is much greater
> than the upside.

The compiler is working *exactly* as it should.  Merging the loads saves
cycles *and* code size.  Many of these added up can make a real difference.

When writing code, you must follow all the rules, whether you like them
or not.  Without rules, the compiler would be very limited in the
optimisations it could perform.

Unfortunately, new optimisations occasionally uncover broken code
violating some constraint or other.  When this happens, the correct
course of action is to fix the code, not cripple the compiler.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 13:20                                           ` Rob Herring
@ 2012-10-11 13:35                                             ` Arnd Bergmann
  -1 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2012-10-11 13:35 UTC (permalink / raw)
  To: Rob Herring
  Cc: Eric Dumazet, linux-arm-kernel, Russell King - ARM Linux,
	Jon Masters, netdev, Måns Rullgård, David Laight

On Thursday 11 October 2012, Rob Herring wrote:
> 
> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
> > On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
> > 
> >>
> >> Rob Herring as the original reporter has dropped off the Cc list, adding
> >> him back.
> >>
> >> I assume that the calxeda xgmac driver is the culprit then. It uses
> >> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
> >> xgmac_rx_refill but it is not clear whether it does so intentionally
> >> or by accident.
> 
> This in fact does work and eliminates the unaligned traps. However, not
> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
> this is a questionable optimization by the compiler. We're saving 1 load
> instruction here for data that is likely already in the cache. It may be
> legal per the ABI, but the downside of this optimization is much greater
> than the upside.
> 

What about the other approach that Eric suggested for such hardware
in http://www.spinics.net/lists/arm-kernel/msg200206.html ?

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 13:35                                             ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2012-10-11 13:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 11 October 2012, Rob Herring wrote:
> 
> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
> > On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
> > 
> >>
> >> Rob Herring as the original reporter has dropped off the Cc list, adding
> >> him back.
> >>
> >> I assume that the calxeda xgmac driver is the culprit then. It uses
> >> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
> >> xgmac_rx_refill but it is not clear whether it does so intentionally
> >> or by accident.
> 
> This in fact does work and eliminates the unaligned traps. However, not
> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
> this is a questionable optimization by the compiler. We're saving 1 load
> instruction here for data that is likely already in the cache. It may be
> legal per the ABI, but the downside of this optimization is much greater
> than the upside.
> 

What about the other approach that Eric suggested for such hardware
in http://www.spinics.net/lists/arm-kernel/msg200206.html ?

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 13:20                                           ` Rob Herring
@ 2012-10-11 13:47                                             ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 13:47 UTC (permalink / raw)
  To: Rob Herring
  Cc: Arnd Bergmann, linux-arm-kernel, Russell King - ARM Linux,
	Jon Masters, netdev, Måns Rullgård, David Laight

On Thu, 2012-10-11 at 08:20 -0500, Rob Herring wrote:
> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
> > On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
> > 
> >>
> >> Rob Herring as the original reporter has dropped off the Cc list, adding
> >> him back.
> >>
> >> I assume that the calxeda xgmac driver is the culprit then. It uses
> >> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
> >> xgmac_rx_refill but it is not clear whether it does so intentionally
> >> or by accident.
> 
> This in fact does work and eliminates the unaligned traps. However, not
> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
> this is a questionable optimization by the compiler. We're saving 1 load
> instruction here for data that is likely already in the cache. It may be
> legal per the ABI, but the downside of this optimization is much greater
> than the upside.

Compiler is asked to perform a 32bit load, it does it.

There is no questionable optimization here. Really.
Please stop pretending this, this makes no sense.

As I said, if some h/w cannot do IP aligned DMA, driver can use a
workaround, or a plain memmove() (some drivers seems to do this to work
around this h/w limitation, just grep for memmove() in drivers/net)

> 
> > 
> > Thanks Arnd
> > 
> > It seems an accident, since driver doesnt check skb->data alignment at
> > all (this can change with SLAB debug on/off)
> > 
> > It also incorrectly adds 64 bytes to bfsize, there is no need for this.
> 
> I'm pretty sure this was needed as the h/w writes out full bursts of
> data, but I'll go back and check.

Maybe the ALIGN() was needed then. But the 64 + NE_IP_ALIGN sounds like
the head room that we allocate/reserve in netdev_alloc_skb_ip_align()

So you allocate this extra room twice.

Thanks

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 13:47                                             ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 13:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 08:20 -0500, Rob Herring wrote:
> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
> > On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
> > 
> >>
> >> Rob Herring as the original reporter has dropped off the Cc list, adding
> >> him back.
> >>
> >> I assume that the calxeda xgmac driver is the culprit then. It uses
> >> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
> >> xgmac_rx_refill but it is not clear whether it does so intentionally
> >> or by accident.
> 
> This in fact does work and eliminates the unaligned traps. However, not
> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
> this is a questionable optimization by the compiler. We're saving 1 load
> instruction here for data that is likely already in the cache. It may be
> legal per the ABI, but the downside of this optimization is much greater
> than the upside.

Compiler is asked to perform a 32bit load, it does it.

There is no questionable optimization here. Really.
Please stop pretending this, this makes no sense.

As I said, if some h/w cannot do IP aligned DMA, driver can use a
workaround, or a plain memmove() (some drivers seems to do this to work
around this h/w limitation, just grep for memmove() in drivers/net)

> 
> > 
> > Thanks Arnd
> > 
> > It seems an accident, since driver doesnt check skb->data alignment at
> > all (this can change with SLAB debug on/off)
> > 
> > It also incorrectly adds 64 bytes to bfsize, there is no need for this.
> 
> I'm pretty sure this was needed as the h/w writes out full bursts of
> data, but I'll go back and check.

Maybe the ALIGN() was needed then. But the 64 + NE_IP_ALIGN sounds like
the head room that we allocate/reserve in netdev_alloc_skb_ip_align()

So you allocate this extra room twice.

Thanks

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 13:47                                             ` Eric Dumazet
@ 2012-10-11 15:23                                               ` Rob Herring
  -1 siblings, 0 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-11 15:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Arnd Bergmann, linux-arm-kernel, Russell King - ARM Linux,
	Jon Masters, netdev, Måns Rullgård, David Laight

On 10/11/2012 08:47 AM, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 08:20 -0500, Rob Herring wrote:
>> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
>>> On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
>>>
>>>>
>>>> Rob Herring as the original reporter has dropped off the Cc list, adding
>>>> him back.
>>>>
>>>> I assume that the calxeda xgmac driver is the culprit then. It uses
>>>> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
>>>> xgmac_rx_refill but it is not clear whether it does so intentionally
>>>> or by accident.
>>
>> This in fact does work and eliminates the unaligned traps. However, not
>> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
>> this is a questionable optimization by the compiler. We're saving 1 load
>> instruction here for data that is likely already in the cache. It may be
>> legal per the ABI, but the downside of this optimization is much greater
>> than the upside.
> 
> Compiler is asked to perform a 32bit load, it does it.

Not exactly. It is asked to to perform 2 32-bit loads which are combined
into a single ldm (load multiple) which cannot handle unaligned
accesses. Here's a simple example that does the same thing:

void test(char * buf)
{
	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
}

So I guess the only ABI legal unaligned access is in a packed struct.

> There is no questionable optimization here. Really.
> Please stop pretending this, this makes no sense.

I'm not the one calling the networking stack bad code.

I can fix my h/w, so I'll stop caring about this. Others can all get
bitten by this new behavior in gcc 4.7.

Rob

> As I said, if some h/w cannot do IP aligned DMA, driver can use a
> workaround, or a plain memmove() (some drivers seems to do this to work
> around this h/w limitation, just grep for memmove() in drivers/net)
> 
>>
>>>
>>> Thanks Arnd
>>>
>>> It seems an accident, since driver doesnt check skb->data alignment at
>>> all (this can change with SLAB debug on/off)
>>>
>>> It also incorrectly adds 64 bytes to bfsize, there is no need for this.
>>
>> I'm pretty sure this was needed as the h/w writes out full bursts of
>> data, but I'll go back and check.
> 
> Maybe the ALIGN() was needed then. But the 64 + NE_IP_ALIGN sounds like
> the head room that we allocate/reserve in netdev_alloc_skb_ip_align()
> 
> So you allocate this extra room twice.
> 
> Thanks
> 
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 15:23                                               ` Rob Herring
  0 siblings, 0 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-11 15:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/11/2012 08:47 AM, Eric Dumazet wrote:
> On Thu, 2012-10-11 at 08:20 -0500, Rob Herring wrote:
>> On 10/11/2012 07:40 AM, Eric Dumazet wrote:
>>> On Thu, 2012-10-11 at 12:28 +0000, Arnd Bergmann wrote:
>>>
>>>>
>>>> Rob Herring as the original reporter has dropped off the Cc list, adding
>>>> him back.
>>>>
>>>> I assume that the calxeda xgmac driver is the culprit then. It uses
>>>> netdev_alloc_skb() rather than netdev_alloc_skb_ip_align() in
>>>> xgmac_rx_refill but it is not clear whether it does so intentionally
>>>> or by accident.
>>
>> This in fact does work and eliminates the unaligned traps. However, not
>> all h/w can do IP aligned DMA (i.MX FEC for example), so I still think
>> this is a questionable optimization by the compiler. We're saving 1 load
>> instruction here for data that is likely already in the cache. It may be
>> legal per the ABI, but the downside of this optimization is much greater
>> than the upside.
> 
> Compiler is asked to perform a 32bit load, it does it.

Not exactly. It is asked to to perform 2 32-bit loads which are combined
into a single ldm (load multiple) which cannot handle unaligned
accesses. Here's a simple example that does the same thing:

void test(char * buf)
{
	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
}

So I guess the only ABI legal unaligned access is in a packed struct.

> There is no questionable optimization here. Really.
> Please stop pretending this, this makes no sense.

I'm not the one calling the networking stack bad code.

I can fix my h/w, so I'll stop caring about this. Others can all get
bitten by this new behavior in gcc 4.7.

Rob

> As I said, if some h/w cannot do IP aligned DMA, driver can use a
> workaround, or a plain memmove() (some drivers seems to do this to work
> around this h/w limitation, just grep for memmove() in drivers/net)
> 
>>
>>>
>>> Thanks Arnd
>>>
>>> It seems an accident, since driver doesnt check skb->data alignment at
>>> all (this can change with SLAB debug on/off)
>>>
>>> It also incorrectly adds 64 bytes to bfsize, there is no need for this.
>>
>> I'm pretty sure this was needed as the h/w writes out full bursts of
>> data, but I'll go back and check.
> 
> Maybe the ALIGN() was needed then. But the 64 + NE_IP_ALIGN sounds like
> the head room that we allocate/reserve in netdev_alloc_skb_ip_align()
> 
> So you allocate this extra room twice.
> 
> Thanks
> 
> 

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: alignment faults in 3.6
  2012-10-11 15:23                                               ` Rob Herring
@ 2012-10-11 15:39                                                 ` David Laight
  -1 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-11 15:39 UTC (permalink / raw)
  To: Rob Herring, Eric Dumazet
  Cc: Arnd Bergmann, linux-arm-kernel, Russell King - ARM Linux,
	Jon Masters, netdev, Måns Rullgård

 
> Not exactly. It is asked to to perform 2 32-bit loads which are combined
> into a single ldm (load multiple) which cannot handle unaligned
> accesses. Here's a simple example that does the same thing:
> 
> void test(char * buf)
> {
> 	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
> }

Have you actually looked at what an ARM processor traditionally did
with misaligned memory reads?
While useful, it probably wasn't what was intended.

Actually, and IIRC, some very recent ARM cpus will do the 'expected'
thing for single-word loads from misaligned addesses.
However they almost certainly won't for ldm/stm.

The 'ldm' optimisation for adjacent memory loads is also dubious.
On at least some ARMs it is very slow (might only be strongarms).

> So I guess the only ABI legal unaligned access is in a packed struct.

Correct. And you mustn't try casting the address, the compiler is
allowed to remember where it came from.
(This causes a lot of grief...)

If you are targeting the ARM cpu that can do misaligned transfers,
then gcc should generate single instructions for misaligned structure
members, and never do the 'ldm' optimisations.

But, the IP header is expected to be aligned.

	David


^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 15:39                                                 ` David Laight
  0 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-11 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

 
> Not exactly. It is asked to to perform 2 32-bit loads which are combined
> into a single ldm (load multiple) which cannot handle unaligned
> accesses. Here's a simple example that does the same thing:
> 
> void test(char * buf)
> {
> 	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
> }

Have you actually looked at what an ARM processor traditionally did
with misaligned memory reads?
While useful, it probably wasn't what was intended.

Actually, and IIRC, some very recent ARM cpus will do the 'expected'
thing for single-word loads from misaligned addesses.
However they almost certainly won't for ldm/stm.

The 'ldm' optimisation for adjacent memory loads is also dubious.
On at least some ARMs it is very slow (might only be strongarms).

> So I guess the only ABI legal unaligned access is in a packed struct.

Correct. And you mustn't try casting the address, the compiler is
allowed to remember where it came from.
(This causes a lot of grief...)

If you are targeting the ARM cpu that can do misaligned transfers,
then gcc should generate single instructions for misaligned structure
members, and never do the 'ldm' optimisations.

But, the IP header is expected to be aligned.

	David

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 15:23                                               ` Rob Herring
@ 2012-10-11 16:15                                                 ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 16:15 UTC (permalink / raw)
  To: Rob Herring
  Cc: Arnd Bergmann, linux-arm-kernel, Russell King - ARM Linux,
	Jon Masters, netdev, Måns Rullgård, David Laight

On Thu, 2012-10-11 at 10:23 -0500, Rob Herring wrote:
> On 10/11/2012 08:47 AM, Eric Dumazet wrote:

> > Compiler is asked to perform a 32bit load, it does it.
> 
> Not exactly. It is asked to to perform 2 32-bit loads which are combined
> into a single ldm (load multiple) which cannot handle unaligned
> accesses. Here's a simple example that does the same thing:


Thats simply not true. You are severely mistaken.

ldm does a load of seeral 32bit words.

And the compiler would not use it if the alignment was not matching the
prereq (alignment >= 4)



> 
> void test(char * buf)
> {
> 	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
> }

But you completely miss the fact that network doesnt pass a "char *buf"
but a "be32 *buf". Your example is not relevant at all.

So the compiler is absolutely right, and network stack is _right_ too.

The prereq is that IP header are aligned to 4 bytes boundary.

Denying this fact is not going to help you


> 
> So I guess the only ABI legal unaligned access is in a packed struct.
> 
> > There is no questionable optimization here. Really.
> > Please stop pretending this, this makes no sense.
> 
> I'm not the one calling the networking stack bad code.

Once you understand the issues, you can explain us where is the bad
code. But given you say "Bug is in compiler, and/or network stack, but
my driver is fine", its not very wise.

For the moment, the bug is in your driver.

> 
> I can fix my h/w, so I'll stop caring about this. Others can all get
> bitten by this new behavior in gcc 4.7.

Again compiler is fine. How should we say that so that you stop your
rants ?

Stop trying to find an excuse, dont try to fool us, this is really
embarrassing. Just fix the driver, there is no shame to fix an error.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 16:15                                                 ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-11 16:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2012-10-11 at 10:23 -0500, Rob Herring wrote:
> On 10/11/2012 08:47 AM, Eric Dumazet wrote:

> > Compiler is asked to perform a 32bit load, it does it.
> 
> Not exactly. It is asked to to perform 2 32-bit loads which are combined
> into a single ldm (load multiple) which cannot handle unaligned
> accesses. Here's a simple example that does the same thing:


Thats simply not true. You are severely mistaken.

ldm does a load of seeral 32bit words.

And the compiler would not use it if the alignment was not matching the
prereq (alignment >= 4)



> 
> void test(char * buf)
> {
> 	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
> }

But you completely miss the fact that network doesnt pass a "char *buf"
but a "be32 *buf". Your example is not relevant at all.

So the compiler is absolutely right, and network stack is _right_ too.

The prereq is that IP header are aligned to 4 bytes boundary.

Denying this fact is not going to help you


> 
> So I guess the only ABI legal unaligned access is in a packed struct.
> 
> > There is no questionable optimization here. Really.
> > Please stop pretending this, this makes no sense.
> 
> I'm not the one calling the networking stack bad code.

Once you understand the issues, you can explain us where is the bad
code. But given you say "Bug is in compiler, and/or network stack, but
my driver is fine", its not very wise.

For the moment, the bug is in your driver.

> 
> I can fix my h/w, so I'll stop caring about this. Others can all get
> bitten by this new behavior in gcc 4.7.

Again compiler is fine. How should we say that so that you stop your
rants ?

Stop trying to find an excuse, dont try to fool us, this is really
embarrassing. Just fix the driver, there is no shame to fix an error.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 15:39                                                 ` David Laight
@ 2012-10-11 16:18                                                   ` Måns Rullgård
  -1 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11 16:18 UTC (permalink / raw)
  To: David Laight
  Cc: Rob Herring, Eric Dumazet, Arnd Bergmann, linux-arm-kernel,
	Russell King - ARM Linux, Jon Masters, netdev,
	Måns Rullgård

"David Laight" <David.Laight@ACULAB.COM> writes:

>> Not exactly. It is asked to to perform 2 32-bit loads which are combined
>> into a single ldm (load multiple) which cannot handle unaligned
>> accesses. Here's a simple example that does the same thing:
>> 
>> void test(char * buf)
>> {
>> 	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
>> }
>
> Have you actually looked at what an ARM processor traditionally did
> with misaligned memory reads?
> While useful, it probably wasn't what was intended.
>
> Actually, and IIRC, some very recent ARM cpus will do the 'expected'
> thing for single-word loads from misaligned addesses.

What various CPUs do with unaligned accesses is not the issue here.  The
casts in the code above act as a promise to the compiler that the
address is in fact properly align for the pointer type.

> However they almost certainly won't for ldm/stm.
>
> The 'ldm' optimisation for adjacent memory loads is also dubious.

There is nothing whatsoever dubious about the compiler using the most
efficient instruction sequence to accomplish what the code asks for.

> On at least some ARMs it is very slow (might only be strongarms).

The compiler will pick instructions suitable for the CPU you specify.

>> So I guess the only ABI legal unaligned access is in a packed struct.
>
> Correct. And you mustn't try casting the address, the compiler is
> allowed to remember where it came from.
> (This causes a lot of grief...)

It is only a problem when you try to outsmart the compiler.

> If you are targeting the ARM cpu that can do misaligned transfers,
> then gcc should generate single instructions for misaligned structure
> members, and never do the 'ldm' optimisations.

That is exactly how gcc works.

> But, the IP header is expected to be aligned.

Everything tells the compiler the struct is perfectly aligned.  When the
buggy driver passes a misaligned pointer, bad things happen.

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 16:18                                                   ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-11 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

"David Laight" <David.Laight@ACULAB.COM> writes:

>> Not exactly. It is asked to to perform 2 32-bit loads which are combined
>> into a single ldm (load multiple) which cannot handle unaligned
>> accesses. Here's a simple example that does the same thing:
>> 
>> void test(char * buf)
>> {
>> 	printf("%d, %d\n", *((unsigned int *)&buf[0]), *((unsigned int *)&buf[4]));
>> }
>
> Have you actually looked at what an ARM processor traditionally did
> with misaligned memory reads?
> While useful, it probably wasn't what was intended.
>
> Actually, and IIRC, some very recent ARM cpus will do the 'expected'
> thing for single-word loads from misaligned addesses.

What various CPUs do with unaligned accesses is not the issue here.  The
casts in the code above act as a promise to the compiler that the
address is in fact properly align for the pointer type.

> However they almost certainly won't for ldm/stm.
>
> The 'ldm' optimisation for adjacent memory loads is also dubious.

There is nothing whatsoever dubious about the compiler using the most
efficient instruction sequence to accomplish what the code asks for.

> On at least some ARMs it is very slow (might only be strongarms).

The compiler will pick instructions suitable for the CPU you specify.

>> So I guess the only ABI legal unaligned access is in a packed struct.
>
> Correct. And you mustn't try casting the address, the compiler is
> allowed to remember where it came from.
> (This causes a lot of grief...)

It is only a problem when you try to outsmart the compiler.

> If you are targeting the ARM cpu that can do misaligned transfers,
> then gcc should generate single instructions for misaligned structure
> members, and never do the 'ldm' optimisations.

That is exactly how gcc works.

> But, the IP header is expected to be aligned.

Everything tells the compiler the struct is perfectly aligned.  When the
buggy driver passes a misaligned pointer, bad things happen.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 10:32                                   ` Russell King - ARM Linux
@ 2012-10-11 16:59                                     ` Catalin Marinas
  -1 siblings, 0 replies; 133+ messages in thread
From: Catalin Marinas @ 2012-10-11 16:59 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Eric Dumazet, Jon Masters, netdev, Måns Rullgård,
	David Laight, linux-arm-kernel

On 11 October 2012 11:32, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Oct 11, 2012 at 12:22:06PM +0200, Eric Dumazet wrote:
>> I took a look, and I dont see why/how gcc could use a ldm instruction
>>
>> Doing so assumed the alignment of the structure was 8 bytes, but its
>> not.
>>
>> Networking stack mandates that IP headers are aligned on 4 bytes
>> boundaries, not 8 bytes.
>
> Err, no.  ldm is "load multiple" not "load double".  It loads multiple
> 32-bit registers, and its requirement for non-faulting behaviour is for
> the pointer to be 4 byte aligned.  However, "load double" requires 8
> byte alignment.

It got better with ARMv6 where LDRD/STRD only require 4 byte alignment
(the only 8 byte alignment is required by LDREXD/STREXD). But on ARMv5
LDRD/STRD 8 byte alignment is indeed required (otherwise
unpredictable).

-- 
Catalin

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-11 16:59                                     ` Catalin Marinas
  0 siblings, 0 replies; 133+ messages in thread
From: Catalin Marinas @ 2012-10-11 16:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 11 October 2012 11:32, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Oct 11, 2012 at 12:22:06PM +0200, Eric Dumazet wrote:
>> I took a look, and I dont see why/how gcc could use a ldm instruction
>>
>> Doing so assumed the alignment of the structure was 8 bytes, but its
>> not.
>>
>> Networking stack mandates that IP headers are aligned on 4 bytes
>> boundaries, not 8 bytes.
>
> Err, no.  ldm is "load multiple" not "load double".  It loads multiple
> 32-bit registers, and its requirement for non-faulting behaviour is for
> the pointer to be 4 byte aligned.  However, "load double" requires 8
> byte alignment.

It got better with ARMv6 where LDRD/STRD only require 4 byte alignment
(the only 8 byte alignment is required by LDREXD/STREXD). But on ARMv5
LDRD/STRD 8 byte alignment is indeed required (otherwise
unpredictable).

-- 
Catalin

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-11 16:18                                                   ` Måns Rullgård
@ 2012-10-12  8:11                                                     ` Arnd Bergmann
  -1 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2012-10-12  8:11 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Måns Rullgård, David Laight, Russell King - ARM Linux,
	Eric Dumazet, netdev, Jon Masters

On Thursday 11 October 2012, Måns Rullgård wrote:
> > But, the IP header is expected to be aligned.
> 
> Everything tells the compiler the struct is perfectly aligned.  When the
> buggy driver passes a misaligned pointer, bad things happen.

Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
then? If all alignment faults in the kernel are caused by broken drivers,
that would at least give us some hope of finding those drivers while at the
same time not causing much overhead in the case where we need to do the
fixup in the meantime.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12  8:11                                                     ` Arnd Bergmann
  0 siblings, 0 replies; 133+ messages in thread
From: Arnd Bergmann @ 2012-10-12  8:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 11 October 2012, M?ns Rullg?rd wrote:
> > But, the IP header is expected to be aligned.
> 
> Everything tells the compiler the struct is perfectly aligned.  When the
> buggy driver passes a misaligned pointer, bad things happen.

Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
then? If all alignment faults in the kernel are caused by broken drivers,
that would at least give us some hope of finding those drivers while at the
same time not causing much overhead in the case where we need to do the
fixup in the meantime.

	Arnd

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12  8:11                                                     ` Arnd Bergmann
@ 2012-10-12  9:03                                                       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12  9:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, Måns Rullgård, David Laight,
	Eric Dumazet, netdev, Jon Masters

On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
> On Thursday 11 October 2012, Måns Rullgård wrote:
> > > But, the IP header is expected to be aligned.
> > 
> > Everything tells the compiler the struct is perfectly aligned.  When the
> > buggy driver passes a misaligned pointer, bad things happen.
> 
> Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
> then? If all alignment faults in the kernel are caused by broken drivers,
> that would at least give us some hope of finding those drivers while at the
> same time not causing much overhead in the case where we need to do the
> fixup in the meantime.

No.  It is my understanding that various IP option processing can also
cause the alignment fault handler to be invoked, even when the packet is
properly aligned, and then there's jffs2/mtd which also relies upon
alignment faults being fixed up.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12  9:03                                                       ` Russell King - ARM Linux
  0 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12  9:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
> On Thursday 11 October 2012, M?ns Rullg?rd wrote:
> > > But, the IP header is expected to be aligned.
> > 
> > Everything tells the compiler the struct is perfectly aligned.  When the
> > buggy driver passes a misaligned pointer, bad things happen.
> 
> Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
> then? If all alignment faults in the kernel are caused by broken drivers,
> that would at least give us some hope of finding those drivers while at the
> same time not causing much overhead in the case where we need to do the
> fixup in the meantime.

No.  It is my understanding that various IP option processing can also
cause the alignment fault handler to be invoked, even when the packet is
properly aligned, and then there's jffs2/mtd which also relies upon
alignment faults being fixed up.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12  9:03                                                       ` Russell King - ARM Linux
@ 2012-10-12 10:04                                                         ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-12 10:04 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Arnd Bergmann, linux-arm-kernel, Måns Rullgård,
	David Laight, netdev, Jon Masters

On Fri, 2012-10-12 at 10:03 +0100, Russell King - ARM Linux wrote:

> No.  It is my understanding that various IP option processing can also
> cause the alignment fault handler to be invoked, even when the packet is
> properly aligned, and then there's jffs2/mtd which also relies upon
> alignment faults being fixed up.

Oh well.

We normally make sure we dont have alignment faults on arches that dont
have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS (or a non null NET_IP_ALIGN)

So if you find an offender, please report a bug, because I can guarantee
you we will _fix_ it.

One example of a fix was the following subtle one.

commit 117632e64d2a5f464e491fe221d7169a3814a77b
tcp: take care of misalignments
    
We discovered that TCP stack could retransmit misaligned skbs if a
malicious peer acknowledged sub MSS frame. This currently can happen
only if output interface is non SG enabled : If SG is enabled, tcp
builds headless skbs (all payload is included in fragments), so the tcp
trimming process only removes parts of skb fragments, header stay
aligned.
    
Some arches cant handle misalignments, so force a head reallocation and
shrink headroom to MAX_TCP_HEADER.

Dont care about misaligments on x86 and PPC (or other arches setting
NET_IP_ALIGN to 0)

This patch introduces __pskb_copy() which can specify the headroom of
new head, and pskb_copy() becomes a wrapper on top of __pskb_copy()

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 10:04                                                         ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-12 10:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2012-10-12 at 10:03 +0100, Russell King - ARM Linux wrote:

> No.  It is my understanding that various IP option processing can also
> cause the alignment fault handler to be invoked, even when the packet is
> properly aligned, and then there's jffs2/mtd which also relies upon
> alignment faults being fixed up.

Oh well.

We normally make sure we dont have alignment faults on arches that dont
have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS (or a non null NET_IP_ALIGN)

So if you find an offender, please report a bug, because I can guarantee
you we will _fix_ it.

One example of a fix was the following subtle one.

commit 117632e64d2a5f464e491fe221d7169a3814a77b
tcp: take care of misalignments
    
We discovered that TCP stack could retransmit misaligned skbs if a
malicious peer acknowledged sub MSS frame. This currently can happen
only if output interface is non SG enabled : If SG is enabled, tcp
builds headless skbs (all payload is included in fragments), so the tcp
trimming process only removes parts of skb fragments, header stay
aligned.
    
Some arches cant handle misalignments, so force a head reallocation and
shrink headroom to MAX_TCP_HEADER.

Dont care about misaligments on x86 and PPC (or other arches setting
NET_IP_ALIGN to 0)

This patch introduces __pskb_copy() which can specify the headroom of
new head, and pskb_copy() becomes a wrapper on top of __pskb_copy()

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12  9:03                                                       ` Russell King - ARM Linux
@ 2012-10-12 11:00                                                         ` Måns Rullgård
  -1 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-12 11:00 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Arnd Bergmann, linux-arm-kernel, Måns Rullgård,
	David Laight, Eric Dumazet, netdev, Jon Masters

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
>> On Thursday 11 October 2012, Måns Rullgård wrote:
>> > > But, the IP header is expected to be aligned.
>> > 
>> > Everything tells the compiler the struct is perfectly aligned.  When the
>> > buggy driver passes a misaligned pointer, bad things happen.
>> 
>> Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
>> then?

I think that's an excellent idea.

>> If all alignment faults in the kernel are caused by broken drivers,
>> that would at least give us some hope of finding those drivers while
>> at the same time not causing much overhead in the case where we need
>> to do the fixup in the meantime.
>
> No.  It is my understanding that various IP option processing can also
> cause the alignment fault handler to be invoked, even when the packet is
> properly aligned, and then there's jffs2/mtd which also relies upon
> alignment faults being fixed up.

As far as I'm concerned, this is all hearsay, and I've only ever heard
it from you.  Why can't you let those who care fix these bugs instead?

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 11:00                                                         ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-12 11:00 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
>> On Thursday 11 October 2012, M?ns Rullg?rd wrote:
>> > > But, the IP header is expected to be aligned.
>> > 
>> > Everything tells the compiler the struct is perfectly aligned.  When the
>> > buggy driver passes a misaligned pointer, bad things happen.
>> 
>> Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
>> then?

I think that's an excellent idea.

>> If all alignment faults in the kernel are caused by broken drivers,
>> that would at least give us some hope of finding those drivers while
>> at the same time not causing much overhead in the case where we need
>> to do the fixup in the meantime.
>
> No.  It is my understanding that various IP option processing can also
> cause the alignment fault handler to be invoked, even when the packet is
> properly aligned, and then there's jffs2/mtd which also relies upon
> alignment faults being fixed up.

As far as I'm concerned, this is all hearsay, and I've only ever heard
it from you.  Why can't you let those who care fix these bugs instead?

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 11:00                                                         ` Måns Rullgård
@ 2012-10-12 11:07                                                           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12 11:07 UTC (permalink / raw)
  To: Måns Rullgård
  Cc: Arnd Bergmann, linux-arm-kernel, David Laight, Eric Dumazet,
	netdev, Jon Masters

On Fri, Oct 12, 2012 at 12:00:03PM +0100, Måns Rullgård wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
> >> On Thursday 11 October 2012, Måns Rullgård wrote:
> >> > > But, the IP header is expected to be aligned.
> >> > 
> >> > Everything tells the compiler the struct is perfectly aligned.  When the
> >> > buggy driver passes a misaligned pointer, bad things happen.
> >> 
> >> Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
> >> then?
> 
> I think that's an excellent idea.

Well, I get the last word here and it's no.

> >> If all alignment faults in the kernel are caused by broken drivers,
> >> that would at least give us some hope of finding those drivers while
> >> at the same time not causing much overhead in the case where we need
> >> to do the fixup in the meantime.
> >
> > No.  It is my understanding that various IP option processing can also
> > cause the alignment fault handler to be invoked, even when the packet is
> > properly aligned, and then there's jffs2/mtd which also relies upon
> > alignment faults being fixed up.
> 
> As far as I'm concerned, this is all hearsay, and I've only ever heard
> it from you.  Why can't you let those who care fix these bugs instead?

You know, I'm giving you the benefit of my _knowledge_ which has been
built over the course of the last 20 years.  I've been in these
discussions with networking people before.  I ended up having to develop
the alignment fault handler because of those discussions.  And oh look,
Eric confirmed that the networking code isn't going to get "fixed" as
you were demanding, which is exactly what I said.

I've been in discussions with MTD people over these issues before, I've
discussed this with David Woodhouse when it came up in JFFS2.  I *KNOW*
these things.

You can call it hearsay if you wish, but it seems to be more accurate
than your wild outlandish and pathetic statements.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 11:07                                                           ` Russell King - ARM Linux
  0 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12 11:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 12, 2012 at 12:00:03PM +0100, M?ns Rullg?rd wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
> >> On Thursday 11 October 2012, M?ns Rullg?rd wrote:
> >> > > But, the IP header is expected to be aligned.
> >> > 
> >> > Everything tells the compiler the struct is perfectly aligned.  When the
> >> > buggy driver passes a misaligned pointer, bad things happen.
> >> 
> >> Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
> >> then?
> 
> I think that's an excellent idea.

Well, I get the last word here and it's no.

> >> If all alignment faults in the kernel are caused by broken drivers,
> >> that would at least give us some hope of finding those drivers while
> >> at the same time not causing much overhead in the case where we need
> >> to do the fixup in the meantime.
> >
> > No.  It is my understanding that various IP option processing can also
> > cause the alignment fault handler to be invoked, even when the packet is
> > properly aligned, and then there's jffs2/mtd which also relies upon
> > alignment faults being fixed up.
> 
> As far as I'm concerned, this is all hearsay, and I've only ever heard
> it from you.  Why can't you let those who care fix these bugs instead?

You know, I'm giving you the benefit of my _knowledge_ which has been
built over the course of the last 20 years.  I've been in these
discussions with networking people before.  I ended up having to develop
the alignment fault handler because of those discussions.  And oh look,
Eric confirmed that the networking code isn't going to get "fixed" as
you were demanding, which is exactly what I said.

I've been in discussions with MTD people over these issues before, I've
discussed this with David Woodhouse when it came up in JFFS2.  I *KNOW*
these things.

You can call it hearsay if you wish, but it seems to be more accurate
than your wild outlandish and pathetic statements.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 11:07                                                           ` Russell King - ARM Linux
@ 2012-10-12 11:18                                                             ` Måns Rullgård
  -1 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-12 11:18 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Måns Rullgård, Arnd Bergmann, linux-arm-kernel,
	David Laight, Eric Dumazet, netdev, Jon Masters

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Oct 12, 2012 at 12:00:03PM +0100, Måns Rullgård wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> 
>> > On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
>> >> On Thursday 11 October 2012, Måns Rullgård wrote:
>> >> > > But, the IP header is expected to be aligned.
>> >> > 
>> >> > Everything tells the compiler the struct is perfectly aligned.  When the
>> >> > buggy driver passes a misaligned pointer, bad things happen.
>> >> 
>> >> Would it be appropriate to add a WARN_ON_ONCE() in the alignment
>> >> fault path then?
>> 
>> I think that's an excellent idea.
>
> Well, I get the last word here and it's no.

Sadly, yes.

>> >> If all alignment faults in the kernel are caused by broken drivers,
>> >> that would at least give us some hope of finding those drivers while
>> >> at the same time not causing much overhead in the case where we need
>> >> to do the fixup in the meantime.
>> >
>> > No.  It is my understanding that various IP option processing can also
>> > cause the alignment fault handler to be invoked, even when the packet is
>> > properly aligned, and then there's jffs2/mtd which also relies upon
>> > alignment faults being fixed up.
>> 
>> As far as I'm concerned, this is all hearsay, and I've only ever heard
>> it from you.  Why can't you let those who care fix these bugs instead?
>
> You know, I'm giving you the benefit of my _knowledge_ which has been
> built over the course of the last 20 years.

How proud you sound.  Now could you say something of substance instead?

> I've been in these discussions with networking people before.  I ended
> up having to develop the alignment fault handler because of those
> discussions.  And oh look, Eric confirmed that the networking code
> isn't going to get "fixed" as you were demanding, which is exactly
> what I said.

Funny, I saw him say the exact opposite:

  So if you find an offender, please report a bug, because I can
  guarantee you we will _fix_ it.

> I've been in discussions with MTD people over these issues before, I've
> discussed this with David Woodhouse when it came up in JFFS2.  I *KNOW*
> these things.

In the same way you "know" the networking people won't fix their code,
despite them _clearly_ stating the opposite?

> You can call it hearsay if you wish, but it seems to be more accurate
> than your wild outlandish and pathetic statements.

So you're resorting to name-calling.  Not taking that bait.

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 11:18                                                             ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-12 11:18 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Oct 12, 2012 at 12:00:03PM +0100, M?ns Rullg?rd wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> 
>> > On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
>> >> On Thursday 11 October 2012, M?ns Rullg?rd wrote:
>> >> > > But, the IP header is expected to be aligned.
>> >> > 
>> >> > Everything tells the compiler the struct is perfectly aligned.  When the
>> >> > buggy driver passes a misaligned pointer, bad things happen.
>> >> 
>> >> Would it be appropriate to add a WARN_ON_ONCE() in the alignment
>> >> fault path then?
>> 
>> I think that's an excellent idea.
>
> Well, I get the last word here and it's no.

Sadly, yes.

>> >> If all alignment faults in the kernel are caused by broken drivers,
>> >> that would at least give us some hope of finding those drivers while
>> >> at the same time not causing much overhead in the case where we need
>> >> to do the fixup in the meantime.
>> >
>> > No.  It is my understanding that various IP option processing can also
>> > cause the alignment fault handler to be invoked, even when the packet is
>> > properly aligned, and then there's jffs2/mtd which also relies upon
>> > alignment faults being fixed up.
>> 
>> As far as I'm concerned, this is all hearsay, and I've only ever heard
>> it from you.  Why can't you let those who care fix these bugs instead?
>
> You know, I'm giving you the benefit of my _knowledge_ which has been
> built over the course of the last 20 years.

How proud you sound.  Now could you say something of substance instead?

> I've been in these discussions with networking people before.  I ended
> up having to develop the alignment fault handler because of those
> discussions.  And oh look, Eric confirmed that the networking code
> isn't going to get "fixed" as you were demanding, which is exactly
> what I said.

Funny, I saw him say the exact opposite:

  So if you find an offender, please report a bug, because I can
  guarantee you we will _fix_ it.

> I've been in discussions with MTD people over these issues before, I've
> discussed this with David Woodhouse when it came up in JFFS2.  I *KNOW*
> these things.

In the same way you "know" the networking people won't fix their code,
despite them _clearly_ stating the opposite?

> You can call it hearsay if you wish, but it seems to be more accurate
> than your wild outlandish and pathetic statements.

So you're resorting to name-calling.  Not taking that bait.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 11:07                                                           ` Russell King - ARM Linux
@ 2012-10-12 11:19                                                             ` Russell King - ARM Linux
  -1 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12 11:19 UTC (permalink / raw)
  To: Måns Rullgård
  Cc: Eric Dumazet, Arnd Bergmann, netdev, David Laight, Jon Masters,
	linux-arm-kernel

On Fri, Oct 12, 2012 at 12:07:50PM +0100, Russell King - ARM Linux wrote:
> On Fri, Oct 12, 2012 at 12:00:03PM +0100, Måns Rullgård wrote:
> > Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> > 
> > > On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
> > >> On Thursday 11 October 2012, Måns Rullgård wrote:
> > >> > > But, the IP header is expected to be aligned.
> > >> > 
> > >> > Everything tells the compiler the struct is perfectly aligned.  When the
> > >> > buggy driver passes a misaligned pointer, bad things happen.
> > >> 
> > >> Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
> > >> then?
> > 
> > I think that's an excellent idea.
> 
> Well, I get the last word here and it's no.
> 
> > >> If all alignment faults in the kernel are caused by broken drivers,
> > >> that would at least give us some hope of finding those drivers while
> > >> at the same time not causing much overhead in the case where we need
> > >> to do the fixup in the meantime.
> > >
> > > No.  It is my understanding that various IP option processing can also
> > > cause the alignment fault handler to be invoked, even when the packet is
> > > properly aligned, and then there's jffs2/mtd which also relies upon
> > > alignment faults being fixed up.
> > 
> > As far as I'm concerned, this is all hearsay, and I've only ever heard
> > it from you.  Why can't you let those who care fix these bugs instead?
> 
> You know, I'm giving you the benefit of my _knowledge_ which has been
> built over the course of the last 20 years.  I've been in these
> discussions with networking people before.  I ended up having to develop
> the alignment fault handler because of those discussions.

Correction: San Mehat at Corel Inc had to develop it, but I was involved
in those discussions over it nevertheless, as I was the person who merged
the code and then subsequently maintained it.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 11:19                                                             ` Russell King - ARM Linux
  0 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12 11:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 12, 2012 at 12:07:50PM +0100, Russell King - ARM Linux wrote:
> On Fri, Oct 12, 2012 at 12:00:03PM +0100, M?ns Rullg?rd wrote:
> > Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> > 
> > > On Fri, Oct 12, 2012 at 08:11:42AM +0000, Arnd Bergmann wrote:
> > >> On Thursday 11 October 2012, M?ns Rullg?rd wrote:
> > >> > > But, the IP header is expected to be aligned.
> > >> > 
> > >> > Everything tells the compiler the struct is perfectly aligned.  When the
> > >> > buggy driver passes a misaligned pointer, bad things happen.
> > >> 
> > >> Would it be appropriate to add a WARN_ON_ONCE() in the alignment fault path
> > >> then?
> > 
> > I think that's an excellent idea.
> 
> Well, I get the last word here and it's no.
> 
> > >> If all alignment faults in the kernel are caused by broken drivers,
> > >> that would at least give us some hope of finding those drivers while
> > >> at the same time not causing much overhead in the case where we need
> > >> to do the fixup in the meantime.
> > >
> > > No.  It is my understanding that various IP option processing can also
> > > cause the alignment fault handler to be invoked, even when the packet is
> > > properly aligned, and then there's jffs2/mtd which also relies upon
> > > alignment faults being fixed up.
> > 
> > As far as I'm concerned, this is all hearsay, and I've only ever heard
> > it from you.  Why can't you let those who care fix these bugs instead?
> 
> You know, I'm giving you the benefit of my _knowledge_ which has been
> built over the course of the last 20 years.  I've been in these
> discussions with networking people before.  I ended up having to develop
> the alignment fault handler because of those discussions.

Correction: San Mehat at Corel Inc had to develop it, but I was involved
in those discussions over it nevertheless, as I was the person who merged
the code and then subsequently maintained it.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 11:18                                                             ` Måns Rullgård
@ 2012-10-12 11:44                                                               ` Russell King - ARM Linux
  -1 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12 11:44 UTC (permalink / raw)
  To: Måns Rullgård
  Cc: Arnd Bergmann, linux-arm-kernel, David Laight, Eric Dumazet,
	netdev, Jon Masters

On Fri, Oct 12, 2012 at 12:18:08PM +0100, Måns Rullgård wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> > Well, I get the last word here and it's no.
> 
> Sadly, yes.

It's not "sadly" - it's a matter of fact that the kernel does from time
to time generate misaligned accesses and they are _not_ bugs.  If they
were bugs, then the code to fix up misaligned accesses would not have
been developed, and we'd instead take the fault and panic or something
like that.

> >> >> If all alignment faults in the kernel are caused by broken drivers,
> >> >> that would at least give us some hope of finding those drivers while
> >> >> at the same time not causing much overhead in the case where we need
> >> >> to do the fixup in the meantime.
> >> >
> >> > No.  It is my understanding that various IP option processing can also
> >> > cause the alignment fault handler to be invoked, even when the packet is
> >> > properly aligned, and then there's jffs2/mtd which also relies upon
> >> > alignment faults being fixed up.
> >> 
> >> As far as I'm concerned, this is all hearsay, and I've only ever heard
> >> it from you.  Why can't you let those who care fix these bugs instead?
> >
> > You know, I'm giving you the benefit of my _knowledge_ which has been
> > built over the course of the last 20 years.
> 
> How proud you sound.  Now could you say something of substance instead?

You're proving yourself to be idiot?  There, that's substance.

> > I've been in these discussions with networking people before.  I ended
> > up having to develop the alignment fault handler because of those
> > discussions.  And oh look, Eric confirmed that the networking code
> > isn't going to get "fixed" as you were demanding, which is exactly
> > what I said.
> 
> Funny, I saw him say the exact opposite:
> 
>   So if you find an offender, please report a bug, because I can
>   guarantee you we will _fix_ it.

No, let's go back.

- You were demanding that the ipv4 header structure should be packed.
  I said that wasn't going to happen because the networking people
  wouldn't allow it, and it seems that's been proven correct.
- You were demanding that the ipv4 code used the unaligned accessors.
  I said that networking people wouldn't allow it either, and that's
  also been proven correct.

Both these points have been proven correct because Eric has said that the
core networking code is _not_ going to be changed to suit this.

What Eric _has_ said is that networking people consider packets supplied
to the networking layer where the IPv4 header is not aligned on architectures
where misaligned accesses are a problem to be a bug in the network driver,
not the network code, and proposed a solution.

That's entirely different from all your claims that the core networking
code needs fixing to avoid these misaligned accesses.

> > I've been in discussions with MTD people over these issues before, I've
> > discussed this with David Woodhouse when it came up in JFFS2.  I *KNOW*
> > these things.
> 
> In the same way you "know" the networking people won't fix their code,
> despite them _clearly_ stating the opposite?

I'll tell you exactly how I *KNOW* this.  The issue came up because of
noMMU, which does not have any way to fix up alignment faults.  JFFS2
passes randomly aligned buffers to the MTD drivers, and the MTD drivers
assume that they're aligned and they do word accesses on them.

See the thread http://lists.arm.linux.org.uk/lurker/thread/20021204.191632.4473796b.en.html

See: http://lists.arm.linux.org.uk/lurker/message/20020225.195925.02bdbd47.en.html
and: http://lists.arm.linux.org.uk/lurker/message/20020313.150932.081a7592.en.html

There's several other threads where it's also discussed.

And while you're there, note the date.  There is nothing recent about this
issue.  It's well known, and well understood by those who have a grasp of
the issues that alignment faults are a part of normal operation by the
ARM kernel - and is one of the penalties of being tied into architecture
independent code.

Compromises have to be sought, and that's the compromise we get to live
with.

> > You can call it hearsay if you wish, but it seems to be more accurate
> > than your wild outlandish and pathetic statements.
> 
> So you're resorting to name-calling.  Not taking that bait.

Sorry?  So what you're saying is that it's fine for you to call my
comments hearsay, but I'm not allowed to express a view on your comments.
How arrogant of you.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 11:44                                                               ` Russell King - ARM Linux
  0 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12 11:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 12, 2012 at 12:18:08PM +0100, M?ns Rullg?rd wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> > Well, I get the last word here and it's no.
> 
> Sadly, yes.

It's not "sadly" - it's a matter of fact that the kernel does from time
to time generate misaligned accesses and they are _not_ bugs.  If they
were bugs, then the code to fix up misaligned accesses would not have
been developed, and we'd instead take the fault and panic or something
like that.

> >> >> If all alignment faults in the kernel are caused by broken drivers,
> >> >> that would at least give us some hope of finding those drivers while
> >> >> at the same time not causing much overhead in the case where we need
> >> >> to do the fixup in the meantime.
> >> >
> >> > No.  It is my understanding that various IP option processing can also
> >> > cause the alignment fault handler to be invoked, even when the packet is
> >> > properly aligned, and then there's jffs2/mtd which also relies upon
> >> > alignment faults being fixed up.
> >> 
> >> As far as I'm concerned, this is all hearsay, and I've only ever heard
> >> it from you.  Why can't you let those who care fix these bugs instead?
> >
> > You know, I'm giving you the benefit of my _knowledge_ which has been
> > built over the course of the last 20 years.
> 
> How proud you sound.  Now could you say something of substance instead?

You're proving yourself to be idiot?  There, that's substance.

> > I've been in these discussions with networking people before.  I ended
> > up having to develop the alignment fault handler because of those
> > discussions.  And oh look, Eric confirmed that the networking code
> > isn't going to get "fixed" as you were demanding, which is exactly
> > what I said.
> 
> Funny, I saw him say the exact opposite:
> 
>   So if you find an offender, please report a bug, because I can
>   guarantee you we will _fix_ it.

No, let's go back.

- You were demanding that the ipv4 header structure should be packed.
  I said that wasn't going to happen because the networking people
  wouldn't allow it, and it seems that's been proven correct.
- You were demanding that the ipv4 code used the unaligned accessors.
  I said that networking people wouldn't allow it either, and that's
  also been proven correct.

Both these points have been proven correct because Eric has said that the
core networking code is _not_ going to be changed to suit this.

What Eric _has_ said is that networking people consider packets supplied
to the networking layer where the IPv4 header is not aligned on architectures
where misaligned accesses are a problem to be a bug in the network driver,
not the network code, and proposed a solution.

That's entirely different from all your claims that the core networking
code needs fixing to avoid these misaligned accesses.

> > I've been in discussions with MTD people over these issues before, I've
> > discussed this with David Woodhouse when it came up in JFFS2.  I *KNOW*
> > these things.
> 
> In the same way you "know" the networking people won't fix their code,
> despite them _clearly_ stating the opposite?

I'll tell you exactly how I *KNOW* this.  The issue came up because of
noMMU, which does not have any way to fix up alignment faults.  JFFS2
passes randomly aligned buffers to the MTD drivers, and the MTD drivers
assume that they're aligned and they do word accesses on them.

See the thread http://lists.arm.linux.org.uk/lurker/thread/20021204.191632.4473796b.en.html

See: http://lists.arm.linux.org.uk/lurker/message/20020225.195925.02bdbd47.en.html
and: http://lists.arm.linux.org.uk/lurker/message/20020313.150932.081a7592.en.html

There's several other threads where it's also discussed.

And while you're there, note the date.  There is nothing recent about this
issue.  It's well known, and well understood by those who have a grasp of
the issues that alignment faults are a part of normal operation by the
ARM kernel - and is one of the penalties of being tied into architecture
independent code.

Compromises have to be sought, and that's the compromise we get to live
with.

> > You can call it hearsay if you wish, but it seems to be more accurate
> > than your wild outlandish and pathetic statements.
> 
> So you're resorting to name-calling.  Not taking that bait.

Sorry?  So what you're saying is that it's fine for you to call my
comments hearsay, but I'm not allowed to express a view on your comments.
How arrogant of you.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 11:44                                                               ` Russell King - ARM Linux
@ 2012-10-12 12:08                                                                 ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-12 12:08 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Måns Rullgård, Arnd Bergmann, linux-arm-kernel,
	David Laight, netdev, Jon Masters

On Fri, 2012-10-12 at 12:44 +0100, Russell King - ARM Linux wrote:

> - You were demanding that the ipv4 header structure should be packed.
>   I said that wasn't going to happen because the networking people
>   wouldn't allow it, and it seems that's been proven correct.
> - You were demanding that the ipv4 code used the unaligned accessors.
>   I said that networking people wouldn't allow it either, and that's
>   also been proven correct.
> 
> Both these points have been proven correct because Eric has said that the
> core networking code is _not_ going to be changed to suit this.
> 
> What Eric _has_ said is that networking people consider packets supplied
> to the networking layer where the IPv4 header is not aligned on architectures
> where misaligned accesses are a problem to be a bug in the network driver,
> not the network code, and proposed a solution.
> 
> That's entirely different from all your claims that the core networking
> code needs fixing to avoid these misaligned accesses.

It seems we agree then, but your words were a bit misleading (at least
for me)

So yes, we built network stack with the prereq that IP headers are
aligned, but unfortunately many developers use x86 which doesnt care, so
its possible some bugs are added.

But its not by intent, only by accident.

Thanks

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 12:08                                                                 ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-12 12:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2012-10-12 at 12:44 +0100, Russell King - ARM Linux wrote:

> - You were demanding that the ipv4 header structure should be packed.
>   I said that wasn't going to happen because the networking people
>   wouldn't allow it, and it seems that's been proven correct.
> - You were demanding that the ipv4 code used the unaligned accessors.
>   I said that networking people wouldn't allow it either, and that's
>   also been proven correct.
> 
> Both these points have been proven correct because Eric has said that the
> core networking code is _not_ going to be changed to suit this.
> 
> What Eric _has_ said is that networking people consider packets supplied
> to the networking layer where the IPv4 header is not aligned on architectures
> where misaligned accesses are a problem to be a bug in the network driver,
> not the network code, and proposed a solution.
> 
> That's entirely different from all your claims that the core networking
> code needs fixing to avoid these misaligned accesses.

It seems we agree then, but your words were a bit misleading (at least
for me)

So yes, we built network stack with the prereq that IP headers are
aligned, but unfortunately many developers use x86 which doesnt care, so
its possible some bugs are added.

But its not by intent, only by accident.

Thanks

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 11:44                                                               ` Russell King - ARM Linux
@ 2012-10-12 12:16                                                                 ` Måns Rullgård
  -1 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-12 12:16 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Måns Rullgård, Arnd Bergmann, linux-arm-kernel,
	David Laight, Eric Dumazet, netdev, Jon Masters

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Oct 12, 2012 at 12:18:08PM +0100, Måns Rullgård wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> > Well, I get the last word here and it's no.
>> 
>> Sadly, yes.
>
> It's not "sadly" - it's a matter of fact that the kernel does from time
> to time generate misaligned accesses and they are _not_ bugs.  If they
> were bugs, then the code to fix up misaligned accesses would not have
> been developed, and we'd instead take the fault and panic or something
> like that.
>
>> >> >> If all alignment faults in the kernel are caused by broken drivers,
>> >> >> that would at least give us some hope of finding those drivers while
>> >> >> at the same time not causing much overhead in the case where we need
>> >> >> to do the fixup in the meantime.
>> >> >
>> >> > No.  It is my understanding that various IP option processing can also
>> >> > cause the alignment fault handler to be invoked, even when the packet is
>> >> > properly aligned, and then there's jffs2/mtd which also relies upon
>> >> > alignment faults being fixed up.
>> >> 
>> >> As far as I'm concerned, this is all hearsay, and I've only ever heard
>> >> it from you.  Why can't you let those who care fix these bugs instead?
>> >
>> > You know, I'm giving you the benefit of my _knowledge_ which has been
>> > built over the course of the last 20 years.
>> 
>> How proud you sound.  Now could you say something of substance instead?
>
> You're proving yourself to be idiot?  There, that's substance.
>
>> > I've been in these discussions with networking people before.  I ended
>> > up having to develop the alignment fault handler because of those
>> > discussions.  And oh look, Eric confirmed that the networking code
>> > isn't going to get "fixed" as you were demanding, which is exactly
>> > what I said.
>> 
>> Funny, I saw him say the exact opposite:
>> 
>>   So if you find an offender, please report a bug, because I can
>>   guarantee you we will _fix_ it.
>
> No, let's go back.
>
> - You were demanding that the ipv4 header structure should be packed.

I demanded no such thing.

>   I said that wasn't going to happen because the networking people
>   wouldn't allow it, and it seems that's been proven correct.
> - You were demanding that the ipv4 code used the unaligned accessors.

Again, I made no such demand.

>   I said that networking people wouldn't allow it either, and that's
>   also been proven correct.

I did not, in fact, _demand_ anything at all.  What I did say was that
to fix the problem of unaligned access traps the OP was having, the
(driver) code supplying the unaligned pointer should be fixed, or *if
that is not possible* mark the structs __packed.  As it turns out,
fixing the driver is easy, so there is no need to change the structs or
how they are accessed.

> Both these points have been proven correct because Eric has said that the
> core networking code is _not_ going to be changed to suit this.
>
> What Eric _has_ said is that networking people consider packets supplied
> to the networking layer where the IPv4 header is not aligned on architectures
> where misaligned accesses are a problem to be a bug in the network driver,
> not the network code, and proposed a solution.
>
> That's entirely different from all your claims that the core networking
> code needs fixing to avoid these misaligned accesses.

I never said that.  I said whatever code is responsible for the pointer
should be fixed.  That code turns out to be the driver.

>> > I've been in discussions with MTD people over these issues before, I've
>> > discussed this with David Woodhouse when it came up in JFFS2.  I *KNOW*
>> > these things.
>> 
>> In the same way you "know" the networking people won't fix their code,
>> despite them _clearly_ stating the opposite?
>
> I'll tell you exactly how I *KNOW* this.  The issue came up because of
> noMMU, which does not have any way to fix up alignment faults.  JFFS2
> passes randomly aligned buffers to the MTD drivers, and the MTD drivers
> assume that they're aligned and they do word accesses on them.

So jffs2 is broken.  Why can't it be fixed?

> See the thread http://lists.arm.linux.org.uk/lurker/thread/20021204.191632.4473796b.en.html

That thread is about detecting misaligned accesses and what to do with
them if they do occur.  I don't see anyone successfully arguing against
fixing the code causing them.

> See: http://lists.arm.linux.org.uk/lurker/message/20020225.195925.02bdbd47.en.html

Yes, there seems to be a problem in jffs2.  Or at least there was one 10
years ago.

> and: http://lists.arm.linux.org.uk/lurker/message/20020313.150932.081a7592.en.html

Yes, disabling the alignment trap on armv5 is a bad idea.  Nobody has
argued otherwise.

> There's several other threads where it's also discussed.
>
> And while you're there, note the date.  There is nothing recent about this
> issue.  It's well known, and well understood by those who have a grasp of
> the issues that alignment faults are a part of normal operation by the
> ARM kernel - and is one of the penalties of being tied into architecture
> independent code.
>
> Compromises have to be sought, and that's the compromise we get to live
> with.

So because of some ancient history with jffs2, we should deny
present-day developers the tools to quickly identify problems in their
code?  I just _love_ such compromises.

>> > You can call it hearsay if you wish, but it seems to be more accurate
>> > than your wild outlandish and pathetic statements.
>> 
>> So you're resorting to name-calling.  Not taking that bait.
>
> Sorry?  So what you're saying is that it's fine for you to call my
> comments hearsay, but I'm not allowed to express a view on your comments.
> How arrogant of you.

Go ahead, pile it on.  I'm not falling into this trap.

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 12:16                                                                 ` Måns Rullgård
  0 siblings, 0 replies; 133+ messages in thread
From: Måns Rullgård @ 2012-10-12 12:16 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Oct 12, 2012 at 12:18:08PM +0100, M?ns Rullg?rd wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> > Well, I get the last word here and it's no.
>> 
>> Sadly, yes.
>
> It's not "sadly" - it's a matter of fact that the kernel does from time
> to time generate misaligned accesses and they are _not_ bugs.  If they
> were bugs, then the code to fix up misaligned accesses would not have
> been developed, and we'd instead take the fault and panic or something
> like that.
>
>> >> >> If all alignment faults in the kernel are caused by broken drivers,
>> >> >> that would at least give us some hope of finding those drivers while
>> >> >> at the same time not causing much overhead in the case where we need
>> >> >> to do the fixup in the meantime.
>> >> >
>> >> > No.  It is my understanding that various IP option processing can also
>> >> > cause the alignment fault handler to be invoked, even when the packet is
>> >> > properly aligned, and then there's jffs2/mtd which also relies upon
>> >> > alignment faults being fixed up.
>> >> 
>> >> As far as I'm concerned, this is all hearsay, and I've only ever heard
>> >> it from you.  Why can't you let those who care fix these bugs instead?
>> >
>> > You know, I'm giving you the benefit of my _knowledge_ which has been
>> > built over the course of the last 20 years.
>> 
>> How proud you sound.  Now could you say something of substance instead?
>
> You're proving yourself to be idiot?  There, that's substance.
>
>> > I've been in these discussions with networking people before.  I ended
>> > up having to develop the alignment fault handler because of those
>> > discussions.  And oh look, Eric confirmed that the networking code
>> > isn't going to get "fixed" as you were demanding, which is exactly
>> > what I said.
>> 
>> Funny, I saw him say the exact opposite:
>> 
>>   So if you find an offender, please report a bug, because I can
>>   guarantee you we will _fix_ it.
>
> No, let's go back.
>
> - You were demanding that the ipv4 header structure should be packed.

I demanded no such thing.

>   I said that wasn't going to happen because the networking people
>   wouldn't allow it, and it seems that's been proven correct.
> - You were demanding that the ipv4 code used the unaligned accessors.

Again, I made no such demand.

>   I said that networking people wouldn't allow it either, and that's
>   also been proven correct.

I did not, in fact, _demand_ anything at all.  What I did say was that
to fix the problem of unaligned access traps the OP was having, the
(driver) code supplying the unaligned pointer should be fixed, or *if
that is not possible* mark the structs __packed.  As it turns out,
fixing the driver is easy, so there is no need to change the structs or
how they are accessed.

> Both these points have been proven correct because Eric has said that the
> core networking code is _not_ going to be changed to suit this.
>
> What Eric _has_ said is that networking people consider packets supplied
> to the networking layer where the IPv4 header is not aligned on architectures
> where misaligned accesses are a problem to be a bug in the network driver,
> not the network code, and proposed a solution.
>
> That's entirely different from all your claims that the core networking
> code needs fixing to avoid these misaligned accesses.

I never said that.  I said whatever code is responsible for the pointer
should be fixed.  That code turns out to be the driver.

>> > I've been in discussions with MTD people over these issues before, I've
>> > discussed this with David Woodhouse when it came up in JFFS2.  I *KNOW*
>> > these things.
>> 
>> In the same way you "know" the networking people won't fix their code,
>> despite them _clearly_ stating the opposite?
>
> I'll tell you exactly how I *KNOW* this.  The issue came up because of
> noMMU, which does not have any way to fix up alignment faults.  JFFS2
> passes randomly aligned buffers to the MTD drivers, and the MTD drivers
> assume that they're aligned and they do word accesses on them.

So jffs2 is broken.  Why can't it be fixed?

> See the thread http://lists.arm.linux.org.uk/lurker/thread/20021204.191632.4473796b.en.html

That thread is about detecting misaligned accesses and what to do with
them if they do occur.  I don't see anyone successfully arguing against
fixing the code causing them.

> See: http://lists.arm.linux.org.uk/lurker/message/20020225.195925.02bdbd47.en.html

Yes, there seems to be a problem in jffs2.  Or at least there was one 10
years ago.

> and: http://lists.arm.linux.org.uk/lurker/message/20020313.150932.081a7592.en.html

Yes, disabling the alignment trap on armv5 is a bad idea.  Nobody has
argued otherwise.

> There's several other threads where it's also discussed.
>
> And while you're there, note the date.  There is nothing recent about this
> issue.  It's well known, and well understood by those who have a grasp of
> the issues that alignment faults are a part of normal operation by the
> ARM kernel - and is one of the penalties of being tied into architecture
> independent code.
>
> Compromises have to be sought, and that's the compromise we get to live
> with.

So because of some ancient history with jffs2, we should deny
present-day developers the tools to quickly identify problems in their
code?  I just _love_ such compromises.

>> > You can call it hearsay if you wish, but it seems to be more accurate
>> > than your wild outlandish and pathetic statements.
>> 
>> So you're resorting to name-calling.  Not taking that bait.
>
> Sorry?  So what you're saying is that it's fine for you to call my
> comments hearsay, but I'm not allowed to express a view on your comments.
> How arrogant of you.

Go ahead, pile it on.  I'm not falling into this trap.

-- 
M?ns Rullg?rd
mans at mansr.com

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 10:04                                                         ` Eric Dumazet
@ 2012-10-12 12:24                                                           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12 12:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Arnd Bergmann, linux-arm-kernel, Måns Rullgård,
	David Laight, netdev, Jon Masters

On Fri, Oct 12, 2012 at 12:04:23PM +0200, Eric Dumazet wrote:
> On Fri, 2012-10-12 at 10:03 +0100, Russell King - ARM Linux wrote:
> 
> > No.  It is my understanding that various IP option processing can also
> > cause the alignment fault handler to be invoked, even when the packet is
> > properly aligned, and then there's jffs2/mtd which also relies upon
> > alignment faults being fixed up.
> 
> Oh well.
> 
> We normally make sure we dont have alignment faults on arches that dont
> have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS (or a non null NET_IP_ALIGN)
> 
> So if you find an offender, please report a bug, because I can guarantee
> you we will _fix_ it.

I think one change I will make to the ARM alignment fixup is to get it
to record the last PC where a misaligned kernel fault occurred, and
report it via our statistics procfs file.  That should allow us to
track down where some of these occur.

They aren't anywhere near regular though - looking at the statistics, my
firewall seems to do an average of around 2-3 a day, and a web server
around 7-8 a day.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 12:24                                                           ` Russell King - ARM Linux
  0 siblings, 0 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-12 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 12, 2012 at 12:04:23PM +0200, Eric Dumazet wrote:
> On Fri, 2012-10-12 at 10:03 +0100, Russell King - ARM Linux wrote:
> 
> > No.  It is my understanding that various IP option processing can also
> > cause the alignment fault handler to be invoked, even when the packet is
> > properly aligned, and then there's jffs2/mtd which also relies upon
> > alignment faults being fixed up.
> 
> Oh well.
> 
> We normally make sure we dont have alignment faults on arches that dont
> have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS (or a non null NET_IP_ALIGN)
> 
> So if you find an offender, please report a bug, because I can guarantee
> you we will _fix_ it.

I think one change I will make to the ARM alignment fixup is to get it
to record the last PC where a misaligned kernel fault occurred, and
report it via our statistics procfs file.  That should allow us to
track down where some of these occur.

They aren't anywhere near regular though - looking at the statistics, my
firewall seems to do an average of around 2-3 a day, and a web server
around 7-8 a day.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 12:08                                                                 ` Eric Dumazet
@ 2012-10-12 14:22                                                                   ` Benjamin LaHaise
  -1 siblings, 0 replies; 133+ messages in thread
From: Benjamin LaHaise @ 2012-10-12 14:22 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Russell King - ARM Linux, Måns Rullgård, Arnd Bergmann,
	linux-arm-kernel, David Laight, netdev, Jon Masters

On Fri, Oct 12, 2012 at 02:08:12PM +0200, Eric Dumazet wrote:
> So yes, we built network stack with the prereq that IP headers are
> aligned, but unfortunately many developers use x86 which doesnt care, so
> its possible some bugs are added.

x86 does have an alignment check flag that can be set in the flags register.  
Somehow, I doubt anyone would be willing to walk through all the noise the 
faults would likely trigger.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 14:22                                                                   ` Benjamin LaHaise
  0 siblings, 0 replies; 133+ messages in thread
From: Benjamin LaHaise @ 2012-10-12 14:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 12, 2012 at 02:08:12PM +0200, Eric Dumazet wrote:
> So yes, we built network stack with the prereq that IP headers are
> aligned, but unfortunately many developers use x86 which doesnt care, so
> its possible some bugs are added.

x86 does have an alignment check flag that can be set in the flags register.  
Somehow, I doubt anyone would be willing to walk through all the noise the 
faults would likely trigger.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: alignment faults in 3.6
  2012-10-12 14:22                                                                   ` Benjamin LaHaise
@ 2012-10-12 14:36                                                                     ` David Laight
  -1 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-12 14:36 UTC (permalink / raw)
  To: Benjamin LaHaise, Eric Dumazet
  Cc: Russell King - ARM Linux, Måns Rullgård, Arnd Bergmann,
	linux-arm-kernel, netdev, Jon Masters

> x86 does have an alignment check flag that can be set in the flags register.
> Somehow, I doubt anyone would be willing to walk through all the noise the
> faults would likely trigger.

Someone has tried to set that (in userspace) on NetBSD.
The fault reporting has been fixed, but really nothing
works since optimised parts of libc deliberately do
misaligned transfers.

	David

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 14:36                                                                     ` David Laight
  0 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-12 14:36 UTC (permalink / raw)
  To: linux-arm-kernel

> x86 does have an alignment check flag that can be set in the flags register.
> Somehow, I doubt anyone would be willing to walk through all the noise the
> faults would likely trigger.

Someone has tried to set that (in userspace) on NetBSD.
The fault reporting has been fixed, but really nothing
works since optimised parts of libc deliberately do
misaligned transfers.

	David

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 14:22                                                                   ` Benjamin LaHaise
@ 2012-10-12 14:48                                                                     ` Eric Dumazet
  -1 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-12 14:48 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Russell King - ARM Linux, Måns Rullgård, Arnd Bergmann,
	linux-arm-kernel, David Laight, netdev, Jon Masters

On Fri, 2012-10-12 at 10:22 -0400, Benjamin LaHaise wrote:
> On Fri, Oct 12, 2012 at 02:08:12PM +0200, Eric Dumazet wrote:
> > So yes, we built network stack with the prereq that IP headers are
> > aligned, but unfortunately many developers use x86 which doesnt care, so
> > its possible some bugs are added.
> 
> x86 does have an alignment check flag that can be set in the flags register.  
> Somehow, I doubt anyone would be willing to walk through all the noise the 
> faults would likely trigger.

If this can be mapped to an event that can be used by perf tool, that
might be useful ?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 14:48                                                                     ` Eric Dumazet
  0 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2012-10-12 14:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2012-10-12 at 10:22 -0400, Benjamin LaHaise wrote:
> On Fri, Oct 12, 2012 at 02:08:12PM +0200, Eric Dumazet wrote:
> > So yes, we built network stack with the prereq that IP headers are
> > aligned, but unfortunately many developers use x86 which doesnt care, so
> > its possible some bugs are added.
> 
> x86 does have an alignment check flag that can be set in the flags register.  
> Somehow, I doubt anyone would be willing to walk through all the noise the 
> faults would likely trigger.

If this can be mapped to an event that can be used by perf tool, that
might be useful ?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 14:48                                                                     ` Eric Dumazet
@ 2012-10-12 15:00                                                                       ` Benjamin LaHaise
  -1 siblings, 0 replies; 133+ messages in thread
From: Benjamin LaHaise @ 2012-10-12 15:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Russell King - ARM Linux, Måns Rullgård, Arnd Bergmann,
	linux-arm-kernel, David Laight, netdev, Jon Masters

On Fri, Oct 12, 2012 at 04:48:20PM +0200, Eric Dumazet wrote:
> > Somehow, I doubt anyone would be willing to walk through all the noise the 
> > faults would likely trigger.
> 
> If this can be mapped to an event that can be used by perf tool, that
> might be useful ?

There are performance counters for the various different types of alignment 
faults supported by perf. Modern x86 makes the vast majority of unaligned 
accesses very low overhead -- the only ones that really hurt are those 
straddling different vm pages, but even those have little cost compared to 
obsolete microarchitectures (*cough* P4 *cough*).

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 15:00                                                                       ` Benjamin LaHaise
  0 siblings, 0 replies; 133+ messages in thread
From: Benjamin LaHaise @ 2012-10-12 15:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Oct 12, 2012 at 04:48:20PM +0200, Eric Dumazet wrote:
> > Somehow, I doubt anyone would be willing to walk through all the noise the 
> > faults would likely trigger.
> 
> If this can be mapped to an event that can be used by perf tool, that
> might be useful ?

There are performance counters for the various different types of alignment 
faults supported by perf. Modern x86 makes the vast majority of unaligned 
accesses very low overhead -- the only ones that really hurt are those 
straddling different vm pages, but even those have little cost compared to 
obsolete microarchitectures (*cough* P4 *cough*).

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: alignment faults in 3.6
  2012-10-12 14:48                                                                     ` Eric Dumazet
@ 2012-10-12 15:04                                                                       ` Ben Hutchings
  -1 siblings, 0 replies; 133+ messages in thread
From: Ben Hutchings @ 2012-10-12 15:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Benjamin LaHaise, Russell King - ARM Linux,
	Måns Rullgård, Arnd Bergmann, linux-arm-kernel,
	David Laight, netdev, Jon Masters

On Fri, 2012-10-12 at 16:48 +0200, Eric Dumazet wrote:
> On Fri, 2012-10-12 at 10:22 -0400, Benjamin LaHaise wrote:
> > On Fri, Oct 12, 2012 at 02:08:12PM +0200, Eric Dumazet wrote:
> > > So yes, we built network stack with the prereq that IP headers are
> > > aligned, but unfortunately many developers use x86 which doesnt care, so
> > > its possible some bugs are added.
> > 
> > x86 does have an alignment check flag that can be set in the flags register.  
> > Somehow, I doubt anyone would be willing to walk through all the noise the 
> > faults would likely trigger.
> 
> If this can be mapped to an event that can be used by perf tool, that
> might be useful ?

AC was so useless that it has now been reallocated to use by SMAP
<https://lwn.net/Articles/517251/>.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 15:04                                                                       ` Ben Hutchings
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Hutchings @ 2012-10-12 15:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2012-10-12 at 16:48 +0200, Eric Dumazet wrote:
> On Fri, 2012-10-12 at 10:22 -0400, Benjamin LaHaise wrote:
> > On Fri, Oct 12, 2012 at 02:08:12PM +0200, Eric Dumazet wrote:
> > > So yes, we built network stack with the prereq that IP headers are
> > > aligned, but unfortunately many developers use x86 which doesnt care, so
> > > its possible some bugs are added.
> > 
> > x86 does have an alignment check flag that can be set in the flags register.  
> > Somehow, I doubt anyone would be willing to walk through all the noise the 
> > faults would likely trigger.
> 
> If this can be mapped to an event that can be used by perf tool, that
> might be useful ?

AC was so useless that it has now been reallocated to use by SMAP
<https://lwn.net/Articles/517251/>.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: alignment faults in 3.6
  2012-10-12 15:04                                                                       ` Ben Hutchings
@ 2012-10-12 15:47                                                                         ` David Laight
  -1 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-12 15:47 UTC (permalink / raw)
  To: Ben Hutchings, Eric Dumazet
  Cc: Benjamin LaHaise, Russell King - ARM Linux,
	Måns Rullgård, Arnd Bergmann, linux-arm-kernel, netdev,
	Jon Masters

> AC was so useless that it has now been reallocated to use by SMAP
> <https://lwn.net/Articles/517251/>.

That is a long time coming!
Wonder when it will appear in any cpus.

How am I going to get root access when I can't get the kernel
to execute code at address 0 any more :-)


^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 15:47                                                                         ` David Laight
  0 siblings, 0 replies; 133+ messages in thread
From: David Laight @ 2012-10-12 15:47 UTC (permalink / raw)
  To: linux-arm-kernel

> AC was so useless that it has now been reallocated to use by SMAP
> <https://lwn.net/Articles/517251/>.

That is a long time coming!
Wonder when it will appear in any cpus.

How am I going to get root access when I can't get the kernel
to execute code at address 0 any more :-)

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: alignment faults in 3.6
  2012-10-12 15:47                                                                         ` David Laight
@ 2012-10-12 16:13                                                                           ` Ben Hutchings
  -1 siblings, 0 replies; 133+ messages in thread
From: Ben Hutchings @ 2012-10-12 16:13 UTC (permalink / raw)
  To: David Laight
  Cc: Eric Dumazet, Benjamin LaHaise, Russell King - ARM Linux,
	Måns Rullgård, Arnd Bergmann, linux-arm-kernel, netdev,
	Jon Masters

On Fri, 2012-10-12 at 16:47 +0100, David Laight wrote:
> > AC was so useless that it has now been reallocated to use by SMAP
> > <https://lwn.net/Articles/517251/>.
> 
> That is a long time coming!
> Wonder when it will appear in any cpus.
> 
> How am I going to get root access when I can't get the kernel
> to execute code at address 0 any more :-)

Moving further off the topic: that is supposed to be prevented by a
separate feature, SMEP, which I think is available in current Intel CPUs
(Ivy Bridge).  Also, unprivileged users are generally not permitted to
mmap() at address 0.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
@ 2012-10-12 16:13                                                                           ` Ben Hutchings
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Hutchings @ 2012-10-12 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2012-10-12 at 16:47 +0100, David Laight wrote:
> > AC was so useless that it has now been reallocated to use by SMAP
> > <https://lwn.net/Articles/517251/>.
> 
> That is a long time coming!
> Wonder when it will appear in any cpus.
> 
> How am I going to get root access when I can't get the kernel
> to execute code at address 0 any more :-)

Moving further off the topic: that is supposed to be prevented by a
separate feature, SMEP, which I think is available in current Intel CPUs
(Ivy Bridge).  Also, unprivileged users are generally not permitted to
mmap() at address 0.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-05 10:51   ` Russell King - ARM Linux
@ 2012-10-23 16:30     ` Jon Masters
  2012-10-23 16:58       ` Russell King - ARM Linux
  0 siblings, 1 reply; 133+ messages in thread
From: Jon Masters @ 2012-10-23 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/05/2012 06:51 AM, Russell King - ARM Linux wrote:

<snip switch out the might_fault __get_user for atomic
probe_kernel_address related discussion>

> Okay, this should fix the issue...  I've only compile tested it so far.
> Rob, as you have a way to trigger this easily, can you give this patch
> a go and let me know if it solves your problem?  Thanks.

Russell, can you let me know the status of this patch? I don't see it in
your tree, and I think I might have missed a followup patch from you? It
seems to solve a real problem with in-kernel faults during the dump_mem
in general, even beyond just an alignment fault.

Jon.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-23 16:30     ` Jon Masters
@ 2012-10-23 16:58       ` Russell King - ARM Linux
  2012-10-23 17:15         ` Jon Masters
  2012-10-23 19:14         ` Rob Herring
  0 siblings, 2 replies; 133+ messages in thread
From: Russell King - ARM Linux @ 2012-10-23 16:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Oct 23, 2012 at 12:30:14PM -0400, Jon Masters wrote:
> On 10/05/2012 06:51 AM, Russell King - ARM Linux wrote:
> 
> <snip switch out the might_fault __get_user for atomic
> probe_kernel_address related discussion>
> 
> > Okay, this should fix the issue...  I've only compile tested it so far.
> > Rob, as you have a way to trigger this easily, can you give this patch
> > a go and let me know if it solves your problem?  Thanks.
> 
> Russell, can you let me know the status of this patch? I don't see it in
> your tree, and I think I might have missed a followup patch from you? It
> seems to solve a real problem with in-kernel faults during the dump_mem
> in general, even beyond just an alignment fault.

I was waiting for someone to say "I saw a problem and it works"...
Would be nice to have a Tested-by tag against it...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-23 16:58       ` Russell King - ARM Linux
@ 2012-10-23 17:15         ` Jon Masters
  2012-10-23 19:14         ` Rob Herring
  1 sibling, 0 replies; 133+ messages in thread
From: Jon Masters @ 2012-10-23 17:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/23/2012 12:58 PM, Russell King - ARM Linux wrote:
> On Tue, Oct 23, 2012 at 12:30:14PM -0400, Jon Masters wrote:
>> On 10/05/2012 06:51 AM, Russell King - ARM Linux wrote:
>>
>> <snip switch out the might_fault __get_user for atomic
>> probe_kernel_address related discussion>
>>
>>> Okay, this should fix the issue...  I've only compile tested it so far.
>>> Rob, as you have a way to trigger this easily, can you give this patch
>>> a go and let me know if it solves your problem?  Thanks.
>>
>> Russell, can you let me know the status of this patch? I don't see it in
>> your tree, and I think I might have missed a followup patch from you? It
>> seems to solve a real problem with in-kernel faults during the dump_mem
>> in general, even beyond just an alignment fault.
> 
> I was waiting for someone to say "I saw a problem and it works"...
> Would be nice to have a Tested-by tag against it...

I've poked at the patch, made one against Fedora, and asked someone in
my team to do a scratch build and test on our local highbank/other
systems. We'll come back with that tag for ya asap, thanks!

Jon.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* alignment faults in 3.6
  2012-10-23 16:58       ` Russell King - ARM Linux
  2012-10-23 17:15         ` Jon Masters
@ 2012-10-23 19:14         ` Rob Herring
  1 sibling, 0 replies; 133+ messages in thread
From: Rob Herring @ 2012-10-23 19:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/23/2012 11:58 AM, Russell King - ARM Linux wrote:
> On Tue, Oct 23, 2012 at 12:30:14PM -0400, Jon Masters wrote:
>> On 10/05/2012 06:51 AM, Russell King - ARM Linux wrote:
>>
>> <snip switch out the might_fault __get_user for atomic
>> probe_kernel_address related discussion>
>>
>>> Okay, this should fix the issue...  I've only compile tested it so far.
>>> Rob, as you have a way to trigger this easily, can you give this patch
>>> a go and let me know if it solves your problem?  Thanks.
>>
>> Russell, can you let me know the status of this patch? I don't see it in
>> your tree, and I think I might have missed a followup patch from you? It
>> seems to solve a real problem with in-kernel faults during the dump_mem
>> in general, even beyond just an alignment fault.
> 
> I was waiting for someone to say "I saw a problem and it works"...
> Would be nice to have a Tested-by tag against it...

I thought I had said it does work and fixes the problem. Anyway,

Tested-by: Rob Herring <rob.herring@calxeda.com>

Rob

^ permalink raw reply	[flat|nested] 133+ messages in thread

end of thread, other threads:[~2012-10-23 19:14 UTC | newest]

Thread overview: 133+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-04 23:10 alignment faults in 3.6 Rob Herring
2012-10-05  0:58 ` Michael Hope
2012-10-05  1:26   ` Mans Rullgard
2012-10-05  1:56     ` Rob Herring
2012-10-05  2:25       ` Mans Rullgard
2012-10-05  3:04         ` Rob Herring
2012-10-05  5:37           ` Khem Raj
2012-10-05  7:12         ` Russell King - ARM Linux
2012-10-05  8:20           ` Mans Rullgard
2012-10-05  8:24             ` Russell King - ARM Linux
2012-10-05  8:33               ` Mans Rullgard
2012-10-05  8:33                 ` Russell King - ARM Linux
2012-10-05  8:37                   ` Mans Rullgard
2012-10-05  8:50                     ` Russell King - ARM Linux
2012-10-05 13:49                     ` Mikael Pettersson
2012-10-05 12:24               ` Rob Herring
2012-10-05 13:51                 ` Mikael Pettersson
2012-10-05 16:01                   ` Rob Herring
2012-10-05 22:37                     ` Mans Rullgard
2012-10-05 22:42                       ` Russell King - ARM Linux
2012-10-06  1:41                         ` Nicolas Pitre
2012-10-06 16:04                         ` Mans Rullgard
2012-10-06 16:19                           ` Nicolas Pitre
2012-10-06 16:31                           ` Russell King - ARM Linux
2012-10-06 10:58                     ` Mikael Pettersson
2012-10-09 14:05                     ` Scott Bambrough
2012-10-09 14:18                       ` Mans Rullgard
2012-10-05 14:05                 ` Russell King - ARM Linux
2012-10-05 14:33                   ` Rob Herring
2012-10-11  0:59                     ` Jon Masters
2012-10-11  2:27                       ` Måns Rullgård
2012-10-11  2:27                         ` Måns Rullgård
2012-10-11  2:34                         ` Jon Masters
2012-10-11  2:34                           ` Jon Masters
2012-10-11  8:21                         ` David Laight
2012-10-11  8:21                           ` David Laight
2012-10-11  8:53                           ` Russell King - ARM Linux
2012-10-11  8:53                             ` Russell King - ARM Linux
2012-10-11  9:45                           ` Måns Rullgård
2012-10-11  9:45                             ` Måns Rullgård
2012-10-11 10:00                             ` Eric Dumazet
2012-10-11 10:00                               ` Eric Dumazet
2012-10-11 10:20                               ` Måns Rullgård
2012-10-11 10:20                                 ` Måns Rullgård
2012-10-11 10:22                               ` Eric Dumazet
2012-10-11 10:22                                 ` Eric Dumazet
2012-10-11 10:32                                 ` Russell King - ARM Linux
2012-10-11 10:32                                   ` Russell King - ARM Linux
2012-10-11 10:49                                   ` Eric Dumazet
2012-10-11 10:49                                     ` Eric Dumazet
2012-10-11 10:56                                     ` Maxime Bizon
2012-10-11 10:56                                       ` Maxime Bizon
2012-10-11 11:28                                       ` Eric Dumazet
2012-10-11 11:28                                         ` Eric Dumazet
2012-10-11 11:47                                         ` Maxime Bizon
2012-10-11 11:47                                           ` Maxime Bizon
2012-10-11 11:54                                           ` Eric Dumazet
2012-10-11 11:54                                             ` Eric Dumazet
2012-10-11 12:00                                             ` Eric Dumazet
2012-10-11 12:00                                               ` Eric Dumazet
2012-10-11 12:51                                             ` Maxime Bizon
2012-10-11 12:51                                               ` Maxime Bizon
2012-10-11 12:59                                               ` Eric Dumazet
2012-10-11 12:59                                                 ` Eric Dumazet
2012-10-11 12:28                                     ` Arnd Bergmann
2012-10-11 12:28                                       ` Arnd Bergmann
2012-10-11 12:40                                       ` Eric Dumazet
2012-10-11 12:40                                         ` Eric Dumazet
2012-10-11 13:20                                         ` Rob Herring
2012-10-11 13:20                                           ` Rob Herring
2012-10-11 13:32                                           ` Måns Rullgård
2012-10-11 13:32                                             ` Måns Rullgård
2012-10-11 13:35                                           ` Arnd Bergmann
2012-10-11 13:35                                             ` Arnd Bergmann
2012-10-11 13:47                                           ` Eric Dumazet
2012-10-11 13:47                                             ` Eric Dumazet
2012-10-11 15:23                                             ` Rob Herring
2012-10-11 15:23                                               ` Rob Herring
2012-10-11 15:39                                               ` David Laight
2012-10-11 15:39                                                 ` David Laight
2012-10-11 16:18                                                 ` Måns Rullgård
2012-10-11 16:18                                                   ` Måns Rullgård
2012-10-12  8:11                                                   ` Arnd Bergmann
2012-10-12  8:11                                                     ` Arnd Bergmann
2012-10-12  9:03                                                     ` Russell King - ARM Linux
2012-10-12  9:03                                                       ` Russell King - ARM Linux
2012-10-12 10:04                                                       ` Eric Dumazet
2012-10-12 10:04                                                         ` Eric Dumazet
2012-10-12 12:24                                                         ` Russell King - ARM Linux
2012-10-12 12:24                                                           ` Russell King - ARM Linux
2012-10-12 11:00                                                       ` Måns Rullgård
2012-10-12 11:00                                                         ` Måns Rullgård
2012-10-12 11:07                                                         ` Russell King - ARM Linux
2012-10-12 11:07                                                           ` Russell King - ARM Linux
2012-10-12 11:18                                                           ` Måns Rullgård
2012-10-12 11:18                                                             ` Måns Rullgård
2012-10-12 11:44                                                             ` Russell King - ARM Linux
2012-10-12 11:44                                                               ` Russell King - ARM Linux
2012-10-12 12:08                                                               ` Eric Dumazet
2012-10-12 12:08                                                                 ` Eric Dumazet
2012-10-12 14:22                                                                 ` Benjamin LaHaise
2012-10-12 14:22                                                                   ` Benjamin LaHaise
2012-10-12 14:36                                                                   ` David Laight
2012-10-12 14:36                                                                     ` David Laight
2012-10-12 14:48                                                                   ` Eric Dumazet
2012-10-12 14:48                                                                     ` Eric Dumazet
2012-10-12 15:00                                                                     ` Benjamin LaHaise
2012-10-12 15:00                                                                       ` Benjamin LaHaise
2012-10-12 15:04                                                                     ` Ben Hutchings
2012-10-12 15:04                                                                       ` Ben Hutchings
2012-10-12 15:47                                                                       ` David Laight
2012-10-12 15:47                                                                         ` David Laight
2012-10-12 16:13                                                                         ` Ben Hutchings
2012-10-12 16:13                                                                           ` Ben Hutchings
2012-10-12 12:16                                                               ` Måns Rullgård
2012-10-12 12:16                                                                 ` Måns Rullgård
2012-10-12 11:19                                                           ` Russell King - ARM Linux
2012-10-12 11:19                                                             ` Russell King - ARM Linux
2012-10-11 16:15                                               ` Eric Dumazet
2012-10-11 16:15                                                 ` Eric Dumazet
2012-10-11 16:59                                   ` Catalin Marinas
2012-10-11 16:59                                     ` Catalin Marinas
2012-10-11 10:16                             ` David Laight
2012-10-11 10:16                               ` David Laight
2012-10-11 10:46                               ` Måns Rullgård
2012-10-11 10:46                                 ` Måns Rullgård
2012-10-05 16:08   ` Rob Herring
2012-10-05  7:29 ` Russell King - ARM Linux
2012-10-05 10:51   ` Russell King - ARM Linux
2012-10-23 16:30     ` Jon Masters
2012-10-23 16:58       ` Russell King - ARM Linux
2012-10-23 17:15         ` Jon Masters
2012-10-23 19:14         ` Rob Herring

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.