All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] Unexpected mode switches when reordering C++ lines
@ 2017-09-27 13:14 Andreas Glatz
  2017-09-27 13:26 ` Andreas Glatz
  2017-09-27 13:53 ` Philippe Gerum
  0 siblings, 2 replies; 13+ messages in thread
From: Andreas Glatz @ 2017-09-27 13:14 UTC (permalink / raw)
  To: xenomai

Hi

I've got two versions of the same piece of code, the only difference
is that lines are reordered:

#if 1
   u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
   sPhasor.angle = interpretU32AsFloat(u32Tmp);
   u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
   sPhasor.mag   = interpretU32AsFloat(u32Tmp);
#else
   u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
   sPhasor.mag   = interpretU32AsFloat(u32Tmp);
   u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
   sPhasor.angle = interpretU32AsFloat(u32Tmp);
#endif

In case '#if 1' sPhasor.angle is assigned before sPhasor.mag, and
everything works fine.

In case '#if 0' it's the other way round, and I get mode switches
caused by the first (but not second) ntohl().

Looking at the assembly, there is a clear difference:

#if 1
   0x2ac5bcb8 <+304>:   ldr     r3, [r5]
   0x2ac5bcbc <+308>:   ldr     r2, [r5, #4]
   0x2ac5bcc0 <+312>:   rev     r3, r3
   0x2ac5bcc4 <+316>:   rev     r2, r2
[...]
#else
     0x2abc3cb8 <+304>:   ldm     r5, {r2, r3}
=> 0x2abc3cbc <+308>:   rev     r2, r2
     0x2abc3cc0 <+312>:   rev     r3, r3
[...]
#endif

Here the arrow indicates where the mode switch occurs, which is at the
'rev' (for reverse byte order instruction), hence ntohl(). It seems
that the assembler optimization that collates the two ' ldr r3, [r5]'
and 'ldr r2, [r5, #4]' into ' ldm r5, {r2, r3}' could have something
to do with the mode switches...

Any clues what causes the mode switches and how to avoid them?

Cheers,

Andreas


Setup:
i.MX6 dual (ARM Cortex A9)
$> cat /proc/version
Linux version 3.0.43-tpcom_run2-PD13.2.4 (andreas@gorse) (gcc version
4.6.2 (OSELAS.Toolchain-2011.11.3 linaro-4.6-2011.11) ) #1 SMP PREEMPT
Tue Sep 26 16:17:30 BST 2017
$> cat /proc/xenomai/version
2.6.4
$> cat /proc/ipipe/version
1.18-13
$> arm-cortexa9-linux-gnueabi-gcc --version
arm-cortexa9-linux-gnueabi-gcc (OSELAS.Toolchain-2011.11.3
linaro-4.6-2011.11) 4.6.2
$> arm-cortexa9-linux-gnueabi-ld --version
GNU ld (GNU Binutils) 2.21.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 13:14 [Xenomai] Unexpected mode switches when reordering C++ lines Andreas Glatz
@ 2017-09-27 13:26 ` Andreas Glatz
  2017-09-27 13:31   ` Philippe Gerum
  2017-09-27 13:53 ` Philippe Gerum
  1 sibling, 1 reply; 13+ messages in thread
From: Andreas Glatz @ 2017-09-27 13:26 UTC (permalink / raw)
  To: xenomai

> I've got two versions of the same piece of code, the only difference
> is that lines are reordered:
>
> #if 1
>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
> #else
>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
> #endif
>
> In case '#if 1' sPhasor.angle is assigned before sPhasor.mag, and
> everything works fine.
>
> In case '#if 0' it's the other way round, and I get mode switches
> caused by the first (but not second) ntohl().

I forgot to mention the mode switch is actually a unaligned access
exception as shown here:
$> cat /proc/xenomai/faults
TRAP         CPU0        CPU1
  0:            0          50    (Data or instruction access)
  1:            0           0    (Section fault)
  2:            0           0    (Generic data abort)
  3:            0           0    (Unknown exception)
  4:            0           0    (Instruction breakpoint)
  5:            0           0    (Floating point exception)
  6:            0           0    (VFP Floating point exception)
  7:            0           0    (Undefined instruction)
  8:            0    14629654    (Unaligned access exception)

>
> Looking at the assembly, there is a clear difference:
>
> #if 1
>    0x2ac5bcb8 <+304>:   ldr     r3, [r5]
>    0x2ac5bcbc <+308>:   ldr     r2, [r5, #4]
>    0x2ac5bcc0 <+312>:   rev     r3, r3
>    0x2ac5bcc4 <+316>:   rev     r2, r2
> [...]
> #else
>      0x2abc3cb8 <+304>:   ldm     r5, {r2, r3}
> => 0x2abc3cbc <+308>:   rev     r2, r2
>      0x2abc3cc0 <+312>:   rev     r3, r3
> [...]
> #endif
>
> Here the arrow indicates where the mode switch occurs, which is at the
> 'rev' (for reverse byte order instruction), hence ntohl(). It seems
> that the assembler optimization that collates the two ' ldr r3, [r5]'
> and 'ldr r2, [r5, #4]' into ' ldm r5, {r2, r3}' could have something
> to do with the mode switches...
>
> Any clues what causes the mode switches and how to avoid them?
>
> Cheers,
>
> Andreas
>
>
> Setup:
> i.MX6 dual (ARM Cortex A9)
> $> cat /proc/version
> Linux version 3.0.43-tpcom_run2-PD13.2.4 (andreas@gorse) (gcc version
> 4.6.2 (OSELAS.Toolchain-2011.11.3 linaro-4.6-2011.11) ) #1 SMP PREEMPT
> Tue Sep 26 16:17:30 BST 2017
> $> cat /proc/xenomai/version
> 2.6.4
> $> cat /proc/ipipe/version
> 1.18-13
> $> arm-cortexa9-linux-gnueabi-gcc --version
> arm-cortexa9-linux-gnueabi-gcc (OSELAS.Toolchain-2011.11.3
> linaro-4.6-2011.11) 4.6.2
> $> arm-cortexa9-linux-gnueabi-ld --version
> GNU ld (GNU Binutils) 2.21.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 13:26 ` Andreas Glatz
@ 2017-09-27 13:31   ` Philippe Gerum
  2017-09-27 13:37     ` Andreas Glatz
  0 siblings, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2017-09-27 13:31 UTC (permalink / raw)
  To: Andreas Glatz, xenomai

On 09/27/2017 03:26 PM, Andreas Glatz wrote:
>> I've got two versions of the same piece of code, the only difference
>> is that lines are reordered:
>>
>> #if 1
>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>> #else
>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>> #endif
>>
>> In case '#if 1' sPhasor.angle is assigned before sPhasor.mag, and
>> everything works fine.
>>
>> In case '#if 0' it's the other way round, and I get mode switches
>> caused by the first (but not second) ntohl().
> 
> I forgot to mention the mode switch is actually a unaligned access
> exception as shown here:
> $> cat /proc/xenomai/faults
> TRAP         CPU0        CPU1
>   0:            0          50    (Data or instruction access)
>   1:            0           0    (Section fault)
>   2:            0           0    (Generic data abort)
>   3:            0           0    (Unknown exception)
>   4:            0           0    (Instruction breakpoint)
>   5:            0           0    (Floating point exception)
>   6:            0           0    (VFP Floating point exception)
>   7:            0           0    (Undefined instruction)
>   8:            0    14629654    (Unaligned access exception)
> 

I believe that the ldm instruction is causing this, not rev. If so, this
would mean that %r5 contains an unaligned address for a 32bit read.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 13:31   ` Philippe Gerum
@ 2017-09-27 13:37     ` Andreas Glatz
  2017-09-27 14:39       ` Lennart Sorensen
  2017-09-28  6:18       ` dietmar.schindler
  0 siblings, 2 replies; 13+ messages in thread
From: Andreas Glatz @ 2017-09-27 13:37 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On Wed, Sep 27, 2017 at 2:31 PM, Philippe Gerum <rpm@xenomai.org> wrote:
> On 09/27/2017 03:26 PM, Andreas Glatz wrote:
>>> I've got two versions of the same piece of code, the only difference
>>> is that lines are reordered:
>>>
>>> #if 1
>>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>>>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>>>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>>> #else
>>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>>>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>>>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>>> #endif
>>>
>>> In case '#if 1' sPhasor.angle is assigned before sPhasor.mag, and
>>> everything works fine.
>>>
>>> In case '#if 0' it's the other way round, and I get mode switches
>>> caused by the first (but not second) ntohl().
>>
>> I forgot to mention the mode switch is actually a unaligned access
>> exception as shown here:
>> $> cat /proc/xenomai/faults
>> TRAP         CPU0        CPU1
>>   0:            0          50    (Data or instruction access)
>>   1:            0           0    (Section fault)
>>   2:            0           0    (Generic data abort)
>>   3:            0           0    (Unknown exception)
>>   4:            0           0    (Instruction breakpoint)
>>   5:            0           0    (Floating point exception)
>>   6:            0           0    (VFP Floating point exception)
>>   7:            0           0    (Undefined instruction)
>>   8:            0    14629654    (Unaligned access exception)
>>
>
> I believe that the ldm instruction is causing this, not rev. If so, this
> would mean that %r5 contains an unaligned address for a 32bit read.
>

It's true according to the ARM doc [1]. But the pDat alignment is
prescribed by the sender and we're at the receiver. So I guess the
question is how to tell the assembler not to use ldm in the first
place?

Thanks

A.

[1] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 13:14 [Xenomai] Unexpected mode switches when reordering C++ lines Andreas Glatz
  2017-09-27 13:26 ` Andreas Glatz
@ 2017-09-27 13:53 ` Philippe Gerum
  2017-09-27 17:25   ` Andreas Glatz
  1 sibling, 1 reply; 13+ messages in thread
From: Philippe Gerum @ 2017-09-27 13:53 UTC (permalink / raw)
  To: Andreas Glatz, xenomai

On 09/27/2017 03:14 PM, Andreas Glatz wrote:
> Hi
> 
> I've got two versions of the same piece of code, the only difference
> is that lines are reordered:
> 
> #if 1
>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
> #else
>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
> #endif
> 
> In case '#if 1' sPhasor.angle is assigned before sPhasor.mag, and
> everything works fine.
> 
> In case '#if 0' it's the other way round, and I get mode switches
> caused by the first (but not second) ntohl().
> 
> Looking at the assembly, there is a clear difference:
> 
> #if 1
>    0x2ac5bcb8 <+304>:   ldr     r3, [r5]
>    0x2ac5bcbc <+308>:   ldr     r2, [r5, #4]
>    0x2ac5bcc0 <+312>:   rev     r3, r3
>    0x2ac5bcc4 <+316>:   rev     r2, r2
> [...]
> #else
>      0x2abc3cb8 <+304>:   ldm     r5, {r2, r3}
> => 0x2abc3cbc <+308>:   rev     r2, r2
>      0x2abc3cc0 <+312>:   rev     r3, r3
> [...]
> #endif
> 
> Here the arrow indicates where the mode switch occurs, which is at the
> 'rev' (for reverse byte order instruction), hence ntohl(). It seems
> that the assembler optimization that collates the two ' ldr r3, [r5]'
> and 'ldr r2, [r5, #4]' into ' ldm r5, {r2, r3}' could have something
> to do with the mode switches...
> 
> Any clues what causes the mode switches and how to avoid them?
> 

The ugly way I'm afraid: load each 32bit word one byte at a time in a
temp variable recomposing the 32bit quantity first, then reshuffling the
temp variable contents to host byte order.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 13:37     ` Andreas Glatz
@ 2017-09-27 14:39       ` Lennart Sorensen
  2017-09-27 17:33         ` Andreas Glatz
  2017-09-28  6:18       ` dietmar.schindler
  1 sibling, 1 reply; 13+ messages in thread
From: Lennart Sorensen @ 2017-09-27 14:39 UTC (permalink / raw)
  To: Andreas Glatz; +Cc: xenomai

On Wed, Sep 27, 2017 at 02:37:39PM +0100, Andreas Glatz wrote:
> It's true according to the ARM doc [1]. But the pDat alignment is
> prescribed by the sender and we're at the receiver. So I guess the
> question is how to tell the assembler not to use ldm in the first
> place?

It certainly looks like the code somehow is optimizing two consequitive
32 bit reads into a single 64 bit read, but then the input is not 64 bit
aligned (but is 32 bit aligned) which causes an alignment exception on
the 64 bit read while the 32 bit reads were fine.

One would think the compiler should NOT be doing that.

Now as far as I remember, armv7 (which is what you appear to have)
allows unaligned accesses to normal memory, but not to IO ports or other
special memory.  I recall hitting a case where someone had flagged
some memory as DMA which caused the alignement exception to be hit.
Since the memory really had no reason to be flagged for DMA or anything
else special it was easy to fix.

Is the memory at your pointer allocated with any special flags or is
it device memory?  Because if it isn't, then you shouldn't as far as I
recall be hitting that exception.

Part of the problem seems to be that gcc optimizes with the assumption
that it is always talking to normal memory on arm and hence no alignment
rules apply anymore, but in driver code talking to IO and in cases where
you are flagging memory for DMA and such, gcc's optimizations get you
in trouble if you don't hit it with a stick in the right places.

-- 
Len Sorensen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 13:53 ` Philippe Gerum
@ 2017-09-27 17:25   ` Andreas Glatz
  2017-09-27 18:15     ` Lennart Sorensen
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Glatz @ 2017-09-27 17:25 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On Wed, Sep 27, 2017 at 2:53 PM, Philippe Gerum <rpm@xenomai.org> wrote:
> On 09/27/2017 03:14 PM, Andreas Glatz wrote:
>> Hi
>>
>> I've got two versions of the same piece of code, the only difference
>> is that lines are reordered:
>>
>> #if 1
>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>> #else
>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>> #endif
>>
>> In case '#if 1' sPhasor.angle is assigned before sPhasor.mag, and
>> everything works fine.
>>
>> In case '#if 0' it's the other way round, and I get mode switches
>> caused by the first (but not second) ntohl().
>>
>> Looking at the assembly, there is a clear difference:
>>
>> #if 1
>>    0x2ac5bcb8 <+304>:   ldr     r3, [r5]
>>    0x2ac5bcbc <+308>:   ldr     r2, [r5, #4]
>>    0x2ac5bcc0 <+312>:   rev     r3, r3
>>    0x2ac5bcc4 <+316>:   rev     r2, r2
>> [...]
>> #else
>>      0x2abc3cb8 <+304>:   ldm     r5, {r2, r3}
>> => 0x2abc3cbc <+308>:   rev     r2, r2
>>      0x2abc3cc0 <+312>:   rev     r3, r3
>> [...]
>> #endif
>>
>> Here the arrow indicates where the mode switch occurs, which is at the
>> 'rev' (for reverse byte order instruction), hence ntohl(). It seems
>> that the assembler optimization that collates the two ' ldr r3, [r5]'
>> and 'ldr r2, [r5, #4]' into ' ldm r5, {r2, r3}' could have something
>> to do with the mode switches...
>>
>> Any clues what causes the mode switches and how to avoid them?
>>
>
> The ugly way I'm afraid: load each 32bit word one byte at a time in a
> temp variable recomposing the 32bit quantity first, then reshuffling the
> temp variable contents to host byte order.
>

As you say Philippe, I had to memcpy() 2xu32 from the (malloc'd)
memory region into an array u32 tmpVal[2] array, and then use these
values one-by-one to process them with ntohl(), ...

I tried to cast the pDat pointer to u32* initially, but then I still
got the ldm instructions and hence mode switches as the assembler is
very good at optimizing.

I have the feeling the same could come up at other places as well as
we progress with the testing of our application. I guess a 'save'
solution is to compile the code with -O0 or -O1, because if never seen
these mode switches using these compiler options. However, an ideal
solution would be to have something to mark the memory region as
'unaligned' so the assembler then only uses ldr instead of
instructions on it... maybe there is this unicorn running around
somewhere :)

Anyways, many thanks for your help.

A.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 14:39       ` Lennart Sorensen
@ 2017-09-27 17:33         ` Andreas Glatz
  2017-09-27 17:57           ` Lennart Sorensen
  2017-09-27 18:49           ` Philippe Gerum
  0 siblings, 2 replies; 13+ messages in thread
From: Andreas Glatz @ 2017-09-27 17:33 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: xenomai

Hi Lennart

On Wed, Sep 27, 2017 at 3:39 PM, Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
> On Wed, Sep 27, 2017 at 02:37:39PM +0100, Andreas Glatz wrote:
>> It's true according to the ARM doc [1]. But the pDat alignment is
>> prescribed by the sender and we're at the receiver. So I guess the
>> question is how to tell the assembler not to use ldm in the first
>> place?
>
> It certainly looks like the code somehow is optimizing two consequitive
> 32 bit reads into a single 64 bit read, but then the input is not 64 bit
> aligned (but is 32 bit aligned) which causes an alignment exception on
> the 64 bit read while the 32 bit reads were fine.
>
> One would think the compiler should NOT be doing that.

Yes, one would think so...

>
> Now as far as I remember, armv7 (which is what you appear to have)
> allows unaligned accesses to normal memory, but not to IO ports or other
> special memory.  I recall hitting a case where someone had flagged
> some memory as DMA which caused the alignement exception to be hit.
> Since the memory really had no reason to be flagged for DMA or anything
> else special it was easy to fix.

I shall try to play with these options then. As explained to Philippe
and hinted by you one would ideally want a flag to mark a memory
regions as 'unaligned' so that the compiler/assembler know that they
should only use instructions that support unaligned access, albeit
being slower.

>
> Is the memory at your pointer allocated with any special flags or is
> it device memory?  Because if it isn't, then you shouldn't as far as I
> recall be hitting that exception.

The memory is malloc'd before the Xenomai thread starts, but after
locking all the pages into memory (the usual Xenomai startup stuff).

>
> Part of the problem seems to be that gcc optimizes with the assumption
> that it is always talking to normal memory on arm and hence no alignment
> rules apply anymore, but in driver code talking to IO and in cases where
> you are flagging memory for DMA and such, gcc's optimizations get you
> in trouble if you don't hit it with a stick in the right places.

Apparently :) Other option would be to write my own access function
for this piece of memory in assembly... then there'd be no room for
wiggling around anymore.

Cheers,

A.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 17:33         ` Andreas Glatz
@ 2017-09-27 17:57           ` Lennart Sorensen
  2017-09-27 18:49           ` Philippe Gerum
  1 sibling, 0 replies; 13+ messages in thread
From: Lennart Sorensen @ 2017-09-27 17:57 UTC (permalink / raw)
  To: Andreas Glatz; +Cc: xenomai

On Wed, Sep 27, 2017 at 06:33:50PM +0100, Andreas Glatz wrote:
> Yes, one would think so...

Well reading a bit, it seems by default gcc assumes all pointers are
sensibly aligned.  Still not sure how that excuses doing a combination
of two reads into one in this case.

Which gcc version are you using?

> I shall try to play with these options then. As explained to Philippe
> and hinted by you one would ideally want a flag to mark a memory
> regions as 'unaligned' so that the compiler/assembler know that they
> should only use instructions that support unaligned access, albeit
> being slower.
> 
> The memory is malloc'd before the Xenomai thread starts, but after
> locking all the pages into memory (the usual Xenomai startup stuff).

So according to
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html
only memory marked as normal allows unaligned accesses.  It also says
that the LDM instruction is NOT allowed on unaligned accesses.  So then
the question is why does gcc assume that pointer is always going to
be aligned?  Is this a bug in gcc?

> Apparently :) Other option would be to write my own access function
> for this piece of memory in assembly... then there'd be no room for
> wiggling around anymore.

Any change adding -fno-strict-aliasing to the gcc arguments helps?

-- 
Len Sorensen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 17:25   ` Andreas Glatz
@ 2017-09-27 18:15     ` Lennart Sorensen
  0 siblings, 0 replies; 13+ messages in thread
From: Lennart Sorensen @ 2017-09-27 18:15 UTC (permalink / raw)
  To: Andreas Glatz; +Cc: xenomai

On Wed, Sep 27, 2017 at 06:25:18PM +0100, Andreas Glatz wrote:
> As you say Philippe, I had to memcpy() 2xu32 from the (malloc'd)
> memory region into an array u32 tmpVal[2] array, and then use these
> values one-by-one to process them with ntohl(), ...

Certainly using memcpy is the safe way to deal with unaligned data.

Casting a char* to an int* is just not valid and breaks aliasing rules.

> I tried to cast the pDat pointer to u32* initially, but then I still
> got the ldm instructions and hence mode switches as the assembler is
> very good at optimizing.
>
> I have the feeling the same could come up at other places as well as
> we progress with the testing of our application. I guess a 'save'
> solution is to compile the code with -O0 or -O1, because if never seen
> these mode switches using these compiler options. However, an ideal
> solution would be to have something to mark the memory region as
> 'unaligned' so the assembler then only uses ldr instead of
> instructions on it... maybe there is this unicorn running around
> somewhere :)
> 
> Anyways, many thanks for your help.

Is this any help:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53016

A bit of searching is starting to make me think your problem is caused
by using reinterpret_cast rather than a plain static cast.  I think
using that is making the compiler say "Oh this is a proper correct fully
aligned pointer of this type, so I can assume anything I want about it".

-- 
Len Sorensen


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 17:33         ` Andreas Glatz
  2017-09-27 17:57           ` Lennart Sorensen
@ 2017-09-27 18:49           ` Philippe Gerum
  1 sibling, 0 replies; 13+ messages in thread
From: Philippe Gerum @ 2017-09-27 18:49 UTC (permalink / raw)
  To: Andreas Glatz, Lennart Sorensen; +Cc: xenomai

On 09/27/2017 07:33 PM, Andreas Glatz wrote:
> Hi Lennart
> 
> On Wed, Sep 27, 2017 at 3:39 PM, Lennart Sorensen
> <lsorense@csclub.uwaterloo.ca> wrote:
>> On Wed, Sep 27, 2017 at 02:37:39PM +0100, Andreas Glatz wrote:
>>> It's true according to the ARM doc [1]. But the pDat alignment is
>>> prescribed by the sender and we're at the receiver. So I guess the
>>> question is how to tell the assembler not to use ldm in the first
>>> place?
>>
>> It certainly looks like the code somehow is optimizing two consequitive
>> 32 bit reads into a single 64 bit read, but then the input is not 64 bit
>> aligned (but is 32 bit aligned) which causes an alignment exception on
>> the 64 bit read while the 32 bit reads were fine.
>>
>> One would think the compiler should NOT be doing that.
> 
> Yes, one would think so...
> 
>>
>> Now as far as I remember, armv7 (which is what you appear to have)
>> allows unaligned accesses to normal memory, but not to IO ports or other
>> special memory.  I recall hitting a case where someone had flagged
>> some memory as DMA which caused the alignement exception to be hit.
>> Since the memory really had no reason to be flagged for DMA or anything
>> else special it was easy to fix.
> 
> I shall try to play with these options then. As explained to Philippe
> and hinted by you one would ideally want a flag to mark a memory
> regions as 'unaligned' so that the compiler/assembler know that they
> should only use instructions that support unaligned access, albeit
> being slower.
> 
>>
>> Is the memory at your pointer allocated with any special flags or is
>> it device memory?  Because if it isn't, then you shouldn't as far as I
>> recall be hitting that exception.
> 
> The memory is malloc'd before the Xenomai thread starts, but after
> locking all the pages into memory (the usual Xenomai startup stuff).
> 
>>
>> Part of the problem seems to be that gcc optimizes with the assumption
>> that it is always talking to normal memory on arm and hence no alignment
>> rules apply anymore, but in driver code talking to IO and in cases where
>> you are flagging memory for DMA and such, gcc's optimizations get you
>> in trouble if you don't hit it with a stick in the right places.
> 
> Apparently :) Other option would be to write my own access function
> for this piece of memory in assembly... then there'd be no room for
> wiggling around anymore.

There is also the option to go through a packed attribute window for reading the data, e.g.:

#include <linux/types.h>
#include <netinet/in.h>

struct maybe_unaligned {
	__u32 val;
} __attribute__((__packed__));

extern "C" __u32 read32_unaligned(const void *p)
{
	return ntohl(static_cast<const struct maybe_unaligned *>(p)->val);
}

The disassembly looks promising, since ldr can deal with unaligned accesses (which is confirmed by the assembler's annotation):

{rpm@cobalt} arm-linux-gnueabihf-g++ -O2 c.cc -S -o -
...
read32_unaligned:
	.fnstart
.LFB6:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	ldr	r0, [r0]	@ unaligned
	rev	r0, r0
	bx	lr
...

-- 
Philippe.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
  2017-09-27 13:37     ` Andreas Glatz
  2017-09-27 14:39       ` Lennart Sorensen
@ 2017-09-28  6:18       ` dietmar.schindler
  1 sibling, 0 replies; 13+ messages in thread
From: dietmar.schindler @ 2017-09-28  6:18 UTC (permalink / raw)
  Cc: xenomai

> Von: Andreas Glatz
> Gesendet: Mittwoch, 27. September 2017 15:38
>
> ...
> >>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
> >>>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
> >>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
> >>>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
> ...
>
> It's true according to the ARM doc [1]. But the pDat alignment is
> prescribed by the sender and we're at the receiver. So I guess the
> question is how to tell the assembler not to use ldm in the first
> place?

Try defining `*pDat` with `volatile`.
See http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka16346.html (How do I avoid the compiler generating LDM/STM instructions?)
--
Regards,
Dietmar Schindler
________________________________
manroland web systems GmbH | Managing Director: Alexander Wassermann
Registered Office: Augsburg | Trade Register: AG Augsburg | HRB-No.: 26816 | VAT: DE281389840

Confidentiality note:
This eMail and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, you are hereby notified that any use or dissemination of this communication is strictly prohibited. If you have received this eMail in error, then please delete this eMail.
________________________________


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Xenomai] Unexpected mode switches when reordering C++ lines
@ 2017-09-28 12:36 Andreas Glatz
  0 siblings, 0 replies; 13+ messages in thread
From: Andreas Glatz @ 2017-09-28 12:36 UTC (permalink / raw)
  To: xenomai

>
>> Von: Andreas Glatz
>> Gesendet: Mittwoch, 27. September 2017 15:38
>>
>> ...
>> >>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat));
>> >>>    sPhasor.mag   = interpretU32AsFloat(u32Tmp);
>> >>>    u32Tmp        = ntohl(*reinterpret_cast<u32*>(pDat+4));
>> >>>    sPhasor.angle = interpretU32AsFloat(u32Tmp);
>> ...
>>
>> It's true according to the ARM doc [1]. But the pDat alignment is
>> prescribed by the sender and we're at the receiver. So I guess the
>> question is how to tell the assembler not to use ldm in the first
>> place?
>
> Try defining `*pDat` with `volatile`.
> See http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka16346.html (How do I avoid the compiler generating LDM/STM instructions?)

I think this does the trick and needs only minimal surgery in the
code. Currently I'm running the optimised version of the code as
yesterday, where essentially the only changes are that i defined pDat
as 'volatile u8*', and all the reinterpret_casts were changed to
contain the volatile keyword as well (otherwise the compiler starts
complaining about missing qualifiers).

I'll leave it running for a while to see if anything strange happens...

Thanks a lot everyone involved!

A.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-09-28 12:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-27 13:14 [Xenomai] Unexpected mode switches when reordering C++ lines Andreas Glatz
2017-09-27 13:26 ` Andreas Glatz
2017-09-27 13:31   ` Philippe Gerum
2017-09-27 13:37     ` Andreas Glatz
2017-09-27 14:39       ` Lennart Sorensen
2017-09-27 17:33         ` Andreas Glatz
2017-09-27 17:57           ` Lennart Sorensen
2017-09-27 18:49           ` Philippe Gerum
2017-09-28  6:18       ` dietmar.schindler
2017-09-27 13:53 ` Philippe Gerum
2017-09-27 17:25   ` Andreas Glatz
2017-09-27 18:15     ` Lennart Sorensen
2017-09-28 12:36 Andreas Glatz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.