All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
@ 2010-12-18 22:27 Alexander Holler
  2010-12-19  7:51 ` Dirk Behme
  2010-12-22  8:02 ` Wolfgang Denk
  0 siblings, 2 replies; 48+ messages in thread
From: Alexander Holler @ 2010-12-18 22:27 UTC (permalink / raw)
  To: u-boot

gcc 4.5.1 seems to ignore (at least some) volatile definitions,
avoid that as done in the kernel.

Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that
gcc version to ignore the volatile type qualifier used e.g. in __arch_getl().
Anyway, using a definition as in the kernel headers avoids such optimizations when
gcc 4.5.1 is used.

Maybe the headers as used in the current linux-kernel should be used,
but to avoid large changes, I've just added a small change to the current headers.

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
---
 arch/arm/include/asm/io.h |   20 ++++++++++++++------
 1 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index ff1518e..5364b78 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -125,13 +125,21 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen)
 #define __raw_readw(a)			__arch_getw(a)
 #define __raw_readl(a)			__arch_getl(a)
 
-#define writeb(v,a)			__arch_putb(v,a)
-#define writew(v,a)			__arch_putw(v,a)
-#define writel(v,a)			__arch_putl(v,a)
+/*
+ * TODO: The kernel offers some more advanced versions of barriers, it might
+ * have some advantages to use them instead of the simple one here.
+ */
+#define dmb()				__asm__ __volatile__ ("" : : : "memory")
+#define __iormb()			dmb()
+#define __iowmb()			dmb()
+
+#define writeb(v,c)			({ __iowmb(); __arch_putb(v,c); })
+#define writew(v,c)			({ __iowmb(); __arch_putw(v,c); })
+#define writel(v,c)			({ __iowmb(); __arch_putl(v,c); })
 
-#define readb(a)			__arch_getb(a)
-#define readw(a)			__arch_getw(a)
-#define readl(a)			__arch_getl(a)
+#define readb(c)			({ u8  __v = __arch_getb(c); __iormb(); __v; })
+#define readw(c)			({ u16 __v = __arch_getw(c); __iormb(); __v; })
+#define readl(c)			({ u32 __v = __arch_getl(c); __iormb(); __v; })
 
 /*
  * The compiler seems to be incapable of optimising constants
-- 
1.7.2.2

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-18 22:27 [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends Alexander Holler
@ 2010-12-19  7:51 ` Dirk Behme
  2010-12-19 10:22   ` Alexander Holler
  2010-12-22  8:02 ` Wolfgang Denk
  1 sibling, 1 reply; 48+ messages in thread
From: Dirk Behme @ 2010-12-19  7:51 UTC (permalink / raw)
  To: u-boot

On 18.12.2010 23:27, Alexander Holler wrote:
> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
> avoid that as done in the kernel.
>
> Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that
> gcc version to ignore the volatile type qualifier used e.g. in __arch_getl().
> Anyway, using a definition as in the kernel headers avoids such optimizations when
> gcc 4.5.1 is used.
>
> Maybe the headers as used in the current linux-kernel should be used,
> but to avoid large changes, I've just added a small change to the current headers.
>
> Signed-off-by: Alexander Holler<holler@ahsoftware.de>
> ---
>   arch/arm/include/asm/io.h |   20 ++++++++++++++------
>   1 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
> index ff1518e..5364b78 100644
> --- a/arch/arm/include/asm/io.h
> +++ b/arch/arm/include/asm/io.h
> @@ -125,13 +125,21 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen)
>   #define __raw_readw(a)			__arch_getw(a)
>   #define __raw_readl(a)			__arch_getl(a)
>
> -#define writeb(v,a)			__arch_putb(v,a)
> -#define writew(v,a)			__arch_putw(v,a)
> -#define writel(v,a)			__arch_putl(v,a)
> +/*
> + * TODO: The kernel offers some more advanced versions of barriers, it might
> + * have some advantages to use them instead of the simple one here.
> + */
> +#define dmb()				__asm__ __volatile__ ("" : : : "memory")
> +#define __iormb()			dmb()
> +#define __iowmb()			dmb()
> +
> +#define writeb(v,c)			({ __iowmb(); __arch_putb(v,c); })
> +#define writew(v,c)			({ __iowmb(); __arch_putw(v,c); })
> +#define writel(v,c)			({ __iowmb(); __arch_putl(v,c); })
>
> -#define readb(a)			__arch_getb(a)
> -#define readw(a)			__arch_getw(a)
> -#define readl(a)			__arch_getl(a)
> +#define readb(c)			({ u8  __v = __arch_getb(c); __iormb(); __v; })
> +#define readw(c)			({ u16 __v = __arch_getw(c); __iormb(); __v; })
> +#define readl(c)			({ u32 __v = __arch_getl(c); __iormb(); __v; })

Using the test code below [1] and then looking at the disassembly from 
the two tool chains gcc version 4.3.3 (Sourcery G++ Lite 2009q1-203) 
versus gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50): Yes, without 
the additional dmb() the gcc 4.5.1 just creates

00000000 <main>:
    0:	e3a00000 	mov	r0, #0
    4:	e12fff1e 	bx	lr

while with the additional dmb() it creates

00000000 <main>:
    0:	e59f300c 	ldr	r3, [pc, #12]	; 14 <main+0x14>
    4:	e5932028 	ldr	r2, [r3, #40]	; 0x28
    8:	e5930028 	ldr	r0, [r3, #40]	; 0x28
    c:	e0620000 	rsb	r0, r2, r0
   10:	e12fff1e 	bx	lr
   14:	48318000

what looks correct. And 4.3.3 does the same code for both readl() 
versions. So:

Acked-by: Dirk Behme <dirk.behme@googlemail.com>

Thanks

Dirk

[1]

arm-none-linux-gnueabi-gcc -Wall -O2 -c foo.c -o foo.o
arm-none-linux-gnueabi-objdump -D foo.o > foo.dis

-- foo.c --
struct gptimer {
	unsigned int tidr;	/* 0x00 r */
	unsigned char res[0xc];
	unsigned int tiocp_cfg;	/* 0x10 rw */
	unsigned int tistat;	/* 0x14 r */
	unsigned int tisr;	/* 0x18 rw */
	unsigned int tier;	/* 0x1c rw */
	unsigned int twer;	/* 0x20 rw */
	unsigned int tclr;	/* 0x24 rw */
	unsigned int tcrr;	/* 0x28 rw */
	unsigned int tldr;	/* 0x2c rw */
	unsigned int ttgr;	/* 0x30 rw */
	unsigned int twpc;	/* 0x34 r*/
	unsigned int tmar;	/* 0x38 rw*/
	unsigned int tcar1;	/* 0x3c r */
	unsigned int tcicr;	/* 0x40 rw */
	unsigned int tcar2;	/* 0x44 r */
};


#define dmb()		  __asm__ __volatile__ ("" : : : "memory")
#define __iormb()	  dmb()

#define __arch_getl(a)	  (*(volatile unsigned int *)(a))
#define readl(a)	  __arch_getl(a)
//#define readl(c)	  ({ unsigned int __v = __arch_getl(c); __iormb(); 
__v; })

int main(void) {

   struct gptimer *gpt1_base = (struct gptimer *)0x48318000;
   unsigned int cdiff, cstart, cend;

   cstart = readl(&gpt1_base->tcrr);

   cend = readl(&gpt1_base->tcrr);

   cdiff = cend - cstart;

   return cdiff;

}
-- foo.c --

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-19  7:51 ` Dirk Behme
@ 2010-12-19 10:22   ` Alexander Holler
  2010-12-19 11:28     ` Albert ARIBAUD
  2010-12-19 18:45     ` John Rigby
  0 siblings, 2 replies; 48+ messages in thread
From: Alexander Holler @ 2010-12-19 10:22 UTC (permalink / raw)
  To: u-boot

Hello,

On 19.12.2010 08:51, Dirk Behme wrote:
> On 18.12.2010 23:27, Alexander Holler wrote:
>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>> avoid that as done in the kernel.
>>

> Acked-by: Dirk Behme <dirk.behme@googlemail.com>

Thanks for the ack, but I have to say, that those barriers are having 
side effects here. Reading NAND now fails on my BeagleBoard. Regardless 
if I use gcc 4.3.5 or gcc 4.5.1, after applying that patch the wrong ID 
of the NAND is read. In nand_get_flash_type() 
(drivers/mtd/nand/nand_base.c) without that patch I will get the following:

*maf_id: 44, dev_id: 186

with the patch the following is read:

*maf_id: 128, dev_id: 85

Which just is wrong.

I haven't looked further up to now, maybe thats just a side effect of 
some wrong clock settings because of different timings through those 
barrieres or whatever.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-19 10:22   ` Alexander Holler
@ 2010-12-19 11:28     ` Albert ARIBAUD
  2010-12-19 16:34       ` Alexander Holler
  2010-12-19 18:45     ` John Rigby
  1 sibling, 1 reply; 48+ messages in thread
From: Albert ARIBAUD @ 2010-12-19 11:28 UTC (permalink / raw)
  To: u-boot

Le 19/12/2010 11:22, Alexander Holler a ?crit :
> Hello,
>
> On 19.12.2010 08:51, Dirk Behme wrote:
>> On 18.12.2010 23:27, Alexander Holler wrote:
>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>> avoid that as done in the kernel.
>>>
>
>> Acked-by: Dirk Behme<dirk.behme@googlemail.com>
>
> Thanks for the ack, but I have to say, that those barriers are having
> side effects here. Reading NAND now fails on my BeagleBoard. Regardless
> if I use gcc 4.3.5 or gcc 4.5.1, after applying that patch the wrong ID
> of the NAND is read. In nand_get_flash_type()
> (drivers/mtd/nand/nand_base.c) without that patch I will get the following:
>
> *maf_id: 44, dev_id: 186
>
> with the patch the following is read:
>
> *maf_id: 128, dev_id: 85
>
> Which just is wrong.
>
> I haven't looked further up to now, maybe thats just a side effect of
> some wrong clock settings because of different timings through those
> barrieres or whatever.

IIUC, the patch is adding barriers to every HW register write and read, 
which makes the compiler stop reordering these, and keep them ordered as 
in the source code. Are you sure that the order as laid out in the 
source is the right one? Maybe you were just luck so far that the 
reordering masked a bug.

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-19 11:28     ` Albert ARIBAUD
@ 2010-12-19 16:34       ` Alexander Holler
  0 siblings, 0 replies; 48+ messages in thread
From: Alexander Holler @ 2010-12-19 16:34 UTC (permalink / raw)
  To: u-boot

Hello,

Am 19.12.2010 12:28, schrieb Albert ARIBAUD:
> Le 19/12/2010 11:22, Alexander Holler a ?crit :
>> Hello,
>>
>> On 19.12.2010 08:51, Dirk Behme wrote:
>>> On 18.12.2010 23:27, Alexander Holler wrote:
>>>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>>>> avoid that as done in the kernel.
>>>>
>>
>>> Acked-by: Dirk Behme<dirk.behme@googlemail.com>
>>
>> Thanks for the ack, but I have to say, that those barriers are having
>> side effects here. Reading NAND now fails on my BeagleBoard. Regardless
>> if I use gcc 4.3.5 or gcc 4.5.1, after applying that patch the wrong ID
>> of the NAND is read. In nand_get_flash_type()
>> (drivers/mtd/nand/nand_base.c) without that patch I will get the following:
>>
>> *maf_id: 44, dev_id: 186
>>
>> with the patch the following is read:
>>
>> *maf_id: 128, dev_id: 85
>>
>> Which just is wrong.
>>
>> I haven't looked further up to now, maybe thats just a side effect of
>> some wrong clock settings because of different timings through those
>> barrieres or whatever.
>
> IIUC, the patch is adding barriers to every HW register write and read,
> which makes the compiler stop reordering these, and keep them ordered as
> in the source code. Are you sure that the order as laid out in the
> source is the right one? Maybe you were just luck so far that the
> reordering masked a bug.

I don't know much about all the stuff the omap3-drivers are doing. E.g. 
I've already wondered why it's necessary to measure a clock frequency in 
the code which got (wrongly) optimized without that patch. I would think 
such isn't necessary because I would assume the drivers are actually 
there to set the clock settings, and not to measure them. ;)

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-19 10:22   ` Alexander Holler
  2010-12-19 11:28     ` Albert ARIBAUD
@ 2010-12-19 18:45     ` John Rigby
  2010-12-19 19:59       ` Alexander Holler
  1 sibling, 1 reply; 48+ messages in thread
From: John Rigby @ 2010-12-19 18:45 UTC (permalink / raw)
  To: u-boot

On Sun, Dec 19, 2010 at 3:22 AM, Alexander Holler <holler@ahsoftware.de> wrote:
> side effects here. Reading NAND now fails on my BeagleBoard. Regardless
> if I use gcc 4.3.5 or gcc 4.5.1, after applying that patch the wrong ID
> of the NAND is read. In nand_get_flash_type()
> (drivers/mtd/nand/nand_base.c) without that patch I will get the following:
>
> *maf_id: 44, dev_id: 186
>
> with the patch the following is read:
>
> *maf_id: 128, dev_id: 85
The nand_get_flash_type routine reads these id's twice and compares
them.  Do your see an error message like this?

second ID read did not match

>
> Which just is wrong.
>
> I haven't looked further up to now, maybe thats just a side effect of
> some wrong clock settings because of different timings through those
> barrieres or whatever.
>
> Regards,
>
> Alexander
> _______________________________________________
> U-Boot mailing list
> U-Boot at lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-19 18:45     ` John Rigby
@ 2010-12-19 19:59       ` Alexander Holler
  2010-12-20  0:39         ` John Rigby
  0 siblings, 1 reply; 48+ messages in thread
From: Alexander Holler @ 2010-12-19 19:59 UTC (permalink / raw)
  To: u-boot

Am 19.12.2010 19:45, schrieb John Rigby:
> On Sun, Dec 19, 2010 at 3:22 AM, Alexander Holler<holler@ahsoftware.de>  wrote:
>> side effects here. Reading NAND now fails on my BeagleBoard. Regardless
>> if I use gcc 4.3.5 or gcc 4.5.1, after applying that patch the wrong ID
>> of the NAND is read. In nand_get_flash_type()
>> (drivers/mtd/nand/nand_base.c) without that patch I will get the following:
>>
>> *maf_id: 44, dev_id: 186
>>
>> with the patch the following is read:
>>
>> *maf_id: 128, dev_id: 85
> The nand_get_flash_type routine reads these id's twice and compares
> them.  Do your see an error message like this?
>
> second ID read did not match
>

No, the output is

without memory barrier in read*/write*:
--------------------------------
U-Boot 2010.12-rc3-00013-g71aab09 (Dec 19 2010 - 01:19:38)

OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
OMAP3 Beagle board + LPDDR/NAND
I2C:   ready
DRAM:  256 MiB
NAND:  256 MiB
MMC:   OMAP SD/MMC: 0
In:    serial
Out:   serial
Err:   serial
Beagle Rev C4
timed out in wait_for_pin: I2C_STAT=0
No EEPROM on expansion board
Die ID #062a000400000000040365fa16019019
Hit any key to stop autoboot:  0
OMAP3 beagleboard.org # nand info

Device 0: nand0, sector size 128 KiB
--------------------------------

with memory barrier in read*/write*:
--------------------------------
U-Boot 2010.12-rc3-00014-gde0126f (Dec 19 2010 - 01:25:11)

OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
OMAP3 Beagle board + LPDDR/NAND
I2C:   ready
DRAM:  256 MiB
NAND:  32 MiB
MMC:   OMAP SD/MMC: 0
NAND read from offset 260000 failed -74
*** Warning - readenv() failed, using default environment

In:    serial
Out:   serial
Err:   serial
Beagle Rev C4
timed out in wait_for_pin: I2C_STAT=0
No EEPROM on expansion board
Die ID #062a000400000000040365fa16019019
Hit any key to stop autoboot:  0
OMAP3 beagleboard.org # nand info

Device 0: nand0, sector size 16 KiB
--------------------------------

And with the memory barrier the kernel gets started but hangs afterwards 
(at least it looks so). I still haven't searched further and can only 
offer speculations like clocks are screwed up, memory setup is broken or 
such. Finding the reason and not just the impact will need some more 
time on my side.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-19 19:59       ` Alexander Holler
@ 2010-12-20  0:39         ` John Rigby
  2010-12-20  0:56           ` Alexander Holler
  0 siblings, 1 reply; 48+ messages in thread
From: John Rigby @ 2010-12-20  0:39 UTC (permalink / raw)
  To: u-boot

On Sun, Dec 19, 2010 at 12:59 PM, Alexander Holler <holler@ahsoftware.de> wrote:
...
> No EEPROM on expansion board
> Die ID #062a000400000000040365fa16019019
> Hit any key to stop autoboot: ?0
> OMAP3 beagleboard.org # nand info
>
> Device 0: nand0, sector size 16 KiB
> --------------------------------
>
I get the same output  without your change.  My gcc is linaro 4.4.5.
I'll do some bisecting and try to find out what is going on.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20  0:39         ` John Rigby
@ 2010-12-20  0:56           ` Alexander Holler
  2010-12-20  4:18             ` John Rigby
  0 siblings, 1 reply; 48+ messages in thread
From: Alexander Holler @ 2010-12-20  0:56 UTC (permalink / raw)
  To: u-boot

Am 20.12.2010 01:39, schrieb John Rigby:
> On Sun, Dec 19, 2010 at 12:59 PM, Alexander Holler<holler@ahsoftware.de>  wrote:
> ...
>> No EEPROM on expansion board
>> Die ID #062a000400000000040365fa16019019
>> Hit any key to stop autoboot:  0
>> OMAP3 beagleboard.org # nand info
>>
>> Device 0: nand0, sector size 16 KiB
>> --------------------------------
>>
> I get the same output  without your change.  My gcc is linaro 4.4.5.
> I'll do some bisecting and try to find out what is going on.

Bisecting won't help you here. Not if the problem was always there 
(which is what I assume).

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20  0:56           ` Alexander Holler
@ 2010-12-20  4:18             ` John Rigby
  2010-12-20  6:07               ` John Rigby
  0 siblings, 1 reply; 48+ messages in thread
From: John Rigby @ 2010-12-20  4:18 UTC (permalink / raw)
  To: u-boot

On Sun, Dec 19, 2010 at 5:56 PM, Alexander Holler <holler@ahsoftware.de> wrote:
> Am 20.12.2010 01:39, schrieb John Rigby:
>>
>> On Sun, Dec 19, 2010 at 12:59 PM, Alexander Holler<holler@ahsoftware.de>
>> ?wrote:
>> ...
>>>
>>> No EEPROM on expansion board
>>> Die ID #062a000400000000040365fa16019019
>>> Hit any key to stop autoboot: ?0
>>> OMAP3 beagleboard.org # nand info
>>>
>>> Device 0: nand0, sector size 16 KiB
>>> --------------------------------
>>>
>> I get the same output ?without your change. ?My gcc is linaro 4.4.5.
>> I'll do some bisecting and try to find out what is going on.
>
> Bisecting won't help you here. Not if the problem was always there (which is
> what I assume
Sorry, I was confused about my results.

If I replace include <asm/io.h> in drivers/mtd/nand/omap_gpmc.c with a
copy of the original called orig_io.h:
#include "orig_io.h"

Nand starts working again.  So the problem seems to be isolated to this file.
>
> Regards,
>
> Alexander
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20  4:18             ` John Rigby
@ 2010-12-20  6:07               ` John Rigby
  2010-12-20  6:49                 ` Dirk Behme
  2010-12-20 17:12                 ` Alexander Holler
  0 siblings, 2 replies; 48+ messages in thread
From: John Rigby @ 2010-12-20  6:07 UTC (permalink / raw)
  To: u-boot

On Sun, Dec 19, 2010 at 9:18 PM, John Rigby <john.rigby@linaro.org> wrote:
> On Sun, Dec 19, 2010 at 5:56 PM, Alexander Holler <holler@ahsoftware.de> wrote:
>> Am 20.12.2010 01:39, schrieb John Rigby:
>>>
>>> On Sun, Dec 19, 2010 at 12:59 PM, Alexander Holler<holler@ahsoftware.de>
>>> ?wrote:
>>> ...
>>>>
>>>> No EEPROM on expansion board
>>>> Die ID #062a000400000000040365fa16019019
>>>> Hit any key to stop autoboot: ?0
>>>> OMAP3 beagleboard.org # nand info
>>>>
>>>> Device 0: nand0, sector size 16 KiB
>>>> --------------------------------
>>>>
>>> I get the same output ?without your change. ?My gcc is linaro 4.4.5.
>>> I'll do some bisecting and try to find out what is going on.
>>
>> Bisecting won't help you here. Not if the problem was always there (which is
>> what I assume
> Sorry, I was confused about my results.
>
> If I replace include <asm/io.h> in drivers/mtd/nand/omap_gpmc.c with a
> copy of the original called orig_io.h:
> #include "orig_io.h"
>
> Nand starts working again. ?So the problem seems to be isolated to this file.
>>
>> Regards,
>>
>> Alexander
>>
>

With your patch and the following hack nand works:


diff --git a/drivers/mtd/nand/omap_gpmc.c b/drivers/mtd/nand/omap_gpmc.c
index 99b9cef..5e94155 100644
--- a/drivers/mtd/nand/omap_gpmc.c
+++ b/drivers/mtd/nand/omap_gpmc.c
@@ -29,6 +29,8 @@
 #include <linux/mtd/nand_ecc.h>
 #include <nand.h>

+#define origwriteb(v,a)                        __arch_putb(v,a)
+
 static uint8_t cs;
 static struct nand_ecclayout hw_nand_oob = GPMC_NAND_HW_ECC_LAYOUT;

@@ -58,7 +60,7 @@ static void omap_nand_hwcontrol(struct mtd_info
*mtd, int32_t cmd,
        }

        if (cmd != NAND_CMD_NONE)
-               writeb(cmd, this->IO_ADDR_W);
+               origwriteb(cmd, this->IO_ADDR_W);
 }

 /*

The working assembly looks like this:

	if (cmd != NAND_CMD_NONE)
80024d28:	e3710001 	cmn	r1, #1
		origwriteb(cmd, this->IO_ADDR_W);
80024d2c:	15933004 	ldrne	r3, [r3, #4]
80024d30:	120110ff 	andne	r1, r1, #255	; 0xff
80024d34:	15c31000 	strbne	r1, [r3]
80024d38:	e8bd8010 	pop	{r4, pc}

The broken assembly looks like this:

	if (cmd != NAND_CMD_NONE)
80024d28:	e3710001 	cmn	r1, #1
80024d2c:	08bd8010 	popeq	{r4, pc}
		writeb(cmd, this->IO_ADDR_W);
80024d30:	e5933004 	ldr	r3, [r3, #4]
80024d34:	e20110ff 	and	r1, r1, #255	; 0xff
80024d38:	e5c31000 	strb	r1, [r3]
80024d3c:	e5d33000 	ldrb	r3, [r3]
80024d40:	e8bd8010 	pop	{r4, pc}

This is with gcc version "(Ubuntu/Linaro 4.4.4-14ubuntu4) 4.4.5".
I'll try a 4.5.2 version next and see what happens.

br,

John

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20  6:07               ` John Rigby
@ 2010-12-20  6:49                 ` Dirk Behme
  2010-12-20  7:37                   ` John Rigby
  2010-12-20 17:12                 ` Alexander Holler
  1 sibling, 1 reply; 48+ messages in thread
From: Dirk Behme @ 2010-12-20  6:49 UTC (permalink / raw)
  To: u-boot

On 20.12.2010 07:07, John Rigby wrote:
> On Sun, Dec 19, 2010 at 9:18 PM, John Rigby<john.rigby@linaro.org>  wrote:
>> On Sun, Dec 19, 2010 at 5:56 PM, Alexander Holler<holler@ahsoftware.de>  wrote:
>>> Am 20.12.2010 01:39, schrieb John Rigby:
>>>>
>>>> On Sun, Dec 19, 2010 at 12:59 PM, Alexander Holler<holler@ahsoftware.de>
>>>>   wrote:
>>>> ...
>>>>>
>>>>> No EEPROM on expansion board
>>>>> Die ID #062a000400000000040365fa16019019
>>>>> Hit any key to stop autoboot:  0
>>>>> OMAP3 beagleboard.org # nand info
>>>>>
>>>>> Device 0: nand0, sector size 16 KiB
>>>>> --------------------------------
>>>>>
>>>> I get the same output  without your change.  My gcc is linaro 4.4.5.
>>>> I'll do some bisecting and try to find out what is going on.
>>>
>>> Bisecting won't help you here. Not if the problem was always there (which is
>>> what I assume
>> Sorry, I was confused about my results.
>>
>> If I replace include<asm/io.h>  in drivers/mtd/nand/omap_gpmc.c with a
>> copy of the original called orig_io.h:
>> #include "orig_io.h"
>>
>> Nand starts working again.  So the problem seems to be isolated to this file.
>>>
>>> Regards,
>>>
>>> Alexander
>>>
>>
>
> With your patch and the following hack nand works:
>
>
> diff --git a/drivers/mtd/nand/omap_gpmc.c b/drivers/mtd/nand/omap_gpmc.c
> index 99b9cef..5e94155 100644
> --- a/drivers/mtd/nand/omap_gpmc.c
> +++ b/drivers/mtd/nand/omap_gpmc.c
> @@ -29,6 +29,8 @@
>   #include<linux/mtd/nand_ecc.h>
>   #include<nand.h>
>
> +#define origwriteb(v,a)                        __arch_putb(v,a)
> +
>   static uint8_t cs;
>   static struct nand_ecclayout hw_nand_oob = GPMC_NAND_HW_ECC_LAYOUT;
>
> @@ -58,7 +60,7 @@ static void omap_nand_hwcontrol(struct mtd_info
> *mtd, int32_t cmd,
>          }
>
>          if (cmd != NAND_CMD_NONE)
> -               writeb(cmd, this->IO_ADDR_W);
> +               origwriteb(cmd, this->IO_ADDR_W);
>   }
>
>   /*
>
> The working assembly looks like this:
>
> 	if (cmd != NAND_CMD_NONE)
> 80024d28:	e3710001 	cmn	r1, #1
> 		origwriteb(cmd, this->IO_ADDR_W);
> 80024d2c:	15933004 	ldrne	r3, [r3, #4]
> 80024d30:	120110ff 	andne	r1, r1, #255	; 0xff
> 80024d34:	15c31000 	strbne	r1, [r3]
> 80024d38:	e8bd8010 	pop	{r4, pc}
>
> The broken assembly looks like this:
>
> 	if (cmd != NAND_CMD_NONE)
> 80024d28:	e3710001 	cmn	r1, #1
> 80024d2c:	08bd8010 	popeq	{r4, pc}
> 		writeb(cmd, this->IO_ADDR_W);
> 80024d30:	e5933004 	ldr	r3, [r3, #4]
> 80024d34:	e20110ff 	and	r1, r1, #255	; 0xff
> 80024d38:	e5c31000 	strb	r1, [r3]
> 80024d3c:	e5d33000 	ldrb	r3, [r3]
> 80024d40:	e8bd8010 	pop	{r4, pc}

Hmm. From functionality point of view, the 'broken' assembly below 
should to the same as the working assembly, above. The main difference 
is the 'popeq	{r4, pc}' and the additional 'ldrb	r3, [r3]'. The write 
to the HW 'strb	r1, [r3]' is there, so it should work. Is this 
understanding correct?

If it's correct, the question is, what breaks the below assembly? The 
popeq or the additional ldrb? The popeq looks correct, but why is the 
additional ldrb there?

For some further debugging, I had two ideas: Modifying the resulting 
binary with a hex editor and replacing the ldrb with a nop (if I 
remember correctly the hex code for a nop is ffffffff?). Or modifying 
the the C code and adding a barrier behind the writeb(). E.g.

if (cmd != NAND_CMD_NONE);
         writeb(cmd, this->IO_ADDR_W);
         __asm__ __volatile__ ("dsb" : : : "memory");
}

Best regards

Dirk

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20  6:49                 ` Dirk Behme
@ 2010-12-20  7:37                   ` John Rigby
  2010-12-20 16:08                     ` John Rigby
  0 siblings, 1 reply; 48+ messages in thread
From: John Rigby @ 2010-12-20  7:37 UTC (permalink / raw)
  To: u-boot

On Sun, Dec 19, 2010 at 11:49 PM, Dirk Behme <dirk.behme@googlemail.com> wrote:
> On 20.12.2010 07:07, John Rigby wrote:
>> The working assembly looks like this:
>>
>> ? ? ? ?if (cmd != NAND_CMD_NONE)
>> 80024d28: ? ? ? e3710001 ? ? ? ?cmn ? ? r1, #1
>> ? ? ? ? ? ? ? ?origwriteb(cmd, this->IO_ADDR_W);
>> 80024d2c: ? ? ? 15933004 ? ? ? ?ldrne ? r3, [r3, #4]
>> 80024d30: ? ? ? 120110ff ? ? ? ?andne ? r1, r1, #255 ? ?; 0xff
>> 80024d34: ? ? ? 15c31000 ? ? ? ?strbne ?r1, [r3]
>> 80024d38: ? ? ? e8bd8010 ? ? ? ?pop ? ? {r4, pc}
>>
>> The broken assembly looks like this:
>>
>> ? ? ? ?if (cmd != NAND_CMD_NONE)
>> 80024d28: ? ? ? e3710001 ? ? ? ?cmn ? ? r1, #1
>> 80024d2c: ? ? ? 08bd8010 ? ? ? ?popeq ? {r4, pc}
>> ? ? ? ? ? ? ? ?writeb(cmd, this->IO_ADDR_W);
>> 80024d30: ? ? ? e5933004 ? ? ? ?ldr ? ? r3, [r3, #4]
>> 80024d34: ? ? ? e20110ff ? ? ? ?and ? ? r1, r1, #255 ? ?; 0xff
>> 80024d38: ? ? ? e5c31000 ? ? ? ?strb ? ?r1, [r3]
>> 80024d3c: ? ? ? e5d33000 ? ? ? ?ldrb ? ?r3, [r3]
>> 80024d40: ? ? ? e8bd8010 ? ? ? ?pop ? ? {r4, pc}
>
> Hmm. From functionality point of view, the 'broken' assembly below should to
> the same as the working assembly, above. The main difference is the 'popeq
> {r4, pc}' and the additional 'ldrb ? ? ?r3, [r3]'. The write to the HW 'strb
> ? ?r1, [r3]' is there, so it should work. Is this understanding correct?
>
> If it's correct, the question is, what breaks the below assembly? The popeq
> or the additional ldrb? The popeq looks correct, but why is the additional
> ldrb there?
>

I can't answer why the ldrb is there but I'm pretty sure it is the
problem.  From the TRM
http://focus.ti.com/lit/ug/spruf98m/spruf98m.pdf:

Only write accesses must be issued to these locations, but the GPMC
does not discard any read access. Accessing a NAND device with nOE and
CLE or ALE asserted (read access) can produce undefined results.

br,
John

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20  7:37                   ` John Rigby
@ 2010-12-20 16:08                     ` John Rigby
  2010-12-20 16:12                       ` John Rigby
  2011-01-17  4:35                       ` Alexander Holler
  0 siblings, 2 replies; 48+ messages in thread
From: John Rigby @ 2010-12-20 16:08 UTC (permalink / raw)
  To: u-boot

Earlier in this thread Alexander said:
> I haven't add the definitions which are using a memory barrier because I haven't found
> a place in the kernel where they were actually enabled
> (CONFIG_ARM_DMA_MEM_BUFFERABLE).

I think this is the problem because it is indeed defined for all v6
and v7 arm platforms.  Here is the config snippet from
arch/arm/mm/Kconfig:

config ARM_DMA_MEM_BUFFERABLE
	bool "Use non-cacheable memory for DMA" if CPU_V6 && !CPU_V7
	depends on !(MACH_REALVIEW_PB1176 || REALVIEW_EB_ARM11MP || \
		     MACH_REALVIEW_PB11MP)
	default y if CPU_V6 || CPU_V7
	help
	  Historically, the kernel has used strongly ordered mappings to
	  provide DMA coherent memory.  With the advent of ARMv7, mapping
	  memory with differing types results in unpredictable behaviour,
	  so on these CPUs, this option is forced on.

	  Multiple mappings with differing attributes is also unpredictable
	  on ARMv6 CPUs, but since they do not have aggressive speculative
	  prefetch, no harm appears to occur.

	  However, drivers may be missing the necessary barriers for ARMv6,
	  and therefore turning this on may result in unpredictable driver
	  behaviour.  Therefore, we offer this as an option.

	  You are recommended say 'Y' here and debug any affected drivers.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20 16:08                     ` John Rigby
@ 2010-12-20 16:12                       ` John Rigby
  2011-01-17  4:35                       ` Alexander Holler
  1 sibling, 0 replies; 48+ messages in thread
From: John Rigby @ 2010-12-20 16:12 UTC (permalink / raw)
  To: u-boot

On Mon, Dec 20, 2010 at 9:08 AM, John Rigby <john.rigby@linaro.org> wrote:
> Earlier in this thread Alexander said:
>> I haven't add the definitions which are using a memory barrier because I haven't found
>> a place in the kernel where they were actually enabled
>> (CONFIG_ARM_DMA_MEM_BUFFERABLE).
>
> I think this is the problem because it is indeed defined for all v6
> and v7 arm platforms. ?Here is the config snippet from
> arch/arm/mm/Kconfig:
>
> config ARM_DMA_MEM_BUFFERABLE
> ? ? ? ?bool "Use non-cacheable memory for DMA" if CPU_V6 && !CPU_V7
> ? ? ? ?depends on !(MACH_REALVIEW_PB1176 || REALVIEW_EB_ARM11MP || \
> ? ? ? ? ? ? ? ? ? ? MACH_REALVIEW_PB11MP)
> ? ? ? ?default y if CPU_V6 || CPU_V7
> ? ? ? ?help
> ? ? ? ? ?Historically, the kernel has used strongly ordered mappings to
> ? ? ? ? ?provide DMA coherent memory. ?With the advent of ARMv7, mapping
> ? ? ? ? ?memory with differing types results in unpredictable behaviour,
> ? ? ? ? ?so on these CPUs, this option is forced on.

On second thought maybe this is noise for us in u-boot without
cacheable mappings?  Sorry for the noise.

br,

John

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20  6:07               ` John Rigby
  2010-12-20  6:49                 ` Dirk Behme
@ 2010-12-20 17:12                 ` Alexander Holler
  2010-12-21  0:25                   ` John Rigby
  1 sibling, 1 reply; 48+ messages in thread
From: Alexander Holler @ 2010-12-20 17:12 UTC (permalink / raw)
  To: u-boot

Hello,

Am 20.12.2010 07:07, schrieb John Rigby:
> With your patch and the following hack nand works:
>
>
> diff --git a/drivers/mtd/nand/omap_gpmc.c b/drivers/mtd/nand/omap_gpmc.c
> index 99b9cef..5e94155 100644
> --- a/drivers/mtd/nand/omap_gpmc.c
> +++ b/drivers/mtd/nand/omap_gpmc.c
> @@ -29,6 +29,8 @@
>   #include<linux/mtd/nand_ecc.h>
>   #include<nand.h>
>
> +#define origwriteb(v,a)                        __arch_putb(v,a)
> +
>   static uint8_t cs;
>   static struct nand_ecclayout hw_nand_oob = GPMC_NAND_HW_ECC_LAYOUT;
>
> @@ -58,7 +60,7 @@ static void omap_nand_hwcontrol(struct mtd_info
> *mtd, int32_t cmd,
>          }
>
>          if (cmd != NAND_CMD_NONE)
> -               writeb(cmd, this->IO_ADDR_W);
> +               origwriteb(cmd, this->IO_ADDR_W);
>   }
>
>   /*
>
> The working assembly looks like this:
>
> 	if (cmd != NAND_CMD_NONE)
> 80024d28:	e3710001 	cmn	r1, #1
> 		origwriteb(cmd, this->IO_ADDR_W);
> 80024d2c:	15933004 	ldrne	r3, [r3, #4]
> 80024d30:	120110ff 	andne	r1, r1, #255	; 0xff
> 80024d34:	15c31000 	strbne	r1, [r3]
> 80024d38:	e8bd8010 	pop	{r4, pc}
>
> The broken assembly looks like this:
>
> 	if (cmd != NAND_CMD_NONE)
> 80024d28:	e3710001 	cmn	r1, #1
> 80024d2c:	08bd8010 	popeq	{r4, pc}
> 		writeb(cmd, this->IO_ADDR_W);
> 80024d30:	e5933004 	ldr	r3, [r3, #4]
> 80024d34:	e20110ff 	and	r1, r1, #255	; 0xff
> 80024d38:	e5c31000 	strb	r1, [r3]
> 80024d3c:	e5d33000 	ldrb	r3, [r3]
> 80024d40:	e8bd8010 	pop	{r4, pc}
>
> This is with gcc version "(Ubuntu/Linaro 4.4.4-14ubuntu4) 4.4.5".
> I'll try a 4.5.2 version next and see what happens.

There must be more problems. Using gcc 4.5.1, the read*/write*-patch and 
your hack, my kernel doesn't boot. Using gcc 4.3.5 and the same source 
to compile u-boot the kernel comes up. Here is the output for the 
non-working u-boot:

----------------
U-Boot 2010.12-rc3-00015-g3ae9687-dirty (Dec 20 2010 - 18:01:41, gcc 4.5.1)

OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
OMAP3 Beagle board + LPDDR/NAND
I2C:   ready
DRAM:  256 MiB
NAND:  256 MiB
MMC:   OMAP SD/MMC: 0
In:    serial
Out:   serial
Err:   serial
Beagle Rev C4
timed out in wait_for_pin: I2C_STAT=0
No EEPROM on expansion board
Die ID #062a000400000000040365fa16019019
Hit any key to stop autoboot:  0
reading boot.scr

422 bytes read
Running bootscript from mmc ...
## Executing script at 82000000
reading uImage

2419940 bytes read
Booting from mmc ...
## Booting kernel from Legacy Image at 82000000 ...
    Image Name:   Linux-2.6.37-rc5-beagleboard-000
    Image Type:   ARM Linux Kernel Image (uncompressed)
    Data Size:    2419876 Bytes = 2.3 MiB
    Load Address: 80008000
    Entry Point:  80008000
    Verifying Checksum ... OK
    Loading Kernel Image ... OK
OK
----------------

Nothing else.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20 17:12                 ` Alexander Holler
@ 2010-12-21  0:25                   ` John Rigby
  2010-12-21  0:46                     ` John Rigby
  2010-12-21 13:38                     ` Alexander Holler
  0 siblings, 2 replies; 48+ messages in thread
From: John Rigby @ 2010-12-21  0:25 UTC (permalink / raw)
  To: u-boot

On Mon, Dec 20, 2010 at 10:12 AM, Alexander Holler <holler@ahsoftware.de> wrote:

> There must be more problems. Using gcc 4.5.1, the read*/write*-patch and
> your hack, my kernel doesn't boot. Using gcc 4.3.5 and the same source to
> compile u-boot the kernel comes up. Here is the output for the non-working
> u-boot:
>
> ----------------
> U-Boot 2010.12-rc3-00015-g3ae9687-dirty (Dec 20 2010 - 18:01:41, gcc 4.5.1)
>
> OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
> OMAP3 Beagle board + LPDDR/NAND
> I2C: ? ready
> DRAM: ?256 MiB
> NAND: ?256 MiB
> MMC: ? OMAP SD/MMC: 0
> In: ? ?serial
> Out: ? serial
> Err: ? serial
> Beagle Rev C4
> timed out in wait_for_pin: I2C_STAT=0
> No EEPROM on expansion board
> Die ID #062a000400000000040365fa16019019
> Hit any key to stop autoboot: ?0
> reading boot.scr
>
> 422 bytes read
> Running bootscript from mmc ...
> ## Executing script at 82000000
> reading uImage
>
> 2419940 bytes read
> Booting from mmc ...
> ## Booting kernel from Legacy Image at 82000000 ...
> ? Image Name: ? Linux-2.6.37-rc5-beagleboard-000
> ? Image Type: ? ARM Linux Kernel Image (uncompressed)
> ? Data Size: ? ?2419876 Bytes = 2.3 MiB
> ? Load Address: 80008000
> ? Entry Point: ?80008000
> ? Verifying Checksum ... OK
> ? Loading Kernel Image ... OK
> OK
> ----------------
>
> Nothing else.
>
> Regards,
>
> Alexander
>

Yes, you are correct, I see the same here with 4.5.2.  I noticed that
bd did not have correct values of machine type and boot params:

bd address  = 0x8FF24FE0
arch_number = 0xFF0084FF
boot_params = 0xBB2000FE
DRAM bank   = 0x00000000
-> start    = 0x80000000
-> size     = 0x08000000
DRAM bank   = 0x00000001
-> start    = 0x88000000
-> size     = 0x08000000
baudrate    = 115200 bps
TLB addr    = 0x8FFF0000
relocaddr   = 0x8FF85000
reloc off   = 0x0FF7D000
irq_sp      = 0x8FF24F68
sp start    = 0x8FF24F60
FB base     = 0x00000000

If we then look at board_init in beagle.c the problem is obvious:

800331ac <board_init>:
800331ac:       e92d4008        push    {r3, lr}
800331b0:       ebff5a74        bl      80009b88 <gpmc_init>
800331b4:       e3a00000        mov     r0, #0
800331b8:       e5983000        ldr     r3, [r8]
800331bc:       e5983000        ldr     r3, [r8]
800331c0:       e8bd8008        pop     {r3, pc}

Here is with source mingled in:
800331ac <board_init>:
/*
 * Routine: board_init
 * Description: Early hardware init.
 */
int board_init(void)
{
800331ac:       e92d4008        push    {r3, lr}
        DECLARE_GLOBAL_DATA_PTR;

        gpmc_init(); /* in SRAM or SDRAM, finish GPMC */
800331b0:       ebff5a74        bl      80009b88 <gpmc_init>
        gd->bd->bi_arch_number = MACH_TYPE_OMAP3_BEAGLE;
        /* boot param addr */
        gd->bd->bi_boot_params = (OMAP34XX_SDRC_CS0 + 0x100);

        return 0;
}
800331b4:       e3a00000        mov     r0, #0
{
        DECLARE_GLOBAL_DATA_PTR;

        gpmc_init(); /* in SRAM or SDRAM, finish GPMC */
        /* board id for Linux */
        gd->bd->bi_arch_number = MACH_TYPE_OMAP3_BEAGLE;
800331b8:       e5983000        ldr     r3, [r8]
        /* boot param addr */
        gd->bd->bi_boot_params = (OMAP34XX_SDRC_CS0 + 0x100);
800331bc:       e5983000        ldr     r3, [r8]

        return 0;
}
800331c0:       e8bd8008        pop     {r3, pc}

br,

John

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  0:25                   ` John Rigby
@ 2010-12-21  0:46                     ` John Rigby
  2010-12-21  7:11                       ` Dirk Behme
  2010-12-21 13:38                     ` Alexander Holler
  1 sibling, 1 reply; 48+ messages in thread
From: John Rigby @ 2010-12-21  0:46 UTC (permalink / raw)
  To: u-boot

On Mon, Dec 20, 2010 at 5:25 PM, John Rigby <john.rigby@linaro.org> wrote:
> On Mon, Dec 20, 2010 at 10:12 AM, Alexander Holler <holler@ahsoftware.de> wrote:
>
>> There must be more problems. Using gcc 4.5.1, the read*/write*-patch and
>> your hack, my kernel doesn't boot. Using gcc 4.3.5 and the same source to
>> compile u-boot the kernel comes up. Here is the output for the non-working
>> u-boot:
>>
>> ----------------
>> U-Boot 2010.12-rc3-00015-g3ae9687-dirty (Dec 20 2010 - 18:01:41, gcc 4.5.1)
>>
>> OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
>> OMAP3 Beagle board + LPDDR/NAND
>> I2C: ? ready
>> DRAM: ?256 MiB
>> NAND: ?256 MiB
>> MMC: ? OMAP SD/MMC: 0
>> In: ? ?serial
>> Out: ? serial
>> Err: ? serial
>> Beagle Rev C4
>> timed out in wait_for_pin: I2C_STAT=0
>> No EEPROM on expansion board
>> Die ID #062a000400000000040365fa16019019
>> Hit any key to stop autoboot: ?0
>> reading boot.scr
>>
>> 422 bytes read
>> Running bootscript from mmc ...
>> ## Executing script at 82000000
>> reading uImage
>>
>> 2419940 bytes read
>> Booting from mmc ...
>> ## Booting kernel from Legacy Image at 82000000 ...
>> ? Image Name: ? Linux-2.6.37-rc5-beagleboard-000
>> ? Image Type: ? ARM Linux Kernel Image (uncompressed)
>> ? Data Size: ? ?2419876 Bytes = 2.3 MiB
>> ? Load Address: 80008000
>> ? Entry Point: ?80008000
>> ? Verifying Checksum ... OK
>> ? Loading Kernel Image ... OK
>> OK
>> ----------------
>>
>> Nothing else.
>>
>> Regards,
>>
>> Alexander
>>
>
> Yes, you are correct, I see the same here with 4.5.2. ?I noticed that
> bd did not have correct values of machine type and boot params:
>
> bd address ?= 0x8FF24FE0
> arch_number = 0xFF0084FF
> boot_params = 0xBB2000FE
> DRAM bank ? = 0x00000000
> -> start ? ?= 0x80000000
> -> size ? ? = 0x08000000
> DRAM bank ? = 0x00000001
> -> start ? ?= 0x88000000
> -> size ? ? = 0x08000000
> baudrate ? ?= 115200 bps
> TLB addr ? ?= 0x8FFF0000
> relocaddr ? = 0x8FF85000
> reloc off ? = 0x0FF7D000
> irq_sp ? ? ?= 0x8FF24F68
> sp start ? ?= 0x8FF24F60
> FB base ? ? = 0x00000000
>
> If we then look at board_init in beagle.c the problem is obvious:
>
> 800331ac <board_init>:
> 800331ac: ? ? ? e92d4008 ? ? ? ?push ? ?{r3, lr}
> 800331b0: ? ? ? ebff5a74 ? ? ? ?bl ? ? ?80009b88 <gpmc_init>
> 800331b4: ? ? ? e3a00000 ? ? ? ?mov ? ? r0, #0
> 800331b8: ? ? ? e5983000 ? ? ? ?ldr ? ? r3, [r8]
> 800331bc: ? ? ? e5983000 ? ? ? ?ldr ? ? r3, [r8]
> 800331c0: ? ? ? e8bd8008 ? ? ? ?pop ? ? {r3, pc}
>

Apparently this is a known issue mentioned in README:

NOTE: DECLARE_GLOBAL_DATA_PTR must be used with file-global scope,
or current versions of GCC may "optimize" the code too much.


With this fix I can boot again:

diff --git a/board/ti/beagle/beagle.c b/board/ti/beagle/beagle.c
index d9b6f01..c066d6e 100644
--- a/board/ti/beagle/beagle.c
+++ b/board/ti/beagle/beagle.c
@@ -51,6 +51,8 @@

 #define BEAGLE_NO_EEPROM               0xffffffff

+DECLARE_GLOBAL_DATA_PTR;
+
 static struct {
        unsigned int device_vendor;
        unsigned char revision;
@@ -66,8 +68,6 @@ static struct {
  */
 int board_init(void)
 {
-       DECLARE_GLOBAL_DATA_PTR;
-
        gpmc_init(); /* in SRAM or SDRAM, finish GPMC */
        /* board id for Linux */
        gd->bd->bi_arch_number = MACH_TYPE_OMAP3_BEAGLE;

Please let me know if you find any other problems.

br,

John

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  0:46                     ` John Rigby
@ 2010-12-21  7:11                       ` Dirk Behme
  2010-12-21  7:21                         ` Albert ARIBAUD
  0 siblings, 1 reply; 48+ messages in thread
From: Dirk Behme @ 2010-12-21  7:11 UTC (permalink / raw)
  To: u-boot

On 21.12.2010 01:46, John Rigby wrote:
> On Mon, Dec 20, 2010 at 5:25 PM, John Rigby<john.rigby@linaro.org>  wrote:
>> On Mon, Dec 20, 2010 at 10:12 AM, Alexander Holler<holler@ahsoftware.de>  wrote:
>>
>>> There must be more problems. Using gcc 4.5.1, the read*/write*-patch and
>>> your hack, my kernel doesn't boot. Using gcc 4.3.5 and the same source to
>>> compile u-boot the kernel comes up. Here is the output for the non-working
>>> u-boot:
>>>
>>> ----------------
>>> U-Boot 2010.12-rc3-00015-g3ae9687-dirty (Dec 20 2010 - 18:01:41, gcc 4.5.1)
>>>
>>> OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 720 mHz
>>> OMAP3 Beagle board + LPDDR/NAND
>>> I2C:   ready
>>> DRAM:  256 MiB
>>> NAND:  256 MiB
>>> MMC:   OMAP SD/MMC: 0
>>> In:    serial
>>> Out:   serial
>>> Err:   serial
>>> Beagle Rev C4
>>> timed out in wait_for_pin: I2C_STAT=0
>>> No EEPROM on expansion board
>>> Die ID #062a000400000000040365fa16019019
>>> Hit any key to stop autoboot:  0
>>> reading boot.scr
>>>
>>> 422 bytes read
>>> Running bootscript from mmc ...
>>> ## Executing script at 82000000
>>> reading uImage
>>>
>>> 2419940 bytes read
>>> Booting from mmc ...
>>> ## Booting kernel from Legacy Image at 82000000 ...
>>>    Image Name:   Linux-2.6.37-rc5-beagleboard-000
>>>    Image Type:   ARM Linux Kernel Image (uncompressed)
>>>    Data Size:    2419876 Bytes = 2.3 MiB
>>>    Load Address: 80008000
>>>    Entry Point:  80008000
>>>    Verifying Checksum ... OK
>>>    Loading Kernel Image ... OK
>>> OK
>>> ----------------
>>>
>>> Nothing else.
>>>
>>> Regards,
>>>
>>> Alexander
>>>
>>
>> Yes, you are correct, I see the same here with 4.5.2.  I noticed that
>> bd did not have correct values of machine type and boot params:
>>
>> bd address  = 0x8FF24FE0
>> arch_number = 0xFF0084FF
>> boot_params = 0xBB2000FE
>> DRAM bank   = 0x00000000
>> ->  start    = 0x80000000
>> ->  size     = 0x08000000
>> DRAM bank   = 0x00000001
>> ->  start    = 0x88000000
>> ->  size     = 0x08000000
>> baudrate    = 115200 bps
>> TLB addr    = 0x8FFF0000
>> relocaddr   = 0x8FF85000
>> reloc off   = 0x0FF7D000
>> irq_sp      = 0x8FF24F68
>> sp start    = 0x8FF24F60
>> FB base     = 0x00000000
>>
>> If we then look at board_init in beagle.c the problem is obvious:
>>
>> 800331ac<board_init>:
>> 800331ac:       e92d4008        push    {r3, lr}
>> 800331b0:       ebff5a74        bl      80009b88<gpmc_init>
>> 800331b4:       e3a00000        mov     r0, #0
>> 800331b8:       e5983000        ldr     r3, [r8]
>> 800331bc:       e5983000        ldr     r3, [r8]
>> 800331c0:       e8bd8008        pop     {r3, pc}
>>
>
> Apparently this is a known issue mentioned in README:
>
> NOTE: DECLARE_GLOBAL_DATA_PTR must be used with file-global scope,
> or current versions of GCC may "optimize" the code too much.
>
>
> With this fix I can boot again:
>
> diff --git a/board/ti/beagle/beagle.c b/board/ti/beagle/beagle.c
> index d9b6f01..c066d6e 100644
> --- a/board/ti/beagle/beagle.c
> +++ b/board/ti/beagle/beagle.c
> @@ -51,6 +51,8 @@
>
>   #define BEAGLE_NO_EEPROM               0xffffffff
>
> +DECLARE_GLOBAL_DATA_PTR;
> +
>   static struct {
>          unsigned int device_vendor;
>          unsigned char revision;
> @@ -66,8 +68,6 @@ static struct {
>    */
>   int board_init(void)
>   {
> -       DECLARE_GLOBAL_DATA_PTR;
> -
>          gpmc_init(); /* in SRAM or SDRAM, finish GPMC */
>          /* board id for Linux */
>          gd->bd->bi_arch_number = MACH_TYPE_OMAP3_BEAGLE;
>
> Please let me know if you find any other problems.

Just to not loose the overview:

This is fixed by your patch

http://patchwork.ozlabs.org/patch/76250/

?

But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional 
ldrb    r3, [r3]) is still open? Has anybody tried to replace it with 
a nop in the binary to be sure this is the root cause?

Thanks

Dirk

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  7:11                       ` Dirk Behme
@ 2010-12-21  7:21                         ` Albert ARIBAUD
  2010-12-21  8:05                           ` Dirk Behme
  2010-12-21  8:35                           ` Dirk Behme
  0 siblings, 2 replies; 48+ messages in thread
From: Albert ARIBAUD @ 2010-12-21  7:21 UTC (permalink / raw)
  To: u-boot

Hi Dirk,

Le 21/12/2010 08:11, Dirk Behme a ?crit :

> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
> a nop in the binary to be sure this is the root cause?

Can you try and preprocess the C file for both the broken and working 
cases, then post the preprocessed C extract? Differences at the C level 
may help understanding differences at the asm level.

> Thanks
>
> Dirk

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  7:21                         ` Albert ARIBAUD
@ 2010-12-21  8:05                           ` Dirk Behme
  2010-12-21  8:17                             ` Wolfgang Denk
  2010-12-21  8:35                           ` Dirk Behme
  1 sibling, 1 reply; 48+ messages in thread
From: Dirk Behme @ 2010-12-21  8:05 UTC (permalink / raw)
  To: u-boot

On 21.12.2010 08:21, Albert ARIBAUD wrote:
> Hi Dirk,
>
> Le 21/12/2010 08:11, Dirk Behme a ?crit :
>
>> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
>> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
>> a nop in the binary to be sure this is the root cause?
>
> Can you try and preprocess the C file for both the broken and working
> cases, then post the preprocessed C extract? Differences at the C level
> may help understanding differences at the asm level.

gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50)

Work:
====

static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
     uint32_t ctrl)
{
  register struct nand_chip *this = mtd->priv;
  ...
  if (cmd != -1)

   (*(volatile unsigned char *)(this->IO_ADDR_W) = (cmd));
}

	if (cmd != NAND_CMD_NONE)
   84:	e3710001 	cmn	r1, #1
		origwriteb(cmd, this->IO_ADDR_W);
   88:	15933004 	ldrne	r3, [r3, #4]
   8c:	120110ff 	andne	r1, r1, #255	; 0xff
   90:	15c31000 	strbne	r1, [r3]
   94:	e12fff1e 	bx	lr
	...


Broken:
======

static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
     uint32_t ctrl)
{
  register struct nand_chip *this = mtd->priv;
...
  if (cmd != -1)
   ({ do { } while (0); (*(volatile unsigned char *)(this->IO_ADDR_W) 
= (cmd)); });
}

	if (cmd != NAND_CMD_NONE)
   84:	e3710001 	cmn	r1, #1
		writeb(cmd, this->IO_ADDR_W);
   88:	15933004 	ldrne	r3, [r3, #4]
   8c:	120110ff 	andne	r1, r1, #255	; 0xff
   90:	15c31000 	strbne	r1, [r3]
   94:	15d33000 	ldrbne	r3, [r3]
   98:	e12fff1e 	bx	lr
	...


The issue seems to be the additional 'ldrbne	r3, [r3]' added by the 
compiler in the broken version.

Best regards

Dirk

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  8:05                           ` Dirk Behme
@ 2010-12-21  8:17                             ` Wolfgang Denk
  2010-12-21  8:37                               ` Dirk Behme
  0 siblings, 1 reply; 48+ messages in thread
From: Wolfgang Denk @ 2010-12-21  8:17 UTC (permalink / raw)
  To: u-boot

Dear Dirk Behme,

In message <4D105FB3.3090801@googlemail.com> you wrote:
>
> Broken:
> ======
...
> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>      uint32_t ctrl)
> {
>   register struct nand_chip *this = mtd->priv;
> ...
>   if (cmd != -1)
>    ({ do { } while (0); (*(volatile unsigned char *)(this->IO_ADDR_W) >
> = (cmd)); });

But this is the old, discarded version of the patch.

I though we had agreed that we have to use the

	__asm__ __volatile__ ("" : : : "memory")

version instead?

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
Program maintenance is an entropy-increasing process,  and  even  its
most skilfull execution only delays the subsidence of the system into
unfixable obsolescence.       - Fred Brooks, "The Mythical Man Month"

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  7:21                         ` Albert ARIBAUD
  2010-12-21  8:05                           ` Dirk Behme
@ 2010-12-21  8:35                           ` Dirk Behme
  2010-12-21  8:46                             ` John Rigby
  1 sibling, 1 reply; 48+ messages in thread
From: Dirk Behme @ 2010-12-21  8:35 UTC (permalink / raw)
  To: u-boot


(Resend with corrected broken example)

On 21.12.2010 08:21, Albert ARIBAUD wrote:
> Hi Dirk,
>
> Le 21/12/2010 08:11, Dirk Behme a ?crit :
>
>> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
>> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
>> a nop in the binary to be sure this is the root cause?
>
> Can you try and preprocess the C file for both the broken and working
> cases, then post the preprocessed C extract? Differences at the C level
> may help understanding differences at the asm level.

gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50)

Work:
====

static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
     uint32_t ctrl)
{
  register struct nand_chip *this = mtd->priv;
  ...
  if (cmd != -1)

   (*(volatile unsigned char *)(this->IO_ADDR_W) = (cmd));
}

	if (cmd != NAND_CMD_NONE)
   84:	e3710001 	cmn	r1, #1
		origwriteb(cmd, this->IO_ADDR_W);
   88:	15933004 	ldrne	r3, [r3, #4]
   8c:	120110ff 	andne	r1, r1, #255	; 0xff
   90:	15c31000 	strbne	r1, [r3]
   94:	e12fff1e 	bx	lr
	...


Broken:
======

static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
     uint32_t ctrl)
{
  register struct nand_chip *this = mtd->priv;

...

  if (cmd != -1)
   ({ __asm__ __volatile__ ("" : : : "memory"); (*(volatile unsigned 
char *)(this->IO_ADDR_W) = (cmd)); });
}

	if (cmd != NAND_CMD_NONE)
   84:	e3710001 	cmn	r1, #1
   88:	012fff1e 	bxeq	lr
		writeb(cmd, this->IO_ADDR_W);
   8c:	e5933004 	ldr	r3, [r3, #4]
   90:	e20110ff 	and	r1, r1, #255	; 0xff
   94:	e5c31000 	strb	r1, [r3]
   98:	e5d33000 	ldrb	r3, [r3]
   9c:	e12fff1e 	bx	lr


The issue seems to be the additional 'ldrb	r3, [r3]' added by the 
compiler in the broken version.

Best regards

Dirk

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  8:17                             ` Wolfgang Denk
@ 2010-12-21  8:37                               ` Dirk Behme
  0 siblings, 0 replies; 48+ messages in thread
From: Dirk Behme @ 2010-12-21  8:37 UTC (permalink / raw)
  To: u-boot

On 21.12.2010 09:17, Wolfgang Denk wrote:
> Dear Dirk Behme,
>
> In message<4D105FB3.3090801@googlemail.com>  you wrote:
>>
>> Broken:
>> ======
> ...
>> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>>       uint32_t ctrl)
>> {
>>    register struct nand_chip *this = mtd->priv;
>> ...
>>    if (cmd != -1)
>>     ({ do { } while (0); (*(volatile unsigned char *)(this->IO_ADDR_W)>
>> = (cmd)); });
>
> But this is the old, discarded version of the patch.
>
> I though we had agreed that we have to use the
>
> 	__asm__ __volatile__ ("" : : : "memory")
>
> version instead?

Yes.

I mixed the patches :(

Sorry for the noise. I just sent the hopefully correct version some 
minutes ago.

Thanks

Dirk

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  8:35                           ` Dirk Behme
@ 2010-12-21  8:46                             ` John Rigby
  2010-12-21 10:38                               ` Albert ARIBAUD
  0 siblings, 1 reply; 48+ messages in thread
From: John Rigby @ 2010-12-21  8:46 UTC (permalink / raw)
  To: u-boot

On Tue, Dec 21, 2010 at 1:35 AM, Dirk Behme <dirk.behme@googlemail.com> wrote:
>
> (Resend with corrected broken example)
>
> On 21.12.2010 08:21, Albert ARIBAUD wrote:
>> Hi Dirk,
>>
>> Le 21/12/2010 08:11, Dirk Behme a ?crit :
>>
>>> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
>>> ldrb ? ?r3, [r3]) is still open? Has anybody tried to replace it with
>>> a nop in the binary to be sure this is the root cause?
>>
>> Can you try and preprocess the C file for both the broken and working
>> cases, then post the preprocessed C extract? Differences at the C level
>> may help understanding differences at the asm level.
>
> gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50)
>
> Work:
> ====
>
> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
> ? ? uint32_t ctrl)
> {
> ?register struct nand_chip *this = mtd->priv;
> ?...
> ?if (cmd != -1)
>
> ? (*(volatile unsigned char *)(this->IO_ADDR_W) = (cmd));
> }
>
> ? ? ? ?if (cmd != NAND_CMD_NONE)
> ? 84: ?e3710001 ? ? ? ?cmn ? ? r1, #1
> ? ? ? ? ? ? ? ?origwriteb(cmd, this->IO_ADDR_W);
> ? 88: ?15933004 ? ? ? ?ldrne ? r3, [r3, #4]
> ? 8c: ?120110ff ? ? ? ?andne ? r1, r1, #255 ? ?; 0xff
> ? 90: ?15c31000 ? ? ? ?strbne ?r1, [r3]
> ? 94: ?e12fff1e ? ? ? ?bx ? ? ?lr
> ? ? ? ?...
>
>
> Broken:
> ======
>
> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
> ? ? uint32_t ctrl)
> {
> ?register struct nand_chip *this = mtd->priv;
>
> ...
>
> ?if (cmd != -1)
> ? ({ __asm__ __volatile__ ("" : : : "memory"); (*(volatile unsigned
> char *)(this->IO_ADDR_W) = (cmd)); });
> }
>
> ? ? ? ?if (cmd != NAND_CMD_NONE)
> ? 84: ?e3710001 ? ? ? ?cmn ? ? r1, #1
> ? 88: ?012fff1e ? ? ? ?bxeq ? ?lr
> ? ? ? ? ? ? ? ?writeb(cmd, this->IO_ADDR_W);
> ? 8c: ?e5933004 ? ? ? ?ldr ? ? r3, [r3, #4]
> ? 90: ?e20110ff ? ? ? ?and ? ? r1, r1, #255 ? ?; 0xff
> ? 94: ?e5c31000 ? ? ? ?strb ? ?r1, [r3]
> ? 98: ?e5d33000 ? ? ? ?ldrb ? ?r3, [r3]
> ? 9c: ?e12fff1e ? ? ? ?bx ? ? ?lr
>
>
> The issue seems to be the additional 'ldrb ? ? ?r3, [r3]' added by the
> compiler in the broken version.
>

And I at your suggestion tried modifying the binary changing the extra
ldrb to a nop and it works.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  8:46                             ` John Rigby
@ 2010-12-21 10:38                               ` Albert ARIBAUD
  2010-12-21 10:53                                 ` Wolfgang Denk
  0 siblings, 1 reply; 48+ messages in thread
From: Albert ARIBAUD @ 2010-12-21 10:38 UTC (permalink / raw)
  To: u-boot

Le 21/12/2010 09:46, John Rigby a ?crit :
> On Tue, Dec 21, 2010 at 1:35 AM, Dirk Behme<dirk.behme@googlemail.com>  wrote:
>>
>> (Resend with corrected broken example)
>>
>> On 21.12.2010 08:21, Albert ARIBAUD wrote:
>>> Hi Dirk,
>>>
>>> Le 21/12/2010 08:11, Dirk Behme a ?crit :
>>>
>>>> But the issue with drivers/mtd/nand/omap_gpmc.c (i.e. the additional
>>>> ldrb    r3, [r3]) is still open? Has anybody tried to replace it with
>>>> a nop in the binary to be sure this is the root cause?
>>>
>>> Can you try and preprocess the C file for both the broken and working
>>> cases, then post the preprocessed C extract? Differences at the C level
>>> may help understanding differences at the asm level.
>>
>> gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50)
>>
>> Work:
>> ====
>>
>> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>>      uint32_t ctrl)
>> {
>>   register struct nand_chip *this = mtd->priv;
>>   ...
>>   if (cmd != -1)
>>
>>    (*(volatile unsigned char *)(this->IO_ADDR_W) = (cmd));
>> }
>>
>>         if (cmd != NAND_CMD_NONE)
>>    84:  e3710001        cmn     r1, #1
>>                 origwriteb(cmd, this->IO_ADDR_W);
>>    88:  15933004        ldrne   r3, [r3, #4]
>>    8c:  120110ff        andne   r1, r1, #255    ; 0xff
>>    90:  15c31000        strbne  r1, [r3]
>>    94:  e12fff1e        bx      lr
>>         ...
>>
>>
>> Broken:
>> ======
>>
>> static void omap_nand_hwcontrol(struct mtd_info *mtd, int32_t cmd,
>>      uint32_t ctrl)
>> {
>>   register struct nand_chip *this = mtd->priv;
>>
>> ...
>>
>>   if (cmd != -1)
>>    ({ __asm__ __volatile__ ("" : : : "memory"); (*(volatile unsigned
>> char *)(this->IO_ADDR_W) = (cmd)); });
>> }
>>
>>         if (cmd != NAND_CMD_NONE)
>>    84:  e3710001        cmn     r1, #1
>>    88:  012fff1e        bxeq    lr
>>                 writeb(cmd, this->IO_ADDR_W);
>>    8c:  e5933004        ldr     r3, [r3, #4]
>>    90:  e20110ff        and     r1, r1, #255    ; 0xff
>>    94:  e5c31000        strb    r1, [r3]
>>    98:  e5d33000        ldrb    r3, [r3]
>>    9c:  e12fff1e        bx      lr
>>
>>
>> The issue seems to be the additional 'ldrb      r3, [r3]' added by the
>> compiler in the broken version.
>>
>
> And I at your suggestion tried modifying the binary changing the extra
> ldrb to a nop and it works.

Seems like a compiler issue to me, as the preprocessed C source is the 
same for the register access and does not call for a re-read (that is 
what I wanted to see in the preprocessed version), yet the ASM sequence 
does the re-read.

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 10:38                               ` Albert ARIBAUD
@ 2010-12-21 10:53                                 ` Wolfgang Denk
  2010-12-21 12:35                                   ` Alexander Holler
  0 siblings, 1 reply; 48+ messages in thread
From: Wolfgang Denk @ 2010-12-21 10:53 UTC (permalink / raw)
  To: u-boot

Dear Albert ARIBAUD,

In message <4D1083B4.2060704@free.fr> you wrote:
>
> > And I at your suggestion tried modifying the binary changing the extra
> > ldrb to a nop and it works.
>
> Seems like a compiler issue to me, as the preprocessed C source is the
> same for the register access and does not call for a re-read (that is
> what I wanted to see in the preprocessed version), yet the ASM sequence
> does the re-read.

I also tend to think this is a compiler problem.  Searching the gcc
bugzilla entries for "ldrb" turns up quite a number of hits.  I'm not
sure which of these we are running into here, but there are enough of
them so you can chose freely :-(

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
Every revolutionary idea - in science, politics, art, or  whatever  -
evokes three stages of reaction in a hearer:
  1. It is completely impossible - don't waste my time.
  2. It is possible, but it is not worth doing.
  3. I said it was a good idea all along.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 10:53                                 ` Wolfgang Denk
@ 2010-12-21 12:35                                   ` Alexander Holler
  2010-12-21 12:51                                     ` Albert ARIBAUD
  0 siblings, 1 reply; 48+ messages in thread
From: Alexander Holler @ 2010-12-21 12:35 UTC (permalink / raw)
  To: u-boot

Am 21.12.2010 11:53, schrieb Wolfgang Denk:
> Dear Albert ARIBAUD,
>
> In message<4D1083B4.2060704@free.fr>  you wrote:
>>
>>> And I at your suggestion tried modifying the binary changing the extra
>>> ldrb to a nop and it works.
>>
>> Seems like a compiler issue to me, as the preprocessed C source is the
>> same for the register access and does not call for a re-read (that is
>> what I wanted to see in the preprocessed version), yet the ASM sequence
>> does the re-read.
>
> I also tend to think this is a compiler problem.  Searching the gcc
> bugzilla entries for "ldrb" turns up quite a number of hits.  I'm not
> sure which of these we are running into here, but there are enough of
> them so you can chose freely :-(

Hmm, is there actual somethinbg which should forbid the compiler to 
generate such code which rereads something? It might not be nice, but I 
don't think that it is forbidden for a compiler to do so. So the proper 
way to handle such, might be to use asm to avoid that the compiler 
touches that register.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 12:35                                   ` Alexander Holler
@ 2010-12-21 12:51                                     ` Albert ARIBAUD
  2010-12-21 13:30                                       ` Alexander Holler
  0 siblings, 1 reply; 48+ messages in thread
From: Albert ARIBAUD @ 2010-12-21 12:51 UTC (permalink / raw)
  To: u-boot

Le 21/12/2010 13:35, Alexander Holler a ?crit :

> Hmm, is there actual somethinbg which should forbid the compiler to
> generate such code which rereads something? It might not be nice, but I
> don't think that it is forbidden for a compiler to do so. So the proper
> way to handle such, might be to use asm to avoid that the compiler
> touches that register.

Yes there is something that should prevent a compiler from inserting 
reads: these accesses are to hardware, not memory, and may cause side 
effects even on read (these could be acknowledges, for instance; I've 
seen instances of that myself on some HW).

Another way to look at it is that the semantics of " *ptr = value " is a 
pure write and should not result in a write-then-read.

> Regards,
>
> Alexander

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 12:51                                     ` Albert ARIBAUD
@ 2010-12-21 13:30                                       ` Alexander Holler
  2010-12-21 14:33                                         ` Albert ARIBAUD
  0 siblings, 1 reply; 48+ messages in thread
From: Alexander Holler @ 2010-12-21 13:30 UTC (permalink / raw)
  To: u-boot

Am 21.12.2010 13:51, schrieb Albert ARIBAUD:
> Le 21/12/2010 13:35, Alexander Holler a ?crit :
>
>> Hmm, is there actual somethinbg which should forbid the compiler to
>> generate such code which rereads something? It might not be nice, but I
>> don't think that it is forbidden for a compiler to do so. So the proper
>> way to handle such, might be to use asm to avoid that the compiler
>> touches that register.
>
> Yes there is something that should prevent a compiler from inserting
> reads: these accesses are to hardware, not memory, and may cause side
> effects even on read (these could be acknowledges, for instance; I've
> seen instances of that myself on some HW).
>
> Another way to look at it is that the semantics of " *ptr = value " is a
> pure write and should not result in a write-then-read.

I think it's something like atomic_read. E.g. when reading an 32bit int 
(uint32_t i = *bla;), nothing forbids that the compiler generates code 
which reads those 4 bytes byte by byte (and so becoming a non-atomic 
operation). It's unusual to do so on 32bit architectures but valid.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21  0:25                   ` John Rigby
  2010-12-21  0:46                     ` John Rigby
@ 2010-12-21 13:38                     ` Alexander Holler
  1 sibling, 0 replies; 48+ messages in thread
From: Alexander Holler @ 2010-12-21 13:38 UTC (permalink / raw)
  To: u-boot

Hello,

Am 21.12.2010 01:25, schrieb John Rigby:
> On Mon, Dec 20, 2010 at 10:12 AM, Alexander Holler<holler@ahsoftware.de>  wrote:
>
>> There must be more problems. Using gcc 4.5.1, the read*/write*-patch and
...
> Yes, you are correct, I see the same here with 4.5.2.  I noticed that
> bd did not have correct values of machine type and boot params:
>
> bd address  = 0x8FF24FE0
...

Great!

Thanks a lot for searching and finding that.

I've just tested an u-boot for the BeagleBoard with your "Move 
DECLARE..."-patch compiled with gcc 4.5.1 and it now boots. ;)

I remember having seen those two DECLARE_GLOBAL_DATA_PTR in beagle.c 
too, but I was to lazy at that time to check if that has (bad) 
consequences. ;)

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 13:30                                       ` Alexander Holler
@ 2010-12-21 14:33                                         ` Albert ARIBAUD
  2010-12-21 19:52                                           ` Wolfgang Denk
  0 siblings, 1 reply; 48+ messages in thread
From: Albert ARIBAUD @ 2010-12-21 14:33 UTC (permalink / raw)
  To: u-boot

Le 21/12/2010 14:30, Alexander Holler a ?crit :
> Am 21.12.2010 13:51, schrieb Albert ARIBAUD:
>> Le 21/12/2010 13:35, Alexander Holler a ?crit :
>>
>>> Hmm, is there actual somethinbg which should forbid the compiler to
>>> generate such code which rereads something? It might not be nice, but I
>>> don't think that it is forbidden for a compiler to do so. So the proper
>>> way to handle such, might be to use asm to avoid that the compiler
>>> touches that register.
>>
>> Yes there is something that should prevent a compiler from inserting
>> reads: these accesses are to hardware, not memory, and may cause side
>> effects even on read (these could be acknowledges, for instance; I've
>> seen instances of that myself on some HW).
>>
>> Another way to look at it is that the semantics of " *ptr = value " is a
>> pure write and should not result in a write-then-read.
>
> I think it's something like atomic_read. E.g. when reading an 32bit int
> (uint32_t i = *bla;), nothing forbids that the compiler generates code
> which reads those 4 bytes byte by byte (and so becoming a non-atomic
> operation). It's unusual to do so on 32bit architectures but valid.

OTOH, this still respects the semantics (the compiler is allowed to do a 
non-atomic read) while the bug we're seeing does not repect the 
semantics (the compiler is not asked to do any read but does one).

> Regards,
>
> Alexander

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 14:33                                         ` Albert ARIBAUD
@ 2010-12-21 19:52                                           ` Wolfgang Denk
  2010-12-21 20:04                                             ` Dirk Behme
  0 siblings, 1 reply; 48+ messages in thread
From: Wolfgang Denk @ 2010-12-21 19:52 UTC (permalink / raw)
  To: u-boot

Dear Albert & friends,

what is your opinion?  Should we include the memory barrier patch into
the upcoming release (and eventually delay it for further testing), or
release as is and solve this issue in the next release?

I tend to leave it as is, as I expect that most people will disappear
in the next few days for holidays, so no much testing will be done
anyway, and we then can solve this with less pressure in the next
release - but I'm not really sure if this is a good idea?

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
A committee is a life form with six or more legs and no brain.
                              -- Lazarus Long, "Time Enough For Love"

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 19:52                                           ` Wolfgang Denk
@ 2010-12-21 20:04                                             ` Dirk Behme
  2010-12-21 21:49                                               ` Albert ARIBAUD
  2010-12-22  0:11                                               ` Alexander Holler
  0 siblings, 2 replies; 48+ messages in thread
From: Dirk Behme @ 2010-12-21 20:04 UTC (permalink / raw)
  To: u-boot

On 21.12.2010 20:52, Wolfgang Denk wrote:
> Dear Albert&  friends,
>
> what is your opinion?  Should we include the memory barrier patch into
> the upcoming release (and eventually delay it for further testing), or
> release as is and solve this issue in the next release?
>
> I tend to leave it as is, as I expect that most people will disappear
> in the next few days for holidays, so no much testing will be done
> anyway, and we then can solve this with less pressure in the next
> release - but I'm not really sure if this is a good idea?

I somehow tend to leave it as is, too.

We have issues with some recent compilers. For these we found a fix 
using the io.h somehow the same way the Linux kernel does. But this 
introduces new issues for us, we haven't found a proper fix yet 
(except changing the code to the 'old' io.h style). But we don't know 
where we might have this issue additionally, yet.

I would like to talk with some tool chain guys about this, too.

Could we add a readme or a note somewhere about the issues with the 
recent tool chains in this release?

Best regards

Dirk

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 20:04                                             ` Dirk Behme
@ 2010-12-21 21:49                                               ` Albert ARIBAUD
  2010-12-22  0:11                                               ` Alexander Holler
  1 sibling, 0 replies; 48+ messages in thread
From: Albert ARIBAUD @ 2010-12-21 21:49 UTC (permalink / raw)
  To: u-boot

Le 21/12/2010 21:04, Dirk Behme a ?crit :

> I somehow tend to leave it as is, too.

Agree.

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-21 20:04                                             ` Dirk Behme
  2010-12-21 21:49                                               ` Albert ARIBAUD
@ 2010-12-22  0:11                                               ` Alexander Holler
  2010-12-22  7:02                                                 ` Albert ARIBAUD
  1 sibling, 1 reply; 48+ messages in thread
From: Alexander Holler @ 2010-12-22  0:11 UTC (permalink / raw)
  To: u-boot

Am 21.12.2010 21:04, schrieb Dirk Behme:
> On 21.12.2010 20:52, Wolfgang Denk wrote:
>> Dear Albert&   friends,
>>
>> what is your opinion?  Should we include the memory barrier patch into
>> the upcoming release (and eventually delay it for further testing), or
>> release as is and solve this issue in the next release?
>>
>> I tend to leave it as is, as I expect that most people will disappear
>> in the next few days for holidays, so no much testing will be done
>> anyway, and we then can solve this with less pressure in the next
>> release - but I'm not really sure if this is a good idea?
>
> I somehow tend to leave it as is, too.
>
> We have issues with some recent compilers. For these we found a fix
> using the io.h somehow the same way the Linux kernel does. But this
> introduces new issues for us, we haven't found a proper fix yet
> (except changing the code to the 'old' io.h style). But we don't know
> where we might have this issue additionally, yet.

The only real problem found with that patch was one with a register 
which doesn't like an (unmotivated) read after write.

On the other side, without that patch, using gcc >= 4.5.x (at least on 
arm) proved to fail. In contrast to that problem of gcc 4.5.x ignoring 
that volatile, 4.5.x still fixes many bugs for arm and gcc >= 4.5.x is 
necessary for hardfloat. So it's likely that more people will start 
using 4.5.x (4.5.2 was just released).

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-22  0:11                                               ` Alexander Holler
@ 2010-12-22  7:02                                                 ` Albert ARIBAUD
  2010-12-22  7:18                                                   ` Dirk Behme
  2010-12-22  9:56                                                   ` Alexander Holler
  0 siblings, 2 replies; 48+ messages in thread
From: Albert ARIBAUD @ 2010-12-22  7:02 UTC (permalink / raw)
  To: u-boot

Le 22/12/2010 01:11, Alexander Holler a ?crit :
> Am 21.12.2010 21:04, schrieb Dirk Behme:
>> On 21.12.2010 20:52, Wolfgang Denk wrote:
>>> Dear Albert&    friends,
>>>
>>> what is your opinion?  Should we include the memory barrier patch into
>>> the upcoming release (and eventually delay it for further testing), or
>>> release as is and solve this issue in the next release?
>>>
>>> I tend to leave it as is, as I expect that most people will disappear
>>> in the next few days for holidays, so no much testing will be done
>>> anyway, and we then can solve this with less pressure in the next
>>> release - but I'm not really sure if this is a good idea?
>>
>> I somehow tend to leave it as is, too.
>>
>> We have issues with some recent compilers. For these we found a fix
>> using the io.h somehow the same way the Linux kernel does. But this
>> introduces new issues for us, we haven't found a proper fix yet
>> (except changing the code to the 'old' io.h style). But we don't know
>> where we might have this issue additionally, yet.
>
> The only real problem found with that patch was one with a register
> which doesn't like an (unmotivated) read after write.

Yes, and this is enough for me to not want it right away: we caught this 
one, but how many others, so far unseen, will creep up?

> On the other side, without that patch, using gcc>= 4.5.x (at least on
> arm) proved to fail. In contrast to that problem of gcc 4.5.x ignoring
> that volatile, 4.5.x still fixes many bugs for arm and gcc>= 4.5.x is
> necessary for hardfloat. So it's likely that more people will start
> using 4.5.x (4.5.2 was just released).

Do we need hard floating point in u-boot? IIRC, and unless this changed, 
the kernel does not want floating point, so I wonder why u-boot would.

As for getting 4.5 to work, for the next cycle people can still use pre 
4.5 gccs / toolchains, so this is important but not urgent to the point 
of rushing decisions.

> Regards,
>
> Alexander

Amicalement,
-- 
Albert.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-22  7:02                                                 ` Albert ARIBAUD
@ 2010-12-22  7:18                                                   ` Dirk Behme
  2010-12-22  7:52                                                     ` Wolfgang Denk
  2010-12-22  9:56                                                   ` Alexander Holler
  1 sibling, 1 reply; 48+ messages in thread
From: Dirk Behme @ 2010-12-22  7:18 UTC (permalink / raw)
  To: u-boot

On 22.12.2010 08:02, Albert ARIBAUD wrote:
> Le 22/12/2010 01:11, Alexander Holler a ?crit :
>> Am 21.12.2010 21:04, schrieb Dirk Behme:
>>> On 21.12.2010 20:52, Wolfgang Denk wrote:
>>>> Dear Albert&     friends,
>>>>
>>>> what is your opinion?  Should we include the memory barrier patch into
>>>> the upcoming release (and eventually delay it for further testing), or
>>>> release as is and solve this issue in the next release?
>>>>
>>>> I tend to leave it as is, as I expect that most people will disappear
>>>> in the next few days for holidays, so no much testing will be done
>>>> anyway, and we then can solve this with less pressure in the next
>>>> release - but I'm not really sure if this is a good idea?
>>>
>>> I somehow tend to leave it as is, too.
>>>
>>> We have issues with some recent compilers. For these we found a fix
>>> using the io.h somehow the same way the Linux kernel does. But this
>>> introduces new issues for us, we haven't found a proper fix yet
>>> (except changing the code to the 'old' io.h style). But we don't know
>>> where we might have this issue additionally, yet.
>>
>> The only real problem found with that patch was one with a register
>> which doesn't like an (unmotivated) read after write.
>
> Yes, and this is enough for me to not want it right away: we caught this
> one, but how many others, so far unseen, will creep up?
>
>> On the other side, without that patch, using gcc>= 4.5.x (at least on
>> arm) proved to fail. In contrast to that problem of gcc 4.5.x ignoring
>> that volatile, 4.5.x still fixes many bugs for arm and gcc>= 4.5.x is
>> necessary for hardfloat. So it's likely that more people will start
>> using 4.5.x (4.5.2 was just released).
>
> Do we need hard floating point in u-boot? IIRC, and unless this changed,
> the kernel does not want floating point, so I wonder why u-boot would.
>
> As for getting 4.5 to work, for the next cycle people can still use pre
> 4.5 gccs / toolchains, so this is important but not urgent to the point
> of rushing decisions.

Agree.

Btw, I tried to send a summary of our issues to the Codesourcery 
mailing list:

http://www.codesourcery.com/archives/arm-gnu/msg03989.html

Let's see if we get an answer.

Best regards

Dirk

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-22  7:18                                                   ` Dirk Behme
@ 2010-12-22  7:52                                                     ` Wolfgang Denk
  2010-12-23 16:40                                                       ` Dirk Behme
  0 siblings, 1 reply; 48+ messages in thread
From: Wolfgang Denk @ 2010-12-22  7:52 UTC (permalink / raw)
  To: u-boot

Dear Dirk Behme,

In message <4D11A65E.8040200@googlemail.com> you wrote:
>
> Btw, I tried to send a summary of our issues to the Codesourcery =
> 
> mailing list:
> 
> http://www.codesourcery.com/archives/arm-gnu/msg03989.html
> 
> Let's see if we get an answer.

You got one:
http://www.codesourcery.com/archives/arm-gnu/msg03990.html

And it sounds like a reasonable explanation to me.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
Cigarette, n.: A fire at one end, a fool at the other, and a  bit  of
tobacco in between.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-18 22:27 [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends Alexander Holler
  2010-12-19  7:51 ` Dirk Behme
@ 2010-12-22  8:02 ` Wolfgang Denk
  2010-12-22 11:23   ` Alexander Holler
                     ` (2 more replies)
  1 sibling, 3 replies; 48+ messages in thread
From: Wolfgang Denk @ 2010-12-22  8:02 UTC (permalink / raw)
  To: u-boot

Dear Alexander Holler,

In message <1292711230-3234-1-git-send-email-holler@ahsoftware.de> you wrote:
> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
> avoid that as done in the kernel.
...
> +#define writeb(v,c)			({ __iowmb(); __arch_putb(v,c); })
> +#define writew(v,c)			({ __iowmb(); __arch_putw(v,c); })
> +#define writel(v,c)			({ __iowmb(); __arch_putl(v,c); })

http://www.codesourcery.com/archives/arm-gnu/msg03990.html  explains
why this construct is causing errors in cases where an additional read
from the address is unsupported.

Can you please try the following patch instead?

-------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-22  7:02                                                 ` Albert ARIBAUD
  2010-12-22  7:18                                                   ` Dirk Behme
@ 2010-12-22  9:56                                                   ` Alexander Holler
  1 sibling, 0 replies; 48+ messages in thread
From: Alexander Holler @ 2010-12-22  9:56 UTC (permalink / raw)
  To: u-boot

Am 22.12.2010 08:02, schrieb Albert ARIBAUD:
> Le 22/12/2010 01:11, Alexander Holler a ?crit :
>> Am 21.12.2010 21:04, schrieb Dirk Behme:
>>> On 21.12.2010 20:52, Wolfgang Denk wrote:
>>>> Dear Albert&     friends,
>>>>
>>>> what is your opinion?  Should we include the memory barrier patch into
>>>> the upcoming release (and eventually delay it for further testing), or
>>>> release as is and solve this issue in the next release?
>>>>
>>>> I tend to leave it as is, as I expect that most people will disappear
>>>> in the next few days for holidays, so no much testing will be done
>>>> anyway, and we then can solve this with less pressure in the next
>>>> release - but I'm not really sure if this is a good idea?
>>>
>>> I somehow tend to leave it as is, too.
>>>
>>> We have issues with some recent compilers. For these we found a fix
>>> using the io.h somehow the same way the Linux kernel does. But this
>>> introduces new issues for us, we haven't found a proper fix yet
>>> (except changing the code to the 'old' io.h style). But we don't know
>>> where we might have this issue additionally, yet.
>>
>> The only real problem found with that patch was one with a register
>> which doesn't like an (unmotivated) read after write.
>
> Yes, and this is enough for me to not want it right away: we caught this
> one, but how many others, so far unseen, will creep up?
>
>> On the other side, without that patch, using gcc>= 4.5.x (at least on
>> arm) proved to fail. In contrast to that problem of gcc 4.5.x ignoring
>> that volatile, 4.5.x still fixes many bugs for arm and gcc>= 4.5.x is
>> necessary for hardfloat. So it's likely that more people will start
>> using 4.5.x (4.5.2 was just released).
>
> Do we need hard floating point in u-boot? IIRC, and unless this changed,
> the kernel does not want floating point, so I wonder why u-boot would.

Most people won't use U-Boot as base for the decision which compiler 
version to use.

> As for getting 4.5 to work, for the next cycle people can still use pre
> 4.5 gccs / toolchains, so this is important but not urgent to the point
> of rushing decisions.

I've just written down my opinion.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-22  8:02 ` Wolfgang Denk
@ 2010-12-22 11:23   ` Alexander Holler
  2010-12-29  9:40   ` Dirk Behme
  2010-12-29 23:10   ` Alessandro Rubini
  2 siblings, 0 replies; 48+ messages in thread
From: Alexander Holler @ 2010-12-22 11:23 UTC (permalink / raw)
  To: u-boot

Hello,

Am 22.12.2010 09:02, schrieb Wolfgang Denk:

> In message<1292711230-3234-1-git-send-email-holler@ahsoftware.de>  you wrote:
>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>> avoid that as done in the kernel.
> ...
>> +#define writeb(v,c)			({ __iowmb(); __arch_putb(v,c); })
>> +#define writew(v,c)			({ __iowmb(); __arch_putw(v,c); })
>> +#define writel(v,c)			({ __iowmb(); __arch_putl(v,c); })
>
> http://www.codesourcery.com/archives/arm-gnu/msg03990.html  explains
> why this construct is causing errors in cases where an additional read
> from the address is unsupported.
>
> Can you please try the following patch instead?

Indeed. Using do {} while (0) instead of that "GCC statement-expression" 
fixes the problem with the read after write.

I didn't know about that "GCC statement-expression" and my usage of 
({...}) was based on the writeb in the kernel-headers. But they (seem 
to) expand to something returning (void) which might avoid the problem.

Good that I've added the sentence that using the kernel-headers instead 
of that patch might be a better solution. But this might bring in much 
more changes, including real memory barriers. ;)

Anyway, now the master (including the GLOBAL...-patch) + patch v3 for 
read*/write* is good here. Just tested with both gcc 4.3.5 and gcc 4.5.1 
using binutils 2.20.1.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-22  7:52                                                     ` Wolfgang Denk
@ 2010-12-23 16:40                                                       ` Dirk Behme
  0 siblings, 0 replies; 48+ messages in thread
From: Dirk Behme @ 2010-12-23 16:40 UTC (permalink / raw)
  To: u-boot

On 22.12.2010 08:52, Wolfgang Denk wrote:
> Dear Dirk Behme,
>
> In message<4D11A65E.8040200@googlemail.com>  you wrote:
>>
>> Btw, I tried to send a summary of our issues to the Codesourcery =
>>
>> mailing list:
>>
>> http://www.codesourcery.com/archives/arm-gnu/msg03989.html
>>
>> Let's see if we get an answer.
>
> You got one:
> http://www.codesourcery.com/archives/arm-gnu/msg03990.html
>
> And it sounds like a reasonable explanation to me.

An additional answer, mainly covering the question why the volatile is 
optimized away:

http://www.codesourcery.com/archives/arm-gnu/msg03999.html

Best regards

Dirk

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-22  8:02 ` Wolfgang Denk
  2010-12-22 11:23   ` Alexander Holler
@ 2010-12-29  9:40   ` Dirk Behme
  2010-12-29 23:10   ` Alessandro Rubini
  2 siblings, 0 replies; 48+ messages in thread
From: Dirk Behme @ 2010-12-29  9:40 UTC (permalink / raw)
  To: u-boot

On 22.12.2010 09:02, Wolfgang Denk wrote:
> Dear Alexander Holler,
>
> In message<1292711230-3234-1-git-send-email-holler@ahsoftware.de>  you wrote:
>> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
>> avoid that as done in the kernel.
> ...
>> +#define writeb(v,c)			({ __iowmb(); __arch_putb(v,c); })
>> +#define writew(v,c)			({ __iowmb(); __arch_putw(v,c); })
>> +#define writel(v,c)			({ __iowmb(); __arch_putl(v,c); })
>
> http://www.codesourcery.com/archives/arm-gnu/msg03990.html  explains
> why this construct is causing errors in cases where an additional read
> from the address is unsupported.

Just for the record:

The trick is to ensure that the __arch_putx() containing the volatile 
is not the last statement in the GCC statement-expression. So, using 
something like

#define writeb(v,c)         ({ __iowmb(); __arch_putb(v,c); v;})

(note the additional 'v;') should result in correct code, too.

The patches sent by Wolfgang and Alexander using

#define writeb(v,c)	do { __iowmb(); __arch_putb(v,c); } while (0)

do the same with a slightly different syntax, so these patches are 
fine, too.

Thanks

Dirk

> Can you please try the following patch instead?
>
> -------------------------------------------------------------------------
>
>> From 4672bbddaf8ce7e17a99ba737782cc527d46e5eb Mon Sep 17 00:00:00 2001
> From: Alexander Holler<holler@ahsoftware.de>
> Date: Sat, 18 Dec 2010 23:27:10 +0100
> Subject: [PATCH] ARM: Avoid compiler optimization for readb, writeb and friends.
>
> gcc 4.5.1 seems to ignore (at least some) volatile definitions,
> avoid that as done in the kernel.
>
> Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that
> gcc version to ignore the volatile type qualifier used e.g. in __arch_getl().
> Anyway, using a definition as in the kernel headers avoids such optimizations when
> gcc 4.5.1 is used.
>
> Maybe the headers as used in the current linux-kernel should be used,
> but to avoid large changes, I've just added a small change to the current headers.
>
> Signed-off-by: Alexander Holler<holler@ahsoftware.de>
> Signed-off-by: Wolfgang Denk<wd@denx.de>
> ---
>   arch/arm/include/asm/io.h |   32 ++++++++++++++++++++------------
>   1 files changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
> index ff1518e..647503a 100644
> --- a/arch/arm/include/asm/io.h
> +++ b/arch/arm/include/asm/io.h
> @@ -117,21 +117,29 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen)
>   		*buf++ = __arch_getl(addr);
>   }
>
> -#define __raw_writeb(v,a)		__arch_putb(v,a)
> -#define __raw_writew(v,a)		__arch_putw(v,a)
> -#define __raw_writel(v,a)		__arch_putl(v,a)
> +#define __raw_writeb(v,a)	__arch_putb(v,a)
> +#define __raw_writew(v,a)	__arch_putw(v,a)
> +#define __raw_writel(v,a)	__arch_putl(v,a)
>
> -#define __raw_readb(a)			__arch_getb(a)
> -#define __raw_readw(a)			__arch_getw(a)
> -#define __raw_readl(a)			__arch_getl(a)
> +#define __raw_readb(a)		__arch_getb(a)
> +#define __raw_readw(a)		__arch_getw(a)
> +#define __raw_readl(a)		__arch_getl(a)
>
> -#define writeb(v,a)			__arch_putb(v,a)
> -#define writew(v,a)			__arch_putw(v,a)
> -#define writel(v,a)			__arch_putl(v,a)
> +/*
> + * TODO: The kernel offers some more advanced versions of barriers, it might
> + * have some advantages to use them instead of the simple one here.
> + */
> +#define dmb()		__asm__ __volatile__ ("" : : : "memory")
> +#define __iormb()	dmb()
> +#define __iowmb()	dmb()
> +
> +#define writeb(v,c)	do { __iowmb(); __arch_putb(v,c); } while (0)
> +#define writew(v,c)	do { __iowmb(); __arch_putw(v,c); } while (0)
> +#define writel(v,c)	do { __iowmb(); __arch_putl(v,c); } while (0)
>
> -#define readb(a)			__arch_getb(a)
> -#define readw(a)			__arch_getw(a)
> -#define readl(a)			__arch_getl(a)
> +#define readb(c)	({ u8  __v = __arch_getb(c); __iormb(); __v; })
> +#define readw(c)	({ u16 __v = __arch_getw(c); __iormb(); __v; })
> +#define readl(c)	({ u32 __v = __arch_getl(c); __iormb(); __v; })
>
>   /*
>    * The compiler seems to be incapable of optimising constants

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-22  8:02 ` Wolfgang Denk
  2010-12-22 11:23   ` Alexander Holler
  2010-12-29  9:40   ` Dirk Behme
@ 2010-12-29 23:10   ` Alessandro Rubini
  2010-12-30 10:39     ` Dirk Behme
  2011-01-09 22:03     ` Wolfgang Denk
  2 siblings, 2 replies; 48+ messages in thread
From: Alessandro Rubini @ 2010-12-29 23:10 UTC (permalink / raw)
  To: u-boot

Dirk Behme:
> Just for the record:
> 
> The trick is to ensure that the __arch_putx() containing the volatile 
> is not the last statement in the GCC statement-expression. So, using 
> something like
> 
> #define writeb(v,c)         ({ __iowmb(); __arch_putb(v,c); v;})
> 
> (note the additional 'v;') should result in correct code, too.

Yes, that's good. Also "0" may work, and may be more readable, (or not,
according to who reads it).

> The patches sent by Wolfgang and Alexander using
> 
> #define writeb(v,c)	do { __iowmb(); __arch_putb(v,c); } while (0)
> 
> do the same with a slightly different syntax, so these patches are 
> fine, too.

It's not just different syntax, it's different semantics.

The ({...})  trick turns statements into expressions, while the "do
{...} while(0)" does not.  I'd better not forbid statements like

	while (reg = readb(addr), reg != VALUE) { .... }

or

	if (readb(addr) == VALUE) { ... } 

or 
	swtich (readb(addr)) { ... }
 
While I agree they may in general not be clean, I can forsee
meaningful uses. Moreover, I'd better allow expression-looking macros
to really behave like expressions -- otherwise error messages are quite
hard to understand for the unaquainted.

IMHO, the only reason to use "do {} while(0)" over statemente
expressions is being portable but in u-boot we are gcc-specific
anyways.

/alessandro

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-29 23:10   ` Alessandro Rubini
@ 2010-12-30 10:39     ` Dirk Behme
  2011-01-09 22:03     ` Wolfgang Denk
  1 sibling, 0 replies; 48+ messages in thread
From: Dirk Behme @ 2010-12-30 10:39 UTC (permalink / raw)
  To: u-boot

On 30.12.2010 00:10, Alessandro Rubini wrote:
> Dirk Behme:
>> Just for the record:
>>
>> The trick is to ensure that the __arch_putx() containing the volatile
>> is not the last statement in the GCC statement-expression. So, using
>> something like
>>
>> #define writeb(v,c)         ({ __iowmb(); __arch_putb(v,c); v;})
>>
>> (note the additional 'v;') should result in correct code, too.
>
> Yes, that's good. Also "0" may work, and may be more readable, (or not,
> according to who reads it).

Yes, indeed,

#define writeb(v,c)         ({ __iowmb(); __arch_putb(v,c); 0;})

seems to work, too :)

Thanks

Dirk

>> The patches sent by Wolfgang and Alexander using
>>
>> #define writeb(v,c)	do { __iowmb(); __arch_putb(v,c); } while (0)
>>
>> do the same with a slightly different syntax, so these patches are
>> fine, too.
>
> It's not just different syntax, it's different semantics.
>
> The ({...})  trick turns statements into expressions, while the "do
> {...} while(0)" does not.  I'd better not forbid statements like
>
> 	while (reg = readb(addr), reg != VALUE) { .... }
>
> or
>
> 	if (readb(addr) == VALUE) { ... }
>
> or
> 	swtich (readb(addr)) { ... }
>
> While I agree they may in general not be clean, I can forsee
> meaningful uses. Moreover, I'd better allow expression-looking macros
> to really behave like expressions -- otherwise error messages are quite
> hard to understand for the unaquainted.
>
> IMHO, the only reason to use "do {} while(0)" over statemente
> expressions is being portable but in u-boot we are gcc-specific
> anyways.
>
> /alessandro
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-29 23:10   ` Alessandro Rubini
  2010-12-30 10:39     ` Dirk Behme
@ 2011-01-09 22:03     ` Wolfgang Denk
  1 sibling, 0 replies; 48+ messages in thread
From: Wolfgang Denk @ 2011-01-09 22:03 UTC (permalink / raw)
  To: u-boot

Dear Alessandro Rubini,

In message <20101229231004.GA17999@mail.gnudd.com> you wrote:
>
> > #define writeb(v,c)         ({ __iowmb(); __arch_putb(v,c); v;})
> > 
> > (note the additional 'v;') should result in correct code, too.
> 
> Yes, that's good. Also "0" may work, and may be more readable, (or not,
> according to who reads it).

'0' looks wrong to me, as it changes behaviour.  Keep in mind that the
previous implemention usually looks like this:

	#define writeb(val,addr) (*(volatile unsigned char *)(addr)) = (val)

The value of this should be "val", not 0.


> > #define writeb(v,c)	do { __iowmb(); __arch_putb(v,c); } while (0)
> > 
> > do the same with a slightly different syntax, so these patches are 
> > fine, too.
> 
> It's not just different syntax, it's different semantics.

This is true.  Thanks for pointing out.

> The ({...})  trick turns statements into expressions, while the "do
> {...} while(0)" does not.  I'd better not forbid statements like
> 
> 	while (reg = readb(addr), reg != VALUE) { .... }
> 
> or
> 
> 	if (readb(addr) == VALUE) { ... } 
> 
> or 
> 	swtich (readb(addr)) { ... }

Here you are mixing things up.  There is no issue with the read*()
code, only the write*() code is affected.

And while no person ion a sane mind of state should use write*() in
such a context, I agree that for the sake of backward compatibility we
should change the code.  Patch follows.

> While I agree they may in general not be clean, I can forsee
> meaningful uses. Moreover, I'd better allow expression-looking macros
> to really behave like expressions -- otherwise error messages are quite
> hard to understand for the unaquainted.

Agreed.  But the write*() "functions" are considered to return void,
so there should not be any such issues.

> IMHO, the only reason to use "do {} while(0)" over statemente
> expressions is being portable but in u-boot we are gcc-specific
> anyways.

This is wrong interpretation, too; nothing in this construct is GCC
specific.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
Documentation is the castor oil of programming.
Managers know it must be good because the programmers hate it so much.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends.
  2010-12-20 16:08                     ` John Rigby
  2010-12-20 16:12                       ` John Rigby
@ 2011-01-17  4:35                       ` Alexander Holler
  1 sibling, 0 replies; 48+ messages in thread
From: Alexander Holler @ 2011-01-17  4:35 UTC (permalink / raw)
  To: u-boot

Hello,

Am 20.12.2010 17:08, schrieb John Rigby:
> Earlier in this thread Alexander said:
>> I haven't add the definitions which are using a memory barrier because I haven't found
>> a place in the kernel where they were actually enabled
>> (CONFIG_ARM_DMA_MEM_BUFFERABLE).

Because I've just run again into such a "search problem":

Don't use "git grep" on a kernel configured for x86, when you are 
searching an option for another architecture. ;)

Regards,

Alexander

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2011-01-17  4:35 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-18 22:27 [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends Alexander Holler
2010-12-19  7:51 ` Dirk Behme
2010-12-19 10:22   ` Alexander Holler
2010-12-19 11:28     ` Albert ARIBAUD
2010-12-19 16:34       ` Alexander Holler
2010-12-19 18:45     ` John Rigby
2010-12-19 19:59       ` Alexander Holler
2010-12-20  0:39         ` John Rigby
2010-12-20  0:56           ` Alexander Holler
2010-12-20  4:18             ` John Rigby
2010-12-20  6:07               ` John Rigby
2010-12-20  6:49                 ` Dirk Behme
2010-12-20  7:37                   ` John Rigby
2010-12-20 16:08                     ` John Rigby
2010-12-20 16:12                       ` John Rigby
2011-01-17  4:35                       ` Alexander Holler
2010-12-20 17:12                 ` Alexander Holler
2010-12-21  0:25                   ` John Rigby
2010-12-21  0:46                     ` John Rigby
2010-12-21  7:11                       ` Dirk Behme
2010-12-21  7:21                         ` Albert ARIBAUD
2010-12-21  8:05                           ` Dirk Behme
2010-12-21  8:17                             ` Wolfgang Denk
2010-12-21  8:37                               ` Dirk Behme
2010-12-21  8:35                           ` Dirk Behme
2010-12-21  8:46                             ` John Rigby
2010-12-21 10:38                               ` Albert ARIBAUD
2010-12-21 10:53                                 ` Wolfgang Denk
2010-12-21 12:35                                   ` Alexander Holler
2010-12-21 12:51                                     ` Albert ARIBAUD
2010-12-21 13:30                                       ` Alexander Holler
2010-12-21 14:33                                         ` Albert ARIBAUD
2010-12-21 19:52                                           ` Wolfgang Denk
2010-12-21 20:04                                             ` Dirk Behme
2010-12-21 21:49                                               ` Albert ARIBAUD
2010-12-22  0:11                                               ` Alexander Holler
2010-12-22  7:02                                                 ` Albert ARIBAUD
2010-12-22  7:18                                                   ` Dirk Behme
2010-12-22  7:52                                                     ` Wolfgang Denk
2010-12-23 16:40                                                       ` Dirk Behme
2010-12-22  9:56                                                   ` Alexander Holler
2010-12-21 13:38                     ` Alexander Holler
2010-12-22  8:02 ` Wolfgang Denk
2010-12-22 11:23   ` Alexander Holler
2010-12-29  9:40   ` Dirk Behme
2010-12-29 23:10   ` Alessandro Rubini
2010-12-30 10:39     ` Dirk Behme
2011-01-09 22:03     ` Wolfgang Denk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.