Hi Angelo, On 22/08/17 10:35, Angelo Dureghello wrote: > On 21/08/2017 09:15, Greg Ungerer wrote: >> On 20/08/17 23:26, Angelo Dureghello wrote: >>> On 20/08/2017 14:44, Greg Ungerer wrote: >>>> On 18/08/17 01:02, Angelo Dureghello wrote: >>>>> On 14/08/2017 06:16, Greg Ungerer wrote: >>>>>> On 12/08/17 21:17, Angelo Dureghello wrote: >>>>>>> On 10/08/2017 09:06, Greg Ungerer wrote: >>>>>>>> On 10/08/17 01:32, Angelo Dureghello wrote: >>>>>>>> [snip] >>>>>>>>> sure, on this board http://sysam.it/cff_stmark2.html >>>>>>>>> there are 128MB of ddr2. >>>>>>>>> >>>>>>>>> External SDRAM is accessible, at least without any mmc support enabled, >>>>>>>>> from 0x40000000. >>>>>>>>> >>>>>>>>> I have following test config: >>>>>>>>> >>>>>>>>> GNU nano 2.8.6 File: arch/m68k/configs/stmark2_defconfig >>>>>>>>> >>>>>>>>> CONFIG_LOCALVERSION="stmark2-001" >>>>>>>> [snip] >>>>>>>>> >>>>>>>>> >>>>>>>>> I tried still yesterday a bit, but seems there is no much support for >>>>>>>>> earlyprintk / low level debug for this architecture. >>>>>>>>> >>>>>>>>> In case i can try with a gpio toggling routine, at least to find >>>>>>>>> where kernel stops. >>>>>>>> >>>>>>>> The attached patch, is a quick and dirty early console output method. >>>>>>>> It works for me on the m5475, should work for you "as is" on the 5441x too. >>>>>>>> >>>>>>>> It is kind of an early printk. Of course it still needs the early >>>>>>>> kernel boot to have succeeded before you will get anything much coming out. >>>>>>>> But it is worth trying. >>>>>>> >>>>>>> Ok many thanks. Btw i used a __square(); function written in asm, so i am >>>>>>> sure i see the gpio toggling in very early stages. >>>>>>> >>>>>>>> >>>>>>>> I am wondering if the non-0 base RAM may be a problem. I have only run >>>>>>>> the MMU enabled code on platforms with 0 based RAM so far. But lets see if >>>>>>>> the early console trace attached gives us anything before digging into that. >>>>>>>> >>>>>>> >>>>>>> This MCU has sdram area physically mapped at 0x4000 0000 so U-Boot, to be >>>>>>> able to execute the kernel must load it to that location/area anyway. >>>>>>> >>>>>>> But i have seen that it is not a problem, after MMU is enabled in head.S >>>>>>> the jump >>>>>>> movel #_vstart,%a0 /* jump to "virtual" space */ >>>>>>> jmp %a0@ >>>>>>> >>>>>>> works fine. Since that range is not hitting anything that is maintained >>>>>>> physical, it can be translated into virtual without any issue. >>>>>> >>>>>> Yeah, it is not so much the initial start up that I think will >>>>>> be the problem. More the setup of the MMU mapping tables later >>>>>> in boot. >>>>>> >>>>>> >>>>>>> After some hard debug, i see the execution stops at: >>>>>>> >>>>>>> asmlinkage __visible void __init start_kernel(void) >>>>>>> ... >>>>>>> setup_arch(&command_line); setup_mm.c >>>>>>> ... >>>>>>> paging_init(); mm/mcfmmu.c >>>>>>> ... >>>>>>> empty_zero_page = (void *) alloc_bootmem_pages(PAGE_SIZE); >>>>>>> ^line 47 mcfmmu.c >>>>>>> >>>>>>> Inside alloc_bootmem_pages(), execution seems to end up finally to >>>>>>> mm/bootmem.c and likely to alloc_bootmem_bdata(). >>>>>>> In case i can still proceed to find the exact place where execution stops, >>>>>>> but i suspect in the while(1), line 545. >>>>>>> >>>>>>> As a curious thing, i find in a different cf CPU code "m54xx.c" >>>>>>> the following: >>>>>>> >>>>>>> void __init config_BSP(char *commandp, int size) >>>>>>> { >>>>>>> #ifdef CONFIG_MMU >>>>>>> cf_bootmem_alloc(); >>>>>>> mmu_context_init(); >>>>>>> #endif >>>>>>> Do also m5441x.c maybe need this calls ? >>>>>> >>>>>> Yes, you will need this. So that code above is only getting run when >>>>>> configured for a 547x CPU family. Attached is a rework of that code >>>>>> so that it will be run for all ColdFire MMU varients. Can you try >>>>>> that out? >>>>>> >>>>>> >>>>>>> Would be very nice to have MMU working. Strangely, i don't see any >>>>>>> board_config with it enabled. Was it ever tested on some Coldfire ? >>>>>> >>>>>> Oh, yeah, I run this on a real M5475 EVB board for every kernel >>>>>> mainline release, with and without MMU enabled. See the >>>>>> arch/m68k/configs/m5475evb_defconfig, it will default to having >>>>>> the MMU enabled. >>>>>> >>>>>> I have todays linux-4.13-rc5 running on it here now: >>>>>> >>>>>> # cat /proc/version >>>>>> Linux version 4.13.0-rc5-00001-gb014090-dirty (gerg@goober) (gcc version 5.4.0 (GCC)) #1 Mon Aug 14 10:14:12 AEST 2017 >>>>>> >>>>>> # cat /proc/cpuinfo >>>>>> CPU: ColdFire >>>>>> MMU: ColdFire >>>>>> FPU: ColdFire >>>>>> Clocking: 264.1MHz >>>>>> BogoMips: 264.19 >>>>>> Calibration: 1320960 loops >>>>>> # >>>>>> >>>>>> Regards >>>>>> Greg >>>>> >>>>> Ok, i applied your patch, and still the kernel is hanging silently, >>>>> so i started up a new debug session again. >>>>> >>>>> What is actually happening (after your patch has been applied) is: >>>>> >>>>> setup_arch() arch/m68k/kernel/setup_mm.c >>>>> paging_init() >>>>> memmap_init() mm/page_alloc.c >>>>> memmap_init_zone() >>>>> __init_single_page() >>>>> set_page_links() include/linux/mm.h >>>>> set_page_zone() >>>>> kernel hangs silently on this line >>>>> page->flags &= ~(ZONES_MASK << ZONES_PGSHIFT); >>>>> >>>>>> >>>>> >> >> Can you run your current code with the console debug code I sent >> a little while back? >> >> I ask because I suspect it should give something based on your debug >> above. I played around a little trying to fake out my configuration >> to make it look like the RAM was non-zero based. I couldn't get a fail, >> but I would like to add some more debug to see what is going on with >> the page pointers from your debug. >> >> Can you apply the attached patch and get any extra debug? >> >> >>>>> I am wondering how mmu works, so at the moment mmu is enabled, >>>>> in head.S, i would expect that code compiled for 0x40001000 would >>>>> not run, since jumps would be translated to some different physical >>>>> addresses, but execution sill works. >>>>> At the same, after enabling mmu i would expect .data vars to be >>>>> invalid, since their address would be translated to a different >>>>> location, while not, the init values of .data variables are still >>>>> valid. In case, i am interested to understand this points. >>>> >>>> On the ColdFire the kernel relies on all RAM and IO peripheral >>>> addresses) to "hit" the ACR registers - and essentially be passed >>>> through as an identity physical = virtual mapping. If you look at >>>> the operation of the memory address translation when virtual mode >>>> is enabled (in the ColdFire MMU sections of the 5475 and 54411 >>>> reference manual) you will see that addresses are checked in order >>>> to be for the MMUBAR, RAMBAR, ACR, then MMU. >>>> >>>> For example a kernel address when in supervisor mode will hit >>>> ACR1 or ACR3 the way we set them up in arch/m68k/coldfire/head.S. >>>> And that is why you see kernel code and data still being valid after >>>> the MMU is enabled in virtual mode. No TLB entries required for this. >>>> >>>> Looking at your call sequence above I can see that the physical >>>> RAM start address being non-zero is going to come into play. I'll >>>> dig into this a little more tomorrow see if I can figure out what >>>> is going on. >>>> >>> >>> Thanks for the kind clarifications. >>> >>> I'll look in this things too in next days, learning is always nice. >>> Btw, about load/entry address, i have noticed a possible basic >>> difference betweeen mcf5441x and mcf547x series: >>> >>> The second one (your cpu) is v4e and probably more recent i guess, and >>> one major difference from datasheet seems to be that it is Harvard. >>> So probably, for this reason, you can address ram from 0 there. >> >> IIRC the 5475 was the first ColdFire with MMU, it is pretty old. Pretty >> sure the 54411 came later. Not sure what the thinking was on the different >> default memory layout though. >> > > Finally, cleaning out my debug lines, i found i removed an important line. > So i am back to original "second" error we was trying to understand. > > > So current more clear status is: > > U-Boot 2017.09-rc2-00151-g2d7cb5b426-dirty (Aug 22 2017 - 00:22:46 +0200) > > CPU: Freescale MCF54410 (Mask:9f Version:2) > CPU CLK 240 MHz BUS CLK 120 MHz FLB CLK 60 MHz > INP CLK 30 MHz VCO CLK 480 MHz > SPI: ready > DRAM: 128 MiB > SF: Detected is25lp128 with page size 256 Bytes, erase size 64 KiB, total 16 MiB > In: serial > Out: serial > Err: serial > Hit any key to stop autoboot: 0 > SF: Detected is25lp128 with page size 256 Bytes, erase size 64 KiB, total 16 MiB > device 0 offset 0x100000, size 0x1d9728 > SF: 1939240 bytes @ 0x100000 Read: OK > ## Booting kernel from Legacy Image at 40001000 ... > Image Name: mainline kernel > Created: 2017-08-22 0:07:25 UTC > Image Type: M68K Linux Kernel Image (uncompressed) > Data Size: 1939176 Bytes = 1.8 MiB > Load Address: 40001000 > Entry Point: 40001000 > Verifying Checksum ... OK > Loading Kernel Image ... OK > Linux version 4.12.0stmark2-001-11691-g571d81b2b55f-dirty (angelo@jerusalem) (gcc version 4.9.0 (crosstools-sysam-2016.04.16)) #182 Tue Aug 22 02:07:24 CEST 2017 > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 0 at mm/page_alloc.c:6219 free_area_init_node+0x2f4/0x2fa > CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0stmark2-001-11691-g571d81b2b55f-dirty #182 > Stack from 4017deec: > > 4017deec > 4017b3dd > 40007972 > 00000000 > 00000000 > 47d9f62c > 00020000 > 00000000 > > 00000000 > 4017df9c > 40007a14 > 4016dd8e > 0000184b > 4019caca > 00000009 > 00000000 > > 00000000 > 4019caca > 4016dd8e > 0000184b > 48000000 > 40204000 > 47d9f62c > 40001000 > > 00000000 > 47d9ef1c > 40001480 > 4013010c > 4012cd16 > 4017dfa8 > 4019ecc0 > 00012000 > > 00002000 > 4019ccb4 > 00000000 > 4017df9c > 00020000 > 00000000 > 4019a3f2 > 4017df9c > > 00000001 > 401da8c0 > 401da774 > 4019ebc8 > 00004000 > 00000000 > 00000000 > 4017dfc8 > > Call Trace: > [<40007972>] __warn+0xa4/0xc0 > [<40007a14>] warn_slowpath_null+0x1a/0x22 > [<4019caca>] free_area_init_node+0x2f4/0x2fa > [<4019caca>] free_area_init_node+0x2f4/0x2fa > [<40001000>] kernel_pg_dir+0x0/0x1000 > [<40001480>] kernel_pg_dir+0x480/0x1000 > [<4013010c>] memset+0x0/0x80 > [<4012cd16>] strlen+0x0/0x14 > [<4019ecc0>] __alloc_bootmem+0x16/0x3c > [<4019ccb4>] free_area_init+0x20/0x26 > [<4019a3f2>] paging_init+0xee/0xfa > [<4019ebc8>] free_bootmem_node+0x0/0x34 > [<40199fbc>] setup_arch+0xcc/0x16e > [<40024eb2>] printk+0x0/0x18 > [<4019ecaa>] __alloc_bootmem+0x0/0x3c > [<40198550>] start_kernel+0x68/0x3ae > [<40001000>] kernel_pg_dir+0x0/0x1000 > [<400020f2>] _exit+0x0/0x6 > > ---[ end trace 0000000000000000 ]--- > On node 0 totalpages: 16384 > free_area_init_node: node 0, pgdat 401da8c0, node_mem_map a8c0401d > DMA zone: 72 pages used for memmap > DMA zone: 0 pages reserved > DMA zone: 16384 pages, LIFO batch:3 > /page_alloc.c(1171): page=a8c0401d pfn=131072 Another patch attached that digs a little deeper into why that page pointer ends up being invalid. If you could run with this and send the output that would be great. Regards Greg