Arm64 Crashkernel doesn't work with FLATMEM anymore

* Arm64 Crashkernel doesn't work with FLATMEM anymore
@ 2019-12-17  0:02 Saeed Karimabadi (skarimab)
  2019-12-20 19:52 ` James Morse
  0 siblings, 1 reply; 5+ messages in thread
From: Saeed Karimabadi (skarimab) @ 2019-12-17  0:02 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: linux-arm-kernel, xe-linux-external(mailer list)

Hello Kernel Maintainers,

Crash dump  Kernel doesn't work with FLATMEM memory model since version 4.11.0-rc3 and it 
will panic at boot time with a page request exception. The crash happens while kernel is initializing
the memmap zones inside memmap_init_zone function. FLATMEM memory model is very useful
for systems with limited memory resources where it is desirable to reserve as minimum as possible
memory for the crash kernel. 
I'm wondering if this is a known issue or there is a patch to fix it?

-- Crash Dump Kernel starts here--
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x411fd070]
[    0.000000] Linux version 5.5.0-rc1 (user@host) (gcc version 4.7.0 (GCC)) #163 SMP PREEMPT Tue Dec 10 11:12:37 PST 2019
[    0.000000] Machine model: linux,dummy-virt
[    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '')
[    0.000000] printk: bootconsole [pl11] enabled
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: UEFI not found.
[    0.000000] Reserving 1KB of memory at 0xbfdff000 for elfcorehdr
[    0.000000] On node 0 totalpages: 8192
[    0.000000]   DMA zone: 128 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 8192 pages, LIFO batch:0
[    0.000000] Unable to handle kernel paging request at virtual address ffffff8040ccf0b8
[    0.000000] Mem abort info:
[    0.000000]   ESR = 0x96000045
[    0.000000]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.000000]   SET = 0, FnV = 0
[    0.000000]   EA = 0, S1PTW = 0
[    0.000000] Data abort info:
[    0.000000]   ISV = 0, ISS = 0x00000045
[    0.000000]   CM = 0, WnR = 1
[    0.000000] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000bf068000
[    0.000000] [ffffff8040ccf0b8] pgd=0000000000000000, pud=0000000000000000
[    0.000000] Internal error: Oops: 96000045 [#1] PREEMPT SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc1 #163
[    0.000000] Hardware name: linux,dummy-virt (DT)
[    0.000000] pstate: 20000085 (nzCv daIf -PAN -UAO)
[    0.000000] pc : memmap_init_zone+0x68/0xe0
[    0.000000] lr : memmap_init+0x14/0x1c
[    0.000000] sp : ffffffc011773d60
[    0.000000] x29: ffffffc011773d60 x28: ffffffc01131e000
[    0.000000] x27: ffffffc011949680 x26: ffffffc011949680
[    0.000000] x25: 0000000000000000 x24: ffffffc011999000
[    0.000000] x23: 0000000000001000 x22: ffffffc0111db000
[    0.000000] x21: 0000000000000001 x20: 00000000ffffffff
[    0.000000] x19: 00000000000bfe00 x18: 000000001fbad8f6
[    0.000000] x17: fffffffefe695030 x16: ffffffc010c5f000
[    0.000000] x15: 0000000000000002 x14: ffffffffffffffff
[    0.000000] x13: 0000000000000000 x12: 0000000000000640
[    0.000000] x11: 00000000bfdff400 x10: 00000000000bde00
[    0.000000] x9 : 0000000000000078 x8 : ffffff803fdfffc8
[    0.000000] x7 : 0000000000000000 x6 : 00000000bfdfffc0
[    0.000000] x5 : ffffff803fd57080 x4 : 0000000000000000
[    0.000000] x3 : 00000000000bde00 x2 : 0000000000000000
[    0.000000] x1 : 0000000000f78000 x0 : ffffff8040ccf080
[    0.000000] Call trace:
[    0.000000]  memmap_init_zone+0x68/0xe0
[    0.000000]  memmap_init+0x14/0x1c
[    0.000000]  free_area_init_node+0x39c/0x3ec
[    0.000000]  bootmem_init+0x158/0x174
[    0.000000]  setup_arch+0x290/0x64c
[    0.000000]  start_kernel+0x5c/0x480
[    0.000000] Code: f945c705 cb813061 d37ae421 8b0100a0 (f9001c1f)
[    0.000000] random: get_random_bytes called from init_oops_id+0x3c/0x48 with crng_init=0
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

Kernel Config:
- Running qemu - default arm64 config 
- 39 bit VA addresses, 
- NUMA is disbaled
- FLATMEM as memory model

FLATMEM became broken after submission of these two patches: 
commit 8f579b1c4e347b23bfa747bc2cc0a55dd1b7e5fa      arm64: limit memory regions based on DT property, usable-memory-range
commit a7f8de168ace487fa7b88cb154e413cf40e87fc6       arm64: allow kernel Image to be loaded anywhere in physical memory

The first patch limits the available kernel memory to what has been passed to crash kernel from the main kernel via device tree. 

Breakpoint 1, fdt_enforce_memory_region () at arch/arm64/mm/init.c:213
220                     memblock_cap_memory_range(reg.base, reg.size);
(gdb) p/x reg
$2 = {base = 0xbde00000, size = 0x2000000, flags = 0x0}

Later on, arm64_memblock_init() will 1GB align the base address and it will round it down to 0x80000000 (=memstart_addr)

arm64_memblock_init () at arch/arm64/mm/init.c:343
240             memstart_addr = round_down(memblock_start_of_DRAM(),
241                                        ARM64_MEMSTART_ALIGN);
 (gdb) p/x memstart_addr
$6 = 0x80000000

The crash happens inside mm/page_alloc.c:memmap_init_zone() when kernel tries to initialize the first pfn of ZONE_DMA. The code
 would calculate a wrong page structure pointer which is pointing beyond the end address of available memory.

Breakpoint 3 at 0xffffff8008d463f0: file mm/page_alloc.c, line 5196.
<-- Snip -->
5276                            struct page *page = pfn_to_page(pfn);
(gdb) p/x pfn
$14 = 0xbde00
(gdb) p/x page
$16 = 0xffffffc040cd5780
(gdb) p *page
Cannot access memory at address 0xffffffc040cd5780

for FLATMEM model pfn_to_page is defined as:
#define __pfn_to_page(pfn)       (mem_map + ((pfn) - ARCH_PFN_OFFSET))
 (gdb) p/x mem_map
$17 = 0xffffffc03fd5d780
 (gdb) x 0xffffffc040cd5780
0xffffffc040cd5780:     Cannot access memory at address 0xffffffc040cd5780

It looks like in expansion of the pfn_to_page() macro, if the kernel start address is not 1GB aligned, this part of macro ((pfn)-ARCH_PFN_OFFSET) 
can create a huge offset from the base address of mem_map which will cause the calculated page address to point a location outside of the 
available memory boundaries.

Regards,
Saeed Karimabadi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread