linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Arm64 Crashkernel doesn't work with FLATMEM anymore
@ 2019-12-17  0:02 Saeed Karimabadi (skarimab)
  2019-12-20 19:52 ` James Morse
  0 siblings, 1 reply; 5+ messages in thread
From: Saeed Karimabadi (skarimab) @ 2019-12-17  0:02 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon
  Cc: linux-arm-kernel, xe-linux-external(mailer list)

Hello Kernel Maintainers,

Crash dump  Kernel doesn't work with FLATMEM memory model since version 4.11.0-rc3 and it 
will panic at boot time with a page request exception. The crash happens while kernel is initializing
the memmap zones inside memmap_init_zone function. FLATMEM memory model is very useful
for systems with limited memory resources where it is desirable to reserve as minimum as possible
memory for the crash kernel. 
I'm wondering if this is a known issue or there is a patch to fix it?

-- Crash Dump Kernel starts here--
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x411fd070]
[    0.000000] Linux version 5.5.0-rc1 (user@host) (gcc version 4.7.0 (GCC)) #163 SMP PREEMPT Tue Dec 10 11:12:37 PST 2019
[    0.000000] Machine model: linux,dummy-virt
[    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '')
[    0.000000] printk: bootconsole [pl11] enabled
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: UEFI not found.
[    0.000000] Reserving 1KB of memory at 0xbfdff000 for elfcorehdr
[    0.000000] On node 0 totalpages: 8192
[    0.000000]   DMA zone: 128 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 8192 pages, LIFO batch:0
[    0.000000] Unable to handle kernel paging request at virtual address ffffff8040ccf0b8
[    0.000000] Mem abort info:
[    0.000000]   ESR = 0x96000045
[    0.000000]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.000000]   SET = 0, FnV = 0
[    0.000000]   EA = 0, S1PTW = 0
[    0.000000] Data abort info:
[    0.000000]   ISV = 0, ISS = 0x00000045
[    0.000000]   CM = 0, WnR = 1
[    0.000000] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000bf068000
[    0.000000] [ffffff8040ccf0b8] pgd=0000000000000000, pud=0000000000000000
[    0.000000] Internal error: Oops: 96000045 [#1] PREEMPT SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc1 #163
[    0.000000] Hardware name: linux,dummy-virt (DT)
[    0.000000] pstate: 20000085 (nzCv daIf -PAN -UAO)
[    0.000000] pc : memmap_init_zone+0x68/0xe0
[    0.000000] lr : memmap_init+0x14/0x1c
[    0.000000] sp : ffffffc011773d60
[    0.000000] x29: ffffffc011773d60 x28: ffffffc01131e000
[    0.000000] x27: ffffffc011949680 x26: ffffffc011949680
[    0.000000] x25: 0000000000000000 x24: ffffffc011999000
[    0.000000] x23: 0000000000001000 x22: ffffffc0111db000
[    0.000000] x21: 0000000000000001 x20: 00000000ffffffff
[    0.000000] x19: 00000000000bfe00 x18: 000000001fbad8f6
[    0.000000] x17: fffffffefe695030 x16: ffffffc010c5f000
[    0.000000] x15: 0000000000000002 x14: ffffffffffffffff
[    0.000000] x13: 0000000000000000 x12: 0000000000000640
[    0.000000] x11: 00000000bfdff400 x10: 00000000000bde00
[    0.000000] x9 : 0000000000000078 x8 : ffffff803fdfffc8
[    0.000000] x7 : 0000000000000000 x6 : 00000000bfdfffc0
[    0.000000] x5 : ffffff803fd57080 x4 : 0000000000000000
[    0.000000] x3 : 00000000000bde00 x2 : 0000000000000000
[    0.000000] x1 : 0000000000f78000 x0 : ffffff8040ccf080
[    0.000000] Call trace:
[    0.000000]  memmap_init_zone+0x68/0xe0
[    0.000000]  memmap_init+0x14/0x1c
[    0.000000]  free_area_init_node+0x39c/0x3ec
[    0.000000]  bootmem_init+0x158/0x174
[    0.000000]  setup_arch+0x290/0x64c
[    0.000000]  start_kernel+0x5c/0x480
[    0.000000] Code: f945c705 cb813061 d37ae421 8b0100a0 (f9001c1f)
[    0.000000] random: get_random_bytes called from init_oops_id+0x3c/0x48 with crng_init=0
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

Kernel Config:
- Running qemu - default arm64 config 
- 39 bit VA addresses, 
- NUMA is disbaled
- FLATMEM as memory model

FLATMEM became broken after submission of these two patches: 
commit 8f579b1c4e347b23bfa747bc2cc0a55dd1b7e5fa      arm64: limit memory regions based on DT property, usable-memory-range
commit a7f8de168ace487fa7b88cb154e413cf40e87fc6       arm64: allow kernel Image to be loaded anywhere in physical memory

The first patch limits the available kernel memory to what has been passed to crash kernel from the main kernel via device tree. 

Breakpoint 1, fdt_enforce_memory_region () at arch/arm64/mm/init.c:213
220                     memblock_cap_memory_range(reg.base, reg.size);
(gdb) p/x reg
$2 = {base = 0xbde00000, size = 0x2000000, flags = 0x0}

Later on, arm64_memblock_init() will 1GB align the base address and it will round it down to 0x80000000 (=memstart_addr)

arm64_memblock_init () at arch/arm64/mm/init.c:343
240             memstart_addr = round_down(memblock_start_of_DRAM(),
241                                        ARM64_MEMSTART_ALIGN);
 (gdb) p/x memstart_addr
$6 = 0x80000000

The crash happens inside mm/page_alloc.c:memmap_init_zone() when kernel tries to initialize the first pfn of ZONE_DMA. The code
 would calculate a wrong page structure pointer which is pointing beyond the end address of available memory.

Breakpoint 3 at 0xffffff8008d463f0: file mm/page_alloc.c, line 5196.
<-- Snip -->
5276                            struct page *page = pfn_to_page(pfn);
(gdb) p/x pfn
$14 = 0xbde00
(gdb) p/x page
$16 = 0xffffffc040cd5780
(gdb) p *page
Cannot access memory at address 0xffffffc040cd5780

for FLATMEM model pfn_to_page is defined as:
#define __pfn_to_page(pfn)       (mem_map + ((pfn) - ARCH_PFN_OFFSET))
 (gdb) p/x mem_map
$17 = 0xffffffc03fd5d780
 (gdb) x 0xffffffc040cd5780
0xffffffc040cd5780:     Cannot access memory at address 0xffffffc040cd5780

It looks like in expansion of the pfn_to_page() macro, if the kernel start address is not 1GB aligned, this part of macro ((pfn)-ARCH_PFN_OFFSET) 
can create a huge offset from the base address of mem_map which will cause the calculated page address to point a location outside of the 
available memory boundaries.

Regards,
Saeed Karimabadi


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Arm64 Crashkernel doesn't work with FLATMEM anymore
  2019-12-17  0:02 Arm64 Crashkernel doesn't work with FLATMEM anymore Saeed Karimabadi (skarimab)
@ 2019-12-20 19:52 ` James Morse
  2019-12-23 22:24   ` Saeed Karimabadi (skarimab)
  0 siblings, 1 reply; 5+ messages in thread
From: James Morse @ 2019-12-20 19:52 UTC (permalink / raw)
  To: Saeed Karimabadi (skarimab), Catalin Marinas
  Cc: Ard Biesheuvel, Will Deacon, linux-arm-kernel,
	xe-linux-external(mailer list)

Hi Saeed,

Thanks for the bug report,

(CC: +Ard, KASLR+FLATMEM?)

On 17/12/2019 00:02, Saeed Karimabadi (skarimab) wrote:
> Crash dump  Kernel doesn't work with FLATMEM memory model since version 4.11.0-rc3 and it

v4.11? FLATMEM wasn't enabled until e7d4bac428edb in v4.19!

Kdump support wasn't added until v4.12. Catalin's pull request here:
http://lkml.iu.edu/hypermail/linux/kernel/1705.0/03077.html


You can't use a kernel that doesn't know about kdump as the kdump kernel. It must
understand the elfcorehdr and usable-memory-range DT properties, otherwise it can't know
not to trample on all of memory.


> will panic at boot time with a page request exception. The crash happens while kernel is initializing
> the memmap zones inside memmap_init_zone function. FLATMEM memory model is very useful
> for systems with limited memory resources where it is desirable to reserve as minimum as possible
> memory for the crash kernel. 

(I'd love to know how FLATMEM affects this... but we can save that for later)


> I'm wondering if this is a known issue or there is a patch to fix it?

No, I think this is new,


> -- Crash Dump Kernel starts here--
> [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x411fd070]
> [    0.000000] Linux version 5.5.0-rc1 (user@host) (gcc version 4.7.0 (GCC)) #163 SMP PREEMPT Tue Dec 10 11:12:37 PST 2019

gcc 4.7!

> [    0.000000] Machine model: linux,dummy-virt
> [    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '')
> [    0.000000] printk: bootconsole [pl11] enabled

> [    0.000000] efi: Getting EFI parameters from FDT:
> [    0.000000] efi: UEFI not found.

Hmmm,


> [    0.000000] Reserving 1KB of memory at 0xbfdff000 for elfcorehdr
> [    0.000000] On node 0 totalpages: 8192
> [    0.000000]   DMA zone: 128 pages used for memmap
> [    0.000000]   DMA zone: 0 pages reserved
> [    0.000000]   DMA zone: 8192 pages, LIFO batch:0
> [    0.000000] Unable to handle kernel paging request at virtual address ffffff8040ccf0b8
> [    0.000000] Mem abort info:
> [    0.000000]   ESR = 0x96000045
> [    0.000000]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    0.000000]   SET = 0, FnV = 0
> [    0.000000]   EA = 0, S1PTW = 0
> [    0.000000] Data abort info:
> [    0.000000]   ISV = 0, ISS = 0x00000045
> [    0.000000]   CM = 0, WnR = 1
> [    0.000000] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000bf068000
> [    0.000000] [ffffff8040ccf0b8] pgd=0000000000000000, pud=0000000000000000
> [    0.000000] Internal error: Oops: 96000045 [#1] PREEMPT SMP
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc1 #163
> [    0.000000] Hardware name: linux,dummy-virt (DT)
> [    0.000000] pstate: 20000085 (nzCv daIf -PAN -UAO)
> [    0.000000] pc : memmap_init_zone+0x68/0xe0
> [    0.000000] lr : memmap_init+0x14/0x1c

> [    0.000000] Call trace:
> [    0.000000]  memmap_init_zone+0x68/0xe0
> [    0.000000]  memmap_init+0x14/0x1c
> [    0.000000]  free_area_init_node+0x39c/0x3ec
> [    0.000000]  bootmem_init+0x158/0x174
> [    0.000000]  setup_arch+0x290/0x64c
> [    0.000000]  start_kernel+0x5c/0x480
> [    0.000000] Code: f945c705 cb813061 d37ae421 8b0100a0 (f9001c1f)
> [    0.000000] random: get_random_bytes called from init_oops_id+0x3c/0x48 with crng_init=0
> [    0.000000] ---[ end trace 0000000000000000 ]---
> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> [    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

I've managed to reproduce something like this. In my case its trying to zero a bogus
struct page.


> FLATMEM became broken after submission of these two patches: 
> commit 8f579b1c4e347b23bfa747bc2cc0a55dd1b7e5fa      arm64: limit memory regions based on DT property, usable-memory-range
> commit a7f8de168ace487fa7b88cb154e413cf40e87fc6       arm64: allow kernel Image to be loaded anywhere in physical memory

Those commits are in v4.12 and v4.6 respectively.
FLATMEM wasn't enabled until e7d4bac428edb in v4.19.

By 'after', you mean 'because of'?
Given the order of events here, 'FLATMEM has never work for kdump' seems a fair summary.


[..]

> The crash happens inside mm/page_alloc.c:memmap_init_zone() when kernel tries to initialize the first pfn of ZONE_DMA. The code
>  would calculate a wrong page structure pointer which is pointing beyond the end address of available memory.

> Breakpoint 3 at 0xffffff8008d463f0: file mm/page_alloc.c, line 5196.
> <-- Snip -->

> for FLATMEM model pfn_to_page is defined as:
> #define __pfn_to_page(pfn)       (mem_map + ((pfn) - ARCH_PFN_OFFSET))
>  (gdb) p/x mem_map
> $17 = 0xffffffc03fd5d780
>  (gdb) x 0xffffffc040cd5780
> 0xffffffc040cd5780:     Cannot access memory at address 0xffffffc040cd5780
> 
> It looks like in expansion of the pfn_to_page() macro, if the kernel start address is not 1GB
> aligned, this part of macro ((pfn)-ARCH_PFN_OFFSET) can create a huge offset from the base address
> of mem_map which will cause the calculated page address to point a location outside of the 
> available memory boundaries.

huge offset is the cause of the problem here. ARCH_PFN_OFFSET comes from memstart_addr.

We use memstart_addr is for shifting memory's physical addresses to their offset in the
linear map's range. Otherwise if memory started at 0x8000000000, we'd always lose a chunk
of address space because of this.

CONFIG_RANDOMIZE_BASE tinkers with this to randomise the placement of memory in the linear
map's range.

Catalin found disabling CONFIG_RANDOMIZE_BASE solved the issue for him. (evidently
kexec-tools is passing a random seed for kdump).

Do you have this option enabled in your kdump kernel?


The values are getting unbalanced because of FLATMEM's __page_to_pfn(). In particular
index-0 in the mem_map array causes it to return ARCH_PFN_OFFSET which leads to
memstart_addr, which is a value that may have been modified by CONFIG_RANDOMIZE_BASE.

FLATMEM's __page_to_pfn() can ignore KASLR because its page and mem_map both exist in the
randomised linear map.
Instead it wants to know memblock_start_of_DRAM() so the first DRAM page is index zero in
the array.

I think ARCH_PFN_OFFSET's meaning is different for FLATMEM.

Ugly hack[0] works for me. With this page_to_pfn() and pfn_to_page() seem to be producing
better results.


But! There are bigger problems here. memstart_addr starts out as memblock's idea of the
base of DRAM after kdumps usable-memory-range restrictions have been applied.

memblock_cap_memory_range() wont remove nomap blocks. We need to remember these are
memory, and they are nomap. Drivers depend on this when they want to use some exotic
memory-attributes later on. (is it memory? yes, do we have it mapped with conflicting
attributes? no)

These nomap blocks do influence memblock's idea of the base of DRAM meaning you can get a
large hole in the flatmem mem_map...

For kdump on Seattle, I see:
| memblock_cap_memory_range(0x80bfe00000, 0x40000000)

but
| memblock_start_of_DRAM == 0x8000000000

which is well below the first page.

Because of these nomap memblocks, I don't think kdump is isolated enough from the systems
memory map for the flatmem illusion to hold just because its kdump. You still need to
access firmware table that describe the system, as well as any memory that was reserved
with mechanisms like this. This exposes you to the platform's not-really-flatmem memory
layout.

I think the real fix here is to remove FLATMEM.


Thanks,

James


[0]
--------------------%<--------------------
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index a4f9ca5479b0..bebeca58eda6 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -172,6 +172,7 @@ extern u64                  vabits_actual;
 #include <linux/bitops.h>
 #include <linux/mmdebug.h>

+extern u64                     arm64_memblock_start;
 extern s64                     physvirt_offset;
 extern s64                     memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
@@ -307,7 +308,11 @@ static inline void *phys_to_virt(phys_addr_t x)
  *  virt_to_page(x)    convert a _valid_ virtual address to struct page *
  *  virt_addr_valid(x) indicates whether a virtual address is valid
  */
+#ifdef CONFIG_FLATMEM
+#define ARCH_PFN_OFFSET                ((unsigned long)arm64_memblock_start >> PAGE_SHIFT)
+#else
 #define ARCH_PFN_OFFSET                ((unsigned long)PHYS_PFN_OFFSET)
+#endif /* CONFIG_FLATMEM */

 #if !defined(CONFIG_SPARSEMEM_VMEMMAP) || defined(CONFIG_DEBUG_VIRTUAL)
 #define virt_to_page(x)                pfn_to_page(virt_to_pfn(x))
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index b65dffdfb201..8e29ca9cc9ed 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -44,6 +44,9 @@

 #define ARM64_ZONE_DMA_BITS    30

+u64 arm64_memblock_start;
+EXPORT_SYMBOL(arm64_memblock_start);
+
 /*
  * We need to be able to catch inadvertent references to memstart_addr
  * that occur (potentially in generic code) before arm64_memblock_init()
@@ -427,6 +430,8 @@ void __init arm64_memblock_init(void)
                }
        }

+       arm64_memblock_start = memblock_start_of_DRAM();
+
        /*
         * Register the kernel text, kernel data, initrd, and initial
         * pagetables with memblock.
--------------------%<--------------------

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: Arm64 Crashkernel doesn't work with FLATMEM anymore
  2019-12-20 19:52 ` James Morse
@ 2019-12-23 22:24   ` Saeed Karimabadi (skarimab)
  2020-01-02 17:42     ` Catalin Marinas
  0 siblings, 1 reply; 5+ messages in thread
From: Saeed Karimabadi (skarimab) @ 2019-12-23 22:24 UTC (permalink / raw)
  To: James Morse, Catalin Marinas
  Cc: Ard Biesheuvel, Will Deacon, linux-arm-kernel,
	xe-linux-external(mailer list)

Hi James,

Thank you for your detailed analysis, please see my comments inline.

On 20/12/2019 11:52 AM  James Morse <james.morse> wrote:
> On 17/12/2019 00:02, Saeed Karimabadi (skarimab) wrote:
> > Crash dump  Kernel doesn't work with FLATMEM memory model since version 4.11.0-rc3 and it
> v4.11? FLATMEM wasn't enabled until e7d4bac428edb in v4.19!
> Kdump support wasn't added until v4.12. Catalin's pull request here:
> http://lkml.iu.edu/hypermail/linux/kernel/1705.0/03077.html
> 
From upstream point of view your are right, flatmem was not enabled publicly until  e7d4bac428edb
in v4.19 but historically, e7d4bac428edb was introduced by one of my colleagues in cisco and we had
that patch and flatmem enabled in our local repositories since  kernel v4.4 ( plus some of the necessary kdump
patches from Catalin's pull request cherrypicked from v4.12 to v4.4 ). With this combination, crash kernel was 
working fine and it would collect the core file and with gdb we were able to decode the core file.
But despite our custom configuration, as you mentioned this flatmem issue should be observable to the open source
community since version 4.19.

> > FLATMEM memory model is very useful for systems with limited memory resources where it is 
> > desirable to reserve as minimum as possible memory for the crash kernel.
> (I'd love to know how FLATMEM affects this... but we can save that for later)
 
One of the main reasons we started using flatmem was to save memory on resource-constrained platforms and we could save
up to ~14 MB by switching to flatmem. On a board with an arm64 processor with 2GB of ram, The crash kernel with FLATMEM only 
needs 32MB of reserved memory to collect the core while with SPARMEM, the kernel on the same hardware won't boot with 
32MB reserved memory and we have to increase it to 64MB to get it working.  (the actual number is ~42MB but then we would 
need some extra memory for user space e.g. makedumpfile and etc ). I'm not sure but Sparsmem may need more memory to construct 
and keep its data structures.

> > -- Crash Dump Kernel starts here--
> > [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x411fd070]
> > [    0.000000] Linux version 5.5.0-rc1 (user@host) (gcc version 4.7.0 (GCC)) #163 SMP PREEMPT Tue
> Dec 10 11:12:37 PST 2019
> 
> gcc 4.7!
> 
Right, this kernel was built on one of the older build machines but we have had similar panic issue with kernels built 
by gcc  8.2 and newer.

> > [    0.000000] efi: Getting EFI parameters from FDT:
> > [    0.000000] efi: UEFI not found.
> Hmmm,
The bootloader is not UEFI but it would pass the KASLR_SEED via device tree.

> I've managed to reproduce something like this. In my case its trying to zero a bogus
> struct page.
I think you have reproduced the same issue. In my case also kernel is trying to memset zero the first pfn struct page 
with a bogus address.
 
> 
> > FLATMEM became broken after submission of these two patches:
> > commit 8f579b1c4e347b23bfa747bc2cc0a55dd1b7e5fa      arm64: limit memory regions based on DT property, usable-memory-range
> > commit a7f8de168ace487fa7b88cb154e413cf40e87fc6       arm64: allow kernel Image to be loaded anywhere in physical memory
> 
> Those commits are in v4.12 and v4.6 respectively.
> FLATMEM wasn't enabled until e7d4bac428edb in v4.19.
> By 'after', you mean 'because of'?
As I mentioned in my earlier comment we had FLATMEM enabled in our local repositories since v4.4. here I tried  to use " git bisect" to find out
which commit would cause the panic. And my finding is that because of the 2nd patch "a7f8de168ace487fa7b88cb154e413cf40e87fc6" where 
memstart_addr has been introduces, the logics for page_to_pfn() and pfn_to_page() are broken with flatmem layout.
 
> Given the order of events here, 'FLATMEM has never work for kdump' seems a fair summary.
That is a fair statement.  It was working for us because we had v4.4 as a base release and we just cherry picked some of the patches from 4.12 and 4.19.

> > It looks like in expansion of the pfn_to_page() macro, if the kernel start address is not 1GB
> > aligned, this part of macro ((pfn)-ARCH_PFN_OFFSET) can create a huge offset from the base address
> > of mem_map which will cause the calculated page address to point a location outside of the
> > available memory boundaries.
> 
> huge offset is the cause of the problem here. ARCH_PFN_OFFSET comes from memstart_addr.
> 
> We use memstart_addr is for shifting memory's physical addresses to their offset in the
> linear map's range. Otherwise if memory started at 0x8000000000, we'd always lose a chunk
> of address space because of this.
> 
> CONFIG_RANDOMIZE_BASE tinkers with this to randomise the placement of memory in the linear
> map's range.
> 
> Catalin found disabling CONFIG_RANDOMIZE_BASE solved the issue for him. (evidently
> kexec-tools is passing a random seed for kdump).
> 
> Do you have this option enabled in your kdump kernel?
CONFIG_RANDOMIZE_BASE is enabled in our main kernel but not in the crash kernel. For the main kernel we have to support KASLR and that is why 
we need to keep this option enabled.

> 
> The values are getting unbalanced because of FLATMEM's __page_to_pfn(). In particular
> index-0 in the mem_map array causes it to return ARCH_PFN_OFFSET which leads to
> memstart_addr, which is a value that may have been modified by CONFIG_RANDOMIZE_BASE.
> 
> FLATMEM's __page_to_pfn() can ignore KASLR because its page and mem_map both exist in the
> randomised linear map.
> Instead it wants to know memblock_start_of_DRAM() so the first DRAM page is index zero in
> the array.
> 
> I think ARCH_PFN_OFFSET's meaning is different for FLATMEM.
> 
> Ugly hack[0] works for me. With this page_to_pfn() and pfn_to_page() seem to be producing
> better results.
> 
I'll test it and will share the result.

> But! There are bigger problems here. memstart_addr starts out as memblock's idea of the
> base of DRAM after kdumps usable-memory-range restrictions have been applied.
> 
> memblock_cap_memory_range() wont remove nomap blocks. We need to remember these are
> memory, and they are nomap. Drivers depend on this when they want to use some exotic
> memory-attributes later on. (is it memory? yes, do we have it mapped with conflicting
> attributes? no)
> 
> These nomap blocks do influence memblock's idea of the base of DRAM meaning you can get a
> large hole in the flatmem mem_map...
> 
> For kdump on Seattle, I see:
> | memblock_cap_memory_range(0x80bfe00000, 0x40000000)
> 
> but
> | memblock_start_of_DRAM == 0x8000000000
> 
> which is well below the first page.
> 
> Because of these nomap memblocks, I don't think kdump is isolated enough from the systems
> memory map for the flatmem illusion to hold just because its kdump. You still need to
> access firmware table that describe the system, as well as any memory that was reserved
> with mechanisms like this. This exposes you to the platform's not-really-flatmem memory
> layout.
> 
> I think the real fix here is to remove FLATMEM.

It looks like major part of the arm64 architecture development has been done by using the sparsemem layout as the default 
memory layout and now if one wants to fix the FLATMEM, many low level code areas needs to be touched.
With presence of above problems with flatmem, I'm wondering if there is any possibility of using SPARMEM for crash kernel 
but still keeping the memory footprint as low as possible like what one can achieve with flatmem? 
For example is it possible to reduce the amount of memory SPARMEM is using for its internal data structures or to keep track 
of different memory zones? Or any other suggestion of reducing the total memory size for crash kernel ?

Thanks,
Saeed

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Arm64 Crashkernel doesn't work with FLATMEM anymore
  2019-12-23 22:24   ` Saeed Karimabadi (skarimab)
@ 2020-01-02 17:42     ` Catalin Marinas
  2020-01-07 18:21       ` Saeed Karimabadi (skarimab)
  0 siblings, 1 reply; 5+ messages in thread
From: Catalin Marinas @ 2020-01-02 17:42 UTC (permalink / raw)
  To: Saeed Karimabadi (skarimab)
  Cc: Will Deacon, Ard Biesheuvel, James Morse, linux-arm-kernel,
	xe-linux-external(mailer list)

On Mon, Dec 23, 2019 at 10:24:57PM +0000, Saeed Karimabadi (skarimab) wrote:
> On 20/12/2019 11:52 AM  James Morse <james.morse> wrote:
> > On 17/12/2019 00:02, Saeed Karimabadi (skarimab) wrote:
> > > Crash dump  Kernel doesn't work with FLATMEM memory model since version 4.11.0-rc3 and it
[...]
> > Because of these nomap memblocks, I don't think kdump is isolated enough from the systems
> > memory map for the flatmem illusion to hold just because its kdump. You still need to
> > access firmware table that describe the system, as well as any memory that was reserved
> > with mechanisms like this. This exposes you to the platform's not-really-flatmem memory
> > layout.
> > 
> > I think the real fix here is to remove FLATMEM.
[...]
> For example is it possible to reduce the amount of memory SPARMEM is
> using for its internal data structures or to keep track of different
> memory zones? Or any other suggestion of reducing the total memory
> size for crash kernel ?

Can you change SECTION_SIZE_BITS to 29 or 28 in
arch/arm64/include/asm/sparsemem.h and see whether it makes a
difference?

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Arm64 Crashkernel doesn't work with FLATMEM anymore
  2020-01-02 17:42     ` Catalin Marinas
@ 2020-01-07 18:21       ` Saeed Karimabadi (skarimab)
  0 siblings, 0 replies; 5+ messages in thread
From: Saeed Karimabadi (skarimab) @ 2020-01-07 18:21 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Ard Biesheuvel, James Morse, linux-arm-kernel,
	xe-linux-external(mailer list)

On Thursday, January 2, 2020 9:42 AM Catalin Marinas <catalin.marinas> wrote:
> On Mon, Dec 23, 2019 at 10:24:57PM +0000, Saeed Karimabadi (skarimab) wrote:
> > On 20/12/2019 11:52 AM  James Morse <james.morse> wrote:
> > > On 17/12/2019 00:02, Saeed Karimabadi (skarimab) wrote:
> > > > Crash dump  Kernel doesn't work with FLATMEM memory model since version 4.11.0-rc3 and it
> [...]
> > > Because of these nomap memblocks, I don't think kdump is isolated enough from the systems
> > > memory map for the flatmem illusion to hold just because its kdump. You still need to
> > > access firmware table that describe the system, as well as any memory that was reserved
> > > with mechanisms like this. This exposes you to the platform's not-really-flatmem memory
> > > layout.
> > >
> > > I think the real fix here is to remove FLATMEM.
> [...]
> > For example is it possible to reduce the amount of memory SPARMEM is
> > using for its internal data structures or to keep track of different
> > memory zones? Or any other suggestion of reducing the total memory
> > size for crash kernel ?
> 
> Can you change SECTION_SIZE_BITS to 29 or 28 in
> arch/arm64/include/asm/sparsemem.h and see whether it makes a
> difference?

I changed SECTION_SIZE_BITS to 29 in main kernel as well as crash kernel and both can boot properly but after 
triggering a panic "makedumpfile" tool cannot collect the corefile. It complains that it cannot find 
the address of mem_section. I'll collect the raw VMCORE and will check further if I need to modify 
the "makedumpfile" code .  Also "makedumpfile" recognizes the VMCORE memory model as SPARSEMEM while
it should detect it as SPARSEMEM_EX.

# makedumpfile -F -E -D -d 31 /proc/vmcore | gzip > ./core.4.19.29
sadump: unsupported architecture
LOAD (0)
  phys_start : 80080000
  phys_end   : 80c35000
  virt_start : ffffff8008080000
  virt_end   : ffffff8008c35000
LOAD (1)
  phys_start : 80000000
  phys_end   : c2000000
  virt_start : ffffffc000000000
  virt_end   : ffffffc042000000
LOAD (2)
  phys_start : c203b000
  phys_end   : fbe00000
  virt_start : ffffffc04203b000
  virt_end   : ffffffc07be00000
LOAD (3)
  phys_start : ffe00000
  phys_end   : 100000000
  virt_start : ffffffc07fe00000
  virt_end   : ffffffc080000000
Linux kdump
page_size    : 4096
phys_base    : 80000000 (vmcoreinfo)

max_mapnr    : 100000
There is enough free memory to be done in one cycle.

Buffer size for the cyclic mode: 262144
va_bits      : 39
page_offset  : ffffffc000000000
kimage_voffset   : ffffff7f88000000
max_physmem_bits : 30
section_size_bits: 1e
va_bits      : 39
page_offset  : ffffffc000000000
num of NODEs : 1

Memory type  : SPARSEMEM
get_mm_sparsemem: Can't get the address of mem_section.

Thanks,
Saeed

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-01-07 18:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-17  0:02 Arm64 Crashkernel doesn't work with FLATMEM anymore Saeed Karimabadi (skarimab)
2019-12-20 19:52 ` James Morse
2019-12-23 22:24   ` Saeed Karimabadi (skarimab)
2020-01-02 17:42     ` Catalin Marinas
2020-01-07 18:21       ` Saeed Karimabadi (skarimab)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).