linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] mips: Post-bootmem-memblock transition fixes
@ 2019-04-23 22:47 Serge Semin
  2019-04-23 22:47 ` [PATCH 01/12] mips: Make sure kernel .bss exists in boot mem pool Serge Semin
                   ` (11 more replies)
  0 siblings, 12 replies; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

First attempt of making the MIPS subsystem utilizing the memblock early memory
allocator was done by me a few years ago. I created a patchset with
21 patches [1]. It turned out to be too complicated and I decided to resend a
reworked patchset with smaller number of changes [2]. I did this and after a
small review process a v2 patchset was also posted. Then my spare
time was over and I couldn't proceed with the patchset support and
resubmission.

In a year Mike Rapoport took charge in this task and posted a small
patch which essentially did the bootmem allocator removal from MIPS
subsystem [3]. A single small patch did in general the whole thing my huge
patchsetes were intended for in the first place (though it lacked a few fixes).
Mike even went further and completely removed the bootmem allocator from
kernel code, so all the subsystems would need to use the only one early
memory allocator. This significantly simplified the platforms code as well
as removed a deprecated subsystem with duplicated functionality. Million
credits to Mike for this.

Getting back to the MIPS subsystem and it memblock allocator usage. Even
though the patch suggested by Mike [3] fixed most of the problems caused
by enabling the memblock allocator usage, some of them have been left
uncovered by it. First of all the PFNs calculation algorithm hasn't been
fully refactored. A reserved memory declaration loop has been left
untouched though it was clearly over-complicated for the new initialization
procedure. Secondly the MIPS platform code reserved the whole space below
kernel start address, which doesn't seem right since kernel can be
located much higher than memory space really starts. Thirdly CMA if it
is enabled reserves memory regions by means of memblock in the first place.
So the bootmem-init code doesn't need to do it again. Fifthly at early
platform initialization stage non of bootmem-left methods can be called
since there is no memory pages mapping at that moment, so __nosave* region
must be reserved by means of memblock allocator. Finally aside from memblock
allocator introduction my early patchsets included a series of useful
alterations like "nomap" property implementation for "reserved-memory"
dts-nodes, memblock_dump_all() method call after early memory allocator
initialization for debugging, low-memory test procedure, kernel memory
mapping printout at boot-time, and so on. So all of these fixes and
alterations are introduced in this new patchset. Please review. Hope
this time I'll be more responsive and finish this series up until it
is merged.

[1] https://lkml.org/lkml/2016/12/18/195
[2] https://lkml.org/lkml/2018/1/17/1201
[3] https://lkml.org/lkml/2018/9/10/302

NOTE I added a few "Reviewed-by:  Matt Redfearn <matt.redfearn@mips.com>"
since some patches of this series have been picked up from my earlier
patchsets, which Matt's already reviewed. I didn't add the tag for patches,
which were either new or partially ported.

Serge Semin (12):
  mips: Make sure kernel .bss exists in boot mem pool
  mips: Discard rudiments from bootmem_init
  mips: Combine memblock init and memory reservation loops
  mips: Reserve memory for the kernel image resources
  mips: Discard post-CMA-init foreach loop
  mips: Use memblock to reserve the __nosave memory range
  mips: Add reserve-nomap memory type support
  mips: Dump memblock regions for debugging
  mips: Perform early low memory test
  mips: Print the kernel virtual mem layout on debugging
  mips: Make sure dt memory regions are valid
  mips: Enable OF_RESERVED_MEM config

 arch/mips/Kconfig                |   1 +
 arch/mips/include/asm/bootinfo.h |   1 +
 arch/mips/kernel/prom.c          |  18 ++++-
 arch/mips/kernel/setup.c         | 129 +++++++++----------------------
 arch/mips/mm/init.c              |  49 ++++++++++++
 5 files changed, 102 insertions(+), 96 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/12] mips: Make sure kernel .bss exists in boot mem pool
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-24 22:30   ` Paul Burton
  2019-04-23 22:47 ` [PATCH 02/12] mips: Discard rudiments from bootmem_init Serge Semin
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

Current MIPS platform code makes sure the kernel text, data and init
sections are added to the boot memory map pool right after the
arch-specific memory setup method has been executed. But for some reason
the MIPS platform code skipped the kernel .bss section, which definitely
should be in the boot mem pool as well in any case. Lets fix this just be
adding the space between __bss_start and __bss_stop.

Reviewed-by: Matt Redfearn <matt.redfearn@mips.com>
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/setup.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 8d1dc6c71173..0ee033c44116 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -809,6 +809,9 @@ static void __init arch_mem_init(char **cmdline_p)
 	arch_mem_addpart(PFN_UP(__pa_symbol(&__init_begin)) << PAGE_SHIFT,
 			 PFN_DOWN(__pa_symbol(&__init_end)) << PAGE_SHIFT,
 			 BOOT_MEM_INIT_RAM);
+	arch_mem_addpart(PFN_DOWN(__pa_symbol(&__bss_start)) << PAGE_SHIFT,
+			 PFN_UP(__pa_symbol(&__bss_stop)) << PAGE_SHIFT,
+			 BOOT_MEM_RAM);
 
 	pr_info("Determined physical RAM map:\n");
 	print_memory_map();
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/12] mips: Discard rudiments from bootmem_init
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
  2019-04-23 22:47 ` [PATCH 01/12] mips: Make sure kernel .bss exists in boot mem pool Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-24 22:30   ` Paul Burton
  2019-04-23 22:47 ` [PATCH 03/12] mips: Combine memblock init and memory reservation loops Serge Semin
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

There is a pointless code left in the bootmem_init() method since
the bootmem allocator removal. First part resides the PFN ranges
calculation loop. The conditional expressions and continue operator
are useless there, since nothing is done after them. Second part is
in RAM ranges installation loop. We can simplify the conditions cascade
a bit without much of the logic redefinition, so to reduce the code
length. In particular the end boundary value can be verified after
the possible reduction to be below max_low_pfn.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/setup.c | 25 +++++--------------------
 1 file changed, 5 insertions(+), 20 deletions(-)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 0ee033c44116..53d93a727d1a 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -394,10 +394,7 @@ static void __init bootmem_init(void)
 	min_low_pfn = ~0UL;
 	max_low_pfn = 0;
 
-	/*
-	 * Find the highest page frame number we have available
-	 * and the lowest used RAM address
-	 */
+	/* Find the highest and lowest page frame numbers we have available. */
 	for (i = 0; i < boot_mem_map.nr_map; i++) {
 		unsigned long start, end;
 
@@ -427,13 +424,6 @@ static void __init bootmem_init(void)
 			max_low_pfn = end;
 		if (start < min_low_pfn)
 			min_low_pfn = start;
-		if (end <= reserved_end)
-			continue;
-#ifdef CONFIG_BLK_DEV_INITRD
-		/* Skip zones before initrd and initrd itself */
-		if (initrd_end && end <= (unsigned long)PFN_UP(__pa(initrd_end)))
-			continue;
-#endif
 	}
 
 	if (min_low_pfn >= max_low_pfn)
@@ -474,6 +464,7 @@ static void __init bootmem_init(void)
 		max_low_pfn = PFN_DOWN(HIGHMEM_START);
 	}
 
+	/* Install all valid RAM ranges to the memblock memory region */
 	for (i = 0; i < boot_mem_map.nr_map; i++) {
 		unsigned long start, end;
 
@@ -481,21 +472,15 @@ static void __init bootmem_init(void)
 		end = PFN_DOWN(boot_mem_map.map[i].addr
 				+ boot_mem_map.map[i].size);
 
-		if (start <= min_low_pfn)
+		if (start < min_low_pfn)
 			start = min_low_pfn;
-		if (start >= end)
-			continue;
-
 #ifndef CONFIG_HIGHMEM
+		/* Ignore highmem regions if highmem is unsupported */
 		if (end > max_low_pfn)
 			end = max_low_pfn;
-
-		/*
-		 * ... finally, is the area going away?
-		 */
+#endif
 		if (end <= start)
 			continue;
-#endif
 
 		memblock_add_node(PFN_PHYS(start), PFN_PHYS(end - start), 0);
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/12] mips: Combine memblock init and memory reservation loops
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
  2019-04-23 22:47 ` [PATCH 01/12] mips: Make sure kernel .bss exists in boot mem pool Serge Semin
  2019-04-23 22:47 ` [PATCH 02/12] mips: Discard rudiments from bootmem_init Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-24 22:30   ` Paul Burton
  2019-04-23 22:47 ` [PATCH 04/12] mips: Reserve memory for the kernel image resources Serge Semin
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

Before bootmem was completely removed from the kernel, the last loop
in the bootmem_init() had been used to reserve the correspondingly
marked regions, initialize sparsemem sections and to free the low memory
pages, which then would be used for early memory allocations. After the
bootmem removing patchset had been merged the loop was left to do the first
two things only. But it didn't do them quite well.

First of all it leaves the BOOT_MEM_INIT_RAM memory types unreserved,
which is definitely bug (although it isn't noticeable due to being used
by the kernel region only, which is fully marked as reserved). Secondly
the reservation is supposed to be done for any memory including the
high one. (I couldn't figure out why the highmem was ignored in the first
place, since platforms and dts' may declare any memory region for
reservation) Thirdly the reserved_end variable had been used here to not
accidentally free memory occupied by kernel. Since we already reserved the
corresponding region higher in this method there is no need in using the
variable here anymore. Fourthly the sparsemem should be aware of all the
memory types in the system including the ROM_DATA even if it is going to
be reserved for the whole system uptime. Finally after all these notes are
fixed the loop of memory reservation can be freely merged into the memory
installation loop as it's done in this patch.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/setup.c | 48 ++++++----------------------------------
 1 file changed, 7 insertions(+), 41 deletions(-)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 53d93a727d1a..185e0e42e009 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -483,55 +483,21 @@ static void __init bootmem_init(void)
 			continue;
 
 		memblock_add_node(PFN_PHYS(start), PFN_PHYS(end - start), 0);
-	}
-
-	/*
-	 * Register fully available low RAM pages with the bootmem allocator.
-	 */
-	for (i = 0; i < boot_mem_map.nr_map; i++) {
-		unsigned long start, end, size;
-
-		start = PFN_UP(boot_mem_map.map[i].addr);
-		end   = PFN_DOWN(boot_mem_map.map[i].addr
-				    + boot_mem_map.map[i].size);
 
-		/*
-		 * Reserve usable memory.
-		 */
+		/* Reserve any memory except the ordinary RAM ranges. */
 		switch (boot_mem_map.map[i].type) {
 		case BOOT_MEM_RAM:
 			break;
-		case BOOT_MEM_INIT_RAM:
-			memory_present(0, start, end);
-			continue;
-		default:
-			/* Not usable memory */
-			if (start > min_low_pfn && end < max_low_pfn)
-				memblock_reserve(boot_mem_map.map[i].addr,
-						boot_mem_map.map[i].size);
-
-			continue;
+		default: /* Reserve the rest of the memory types at boot time */
+			memblock_reserve(PFN_PHYS(start), PFN_PHYS(end - start));
+			break;
 		}
 
 		/*
-		 * We are rounding up the start address of usable memory
-		 * and at the end of the usable range downwards.
-		 */
-		if (start >= max_low_pfn)
-			continue;
-		if (start < reserved_end)
-			start = reserved_end;
-		if (end > max_low_pfn)
-			end = max_low_pfn;
-
-		/*
-		 * ... finally, is the area going away?
+		 * In any case the added to the memblock memory regions
+		 * (highmem/lowmem, available/reserved, etc) are considered
+		 * as present, so inform sparsemem about them.
 		 */
-		if (end <= start)
-			continue;
-		size = end - start;
-
-		/* Register lowmem ranges */
 		memory_present(0, start, end);
 	}
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (2 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 03/12] mips: Combine memblock init and memory reservation loops Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-24 22:43   ` Paul Burton
                     ` (2 more replies)
  2019-04-23 22:47 ` [PATCH 05/12] mips: Discard post-CMA-init foreach loop Serge Semin
                   ` (7 subsequent siblings)
  11 siblings, 3 replies; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

The reserved_end variable had been used by the bootmem_init() code
to find a lowest limit of memory available for memmap blob. The original
code just tried to find a free memory space higher than kernel was placed.
This limitation seems justified for the memmap ragion search process, but
I can't see any obvious reason to reserve the unused space below kernel
seeing some platforms place it much higher than standard 1MB. Moreover
the RELOCATION config enables it to be loaded at any memory address.
So lets reserve the memory occupied by the kernel only, leaving the region
below being free for allocations. After doing this we can now discard the
code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
since it's going to be free anyway (unless marked as reserved by
platforms).

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/setup.c | 30 +++---------------------------
 1 file changed, 3 insertions(+), 27 deletions(-)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 185e0e42e009..f71a7d32a687 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -371,7 +371,6 @@ static void __init bootmem_init(void)
 
 static void __init bootmem_init(void)
 {
-	unsigned long reserved_end;
 	phys_addr_t ramstart = PHYS_ADDR_MAX;
 	int i;
 
@@ -382,10 +381,10 @@ static void __init bootmem_init(void)
 	 * will reserve the area used for the initrd.
 	 */
 	init_initrd();
-	reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
 
-	memblock_reserve(PHYS_OFFSET,
-			 (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
+	/* Reserve memory occupied by kernel. */
+	memblock_reserve(__pa_symbol(&_text),
+			__pa_symbol(&_end) - __pa_symbol(&_text));
 
 	/*
 	 * max_low_pfn is not a number of pages. The number of pages
@@ -501,29 +500,6 @@ static void __init bootmem_init(void)
 		memory_present(0, start, end);
 	}
 
-#ifdef CONFIG_RELOCATABLE
-	/*
-	 * The kernel reserves all memory below its _end symbol as bootmem,
-	 * but the kernel may now be at a much higher address. The memory
-	 * between the original and new locations may be returned to the system.
-	 */
-	if (__pa_symbol(_text) > __pa_symbol(VMLINUX_LOAD_ADDRESS)) {
-		unsigned long offset;
-		extern void show_kernel_relocation(const char *level);
-
-		offset = __pa_symbol(_text) - __pa_symbol(VMLINUX_LOAD_ADDRESS);
-		memblock_free(__pa_symbol(VMLINUX_LOAD_ADDRESS), offset);
-
-#if defined(CONFIG_DEBUG_KERNEL) && defined(CONFIG_DEBUG_INFO)
-		/*
-		 * This information is necessary when debugging the kernel
-		 * But is a security vulnerability otherwise!
-		 */
-		show_kernel_relocation(KERN_INFO);
-#endif
-	}
-#endif
-
 	/*
 	 * Reserve initrd memory if needed.
 	 */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/12] mips: Discard post-CMA-init foreach loop
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (3 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 04/12] mips: Reserve memory for the kernel image resources Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-05-02 18:35   ` Paul Burton
  2019-04-23 22:47 ` [PATCH 06/12] mips: Use memblock to reserve the __nosave memory range Serge Semin
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

Really the loop is pointless, since it walks over memblock-reserved
memory regions and mark them as reserved in memblock. Before
bootmem was removed from the kernel, this loop had been
used to map the memory reserved by CMA into the legacy bootmem
allocator. But now the early memory allocator is memblock,
which is used by CMA for reservation, so we don't need any mapping
anymore.

Reviewed-by: Matt Redfearn <matt.redfearn@mips.com>
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/setup.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index f71a7d32a687..2ae6b02b948f 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -708,7 +708,6 @@ static void __init request_crashkernel(struct resource *res)
  */
 static void __init arch_mem_init(char **cmdline_p)
 {
-	struct memblock_region *reg;
 	extern void plat_mem_setup(void);
 
 	/*
@@ -814,10 +813,6 @@ static void __init arch_mem_init(char **cmdline_p)
 	plat_swiotlb_setup();
 
 	dma_contiguous_reserve(PFN_PHYS(max_low_pfn));
-	/* Tell bootmem about cma reserved memblock section */
-	for_each_memblock(reserved, reg)
-		if (reg->size != 0)
-			memblock_reserve(reg->base, reg->size);
 
 	reserve_bootmem_region(__pa_symbol(&__nosave_begin),
 			__pa_symbol(&__nosave_end)); /* Reserve for hibernation */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/12] mips: Use memblock to reserve the __nosave memory range
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (4 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 05/12] mips: Discard post-CMA-init foreach loop Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-05-02 18:35   ` Paul Burton
  2019-04-23 22:47 ` [PATCH 07/12] mips: Add reserve-nomap memory type support Serge Semin
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

Originally before legacy bootmem was removed, the memory for the range was
correctly reserved by reserve_bootmem_region(). But since memblock has been
selected for early memory allocation the function can be utilized only
after paging is fully initialized (as it is done by memblock_free_all()
function). So calling it from arch_mem_init() method is prone to errors,
and at this stage we need to reserve the memory in the memblock allocator.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/setup.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 2ae6b02b948f..3a5140943f54 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -814,8 +814,9 @@ static void __init arch_mem_init(char **cmdline_p)
 
 	dma_contiguous_reserve(PFN_PHYS(max_low_pfn));
 
-	reserve_bootmem_region(__pa_symbol(&__nosave_begin),
-			__pa_symbol(&__nosave_end)); /* Reserve for hibernation */
+	/* Reserve for hibernation. */
+	memblock_reserve(__pa_symbol(&__nosave_begin),
+		__pa_symbol(&__nosave_end) - __pa_symbol(&__nosave_begin));
 }
 
 static void __init resource_init(void)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/12] mips: Add reserve-nomap memory type support
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (5 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 06/12] mips: Use memblock to reserve the __nosave memory range Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-05-02 18:35   ` Paul Burton
  2019-04-23 22:47 ` [PATCH 08/12] mips: Dump memblock regions for debugging Serge Semin
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

It might be necessary to prevent the virtual mapping creation for a
requested memory region. For instance there is a "no-map" property
indicating exactly this feature. In this case we need to not only
reserve the specified region by pretending it doesn't exist in the
memory space, but completely remove the range from system just by
removing it from memblock. The same way it's done in default
early_init_dt_reserve_memory_arch() method.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/include/asm/bootinfo.h | 1 +
 arch/mips/kernel/prom.c          | 4 +++-
 arch/mips/kernel/setup.c         | 8 ++++++++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/bootinfo.h b/arch/mips/include/asm/bootinfo.h
index a301a8f4bc66..235bc2f52113 100644
--- a/arch/mips/include/asm/bootinfo.h
+++ b/arch/mips/include/asm/bootinfo.h
@@ -92,6 +92,7 @@ extern unsigned long mips_machtype;
 #define BOOT_MEM_ROM_DATA	2
 #define BOOT_MEM_RESERVED	3
 #define BOOT_MEM_INIT_RAM	4
+#define BOOT_MEM_NOMAP		5
 
 /*
  * A memory map that's built upon what was determined
diff --git a/arch/mips/kernel/prom.c b/arch/mips/kernel/prom.c
index 93b8e0b4332f..437a174e3ef9 100644
--- a/arch/mips/kernel/prom.c
+++ b/arch/mips/kernel/prom.c
@@ -47,7 +47,9 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
 int __init early_init_dt_reserve_memory_arch(phys_addr_t base,
 					phys_addr_t size, bool nomap)
 {
-	add_memory_region(base, size, BOOT_MEM_RESERVED);
+	add_memory_region(base, size,
+			  nomap ? BOOT_MEM_NOMAP : BOOT_MEM_RESERVED);
+
 	return 0;
 }
 
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 3a5140943f54..2a1b2e7a1bc9 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -178,6 +178,7 @@ static bool __init __maybe_unused memory_region_available(phys_addr_t start,
 				in_ram = true;
 			break;
 		case BOOT_MEM_RESERVED:
+		case BOOT_MEM_NOMAP:
 			if ((start >= start_ && start < end_) ||
 			    (start < start_ && start + size >= start_))
 				free = false;
@@ -213,6 +214,9 @@ static void __init print_memory_map(void)
 		case BOOT_MEM_RESERVED:
 			printk(KERN_CONT "(reserved)\n");
 			break;
+		case BOOT_MEM_NOMAP:
+			printk(KERN_CONT "(nomap)\n");
+			break;
 		default:
 			printk(KERN_CONT "type %lu\n", boot_mem_map.map[i].type);
 			break;
@@ -487,6 +491,9 @@ static void __init bootmem_init(void)
 		switch (boot_mem_map.map[i].type) {
 		case BOOT_MEM_RAM:
 			break;
+		case BOOT_MEM_NOMAP: /* Discard the range from the system. */
+			memblock_remove(PFN_PHYS(start), PFN_PHYS(end - start));
+			continue;
 		default: /* Reserve the rest of the memory types at boot time */
 			memblock_reserve(PFN_PHYS(start), PFN_PHYS(end - start));
 			break;
@@ -861,6 +868,7 @@ static void __init resource_init(void)
 			res->flags |= IORESOURCE_SYSRAM;
 			break;
 		case BOOT_MEM_RESERVED:
+		case BOOT_MEM_NOMAP:
 		default:
 			res->name = "reserved";
 		}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/12] mips: Dump memblock regions for debugging
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (6 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 07/12] mips: Add reserve-nomap memory type support Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-24 13:45   ` Mike Rapoport
  2019-04-23 22:47 ` [PATCH 09/12] mips: Perform early low memory test Serge Semin
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

It is useful to have the whole memblock memory space printed to console
when basic memlock initializations are done. It can be performed by
ready-to-use method memblock_dump_all(), which prints the available
and reserved memory spaces if MEMBLOCK_DEBUG config is enabled.
Lets call it at the very end of arch_mem_init() function, when
all memblock memory and reserved regions are defined, but before
any serious allocation is performed.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/setup.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 2a1b2e7a1bc9..ca493fdf69b0 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -824,6 +824,8 @@ static void __init arch_mem_init(char **cmdline_p)
 	/* Reserve for hibernation. */
 	memblock_reserve(__pa_symbol(&__nosave_begin),
 		__pa_symbol(&__nosave_end) - __pa_symbol(&__nosave_begin));
+
+	memblock_dump_all();
 }
 
 static void __init resource_init(void)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/12] mips: Perform early low memory test
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (7 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 08/12] mips: Dump memblock regions for debugging Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-23 22:47 ` [PATCH 10/12] mips: Print the kernel virtual mem layout on debugging Serge Semin
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

memblock subsystem provides a method to optionally test the passed
memory region in case if it was requested via special kernel boot
argument. Lets add the function at the bottom of the arch_mem_init()
method. Testing at this point in the boot sequence should be safe since all
critical areas are now reserved and a minimum of allocations have been
done.

Reviewed-by: Matt Redfearn <matt.redfearn@mips.com>
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/setup.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index ca493fdf69b0..fbd216b4e929 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -826,6 +826,8 @@ static void __init arch_mem_init(char **cmdline_p)
 		__pa_symbol(&__nosave_end) - __pa_symbol(&__nosave_begin));
 
 	memblock_dump_all();
+
+	early_memtest(PFN_PHYS(min_low_pfn), PFN_PHYS(max_low_pfn));
 }
 
 static void __init resource_init(void)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/12] mips: Print the kernel virtual mem layout on debugging
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (8 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 09/12] mips: Perform early low memory test Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-24 13:47   ` Mike Rapoport
  2019-04-23 22:47 ` [PATCH 11/12] mips: Make sure dt memory regions are valid Serge Semin
  2019-04-23 22:47 ` [PATCH 12/12] mips: Enable OF_RESERVED_MEM config Serge Semin
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

It is useful at least for debugging to have the kernel virtual
memory layout printed at boot time so to have the full information
about the booted kernel. Make the printing optional and available
only when DEBUG_KERNEL config is enabled so not to leak a sensitive
kernel information.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/mm/init.c | 49 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index bbb196ad5f26..c338bbd03b2a 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -31,6 +31,7 @@
 #include <linux/gfp.h>
 #include <linux/kcore.h>
 #include <linux/initrd.h>
+#include <linux/sizes.h>
 
 #include <asm/bootinfo.h>
 #include <asm/cachectl.h>
@@ -56,6 +57,53 @@ unsigned long empty_zero_page, zero_page_mask;
 EXPORT_SYMBOL_GPL(empty_zero_page);
 EXPORT_SYMBOL(zero_page_mask);
 
+/*
+ * Print out the kernel virtual memory layout
+ */
+#define MLK(b, t) (void *)b, (void *)t, ((t) - (b)) >> 10
+#define MLM(b, t) (void *)b, (void *)t, ((t) - (b)) >> 20
+#define MLK_ROUNDUP(b, t) (void *)b, (void *)t, DIV_ROUND_UP(((t) - (b)), SZ_1K)
+static void __init mem_print_kmap_info(void)
+{
+#ifdef CONFIG_DEBUG_KERNEL
+	pr_notice("Kernel virtual memory layout:\n"
+		  "    lowmem  : 0x%px - 0x%px  (%4ld MB)\n"
+		  "      .text : 0x%px - 0x%px  (%4td kB)\n"
+		  "      .data : 0x%px - 0x%px  (%4td kB)\n"
+		  "      .init : 0x%px - 0x%px  (%4td kB)\n"
+		  "      .bss  : 0x%px - 0x%px  (%4td kB)\n"
+		  "    vmalloc : 0x%px - 0x%px  (%4ld MB)\n"
+#ifdef CONFIG_HIGHMEM
+		  "    pkmap   : 0x%px - 0x%px  (%4ld MB)\n"
+#endif
+		  "    fixmap  : 0x%px - 0x%px  (%4ld kB)\n",
+		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
+		  MLK_ROUNDUP(_text, _etext),
+		  MLK_ROUNDUP(_sdata, _edata),
+		  MLK_ROUNDUP(__init_begin, __init_end),
+		  MLK_ROUNDUP(__bss_start, __bss_stop),
+		  MLM(VMALLOC_START, VMALLOC_END),
+#ifdef CONFIG_HIGHMEM
+		  MLM(PKMAP_BASE, (PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE)),
+#endif
+		  MLK(FIXADDR_START, FIXADDR_TOP));
+
+	/* Check some fundamental inconsistencies. May add something else? */
+#ifdef CONFIG_HIGHMEM
+	BUILD_BUG_ON(VMALLOC_END < PAGE_OFFSET);
+	BUG_ON(VMALLOC_END < (unsigned long)high_memory);
+	BUILD_BUG_ON((PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE) < PAGE_OFFSET);
+	BUG_ON((PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE) <
+		(unsigned long)high_memory);
+#endif
+	BUILD_BUG_ON(FIXADDR_TOP < PAGE_OFFSET);
+	BUG_ON(FIXADDR_TOP < (unsigned long)high_memory);
+#endif /* CONFIG_DEBUG_KERNEL */
+}
+#undef MLK
+#undef MLM
+#undef MLK_ROUNDUP
+
 /*
  * Not static inline because used by IP27 special magic initialization code
  */
@@ -479,6 +527,7 @@ void __init mem_init(void)
 	setup_zero_pages();	/* Setup zeroed pages.  */
 	mem_init_free_highmem();
 	mem_init_print_info(NULL);
+	mem_print_kmap_info();
 
 #ifdef CONFIG_64BIT
 	if ((unsigned long) &_text > (unsigned long) CKSEG0)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/12] mips: Make sure dt memory regions are valid
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (9 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 10/12] mips: Print the kernel virtual mem layout on debugging Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-23 22:47 ` [PATCH 12/12] mips: Enable OF_RESERVED_MEM config Serge Semin
  11 siblings, 0 replies; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

There are situations when memory regions coming from dts may be
too big for the platform physical address space. It especially
concerns XPA-capable systems. Bootleader may determine more than 4GB
memory available and pass it to the kernel over dts memory node, while
kernel is built without XPA support. In this case the region
may either simply be truncated by add_memory_region() method
or by u64->phys_addr_t type casting. But in worst case the method
can even drop the memory region if it exceedes PHYS_ADDR_MAX size.
So lets make sure the retrieved from dts memory regions are valid,
and if some of them isn't just manually truncate it with a warning
printed out.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/kernel/prom.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/mips/kernel/prom.c b/arch/mips/kernel/prom.c
index 437a174e3ef9..28bf01961bb2 100644
--- a/arch/mips/kernel/prom.c
+++ b/arch/mips/kernel/prom.c
@@ -41,7 +41,19 @@ char *mips_get_machine_name(void)
 #ifdef CONFIG_USE_OF
 void __init early_init_dt_add_memory_arch(u64 base, u64 size)
 {
-	return add_memory_region(base, size, BOOT_MEM_RAM);
+	if (base >= PHYS_ADDR_MAX) {
+		pr_warn("Trying to add an invalid memory region, skipped\n");
+		return;
+	}
+
+	/* Truncate the passed memory region instead of type casting */
+	if (base + size - 1 >= PHYS_ADDR_MAX || base + size < base) {
+		pr_warn("Truncate memory region %llx @ %llx to size %llx\n",
+			size, base, PHYS_ADDR_MAX - base);
+		size = PHYS_ADDR_MAX - base;
+	}
+
+	add_memory_region(base, size, BOOT_MEM_RAM);
 }
 
 int __init early_init_dt_reserve_memory_arch(phys_addr_t base,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/12] mips: Enable OF_RESERVED_MEM config
  2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
                   ` (10 preceding siblings ...)
  2019-04-23 22:47 ` [PATCH 11/12] mips: Make sure dt memory regions are valid Serge Semin
@ 2019-04-23 22:47 ` Serge Semin
  2019-04-24  6:17   ` Christoph Hellwig
  11 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-23 22:47 UTC (permalink / raw)
  To: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross
  Cc: linux-mips, linux-kernel, Serge Semin

Since memblock-patchset was introduced the reserved-memory nodes are
supported being declared in dt-files. So these nodes are actually parsed
during the arch setup procedure when the early_init_fdt_scan_reserved_mem()
method is called. But some of the features like private reserved memory
pools aren't available at the moment, since OF_RESERVED_MEM isn't enabled
for the MIPS platform. Lets fix it by enabling the config.

But due to the arch-specific boot mem_map container utilization we need
to manually call the fdt_init_reserved_mem() method after all the available
and reserved memory has been moved to memblock. The function call performed
before bootmem_init() fails due to the lack of any memblock memory regions
to allocate from at that stage.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
---
 arch/mips/Kconfig        | 1 +
 arch/mips/kernel/setup.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 4a5f5b0ee9a9..0bf9e89e4023 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2988,6 +2988,7 @@ config USE_OF
 	bool
 	select OF
 	select OF_EARLY_FLATTREE
+	select OF_RESERVED_MEM
 	select IRQ_DOMAIN
 
 config UHI_BOOT
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index fbd216b4e929..ab349d2381c3 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -27,6 +27,7 @@
 #include <linux/dma-contiguous.h>
 #include <linux/decompress/generic.h>
 #include <linux/of_fdt.h>
+#include <linux/of_reserved_mem.h>
 
 #include <asm/addrspace.h>
 #include <asm/bootinfo.h>
@@ -825,6 +826,8 @@ static void __init arch_mem_init(char **cmdline_p)
 	memblock_reserve(__pa_symbol(&__nosave_begin),
 		__pa_symbol(&__nosave_end) - __pa_symbol(&__nosave_begin));
 
+	fdt_init_reserved_mem();
+
 	memblock_dump_all();
 
 	early_memtest(PFN_PHYS(min_low_pfn), PFN_PHYS(max_low_pfn));
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/12] mips: Enable OF_RESERVED_MEM config
  2019-04-23 22:47 ` [PATCH 12/12] mips: Enable OF_RESERVED_MEM config Serge Semin
@ 2019-04-24  6:17   ` Christoph Hellwig
  2019-04-24  8:34     ` Serge Semin
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2019-04-24  6:17 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel

On Wed, Apr 24, 2019 at 01:47:48AM +0300, Serge Semin wrote:
> Since memblock-patchset was introduced the reserved-memory nodes are
> supported being declared in dt-files. So these nodes are actually parsed
> during the arch setup procedure when the early_init_fdt_scan_reserved_mem()
> method is called. But some of the features like private reserved memory
> pools aren't available at the moment, since OF_RESERVED_MEM isn't enabled
> for the MIPS platform. Lets fix it by enabling the config.
> 
> But due to the arch-specific boot mem_map container utilization we need
> to manually call the fdt_init_reserved_mem() method after all the available
> and reserved memory has been moved to memblock. The function call performed
> before bootmem_init() fails due to the lack of any memblock memory regions
> to allocate from at that stage.

Architectures should not select this symbol directly, it will be
automatically enabled if either DMA_DECLARE_COHERENT or DMA_CMA
are enabled, which are required for the actual underlying memory
allocators.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/12] mips: Enable OF_RESERVED_MEM config
  2019-04-24  6:17   ` Christoph Hellwig
@ 2019-04-24  8:34     ` Serge Semin
  0 siblings, 0 replies; 43+ messages in thread
From: Serge Semin @ 2019-04-24  8:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel

On Tue, Apr 23, 2019 at 11:17:50PM -0700, Christoph Hellwig wrote:

Hello Christoph

> On Wed, Apr 24, 2019 at 01:47:48AM +0300, Serge Semin wrote:
> > Since memblock-patchset was introduced the reserved-memory nodes are
> > supported being declared in dt-files. So these nodes are actually parsed
> > during the arch setup procedure when the early_init_fdt_scan_reserved_mem()
> > method is called. But some of the features like private reserved memory
> > pools aren't available at the moment, since OF_RESERVED_MEM isn't enabled
> > for the MIPS platform. Lets fix it by enabling the config.
> > 
> > But due to the arch-specific boot mem_map container utilization we need
> > to manually call the fdt_init_reserved_mem() method after all the available
> > and reserved memory has been moved to memblock. The function call performed
> > before bootmem_init() fails due to the lack of any memblock memory regions
> > to allocate from at that stage.
> 
> Architectures should not select this symbol directly, it will be
> automatically enabled if either DMA_DECLARE_COHERENT or DMA_CMA
> are enabled, which are required for the actual underlying memory
> allocators.

Thanks for the comment. I should have checked this before porting the patch from
kernel 4.9. This symbol has been selected there by the platforms. I'll remove the
forcible selection of the config in the next patchset revision.

-Sergey


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/12] mips: Dump memblock regions for debugging
  2019-04-23 22:47 ` [PATCH 08/12] mips: Dump memblock regions for debugging Serge Semin
@ 2019-04-24 13:45   ` Mike Rapoport
  2019-04-24 14:20     ` Serge Semin
  0 siblings, 1 reply; 43+ messages in thread
From: Mike Rapoport @ 2019-04-24 13:45 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel

On Wed, Apr 24, 2019 at 01:47:44AM +0300, Serge Semin wrote:
> It is useful to have the whole memblock memory space printed to console
> when basic memlock initializations are done. It can be performed by
> ready-to-use method memblock_dump_all(), which prints the available
> and reserved memory spaces if MEMBLOCK_DEBUG config is enabled.

Nit: there's no MEMBLOCK_DEBUG config option but rather memblock=debug
command line parameter ;-)

> Lets call it at the very end of arch_mem_init() function, when
> all memblock memory and reserved regions are defined, but before
> any serious allocation is performed.
> 
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> ---
>  arch/mips/kernel/setup.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
> index 2a1b2e7a1bc9..ca493fdf69b0 100644
> --- a/arch/mips/kernel/setup.c
> +++ b/arch/mips/kernel/setup.c
> @@ -824,6 +824,8 @@ static void __init arch_mem_init(char **cmdline_p)
>  	/* Reserve for hibernation. */
>  	memblock_reserve(__pa_symbol(&__nosave_begin),
>  		__pa_symbol(&__nosave_end) - __pa_symbol(&__nosave_begin));
> +
> +	memblock_dump_all();
>  }
>  
>  static void __init resource_init(void)
> -- 
> 2.21.0
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/12] mips: Print the kernel virtual mem layout on debugging
  2019-04-23 22:47 ` [PATCH 10/12] mips: Print the kernel virtual mem layout on debugging Serge Semin
@ 2019-04-24 13:47   ` Mike Rapoport
  2019-04-24 14:35     ` Serge Semin
  0 siblings, 1 reply; 43+ messages in thread
From: Mike Rapoport @ 2019-04-24 13:47 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel

On Wed, Apr 24, 2019 at 01:47:46AM +0300, Serge Semin wrote:
> It is useful at least for debugging to have the kernel virtual
> memory layout printed at boot time so to have the full information
> about the booted kernel. Make the printing optional and available
> only when DEBUG_KERNEL config is enabled so not to leak a sensitive
> kernel information.
> 
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> ---
>  arch/mips/mm/init.c | 49 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
> index bbb196ad5f26..c338bbd03b2a 100644
> --- a/arch/mips/mm/init.c
> +++ b/arch/mips/mm/init.c
> @@ -31,6 +31,7 @@
>  #include <linux/gfp.h>
>  #include <linux/kcore.h>
>  #include <linux/initrd.h>
> +#include <linux/sizes.h>
>  
>  #include <asm/bootinfo.h>
>  #include <asm/cachectl.h>
> @@ -56,6 +57,53 @@ unsigned long empty_zero_page, zero_page_mask;
>  EXPORT_SYMBOL_GPL(empty_zero_page);
>  EXPORT_SYMBOL(zero_page_mask);
>  
> +/*
> + * Print out the kernel virtual memory layout
> + */
> +#define MLK(b, t) (void *)b, (void *)t, ((t) - (b)) >> 10
> +#define MLM(b, t) (void *)b, (void *)t, ((t) - (b)) >> 20
> +#define MLK_ROUNDUP(b, t) (void *)b, (void *)t, DIV_ROUND_UP(((t) - (b)), SZ_1K)
> +static void __init mem_print_kmap_info(void)
> +{
> +#ifdef CONFIG_DEBUG_KERNEL

Maybe CONFIG_DEBUG_VM?

> +	pr_notice("Kernel virtual memory layout:\n"
> +		  "    lowmem  : 0x%px - 0x%px  (%4ld MB)\n"
> +		  "      .text : 0x%px - 0x%px  (%4td kB)\n"
> +		  "      .data : 0x%px - 0x%px  (%4td kB)\n"
> +		  "      .init : 0x%px - 0x%px  (%4td kB)\n"
> +		  "      .bss  : 0x%px - 0x%px  (%4td kB)\n"
> +		  "    vmalloc : 0x%px - 0x%px  (%4ld MB)\n"
> +#ifdef CONFIG_HIGHMEM
> +		  "    pkmap   : 0x%px - 0x%px  (%4ld MB)\n"
> +#endif
> +		  "    fixmap  : 0x%px - 0x%px  (%4ld kB)\n",
> +		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
> +		  MLK_ROUNDUP(_text, _etext),
> +		  MLK_ROUNDUP(_sdata, _edata),
> +		  MLK_ROUNDUP(__init_begin, __init_end),
> +		  MLK_ROUNDUP(__bss_start, __bss_stop),
> +		  MLM(VMALLOC_START, VMALLOC_END),
> +#ifdef CONFIG_HIGHMEM
> +		  MLM(PKMAP_BASE, (PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE)),
> +#endif
> +		  MLK(FIXADDR_START, FIXADDR_TOP));
> +
> +	/* Check some fundamental inconsistencies. May add something else? */
> +#ifdef CONFIG_HIGHMEM
> +	BUILD_BUG_ON(VMALLOC_END < PAGE_OFFSET);
> +	BUG_ON(VMALLOC_END < (unsigned long)high_memory);
> +	BUILD_BUG_ON((PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE) < PAGE_OFFSET);
> +	BUG_ON((PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE) <
> +		(unsigned long)high_memory);
> +#endif
> +	BUILD_BUG_ON(FIXADDR_TOP < PAGE_OFFSET);
> +	BUG_ON(FIXADDR_TOP < (unsigned long)high_memory);
> +#endif /* CONFIG_DEBUG_KERNEL */
> +}
> +#undef MLK
> +#undef MLM
> +#undef MLK_ROUNDUP
> +
>  /*
>   * Not static inline because used by IP27 special magic initialization code
>   */
> @@ -479,6 +527,7 @@ void __init mem_init(void)
>  	setup_zero_pages();	/* Setup zeroed pages.  */
>  	mem_init_free_highmem();
>  	mem_init_print_info(NULL);
> +	mem_print_kmap_info();
>  
>  #ifdef CONFIG_64BIT
>  	if ((unsigned long) &_text > (unsigned long) CKSEG0)
> -- 
> 2.21.0
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/12] mips: Dump memblock regions for debugging
  2019-04-24 13:45   ` Mike Rapoport
@ 2019-04-24 14:20     ` Serge Semin
  0 siblings, 0 replies; 43+ messages in thread
From: Serge Semin @ 2019-04-24 14:20 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Ralf Baechle, Paul Burton, James Hogan, Andrew Morton,
	Michal Hocko, Greg Kroah-Hartman, Thomas Bogendoerfer,
	Huacai Chen, Stefan Agner, Stephen Rothwell, Alexandre Belloni,
	Juergen Gross, linux-mips, linux-kernel

On Wed, Apr 24, 2019 at 04:45:47PM +0300, Mike Rapoport wrote:
> On Wed, Apr 24, 2019 at 01:47:44AM +0300, Serge Semin wrote:
> > It is useful to have the whole memblock memory space printed to console
> > when basic memlock initializations are done. It can be performed by
> > ready-to-use method memblock_dump_all(), which prints the available
> > and reserved memory spaces if MEMBLOCK_DEBUG config is enabled.
> 
> Nit: there's no MEMBLOCK_DEBUG config option but rather memblock=debug
> command line parameter ;-)
> 

Right. Thanks. I'll reword the message in the next patchset revision.

-Sergey

> > Lets call it at the very end of arch_mem_init() function, when
> > all memblock memory and reserved regions are defined, but before
> > any serious allocation is performed.
> > 
> > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > ---
> >  arch/mips/kernel/setup.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
> > index 2a1b2e7a1bc9..ca493fdf69b0 100644
> > --- a/arch/mips/kernel/setup.c
> > +++ b/arch/mips/kernel/setup.c
> > @@ -824,6 +824,8 @@ static void __init arch_mem_init(char **cmdline_p)
> >  	/* Reserve for hibernation. */
> >  	memblock_reserve(__pa_symbol(&__nosave_begin),
> >  		__pa_symbol(&__nosave_end) - __pa_symbol(&__nosave_begin));
> > +
> > +	memblock_dump_all();
> >  }
> >  
> >  static void __init resource_init(void)
> > -- 
> > 2.21.0
> > 
> 
> -- 
> Sincerely yours,
> Mike.
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/12] mips: Print the kernel virtual mem layout on debugging
  2019-04-24 13:47   ` Mike Rapoport
@ 2019-04-24 14:35     ` Serge Semin
  0 siblings, 0 replies; 43+ messages in thread
From: Serge Semin @ 2019-04-24 14:35 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel

On Wed, Apr 24, 2019 at 04:47:11PM +0300, Mike Rapoport wrote:
> On Wed, Apr 24, 2019 at 01:47:46AM +0300, Serge Semin wrote:
> > It is useful at least for debugging to have the kernel virtual
> > memory layout printed at boot time so to have the full information
> > about the booted kernel. Make the printing optional and available
> > only when DEBUG_KERNEL config is enabled so not to leak a sensitive
> > kernel information.
> > 
> > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > ---
> >  arch/mips/mm/init.c | 49 +++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 49 insertions(+)
> > 
> > diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
> > index bbb196ad5f26..c338bbd03b2a 100644
> > --- a/arch/mips/mm/init.c
> > +++ b/arch/mips/mm/init.c
> > @@ -31,6 +31,7 @@
> >  #include <linux/gfp.h>
> >  #include <linux/kcore.h>
> >  #include <linux/initrd.h>
> > +#include <linux/sizes.h>
> >  
> >  #include <asm/bootinfo.h>
> >  #include <asm/cachectl.h>
> > @@ -56,6 +57,53 @@ unsigned long empty_zero_page, zero_page_mask;
> >  EXPORT_SYMBOL_GPL(empty_zero_page);
> >  EXPORT_SYMBOL(zero_page_mask);
> >  
> > +/*
> > + * Print out the kernel virtual memory layout
> > + */
> > +#define MLK(b, t) (void *)b, (void *)t, ((t) - (b)) >> 10
> > +#define MLM(b, t) (void *)b, (void *)t, ((t) - (b)) >> 20
> > +#define MLK_ROUNDUP(b, t) (void *)b, (void *)t, DIV_ROUND_UP(((t) - (b)), SZ_1K)
> > +static void __init mem_print_kmap_info(void)
> > +{
> > +#ifdef CONFIG_DEBUG_KERNEL
> 
> Maybe CONFIG_DEBUG_VM?
> 

Last time I posted this patch Matt suggested to use CONFIG_DEBUG_KERNEL [1].
On the other hand arm platform prints this table unconditionally, but uses %lx
format for low-memory ranges and %p for kernel segments. I even more inclined
in the arm solution. But if selecting between DEBUG_KERNEL and DEBUG_VM I'd
stick with DEBUG_KERNEL, since VM-debug config help text states it is
intended for special performance checks: "Enable this to turn on extended
checks in the virtual-memory system that may impact performance," and
not for memory layout.

-Sergey

[1] https://lkml.org/lkml/2018/2/13/494

> > +	pr_notice("Kernel virtual memory layout:\n"
> > +		  "    lowmem  : 0x%px - 0x%px  (%4ld MB)\n"
> > +		  "      .text : 0x%px - 0x%px  (%4td kB)\n"
> > +		  "      .data : 0x%px - 0x%px  (%4td kB)\n"
> > +		  "      .init : 0x%px - 0x%px  (%4td kB)\n"
> > +		  "      .bss  : 0x%px - 0x%px  (%4td kB)\n"
> > +		  "    vmalloc : 0x%px - 0x%px  (%4ld MB)\n"
> > +#ifdef CONFIG_HIGHMEM
> > +		  "    pkmap   : 0x%px - 0x%px  (%4ld MB)\n"
> > +#endif
> > +		  "    fixmap  : 0x%px - 0x%px  (%4ld kB)\n",
> > +		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
> > +		  MLK_ROUNDUP(_text, _etext),
> > +		  MLK_ROUNDUP(_sdata, _edata),
> > +		  MLK_ROUNDUP(__init_begin, __init_end),
> > +		  MLK_ROUNDUP(__bss_start, __bss_stop),
> > +		  MLM(VMALLOC_START, VMALLOC_END),
> > +#ifdef CONFIG_HIGHMEM
> > +		  MLM(PKMAP_BASE, (PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE)),
> > +#endif
> > +		  MLK(FIXADDR_START, FIXADDR_TOP));
> > +
> > +	/* Check some fundamental inconsistencies. May add something else? */
> > +#ifdef CONFIG_HIGHMEM
> > +	BUILD_BUG_ON(VMALLOC_END < PAGE_OFFSET);
> > +	BUG_ON(VMALLOC_END < (unsigned long)high_memory);
> > +	BUILD_BUG_ON((PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE) < PAGE_OFFSET);
> > +	BUG_ON((PKMAP_BASE) + (LAST_PKMAP)*(PAGE_SIZE) <
> > +		(unsigned long)high_memory);
> > +#endif
> > +	BUILD_BUG_ON(FIXADDR_TOP < PAGE_OFFSET);
> > +	BUG_ON(FIXADDR_TOP < (unsigned long)high_memory);
> > +#endif /* CONFIG_DEBUG_KERNEL */
> > +}
> > +#undef MLK
> > +#undef MLM
> > +#undef MLK_ROUNDUP
> > +
> >  /*
> >   * Not static inline because used by IP27 special magic initialization code
> >   */
> > @@ -479,6 +527,7 @@ void __init mem_init(void)
> >  	setup_zero_pages();	/* Setup zeroed pages.  */
> >  	mem_init_free_highmem();
> >  	mem_init_print_info(NULL);
> > +	mem_print_kmap_info();
> >  
> >  #ifdef CONFIG_64BIT
> >  	if ((unsigned long) &_text > (unsigned long) CKSEG0)
> > -- 
> > 2.21.0
> > 
> 
> -- 
> Sincerely yours,
> Mike.
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/12] mips: Make sure kernel .bss exists in boot mem pool
  2019-04-23 22:47 ` [PATCH 01/12] mips: Make sure kernel .bss exists in boot mem pool Serge Semin
@ 2019-04-24 22:30   ` Paul Burton
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Burton @ 2019-04-24 22:30 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel,
	Serge Semin, linux-mips

Hello,

Serge Semin wrote:
> Current MIPS platform code makes sure the kernel text, data and init
> sections are added to the boot memory map pool right after the
> arch-specific memory setup method has been executed. But for some reason
> the MIPS platform code skipped the kernel .bss section, which definitely
> should be in the boot mem pool as well in any case. Lets fix this just be
> adding the space between __bss_start and __bss_stop.
> 
> Reviewed-by: Matt Redfearn <matt.redfearn@mips.com>
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

Applied to mips-next.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/12] mips: Discard rudiments from bootmem_init
  2019-04-23 22:47 ` [PATCH 02/12] mips: Discard rudiments from bootmem_init Serge Semin
@ 2019-04-24 22:30   ` Paul Burton
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Burton @ 2019-04-24 22:30 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel,
	Serge Semin, linux-mips

Hello,

Serge Semin wrote:
> There is a pointless code left in the bootmem_init() method since
> the bootmem allocator removal. First part resides the PFN ranges
> calculation loop. The conditional expressions and continue operator
> are useless there, since nothing is done after them. Second part is
> in RAM ranges installation loop. We can simplify the conditions cascade
> a bit without much of the logic redefinition, so to reduce the code
> length. In particular the end boundary value can be verified after
> the possible reduction to be below max_low_pfn.
> 
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

Applied to mips-next.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/12] mips: Combine memblock init and memory reservation loops
  2019-04-23 22:47 ` [PATCH 03/12] mips: Combine memblock init and memory reservation loops Serge Semin
@ 2019-04-24 22:30   ` Paul Burton
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Burton @ 2019-04-24 22:30 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel,
	Serge Semin, linux-mips

Hello,

Serge Semin wrote:
> Before bootmem was completely removed from the kernel, the last loop
> in the bootmem_init() had been used to reserve the correspondingly
> marked regions, initialize sparsemem sections and to free the low memory
> pages, which then would be used for early memory allocations. After the
> bootmem removing patchset had been merged the loop was left to do the first
> two things only. But it didn't do them quite well.
> 
> First of all it leaves the BOOT_MEM_INIT_RAM memory types unreserved,
> which is definitely bug (although it isn't noticeable due to being used
> by the kernel region only, which is fully marked as reserved). Secondly
> the reservation is supposed to be done for any memory including the
> high one. (I couldn't figure out why the highmem was ignored in the first
> place, since platforms and dts' may declare any memory region for
> reservation) Thirdly the reserved_end variable had been used here to not
> accidentally free memory occupied by kernel. Since we already reserved the
> corresponding region higher in this method there is no need in using the
> variable here anymore. Fourthly the sparsemem should be aware of all the
> memory types in the system including the ROM_DATA even if it is going to
> be reserved for the whole system uptime. Finally after all these notes are
> fixed the loop of memory reservation can be freely merged into the memory
> installation loop as it's done in this patch.
> 
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

Applied to mips-next.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-04-23 22:47 ` [PATCH 04/12] mips: Reserve memory for the kernel image resources Serge Semin
@ 2019-04-24 22:43   ` Paul Burton
  2019-04-26  0:00     ` Serge Semin
  2019-05-02 18:35   ` Paul Burton
  2019-05-21 14:56   ` Geert Uytterhoeven
  2 siblings, 1 reply; 43+ messages in thread
From: Paul Burton @ 2019-04-24 22:43 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, James Hogan, Matt Redfearn, Mike Rapoport,
	Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel

Hi Serge,

On Wed, Apr 24, 2019 at 01:47:40AM +0300, Serge Semin wrote:
> The reserved_end variable had been used by the bootmem_init() code
> to find a lowest limit of memory available for memmap blob. The original
> code just tried to find a free memory space higher than kernel was placed.
> This limitation seems justified for the memmap ragion search process, but
> I can't see any obvious reason to reserve the unused space below kernel
> seeing some platforms place it much higher than standard 1MB.

There are 2 reasons I'm aware of:

 1) Older systems generally had something like an ISA bus which used
    addresses below the kernel, and bootloaders like YAMON left behind
    functions that could be called right at the start of RAM. This sort
    of thing should be accounted for by /memreserve/ in DT or similar
    platform-specific reservations though rather than generically, and
    at least Malta & SEAD-3 DTs already have /memreserve/ entries for
    it. So this part I think is OK. Some other older platforms might
    need updating, but that's fine.

 2) trap_init() only allocates memory for the exception vector if using
    a vectored interrupt mode. In other cases it just uses CAC_BASE
    which currently gets reserved as part of this region between
    PHYS_OFFSET & _text.

    I think this behavior is bogus, and we should instead:

    - Allocate the exception vector memory using memblock_alloc() for
      CPUs implementing MIPSr2 or higher (ie. CPUs with a programmable
      EBase register). If we're not using vectored interrupts then
      allocating one page will do, and we already have the size
      calculation for if we are.

    - Otherwise use CAC_BASE but call memblock_reserve() on the first
      page.

    I think we should make that change before this one goes in. I can
    try to get to it tomorrow, but feel free to beat me to it.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-04-24 22:43   ` Paul Burton
@ 2019-04-26  0:00     ` Serge Semin
  2019-04-30 22:58       ` Paul Burton
  0 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-04-26  0:00 UTC (permalink / raw)
  To: Paul Burton
  Cc: Ralf Baechle, James Hogan, Mike Rapoport, Andrew Morton,
	Michal Hocko, Greg Kroah-Hartman, Thomas Bogendoerfer,
	Huacai Chen, Stefan Agner, Stephen Rothwell, Alexandre Belloni,
	Juergen Gross, linux-mips, linux-kernel

On Wed, Apr 24, 2019 at 10:43:48PM +0000, Paul Burton wrote:
> Hi Serge,
> 
> On Wed, Apr 24, 2019 at 01:47:40AM +0300, Serge Semin wrote:
> > The reserved_end variable had been used by the bootmem_init() code
> > to find a lowest limit of memory available for memmap blob. The original
> > code just tried to find a free memory space higher than kernel was placed.
> > This limitation seems justified for the memmap ragion search process, but
> > I can't see any obvious reason to reserve the unused space below kernel
> > seeing some platforms place it much higher than standard 1MB.
> 
> There are 2 reasons I'm aware of:
> 
>  1) Older systems generally had something like an ISA bus which used
>     addresses below the kernel, and bootloaders like YAMON left behind
>     functions that could be called right at the start of RAM. This sort
>     of thing should be accounted for by /memreserve/ in DT or similar
>     platform-specific reservations though rather than generically, and
>     at least Malta & SEAD-3 DTs already have /memreserve/ entries for
>     it. So this part I think is OK. Some other older platforms might
>     need updating, but that's fine.
> 

Regarding ISA. As far as I remember devices on that bus can DMA only to the
lowest 16MB. So in case if kernel is too big or placed pretty much high,
they may be left even without reachable memory at all in current
implementation.

>  2) trap_init() only allocates memory for the exception vector if using
>     a vectored interrupt mode. In other cases it just uses CAC_BASE
>     which currently gets reserved as part of this region between
>     PHYS_OFFSET & _text.
> 
>     I think this behavior is bogus, and we should instead:
> 
>     - Allocate the exception vector memory using memblock_alloc() for
>       CPUs implementing MIPSr2 or higher (ie. CPUs with a programmable
>       EBase register). If we're not using vectored interrupts then
>       allocating one page will do, and we already have the size
>       calculation for if we are.
> 
>     - Otherwise use CAC_BASE but call memblock_reserve() on the first
>       page.
> 
>     I think we should make that change before this one goes in. I can
>     try to get to it tomorrow, but feel free to beat me to it.
> 

As far as I understood you and the code this should be enough to fix
the problem:
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index 98ca55d62201..f680253e2617 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -2326,6 +2326,8 @@ void __init trap_init(void)
 				ebase += (read_c0_ebase() & 0x3ffff000);
 			}
 		}
+
+		memblock_reserve(ebase, PAGE_SIZE);
 	}
 
 	if (cpu_has_mmips) {
---

Allocation has already been implemented in the if-branch under the
(cpu_has_veic || cpu_has_vint) condition. So we don't need to change
there anything.
In case if vectored interrupts aren't supported the else-clause is
taken and we need to reserve whatever is set in the exception base
address variable.

I'll add this patch between 3d and 4th ones if you are ok with it.

-Sergey

> Thanks,
>     Paul

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-04-26  0:00     ` Serge Semin
@ 2019-04-30 22:58       ` Paul Burton
  2019-05-02 14:24         ` Serge Semin
  0 siblings, 1 reply; 43+ messages in thread
From: Paul Burton @ 2019-04-30 22:58 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, James Hogan, Mike Rapoport, Andrew Morton,
	Michal Hocko, Greg Kroah-Hartman, Thomas Bogendoerfer,
	Huacai Chen, Stefan Agner, Stephen Rothwell, Alexandre Belloni,
	Juergen Gross, linux-mips, linux-kernel

Hi Serge,

On Fri, Apr 26, 2019 at 03:00:36AM +0300, Serge Semin wrote:
> >  1) Older systems generally had something like an ISA bus which used
> >     addresses below the kernel, and bootloaders like YAMON left behind
> >     functions that could be called right at the start of RAM. This sort
> >     of thing should be accounted for by /memreserve/ in DT or similar
> >     platform-specific reservations though rather than generically, and
> >     at least Malta & SEAD-3 DTs already have /memreserve/ entries for
> >     it. So this part I think is OK. Some other older platforms might
> >     need updating, but that's fine.
> > 
> 
> Regarding ISA. As far as I remember devices on that bus can DMA only to the
> lowest 16MB. So in case if kernel is too big or placed pretty much high,
> they may be left even without reachable memory at all in current
> implementation.

Sure - I'm not too worried about these old buses, platforms can continue
to reserve the memory through DT or otherwise if they need to.

> >  2) trap_init() only allocates memory for the exception vector if using
> >     a vectored interrupt mode. In other cases it just uses CAC_BASE
> >     which currently gets reserved as part of this region between
> >     PHYS_OFFSET & _text.
> > 
> >     I think this behavior is bogus, and we should instead:
> > 
> >     - Allocate the exception vector memory using memblock_alloc() for
> >       CPUs implementing MIPSr2 or higher (ie. CPUs with a programmable
> >       EBase register). If we're not using vectored interrupts then
> >       allocating one page will do, and we already have the size
> >       calculation for if we are.
> > 
> >     - Otherwise use CAC_BASE but call memblock_reserve() on the first
> >       page.
> > 
> >     I think we should make that change before this one goes in. I can
> >     try to get to it tomorrow, but feel free to beat me to it.
> > 
> 
> As far as I understood you and the code this should be enough to fix
> the problem:
> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> index 98ca55d62201..f680253e2617 100644
> --- a/arch/mips/kernel/traps.c
> +++ b/arch/mips/kernel/traps.c
> @@ -2326,6 +2326,8 @@ void __init trap_init(void)
>  				ebase += (read_c0_ebase() & 0x3ffff000);
>  			}
>  		}
> +
> +		memblock_reserve(ebase, PAGE_SIZE);
>  	}
>  
>  	if (cpu_has_mmips) {
> ---
> 
> Allocation has already been implemented in the if-branch under the
> (cpu_has_veic || cpu_has_vint) condition. So we don't need to change
> there anything.
> In case if vectored interrupts aren't supported the else-clause is
> taken and we need to reserve whatever is set in the exception base
> address variable.
> 
> I'll add this patch between 3d and 4th ones if you are ok with it.

I think that would work, but I have other motivations to allocate the
memory in non-vectored cases anyway. I just sent a series that does that
& cleans up a little [1]. If you could take a look that would be great.
With that change made I think this patch will be good to apply.

Thanks,
    Paul

[1] https://lore.kernel.org/linux-mips/20190430225216.7164-1-paul.burton@mips.com/T/#t

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-04-30 22:58       ` Paul Burton
@ 2019-05-02 14:24         ` Serge Semin
  2019-05-02 18:45           ` Paul Burton
  0 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-05-02 14:24 UTC (permalink / raw)
  To: Paul Burton
  Cc: Ralf Baechle, James Hogan, Mike Rapoport, Andrew Morton,
	Michal Hocko, Greg Kroah-Hartman, Thomas Bogendoerfer,
	Huacai Chen, Stefan Agner, Stephen Rothwell, Alexandre Belloni,
	Juergen Gross, linux-mips, linux-kernel

On Tue, Apr 30, 2019 at 10:58:33PM +0000, Paul Burton wrote:

Hello Paul

> Hi Serge,
> 
> On Fri, Apr 26, 2019 at 03:00:36AM +0300, Serge Semin wrote:
> > >  1) Older systems generally had something like an ISA bus which used
> > >     addresses below the kernel, and bootloaders like YAMON left behind
> > >     functions that could be called right at the start of RAM. This sort
> > >     of thing should be accounted for by /memreserve/ in DT or similar
> > >     platform-specific reservations though rather than generically, and
> > >     at least Malta & SEAD-3 DTs already have /memreserve/ entries for
> > >     it. So this part I think is OK. Some other older platforms might
> > >     need updating, but that's fine.
> > > 
> > 
> > Regarding ISA. As far as I remember devices on that bus can DMA only to the
> > lowest 16MB. So in case if kernel is too big or placed pretty much high,
> > they may be left even without reachable memory at all in current
> > implementation.
> 
> Sure - I'm not too worried about these old buses, platforms can continue
> to reserve the memory through DT or otherwise if they need to.
> 
> > >  2) trap_init() only allocates memory for the exception vector if using
> > >     a vectored interrupt mode. In other cases it just uses CAC_BASE
> > >     which currently gets reserved as part of this region between
> > >     PHYS_OFFSET & _text.
> > > 
> > >     I think this behavior is bogus, and we should instead:
> > > 
> > >     - Allocate the exception vector memory using memblock_alloc() for
> > >       CPUs implementing MIPSr2 or higher (ie. CPUs with a programmable
> > >       EBase register). If we're not using vectored interrupts then
> > >       allocating one page will do, and we already have the size
> > >       calculation for if we are.
> > > 
> > >     - Otherwise use CAC_BASE but call memblock_reserve() on the first
> > >       page.
> > > 
> > >     I think we should make that change before this one goes in. I can
> > >     try to get to it tomorrow, but feel free to beat me to it.
> > > 
> > 
> > As far as I understood you and the code this should be enough to fix
> > the problem:
> > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > index 98ca55d62201..f680253e2617 100644
> > --- a/arch/mips/kernel/traps.c
> > +++ b/arch/mips/kernel/traps.c
> > @@ -2326,6 +2326,8 @@ void __init trap_init(void)
> >  				ebase += (read_c0_ebase() & 0x3ffff000);
> >  			}
> >  		}
> > +
> > +		memblock_reserve(ebase, PAGE_SIZE);
> >  	}
> >  
> >  	if (cpu_has_mmips) {
> > ---
> > 
> > Allocation has already been implemented in the if-branch under the
> > (cpu_has_veic || cpu_has_vint) condition. So we don't need to change
> > there anything.
> > In case if vectored interrupts aren't supported the else-clause is
> > taken and we need to reserve whatever is set in the exception base
> > address variable.
> > 
> > I'll add this patch between 3d and 4th ones if you are ok with it.
> 
> I think that would work, but I have other motivations to allocate the
> memory in non-vectored cases anyway. I just sent a series that does that
> & cleans up a little [1]. If you could take a look that would be great.
> With that change made I think this patch will be good to apply.
> 

Just reviewed and tested your series on my machine. I tagged the whole series
in a response to the cover-letter of [1].

Could you please proceed with this patchset review procedure? There are
also eight more patches left without your tag or comment.  This patch
is also left with no explicit tag.

BTW I see you already applied patches 1-3 to the mips-next, so what shall I
do when sending a v2 patchset with fixes asked to be provided for patch 12
and possibly for others in future? Shall I just resend the series without that
applied patches or send them over with your acked-by tagges?

-Sergey

> Thanks,
>     Paul
> 
> [1] https://lore.kernel.org/linux-mips/20190430225216.7164-1-paul.burton@mips.com/T/#t

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-04-23 22:47 ` [PATCH 04/12] mips: Reserve memory for the kernel image resources Serge Semin
  2019-04-24 22:43   ` Paul Burton
@ 2019-05-02 18:35   ` Paul Burton
  2019-05-21 14:56   ` Geert Uytterhoeven
  2 siblings, 0 replies; 43+ messages in thread
From: Paul Burton @ 2019-05-02 18:35 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel,
	Serge Semin, linux-mips

Hello,

Serge Semin wrote:
> The reserved_end variable had been used by the bootmem_init() code
> to find a lowest limit of memory available for memmap blob. The original
> code just tried to find a free memory space higher than kernel was placed.
> This limitation seems justified for the memmap ragion search process, but
> I can't see any obvious reason to reserve the unused space below kernel
> seeing some platforms place it much higher than standard 1MB. Moreover
> the RELOCATION config enables it to be loaded at any memory address.
> So lets reserve the memory occupied by the kernel only, leaving the region
> below being free for allocations. After doing this we can now discard the
> code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> since it's going to be free anyway (unless marked as reserved by
> platforms).
> 
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

Applied to mips-next.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/12] mips: Discard post-CMA-init foreach loop
  2019-04-23 22:47 ` [PATCH 05/12] mips: Discard post-CMA-init foreach loop Serge Semin
@ 2019-05-02 18:35   ` Paul Burton
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Burton @ 2019-05-02 18:35 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel,
	Serge Semin, linux-mips

Hello,

Serge Semin wrote:
> Really the loop is pointless, since it walks over memblock-reserved
> memory regions and mark them as reserved in memblock. Before
> bootmem was removed from the kernel, this loop had been
> used to map the memory reserved by CMA into the legacy bootmem
> allocator. But now the early memory allocator is memblock,
> which is used by CMA for reservation, so we don't need any mapping
> anymore.
> 
> Reviewed-by: Matt Redfearn <matt.redfearn@mips.com>
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

Applied to mips-next.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/12] mips: Use memblock to reserve the __nosave memory range
  2019-04-23 22:47 ` [PATCH 06/12] mips: Use memblock to reserve the __nosave memory range Serge Semin
@ 2019-05-02 18:35   ` Paul Burton
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Burton @ 2019-05-02 18:35 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel,
	Serge Semin, linux-mips

Hello,

Serge Semin wrote:
> Originally before legacy bootmem was removed, the memory for the range was
> correctly reserved by reserve_bootmem_region(). But since memblock has been
> selected for early memory allocation the function can be utilized only
> after paging is fully initialized (as it is done by memblock_free_all()
> function). So calling it from arch_mem_init() method is prone to errors,
> and at this stage we need to reserve the memory in the memblock allocator.
> 
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

Applied to mips-next.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/12] mips: Add reserve-nomap memory type support
  2019-04-23 22:47 ` [PATCH 07/12] mips: Add reserve-nomap memory type support Serge Semin
@ 2019-05-02 18:35   ` Paul Burton
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Burton @ 2019-05-02 18:35 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips, linux-kernel,
	Serge Semin, linux-mips

Hello,

Serge Semin wrote:
> It might be necessary to prevent the virtual mapping creation for a
> requested memory region. For instance there is a "no-map" property
> indicating exactly this feature. In this case we need to not only
> reserve the specified region by pretending it doesn't exist in the
> memory space, but completely remove the range from system just by
> removing it from memblock. The same way it's done in default
> early_init_dt_reserve_memory_arch() method.
> 
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

Applied to mips-next.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-02 14:24         ` Serge Semin
@ 2019-05-02 18:45           ` Paul Burton
  2019-05-03 17:21             ` Serge Semin
  0 siblings, 1 reply; 43+ messages in thread
From: Paul Burton @ 2019-05-02 18:45 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, James Hogan, Mike Rapoport, Andrew Morton,
	Michal Hocko, Greg Kroah-Hartman, Thomas Bogendoerfer,
	Huacai Chen, Stefan Agner, Stephen Rothwell, Alexandre Belloni,
	Juergen Gross, linux-mips, linux-kernel

Hi Serge,

On Thu, May 02, 2019 at 05:24:37PM +0300, Serge Semin wrote:
> Just reviewed and tested your series on my machine. I tagged the whole series
> in a response to the cover-letter of [1].

Thanks for the review & testing; that series is now in mips-next.

> Could you please proceed with this patchset review procedure? There are
> also eight more patches left without your tag or comment.  This patch
> is also left with no explicit tag.
> 
> BTW I see you already applied patches 1-3 to the mips-next, so what shall I
> do when sending a v2 patchset with fixes asked to be provided for patch 12
> and possibly for others in future? Shall I just resend the series without that
> applied patches or send them over with your acked-by tagges?

I've so far applied patches 1-7 of your series to mips-next, and stopped
at patch 8 which has a comment to address.

My preference would be if you could send a v2 which just contains the
remaining patches (ie. patches 8-12 become patches 1-5), ideally atop
the mips-next branch.

The series looks good to me once the review comments are addressed, but
no need to add an Acked-by - it'll be implicit when I apply them to
mips-next.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-02 18:45           ` Paul Burton
@ 2019-05-03 17:21             ` Serge Semin
  0 siblings, 0 replies; 43+ messages in thread
From: Serge Semin @ 2019-05-03 17:21 UTC (permalink / raw)
  To: Paul Burton
  Cc: Ralf Baechle, James Hogan, Mike Rapoport, Andrew Morton,
	Michal Hocko, Greg Kroah-Hartman, Thomas Bogendoerfer,
	Huacai Chen, Stefan Agner, Stephen Rothwell, Alexandre Belloni,
	Juergen Gross, linux-mips, linux-kernel

Hello Paul

On Thu, May 02, 2019 at 06:45:39PM +0000, Paul Burton wrote:
> Hi Serge,
> 
> On Thu, May 02, 2019 at 05:24:37PM +0300, Serge Semin wrote:
> > Just reviewed and tested your series on my machine. I tagged the whole series
> > in a response to the cover-letter of [1].
> 
> Thanks for the review & testing; that series is now in mips-next.
> 
> > Could you please proceed with this patchset review procedure? There are
> > also eight more patches left without your tag or comment.  This patch
> > is also left with no explicit tag.
> > 
> > BTW I see you already applied patches 1-3 to the mips-next, so what shall I
> > do when sending a v2 patchset with fixes asked to be provided for patch 12
> > and possibly for others in future? Shall I just resend the series without that
> > applied patches or send them over with your acked-by tagges?
> 
> I've so far applied patches 1-7 of your series to mips-next, and stopped
> at patch 8 which has a comment to address.
> 
> My preference would be if you could send a v2 which just contains the
> remaining patches (ie. patches 8-12 become patches 1-5), ideally atop
> the mips-next branch.
> 
> The series looks good to me once the review comments are addressed, but
> no need to add an Acked-by - it'll be implicit when I apply them to
> mips-next.
> 

Agreed. I'll do it shortly.

-Sergey

> Thanks,
>     Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-04-23 22:47 ` [PATCH 04/12] mips: Reserve memory for the kernel image resources Serge Semin
  2019-04-24 22:43   ` Paul Burton
  2019-05-02 18:35   ` Paul Burton
@ 2019-05-21 14:56   ` Geert Uytterhoeven
  2019-05-21 15:53     ` Mike Rapoport
  2 siblings, 1 reply; 43+ messages in thread
From: Geert Uytterhoeven @ 2019-05-21 14:56 UTC (permalink / raw)
  To: Serge Semin
  Cc: Ralf Baechle, Paul Burton, James Hogan, Matt Redfearn,
	Mike Rapoport, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto

Hi Serge,

On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> The reserved_end variable had been used by the bootmem_init() code
> to find a lowest limit of memory available for memmap blob. The original
> code just tried to find a free memory space higher than kernel was placed.
> This limitation seems justified for the memmap ragion search process, but
> I can't see any obvious reason to reserve the unused space below kernel
> seeing some platforms place it much higher than standard 1MB. Moreover
> the RELOCATION config enables it to be loaded at any memory address.
> So lets reserve the memory occupied by the kernel only, leaving the region
> below being free for allocations. After doing this we can now discard the
> code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> since it's going to be free anyway (unless marked as reserved by
> platforms).
>
> Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:

    VFS: Mounted root (nfs filesystem) on device 0:13.
    devtmpfs: mounted
    BUG: Bad page state in process swapper  pfn:00001
    page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
    flags: 0x0()
    raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
    page dumped because: nonzero mapcount
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper Not tainted
5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
    Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
00000001 804a3490
            00000001 81000000 0030f231 80148558 00000003 10008400
87c1dd80 7599ee13
            00000000 00000000 804b0000 00000000 00000007 00000000
00000085 00000000
            62722d31 00000084 804b0000 39347874 00000000 804b7820
8040cef8 81000010
            00000001 00000007 00000001 81000000 00000008 8021de24
00000000 804a0000
            ...
    Call Trace:
    [<8010adec>] show_stack+0x74/0x104
    [<801a5e44>] bad_page+0x130/0x138
    [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
    [<801a789c>] free_unref_page+0x40/0x68
    [<801120f4>] free_init_pages+0xec/0x104
    [<803bdde8>] free_initmem+0x10/0x58
    [<803bdb8c>] kernel_init+0x20/0x100
    [<801057c8>] ret_from_kernel_thread+0x14/0x1c
    Disabling lock debugging due to kernel taint
    BUG: Bad page state in process swapper  pfn:00002
    [...]

CONFIG_RELOCATABLE is not set, so the only relevant part is the
change quoted below.

> --- a/arch/mips/kernel/setup.c
> +++ b/arch/mips/kernel/setup.c
> @@ -371,7 +371,6 @@ static void __init bootmem_init(void)
>
>  static void __init bootmem_init(void)
>  {
> -       unsigned long reserved_end;
>         phys_addr_t ramstart = PHYS_ADDR_MAX;
>         int i;
>
> @@ -382,10 +381,10 @@ static void __init bootmem_init(void)
>          * will reserve the area used for the initrd.
>          */
>         init_initrd();
> -       reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
>
> -       memblock_reserve(PHYS_OFFSET,
> -                        (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
> +       /* Reserve memory occupied by kernel. */
> +       memblock_reserve(__pa_symbol(&_text),
> +                       __pa_symbol(&_end) - __pa_symbol(&_text));
>
>         /*
>          * max_low_pfn is not a number of pages. The number of pages

With some debug code added:

    Determined physical RAM map:
     memory: 08000000 @ 00000000 (usable)
    bootmem_init:390: PHYS_OFFSET = 0x0
    bootmem_init:391: __pa_symbol(&_text) = 0x100000
    bootmem_init:392: __pa_symbol(&_end) = 0x4b77c8
    bootmem_init:393: PFN_UP(__pa_symbol(&_end)) = 0x4b8

Hence the old code reserved 1 MiB extra at the beginning.

Note that the new code also dropped the rounding up of the memory block
size to a multiple of PAGE_SIZE. I'm not sure the latter actually
matters or not.

Do you have a clue? Thanks!

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-21 14:56   ` Geert Uytterhoeven
@ 2019-05-21 15:53     ` Mike Rapoport
  2019-05-21 16:39       ` Serge Semin
  2019-05-22  7:47       ` Geert Uytterhoeven
  0 siblings, 2 replies; 43+ messages in thread
From: Mike Rapoport @ 2019-05-21 15:53 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Serge Semin, Ralf Baechle, Paul Burton, James Hogan,
	Matt Redfearn, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto

Hi Geert,

On Tue, May 21, 2019 at 04:56:39PM +0200, Geert Uytterhoeven wrote:
> Hi Serge,
> 
> On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> > The reserved_end variable had been used by the bootmem_init() code
> > to find a lowest limit of memory available for memmap blob. The original
> > code just tried to find a free memory space higher than kernel was placed.
> > This limitation seems justified for the memmap ragion search process, but
> > I can't see any obvious reason to reserve the unused space below kernel
> > seeing some platforms place it much higher than standard 1MB. Moreover
> > the RELOCATION config enables it to be loaded at any memory address.
> > So lets reserve the memory occupied by the kernel only, leaving the region
> > below being free for allocations. After doing this we can now discard the
> > code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> > since it's going to be free anyway (unless marked as reserved by
> > platforms).
> >
> > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> 
> This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
> image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:
> 
>     VFS: Mounted root (nfs filesystem) on device 0:13.
>     devtmpfs: mounted
>     BUG: Bad page state in process swapper  pfn:00001
>     page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
>     flags: 0x0()
>     raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
>     page dumped because: nonzero mapcount
>     Modules linked in:
>     CPU: 0 PID: 1 Comm: swapper Not tainted
> 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
>     Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
> 00000001 804a3490
>             00000001 81000000 0030f231 80148558 00000003 10008400
> 87c1dd80 7599ee13
>             00000000 00000000 804b0000 00000000 00000007 00000000
> 00000085 00000000
>             62722d31 00000084 804b0000 39347874 00000000 804b7820
> 8040cef8 81000010
>             00000001 00000007 00000001 81000000 00000008 8021de24
> 00000000 804a0000
>             ...
>     Call Trace:
>     [<8010adec>] show_stack+0x74/0x104
>     [<801a5e44>] bad_page+0x130/0x138
>     [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
>     [<801a789c>] free_unref_page+0x40/0x68
>     [<801120f4>] free_init_pages+0xec/0x104
>     [<803bdde8>] free_initmem+0x10/0x58
>     [<803bdb8c>] kernel_init+0x20/0x100
>     [<801057c8>] ret_from_kernel_thread+0x14/0x1c
>     Disabling lock debugging due to kernel taint
>     BUG: Bad page state in process swapper  pfn:00002
>     [...]
> 
> CONFIG_RELOCATABLE is not set, so the only relevant part is the
> change quoted below.
> 
> > --- a/arch/mips/kernel/setup.c
> > +++ b/arch/mips/kernel/setup.c
> > @@ -371,7 +371,6 @@ static void __init bootmem_init(void)
> >
> >  static void __init bootmem_init(void)
> >  {
> > -       unsigned long reserved_end;
> >         phys_addr_t ramstart = PHYS_ADDR_MAX;
> >         int i;
> >
> > @@ -382,10 +381,10 @@ static void __init bootmem_init(void)
> >          * will reserve the area used for the initrd.
> >          */
> >         init_initrd();
> > -       reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
> >
> > -       memblock_reserve(PHYS_OFFSET,
> > -                        (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
> > +       /* Reserve memory occupied by kernel. */
> > +       memblock_reserve(__pa_symbol(&_text),
> > +                       __pa_symbol(&_end) - __pa_symbol(&_text));
> >
> >         /*
> >          * max_low_pfn is not a number of pages. The number of pages
> 
> With some debug code added:
> 
>     Determined physical RAM map:
>      memory: 08000000 @ 00000000 (usable)
>     bootmem_init:390: PHYS_OFFSET = 0x0
>     bootmem_init:391: __pa_symbol(&_text) = 0x100000
>     bootmem_init:392: __pa_symbol(&_end) = 0x4b77c8
>     bootmem_init:393: PFN_UP(__pa_symbol(&_end)) = 0x4b8

Have you tried adding memblock=debug to the command line?
Not sure it'll help, but still :)
 
> Hence the old code reserved 1 MiB extra at the beginning.
> 
> Note that the new code also dropped the rounding up of the memory block
> size to a multiple of PAGE_SIZE. I'm not sure the latter actually
> matters or not.

I'd say that bad page state for pfn 1 is caused by "freeing" the first 1M.

> Do you have a clue? Thanks!
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> -- 
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-21 15:53     ` Mike Rapoport
@ 2019-05-21 16:39       ` Serge Semin
  2019-05-22  7:50         ` Geert Uytterhoeven
  2019-05-22  7:47       ` Geert Uytterhoeven
  1 sibling, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-05-21 16:39 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Geert Uytterhoeven, Ralf Baechle, Paul Burton, James Hogan,
	Matt Redfearn, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto

Hello Geert, Mike

On Tue, May 21, 2019 at 06:53:10PM +0300, Mike Rapoport wrote:
> Hi Geert,
> 
> On Tue, May 21, 2019 at 04:56:39PM +0200, Geert Uytterhoeven wrote:
> > Hi Serge,
> > 
> > On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> > > The reserved_end variable had been used by the bootmem_init() code
> > > to find a lowest limit of memory available for memmap blob. The original
> > > code just tried to find a free memory space higher than kernel was placed.
> > > This limitation seems justified for the memmap ragion search process, but
> > > I can't see any obvious reason to reserve the unused space below kernel
> > > seeing some platforms place it much higher than standard 1MB. Moreover
> > > the RELOCATION config enables it to be loaded at any memory address.
> > > So lets reserve the memory occupied by the kernel only, leaving the region
> > > below being free for allocations. After doing this we can now discard the
> > > code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> > > since it's going to be free anyway (unless marked as reserved by
> > > platforms).
> > >
> > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > 
> > This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
> > image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:
> > 
> >     VFS: Mounted root (nfs filesystem) on device 0:13.
> >     devtmpfs: mounted
> >     BUG: Bad page state in process swapper  pfn:00001
> >     page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
> >     flags: 0x0()
> >     raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
> >     page dumped because: nonzero mapcount
> >     Modules linked in:
> >     CPU: 0 PID: 1 Comm: swapper Not tainted
> > 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
> >     Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
> > 00000001 804a3490
> >             00000001 81000000 0030f231 80148558 00000003 10008400
> > 87c1dd80 7599ee13
> >             00000000 00000000 804b0000 00000000 00000007 00000000
> > 00000085 00000000
> >             62722d31 00000084 804b0000 39347874 00000000 804b7820
> > 8040cef8 81000010
> >             00000001 00000007 00000001 81000000 00000008 8021de24
> > 00000000 804a0000
> >             ...
> >     Call Trace:
> >     [<8010adec>] show_stack+0x74/0x104
> >     [<801a5e44>] bad_page+0x130/0x138
> >     [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
> >     [<801a789c>] free_unref_page+0x40/0x68
> >     [<801120f4>] free_init_pages+0xec/0x104
> >     [<803bdde8>] free_initmem+0x10/0x58
> >     [<803bdb8c>] kernel_init+0x20/0x100
> >     [<801057c8>] ret_from_kernel_thread+0x14/0x1c
> >     Disabling lock debugging due to kernel taint
> >     BUG: Bad page state in process swapper  pfn:00002
> >     [...]
> > 

The root cause of the problem most likely is in prom_free_prom_memory() method of
arch/mips/txx9/generic/setup.c:
void __init prom_free_prom_memory(void)
{
        unsigned long saddr = PAGE_SIZE;
        unsigned long eaddr = __pa_symbol(&_text);
        
        if (saddr < eaddr)
                free_init_pages("prom memory", saddr, eaddr);
}

As you can see the txx9 platform tries to free a memory which isn't reserved
and set free from the very beginning due to the patch above. So as soon as you
remove the free_init_pages("prom memory", ...) the problem shall be fixed.
Could you try it and send a result to us whether it helped?

Regards,
-Sergey

> > CONFIG_RELOCATABLE is not set, so the only relevant part is the
> > change quoted below.
> > 
> > > --- a/arch/mips/kernel/setup.c
> > > +++ b/arch/mips/kernel/setup.c
> > > @@ -371,7 +371,6 @@ static void __init bootmem_init(void)
> > >
> > >  static void __init bootmem_init(void)
> > >  {
> > > -       unsigned long reserved_end;
> > >         phys_addr_t ramstart = PHYS_ADDR_MAX;
> > >         int i;
> > >
> > > @@ -382,10 +381,10 @@ static void __init bootmem_init(void)
> > >          * will reserve the area used for the initrd.
> > >          */
> > >         init_initrd();
> > > -       reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
> > >
> > > -       memblock_reserve(PHYS_OFFSET,
> > > -                        (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
> > > +       /* Reserve memory occupied by kernel. */
> > > +       memblock_reserve(__pa_symbol(&_text),
> > > +                       __pa_symbol(&_end) - __pa_symbol(&_text));
> > >
> > >         /*
> > >          * max_low_pfn is not a number of pages. The number of pages
> > 
> > With some debug code added:
> > 
> >     Determined physical RAM map:
> >      memory: 08000000 @ 00000000 (usable)
> >     bootmem_init:390: PHYS_OFFSET = 0x0
> >     bootmem_init:391: __pa_symbol(&_text) = 0x100000
> >     bootmem_init:392: __pa_symbol(&_end) = 0x4b77c8
> >     bootmem_init:393: PFN_UP(__pa_symbol(&_end)) = 0x4b8
> 
> Have you tried adding memblock=debug to the command line?
> Not sure it'll help, but still :)
>  
> > Hence the old code reserved 1 MiB extra at the beginning.
> > 
> > Note that the new code also dropped the rounding up of the memory block
> > size to a multiple of PAGE_SIZE. I'm not sure the latter actually
> > matters or not.
> 
> I'd say that bad page state for pfn 1 is caused by "freeing" the first 1M.
> 
> > Do you have a clue? Thanks!
> > 
> > Gr{oetje,eeting}s,
> > 
> >                         Geert
> > 
> > -- 
> > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> > 
> > In personal conversations with technical people, I call myself a hacker. But
> > when I'm talking to journalists I just say "programmer" or something like that.
> >                                 -- Linus Torvalds
> > 
> 
> -- 
> Sincerely yours,
> Mike.
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-21 15:53     ` Mike Rapoport
  2019-05-21 16:39       ` Serge Semin
@ 2019-05-22  7:47       ` Geert Uytterhoeven
  2019-05-22  8:08         ` Mike Rapoport
  1 sibling, 1 reply; 43+ messages in thread
From: Geert Uytterhoeven @ 2019-05-22  7:47 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Serge Semin, Ralf Baechle, Paul Burton, James Hogan,
	Matt Redfearn, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto

Hi Mike,

On Tue, May 21, 2019 at 5:53 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> On Tue, May 21, 2019 at 04:56:39PM +0200, Geert Uytterhoeven wrote:
> > On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> > > The reserved_end variable had been used by the bootmem_init() code
> > > to find a lowest limit of memory available for memmap blob. The original
> > > code just tried to find a free memory space higher than kernel was placed.
> > > This limitation seems justified for the memmap ragion search process, but
> > > I can't see any obvious reason to reserve the unused space below kernel
> > > seeing some platforms place it much higher than standard 1MB. Moreover
> > > the RELOCATION config enables it to be loaded at any memory address.
> > > So lets reserve the memory occupied by the kernel only, leaving the region
> > > below being free for allocations. After doing this we can now discard the
> > > code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> > > since it's going to be free anyway (unless marked as reserved by
> > > platforms).
> > >
> > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> >
> > This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
> > image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:
> >
> >     VFS: Mounted root (nfs filesystem) on device 0:13.
> >     devtmpfs: mounted
> >     BUG: Bad page state in process swapper  pfn:00001
> >     page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
> >     flags: 0x0()
> >     raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
> >     page dumped because: nonzero mapcount
> >     Modules linked in:
> >     CPU: 0 PID: 1 Comm: swapper Not tainted
> > 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
> >     Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
> > 00000001 804a3490
> >             00000001 81000000 0030f231 80148558 00000003 10008400
> > 87c1dd80 7599ee13
> >             00000000 00000000 804b0000 00000000 00000007 00000000
> > 00000085 00000000
> >             62722d31 00000084 804b0000 39347874 00000000 804b7820
> > 8040cef8 81000010
> >             00000001 00000007 00000001 81000000 00000008 8021de24
> > 00000000 804a0000
> >             ...
> >     Call Trace:
> >     [<8010adec>] show_stack+0x74/0x104
> >     [<801a5e44>] bad_page+0x130/0x138
> >     [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
> >     [<801a789c>] free_unref_page+0x40/0x68
> >     [<801120f4>] free_init_pages+0xec/0x104
> >     [<803bdde8>] free_initmem+0x10/0x58
> >     [<803bdb8c>] kernel_init+0x20/0x100
> >     [<801057c8>] ret_from_kernel_thread+0x14/0x1c
> >     Disabling lock debugging due to kernel taint
> >     BUG: Bad page state in process swapper  pfn:00002
> >     [...]
> >
> > CONFIG_RELOCATABLE is not set, so the only relevant part is the
> > change quoted below.
> >
> > > --- a/arch/mips/kernel/setup.c
> > > +++ b/arch/mips/kernel/setup.c
> > > @@ -371,7 +371,6 @@ static void __init bootmem_init(void)
> > >
> > >  static void __init bootmem_init(void)
> > >  {
> > > -       unsigned long reserved_end;
> > >         phys_addr_t ramstart = PHYS_ADDR_MAX;
> > >         int i;
> > >
> > > @@ -382,10 +381,10 @@ static void __init bootmem_init(void)
> > >          * will reserve the area used for the initrd.
> > >          */
> > >         init_initrd();
> > > -       reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
> > >
> > > -       memblock_reserve(PHYS_OFFSET,
> > > -                        (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
> > > +       /* Reserve memory occupied by kernel. */
> > > +       memblock_reserve(__pa_symbol(&_text),
> > > +                       __pa_symbol(&_end) - __pa_symbol(&_text));
> > >
> > >         /*
> > >          * max_low_pfn is not a number of pages. The number of pages
> >
> > With some debug code added:
> >
> >     Determined physical RAM map:
> >      memory: 08000000 @ 00000000 (usable)
> >     bootmem_init:390: PHYS_OFFSET = 0x0
> >     bootmem_init:391: __pa_symbol(&_text) = 0x100000
> >     bootmem_init:392: __pa_symbol(&_end) = 0x4b77c8
> >     bootmem_init:393: PFN_UP(__pa_symbol(&_end)) = 0x4b8
>
> Have you tried adding memblock=debug to the command line?
> Not sure it'll help, but still :)

Thanks! Output below...

 Determined physical RAM map:
  memory: 08000000 @ 00000000 (usable)
+memblock_reserve: [0x00100000-0x004b77c7] setup_arch+0x258/0x8e4
 Initrd not found or empty - disabling initrd
+memblock_reserve: [0x00448000-0x00447fff] setup_arch+0x5ac/0x8e4
+MEMBLOCK configuration:
+ memory size = 0x08000000 reserved size = 0x003b77c8
+ memory.cnt  = 0x1
+ memory[0x0]    [0x00000000-0x07ffffff], 0x08000000 bytes on node 0 flags: 0x0
+ reserved.cnt  = 0x1
+ reserved[0x0]  [0x00100000-0x004b77c7], 0x003b77c8 bytes flags: 0x0
+memblock_alloc_try_nid: 32 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 setup_arch+0x7ec/0x8e4
+memblock_reserve: [0x004b77e0-0x004b77ff] memblock_alloc_range_nid+0x130/0x178
 Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
 Primary data cache 32kB, 4-way, VIPT, cache aliases, linesize 32 bytes
 Zone ranges:
@@ -16,10 +26,48 @@ Movable zone start for each node
 Early memory node ranges
   node   0: [mem 0x0000000000000000-0x0000000007ffffff]
 Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
+memblock_alloc_try_nid: 1048576 bytes align=0x20 nid=0
from=0x00000000 max_addr=0x00000000
alloc_node_mem_map.constprop.31+0x6c/0xc8
+memblock_reserve: [0x004b7800-0x005b77ff] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 16 bytes align=0x20 nid=0 from=0x00000000
max_addr=0x00000000 setup_usemap.isra.13+0x68/0xa0
+memblock_reserve: [0x005b7800-0x005b780f] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 116 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 start_kernel+0xb0/0x508
+memblock_reserve: [0x005b7820-0x005b7893] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 116 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 start_kernel+0xf0/0x508
+memblock_reserve: [0x005b78a0-0x005b7913] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 116 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 start_kernel+0x114/0x508
+memblock_reserve: [0x005b7920-0x005b7993] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 4096 bytes align=0x1000 nid=-1
from=0x00000000 max_addr=0x00000000 pcpu_alloc_alloc_info+0x60/0xb8
+memblock_reserve: [0x005b8000-0x005b8fff] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 32768 bytes align=0x1000 nid=-1
from=0x01000000 max_addr=0x00000000 setup_per_cpu_areas+0x38/0xa8
+memblock_reserve: [0x01000000-0x01007fff] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 4 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_setup_first_chunk+0x200/0x588
+memblock_reserve: [0x005b79a0-0x005b79a3] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 4 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_setup_first_chunk+0x22c/0x588
+memblock_reserve: [0x005b79c0-0x005b79c3] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 4 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_setup_first_chunk+0x250/0x588
+memblock_reserve: [0x005b79e0-0x005b79e3] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 4 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_setup_first_chunk+0x288/0x588
+memblock_reserve: [0x005b7a00-0x005b7a03] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 120 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_setup_first_chunk+0x47c/0x588
+memblock_reserve: [0x005b7a20-0x005b7a97] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 89 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_alloc_first_chunk+0x88/0x2e0
+memblock_reserve: [0x005b7aa0-0x005b7af8] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 1024 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_alloc_first_chunk+0xd8/0x2e0
+memblock_reserve: [0x005b7b00-0x005b7eff] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 1028 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_alloc_first_chunk+0x118/0x2e0
+memblock_reserve: [0x005b9000-0x005b9403] memblock_alloc_range_nid+0x130/0x178
+memblock_alloc_try_nid: 256 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 pcpu_alloc_first_chunk+0x154/0x2e0
+memblock_reserve: [0x005b7f00-0x005b7fff] memblock_alloc_range_nid+0x130/0x178
+   memblock_free: [0x005b8000-0x005b8fff] start_kernel+0x164/0x508
 Built 1 zonelists, mobility grouping on.  Total pages: 32512
-Kernel command line:   console=ttyS0,9600 ip=on root=/dev/nfs rw
nfsroot=192.168.97.29:/nas/rbtx4927/debian-mipsel,tcp,v3
+Kernel command line:   console=ttyS0,9600 ip=on root=/dev/nfs rw
nfsroot=192.168.97.29:/nas/rbtx4927/debian-mipsel,tcp,v3
memblock=debug
+memblock_alloc_try_nid: 65536 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 alloc_large_system_hash+0x270/0x478
+memblock_reserve: [0x005b9420-0x005c941f] memblock_alloc_range_nid+0x130/0x178
 Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
+memblock_alloc_try_nid: 32768 bytes align=0x20 nid=-1 from=0x00000000
max_addr=0x00000000 alloc_large_system_hash+0x270/0x478
+memblock_reserve: [0x005c9420-0x005d141f] memblock_alloc_range_nid+0x130/0x178
 Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
+memblock_reserve: [0x00000000-0x000003ff] trap_init+0x58/0x474
 Memory: 126100K/131072K available (2830K kernel code, 147K rwdata,
508K rodata, 220K init, 93K bss, 4972K reserved, 0K cma-reserved)

> > Hence the old code reserved 1 MiB extra at the beginning.
> >
> > Note that the new code also dropped the rounding up of the memory block
> > size to a multiple of PAGE_SIZE. I'm not sure the latter actually
> > matters or not.
>
> I'd say that bad page state for pfn 1 is caused by "freeing" the first 1M.

Yes, by prom_free_prom_memory(), as pointed out by Serge.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-21 16:39       ` Serge Semin
@ 2019-05-22  7:50         ` Geert Uytterhoeven
  0 siblings, 0 replies; 43+ messages in thread
From: Geert Uytterhoeven @ 2019-05-22  7:50 UTC (permalink / raw)
  To: Serge Semin
  Cc: Mike Rapoport, Ralf Baechle, Paul Burton, James Hogan,
	Matt Redfearn, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto

Hi Serge,

On Tue, May 21, 2019 at 6:39 PM Serge Semin <fancer.lancer@gmail.com> wrote:
> On Tue, May 21, 2019 at 06:53:10PM +0300, Mike Rapoport wrote:
> > On Tue, May 21, 2019 at 04:56:39PM +0200, Geert Uytterhoeven wrote:
> > > On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> > > > The reserved_end variable had been used by the bootmem_init() code
> > > > to find a lowest limit of memory available for memmap blob. The original
> > > > code just tried to find a free memory space higher than kernel was placed.
> > > > This limitation seems justified for the memmap ragion search process, but
> > > > I can't see any obvious reason to reserve the unused space below kernel
> > > > seeing some platforms place it much higher than standard 1MB. Moreover
> > > > the RELOCATION config enables it to be loaded at any memory address.
> > > > So lets reserve the memory occupied by the kernel only, leaving the region
> > > > below being free for allocations. After doing this we can now discard the
> > > > code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> > > > since it's going to be free anyway (unless marked as reserved by
> > > > platforms).
> > > >
> > > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > >
> > > This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
> > > image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:
> > >
> > >     VFS: Mounted root (nfs filesystem) on device 0:13.
> > >     devtmpfs: mounted
> > >     BUG: Bad page state in process swapper  pfn:00001
> > >     page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
> > >     flags: 0x0()
> > >     raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
> > >     page dumped because: nonzero mapcount
> > >     Modules linked in:
> > >     CPU: 0 PID: 1 Comm: swapper Not tainted
> > > 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
> > >     Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
> > > 00000001 804a3490
> > >             00000001 81000000 0030f231 80148558 00000003 10008400
> > > 87c1dd80 7599ee13
> > >             00000000 00000000 804b0000 00000000 00000007 00000000
> > > 00000085 00000000
> > >             62722d31 00000084 804b0000 39347874 00000000 804b7820
> > > 8040cef8 81000010
> > >             00000001 00000007 00000001 81000000 00000008 8021de24
> > > 00000000 804a0000
> > >             ...
> > >     Call Trace:
> > >     [<8010adec>] show_stack+0x74/0x104
> > >     [<801a5e44>] bad_page+0x130/0x138
> > >     [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
> > >     [<801a789c>] free_unref_page+0x40/0x68
> > >     [<801120f4>] free_init_pages+0xec/0x104
> > >     [<803bdde8>] free_initmem+0x10/0x58
> > >     [<803bdb8c>] kernel_init+0x20/0x100
> > >     [<801057c8>] ret_from_kernel_thread+0x14/0x1c
> > >     Disabling lock debugging due to kernel taint
> > >     BUG: Bad page state in process swapper  pfn:00002
> > >     [...]
> > >
>
> The root cause of the problem most likely is in prom_free_prom_memory() method of
> arch/mips/txx9/generic/setup.c:
> void __init prom_free_prom_memory(void)
> {
>         unsigned long saddr = PAGE_SIZE;
>         unsigned long eaddr = __pa_symbol(&_text);
>
>         if (saddr < eaddr)
>                 free_init_pages("prom memory", saddr, eaddr);
> }
>
> As you can see the txx9 platform tries to free a memory which isn't reserved
> and set free from the very beginning due to the patch above. So as soon as you
> remove the free_init_pages("prom memory", ...) the problem shall be fixed.
> Could you try it and send a result to us whether it helped?

Thanks, that does the trick!
Will send a patch shortly...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-22  7:47       ` Geert Uytterhoeven
@ 2019-05-22  8:08         ` Mike Rapoport
  2019-05-22  8:14           ` Geert Uytterhoeven
  0 siblings, 1 reply; 43+ messages in thread
From: Mike Rapoport @ 2019-05-22  8:08 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Serge Semin, Ralf Baechle, Paul Burton, James Hogan,
	Matt Redfearn, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto

On Wed, May 22, 2019 at 09:47:04AM +0200, Geert Uytterhoeven wrote:
> Hi Mike,
> 
> On Tue, May 21, 2019 at 5:53 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > On Tue, May 21, 2019 at 04:56:39PM +0200, Geert Uytterhoeven wrote:
> > > On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> > > > The reserved_end variable had been used by the bootmem_init() code
> > > > to find a lowest limit of memory available for memmap blob. The original
> > > > code just tried to find a free memory space higher than kernel was placed.
> > > > This limitation seems justified for the memmap ragion search process, but
> > > > I can't see any obvious reason to reserve the unused space below kernel
> > > > seeing some platforms place it much higher than standard 1MB. Moreover
> > > > the RELOCATION config enables it to be loaded at any memory address.
> > > > So lets reserve the memory occupied by the kernel only, leaving the region
> > > > below being free for allocations. After doing this we can now discard the
> > > > code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> > > > since it's going to be free anyway (unless marked as reserved by
> > > > platforms).
> > > >
> > > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > >
> > > This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
> > > image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:
> > >
> > >     VFS: Mounted root (nfs filesystem) on device 0:13.
> > >     devtmpfs: mounted
> > >     BUG: Bad page state in process swapper  pfn:00001
> > >     page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
> > >     flags: 0x0()
> > >     raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
> > >     page dumped because: nonzero mapcount
> > >     Modules linked in:
> > >     CPU: 0 PID: 1 Comm: swapper Not tainted
> > > 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
> > >     Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
> > > 00000001 804a3490
> > >             00000001 81000000 0030f231 80148558 00000003 10008400
> > > 87c1dd80 7599ee13
> > >             00000000 00000000 804b0000 00000000 00000007 00000000
> > > 00000085 00000000
> > >             62722d31 00000084 804b0000 39347874 00000000 804b7820
> > > 8040cef8 81000010
> > >             00000001 00000007 00000001 81000000 00000008 8021de24
> > > 00000000 804a0000
> > >             ...
> > >     Call Trace:
> > >     [<8010adec>] show_stack+0x74/0x104
> > >     [<801a5e44>] bad_page+0x130/0x138
> > >     [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
> > >     [<801a789c>] free_unref_page+0x40/0x68
> > >     [<801120f4>] free_init_pages+0xec/0x104
> > >     [<803bdde8>] free_initmem+0x10/0x58
> > >     [<803bdb8c>] kernel_init+0x20/0x100
> > >     [<801057c8>] ret_from_kernel_thread+0x14/0x1c
> > >     Disabling lock debugging due to kernel taint
> > >     BUG: Bad page state in process swapper  pfn:00002
> > >     [...]
> > >
> > > CONFIG_RELOCATABLE is not set, so the only relevant part is the
> > > change quoted below.
> > >
> > > > --- a/arch/mips/kernel/setup.c
> > > > +++ b/arch/mips/kernel/setup.c
> > > > @@ -371,7 +371,6 @@ static void __init bootmem_init(void)
> > > >
> > > >  static void __init bootmem_init(void)
> > > >  {
> > > > -       unsigned long reserved_end;
> > > >         phys_addr_t ramstart = PHYS_ADDR_MAX;
> > > >         int i;
> > > >
> > > > @@ -382,10 +381,10 @@ static void __init bootmem_init(void)
> > > >          * will reserve the area used for the initrd.
> > > >          */
> > > >         init_initrd();
> > > > -       reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
> > > >
> > > > -       memblock_reserve(PHYS_OFFSET,
> > > > -                        (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
> > > > +       /* Reserve memory occupied by kernel. */
> > > > +       memblock_reserve(__pa_symbol(&_text),
> > > > +                       __pa_symbol(&_end) - __pa_symbol(&_text));
> > > >
> > > >         /*
> > > >          * max_low_pfn is not a number of pages. The number of pages
> > >
> > > With some debug code added:
> > >
> > >     Determined physical RAM map:
> > >      memory: 08000000 @ 00000000 (usable)
> > >     bootmem_init:390: PHYS_OFFSET = 0x0
> > >     bootmem_init:391: __pa_symbol(&_text) = 0x100000
> > >     bootmem_init:392: __pa_symbol(&_end) = 0x4b77c8
> > >     bootmem_init:393: PFN_UP(__pa_symbol(&_end)) = 0x4b8
> >
> > Have you tried adding memblock=debug to the command line?
> > Not sure it'll help, but still :)
> 
> Thanks! Output below...
> 
>  Determined physical RAM map:
>   memory: 08000000 @ 00000000 (usable)
> +memblock_reserve: [0x00100000-0x004b77c7] setup_arch+0x258/0x8e4
>  Initrd not found or empty - disabling initrd
> +memblock_reserve: [0x00448000-0x00447fff] setup_arch+0x5ac/0x8e4
> +MEMBLOCK configuration:
> + memory size = 0x08000000 reserved size = 0x003b77c8
> + memory.cnt  = 0x1
> + memory[0x0]    [0x00000000-0x07ffffff], 0x08000000 bytes on node 0 flags: 0x0
> + reserved.cnt  = 0x1
> + reserved[0x0]  [0x00100000-0x004b77c7], 0x003b77c8 bytes flags: 0x0
> +memblock_alloc_try_nid: 32 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 setup_arch+0x7ec/0x8e4
> +memblock_reserve: [0x004b77e0-0x004b77ff] memblock_alloc_range_nid+0x130/0x178
>  Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
>  Primary data cache 32kB, 4-way, VIPT, cache aliases, linesize 32 bytes
>  Zone ranges:
> @@ -16,10 +26,48 @@ Movable zone start for each node
>  Early memory node ranges
>    node   0: [mem 0x0000000000000000-0x0000000007ffffff]
>  Initmem setup node 0 [mem 0x0000000000000000-0x0000000007ffffff]
> +memblock_alloc_try_nid: 1048576 bytes align=0x20 nid=0
> from=0x00000000 max_addr=0x00000000
> alloc_node_mem_map.constprop.31+0x6c/0xc8
> +memblock_reserve: [0x004b7800-0x005b77ff] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 16 bytes align=0x20 nid=0 from=0x00000000
> max_addr=0x00000000 setup_usemap.isra.13+0x68/0xa0
> +memblock_reserve: [0x005b7800-0x005b780f] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 116 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 start_kernel+0xb0/0x508
> +memblock_reserve: [0x005b7820-0x005b7893] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 116 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 start_kernel+0xf0/0x508
> +memblock_reserve: [0x005b78a0-0x005b7913] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 116 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 start_kernel+0x114/0x508
> +memblock_reserve: [0x005b7920-0x005b7993] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 4096 bytes align=0x1000 nid=-1
> from=0x00000000 max_addr=0x00000000 pcpu_alloc_alloc_info+0x60/0xb8
> +memblock_reserve: [0x005b8000-0x005b8fff] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 32768 bytes align=0x1000 nid=-1
> from=0x01000000 max_addr=0x00000000 setup_per_cpu_areas+0x38/0xa8
> +memblock_reserve: [0x01000000-0x01007fff] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 4 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_setup_first_chunk+0x200/0x588
> +memblock_reserve: [0x005b79a0-0x005b79a3] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 4 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_setup_first_chunk+0x22c/0x588
> +memblock_reserve: [0x005b79c0-0x005b79c3] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 4 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_setup_first_chunk+0x250/0x588
> +memblock_reserve: [0x005b79e0-0x005b79e3] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 4 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_setup_first_chunk+0x288/0x588
> +memblock_reserve: [0x005b7a00-0x005b7a03] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 120 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_setup_first_chunk+0x47c/0x588
> +memblock_reserve: [0x005b7a20-0x005b7a97] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 89 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_alloc_first_chunk+0x88/0x2e0
> +memblock_reserve: [0x005b7aa0-0x005b7af8] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 1024 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_alloc_first_chunk+0xd8/0x2e0
> +memblock_reserve: [0x005b7b00-0x005b7eff] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 1028 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_alloc_first_chunk+0x118/0x2e0
> +memblock_reserve: [0x005b9000-0x005b9403] memblock_alloc_range_nid+0x130/0x178
> +memblock_alloc_try_nid: 256 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 pcpu_alloc_first_chunk+0x154/0x2e0
> +memblock_reserve: [0x005b7f00-0x005b7fff] memblock_alloc_range_nid+0x130/0x178
> +   memblock_free: [0x005b8000-0x005b8fff] start_kernel+0x164/0x508
>  Built 1 zonelists, mobility grouping on.  Total pages: 32512
> -Kernel command line:   console=ttyS0,9600 ip=on root=/dev/nfs rw
> nfsroot=192.168.97.29:/nas/rbtx4927/debian-mipsel,tcp,v3
> +Kernel command line:   console=ttyS0,9600 ip=on root=/dev/nfs rw
> nfsroot=192.168.97.29:/nas/rbtx4927/debian-mipsel,tcp,v3
> memblock=debug
> +memblock_alloc_try_nid: 65536 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 alloc_large_system_hash+0x270/0x478
> +memblock_reserve: [0x005b9420-0x005c941f] memblock_alloc_range_nid+0x130/0x178
>  Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
> +memblock_alloc_try_nid: 32768 bytes align=0x20 nid=-1 from=0x00000000
> max_addr=0x00000000 alloc_large_system_hash+0x270/0x478
> +memblock_reserve: [0x005c9420-0x005d141f] memblock_alloc_range_nid+0x130/0x178
>  Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
> +memblock_reserve: [0x00000000-0x000003ff] trap_init+0x58/0x474
>  Memory: 126100K/131072K available (2830K kernel code, 147K rwdata,
> 508K rodata, 220K init, 93K bss, 4972K reserved, 0K cma-reserved)

Presuming your system is !cpu_has_mips_r2_r6 and CAC_BASE is 0 the log
looks completely sane
 
> > > Hence the old code reserved 1 MiB extra at the beginning.
> > >
> > > Note that the new code also dropped the rounding up of the memory block
> > > size to a multiple of PAGE_SIZE. I'm not sure the latter actually
> > > matters or not.
> >
> > I'd say that bad page state for pfn 1 is caused by "freeing" the first 1M.
> 
> Yes, by prom_free_prom_memory(), as pointed out by Serge.

I wonder how other MIPS variants would react to the fact that the memory
below the kernel is not reserved ;-)

> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> -- 
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-22  8:08         ` Mike Rapoport
@ 2019-05-22  8:14           ` Geert Uytterhoeven
  2019-05-22 13:34             ` Serge Semin
  0 siblings, 1 reply; 43+ messages in thread
From: Geert Uytterhoeven @ 2019-05-22  8:14 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Serge Semin, Ralf Baechle, Paul Burton, James Hogan,
	Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto, Maciej W. Rozycki

Hi Mike,

On Wed, May 22, 2019 at 10:08 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
> On Wed, May 22, 2019 at 09:47:04AM +0200, Geert Uytterhoeven wrote:
> > On Tue, May 21, 2019 at 5:53 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > On Tue, May 21, 2019 at 04:56:39PM +0200, Geert Uytterhoeven wrote:
> > > > On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> > > > > The reserved_end variable had been used by the bootmem_init() code
> > > > > to find a lowest limit of memory available for memmap blob. The original
> > > > > code just tried to find a free memory space higher than kernel was placed.
> > > > > This limitation seems justified for the memmap ragion search process, but
> > > > > I can't see any obvious reason to reserve the unused space below kernel
> > > > > seeing some platforms place it much higher than standard 1MB. Moreover
> > > > > the RELOCATION config enables it to be loaded at any memory address.
> > > > > So lets reserve the memory occupied by the kernel only, leaving the region
> > > > > below being free for allocations. After doing this we can now discard the
> > > > > code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> > > > > since it's going to be free anyway (unless marked as reserved by
> > > > > platforms).
> > > > >
> > > > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > > >
> > > > This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
> > > > image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:
> > > >
> > > >     VFS: Mounted root (nfs filesystem) on device 0:13.
> > > >     devtmpfs: mounted
> > > >     BUG: Bad page state in process swapper  pfn:00001
> > > >     page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
> > > >     flags: 0x0()
> > > >     raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
> > > >     page dumped because: nonzero mapcount
> > > >     Modules linked in:
> > > >     CPU: 0 PID: 1 Comm: swapper Not tainted
> > > > 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
> > > >     Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
> > > > 00000001 804a3490
> > > >             00000001 81000000 0030f231 80148558 00000003 10008400
> > > > 87c1dd80 7599ee13
> > > >             00000000 00000000 804b0000 00000000 00000007 00000000
> > > > 00000085 00000000
> > > >             62722d31 00000084 804b0000 39347874 00000000 804b7820
> > > > 8040cef8 81000010
> > > >             00000001 00000007 00000001 81000000 00000008 8021de24
> > > > 00000000 804a0000
> > > >             ...
> > > >     Call Trace:
> > > >     [<8010adec>] show_stack+0x74/0x104
> > > >     [<801a5e44>] bad_page+0x130/0x138
> > > >     [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
> > > >     [<801a789c>] free_unref_page+0x40/0x68
> > > >     [<801120f4>] free_init_pages+0xec/0x104
> > > >     [<803bdde8>] free_initmem+0x10/0x58
> > > >     [<803bdb8c>] kernel_init+0x20/0x100
> > > >     [<801057c8>] ret_from_kernel_thread+0x14/0x1c
> > > >     Disabling lock debugging due to kernel taint
> > > >     BUG: Bad page state in process swapper  pfn:00002
> > > >     [...]
> > > >
> > > > CONFIG_RELOCATABLE is not set, so the only relevant part is the
> > > > change quoted below.
> > > >
> > > > > --- a/arch/mips/kernel/setup.c
> > > > > +++ b/arch/mips/kernel/setup.c
> > > > > @@ -371,7 +371,6 @@ static void __init bootmem_init(void)
> > > > >
> > > > >  static void __init bootmem_init(void)
> > > > >  {
> > > > > -       unsigned long reserved_end;
> > > > >         phys_addr_t ramstart = PHYS_ADDR_MAX;
> > > > >         int i;
> > > > >
> > > > > @@ -382,10 +381,10 @@ static void __init bootmem_init(void)
> > > > >          * will reserve the area used for the initrd.
> > > > >          */
> > > > >         init_initrd();
> > > > > -       reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
> > > > >
> > > > > -       memblock_reserve(PHYS_OFFSET,
> > > > > -                        (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
> > > > > +       /* Reserve memory occupied by kernel. */
> > > > > +       memblock_reserve(__pa_symbol(&_text),
> > > > > +                       __pa_symbol(&_end) - __pa_symbol(&_text));
> > > > >
> > > > >         /*
> > > > >          * max_low_pfn is not a number of pages. The number of pages

> > > > Hence the old code reserved 1 MiB extra at the beginning.
> > > >
> > > > Note that the new code also dropped the rounding up of the memory block
> > > > size to a multiple of PAGE_SIZE. I'm not sure the latter actually
> > > > matters or not.
> > >
> > > I'd say that bad page state for pfn 1 is caused by "freeing" the first 1M.
> >
> > Yes, by prom_free_prom_memory(), as pointed out by Serge.
>
> I wonder how other MIPS variants would react to the fact that the memory
> below the kernel is not reserved ;-)

Exactly...

Looks like at least arch/mips/dec/prom/memory.c needs a similar but\
more complicated fix, due to declance handling...


Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-22  8:14           ` Geert Uytterhoeven
@ 2019-05-22 13:34             ` Serge Semin
  2019-05-22 13:44               ` Geert Uytterhoeven
  0 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-05-22 13:34 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Mike Rapoport, Ralf Baechle, Paul Burton, James Hogan,
	Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto, Maciej W. Rozycki

Hello folks,

On Wed, May 22, 2019 at 10:14:47AM +0200, Geert Uytterhoeven wrote:
> Hi Mike,
> 
> On Wed, May 22, 2019 at 10:08 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > On Wed, May 22, 2019 at 09:47:04AM +0200, Geert Uytterhoeven wrote:
> > > On Tue, May 21, 2019 at 5:53 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > > On Tue, May 21, 2019 at 04:56:39PM +0200, Geert Uytterhoeven wrote:
> > > > > On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> > > > > > The reserved_end variable had been used by the bootmem_init() code
> > > > > > to find a lowest limit of memory available for memmap blob. The original
> > > > > > code just tried to find a free memory space higher than kernel was placed.
> > > > > > This limitation seems justified for the memmap ragion search process, but
> > > > > > I can't see any obvious reason to reserve the unused space below kernel
> > > > > > seeing some platforms place it much higher than standard 1MB. Moreover
> > > > > > the RELOCATION config enables it to be loaded at any memory address.
> > > > > > So lets reserve the memory occupied by the kernel only, leaving the region
> > > > > > below being free for allocations. After doing this we can now discard the
> > > > > > code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> > > > > > since it's going to be free anyway (unless marked as reserved by
> > > > > > platforms).
> > > > > >
> > > > > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > > > >
> > > > > This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
> > > > > image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:
> > > > >
> > > > >     VFS: Mounted root (nfs filesystem) on device 0:13.
> > > > >     devtmpfs: mounted
> > > > >     BUG: Bad page state in process swapper  pfn:00001
> > > > >     page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
> > > > >     flags: 0x0()
> > > > >     raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
> > > > >     page dumped because: nonzero mapcount
> > > > >     Modules linked in:
> > > > >     CPU: 0 PID: 1 Comm: swapper Not tainted
> > > > > 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
> > > > >     Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
> > > > > 00000001 804a3490
> > > > >             00000001 81000000 0030f231 80148558 00000003 10008400
> > > > > 87c1dd80 7599ee13
> > > > >             00000000 00000000 804b0000 00000000 00000007 00000000
> > > > > 00000085 00000000
> > > > >             62722d31 00000084 804b0000 39347874 00000000 804b7820
> > > > > 8040cef8 81000010
> > > > >             00000001 00000007 00000001 81000000 00000008 8021de24
> > > > > 00000000 804a0000
> > > > >             ...
> > > > >     Call Trace:
> > > > >     [<8010adec>] show_stack+0x74/0x104
> > > > >     [<801a5e44>] bad_page+0x130/0x138
> > > > >     [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
> > > > >     [<801a789c>] free_unref_page+0x40/0x68
> > > > >     [<801120f4>] free_init_pages+0xec/0x104
> > > > >     [<803bdde8>] free_initmem+0x10/0x58
> > > > >     [<803bdb8c>] kernel_init+0x20/0x100
> > > > >     [<801057c8>] ret_from_kernel_thread+0x14/0x1c
> > > > >     Disabling lock debugging due to kernel taint
> > > > >     BUG: Bad page state in process swapper  pfn:00002
> > > > >     [...]
> > > > >
> > > > > CONFIG_RELOCATABLE is not set, so the only relevant part is the
> > > > > change quoted below.
> > > > >
> > > > > > --- a/arch/mips/kernel/setup.c
> > > > > > +++ b/arch/mips/kernel/setup.c
> > > > > > @@ -371,7 +371,6 @@ static void __init bootmem_init(void)
> > > > > >
> > > > > >  static void __init bootmem_init(void)
> > > > > >  {
> > > > > > -       unsigned long reserved_end;
> > > > > >         phys_addr_t ramstart = PHYS_ADDR_MAX;
> > > > > >         int i;
> > > > > >
> > > > > > @@ -382,10 +381,10 @@ static void __init bootmem_init(void)
> > > > > >          * will reserve the area used for the initrd.
> > > > > >          */
> > > > > >         init_initrd();
> > > > > > -       reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
> > > > > >
> > > > > > -       memblock_reserve(PHYS_OFFSET,
> > > > > > -                        (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
> > > > > > +       /* Reserve memory occupied by kernel. */
> > > > > > +       memblock_reserve(__pa_symbol(&_text),
> > > > > > +                       __pa_symbol(&_end) - __pa_symbol(&_text));
> > > > > >
> > > > > >         /*
> > > > > >          * max_low_pfn is not a number of pages. The number of pages
> 
> > > > > Hence the old code reserved 1 MiB extra at the beginning.
> > > > >
> > > > > Note that the new code also dropped the rounding up of the memory block
> > > > > size to a multiple of PAGE_SIZE. I'm not sure the latter actually
> > > > > matters or not.
> > > >
> > > > I'd say that bad page state for pfn 1 is caused by "freeing" the first 1M.
> > >
> > > Yes, by prom_free_prom_memory(), as pointed out by Serge.
> >
> > I wonder how other MIPS variants would react to the fact that the memory
> > below the kernel is not reserved ;-)
> 
> Exactly...
> 
> Looks like at least arch/mips/dec/prom/memory.c needs a similar but\
> more complicated fix, due to declance handling...
> 

The problem might be fixed there by the next patch:
---
diff --git a/arch/mips/dec/prom/memory.c b/arch/mips/dec/prom/memory.c
index 5073d2ed78bb..5a0c734b5d04 100644
--- a/arch/mips/dec/prom/memory.c
+++ b/arch/mips/dec/prom/memory.c
@@ -91,29 +91,14 @@ void __init prom_meminit(u32 magic)
 		pmax_setup_memory_region();
 	else
 		rex_setup_memory_region();
-}
-
-void __init prom_free_prom_memory(void)
-{
-	unsigned long end;
-
-	/*
-	 * Free everything below the kernel itself but leave
-	 * the first page reserved for the exception handlers.
-	 */
 
 #if IS_ENABLED(CONFIG_DECLANCE)
 	/*
-	 * Leave 128 KB reserved for Lance memory for
-	 * IOASIC DECstations.
+	 * Reserve 128 KB for Lance memory for IOASIC DECstations.
 	 *
 	 * XXX: save this address for use in dec_lance.c?
 	 */
 	if (IOASIC)
-		end = __pa(&_text) - 0x00020000;
-	else
+		memblock_reserve(__pa_symbol(&_text), 0x00020000);
 #endif
-		end = __pa(&_text);
-
-	free_init_pages("unused PROM memory", PAGE_SIZE, end);
 }
---

Didn't wanna use prom_FREE_prom_memory to actually reserve a memory
chunk, so I moved the reservation into the prom_meminit() method.

Regarding the first page for the exception handlers. We don't need
to reserve it here, since it is already done in arch/mips/kernel/traps.c .

Regards,
-Sergey


> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-22 13:34             ` Serge Semin
@ 2019-05-22 13:44               ` Geert Uytterhoeven
  2019-05-22 13:54                 ` Serge Semin
  0 siblings, 1 reply; 43+ messages in thread
From: Geert Uytterhoeven @ 2019-05-22 13:44 UTC (permalink / raw)
  To: Serge Semin
  Cc: Mike Rapoport, Ralf Baechle, Paul Burton, James Hogan,
	Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto, Maciej W. Rozycki

Hi Serge,

On Wed, May 22, 2019 at 3:34 PM Serge Semin <fancer.lancer@gmail.com> wrote:
> On Wed, May 22, 2019 at 10:14:47AM +0200, Geert Uytterhoeven wrote:
> > On Wed, May 22, 2019 at 10:08 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > On Wed, May 22, 2019 at 09:47:04AM +0200, Geert Uytterhoeven wrote:
> > > > On Tue, May 21, 2019 at 5:53 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > > > On Tue, May 21, 2019 at 04:56:39PM +0200, Geert Uytterhoeven wrote:
> > > > > > On Wed, Apr 24, 2019 at 12:50 AM Serge Semin <fancer.lancer@gmail.com> wrote:
> > > > > > > The reserved_end variable had been used by the bootmem_init() code
> > > > > > > to find a lowest limit of memory available for memmap blob. The original
> > > > > > > code just tried to find a free memory space higher than kernel was placed.
> > > > > > > This limitation seems justified for the memmap ragion search process, but
> > > > > > > I can't see any obvious reason to reserve the unused space below kernel
> > > > > > > seeing some platforms place it much higher than standard 1MB. Moreover
> > > > > > > the RELOCATION config enables it to be loaded at any memory address.
> > > > > > > So lets reserve the memory occupied by the kernel only, leaving the region
> > > > > > > below being free for allocations. After doing this we can now discard the
> > > > > > > code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
> > > > > > > since it's going to be free anyway (unless marked as reserved by
> > > > > > > platforms).
> > > > > > >
> > > > > > > Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
> > > > > >
> > > > > > This is now commit b93ddc4f9156205e ("mips: Reserve memory for the kernel
> > > > > > image resources") in v5.2-rc1, which causes rbtx4927 to crash during boot:
> > > > > >
> > > > > >     VFS: Mounted root (nfs filesystem) on device 0:13.
> > > > > >     devtmpfs: mounted
> > > > > >     BUG: Bad page state in process swapper  pfn:00001
> > > > > >     page:804b7820 refcount:0 mapcount:-128 mapping:00000000 index:0x1
> > > > > >     flags: 0x0()
> > > > > >     raw: 00000000 00000100 00000200 00000000 00000001 00000000 ffffff7f 00000000
> > > > > >     page dumped because: nonzero mapcount
> > > > > >     Modules linked in:
> > > > > >     CPU: 0 PID: 1 Comm: swapper Not tainted
> > > > > > 5.2.0-rc1-rbtx4927-00468-g3c05ea3d4077b756-dirty #137
> > > > > >     Stack : 00000000 10008400 8040dd2c 87c1b974 8044af63 8040dd2c
> > > > > > 00000001 804a3490
> > > > > >             00000001 81000000 0030f231 80148558 00000003 10008400
> > > > > > 87c1dd80 7599ee13
> > > > > >             00000000 00000000 804b0000 00000000 00000007 00000000
> > > > > > 00000085 00000000
> > > > > >             62722d31 00000084 804b0000 39347874 00000000 804b7820
> > > > > > 8040cef8 81000010
> > > > > >             00000001 00000007 00000001 81000000 00000008 8021de24
> > > > > > 00000000 804a0000
> > > > > >             ...
> > > > > >     Call Trace:
> > > > > >     [<8010adec>] show_stack+0x74/0x104
> > > > > >     [<801a5e44>] bad_page+0x130/0x138
> > > > > >     [<801a654c>] free_pcppages_bulk+0x17c/0x3b0
> > > > > >     [<801a789c>] free_unref_page+0x40/0x68
> > > > > >     [<801120f4>] free_init_pages+0xec/0x104
> > > > > >     [<803bdde8>] free_initmem+0x10/0x58
> > > > > >     [<803bdb8c>] kernel_init+0x20/0x100
> > > > > >     [<801057c8>] ret_from_kernel_thread+0x14/0x1c
> > > > > >     Disabling lock debugging due to kernel taint
> > > > > >     BUG: Bad page state in process swapper  pfn:00002
> > > > > >     [...]
> > > > > >
> > > > > > CONFIG_RELOCATABLE is not set, so the only relevant part is the
> > > > > > change quoted below.
> > > > > >
> > > > > > > --- a/arch/mips/kernel/setup.c
> > > > > > > +++ b/arch/mips/kernel/setup.c
> > > > > > > @@ -371,7 +371,6 @@ static void __init bootmem_init(void)
> > > > > > >
> > > > > > >  static void __init bootmem_init(void)
> > > > > > >  {
> > > > > > > -       unsigned long reserved_end;
> > > > > > >         phys_addr_t ramstart = PHYS_ADDR_MAX;
> > > > > > >         int i;
> > > > > > >
> > > > > > > @@ -382,10 +381,10 @@ static void __init bootmem_init(void)
> > > > > > >          * will reserve the area used for the initrd.
> > > > > > >          */
> > > > > > >         init_initrd();
> > > > > > > -       reserved_end = (unsigned long) PFN_UP(__pa_symbol(&_end));
> > > > > > >
> > > > > > > -       memblock_reserve(PHYS_OFFSET,
> > > > > > > -                        (reserved_end << PAGE_SHIFT) - PHYS_OFFSET);
> > > > > > > +       /* Reserve memory occupied by kernel. */
> > > > > > > +       memblock_reserve(__pa_symbol(&_text),
> > > > > > > +                       __pa_symbol(&_end) - __pa_symbol(&_text));
> > > > > > >
> > > > > > >         /*
> > > > > > >          * max_low_pfn is not a number of pages. The number of pages
> >
> > > > > > Hence the old code reserved 1 MiB extra at the beginning.
> > > > > >
> > > > > > Note that the new code also dropped the rounding up of the memory block
> > > > > > size to a multiple of PAGE_SIZE. I'm not sure the latter actually
> > > > > > matters or not.
> > > > >
> > > > > I'd say that bad page state for pfn 1 is caused by "freeing" the first 1M.
> > > >
> > > > Yes, by prom_free_prom_memory(), as pointed out by Serge.
> > >
> > > I wonder how other MIPS variants would react to the fact that the memory
> > > below the kernel is not reserved ;-)
> >
> > Exactly...
> >
> > Looks like at least arch/mips/dec/prom/memory.c needs a similar but\
> > more complicated fix, due to declance handling...
> >
>
> The problem might be fixed there by the next patch:
> ---
> diff --git a/arch/mips/dec/prom/memory.c b/arch/mips/dec/prom/memory.c
> index 5073d2ed78bb..5a0c734b5d04 100644
> --- a/arch/mips/dec/prom/memory.c
> +++ b/arch/mips/dec/prom/memory.c
> @@ -91,29 +91,14 @@ void __init prom_meminit(u32 magic)
>                 pmax_setup_memory_region();
>         else
>                 rex_setup_memory_region();
> -}
> -
> -void __init prom_free_prom_memory(void)
> -{
> -       unsigned long end;
> -
> -       /*
> -        * Free everything below the kernel itself but leave
> -        * the first page reserved for the exception handlers.
> -        */
>
>  #if IS_ENABLED(CONFIG_DECLANCE)
>         /*
> -        * Leave 128 KB reserved for Lance memory for
> -        * IOASIC DECstations.
> +        * Reserve 128 KB for Lance memory for IOASIC DECstations.
>          *
>          * XXX: save this address for use in dec_lance.c?
>          */
>         if (IOASIC)
> -               end = __pa(&_text) - 0x00020000;
> -       else
> +               memblock_reserve(__pa_symbol(&_text), 0x00020000);

Shouldn't that be

    memblock_reserve(__pa_symbol(&_text) - 0x00020000, 0x00020000);

?

>  #endif
> -               end = __pa(&_text);
> -
> -       free_init_pages("unused PROM memory", PAGE_SIZE, end);
>  }
> ---
>
> Didn't wanna use prom_FREE_prom_memory to actually reserve a memory
> chunk, so I moved the reservation into the prom_meminit() method.

I guess Maciej will test this on real hardware, eventually...

> Regarding the first page for the exception handlers. We don't need
> to reserve it here, since it is already done in arch/mips/kernel/traps.c .

Thanks for checking! That was actually something I was still wondering
about...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-22 13:44               ` Geert Uytterhoeven
@ 2019-05-22 13:54                 ` Serge Semin
  2020-10-14  9:49                   ` Maciej W. Rozycki
  0 siblings, 1 reply; 43+ messages in thread
From: Serge Semin @ 2019-05-22 13:54 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Mike Rapoport, Ralf Baechle, Paul Burton, James Hogan,
	Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto, Maciej W. Rozycki

[-- Attachment #1: Type: text/plain, Size: 2420 bytes --]

On Wed, May 22, 2019 at 03:44:47PM +0200, Geert Uytterhoeven wrote:
> Hi Serge,
>
> ...
> 
> >
> > The problem might be fixed there by the next patch:
> > ---
> > diff --git a/arch/mips/dec/prom/memory.c b/arch/mips/dec/prom/memory.c
> > index 5073d2ed78bb..5a0c734b5d04 100644
> > --- a/arch/mips/dec/prom/memory.c
> > +++ b/arch/mips/dec/prom/memory.c
> > @@ -91,29 +91,14 @@ void __init prom_meminit(u32 magic)
> >                 pmax_setup_memory_region();
> >         else
> >                 rex_setup_memory_region();
> > -}
> > -
> > -void __init prom_free_prom_memory(void)
> > -{
> > -       unsigned long end;
> > -
> > -       /*
> > -        * Free everything below the kernel itself but leave
> > -        * the first page reserved for the exception handlers.
> > -        */
> >
> >  #if IS_ENABLED(CONFIG_DECLANCE)
> >         /*
> > -        * Leave 128 KB reserved for Lance memory for
> > -        * IOASIC DECstations.
> > +        * Reserve 128 KB for Lance memory for IOASIC DECstations.
> >          *
> >          * XXX: save this address for use in dec_lance.c?
> >          */
> >         if (IOASIC)
> > -               end = __pa(&_text) - 0x00020000;
> > -       else
> > +               memblock_reserve(__pa_symbol(&_text), 0x00020000);
> 
> Shouldn't that be
> 
>     memblock_reserve(__pa_symbol(&_text) - 0x00020000, 0x00020000);
> 
> ?
> 

Right. Thanks. The updated version of the patch is attached to this email.

-Sergey

> >  #endif
> > -               end = __pa(&_text);
> > -
> > -       free_init_pages("unused PROM memory", PAGE_SIZE, end);
> >  }
> > ---
> >
> > Didn't wanna use prom_FREE_prom_memory to actually reserve a memory
> > chunk, so I moved the reservation into the prom_meminit() method.
> 
> I guess Maciej will test this on real hardware, eventually...
> 
> > Regarding the first page for the exception handlers. We don't need
> > to reserve it here, since it is already done in arch/mips/kernel/traps.c .
> 
> Thanks for checking! That was actually something I was still wondering
> about...
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> -- 
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

[-- Attachment #2: fix_dec.patch --]
[-- Type: text/x-patch, Size: 952 bytes --]

diff --git a/arch/mips/dec/prom/memory.c b/arch/mips/dec/prom/memory.c
index 5073d2ed78bb..031c0cd45d85 100644
--- a/arch/mips/dec/prom/memory.c
+++ b/arch/mips/dec/prom/memory.c
@@ -91,29 +91,14 @@ void __init prom_meminit(u32 magic)
 		pmax_setup_memory_region();
 	else
 		rex_setup_memory_region();
-}
-
-void __init prom_free_prom_memory(void)
-{
-	unsigned long end;
-
-	/*
-	 * Free everything below the kernel itself but leave
-	 * the first page reserved for the exception handlers.
-	 */
 
 #if IS_ENABLED(CONFIG_DECLANCE)
 	/*
-	 * Leave 128 KB reserved for Lance memory for
-	 * IOASIC DECstations.
+	 * Reserve 128 KB for Lance memory for IOASIC DECstations.
 	 *
 	 * XXX: save this address for use in dec_lance.c?
 	 */
 	if (IOASIC)
-		end = __pa(&_text) - 0x00020000;
-	else
+		memblock_reserve(__pa_symbol(&_text) - 0x00020000, 0x00020000);
 #endif
-		end = __pa(&_text);
-
-	free_init_pages("unused PROM memory", PAGE_SIZE, end);
 }

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/12] mips: Reserve memory for the kernel image resources
  2019-05-22 13:54                 ` Serge Semin
@ 2020-10-14  9:49                   ` Maciej W. Rozycki
  0 siblings, 0 replies; 43+ messages in thread
From: Maciej W. Rozycki @ 2020-10-14  9:49 UTC (permalink / raw)
  To: Serge Semin
  Cc: Geert Uytterhoeven, Mike Rapoport, Ralf Baechle, Paul Burton,
	James Hogan, Andrew Morton, Michal Hocko, Greg Kroah-Hartman,
	Thomas Bogendoerfer, Huacai Chen, Stefan Agner, Stephen Rothwell,
	Alexandre Belloni, Juergen Gross, linux-mips,
	Linux Kernel Mailing List, Atsushi Nemoto

On Wed, 22 May 2019, Serge Semin wrote:

> > > The problem might be fixed there by the next patch:
> > > ---
> > > diff --git a/arch/mips/dec/prom/memory.c b/arch/mips/dec/prom/memory.c
> > > index 5073d2ed78bb..5a0c734b5d04 100644
> > > --- a/arch/mips/dec/prom/memory.c
> > > +++ b/arch/mips/dec/prom/memory.c
> > > @@ -91,29 +91,14 @@ void __init prom_meminit(u32 magic)
> > >                 pmax_setup_memory_region();
> > >         else
> > >                 rex_setup_memory_region();
> > > -}
> > > -
> > > -void __init prom_free_prom_memory(void)
> > > -{
> > > -       unsigned long end;
> > > -
> > > -       /*
> > > -        * Free everything below the kernel itself but leave
> > > -        * the first page reserved for the exception handlers.
> > > -        */
> > >
> > >  #if IS_ENABLED(CONFIG_DECLANCE)
> > >         /*
> > > -        * Leave 128 KB reserved for Lance memory for
> > > -        * IOASIC DECstations.
> > > +        * Reserve 128 KB for Lance memory for IOASIC DECstations.
> > >          *
> > >          * XXX: save this address for use in dec_lance.c?
> > >          */
> > >         if (IOASIC)
> > > -               end = __pa(&_text) - 0x00020000;
> > > -       else
> > > +               memblock_reserve(__pa_symbol(&_text), 0x00020000);
> > 
> > Shouldn't that be
> > 
> >     memblock_reserve(__pa_symbol(&_text) - 0x00020000, 0x00020000);
> > 
> > ?
> > 
> 
> Right. Thanks. The updated version of the patch is attached to this email.
> 
> -Sergey
> 
> > >  #endif
> > > -               end = __pa(&_text);
> > > -
> > > -       free_init_pages("unused PROM memory", PAGE_SIZE, end);
> > >  }
> > > ---
> > >
> > > Didn't wanna use prom_FREE_prom_memory to actually reserve a memory
> > > chunk, so I moved the reservation into the prom_meminit() method.
> > 
> > I guess Maciej will test this on real hardware, eventually...

 I finally got to it as I was hit by it the hard way (and I do have kept 
the thread in my inbox!), however this is the wrong fix.

 With DEC hardware the whole 128KiB region is reserved as firmware working 
memory area and we call into the firmware throughout bootstrap on several 
occasions.  Therefore we have to stay away from it until we know we won't 
need any firmware services any longer, which is at this point.  So this 
piece has to stay, and the removed reservation has to be reinstated in 
platform code.  I'll be posting a fix separately.

 NB I suspect CFE platforms may need a similar fix, but I don't have 
access to my SWARM now, so I'll verify it on another occasion.

 Other platforms may need it too; at least up to a point an assumption was 
the kernel load address is just at the end of any firmware working area 
typically allocated right beyond the exception vector area, hence the 
reservation.  I realise the assumption may have changed at one point and 
the oldtimers who have known it may have been away or not paying enough 
attention while the newcomers did not realise that.

  Maciej

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2020-10-14  9:49 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-23 22:47 [PATCH 00/12] mips: Post-bootmem-memblock transition fixes Serge Semin
2019-04-23 22:47 ` [PATCH 01/12] mips: Make sure kernel .bss exists in boot mem pool Serge Semin
2019-04-24 22:30   ` Paul Burton
2019-04-23 22:47 ` [PATCH 02/12] mips: Discard rudiments from bootmem_init Serge Semin
2019-04-24 22:30   ` Paul Burton
2019-04-23 22:47 ` [PATCH 03/12] mips: Combine memblock init and memory reservation loops Serge Semin
2019-04-24 22:30   ` Paul Burton
2019-04-23 22:47 ` [PATCH 04/12] mips: Reserve memory for the kernel image resources Serge Semin
2019-04-24 22:43   ` Paul Burton
2019-04-26  0:00     ` Serge Semin
2019-04-30 22:58       ` Paul Burton
2019-05-02 14:24         ` Serge Semin
2019-05-02 18:45           ` Paul Burton
2019-05-03 17:21             ` Serge Semin
2019-05-02 18:35   ` Paul Burton
2019-05-21 14:56   ` Geert Uytterhoeven
2019-05-21 15:53     ` Mike Rapoport
2019-05-21 16:39       ` Serge Semin
2019-05-22  7:50         ` Geert Uytterhoeven
2019-05-22  7:47       ` Geert Uytterhoeven
2019-05-22  8:08         ` Mike Rapoport
2019-05-22  8:14           ` Geert Uytterhoeven
2019-05-22 13:34             ` Serge Semin
2019-05-22 13:44               ` Geert Uytterhoeven
2019-05-22 13:54                 ` Serge Semin
2020-10-14  9:49                   ` Maciej W. Rozycki
2019-04-23 22:47 ` [PATCH 05/12] mips: Discard post-CMA-init foreach loop Serge Semin
2019-05-02 18:35   ` Paul Burton
2019-04-23 22:47 ` [PATCH 06/12] mips: Use memblock to reserve the __nosave memory range Serge Semin
2019-05-02 18:35   ` Paul Burton
2019-04-23 22:47 ` [PATCH 07/12] mips: Add reserve-nomap memory type support Serge Semin
2019-05-02 18:35   ` Paul Burton
2019-04-23 22:47 ` [PATCH 08/12] mips: Dump memblock regions for debugging Serge Semin
2019-04-24 13:45   ` Mike Rapoport
2019-04-24 14:20     ` Serge Semin
2019-04-23 22:47 ` [PATCH 09/12] mips: Perform early low memory test Serge Semin
2019-04-23 22:47 ` [PATCH 10/12] mips: Print the kernel virtual mem layout on debugging Serge Semin
2019-04-24 13:47   ` Mike Rapoport
2019-04-24 14:35     ` Serge Semin
2019-04-23 22:47 ` [PATCH 11/12] mips: Make sure dt memory regions are valid Serge Semin
2019-04-23 22:47 ` [PATCH 12/12] mips: Enable OF_RESERVED_MEM config Serge Semin
2019-04-24  6:17   ` Christoph Hellwig
2019-04-24  8:34     ` Serge Semin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).