All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -v9 00/31] use lmb with x86
@ 2010-03-29  2:42 ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

the new lmb could be used to early_res in x86.

Suggested by: David, Ben, and Thomas

First three patches should go into 2.6.34

-v6: change sequence as requested by Thomas
-v7: seperate them to more patches
-v8: add boundary checking to make sure not free partial page.
-v9: use lmb_debug to control print out of reserve_lmb.
     add e820 clean up, and e820 become __initdata

> size vmlinux.*
   text		   data	    bss		    dec		    hex	filename
20195694	4149812	12627536	36973042	23429f2	vmlinux.before_lmb_patchset
20198187	4163892	12614736	36976815	23438af	vmlinux.after_lmb_patchset

Before:
[   12.124431] Freeing unused kernel memory: 2740k freed
After:
[   11.514822] Freeing unused kernel memory: 2764k freed

So We move about 24k to .init

Please check

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH -v9 00/31] use lmb with x86
@ 2010-03-29  2:42 ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

the new lmb could be used to early_res in x86.

Suggested by: David, Ben, and Thomas

First three patches should go into 2.6.34

-v6: change sequence as requested by Thomas
-v7: seperate them to more patches
-v8: add boundary checking to make sure not free partial page.
-v9: use lmb_debug to control print out of reserve_lmb.
     add e820 clean up, and e820 become __initdata

> size vmlinux.*
   text		   data	    bss		    dec		    hex	filename
20195694	4149812	12627536	36973042	23429f2	vmlinux.before_lmb_patchset
20198187	4163892	12614736	36976815	23438af	vmlinux.after_lmb_patchset

Before:
[   12.124431] Freeing unused kernel memory: 2740k freed
After:
[   11.514822] Freeing unused kernel memory: 2764k freed

So We move about 24k to .init

Please check

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 01/31] x86: Make smp_locks end with page alignment
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:42   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Try to fix
------------[ cut here ]------------
WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa()
free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.34-rc2-tip-03946-g4f16b23-dirty #50
Call Trace:
 [<40232e9f>] warn_slowpath_common+0x65/0x7c
 [<4021c9f0>] ? free_init_pages+0x4c/0xfa
 [<40881434>] ? _etext+0x0/0x24
 [<40232eea>] warn_slowpath_fmt+0x24/0x27
 [<4021c9f0>] free_init_pages+0x4c/0xfa
 [<40881434>] ? _etext+0x0/0x24
 [<40d3f4bd>] alternative_instructions+0xf6/0x100
 [<40d3fe4f>] check_bugs+0xbd/0xbf
 [<40d398a7>] start_kernel+0x2d5/0x2e4
 [<40d390ce>] i386_start_kernel+0xce/0xd5
---[ end trace 4eaa2a86a8e2da22 ]---

Comments in vmlinux.lds.S already said:
|        /*
|         * smp_locks might be freed after init
|         * start/end must be page aligned
|         */

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
 arch/x86/kernel/vmlinux.lds.S |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 44879df..2cc2497 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -291,8 +291,8 @@ SECTIONS
 	.smp_locks : AT(ADDR(.smp_locks) - LOAD_OFFSET) {
 		__smp_locks = .;
 		*(.smp_locks)
-		__smp_locks_end = .;
 		. = ALIGN(PAGE_SIZE);
+		__smp_locks_end = .;
 	}
 
 #ifdef CONFIG_X86_64
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 01/31] x86: Make smp_locks end with page alignment
@ 2010-03-29  2:42   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Try to fix
------------[ cut here ]------------
WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa()
free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.34-rc2-tip-03946-g4f16b23-dirty #50
Call Trace:
 [<40232e9f>] warn_slowpath_common+0x65/0x7c
 [<4021c9f0>] ? free_init_pages+0x4c/0xfa
 [<40881434>] ? _etext+0x0/0x24
 [<40232eea>] warn_slowpath_fmt+0x24/0x27
 [<4021c9f0>] free_init_pages+0x4c/0xfa
 [<40881434>] ? _etext+0x0/0x24
 [<40d3f4bd>] alternative_instructions+0xf6/0x100
 [<40d3fe4f>] check_bugs+0xbd/0xbf
 [<40d398a7>] start_kernel+0x2d5/0x2e4
 [<40d390ce>] i386_start_kernel+0xce/0xd5
---[ end trace 4eaa2a86a8e2da22 ]---

Comments in vmlinux.lds.S already said:
|        /*
|         * smp_locks might be freed after init
|         * start/end must be page aligned
|         */

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
 arch/x86/kernel/vmlinux.lds.S |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 44879df..2cc2497 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -291,8 +291,8 @@ SECTIONS
 	.smp_locks : AT(ADDR(.smp_locks) - LOAD_OFFSET) {
 		__smp_locks = .;
 		*(.smp_locks)
-		__smp_locks_end = .;
 		. = ALIGN(PAGE_SIZE);
+		__smp_locks_end = .;
 	}
 
 #ifdef CONFIG_X86_64
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 02/31] x86: Make sure free_init_pages() free pages in boundary
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:42   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

When CONFIG_NO_BOOTMEM, it could use memory more effient, or more compact.

Example is:
Allocated new RAMDISK: 00ec2000 - 0248ce57
Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56

The new RAMDISK's end is not page aligned.
Last page could use shared with other user.

When free_init_pages are called for initrd or .init, the page could be freed
could have chance to corrupt other data.

code segment in free_init_pages()
|        for (; addr < end; addr += PAGE_SIZE) {
|                ClearPageReserved(virt_to_page(addr));
|                init_page_count(virt_to_page(addr));
|                memset((void *)(addr & ~(PAGE_SIZE-1)),
|                        POISON_FREE_INITMEM, PAGE_SIZE);
|                free_page(addr);
|                totalram_pages++;
|        }
last half page could be used as one whole free page.

Try to make the boundaries to be page aligned.

-v2: make the original initramdisk to be aligned, according to Johannes.
     otherwise we have chance to lose one page.
     we still need to keep initrd_end not aligned, otherwise it could
     confuse decompresser.
-v3: change to WARN_ON instead according to Johannes.
-v4: use PAGE_ALIGN according to Johannes.
     We may fix that MARCO name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN
     Add comments about assuming ramdisk start is aligned
     in relocate_initrd(), change to re get ramdisk_image instead of save it
	to make diff smaller.
     Add WARN for wrong range according Johannes.
-v6: remove one WARN
     We need align begin in free_init_pages()
     not copy more than ramdisk_size according to Johannes

Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
---
 arch/x86/kernel/head32.c |    3 ++-
 arch/x86/kernel/head64.c |    3 ++-
 arch/x86/kernel/setup.c  |   10 ++++++----
 arch/x86/mm/init.c       |   32 ++++++++++++++++++++++++++------
 4 files changed, 36 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index adedeef..9a97504 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -44,9 +44,10 @@ void __init i386_start_kernel(void)
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
 	if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
+		/* Assume only end is not page aligned */
 		u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 		u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
-		u64 ramdisk_end   = ramdisk_image + ramdisk_size;
+		u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index b5a9896..7147143 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -103,9 +103,10 @@ void __init x86_64_start_reservations(char *real_mode_data)
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
 	if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
+		/* Assume only end is not page aligned */
 		unsigned long ramdisk_image = boot_params.hdr.ramdisk_image;
 		unsigned long ramdisk_size  = boot_params.hdr.ramdisk_size;
-		unsigned long ramdisk_end   = ramdisk_image + ramdisk_size;
+		unsigned long ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 5d7ba1a..d76e185 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -314,16 +314,17 @@ static void __init reserve_brk(void)
 #define MAX_MAP_CHUNK	(NR_FIX_BTMAPS << PAGE_SHIFT)
 static void __init relocate_initrd(void)
 {
-
+	/* Assume only end is not page aligned */
 	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+	u64 area_size     = PAGE_ALIGN(ramdisk_size);
 	u64 end_of_lowmem = max_low_pfn_mapped << PAGE_SHIFT;
 	u64 ramdisk_here;
 	unsigned long slop, clen, mapaddr;
 	char *p, *q;
 
 	/* We need to move the initrd down into lowmem */
-	ramdisk_here = find_e820_area(0, end_of_lowmem, ramdisk_size,
+	ramdisk_here = find_e820_area(0, end_of_lowmem, area_size,
 					 PAGE_SIZE);
 
 	if (ramdisk_here == -1ULL)
@@ -332,7 +333,7 @@ static void __init relocate_initrd(void)
 
 	/* Note: this includes all the lowmem currently occupied by
 	   the initrd, we rely on that fact to keep the data intact. */
-	reserve_early(ramdisk_here, ramdisk_here + ramdisk_size,
+	reserve_early(ramdisk_here, ramdisk_here + area_size,
 			 "NEW RAMDISK");
 	initrd_start = ramdisk_here + PAGE_OFFSET;
 	initrd_end   = initrd_start + ramdisk_size;
@@ -376,9 +377,10 @@ static void __init relocate_initrd(void)
 
 static void __init reserve_initrd(void)
 {
+	/* Assume only end is not page aligned */
 	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
-	u64 ramdisk_end   = ramdisk_image + ramdisk_size;
+	u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 	u64 end_of_lowmem = max_low_pfn_mapped << PAGE_SHIFT;
 
 	if (!boot_params.hdr.type_of_loader ||
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index e71c5cb..452ee5b 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -331,11 +331,23 @@ int devmem_is_allowed(unsigned long pagenr)
 
 void free_init_pages(char *what, unsigned long begin, unsigned long end)
 {
-	unsigned long addr = begin;
+	unsigned long addr;
+	unsigned long begin_aligned, end_aligned;
 
-	if (addr >= end)
+	/* Make sure boundaries are page aligned */
+	begin_aligned = PAGE_ALIGN(begin);
+	end_aligned   = end & PAGE_MASK;
+
+	if (WARN_ON(begin_aligned != begin || end_aligned != end)) {
+		begin = begin_aligned;
+		end   = end_aligned;
+	}
+
+	if (begin >= end)
 		return;
 
+	addr = begin;
+
 	/*
 	 * If debugging page accesses then do not free this memory but
 	 * mark them not present - any buggy init-section access will
@@ -343,7 +355,7 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
 	 */
 #ifdef CONFIG_DEBUG_PAGEALLOC
 	printk(KERN_INFO "debug: unmapping init memory %08lx..%08lx\n",
-		begin, PAGE_ALIGN(end));
+		begin, end);
 	set_memory_np(begin, (end - begin) >> PAGE_SHIFT);
 #else
 	/*
@@ -358,8 +370,7 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
 	for (; addr < end; addr += PAGE_SIZE) {
 		ClearPageReserved(virt_to_page(addr));
 		init_page_count(virt_to_page(addr));
-		memset((void *)(addr & ~(PAGE_SIZE-1)),
-			POISON_FREE_INITMEM, PAGE_SIZE);
+		memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE);
 		free_page(addr);
 		totalram_pages++;
 	}
@@ -376,6 +387,15 @@ void free_initmem(void)
 #ifdef CONFIG_BLK_DEV_INITRD
 void free_initrd_mem(unsigned long start, unsigned long end)
 {
-	free_init_pages("initrd memory", start, end);
+	/*
+	 * end could be not aligned, and We can not align that,
+	 * decompresser could be confused by aligned initrd_end
+	 * We already reserve the end partial page before in
+	 *   - i386_start_kernel()
+	 *   - x86_64_start_kernel()
+	 *   - relocate_initrd()
+	 * So here We can do PAGE_ALIGN() safely to get partial page to be freed
+	 */
+	free_init_pages("initrd memory", start, PAGE_ALIGN(end));
 }
 #endif
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 02/31] x86: Make sure free_init_pages() free pages in boundary
@ 2010-03-29  2:42   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

When CONFIG_NO_BOOTMEM, it could use memory more effient, or more compact.

Example is:
Allocated new RAMDISK: 00ec2000 - 0248ce57
Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56

The new RAMDISK's end is not page aligned.
Last page could use shared with other user.

When free_init_pages are called for initrd or .init, the page could be freed
could have chance to corrupt other data.

code segment in free_init_pages()
|        for (; addr < end; addr += PAGE_SIZE) {
|                ClearPageReserved(virt_to_page(addr));
|                init_page_count(virt_to_page(addr));
|                memset((void *)(addr & ~(PAGE_SIZE-1)),
|                        POISON_FREE_INITMEM, PAGE_SIZE);
|                free_page(addr);
|                totalram_pages++;
|        }
last half page could be used as one whole free page.

Try to make the boundaries to be page aligned.

-v2: make the original initramdisk to be aligned, according to Johannes.
     otherwise we have chance to lose one page.
     we still need to keep initrd_end not aligned, otherwise it could
     confuse decompresser.
-v3: change to WARN_ON instead according to Johannes.
-v4: use PAGE_ALIGN according to Johannes.
     We may fix that MARCO name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN
     Add comments about assuming ramdisk start is aligned
     in relocate_initrd(), change to re get ramdisk_image instead of save it
	to make diff smaller.
     Add WARN for wrong range according Johannes.
-v6: remove one WARN
     We need align begin in free_init_pages()
     not copy more than ramdisk_size according to Johannes

Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
---
 arch/x86/kernel/head32.c |    3 ++-
 arch/x86/kernel/head64.c |    3 ++-
 arch/x86/kernel/setup.c  |   10 ++++++----
 arch/x86/mm/init.c       |   32 ++++++++++++++++++++++++++------
 4 files changed, 36 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index adedeef..9a97504 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -44,9 +44,10 @@ void __init i386_start_kernel(void)
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
 	if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
+		/* Assume only end is not page aligned */
 		u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 		u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
-		u64 ramdisk_end   = ramdisk_image + ramdisk_size;
+		u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index b5a9896..7147143 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -103,9 +103,10 @@ void __init x86_64_start_reservations(char *real_mode_data)
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
 	if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
+		/* Assume only end is not page aligned */
 		unsigned long ramdisk_image = boot_params.hdr.ramdisk_image;
 		unsigned long ramdisk_size  = boot_params.hdr.ramdisk_size;
-		unsigned long ramdisk_end   = ramdisk_image + ramdisk_size;
+		unsigned long ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 5d7ba1a..d76e185 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -314,16 +314,17 @@ static void __init reserve_brk(void)
 #define MAX_MAP_CHUNK	(NR_FIX_BTMAPS << PAGE_SHIFT)
 static void __init relocate_initrd(void)
 {
-
+	/* Assume only end is not page aligned */
 	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+	u64 area_size     = PAGE_ALIGN(ramdisk_size);
 	u64 end_of_lowmem = max_low_pfn_mapped << PAGE_SHIFT;
 	u64 ramdisk_here;
 	unsigned long slop, clen, mapaddr;
 	char *p, *q;
 
 	/* We need to move the initrd down into lowmem */
-	ramdisk_here = find_e820_area(0, end_of_lowmem, ramdisk_size,
+	ramdisk_here = find_e820_area(0, end_of_lowmem, area_size,
 					 PAGE_SIZE);
 
 	if (ramdisk_here == -1ULL)
@@ -332,7 +333,7 @@ static void __init relocate_initrd(void)
 
 	/* Note: this includes all the lowmem currently occupied by
 	   the initrd, we rely on that fact to keep the data intact. */
-	reserve_early(ramdisk_here, ramdisk_here + ramdisk_size,
+	reserve_early(ramdisk_here, ramdisk_here + area_size,
 			 "NEW RAMDISK");
 	initrd_start = ramdisk_here + PAGE_OFFSET;
 	initrd_end   = initrd_start + ramdisk_size;
@@ -376,9 +377,10 @@ static void __init relocate_initrd(void)
 
 static void __init reserve_initrd(void)
 {
+	/* Assume only end is not page aligned */
 	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
-	u64 ramdisk_end   = ramdisk_image + ramdisk_size;
+	u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 	u64 end_of_lowmem = max_low_pfn_mapped << PAGE_SHIFT;
 
 	if (!boot_params.hdr.type_of_loader ||
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index e71c5cb..452ee5b 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -331,11 +331,23 @@ int devmem_is_allowed(unsigned long pagenr)
 
 void free_init_pages(char *what, unsigned long begin, unsigned long end)
 {
-	unsigned long addr = begin;
+	unsigned long addr;
+	unsigned long begin_aligned, end_aligned;
 
-	if (addr >= end)
+	/* Make sure boundaries are page aligned */
+	begin_aligned = PAGE_ALIGN(begin);
+	end_aligned   = end & PAGE_MASK;
+
+	if (WARN_ON(begin_aligned != begin || end_aligned != end)) {
+		begin = begin_aligned;
+		end   = end_aligned;
+	}
+
+	if (begin >= end)
 		return;
 
+	addr = begin;
+
 	/*
 	 * If debugging page accesses then do not free this memory but
 	 * mark them not present - any buggy init-section access will
@@ -343,7 +355,7 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
 	 */
 #ifdef CONFIG_DEBUG_PAGEALLOC
 	printk(KERN_INFO "debug: unmapping init memory %08lx..%08lx\n",
-		begin, PAGE_ALIGN(end));
+		begin, end);
 	set_memory_np(begin, (end - begin) >> PAGE_SHIFT);
 #else
 	/*
@@ -358,8 +370,7 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
 	for (; addr < end; addr += PAGE_SIZE) {
 		ClearPageReserved(virt_to_page(addr));
 		init_page_count(virt_to_page(addr));
-		memset((void *)(addr & ~(PAGE_SIZE-1)),
-			POISON_FREE_INITMEM, PAGE_SIZE);
+		memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE);
 		free_page(addr);
 		totalram_pages++;
 	}
@@ -376,6 +387,15 @@ void free_initmem(void)
 #ifdef CONFIG_BLK_DEV_INITRD
 void free_initrd_mem(unsigned long start, unsigned long end)
 {
-	free_init_pages("initrd memory", start, end);
+	/*
+	 * end could be not aligned, and We can not align that,
+	 * decompresser could be confused by aligned initrd_end
+	 * We already reserve the end partial page before in
+	 *   - i386_start_kernel()
+	 *   - x86_64_start_kernel()
+	 *   - relocate_initrd()
+	 * So here We can do PAGE_ALIGN() safely to get partial page to be freed
+	 */
+	free_init_pages("initrd memory", start, PAGE_ALIGN(end));
 }
 #endif
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 03/31] x86: Do not free zero sized per cpu areas
  2010-03-29  2:42 ` Yinghai Lu
  (?)
@ 2010-03-29  2:42   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Ian Campbell,
	Yinghai Lu, Peter Zijlstra, Ingo Molnar

From: Ian Campbell <ian.campbell@citrix.com>

This avoids an infinite loop in free_early_partial().

Add a warning to free_early_partial to catch future problems.

-v5: put back start > end back into WARN_ONCE()
-v6: use one line for warning according to linus
-v7: more test by
-v8: remove the function name according to Johannes
     WARN_ONCE will print that function name.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Joel Becker <joel.becker@oracle.com>
Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
---
 kernel/early_res.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/early_res.c b/kernel/early_res.c
index 3cb2c66..31aa933 100644
--- a/kernel/early_res.c
+++ b/kernel/early_res.c
@@ -333,6 +333,12 @@ void __init free_early_partial(u64 start, u64 end)
 	struct early_res *r;
 	int i;
 
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "  wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
 try_next:
 	i = find_overlapped_early(start, end);
 	if (i >= max_early_res)
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 03/31] x86: Do not free zero sized per cpu areas
@ 2010-03-29  2:42   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Thomas Gleixner, H. Peter Anvin, Andrew Morton, David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Ian Campbell,
	Yinghai Lu, Peter Zijlstra, Ingo Molnar

From: Ian Campbell <ian.campbell@citrix.com>

This avoids an infinite loop in free_early_partial().

Add a warning to free_early_partial to catch future problems.

-v5: put back start > end back into WARN_ONCE()
-v6: use one line for warning according to linus
-v7: more test by
-v8: remove the function name according to Johannes
     WARN_ONCE will print that function name.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Joel Becker <joel.becker@oracle.com>
Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
---
 kernel/early_res.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/early_res.c b/kernel/early_res.c
index 3cb2c66..31aa933 100644
--- a/kernel/early_res.c
+++ b/kernel/early_res.c
@@ -333,6 +333,12 @@ void __init free_early_partial(u64 start, u64 end)
 	struct early_res *r;
 	int i;
 
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "  wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
 try_next:
 	i = find_overlapped_early(start, end);
 	if (i >= max_early_res)
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 03/31] x86: Do not free zero sized per cpu areas
@ 2010-03-29  2:42   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Ian Campbell,
	Yinghai Lu, Peter Zijlstra

From: Ian Campbell <ian.campbell@citrix.com>

This avoids an infinite loop in free_early_partial().

Add a warning to free_early_partial to catch future problems.

-v5: put back start > end back into WARN_ONCE()
-v6: use one line for warning according to linus
-v7: more test by
-v8: remove the function name according to Johannes
     WARN_ONCE will print that function name.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Joel Becker <joel.becker@oracle.com>
Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
---
 kernel/early_res.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/early_res.c b/kernel/early_res.c
index 3cb2c66..31aa933 100644
--- a/kernel/early_res.c
+++ b/kernel/early_res.c
@@ -333,6 +333,12 @@ void __init free_early_partial(u64 start, u64 end)
 	struct early_res *r;
 	int i;
 
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "  wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
 try_next:
 	i = find_overlapped_early(start, end);
 	if (i >= max_early_res)
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 04/31] lmb: Move lmb.c to mm/
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:42   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

that is memory related, so move to mm/ according to Ingo

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 lib/Makefile      |    2 --
 mm/Makefile       |    2 ++
 {lib => mm}/lmb.c |    0
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename {lib => mm}/lmb.c (100%)

diff --git a/lib/Makefile b/lib/Makefile
index 2e152ae..a463a4d 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -85,8 +85,6 @@ obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
 
-obj-$(CONFIG_HAVE_LMB) += lmb.o
-
 obj-$(CONFIG_HAVE_ARCH_TRACEHOOK) += syscall.o
 
 obj-$(CONFIG_DYNAMIC_DEBUG) += dynamic_debug.o
diff --git a/mm/Makefile b/mm/Makefile
index 7a68d2a..df22fd1 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -15,6 +15,8 @@ obj-y			:= bootmem.o filemap.o mempool.o oom_kill.o fadvise.o \
 			   $(mmu-y)
 obj-y += init-mm.o
 
+obj-$(CONFIG_HAVE_LMB) += lmb.o
+
 obj-$(CONFIG_BOUNCE)	+= bounce.o
 obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o thrash.o
 obj-$(CONFIG_HAS_DMA)	+= dmapool.o
diff --git a/lib/lmb.c b/mm/lmb.c
similarity index 100%
rename from lib/lmb.c
rename to mm/lmb.c
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 04/31] lmb: Move lmb.c to mm/
@ 2010-03-29  2:42   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

that is memory related, so move to mm/ according to Ingo

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 lib/Makefile      |    2 --
 mm/Makefile       |    2 ++
 {lib => mm}/lmb.c |    0
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename {lib => mm}/lmb.c (100%)

diff --git a/lib/Makefile b/lib/Makefile
index 2e152ae..a463a4d 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -85,8 +85,6 @@ obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
 
-obj-$(CONFIG_HAVE_LMB) += lmb.o
-
 obj-$(CONFIG_HAVE_ARCH_TRACEHOOK) += syscall.o
 
 obj-$(CONFIG_DYNAMIC_DEBUG) += dynamic_debug.o
diff --git a/mm/Makefile b/mm/Makefile
index 7a68d2a..df22fd1 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -15,6 +15,8 @@ obj-y			:= bootmem.o filemap.o mempool.o oom_kill.o fadvise.o \
 			   $(mmu-y)
 obj-y += init-mm.o
 
+obj-$(CONFIG_HAVE_LMB) += lmb.o
+
 obj-$(CONFIG_BOUNCE)	+= bounce.o
 obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o thrash.o
 obj-$(CONFIG_HAS_DMA)	+= dmapool.o
diff --git a/lib/lmb.c b/mm/lmb.c
similarity index 100%
rename from lib/lmb.c
rename to mm/lmb.c
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 05/31] lmb: Seperate region array from lmb_region struct
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:42   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

lmb_init() will connect them back.

Add nr_regions in struct lmb_region to track region array size.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    3 ++-
 mm/lmb.c            |    9 ++++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index f3d1433..e14ea8d 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -26,7 +26,8 @@ struct lmb_property {
 struct lmb_region {
 	unsigned long cnt;
 	u64 size;
-	struct lmb_property region[MAX_LMB_REGIONS+1];
+	struct lmb_property *region;
+	unsigned long nr_regions;
 };
 
 struct lmb {
diff --git a/mm/lmb.c b/mm/lmb.c
index b1fc526..65b62dc 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -18,6 +18,8 @@
 #define LMB_ALLOC_ANYWHERE	0
 
 struct lmb lmb;
+static struct lmb_property lmb_memory_region[MAX_LMB_REGIONS + 1];
+static struct lmb_property lmb_reserved_region[MAX_LMB_REGIONS + 1];
 
 static int lmb_debug;
 
@@ -106,6 +108,11 @@ static void lmb_coalesce_regions(struct lmb_region *rgn,
 
 void __init lmb_init(void)
 {
+	lmb.memory.region   = lmb_memory_region;
+	lmb.reserved.region = lmb_reserved_region;
+	lmb.memory.nr_regions   = ARRAY_SIZE(lmb_memory_region);
+	lmb.reserved.nr_regions = ARRAY_SIZE(lmb_reserved_region);
+
 	/* Create a dummy zero size LMB which will get coalesced away later.
 	 * This simplifies the lmb_add() code below...
 	 */
@@ -169,7 +176,7 @@ static long lmb_add_region(struct lmb_region *rgn, u64 base, u64 size)
 
 	if (coalesced)
 		return coalesced;
-	if (rgn->cnt >= MAX_LMB_REGIONS)
+	if (rgn->cnt > rgn->nr_regions)
 		return -1;
 
 	/* Couldn't coalesce the LMB, so add it to the sorted table. */
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 05/31] lmb: Seperate region array from lmb_region struct
@ 2010-03-29  2:42   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

lmb_init() will connect them back.

Add nr_regions in struct lmb_region to track region array size.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    3 ++-
 mm/lmb.c            |    9 ++++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index f3d1433..e14ea8d 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -26,7 +26,8 @@ struct lmb_property {
 struct lmb_region {
 	unsigned long cnt;
 	u64 size;
-	struct lmb_property region[MAX_LMB_REGIONS+1];
+	struct lmb_property *region;
+	unsigned long nr_regions;
 };
 
 struct lmb {
diff --git a/mm/lmb.c b/mm/lmb.c
index b1fc526..65b62dc 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -18,6 +18,8 @@
 #define LMB_ALLOC_ANYWHERE	0
 
 struct lmb lmb;
+static struct lmb_property lmb_memory_region[MAX_LMB_REGIONS + 1];
+static struct lmb_property lmb_reserved_region[MAX_LMB_REGIONS + 1];
 
 static int lmb_debug;
 
@@ -106,6 +108,11 @@ static void lmb_coalesce_regions(struct lmb_region *rgn,
 
 void __init lmb_init(void)
 {
+	lmb.memory.region   = lmb_memory_region;
+	lmb.reserved.region = lmb_reserved_region;
+	lmb.memory.nr_regions   = ARRAY_SIZE(lmb_memory_region);
+	lmb.reserved.nr_regions = ARRAY_SIZE(lmb_reserved_region);
+
 	/* Create a dummy zero size LMB which will get coalesced away later.
 	 * This simplifies the lmb_add() code below...
 	 */
@@ -169,7 +176,7 @@ static long lmb_add_region(struct lmb_region *rgn, u64 base, u64 size)
 
 	if (coalesced)
 		return coalesced;
-	if (rgn->cnt >= MAX_LMB_REGIONS)
+	if (rgn->cnt > rgn->nr_regions)
 		return -1;
 
 	/* Couldn't coalesce the LMB, so add it to the sorted table. */
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 06/31] lmb: Add find_lmb_area()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:42   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It will try find area according with size/align in specified range (start, end).

Need use it find correct buffer for new lmb.reserved.region.

also make it more easy for x86 to use lmb.
x86 early_res is using find/reserve pattern instead of alloc.

find_lmb_area() will hohor goal

When we need temperary buff for range array etc for range work, if We are using
lmb_alloc(), We will need to add some post fix code for buffer that is used
by range array, because it is in the lmb.reserved already.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    4 ++
 mm/lmb.c            |   81 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index e14ea8d..05234bd 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -83,6 +83,10 @@ lmb_end_pfn(struct lmb_region *type, unsigned long region_nr)
 	       lmb_size_pages(type, region_nr);
 }
 
+u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
+			 u64 size, u64 align);
+u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
+
 #include <asm/lmb.h>
 
 #endif /* __KERNEL__ */
diff --git a/mm/lmb.c b/mm/lmb.c
index 65b62dc..d5d5dc4 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -11,9 +11,13 @@
  */
 
 #include <linux/kernel.h>
+#include <linux/types.h>
 #include <linux/init.h>
 #include <linux/bitops.h>
 #include <linux/lmb.h>
+#include <linux/bootmem.h>
+#include <linux/mm.h>
+#include <linux/range.h>
 
 #define LMB_ALLOC_ANYWHERE	0
 
@@ -546,3 +550,80 @@ int lmb_find(struct lmb_property *res)
 	}
 	return -1;
 }
+
+static int __init find_overlapped_early(u64 start, u64 end)
+{
+	int i;
+	struct lmb_property *r;
+
+	for (i = 0; i < lmb.reserved.cnt && lmb.reserved.region[i].size; i++) {
+		r = &lmb.reserved.region[i];
+		if (end > r->base && start < (r->base + r->size))
+			break;
+	}
+
+	return i;
+}
+
+/* Check for already reserved areas */
+static inline bool __init bad_addr(u64 *addrp, u64 size, u64 align)
+{
+	int i;
+	u64 addr = *addrp;
+	bool changed = false;
+	struct lmb_property *r;
+again:
+	i = find_overlapped_early(addr, addr + size);
+	r = &lmb.reserved.region[i];
+	if (i < lmb.reserved.cnt && r->size) {
+		*addrp = addr = round_up(r->base + r->size, align);
+		changed = true;
+		goto again;
+	}
+	return changed;
+}
+
+u64 __init __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
+				 u64 size, u64 align)
+{
+	u64 addr, last;
+
+	addr = round_up(ei_start, align);
+	if (addr < start)
+		addr = round_up(start, align);
+	if (addr >= ei_last)
+		goto out;
+	while (bad_addr(&addr, size, align) && addr+size <= ei_last)
+		;
+	last = addr + size;
+	if (last > ei_last)
+		goto out;
+	if (last > end)
+		goto out;
+
+	return addr;
+
+out:
+	return -1ULL;
+}
+
+/*
+ * Find a free area with specified alignment in a specific range.
+ */
+u64 __init find_lmb_area(u64 start, u64 end, u64 size, u64 align)
+{
+	int i;
+
+	for (i = 0; i < lmb.memory.cnt; i++) {
+		u64 ei_start = lmb.memory.region[i].base;
+		u64 ei_last = ei_start + lmb.memory.region[i].size;
+		u64 addr;
+
+		addr = __find_lmb_area(ei_start, ei_last, start, end,
+					 size, align);
+
+		if (addr != -1ULL)
+			return addr;
+	}
+	return -1ULL;
+}
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 06/31] lmb: Add find_lmb_area()
@ 2010-03-29  2:42   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:42 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It will try find area according with size/align in specified range (start, end).

Need use it find correct buffer for new lmb.reserved.region.

also make it more easy for x86 to use lmb.
x86 early_res is using find/reserve pattern instead of alloc.

find_lmb_area() will hohor goal

When we need temperary buff for range array etc for range work, if We are using
lmb_alloc(), We will need to add some post fix code for buffer that is used
by range array, because it is in the lmb.reserved already.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    4 ++
 mm/lmb.c            |   81 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index e14ea8d..05234bd 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -83,6 +83,10 @@ lmb_end_pfn(struct lmb_region *type, unsigned long region_nr)
 	       lmb_size_pages(type, region_nr);
 }
 
+u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
+			 u64 size, u64 align);
+u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
+
 #include <asm/lmb.h>
 
 #endif /* __KERNEL__ */
diff --git a/mm/lmb.c b/mm/lmb.c
index 65b62dc..d5d5dc4 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -11,9 +11,13 @@
  */
 
 #include <linux/kernel.h>
+#include <linux/types.h>
 #include <linux/init.h>
 #include <linux/bitops.h>
 #include <linux/lmb.h>
+#include <linux/bootmem.h>
+#include <linux/mm.h>
+#include <linux/range.h>
 
 #define LMB_ALLOC_ANYWHERE	0
 
@@ -546,3 +550,80 @@ int lmb_find(struct lmb_property *res)
 	}
 	return -1;
 }
+
+static int __init find_overlapped_early(u64 start, u64 end)
+{
+	int i;
+	struct lmb_property *r;
+
+	for (i = 0; i < lmb.reserved.cnt && lmb.reserved.region[i].size; i++) {
+		r = &lmb.reserved.region[i];
+		if (end > r->base && start < (r->base + r->size))
+			break;
+	}
+
+	return i;
+}
+
+/* Check for already reserved areas */
+static inline bool __init bad_addr(u64 *addrp, u64 size, u64 align)
+{
+	int i;
+	u64 addr = *addrp;
+	bool changed = false;
+	struct lmb_property *r;
+again:
+	i = find_overlapped_early(addr, addr + size);
+	r = &lmb.reserved.region[i];
+	if (i < lmb.reserved.cnt && r->size) {
+		*addrp = addr = round_up(r->base + r->size, align);
+		changed = true;
+		goto again;
+	}
+	return changed;
+}
+
+u64 __init __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
+				 u64 size, u64 align)
+{
+	u64 addr, last;
+
+	addr = round_up(ei_start, align);
+	if (addr < start)
+		addr = round_up(start, align);
+	if (addr >= ei_last)
+		goto out;
+	while (bad_addr(&addr, size, align) && addr+size <= ei_last)
+		;
+	last = addr + size;
+	if (last > ei_last)
+		goto out;
+	if (last > end)
+		goto out;
+
+	return addr;
+
+out:
+	return -1ULL;
+}
+
+/*
+ * Find a free area with specified alignment in a specific range.
+ */
+u64 __init find_lmb_area(u64 start, u64 end, u64 size, u64 align)
+{
+	int i;
+
+	for (i = 0; i < lmb.memory.cnt; i++) {
+		u64 ei_start = lmb.memory.region[i].base;
+		u64 ei_last = ei_start + lmb.memory.region[i].size;
+		u64 addr;
+
+		addr = __find_lmb_area(ei_start, ei_last, start, end,
+					 size, align);
+
+		if (addr != -1ULL)
+			return addr;
+	}
+	return -1ULL;
+}
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

They will check if the region array is big enough.

__check_and_double_region_array will try to double the region if that array spare
slots if not big enough.
find_lmb_area() is used to find good postion for new region array.
Old array will be copied to new array.

Arch code should provide to get_max_mapped, so the new array have accessiable
address

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    4 ++
 mm/lmb.c            |   89 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 05234bd..95ae3f4 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -83,9 +83,13 @@ lmb_end_pfn(struct lmb_region *type, unsigned long region_nr)
 	       lmb_size_pages(type, region_nr);
 }
 
+void reserve_lmb(u64 start, u64 end, char *name);
+void free_lmb(u64 start, u64 end);
+void add_lmb_memory(u64 start, u64 end);
 u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
 			 u64 size, u64 align);
 u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
+u64 get_max_mapped(void);
 
 #include <asm/lmb.h>
 
diff --git a/mm/lmb.c b/mm/lmb.c
index d5d5dc4..9798458 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
 	return -1;
 }
 
+u64 __weak __init get_max_mapped(void)
+{
+	u64 end = max_low_pfn;
+
+	end <<= PAGE_SHIFT;
+
+	return end;
+}
+
+static void __init __check_and_double_region_array(struct lmb_region *type,
+			 struct lmb_property *static_region,
+			 u64 ex_start, u64 ex_end)
+{
+	u64 start, end, size, mem;
+	struct lmb_property *new, *old;
+	unsigned long rgnsz = type->nr_regions;
+
+	/* Do we have enough slots left ? */
+	if ((rgnsz - type->cnt) > max_t(unsigned long, rgnsz/8, 2))
+		return;
+
+	old = type->region;
+	/* Double the array size */
+	size = sizeof(struct lmb_property) * rgnsz * 2;
+	if (old == static_region)
+		start = 0;
+	else
+		start = __pa(old) + sizeof(struct lmb_property) * rgnsz;
+	end = ex_start;
+	mem = -1ULL;
+	if (start + size < end)
+		mem = find_lmb_area(start, end, size, sizeof(struct lmb_property));
+	if (mem == -1ULL) {
+		start = ex_end;
+		end = get_max_mapped();
+		if (start + size < end)
+			mem = find_lmb_area(start, end, size, sizeof(struct lmb_property));
+	}
+	if (mem == -1ULL)
+		panic("can not find more space for lmb.reserved.region array");
+
+	new = __va(mem);
+	/* Copy old to new */
+	memcpy(&new[0], &old[0], sizeof(struct lmb_property) * rgnsz);
+	memset(&new[rgnsz], 0, sizeof(struct lmb_property) * rgnsz);
+
+	memset(&old[0], 0, sizeof(struct lmb_property) * rgnsz);
+	type->region = new;
+	type->nr_regions = rgnsz * 2;
+	printk(KERN_DEBUG "lmb.reserved.region array is doubled to %ld at [%llx - %llx]\n",
+		type->nr_regions, mem, mem + size - 1);
+
+	/* Reserve new array and free old one */
+	lmb_reserve(mem, sizeof(struct lmb_property) * rgnsz * 2);
+	if (old != static_region)
+		lmb_free(__pa(old), sizeof(struct lmb_property) * rgnsz);
+}
+
+void __init add_lmb_memory(u64 start, u64 end)
+{
+	__check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
+	lmb_add(start, end - start);
+}
+
+void __init reserve_lmb(u64 start, u64 end, char *name)
+{
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
+	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
+	lmb_reserve(start, end - start);
+}
+
+void __init free_lmb(u64 start, u64 end)
+{
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
+	/* keep punching hole, could run out of slots too */
+	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
+	lmb_free(start, end - start);
+}
+
 static int __init find_overlapped_early(u64 start, u64 end)
 {
 	int i;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

They will check if the region array is big enough.

__check_and_double_region_array will try to double the region if that array spare
slots if not big enough.
find_lmb_area() is used to find good postion for new region array.
Old array will be copied to new array.

Arch code should provide to get_max_mapped, so the new array have accessiable
address

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    4 ++
 mm/lmb.c            |   89 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 05234bd..95ae3f4 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -83,9 +83,13 @@ lmb_end_pfn(struct lmb_region *type, unsigned long region_nr)
 	       lmb_size_pages(type, region_nr);
 }
 
+void reserve_lmb(u64 start, u64 end, char *name);
+void free_lmb(u64 start, u64 end);
+void add_lmb_memory(u64 start, u64 end);
 u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
 			 u64 size, u64 align);
 u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
+u64 get_max_mapped(void);
 
 #include <asm/lmb.h>
 
diff --git a/mm/lmb.c b/mm/lmb.c
index d5d5dc4..9798458 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
 	return -1;
 }
 
+u64 __weak __init get_max_mapped(void)
+{
+	u64 end = max_low_pfn;
+
+	end <<= PAGE_SHIFT;
+
+	return end;
+}
+
+static void __init __check_and_double_region_array(struct lmb_region *type,
+			 struct lmb_property *static_region,
+			 u64 ex_start, u64 ex_end)
+{
+	u64 start, end, size, mem;
+	struct lmb_property *new, *old;
+	unsigned long rgnsz = type->nr_regions;
+
+	/* Do we have enough slots left ? */
+	if ((rgnsz - type->cnt) > max_t(unsigned long, rgnsz/8, 2))
+		return;
+
+	old = type->region;
+	/* Double the array size */
+	size = sizeof(struct lmb_property) * rgnsz * 2;
+	if (old == static_region)
+		start = 0;
+	else
+		start = __pa(old) + sizeof(struct lmb_property) * rgnsz;
+	end = ex_start;
+	mem = -1ULL;
+	if (start + size < end)
+		mem = find_lmb_area(start, end, size, sizeof(struct lmb_property));
+	if (mem == -1ULL) {
+		start = ex_end;
+		end = get_max_mapped();
+		if (start + size < end)
+			mem = find_lmb_area(start, end, size, sizeof(struct lmb_property));
+	}
+	if (mem == -1ULL)
+		panic("can not find more space for lmb.reserved.region array");
+
+	new = __va(mem);
+	/* Copy old to new */
+	memcpy(&new[0], &old[0], sizeof(struct lmb_property) * rgnsz);
+	memset(&new[rgnsz], 0, sizeof(struct lmb_property) * rgnsz);
+
+	memset(&old[0], 0, sizeof(struct lmb_property) * rgnsz);
+	type->region = new;
+	type->nr_regions = rgnsz * 2;
+	printk(KERN_DEBUG "lmb.reserved.region array is doubled to %ld at [%llx - %llx]\n",
+		type->nr_regions, mem, mem + size - 1);
+
+	/* Reserve new array and free old one */
+	lmb_reserve(mem, sizeof(struct lmb_property) * rgnsz * 2);
+	if (old != static_region)
+		lmb_free(__pa(old), sizeof(struct lmb_property) * rgnsz);
+}
+
+void __init add_lmb_memory(u64 start, u64 end)
+{
+	__check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
+	lmb_add(start, end - start);
+}
+
+void __init reserve_lmb(u64 start, u64 end, char *name)
+{
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
+	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
+	lmb_reserve(start, end - start);
+}
+
+void __init free_lmb(u64 start, u64 end)
+{
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
+	/* keep punching hole, could run out of slots too */
+	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
+	lmb_free(start, end - start);
+}
+
 static int __init find_overlapped_early(u64 start, u64 end)
 {
 	int i;
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 08/31] lmb: Add find_lmb_area_size()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

The same as find_lmb_area(), but size is returned according free range.

Will be used to find free ranges for early_memtest and memory corruption check

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   80 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 95ae3f4..7301072 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -89,6 +89,7 @@ void add_lmb_memory(u64 start, u64 end);
 u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
 			 u64 size, u64 align);
 u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
+u64 find_lmb_area_size(u64 start, u64 *sizep, u64 align);
 u64 get_max_mapped(void);
 
 #include <asm/lmb.h>
diff --git a/mm/lmb.c b/mm/lmb.c
index 9798458..a91f48d 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -672,6 +672,40 @@ again:
 	return changed;
 }
 
+/* Check for already reserved areas */
+static inline bool __init bad_addr_size(u64 *addrp, u64 *sizep, u64 align)
+{
+	int i;
+	u64 addr = *addrp, last;
+	u64 size = *sizep;
+	bool changed = false;
+again:
+	last = addr + size;
+	for (i = 0; i < lmb.reserved.cnt && lmb.reserved.region[i].size; i++) {
+		struct lmb_property *r = &lmb.reserved.region[i];
+		if (last > r->base && addr < r->base) {
+			size = r->base - addr;
+			changed = true;
+			goto again;
+		}
+		if (last > (r->base + r->size) && addr < (r->base + r->size)) {
+			addr = round_up(r->base + r->size, align);
+			size = last - addr;
+			changed = true;
+			goto again;
+		}
+		if (last <= (r->base + r->size) && addr >= r->base) {
+			(*sizep)++;
+			return false;
+		}
+	}
+	if (changed) {
+		*addrp = addr;
+		*sizep = size;
+	}
+	return changed;
+}
+
 u64 __init __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
 				 u64 size, u64 align)
 {
@@ -696,6 +730,29 @@ out:
 	return -1ULL;
 }
 
+static u64 __init __find_lmb_area_size(u64 ei_start, u64 ei_last, u64 start,
+			 u64 *sizep, u64 align)
+{
+	u64 addr, last;
+
+	addr = round_up(ei_start, align);
+	if (addr < start)
+		addr = round_up(start, align);
+	if (addr >= ei_last)
+		goto out;
+	*sizep = ei_last - addr;
+	while (bad_addr_size(&addr, sizep, align) && addr + *sizep <= ei_last)
+		;
+	last = addr + *sizep;
+	if (last > ei_last)
+		goto out;
+
+	return addr;
+
+out:
+	return -1ULL;
+}
+
 /*
  * Find a free area with specified alignment in a specific range.
  */
@@ -716,3 +773,26 @@ u64 __init find_lmb_area(u64 start, u64 end, u64 size, u64 align)
 	}
 	return -1ULL;
 }
+
+/*
+ * Find next free range after *start
+ */
+u64 __init find_lmb_area_size(u64 start, u64 *sizep, u64 align)
+{
+	int i;
+
+	for (i = 0; i < lmb.memory.cnt; i++) {
+		u64 ei_start = lmb.memory.region[i].base;
+		u64 ei_last = ei_start + lmb.memory.region[i].size;
+		u64 addr;
+
+		addr = __find_lmb_area_size(ei_start, ei_last, start,
+					 sizep, align);
+
+		if (addr != -1ULL)
+			return addr;
+	}
+
+	return -1ULL;
+}
+
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 08/31] lmb: Add find_lmb_area_size()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

The same as find_lmb_area(), but size is returned according free range.

Will be used to find free ranges for early_memtest and memory corruption check

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   80 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 95ae3f4..7301072 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -89,6 +89,7 @@ void add_lmb_memory(u64 start, u64 end);
 u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
 			 u64 size, u64 align);
 u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
+u64 find_lmb_area_size(u64 start, u64 *sizep, u64 align);
 u64 get_max_mapped(void);
 
 #include <asm/lmb.h>
diff --git a/mm/lmb.c b/mm/lmb.c
index 9798458..a91f48d 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -672,6 +672,40 @@ again:
 	return changed;
 }
 
+/* Check for already reserved areas */
+static inline bool __init bad_addr_size(u64 *addrp, u64 *sizep, u64 align)
+{
+	int i;
+	u64 addr = *addrp, last;
+	u64 size = *sizep;
+	bool changed = false;
+again:
+	last = addr + size;
+	for (i = 0; i < lmb.reserved.cnt && lmb.reserved.region[i].size; i++) {
+		struct lmb_property *r = &lmb.reserved.region[i];
+		if (last > r->base && addr < r->base) {
+			size = r->base - addr;
+			changed = true;
+			goto again;
+		}
+		if (last > (r->base + r->size) && addr < (r->base + r->size)) {
+			addr = round_up(r->base + r->size, align);
+			size = last - addr;
+			changed = true;
+			goto again;
+		}
+		if (last <= (r->base + r->size) && addr >= r->base) {
+			(*sizep)++;
+			return false;
+		}
+	}
+	if (changed) {
+		*addrp = addr;
+		*sizep = size;
+	}
+	return changed;
+}
+
 u64 __init __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
 				 u64 size, u64 align)
 {
@@ -696,6 +730,29 @@ out:
 	return -1ULL;
 }
 
+static u64 __init __find_lmb_area_size(u64 ei_start, u64 ei_last, u64 start,
+			 u64 *sizep, u64 align)
+{
+	u64 addr, last;
+
+	addr = round_up(ei_start, align);
+	if (addr < start)
+		addr = round_up(start, align);
+	if (addr >= ei_last)
+		goto out;
+	*sizep = ei_last - addr;
+	while (bad_addr_size(&addr, sizep, align) && addr + *sizep <= ei_last)
+		;
+	last = addr + *sizep;
+	if (last > ei_last)
+		goto out;
+
+	return addr;
+
+out:
+	return -1ULL;
+}
+
 /*
  * Find a free area with specified alignment in a specific range.
  */
@@ -716,3 +773,26 @@ u64 __init find_lmb_area(u64 start, u64 end, u64 size, u64 align)
 	}
 	return -1ULL;
 }
+
+/*
+ * Find next free range after *start
+ */
+u64 __init find_lmb_area_size(u64 start, u64 *sizep, u64 align)
+{
+	int i;
+
+	for (i = 0; i < lmb.memory.cnt; i++) {
+		u64 ei_start = lmb.memory.region[i].base;
+		u64 ei_last = ei_start + lmb.memory.region[i].size;
+		u64 addr;
+
+		addr = __find_lmb_area_size(ei_start, ei_last, start,
+					 sizep, align);
+
+		if (addr != -1ULL)
+			return addr;
+	}
+
+	return -1ULL;
+}
+
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 09/31] bootmem, x86: Add weak version of reserve_bootmem_generic
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It will be used lmb_to_bootmem converting

It is an wrapper for reserve_bootmem, and x86 64bit is using special one.

Also clean up that version for x86_64. We don't need to take care of numa
path for that, bootmem can handle it how

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/mm/init_32.c |    6 ------
 arch/x86/mm/init_64.c |   20 ++------------------
 mm/bootmem.c          |    6 ++++++
 3 files changed, 8 insertions(+), 24 deletions(-)

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 5cb3f0f..804bbe9 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -1069,9 +1069,3 @@ void mark_rodata_ro(void)
 #endif
 }
 #endif
-
-int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
-				   int flags)
-{
-	return reserve_bootmem(phys, len, flags);
-}
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index e9b040e..5ba6b0e 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -798,13 +798,10 @@ void mark_rodata_ro(void)
 
 #endif
 
+#ifndef CONFIG_NO_BOOTMEM
 int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
 				   int flags)
 {
-#ifdef CONFIG_NUMA
-	int nid, next_nid;
-	int ret;
-#endif
 	unsigned long pfn = phys >> PAGE_SHIFT;
 
 	if (pfn >= max_pfn) {
@@ -820,21 +817,7 @@ int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
 		return -EFAULT;
 	}
 
-	/* Should check here against the e820 map to avoid double free */
-#ifdef CONFIG_NUMA
-	nid = phys_to_nid(phys);
-	next_nid = phys_to_nid(phys + len - 1);
-	if (nid == next_nid)
-		ret = reserve_bootmem_node(NODE_DATA(nid), phys, len, flags);
-	else
-		ret = reserve_bootmem(phys, len, flags);
-
-	if (ret != 0)
-		return ret;
-
-#else
 	reserve_bootmem(phys, len, flags);
-#endif
 
 	if (phys+len <= MAX_DMA_PFN*PAGE_SIZE) {
 		dma_reserve += len / PAGE_SIZE;
@@ -843,6 +826,7 @@ int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
 
 	return 0;
 }
+#endif
 
 int kern_addr_valid(unsigned long addr)
 {
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 9b13446..fadbc3b 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -512,6 +512,12 @@ int __init reserve_bootmem(unsigned long addr, unsigned long size,
 }
 
 #ifndef CONFIG_NO_BOOTMEM
+int __weak __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
+				   int flags)
+{
+	return reserve_bootmem(phys, len, flags);
+}
+
 static unsigned long __init align_idx(struct bootmem_data *bdata,
 				      unsigned long idx, unsigned long step)
 {
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 09/31] bootmem, x86: Add weak version of reserve_bootmem_generic
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It will be used lmb_to_bootmem converting

It is an wrapper for reserve_bootmem, and x86 64bit is using special one.

Also clean up that version for x86_64. We don't need to take care of numa
path for that, bootmem can handle it how

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/mm/init_32.c |    6 ------
 arch/x86/mm/init_64.c |   20 ++------------------
 mm/bootmem.c          |    6 ++++++
 3 files changed, 8 insertions(+), 24 deletions(-)

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 5cb3f0f..804bbe9 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -1069,9 +1069,3 @@ void mark_rodata_ro(void)
 #endif
 }
 #endif
-
-int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
-				   int flags)
-{
-	return reserve_bootmem(phys, len, flags);
-}
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index e9b040e..5ba6b0e 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -798,13 +798,10 @@ void mark_rodata_ro(void)
 
 #endif
 
+#ifndef CONFIG_NO_BOOTMEM
 int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
 				   int flags)
 {
-#ifdef CONFIG_NUMA
-	int nid, next_nid;
-	int ret;
-#endif
 	unsigned long pfn = phys >> PAGE_SHIFT;
 
 	if (pfn >= max_pfn) {
@@ -820,21 +817,7 @@ int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
 		return -EFAULT;
 	}
 
-	/* Should check here against the e820 map to avoid double free */
-#ifdef CONFIG_NUMA
-	nid = phys_to_nid(phys);
-	next_nid = phys_to_nid(phys + len - 1);
-	if (nid == next_nid)
-		ret = reserve_bootmem_node(NODE_DATA(nid), phys, len, flags);
-	else
-		ret = reserve_bootmem(phys, len, flags);
-
-	if (ret != 0)
-		return ret;
-
-#else
 	reserve_bootmem(phys, len, flags);
-#endif
 
 	if (phys+len <= MAX_DMA_PFN*PAGE_SIZE) {
 		dma_reserve += len / PAGE_SIZE;
@@ -843,6 +826,7 @@ int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
 
 	return 0;
 }
+#endif
 
 int kern_addr_valid(unsigned long addr)
 {
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 9b13446..fadbc3b 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -512,6 +512,12 @@ int __init reserve_bootmem(unsigned long addr, unsigned long size,
 }
 
 #ifndef CONFIG_NO_BOOTMEM
+int __weak __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
+				   int flags)
+{
+	return reserve_bootmem(phys, len, flags);
+}
+
 static unsigned long __init align_idx(struct bootmem_data *bdata,
 				      unsigned long idx, unsigned long step)
 {
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 10/31] lmb: Add lmb_to_bootmem()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

lmb_to_bootmem() will reserve lmb.reserved.region in bootmem

We can use it to with all arches that support lmb.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    2 ++
 mm/lmb.c            |   32 ++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 7301072..f5071e1 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -92,6 +92,8 @@ u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
 u64 find_lmb_area_size(u64 start, u64 *sizep, u64 align);
 u64 get_max_mapped(void);
 
+void lmb_to_bootmem(u64 start, u64 end);
+
 #include <asm/lmb.h>
 
 #endif /* __KERNEL__ */
diff --git a/mm/lmb.c b/mm/lmb.c
index a91f48d..71a45b4 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -640,6 +640,38 @@ void __init free_lmb(u64 start, u64 end)
 	lmb_free(start, end - start);
 }
 
+#ifndef CONFIG_NO_BOOTMEM
+void __init lmb_to_bootmem(u64 start, u64 end)
+{
+	int i, count;
+	u64 final_start, final_end;
+
+	/* Take out region array itself */
+	if (lmb.reserved.region != lmb_reserved_region)
+		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+
+	count  = lmb.reserved.cnt;
+	pr_info("(%d early reservations) ==> bootmem [%010llx - %010llx]\n", count, start, end);
+	for (i = 0; i < count; i++) {
+		struct lmb_property *r = &lmb.reserved.region[i];
+		pr_info("  #%d [%010llx - %010llx] ", i, r->base, r->base + r->size);
+		final_start = max(start, r->base);
+		final_end = min(end, r->base + r->size);
+		if (final_start >= final_end) {
+			pr_cont("\n");
+			continue;
+		}
+		pr_cont(" ==> [%010llx - %010llx]\n", final_start, final_end);
+		reserve_bootmem_generic(final_start, final_end - final_start, BOOTMEM_DEFAULT);
+	}
+	/* Clear them to avoid misusing ? */
+	memset(&lmb.reserved.region[0], 0, sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+	lmb.reserved.region = NULL;
+	lmb.reserved.nr_regions = 0;
+	lmb.reserved.cnt = 0;
+}
+#endif
+
 static int __init find_overlapped_early(u64 start, u64 end)
 {
 	int i;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 10/31] lmb: Add lmb_to_bootmem()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

lmb_to_bootmem() will reserve lmb.reserved.region in bootmem

We can use it to with all arches that support lmb.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    2 ++
 mm/lmb.c            |   32 ++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 7301072..f5071e1 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -92,6 +92,8 @@ u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
 u64 find_lmb_area_size(u64 start, u64 *sizep, u64 align);
 u64 get_max_mapped(void);
 
+void lmb_to_bootmem(u64 start, u64 end);
+
 #include <asm/lmb.h>
 
 #endif /* __KERNEL__ */
diff --git a/mm/lmb.c b/mm/lmb.c
index a91f48d..71a45b4 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -640,6 +640,38 @@ void __init free_lmb(u64 start, u64 end)
 	lmb_free(start, end - start);
 }
 
+#ifndef CONFIG_NO_BOOTMEM
+void __init lmb_to_bootmem(u64 start, u64 end)
+{
+	int i, count;
+	u64 final_start, final_end;
+
+	/* Take out region array itself */
+	if (lmb.reserved.region != lmb_reserved_region)
+		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+
+	count  = lmb.reserved.cnt;
+	pr_info("(%d early reservations) ==> bootmem [%010llx - %010llx]\n", count, start, end);
+	for (i = 0; i < count; i++) {
+		struct lmb_property *r = &lmb.reserved.region[i];
+		pr_info("  #%d [%010llx - %010llx] ", i, r->base, r->base + r->size);
+		final_start = max(start, r->base);
+		final_end = min(end, r->base + r->size);
+		if (final_start >= final_end) {
+			pr_cont("\n");
+			continue;
+		}
+		pr_cont(" ==> [%010llx - %010llx]\n", final_start, final_end);
+		reserve_bootmem_generic(final_start, final_end - final_start, BOOTMEM_DEFAULT);
+	}
+	/* Clear them to avoid misusing ? */
+	memset(&lmb.reserved.region[0], 0, sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+	lmb.reserved.region = NULL;
+	lmb.reserved.nr_regions = 0;
+	lmb.reserved.cnt = 0;
+}
+#endif
+
 static int __init find_overlapped_early(u64 start, u64 end)
 {
 	int i;
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 11/31] lmb: Add get_free_all_memory_range()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu, Jan Beulich

get_free_all_memory_range is for CONFIG_NO_BOOTMEM, and will be called by
free_all_memory_core_early().

It will use early_node_map aka active ranges subtract lmb.reserved to
get all free range.

-v2: Update with Jan Beulich's patch "fix allocation done in get_free_all_memory_range()", that one is for early_res.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jan Beulich <jbeulich@novell.com>
---
 include/linux/lmb.h |    2 +
 mm/lmb.c            |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 87 insertions(+), 1 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index f5071e1..9e2dcf5 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -93,6 +93,8 @@ u64 find_lmb_area_size(u64 start, u64 *sizep, u64 align);
 u64 get_max_mapped(void);
 
 void lmb_to_bootmem(u64 start, u64 end);
+struct range;
+int get_free_all_memory_range(struct range **rangep, int nodeid);
 
 #include <asm/lmb.h>
 
diff --git a/mm/lmb.c b/mm/lmb.c
index 71a45b4..abece72 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -640,7 +640,91 @@ void __init free_lmb(u64 start, u64 end)
 	lmb_free(start, end - start);
 }
 
-#ifndef CONFIG_NO_BOOTMEM
+static __init struct range *find_range_array(int count)
+{
+	u64 end, size, mem = -1ULL;
+	struct range *range;
+
+	size = sizeof(struct range) * count;
+	end = get_max_mapped();
+#ifdef MAX_DMA32_PFN
+	if (end > (MAX_DMA32_PFN << PAGE_SHIFT))
+		mem = find_lmb_area(MAX_DMA32_PFN << PAGE_SHIFT, end,
+					 size, sizeof(struct range));
+#endif
+	if (mem == -1ULL)
+		mem = find_lmb_area(0, end, size, sizeof(struct range));
+	if (mem == -1ULL)
+		panic("can not find more space for range free");
+
+	range = __va(mem);
+	memset(range, 0, size);
+
+	return range;
+}
+
+#ifdef CONFIG_NO_BOOTMEM
+static void __init subtract_lmb_reserved(struct range *range, int az)
+{
+	int i, count;
+	u64 final_start, final_end;
+
+	/* Take out region array itself at first*/
+	if (lmb.reserved.region != lmb_reserved_region)
+		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+
+	count  = lmb.reserved.cnt;
+
+	pr_info("Subtract (%d early reservations)\n", count);
+
+	for (i = 0; i < count; i++) {
+		struct lmb_property *r = &lmb.reserved.region[i];
+		pr_info("  #%d [%010llx - %010llx]\n", i, r->base, r->base + r->size);
+		final_start = PFN_DOWN(r->base);
+		final_end = PFN_UP(r->base + r->size);
+		if (final_start >= final_end)
+			continue;
+		subtract_range(range, az, final_start, final_end);
+	}
+	/* Put region array back ? */
+	if (lmb.reserved.region != lmb_reserved_region)
+		lmb_reserve(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+}
+
+int __init get_free_all_memory_range(struct range **rangep, int nodeid)
+{
+	int count;
+	struct range *range;
+	int nr_range;
+
+	count = lmb.reserved.cnt * 2;
+
+	range = find_range_array(count);
+	nr_range = 0;
+
+	/*
+	 * Use early_node_map[] and lmb.reserved.region to get range array
+	 * at first
+	 */
+	nr_range = add_from_early_node_map(range, count, nr_range, nodeid);
+#ifdef CONFIG_X86_32
+	subtract_range(range, count, max_low_pfn, -1ULL);
+#endif
+	subtract_lmb_reserved(range, count);
+	nr_range = clean_sort_range(range, count);
+
+	/* Need to clear it ? */
+	if (nodeid == MAX_NUMNODES) {
+		memset(&lmb.reserved.region[0], 0, sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+		lmb.reserved.region = NULL;
+		lmb.reserved.nr_regions = 0;
+		lmb.reserved.cnt = 0;
+	}
+
+	*rangep = range;
+	return nr_range;
+}
+#else
 void __init lmb_to_bootmem(u64 start, u64 end)
 {
 	int i, count;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 11/31] lmb: Add get_free_all_memory_range()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu, Jan Beulich

get_free_all_memory_range is for CONFIG_NO_BOOTMEM, and will be called by
free_all_memory_core_early().

It will use early_node_map aka active ranges subtract lmb.reserved to
get all free range.

-v2: Update with Jan Beulich's patch "fix allocation done in get_free_all_memory_range()", that one is for early_res.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jan Beulich <jbeulich@novell.com>
---
 include/linux/lmb.h |    2 +
 mm/lmb.c            |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 87 insertions(+), 1 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index f5071e1..9e2dcf5 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -93,6 +93,8 @@ u64 find_lmb_area_size(u64 start, u64 *sizep, u64 align);
 u64 get_max_mapped(void);
 
 void lmb_to_bootmem(u64 start, u64 end);
+struct range;
+int get_free_all_memory_range(struct range **rangep, int nodeid);
 
 #include <asm/lmb.h>
 
diff --git a/mm/lmb.c b/mm/lmb.c
index 71a45b4..abece72 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -640,7 +640,91 @@ void __init free_lmb(u64 start, u64 end)
 	lmb_free(start, end - start);
 }
 
-#ifndef CONFIG_NO_BOOTMEM
+static __init struct range *find_range_array(int count)
+{
+	u64 end, size, mem = -1ULL;
+	struct range *range;
+
+	size = sizeof(struct range) * count;
+	end = get_max_mapped();
+#ifdef MAX_DMA32_PFN
+	if (end > (MAX_DMA32_PFN << PAGE_SHIFT))
+		mem = find_lmb_area(MAX_DMA32_PFN << PAGE_SHIFT, end,
+					 size, sizeof(struct range));
+#endif
+	if (mem == -1ULL)
+		mem = find_lmb_area(0, end, size, sizeof(struct range));
+	if (mem == -1ULL)
+		panic("can not find more space for range free");
+
+	range = __va(mem);
+	memset(range, 0, size);
+
+	return range;
+}
+
+#ifdef CONFIG_NO_BOOTMEM
+static void __init subtract_lmb_reserved(struct range *range, int az)
+{
+	int i, count;
+	u64 final_start, final_end;
+
+	/* Take out region array itself at first*/
+	if (lmb.reserved.region != lmb_reserved_region)
+		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+
+	count  = lmb.reserved.cnt;
+
+	pr_info("Subtract (%d early reservations)\n", count);
+
+	for (i = 0; i < count; i++) {
+		struct lmb_property *r = &lmb.reserved.region[i];
+		pr_info("  #%d [%010llx - %010llx]\n", i, r->base, r->base + r->size);
+		final_start = PFN_DOWN(r->base);
+		final_end = PFN_UP(r->base + r->size);
+		if (final_start >= final_end)
+			continue;
+		subtract_range(range, az, final_start, final_end);
+	}
+	/* Put region array back ? */
+	if (lmb.reserved.region != lmb_reserved_region)
+		lmb_reserve(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+}
+
+int __init get_free_all_memory_range(struct range **rangep, int nodeid)
+{
+	int count;
+	struct range *range;
+	int nr_range;
+
+	count = lmb.reserved.cnt * 2;
+
+	range = find_range_array(count);
+	nr_range = 0;
+
+	/*
+	 * Use early_node_map[] and lmb.reserved.region to get range array
+	 * at first
+	 */
+	nr_range = add_from_early_node_map(range, count, nr_range, nodeid);
+#ifdef CONFIG_X86_32
+	subtract_range(range, count, max_low_pfn, -1ULL);
+#endif
+	subtract_lmb_reserved(range, count);
+	nr_range = clean_sort_range(range, count);
+
+	/* Need to clear it ? */
+	if (nodeid == MAX_NUMNODES) {
+		memset(&lmb.reserved.region[0], 0, sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+		lmb.reserved.region = NULL;
+		lmb.reserved.nr_regions = 0;
+		lmb.reserved.cnt = 0;
+	}
+
+	*rangep = range;
+	return nr_range;
+}
+#else
 void __init lmb_to_bootmem(u64 start, u64 end)
 {
 	int i, count;
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 12/31] lmb: Add lmb_register_active_regions() and lmb_hole_size()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

lmb_register_active_regions() will be used to fill early_node_map,
the result will be lmb.memory.region AND numa data

lmb_hole_size will be used to find hole size on lmb.memory.region
with specified range.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    4 +++
 mm/lmb.c            |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 9e2dcf5..6568e9d 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -96,6 +96,10 @@ void lmb_to_bootmem(u64 start, u64 end);
 struct range;
 int get_free_all_memory_range(struct range **rangep, int nodeid);
 
+void lmb_register_active_regions(int nid, unsigned long start_pfn,
+					 unsigned long last_pfn);
+u64 lmb_hole_size(u64 start, u64 end);
+
 #include <asm/lmb.h>
 
 #endif /* __KERNEL__ */
diff --git a/mm/lmb.c b/mm/lmb.c
index abece72..7e3c2c1 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -912,3 +912,71 @@ u64 __init find_lmb_area_size(u64 start, u64 *sizep, u64 align)
 	return -1ULL;
 }
 
+/*
+ * Finds an active region in the address range from start_pfn to last_pfn and
+ * returns its range in ei_startpfn and ei_endpfn for the lmb entry.
+ */
+static int __init lmb_find_active_region(const struct lmb_property *ei,
+				  unsigned long start_pfn,
+				  unsigned long last_pfn,
+				  unsigned long *ei_startpfn,
+				  unsigned long *ei_endpfn)
+{
+	u64 align = PAGE_SIZE;
+
+	*ei_startpfn = round_up(ei->base, align) >> PAGE_SHIFT;
+	*ei_endpfn = round_down(ei->base + ei->size, align) >> PAGE_SHIFT;
+
+	/* Skip map entries smaller than a page */
+	if (*ei_startpfn >= *ei_endpfn)
+		return 0;
+
+	/* Skip if map is outside the node */
+	if (*ei_endpfn <= start_pfn || *ei_startpfn >= last_pfn)
+		return 0;
+
+	/* Check for overlaps */
+	if (*ei_startpfn < start_pfn)
+		*ei_startpfn = start_pfn;
+	if (*ei_endpfn > last_pfn)
+		*ei_endpfn = last_pfn;
+
+	return 1;
+}
+
+/* Walk the lmb.memory map and register active regions within a node */
+void __init lmb_register_active_regions(int nid, unsigned long start_pfn,
+					 unsigned long last_pfn)
+{
+	unsigned long ei_startpfn;
+	unsigned long ei_endpfn;
+	int i;
+
+	for (i = 0; i < lmb.memory.cnt; i++)
+		if (lmb_find_active_region(&lmb.memory.region[i],
+					    start_pfn, last_pfn,
+					    &ei_startpfn, &ei_endpfn))
+			add_active_range(nid, ei_startpfn, ei_endpfn);
+}
+
+/*
+ * Find the hole size (in bytes) in the memory range.
+ * @start: starting address of the memory range to scan
+ * @end: ending address of the memory range to scan
+ */
+u64 __init lmb_hole_size(u64 start, u64 end)
+{
+	unsigned long start_pfn = start >> PAGE_SHIFT;
+	unsigned long last_pfn = end >> PAGE_SHIFT;
+	unsigned long ei_startpfn, ei_endpfn, ram = 0;
+	int i;
+
+	for (i = 0; i < lmb.memory.cnt; i++) {
+		if (lmb_find_active_region(&lmb.memory.region[i],
+					    start_pfn, last_pfn,
+					    &ei_startpfn, &ei_endpfn))
+			ram += ei_endpfn - ei_startpfn;
+	}
+	return end - start - ((u64)ram << PAGE_SHIFT);
+}
+
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 12/31] lmb: Add lmb_register_active_regions() and lmb_hole_size()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

lmb_register_active_regions() will be used to fill early_node_map,
the result will be lmb.memory.region AND numa data

lmb_hole_size will be used to find hole size on lmb.memory.region
with specified range.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    4 +++
 mm/lmb.c            |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 9e2dcf5..6568e9d 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -96,6 +96,10 @@ void lmb_to_bootmem(u64 start, u64 end);
 struct range;
 int get_free_all_memory_range(struct range **rangep, int nodeid);
 
+void lmb_register_active_regions(int nid, unsigned long start_pfn,
+					 unsigned long last_pfn);
+u64 lmb_hole_size(u64 start, u64 end);
+
 #include <asm/lmb.h>
 
 #endif /* __KERNEL__ */
diff --git a/mm/lmb.c b/mm/lmb.c
index abece72..7e3c2c1 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -912,3 +912,71 @@ u64 __init find_lmb_area_size(u64 start, u64 *sizep, u64 align)
 	return -1ULL;
 }
 
+/*
+ * Finds an active region in the address range from start_pfn to last_pfn and
+ * returns its range in ei_startpfn and ei_endpfn for the lmb entry.
+ */
+static int __init lmb_find_active_region(const struct lmb_property *ei,
+				  unsigned long start_pfn,
+				  unsigned long last_pfn,
+				  unsigned long *ei_startpfn,
+				  unsigned long *ei_endpfn)
+{
+	u64 align = PAGE_SIZE;
+
+	*ei_startpfn = round_up(ei->base, align) >> PAGE_SHIFT;
+	*ei_endpfn = round_down(ei->base + ei->size, align) >> PAGE_SHIFT;
+
+	/* Skip map entries smaller than a page */
+	if (*ei_startpfn >= *ei_endpfn)
+		return 0;
+
+	/* Skip if map is outside the node */
+	if (*ei_endpfn <= start_pfn || *ei_startpfn >= last_pfn)
+		return 0;
+
+	/* Check for overlaps */
+	if (*ei_startpfn < start_pfn)
+		*ei_startpfn = start_pfn;
+	if (*ei_endpfn > last_pfn)
+		*ei_endpfn = last_pfn;
+
+	return 1;
+}
+
+/* Walk the lmb.memory map and register active regions within a node */
+void __init lmb_register_active_regions(int nid, unsigned long start_pfn,
+					 unsigned long last_pfn)
+{
+	unsigned long ei_startpfn;
+	unsigned long ei_endpfn;
+	int i;
+
+	for (i = 0; i < lmb.memory.cnt; i++)
+		if (lmb_find_active_region(&lmb.memory.region[i],
+					    start_pfn, last_pfn,
+					    &ei_startpfn, &ei_endpfn))
+			add_active_range(nid, ei_startpfn, ei_endpfn);
+}
+
+/*
+ * Find the hole size (in bytes) in the memory range.
+ * @start: starting address of the memory range to scan
+ * @end: ending address of the memory range to scan
+ */
+u64 __init lmb_hole_size(u64 start, u64 end)
+{
+	unsigned long start_pfn = start >> PAGE_SHIFT;
+	unsigned long last_pfn = end >> PAGE_SHIFT;
+	unsigned long ei_startpfn, ei_endpfn, ram = 0;
+	int i;
+
+	for (i = 0; i < lmb.memory.cnt; i++) {
+		if (lmb_find_active_region(&lmb.memory.region[i],
+					    start_pfn, last_pfn,
+					    &ei_startpfn, &ei_endpfn))
+			ram += ei_endpfn - ei_startpfn;
+	}
+	return end - start - ((u64)ram << PAGE_SHIFT);
+}
+
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 13/31] lmb: Prepare to include linux/lmb.h in core file
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Need to add protection in linux/lmb.h, to prepare to include it in mm/page_alloc.c
and mm/bootmem.c etc.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 6568e9d..143f97d 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -2,6 +2,7 @@
 #define _LINUX_LMB_H
 #ifdef __KERNEL__
 
+#ifdef CONFIG_HAVE_LMB
 /*
  * Logical memory blocks.
  *
@@ -102,6 +103,8 @@ u64 lmb_hole_size(u64 start, u64 end);
 
 #include <asm/lmb.h>
 
+#endif /* CONFIG_HAVE_LMB */
+
 #endif /* __KERNEL__ */
 
 #endif /* _LINUX_LMB_H */
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 13/31] lmb: Prepare to include linux/lmb.h in core file
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Need to add protection in linux/lmb.h, to prepare to include it in mm/page_alloc.c
and mm/bootmem.c etc.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 6568e9d..143f97d 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -2,6 +2,7 @@
 #define _LINUX_LMB_H
 #ifdef __KERNEL__
 
+#ifdef CONFIG_HAVE_LMB
 /*
  * Logical memory blocks.
  *
@@ -102,6 +103,8 @@ u64 lmb_hole_size(u64 start, u64 end);
 
 #include <asm/lmb.h>
 
+#endif /* CONFIG_HAVE_LMB */
+
 #endif /* __KERNEL__ */
 
 #endif /* _LINUX_LMB_H */
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 14/31] lmb: Add find_memory_core_early()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

According to node range in early_node_map[] with __find_lmb_area
to find free range.

Will be used by find_lmb_area_node()

find_lmb_area_node will be used to find right buffer for NODE_DATA

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/mm.h |    2 ++
 mm/page_alloc.c    |   29 +++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c8442b6..8070bd8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1159,6 +1159,8 @@ extern void free_bootmem_with_active_regions(int nid,
 						unsigned long max_low_pfn);
 int add_from_early_node_map(struct range *range, int az,
 				   int nr_range, int nid);
+u64 __init find_memory_core_early(int nid, u64 size, u64 align,
+					u64 goal, u64 limit);
 void *__alloc_memory_core_early(int nodeid, u64 size, u64 align,
 				 u64 goal, u64 limit);
 typedef int (*work_fn_t)(unsigned long, unsigned long, void *);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d03c946..faae23a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -21,6 +21,7 @@
 #include <linux/pagemap.h>
 #include <linux/jiffies.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/kmemcheck.h>
@@ -3393,6 +3394,34 @@ void __init free_bootmem_with_active_regions(int nid,
 	}
 }
 
+#ifdef CONFIG_HAVE_LMB
+u64 __init find_memory_core_early(int nid, u64 size, u64 align,
+					u64 goal, u64 limit)
+{
+	int i;
+
+	/* Need to go over early_node_map to find out good range for node */
+	for_each_active_range_index_in_nid(i, nid) {
+		u64 addr;
+		u64 ei_start, ei_last;
+
+		ei_last = early_node_map[i].end_pfn;
+		ei_last <<= PAGE_SHIFT;
+		ei_start = early_node_map[i].start_pfn;
+		ei_start <<= PAGE_SHIFT;
+		addr = __find_lmb_area(ei_start, ei_last,
+					 goal, limit, size, align);
+
+		if (addr == -1ULL)
+			continue;
+
+		return addr;
+	}
+
+	return -1ULL;
+}
+#endif
+
 int __init add_from_early_node_map(struct range *range, int az,
 				   int nr_range, int nid)
 {
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 14/31] lmb: Add find_memory_core_early()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

According to node range in early_node_map[] with __find_lmb_area
to find free range.

Will be used by find_lmb_area_node()

find_lmb_area_node will be used to find right buffer for NODE_DATA

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/mm.h |    2 ++
 mm/page_alloc.c    |   29 +++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c8442b6..8070bd8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1159,6 +1159,8 @@ extern void free_bootmem_with_active_regions(int nid,
 						unsigned long max_low_pfn);
 int add_from_early_node_map(struct range *range, int az,
 				   int nr_range, int nid);
+u64 __init find_memory_core_early(int nid, u64 size, u64 align,
+					u64 goal, u64 limit);
 void *__alloc_memory_core_early(int nodeid, u64 size, u64 align,
 				 u64 goal, u64 limit);
 typedef int (*work_fn_t)(unsigned long, unsigned long, void *);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d03c946..faae23a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -21,6 +21,7 @@
 #include <linux/pagemap.h>
 #include <linux/jiffies.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/kmemcheck.h>
@@ -3393,6 +3394,34 @@ void __init free_bootmem_with_active_regions(int nid,
 	}
 }
 
+#ifdef CONFIG_HAVE_LMB
+u64 __init find_memory_core_early(int nid, u64 size, u64 align,
+					u64 goal, u64 limit)
+{
+	int i;
+
+	/* Need to go over early_node_map to find out good range for node */
+	for_each_active_range_index_in_nid(i, nid) {
+		u64 addr;
+		u64 ei_start, ei_last;
+
+		ei_last = early_node_map[i].end_pfn;
+		ei_last <<= PAGE_SHIFT;
+		ei_start = early_node_map[i].start_pfn;
+		ei_start <<= PAGE_SHIFT;
+		addr = __find_lmb_area(ei_start, ei_last,
+					 goal, limit, size, align);
+
+		if (addr == -1ULL)
+			continue;
+
+		return addr;
+	}
+
+	return -1ULL;
+}
+#endif
+
 int __init add_from_early_node_map(struct range *range, int az,
 				   int nr_range, int nid)
 {
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 15/31] lmb: Add find_lmb_area_node()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It can be used to find NODE_DATA for numa.

Need to make sure early_node_map[] is filled before it is called

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   15 +++++++++++++++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 143f97d..019520a 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -91,6 +91,7 @@ u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
 			 u64 size, u64 align);
 u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
 u64 find_lmb_area_size(u64 start, u64 *sizep, u64 align);
+u64 find_lmb_area_node(int nid, u64 start, u64 end, u64 size, u64 align);
 u64 get_max_mapped(void);
 
 void lmb_to_bootmem(u64 start, u64 end);
diff --git a/mm/lmb.c b/mm/lmb.c
index 7e3c2c1..addfcb1 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -913,6 +913,21 @@ u64 __init find_lmb_area_size(u64 start, u64 *sizep, u64 align)
 }
 
 /*
+ * Need to call this function after lmb_register_active_regions,
+ * so early_node_map[] is filled already.
+ */
+u64 __init find_lmb_area_node(int nid, u64 start, u64 end, u64 size, u64 align)
+{
+	u64 addr;
+	addr = find_memory_core_early(nid, size, align, start, end);
+	if (addr != -1ULL)
+		return addr;
+
+	/* Fallback, should already have start end within node range */
+	return find_lmb_area(start, end, size, align);
+}
+
+/*
  * Finds an active region in the address range from start_pfn to last_pfn and
  * returns its range in ei_startpfn and ei_endpfn for the lmb entry.
  */
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 15/31] lmb: Add find_lmb_area_node()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It can be used to find NODE_DATA for numa.

Need to make sure early_node_map[] is filled before it is called

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   15 +++++++++++++++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 143f97d..019520a 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -91,6 +91,7 @@ u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
 			 u64 size, u64 align);
 u64 find_lmb_area(u64 start, u64 end, u64 size, u64 align);
 u64 find_lmb_area_size(u64 start, u64 *sizep, u64 align);
+u64 find_lmb_area_node(int nid, u64 start, u64 end, u64 size, u64 align);
 u64 get_max_mapped(void);
 
 void lmb_to_bootmem(u64 start, u64 end);
diff --git a/mm/lmb.c b/mm/lmb.c
index 7e3c2c1..addfcb1 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -913,6 +913,21 @@ u64 __init find_lmb_area_size(u64 start, u64 *sizep, u64 align)
 }
 
 /*
+ * Need to call this function after lmb_register_active_regions,
+ * so early_node_map[] is filled already.
+ */
+u64 __init find_lmb_area_node(int nid, u64 start, u64 end, u64 size, u64 align)
+{
+	u64 addr;
+	addr = find_memory_core_early(nid, size, align, start, end);
+	if (addr != -1ULL)
+		return addr;
+
+	/* Fallback, should already have start end within node range */
+	return find_lmb_area(start, end, size, align);
+}
+
+/*
  * Finds an active region in the address range from start_pfn to last_pfn and
  * returns its range in ei_startpfn and ei_endpfn for the lmb entry.
  */
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 16/31] lmb: Add lmb_free_memory_size()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It will return free memory size in specified range.

We can not use memory_size - reserved_size here, because some reserved area
may not be in the scope of lmb.memory.region.

Use lmb.memory.region subtracting lmb.reserved.region to get free range array.
then count size of all free ranges.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 019520a..51a8653 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -101,6 +101,7 @@ int get_free_all_memory_range(struct range **rangep, int nodeid);
 void lmb_register_active_regions(int nid, unsigned long start_pfn,
 					 unsigned long last_pfn);
 u64 lmb_hole_size(u64 start, u64 end);
+u64 lmb_free_memory_size(u64 addr, u64 limit);
 
 #include <asm/lmb.h>
 
diff --git a/mm/lmb.c b/mm/lmb.c
index addfcb1..233c40d 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -756,6 +756,57 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 }
 #endif
 
+u64 __init lmb_free_memory_size(u64 addr, u64 limit)
+{
+	int i, count;
+	struct range *range;
+	int nr_range;
+	u64 final_start, final_end;
+	u64 free_size;
+
+	count = lmb.reserved.cnt * 2;
+
+	range = find_range_array(count);
+	nr_range = 0;
+
+	addr = PFN_UP(addr);
+	limit = PFN_DOWN(limit);
+
+	for (i = 0; i < lmb.memory.cnt; i++) {
+		struct lmb_property *r = &lmb.memory.region[i];
+
+		final_start = PFN_UP(r->base);
+		final_end = PFN_DOWN(r->base + r->size);
+		if (final_start >= final_end)
+			continue;
+		if (final_start >= limit || final_end <= addr)
+			continue;
+
+		nr_range = add_range(range, count, nr_range, final_start, final_end);
+	}
+	subtract_range(range, count, 0, addr);
+	subtract_range(range, count, limit, -1ULL);
+	for (i = 0; i < lmb.reserved.cnt; i++) {
+		struct lmb_property *r = &lmb.reserved.region[i];
+
+		final_start = PFN_DOWN(r->base);
+		final_end = PFN_UP(r->base + r->size);
+		if (final_start >= final_end)
+			continue;
+		if (final_start >= limit || final_end <= addr)
+			continue;
+
+		subtract_range(range, count, final_start, final_end);
+	}
+	nr_range = clean_sort_range(range, count);
+
+	free_size = 0;
+	for (i = 0; i < nr_range; i++)
+		free_size += range[i].end - range[i].start;
+
+	return free_size << PAGE_SHIFT;
+}
+
 static int __init find_overlapped_early(u64 start, u64 end)
 {
 	int i;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 16/31] lmb: Add lmb_free_memory_size()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It will return free memory size in specified range.

We can not use memory_size - reserved_size here, because some reserved area
may not be in the scope of lmb.memory.region.

Use lmb.memory.region subtracting lmb.reserved.region to get free range array.
then count size of all free ranges.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 019520a..51a8653 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -101,6 +101,7 @@ int get_free_all_memory_range(struct range **rangep, int nodeid);
 void lmb_register_active_regions(int nid, unsigned long start_pfn,
 					 unsigned long last_pfn);
 u64 lmb_hole_size(u64 start, u64 end);
+u64 lmb_free_memory_size(u64 addr, u64 limit);
 
 #include <asm/lmb.h>
 
diff --git a/mm/lmb.c b/mm/lmb.c
index addfcb1..233c40d 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -756,6 +756,57 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 }
 #endif
 
+u64 __init lmb_free_memory_size(u64 addr, u64 limit)
+{
+	int i, count;
+	struct range *range;
+	int nr_range;
+	u64 final_start, final_end;
+	u64 free_size;
+
+	count = lmb.reserved.cnt * 2;
+
+	range = find_range_array(count);
+	nr_range = 0;
+
+	addr = PFN_UP(addr);
+	limit = PFN_DOWN(limit);
+
+	for (i = 0; i < lmb.memory.cnt; i++) {
+		struct lmb_property *r = &lmb.memory.region[i];
+
+		final_start = PFN_UP(r->base);
+		final_end = PFN_DOWN(r->base + r->size);
+		if (final_start >= final_end)
+			continue;
+		if (final_start >= limit || final_end <= addr)
+			continue;
+
+		nr_range = add_range(range, count, nr_range, final_start, final_end);
+	}
+	subtract_range(range, count, 0, addr);
+	subtract_range(range, count, limit, -1ULL);
+	for (i = 0; i < lmb.reserved.cnt; i++) {
+		struct lmb_property *r = &lmb.reserved.region[i];
+
+		final_start = PFN_DOWN(r->base);
+		final_end = PFN_UP(r->base + r->size);
+		if (final_start >= final_end)
+			continue;
+		if (final_start >= limit || final_end <= addr)
+			continue;
+
+		subtract_range(range, count, final_start, final_end);
+	}
+	nr_range = clean_sort_range(range, count);
+
+	free_size = 0;
+	for (i = 0; i < nr_range; i++)
+		free_size += range[i].end - range[i].start;
+
+	return free_size << PAGE_SHIFT;
+}
+
 static int __init find_overlapped_early(u64 start, u64 end)
 {
 	int i;
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 17/31] lmb: Add lmb_memory_size()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It will return memory size in specified range according to lmb.memory.region

Try to share some code with lmb_free_memory_size() by passing get_free to
__lmb_memory_size().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   18 +++++++++++++++++-
 2 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 51a8653..285f287 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -102,6 +102,7 @@ void lmb_register_active_regions(int nid, unsigned long start_pfn,
 					 unsigned long last_pfn);
 u64 lmb_hole_size(u64 start, u64 end);
 u64 lmb_free_memory_size(u64 addr, u64 limit);
+u64 lmb_memory_size(u64 addr, u64 limit);
 
 #include <asm/lmb.h>
 
diff --git a/mm/lmb.c b/mm/lmb.c
index 233c40d..f49d6c8 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -756,7 +756,7 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 }
 #endif
 
-u64 __init lmb_free_memory_size(u64 addr, u64 limit)
+static u64 __init __lmb_memory_size(u64 addr, u64 limit, bool get_free)
 {
 	int i, count;
 	struct range *range;
@@ -786,6 +786,10 @@ u64 __init lmb_free_memory_size(u64 addr, u64 limit)
 	}
 	subtract_range(range, count, 0, addr);
 	subtract_range(range, count, limit, -1ULL);
+
+	/* Subtract lmb.reserved.region in range ? */
+	if (!get_free)
+		goto sort_and_count_them;
 	for (i = 0; i < lmb.reserved.cnt; i++) {
 		struct lmb_property *r = &lmb.reserved.region[i];
 
@@ -798,6 +802,8 @@ u64 __init lmb_free_memory_size(u64 addr, u64 limit)
 
 		subtract_range(range, count, final_start, final_end);
 	}
+
+sort_and_count_them:
 	nr_range = clean_sort_range(range, count);
 
 	free_size = 0;
@@ -807,6 +813,16 @@ u64 __init lmb_free_memory_size(u64 addr, u64 limit)
 	return free_size << PAGE_SHIFT;
 }
 
+u64 __init lmb_free_memory_size(u64 addr, u64 limit)
+{
+	return __lmb_memory_size(addr, limit, true);
+}
+
+u64 __init lmb_memory_size(u64 addr, u64 limit)
+{
+	return __lmb_memory_size(addr, limit, false);
+}
+
 static int __init find_overlapped_early(u64 start, u64 end)
 {
 	int i;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 17/31] lmb: Add lmb_memory_size()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

It will return memory size in specified range according to lmb.memory.region

Try to share some code with lmb_free_memory_size() by passing get_free to
__lmb_memory_size().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   18 +++++++++++++++++-
 2 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 51a8653..285f287 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -102,6 +102,7 @@ void lmb_register_active_regions(int nid, unsigned long start_pfn,
 					 unsigned long last_pfn);
 u64 lmb_hole_size(u64 start, u64 end);
 u64 lmb_free_memory_size(u64 addr, u64 limit);
+u64 lmb_memory_size(u64 addr, u64 limit);
 
 #include <asm/lmb.h>
 
diff --git a/mm/lmb.c b/mm/lmb.c
index 233c40d..f49d6c8 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -756,7 +756,7 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 }
 #endif
 
-u64 __init lmb_free_memory_size(u64 addr, u64 limit)
+static u64 __init __lmb_memory_size(u64 addr, u64 limit, bool get_free)
 {
 	int i, count;
 	struct range *range;
@@ -786,6 +786,10 @@ u64 __init lmb_free_memory_size(u64 addr, u64 limit)
 	}
 	subtract_range(range, count, 0, addr);
 	subtract_range(range, count, limit, -1ULL);
+
+	/* Subtract lmb.reserved.region in range ? */
+	if (!get_free)
+		goto sort_and_count_them;
 	for (i = 0; i < lmb.reserved.cnt; i++) {
 		struct lmb_property *r = &lmb.reserved.region[i];
 
@@ -798,6 +802,8 @@ u64 __init lmb_free_memory_size(u64 addr, u64 limit)
 
 		subtract_range(range, count, final_start, final_end);
 	}
+
+sort_and_count_them:
 	nr_range = clean_sort_range(range, count);
 
 	free_size = 0;
@@ -807,6 +813,16 @@ u64 __init lmb_free_memory_size(u64 addr, u64 limit)
 	return free_size << PAGE_SHIFT;
 }
 
+u64 __init lmb_free_memory_size(u64 addr, u64 limit)
+{
+	return __lmb_memory_size(addr, limit, true);
+}
+
+u64 __init lmb_memory_size(u64 addr, u64 limit)
+{
+	return __lmb_memory_size(addr, limit, false);
+}
+
 static int __init find_overlapped_early(u64 start, u64 end)
 {
 	int i;
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 18/31] lmb: Add reserve_lmb_overlap_ok()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Some areas from firmware could be reserved several times from different callers.

If these area are overlapped, We may have overlapped entries in lmb.reserved.

Try to free the area at first, before rerserve them again.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   27 +++++++++++++++++++++++++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 285f287..1e11fe0 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -85,6 +85,7 @@ lmb_end_pfn(struct lmb_region *type, unsigned long region_nr)
 }
 
 void reserve_lmb(u64 start, u64 end, char *name);
+void reserve_lmb_overlap_ok(u64 start, u64 end, char *name);
 void free_lmb(u64 start, u64 end);
 void add_lmb_memory(u64 start, u64 end);
 u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
diff --git a/mm/lmb.c b/mm/lmb.c
index f49d6c8..a709522 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -615,6 +615,12 @@ void __init add_lmb_memory(u64 start, u64 end)
 	lmb_add(start, end - start);
 }
 
+static void __init __reserve_lmb(u64 start, u64 end, char *name)
+{
+	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
+	lmb_reserve(start, end - start);
+}
+
 void __init reserve_lmb(u64 start, u64 end, char *name)
 {
 	if (start == end)
@@ -623,8 +629,25 @@ void __init reserve_lmb(u64 start, u64 end, char *name)
 	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
 		return;
 
-	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
-	lmb_reserve(start, end - start);
+	__reserve_lmb(start, end, name);
+}
+
+/*
+ * Could be used to avoid having overlap entries in lmb.reserved.region.
+ *  Don't need to use it with area that is from find_lmb_area()
+ *  Only use it for the area that fw hidden area.
+ */
+void __init reserve_lmb_overlap_ok(u64 start, u64 end, char *name)
+{
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "reserve_lmb_overlap_ok: wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
+	/* Free that region at first */
+	lmb_free(start, end - start);
+	__reserve_lmb(start, end, name);
 }
 
 void __init free_lmb(u64 start, u64 end)
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 18/31] lmb: Add reserve_lmb_overlap_ok()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Some areas from firmware could be reserved several times from different callers.

If these area are overlapped, We may have overlapped entries in lmb.reserved.

Try to free the area at first, before rerserve them again.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    1 +
 mm/lmb.c            |   27 +++++++++++++++++++++++++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 285f287..1e11fe0 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -85,6 +85,7 @@ lmb_end_pfn(struct lmb_region *type, unsigned long region_nr)
 }
 
 void reserve_lmb(u64 start, u64 end, char *name);
+void reserve_lmb_overlap_ok(u64 start, u64 end, char *name);
 void free_lmb(u64 start, u64 end);
 void add_lmb_memory(u64 start, u64 end);
 u64 __find_lmb_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
diff --git a/mm/lmb.c b/mm/lmb.c
index f49d6c8..a709522 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -615,6 +615,12 @@ void __init add_lmb_memory(u64 start, u64 end)
 	lmb_add(start, end - start);
 }
 
+static void __init __reserve_lmb(u64 start, u64 end, char *name)
+{
+	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
+	lmb_reserve(start, end - start);
+}
+
 void __init reserve_lmb(u64 start, u64 end, char *name)
 {
 	if (start == end)
@@ -623,8 +629,25 @@ void __init reserve_lmb(u64 start, u64 end, char *name)
 	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
 		return;
 
-	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
-	lmb_reserve(start, end - start);
+	__reserve_lmb(start, end, name);
+}
+
+/*
+ * Could be used to avoid having overlap entries in lmb.reserved.region.
+ *  Don't need to use it with area that is from find_lmb_area()
+ *  Only use it for the area that fw hidden area.
+ */
+void __init reserve_lmb_overlap_ok(u64 start, u64 end, char *name)
+{
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "reserve_lmb_overlap_ok: wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
+	/* Free that region at first */
+	lmb_free(start, end - start);
+	__reserve_lmb(start, end, name);
 }
 
 void __init free_lmb(u64 start, u64 end)
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 19/31] lmb: Use lmb_debug to control debug message print out
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Also let reserve_lmb/free_lmb could print out name if lmb=debug is specified

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 mm/lmb.c |   29 +++++++++++++++++++++--------
 1 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/mm/lmb.c b/mm/lmb.c
index a709522..91a2b33 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -600,8 +600,9 @@ static void __init __check_and_double_region_array(struct lmb_region *type,
 	memset(&old[0], 0, sizeof(struct lmb_property) * rgnsz);
 	type->region = new;
 	type->nr_regions = rgnsz * 2;
-	printk(KERN_DEBUG "lmb.reserved.region array is doubled to %ld at [%llx - %llx]\n",
-		type->nr_regions, mem, mem + size - 1);
+	if (lmb_debug)
+		pr_info("lmb.reserved.region array is doubled to %ld at [%010llx - %010llx]\n",
+				type->nr_regions, mem, mem + size - 1);
 
 	/* Reserve new array and free old one */
 	lmb_reserve(mem, sizeof(struct lmb_property) * rgnsz * 2);
@@ -629,6 +630,8 @@ void __init reserve_lmb(u64 start, u64 end, char *name)
 	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
 		return;
 
+	if (lmb_debug)
+		pr_info("    reserve_lmb: [%010llx, %010llx] %16s\n", start, end, name);
 	__reserve_lmb(start, end, name);
 }
 
@@ -645,6 +648,8 @@ void __init reserve_lmb_overlap_ok(u64 start, u64 end, char *name)
 	if (WARN_ONCE(start > end, "reserve_lmb_overlap_ok: wrong range [%#llx, %#llx]\n", start, end))
 		return;
 
+	if (lmb_debug)
+		pr_info("    reserve_lmb_overlap_ok: [%010llx, %010llx] %16s\n", start, end, name);
 	/* Free that region at first */
 	lmb_free(start, end - start);
 	__reserve_lmb(start, end, name);
@@ -658,6 +663,8 @@ void __init free_lmb(u64 start, u64 end)
 	if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
 		return;
 
+	if (lmb_debug)
+		pr_info("       free_lmb: [%010llx, %010llx]\n", start, end);
 	/* keep punching hole, could run out of slots too */
 	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
 	lmb_free(start, end - start);
@@ -698,11 +705,13 @@ static void __init subtract_lmb_reserved(struct range *range, int az)
 
 	count  = lmb.reserved.cnt;
 
-	pr_info("Subtract (%d early reservations)\n", count);
+	if (lmb_debug)
+		pr_info("Subtract (%d early reservations)\n", count);
 
 	for (i = 0; i < count; i++) {
 		struct lmb_property *r = &lmb.reserved.region[i];
-		pr_info("  #%d [%010llx - %010llx]\n", i, r->base, r->base + r->size);
+		if (lmb_debug)
+			pr_info("  #%03d [%010llx - %010llx]\n", i, r->base, r->base + r->size);
 		final_start = PFN_DOWN(r->base);
 		final_end = PFN_UP(r->base + r->size);
 		if (final_start >= final_end)
@@ -758,17 +767,21 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
 
 	count  = lmb.reserved.cnt;
-	pr_info("(%d early reservations) ==> bootmem [%010llx - %010llx]\n", count, start, end);
+	if (lmb_debug)
+		pr_info("(%d early reservations) ==> bootmem [%010llx - %010llx]\n", count, start, end);
 	for (i = 0; i < count; i++) {
 		struct lmb_property *r = &lmb.reserved.region[i];
-		pr_info("  #%d [%010llx - %010llx] ", i, r->base, r->base + r->size);
+		if (lmb_debug)
+			pr_info("  #%03d [%010llx - %010llx] ", i, r->base, r->base + r->size);
 		final_start = max(start, r->base);
 		final_end = min(end, r->base + r->size);
 		if (final_start >= final_end) {
-			pr_cont("\n");
+			if (lmb_debug)
+				pr_cont("\n");
 			continue;
 		}
-		pr_cont(" ==> [%010llx - %010llx]\n", final_start, final_end);
+		if (lmb_debug)
+			pr_cont(" ==> [%010llx - %010llx]\n", final_start, final_end);
 		reserve_bootmem_generic(final_start, final_end - final_start, BOOTMEM_DEFAULT);
 	}
 	/* Clear them to avoid misusing ? */
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 19/31] lmb: Use lmb_debug to control debug message print out
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Also let reserve_lmb/free_lmb could print out name if lmb=debug is specified

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 mm/lmb.c |   29 +++++++++++++++++++++--------
 1 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/mm/lmb.c b/mm/lmb.c
index a709522..91a2b33 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -600,8 +600,9 @@ static void __init __check_and_double_region_array(struct lmb_region *type,
 	memset(&old[0], 0, sizeof(struct lmb_property) * rgnsz);
 	type->region = new;
 	type->nr_regions = rgnsz * 2;
-	printk(KERN_DEBUG "lmb.reserved.region array is doubled to %ld at [%llx - %llx]\n",
-		type->nr_regions, mem, mem + size - 1);
+	if (lmb_debug)
+		pr_info("lmb.reserved.region array is doubled to %ld at [%010llx - %010llx]\n",
+				type->nr_regions, mem, mem + size - 1);
 
 	/* Reserve new array and free old one */
 	lmb_reserve(mem, sizeof(struct lmb_property) * rgnsz * 2);
@@ -629,6 +630,8 @@ void __init reserve_lmb(u64 start, u64 end, char *name)
 	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
 		return;
 
+	if (lmb_debug)
+		pr_info("    reserve_lmb: [%010llx, %010llx] %16s\n", start, end, name);
 	__reserve_lmb(start, end, name);
 }
 
@@ -645,6 +648,8 @@ void __init reserve_lmb_overlap_ok(u64 start, u64 end, char *name)
 	if (WARN_ONCE(start > end, "reserve_lmb_overlap_ok: wrong range [%#llx, %#llx]\n", start, end))
 		return;
 
+	if (lmb_debug)
+		pr_info("    reserve_lmb_overlap_ok: [%010llx, %010llx] %16s\n", start, end, name);
 	/* Free that region at first */
 	lmb_free(start, end - start);
 	__reserve_lmb(start, end, name);
@@ -658,6 +663,8 @@ void __init free_lmb(u64 start, u64 end)
 	if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
 		return;
 
+	if (lmb_debug)
+		pr_info("       free_lmb: [%010llx, %010llx]\n", start, end);
 	/* keep punching hole, could run out of slots too */
 	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
 	lmb_free(start, end - start);
@@ -698,11 +705,13 @@ static void __init subtract_lmb_reserved(struct range *range, int az)
 
 	count  = lmb.reserved.cnt;
 
-	pr_info("Subtract (%d early reservations)\n", count);
+	if (lmb_debug)
+		pr_info("Subtract (%d early reservations)\n", count);
 
 	for (i = 0; i < count; i++) {
 		struct lmb_property *r = &lmb.reserved.region[i];
-		pr_info("  #%d [%010llx - %010llx]\n", i, r->base, r->base + r->size);
+		if (lmb_debug)
+			pr_info("  #%03d [%010llx - %010llx]\n", i, r->base, r->base + r->size);
 		final_start = PFN_DOWN(r->base);
 		final_end = PFN_UP(r->base + r->size);
 		if (final_start >= final_end)
@@ -758,17 +767,21 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
 
 	count  = lmb.reserved.cnt;
-	pr_info("(%d early reservations) ==> bootmem [%010llx - %010llx]\n", count, start, end);
+	if (lmb_debug)
+		pr_info("(%d early reservations) ==> bootmem [%010llx - %010llx]\n", count, start, end);
 	for (i = 0; i < count; i++) {
 		struct lmb_property *r = &lmb.reserved.region[i];
-		pr_info("  #%d [%010llx - %010llx] ", i, r->base, r->base + r->size);
+		if (lmb_debug)
+			pr_info("  #%03d [%010llx - %010llx] ", i, r->base, r->base + r->size);
 		final_start = max(start, r->base);
 		final_end = min(end, r->base + r->size);
 		if (final_start >= final_end) {
-			pr_cont("\n");
+			if (lmb_debug)
+				pr_cont("\n");
 			continue;
 		}
-		pr_cont(" ==> [%010llx - %010llx]\n", final_start, final_end);
+		if (lmb_debug)
+			pr_cont(" ==> [%010llx - %010llx]\n", final_start, final_end);
 		reserve_bootmem_generic(final_start, final_end - final_start, BOOTMEM_DEFAULT);
 	}
 	/* Clear them to avoid misusing ? */
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

So those lmb bits could released after kernel is booted up.

Arch code could define __NOT_KEEP_LMB in asm/lmb.h, __init_lmb will become __init

x86 code will use that.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    8 +++++++
 mm/lmb.c            |   54 ++++++++++++++++++++++++++++++--------------------
 2 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 1e11fe0..7ae52c8 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -107,6 +107,14 @@ u64 lmb_memory_size(u64 addr, u64 limit);
 
 #include <asm/lmb.h>
 
+#ifdef __NOT_KEEP_LMB
+#define __init_lmb __init
+#define __initdata_lmb __initdata
+#else
+#define __init_lmb
+#define __initdata_lmb
+#endif
+
 #endif /* CONFIG_HAVE_LMB */
 
 #endif /* __KERNEL__ */
diff --git a/mm/lmb.c b/mm/lmb.c
index 91a2b33..c14deb9 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -21,11 +21,11 @@
 
 #define LMB_ALLOC_ANYWHERE	0
 
-struct lmb lmb;
-static struct lmb_property lmb_memory_region[MAX_LMB_REGIONS + 1];
-static struct lmb_property lmb_reserved_region[MAX_LMB_REGIONS + 1];
+struct lmb lmb __initdata_lmb;
+static struct lmb_property lmb_memory_region[MAX_LMB_REGIONS + 1] __initdata_lmb;
+static struct lmb_property lmb_reserved_region[MAX_LMB_REGIONS + 1] __initdata_lmb;
 
-static int lmb_debug;
+static int lmb_debug __initdata_lmb;
 
 static int __init early_lmb(char *p)
 {
@@ -35,7 +35,7 @@ static int __init early_lmb(char *p)
 }
 early_param("lmb", early_lmb);
 
-static void lmb_dump(struct lmb_region *region, char *name)
+static void __init_lmb lmb_dump(struct lmb_region *region, char *name)
 {
 	unsigned long long base, size;
 	int i;
@@ -51,7 +51,7 @@ static void lmb_dump(struct lmb_region *region, char *name)
 	}
 }
 
-void lmb_dump_all(void)
+void __init_lmb lmb_dump_all(void)
 {
 	if (!lmb_debug)
 		return;
@@ -64,13 +64,13 @@ void lmb_dump_all(void)
 	lmb_dump(&lmb.reserved, "reserved");
 }
 
-static unsigned long lmb_addrs_overlap(u64 base1, u64 size1, u64 base2,
+static unsigned long __init_lmb lmb_addrs_overlap(u64 base1, u64 size1, u64 base2,
 					u64 size2)
 {
 	return ((base1 < (base2 + size2)) && (base2 < (base1 + size1)));
 }
 
-static long lmb_addrs_adjacent(u64 base1, u64 size1, u64 base2, u64 size2)
+static long __init_lmb lmb_addrs_adjacent(u64 base1, u64 size1, u64 base2, u64 size2)
 {
 	if (base2 == base1 + size1)
 		return 1;
@@ -80,7 +80,7 @@ static long lmb_addrs_adjacent(u64 base1, u64 size1, u64 base2, u64 size2)
 	return 0;
 }
 
-static long lmb_regions_adjacent(struct lmb_region *rgn,
+static long __init_lmb lmb_regions_adjacent(struct lmb_region *rgn,
 		unsigned long r1, unsigned long r2)
 {
 	u64 base1 = rgn->region[r1].base;
@@ -91,7 +91,7 @@ static long lmb_regions_adjacent(struct lmb_region *rgn,
 	return lmb_addrs_adjacent(base1, size1, base2, size2);
 }
 
-static void lmb_remove_region(struct lmb_region *rgn, unsigned long r)
+static void __init_lmb lmb_remove_region(struct lmb_region *rgn, unsigned long r)
 {
 	unsigned long i;
 
@@ -103,7 +103,7 @@ static void lmb_remove_region(struct lmb_region *rgn, unsigned long r)
 }
 
 /* Assumption: base addr of region 1 < base addr of region 2 */
-static void lmb_coalesce_regions(struct lmb_region *rgn,
+static void __init_lmb lmb_coalesce_regions(struct lmb_region *rgn,
 		unsigned long r1, unsigned long r2)
 {
 	rgn->region[r1].size += rgn->region[r2].size;
@@ -140,7 +140,7 @@ void __init lmb_analyze(void)
 		lmb.memory.size += lmb.memory.region[i].size;
 }
 
-static long lmb_add_region(struct lmb_region *rgn, u64 base, u64 size)
+static long __init_lmb lmb_add_region(struct lmb_region *rgn, u64 base, u64 size)
 {
 	unsigned long coalesced = 0;
 	long adjacent, i;
@@ -204,7 +204,7 @@ static long lmb_add_region(struct lmb_region *rgn, u64 base, u64 size)
 	return 0;
 }
 
-long lmb_add(u64 base, u64 size)
+long __init_lmb lmb_add(u64 base, u64 size)
 {
 	struct lmb_region *_rgn = &lmb.memory;
 
@@ -216,7 +216,7 @@ long lmb_add(u64 base, u64 size)
 
 }
 
-static long __lmb_remove(struct lmb_region *rgn, u64 base, u64 size)
+static long __init_lmb __lmb_remove(struct lmb_region *rgn, u64 base, u64 size)
 {
 	u64 rgnbegin, rgnend;
 	u64 end = base + size;
@@ -264,7 +264,7 @@ static long __lmb_remove(struct lmb_region *rgn, u64 base, u64 size)
 	return lmb_add_region(rgn, end, rgnend - end);
 }
 
-long lmb_remove(u64 base, u64 size)
+long __init_lmb lmb_remove(u64 base, u64 size)
 {
 	return __lmb_remove(&lmb.memory, base, size);
 }
@@ -283,7 +283,7 @@ long __init lmb_reserve(u64 base, u64 size)
 	return lmb_add_region(_rgn, base, size);
 }
 
-long lmb_overlaps_region(struct lmb_region *rgn, u64 base, u64 size)
+long __init_lmb lmb_overlaps_region(struct lmb_region *rgn, u64 base, u64 size)
 {
 	unsigned long i;
 
@@ -297,12 +297,12 @@ long lmb_overlaps_region(struct lmb_region *rgn, u64 base, u64 size)
 	return (i < rgn->cnt) ? i : -1;
 }
 
-static u64 lmb_align_down(u64 addr, u64 size)
+static u64 __init_lmb lmb_align_down(u64 addr, u64 size)
 {
 	return addr & ~(size - 1);
 }
 
-static u64 lmb_align_up(u64 addr, u64 size)
+static u64 __init_lmb lmb_align_up(u64 addr, u64 size)
 {
 	return (addr + (size - 1)) & ~(size - 1);
 }
@@ -449,7 +449,7 @@ u64 __init lmb_phys_mem_size(void)
 	return lmb.memory.size;
 }
 
-u64 lmb_end_of_DRAM(void)
+u64 __init_lmb lmb_end_of_DRAM(void)
 {
 	int idx = lmb.memory.cnt - 1;
 
@@ -513,7 +513,7 @@ int __init lmb_is_reserved(u64 addr)
 	return 0;
 }
 
-int lmb_is_region_reserved(u64 base, u64 size)
+int __init_lmb lmb_is_region_reserved(u64 base, u64 size)
 {
 	return lmb_overlaps_region(&lmb.reserved, base, size);
 }
@@ -522,7 +522,7 @@ int lmb_is_region_reserved(u64 base, u64 size)
  * Given a <base, len>, find which memory regions belong to this range.
  * Adjust the request and return a contiguous chunk.
  */
-int lmb_find(struct lmb_property *res)
+int __init_lmb lmb_find(struct lmb_property *res)
 {
 	int i;
 	u64 rstart, rend;
@@ -699,10 +699,11 @@ static void __init subtract_lmb_reserved(struct range *range, int az)
 	int i, count;
 	u64 final_start, final_end;
 
+#ifdef __NOT_KEEP_LMB
 	/* Take out region array itself at first*/
 	if (lmb.reserved.region != lmb_reserved_region)
 		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
-
+#endif
 	count  = lmb.reserved.cnt;
 
 	if (lmb_debug)
@@ -718,9 +719,11 @@ static void __init subtract_lmb_reserved(struct range *range, int az)
 			continue;
 		subtract_range(range, az, final_start, final_end);
 	}
+#ifdef __NOT_KEEP_LMB
 	/* Put region array back ? */
 	if (lmb.reserved.region != lmb_reserved_region)
 		lmb_reserve(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+#endif
 }
 
 int __init get_free_all_memory_range(struct range **rangep, int nodeid)
@@ -745,6 +748,7 @@ int __init get_free_all_memory_range(struct range **rangep, int nodeid)
 	subtract_lmb_reserved(range, count);
 	nr_range = clean_sort_range(range, count);
 
+#ifdef __NOT_KEEP_LMB
 	/* Need to clear it ? */
 	if (nodeid == MAX_NUMNODES) {
 		memset(&lmb.reserved.region[0], 0, sizeof(struct lmb_property) * lmb.reserved.nr_regions);
@@ -752,6 +756,7 @@ int __init get_free_all_memory_range(struct range **rangep, int nodeid)
 		lmb.reserved.nr_regions = 0;
 		lmb.reserved.cnt = 0;
 	}
+#endif
 
 	*rangep = range;
 	return nr_range;
@@ -762,9 +767,11 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 	int i, count;
 	u64 final_start, final_end;
 
+#ifdef __NOT_KEEP_LMB
 	/* Take out region array itself */
 	if (lmb.reserved.region != lmb_reserved_region)
 		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+#endif
 
 	count  = lmb.reserved.cnt;
 	if (lmb_debug)
@@ -784,11 +791,14 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 			pr_cont(" ==> [%010llx - %010llx]\n", final_start, final_end);
 		reserve_bootmem_generic(final_start, final_end - final_start, BOOTMEM_DEFAULT);
 	}
+
+#ifdef __NOT_KEEP_LMB
 	/* Clear them to avoid misusing ? */
 	memset(&lmb.reserved.region[0], 0, sizeof(struct lmb_property) * lmb.reserved.nr_regions);
 	lmb.reserved.region = NULL;
 	lmb.reserved.nr_regions = 0;
 	lmb.reserved.cnt = 0;
+#endif
 }
 #endif
 
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

So those lmb bits could released after kernel is booted up.

Arch code could define __NOT_KEEP_LMB in asm/lmb.h, __init_lmb will become __init

x86 code will use that.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 include/linux/lmb.h |    8 +++++++
 mm/lmb.c            |   54 ++++++++++++++++++++++++++++++--------------------
 2 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/include/linux/lmb.h b/include/linux/lmb.h
index 1e11fe0..7ae52c8 100644
--- a/include/linux/lmb.h
+++ b/include/linux/lmb.h
@@ -107,6 +107,14 @@ u64 lmb_memory_size(u64 addr, u64 limit);
 
 #include <asm/lmb.h>
 
+#ifdef __NOT_KEEP_LMB
+#define __init_lmb __init
+#define __initdata_lmb __initdata
+#else
+#define __init_lmb
+#define __initdata_lmb
+#endif
+
 #endif /* CONFIG_HAVE_LMB */
 
 #endif /* __KERNEL__ */
diff --git a/mm/lmb.c b/mm/lmb.c
index 91a2b33..c14deb9 100644
--- a/mm/lmb.c
+++ b/mm/lmb.c
@@ -21,11 +21,11 @@
 
 #define LMB_ALLOC_ANYWHERE	0
 
-struct lmb lmb;
-static struct lmb_property lmb_memory_region[MAX_LMB_REGIONS + 1];
-static struct lmb_property lmb_reserved_region[MAX_LMB_REGIONS + 1];
+struct lmb lmb __initdata_lmb;
+static struct lmb_property lmb_memory_region[MAX_LMB_REGIONS + 1] __initdata_lmb;
+static struct lmb_property lmb_reserved_region[MAX_LMB_REGIONS + 1] __initdata_lmb;
 
-static int lmb_debug;
+static int lmb_debug __initdata_lmb;
 
 static int __init early_lmb(char *p)
 {
@@ -35,7 +35,7 @@ static int __init early_lmb(char *p)
 }
 early_param("lmb", early_lmb);
 
-static void lmb_dump(struct lmb_region *region, char *name)
+static void __init_lmb lmb_dump(struct lmb_region *region, char *name)
 {
 	unsigned long long base, size;
 	int i;
@@ -51,7 +51,7 @@ static void lmb_dump(struct lmb_region *region, char *name)
 	}
 }
 
-void lmb_dump_all(void)
+void __init_lmb lmb_dump_all(void)
 {
 	if (!lmb_debug)
 		return;
@@ -64,13 +64,13 @@ void lmb_dump_all(void)
 	lmb_dump(&lmb.reserved, "reserved");
 }
 
-static unsigned long lmb_addrs_overlap(u64 base1, u64 size1, u64 base2,
+static unsigned long __init_lmb lmb_addrs_overlap(u64 base1, u64 size1, u64 base2,
 					u64 size2)
 {
 	return ((base1 < (base2 + size2)) && (base2 < (base1 + size1)));
 }
 
-static long lmb_addrs_adjacent(u64 base1, u64 size1, u64 base2, u64 size2)
+static long __init_lmb lmb_addrs_adjacent(u64 base1, u64 size1, u64 base2, u64 size2)
 {
 	if (base2 == base1 + size1)
 		return 1;
@@ -80,7 +80,7 @@ static long lmb_addrs_adjacent(u64 base1, u64 size1, u64 base2, u64 size2)
 	return 0;
 }
 
-static long lmb_regions_adjacent(struct lmb_region *rgn,
+static long __init_lmb lmb_regions_adjacent(struct lmb_region *rgn,
 		unsigned long r1, unsigned long r2)
 {
 	u64 base1 = rgn->region[r1].base;
@@ -91,7 +91,7 @@ static long lmb_regions_adjacent(struct lmb_region *rgn,
 	return lmb_addrs_adjacent(base1, size1, base2, size2);
 }
 
-static void lmb_remove_region(struct lmb_region *rgn, unsigned long r)
+static void __init_lmb lmb_remove_region(struct lmb_region *rgn, unsigned long r)
 {
 	unsigned long i;
 
@@ -103,7 +103,7 @@ static void lmb_remove_region(struct lmb_region *rgn, unsigned long r)
 }
 
 /* Assumption: base addr of region 1 < base addr of region 2 */
-static void lmb_coalesce_regions(struct lmb_region *rgn,
+static void __init_lmb lmb_coalesce_regions(struct lmb_region *rgn,
 		unsigned long r1, unsigned long r2)
 {
 	rgn->region[r1].size += rgn->region[r2].size;
@@ -140,7 +140,7 @@ void __init lmb_analyze(void)
 		lmb.memory.size += lmb.memory.region[i].size;
 }
 
-static long lmb_add_region(struct lmb_region *rgn, u64 base, u64 size)
+static long __init_lmb lmb_add_region(struct lmb_region *rgn, u64 base, u64 size)
 {
 	unsigned long coalesced = 0;
 	long adjacent, i;
@@ -204,7 +204,7 @@ static long lmb_add_region(struct lmb_region *rgn, u64 base, u64 size)
 	return 0;
 }
 
-long lmb_add(u64 base, u64 size)
+long __init_lmb lmb_add(u64 base, u64 size)
 {
 	struct lmb_region *_rgn = &lmb.memory;
 
@@ -216,7 +216,7 @@ long lmb_add(u64 base, u64 size)
 
 }
 
-static long __lmb_remove(struct lmb_region *rgn, u64 base, u64 size)
+static long __init_lmb __lmb_remove(struct lmb_region *rgn, u64 base, u64 size)
 {
 	u64 rgnbegin, rgnend;
 	u64 end = base + size;
@@ -264,7 +264,7 @@ static long __lmb_remove(struct lmb_region *rgn, u64 base, u64 size)
 	return lmb_add_region(rgn, end, rgnend - end);
 }
 
-long lmb_remove(u64 base, u64 size)
+long __init_lmb lmb_remove(u64 base, u64 size)
 {
 	return __lmb_remove(&lmb.memory, base, size);
 }
@@ -283,7 +283,7 @@ long __init lmb_reserve(u64 base, u64 size)
 	return lmb_add_region(_rgn, base, size);
 }
 
-long lmb_overlaps_region(struct lmb_region *rgn, u64 base, u64 size)
+long __init_lmb lmb_overlaps_region(struct lmb_region *rgn, u64 base, u64 size)
 {
 	unsigned long i;
 
@@ -297,12 +297,12 @@ long lmb_overlaps_region(struct lmb_region *rgn, u64 base, u64 size)
 	return (i < rgn->cnt) ? i : -1;
 }
 
-static u64 lmb_align_down(u64 addr, u64 size)
+static u64 __init_lmb lmb_align_down(u64 addr, u64 size)
 {
 	return addr & ~(size - 1);
 }
 
-static u64 lmb_align_up(u64 addr, u64 size)
+static u64 __init_lmb lmb_align_up(u64 addr, u64 size)
 {
 	return (addr + (size - 1)) & ~(size - 1);
 }
@@ -449,7 +449,7 @@ u64 __init lmb_phys_mem_size(void)
 	return lmb.memory.size;
 }
 
-u64 lmb_end_of_DRAM(void)
+u64 __init_lmb lmb_end_of_DRAM(void)
 {
 	int idx = lmb.memory.cnt - 1;
 
@@ -513,7 +513,7 @@ int __init lmb_is_reserved(u64 addr)
 	return 0;
 }
 
-int lmb_is_region_reserved(u64 base, u64 size)
+int __init_lmb lmb_is_region_reserved(u64 base, u64 size)
 {
 	return lmb_overlaps_region(&lmb.reserved, base, size);
 }
@@ -522,7 +522,7 @@ int lmb_is_region_reserved(u64 base, u64 size)
  * Given a <base, len>, find which memory regions belong to this range.
  * Adjust the request and return a contiguous chunk.
  */
-int lmb_find(struct lmb_property *res)
+int __init_lmb lmb_find(struct lmb_property *res)
 {
 	int i;
 	u64 rstart, rend;
@@ -699,10 +699,11 @@ static void __init subtract_lmb_reserved(struct range *range, int az)
 	int i, count;
 	u64 final_start, final_end;
 
+#ifdef __NOT_KEEP_LMB
 	/* Take out region array itself at first*/
 	if (lmb.reserved.region != lmb_reserved_region)
 		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
-
+#endif
 	count  = lmb.reserved.cnt;
 
 	if (lmb_debug)
@@ -718,9 +719,11 @@ static void __init subtract_lmb_reserved(struct range *range, int az)
 			continue;
 		subtract_range(range, az, final_start, final_end);
 	}
+#ifdef __NOT_KEEP_LMB
 	/* Put region array back ? */
 	if (lmb.reserved.region != lmb_reserved_region)
 		lmb_reserve(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+#endif
 }
 
 int __init get_free_all_memory_range(struct range **rangep, int nodeid)
@@ -745,6 +748,7 @@ int __init get_free_all_memory_range(struct range **rangep, int nodeid)
 	subtract_lmb_reserved(range, count);
 	nr_range = clean_sort_range(range, count);
 
+#ifdef __NOT_KEEP_LMB
 	/* Need to clear it ? */
 	if (nodeid == MAX_NUMNODES) {
 		memset(&lmb.reserved.region[0], 0, sizeof(struct lmb_property) * lmb.reserved.nr_regions);
@@ -752,6 +756,7 @@ int __init get_free_all_memory_range(struct range **rangep, int nodeid)
 		lmb.reserved.nr_regions = 0;
 		lmb.reserved.cnt = 0;
 	}
+#endif
 
 	*rangep = range;
 	return nr_range;
@@ -762,9 +767,11 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 	int i, count;
 	u64 final_start, final_end;
 
+#ifdef __NOT_KEEP_LMB
 	/* Take out region array itself */
 	if (lmb.reserved.region != lmb_reserved_region)
 		lmb_free(__pa(lmb.reserved.region), sizeof(struct lmb_property) * lmb.reserved.nr_regions);
+#endif
 
 	count  = lmb.reserved.cnt;
 	if (lmb_debug)
@@ -784,11 +791,14 @@ void __init lmb_to_bootmem(u64 start, u64 end)
 			pr_cont(" ==> [%010llx - %010llx]\n", final_start, final_end);
 		reserve_bootmem_generic(final_start, final_end - final_start, BOOTMEM_DEFAULT);
 	}
+
+#ifdef __NOT_KEEP_LMB
 	/* Clear them to avoid misusing ? */
 	memset(&lmb.reserved.region[0], 0, sizeof(struct lmb_property) * lmb.reserved.nr_regions);
 	lmb.reserved.region = NULL;
 	lmb.reserved.nr_regions = 0;
 	lmb.reserved.cnt = 0;
+#endif
 }
 #endif
 
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 21/31] x86: Add sanitize_e820_map()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

So We don't need to take e820.map with it.

Also change e820_saved to initdata to get some bytes memory back.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h |    5 ++---
 arch/x86/kernel/e820.c      |   26 ++++++++++++++++++--------
 arch/x86/kernel/efi.c       |    2 +-
 arch/x86/kernel/setup.c     |   10 +++++-----
 arch/x86/xen/setup.c        |    4 +---
 5 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index ec8a52d..0457c49 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -75,15 +75,14 @@ struct e820map {
 #ifdef __KERNEL__
 /* see comment in arch/x86/kernel/e820.c */
 extern struct e820map e820;
-extern struct e820map e820_saved;
 
 extern unsigned long pci_mem_start;
 extern int e820_any_mapped(u64 start, u64 end, unsigned type);
 extern int e820_all_mapped(u64 start, u64 end, unsigned type);
 extern void e820_add_region(u64 start, u64 size, int type);
 extern void e820_print_map(char *who);
-extern int
-sanitize_e820_map(struct e820entry *biosmap, int max_nr_map, u32 *pnr_map);
+int sanitize_e820_map(void);
+void save_e820_map(void);
 extern u64 e820_update_range(u64 start, u64 size, unsigned old_type,
 			       unsigned new_type);
 extern u64 e820_remove_range(u64 start, u64 size, unsigned old_type,
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 740b440..0eb9830 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -35,7 +35,7 @@
  * next kernel with full memory.
  */
 struct e820map e820;
-struct e820map e820_saved;
+static struct e820map __initdata e820_saved;
 
 /* For PCI or other memory-mapped resources */
 unsigned long pci_mem_start = 0xaeedbabe;
@@ -224,7 +224,7 @@ void __init e820_print_map(char *who)
  *	   ______________________4_
  */
 
-int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
+static int __init __sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
 			     u32 *pnr_map)
 {
 	struct change_member {
@@ -383,6 +383,11 @@ int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
 	return 0;
 }
 
+int __init sanitize_e820_map(void)
+{
+	return __sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+}
+
 static int __init __append_e820_map(struct e820entry *biosmap, int nr_map)
 {
 	while (nr_map) {
@@ -555,7 +560,7 @@ void __init update_e820(void)
 	u32 nr_map;
 
 	nr_map = e820.nr_map;
-	if (sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr_map))
+	if (__sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr_map))
 		return;
 	e820.nr_map = nr_map;
 	printk(KERN_INFO "modified physical RAM map:\n");
@@ -566,7 +571,7 @@ static void __init update_e820_saved(void)
 	u32 nr_map;
 
 	nr_map = e820_saved.nr_map;
-	if (sanitize_e820_map(e820_saved.map, ARRAY_SIZE(e820_saved.map), &nr_map))
+	if (__sanitize_e820_map(e820_saved.map, ARRAY_SIZE(e820_saved.map), &nr_map))
 		return;
 	e820_saved.nr_map = nr_map;
 }
@@ -661,7 +666,7 @@ void __init parse_e820_ext(struct setup_data *sdata, unsigned long pa_data)
 		sdata = early_ioremap(pa_data, map_len);
 	extmap = (struct e820entry *)(sdata->data);
 	__append_e820_map(extmap, entries);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 	if (map_len > PAGE_SIZE)
 		early_iounmap(sdata, map_len);
 	printk(KERN_INFO "extended physical RAM map:\n");
@@ -1028,7 +1033,7 @@ void __init finish_e820_parsing(void)
 	if (userdef) {
 		u32 nr = e820.nr_map;
 
-		if (sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr) < 0)
+		if (__sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr) < 0)
 			early_panic("Invalid user supplied memory map");
 		e820.nr_map = nr;
 
@@ -1158,7 +1163,7 @@ char *__init default_machine_specific_memory_setup(void)
 	 * the next section from 1mb->appropriate_mem_k
 	 */
 	new_nr = boot_params.e820_entries;
-	sanitize_e820_map(boot_params.e820_map,
+	__sanitize_e820_map(boot_params.e820_map,
 			ARRAY_SIZE(boot_params.e820_map),
 			&new_nr);
 	boot_params.e820_entries = new_nr;
@@ -1185,12 +1190,17 @@ char *__init default_machine_specific_memory_setup(void)
 	return who;
 }
 
+void __init save_e820_map(void)
+{
+	memcpy(&e820_saved, &e820, sizeof(struct e820map));
+}
+
 void __init setup_memory_map(void)
 {
 	char *who;
 
 	who = x86_init.resources.memory_setup();
-	memcpy(&e820_saved, &e820, sizeof(struct e820map));
+	save_e820_map();
 	printk(KERN_INFO "BIOS-provided physical RAM map:\n");
 	e820_print_map(who);
 }
diff --git a/arch/x86/kernel/efi.c b/arch/x86/kernel/efi.c
index c2fa9b8..299f03f 100644
--- a/arch/x86/kernel/efi.c
+++ b/arch/x86/kernel/efi.c
@@ -272,7 +272,7 @@ static void __init do_add_efi_memmap(void)
 		}
 		e820_add_region(start, size, e820_type);
 	}
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 }
 
 void __init efi_reserve_early(void)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index d76e185..7ca6878 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -463,8 +463,8 @@ static void __init e820_reserve_setup_data(void)
 	if (!found)
 		return;
 
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
-	memcpy(&e820_saved, &e820, sizeof(struct e820map));
+	sanitize_e820_map();
+	save_e820_map();
 	printk(KERN_INFO "extended physical RAM map:\n");
 	e820_print_map("reserve setup_data");
 }
@@ -616,7 +616,7 @@ static int __init dmi_low_memory_corruption(const struct dmi_system_id *d)
 		d->ident);
 
 	e820_update_range(0, 0x10000, E820_RAM, E820_RESERVED);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 
 	return 0;
 }
@@ -685,7 +685,7 @@ static void __init trim_bios_range(void)
 	 * take them out.
 	 */
 	e820_remove_range(BIOS_BEGIN, BIOS_END - BIOS_BEGIN, E820_RAM, 1);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 }
 
 /*
@@ -856,7 +856,7 @@ void __init setup_arch(char **cmdline_p)
 	if (ppro_with_ram_bug()) {
 		e820_update_range(0x70000000ULL, 0x40000ULL, E820_RAM,
 				  E820_RESERVED);
-		sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+		sanitize_e820_map();
 		printk(KERN_INFO "fixed physical RAM map:\n");
 		e820_print_map("bad_ppro");
 	}
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index ad0047f..3f2c411 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -43,8 +43,6 @@ char * __init xen_memory_setup(void)
 
 	max_pfn = min(MAX_DOMAIN_PAGES, max_pfn);
 
-	e820.nr_map = 0;
-
 	e820_add_region(0, PFN_PHYS((u64)max_pfn), E820_RAM);
 
 	/*
@@ -65,7 +63,7 @@ char * __init xen_memory_setup(void)
 		      __pa(xen_start_info->pt_base),
 			"XEN START INFO");
 
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 
 	return "Xen";
 }
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 21/31] x86: Add sanitize_e820_map()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

So We don't need to take e820.map with it.

Also change e820_saved to initdata to get some bytes memory back.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h |    5 ++---
 arch/x86/kernel/e820.c      |   26 ++++++++++++++++++--------
 arch/x86/kernel/efi.c       |    2 +-
 arch/x86/kernel/setup.c     |   10 +++++-----
 arch/x86/xen/setup.c        |    4 +---
 5 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index ec8a52d..0457c49 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -75,15 +75,14 @@ struct e820map {
 #ifdef __KERNEL__
 /* see comment in arch/x86/kernel/e820.c */
 extern struct e820map e820;
-extern struct e820map e820_saved;
 
 extern unsigned long pci_mem_start;
 extern int e820_any_mapped(u64 start, u64 end, unsigned type);
 extern int e820_all_mapped(u64 start, u64 end, unsigned type);
 extern void e820_add_region(u64 start, u64 size, int type);
 extern void e820_print_map(char *who);
-extern int
-sanitize_e820_map(struct e820entry *biosmap, int max_nr_map, u32 *pnr_map);
+int sanitize_e820_map(void);
+void save_e820_map(void);
 extern u64 e820_update_range(u64 start, u64 size, unsigned old_type,
 			       unsigned new_type);
 extern u64 e820_remove_range(u64 start, u64 size, unsigned old_type,
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 740b440..0eb9830 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -35,7 +35,7 @@
  * next kernel with full memory.
  */
 struct e820map e820;
-struct e820map e820_saved;
+static struct e820map __initdata e820_saved;
 
 /* For PCI or other memory-mapped resources */
 unsigned long pci_mem_start = 0xaeedbabe;
@@ -224,7 +224,7 @@ void __init e820_print_map(char *who)
  *	   ______________________4_
  */
 
-int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
+static int __init __sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
 			     u32 *pnr_map)
 {
 	struct change_member {
@@ -383,6 +383,11 @@ int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
 	return 0;
 }
 
+int __init sanitize_e820_map(void)
+{
+	return __sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+}
+
 static int __init __append_e820_map(struct e820entry *biosmap, int nr_map)
 {
 	while (nr_map) {
@@ -555,7 +560,7 @@ void __init update_e820(void)
 	u32 nr_map;
 
 	nr_map = e820.nr_map;
-	if (sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr_map))
+	if (__sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr_map))
 		return;
 	e820.nr_map = nr_map;
 	printk(KERN_INFO "modified physical RAM map:\n");
@@ -566,7 +571,7 @@ static void __init update_e820_saved(void)
 	u32 nr_map;
 
 	nr_map = e820_saved.nr_map;
-	if (sanitize_e820_map(e820_saved.map, ARRAY_SIZE(e820_saved.map), &nr_map))
+	if (__sanitize_e820_map(e820_saved.map, ARRAY_SIZE(e820_saved.map), &nr_map))
 		return;
 	e820_saved.nr_map = nr_map;
 }
@@ -661,7 +666,7 @@ void __init parse_e820_ext(struct setup_data *sdata, unsigned long pa_data)
 		sdata = early_ioremap(pa_data, map_len);
 	extmap = (struct e820entry *)(sdata->data);
 	__append_e820_map(extmap, entries);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 	if (map_len > PAGE_SIZE)
 		early_iounmap(sdata, map_len);
 	printk(KERN_INFO "extended physical RAM map:\n");
@@ -1028,7 +1033,7 @@ void __init finish_e820_parsing(void)
 	if (userdef) {
 		u32 nr = e820.nr_map;
 
-		if (sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr) < 0)
+		if (__sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr) < 0)
 			early_panic("Invalid user supplied memory map");
 		e820.nr_map = nr;
 
@@ -1158,7 +1163,7 @@ char *__init default_machine_specific_memory_setup(void)
 	 * the next section from 1mb->appropriate_mem_k
 	 */
 	new_nr = boot_params.e820_entries;
-	sanitize_e820_map(boot_params.e820_map,
+	__sanitize_e820_map(boot_params.e820_map,
 			ARRAY_SIZE(boot_params.e820_map),
 			&new_nr);
 	boot_params.e820_entries = new_nr;
@@ -1185,12 +1190,17 @@ char *__init default_machine_specific_memory_setup(void)
 	return who;
 }
 
+void __init save_e820_map(void)
+{
+	memcpy(&e820_saved, &e820, sizeof(struct e820map));
+}
+
 void __init setup_memory_map(void)
 {
 	char *who;
 
 	who = x86_init.resources.memory_setup();
-	memcpy(&e820_saved, &e820, sizeof(struct e820map));
+	save_e820_map();
 	printk(KERN_INFO "BIOS-provided physical RAM map:\n");
 	e820_print_map(who);
 }
diff --git a/arch/x86/kernel/efi.c b/arch/x86/kernel/efi.c
index c2fa9b8..299f03f 100644
--- a/arch/x86/kernel/efi.c
+++ b/arch/x86/kernel/efi.c
@@ -272,7 +272,7 @@ static void __init do_add_efi_memmap(void)
 		}
 		e820_add_region(start, size, e820_type);
 	}
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 }
 
 void __init efi_reserve_early(void)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index d76e185..7ca6878 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -463,8 +463,8 @@ static void __init e820_reserve_setup_data(void)
 	if (!found)
 		return;
 
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
-	memcpy(&e820_saved, &e820, sizeof(struct e820map));
+	sanitize_e820_map();
+	save_e820_map();
 	printk(KERN_INFO "extended physical RAM map:\n");
 	e820_print_map("reserve setup_data");
 }
@@ -616,7 +616,7 @@ static int __init dmi_low_memory_corruption(const struct dmi_system_id *d)
 		d->ident);
 
 	e820_update_range(0, 0x10000, E820_RAM, E820_RESERVED);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 
 	return 0;
 }
@@ -685,7 +685,7 @@ static void __init trim_bios_range(void)
 	 * take them out.
 	 */
 	e820_remove_range(BIOS_BEGIN, BIOS_END - BIOS_BEGIN, E820_RAM, 1);
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 }
 
 /*
@@ -856,7 +856,7 @@ void __init setup_arch(char **cmdline_p)
 	if (ppro_with_ram_bug()) {
 		e820_update_range(0x70000000ULL, 0x40000ULL, E820_RAM,
 				  E820_RESERVED);
-		sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+		sanitize_e820_map();
 		printk(KERN_INFO "fixed physical RAM map:\n");
 		e820_print_map("bad_ppro");
 	}
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index ad0047f..3f2c411 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -43,8 +43,6 @@ char * __init xen_memory_setup(void)
 
 	max_pfn = min(MAX_DOMAIN_PAGES, max_pfn);
 
-	e820.nr_map = 0;
-
 	e820_add_region(0, PFN_PHYS((u64)max_pfn), E820_RAM);
 
 	/*
@@ -65,7 +63,7 @@ char * __init xen_memory_setup(void)
 		      __pa(xen_start_info->pt_base),
 			"XEN START INFO");
 
-	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+	sanitize_e820_map();
 
 	return "Xen";
 }
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 22/31] x86: Use lmb to replace early_res
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

1. replace find_e820_area with find_lmb_area
2. replace reserve_early with reserve_lmb
3. replace free_early with free_lmb.
4. NO_BOOTMEM will switch to use lmb too.
5. use _e820, _early wrap in the patch, in following patch, will
   replace them all
6. because free_lmb support partial free, we can remove some special care
7. Need to make sure that find_lmb_area() is called after fill_lmb_memory()
   so adjust some calling later in setup.c::setup_arch()
   -- corruption_check and mptable_update

-v2: Move reserve_brk() early
    Before fill_lmb_area, to avoid overlap between brk and find_lmb_area()
    that could happen We have more then 128 RAM entry in E820 tables, and
    fill_lmb_memory() could use find_lmb_area() to find a new place for
    lmb.memory.region array.
    and We don't need to use extend_brk() after fill_lmb_area()
-v3: Move find_smp_config early
    To make sure find_lmb_area not find wrong place, if BIOS doesn't put mptable
    in right place.
-v4: Treat RESERVED_KERN as RAM in lmb.memory. and they are already in
    lmb.reserved already..
    use __NOT_KEEP_LMB to make sure lmb related code could be freed later.

So move reserve_brk() early before fill_lmb_area().

Suggested-by: David S. Miller <davem@davemloft.net>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/Kconfig               |    9 +--
 arch/x86/include/asm/e820.h    |   15 +++--
 arch/x86/include/asm/lmb.h     |   10 +++
 arch/x86/kernel/check.c        |   14 ++--
 arch/x86/kernel/e820.c         |  139 +++++++++++-----------------------------
 arch/x86/kernel/head.c         |    3 +-
 arch/x86/kernel/head32.c       |    6 +-
 arch/x86/kernel/head64.c       |    3 +
 arch/x86/kernel/mpparse.c      |    5 +-
 arch/x86/kernel/setup.c        |   29 ++++++---
 arch/x86/kernel/setup_percpu.c |    6 --
 arch/x86/mm/numa_64.c          |    5 +-
 kernel/Makefile                |    1 -
 mm/bootmem.c                   |    1 +
 mm/page_alloc.c                |   35 ++--------
 mm/sparse-vmemmap.c            |   11 ---
 16 files changed, 111 insertions(+), 181 deletions(-)
 create mode 100644 arch/x86/include/asm/lmb.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6a80bce..3117de5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -27,6 +27,7 @@ config X86
 	select HAVE_PERF_EVENTS if (!M386 && !M486)
 	select HAVE_IOREMAP_PROT
 	select HAVE_KPROBES
+	select HAVE_LMB
 	select ARCH_WANT_OPTIONAL_GPIOLIB
 	select ARCH_WANT_FRAME_POINTERS
 	select HAVE_DMA_ATTRS
@@ -192,9 +193,6 @@ config ARCH_SUPPORTS_OPTIMIZED_INLINING
 config ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	def_bool y
 
-config HAVE_EARLY_RES
-	def_bool y
-
 config HAVE_INTEL_TXT
 	def_bool y
 	depends on EXPERIMENTAL && DMAR && ACPI
@@ -589,14 +587,13 @@ config NO_BOOTMEM
 	default y
 	bool "Disable Bootmem code"
 	---help---
-	  Use early_res directly instead of bootmem before slab is ready.
+	  Use lmb directly instead of bootmem before slab is ready.
 		- allocator (buddy) [generic]
 		- early allocator (bootmem) [generic]
-		- very early allocator (reserve_early*()) [x86]
+		- very early allocator (lmb) [some generic]
 		- very very early allocator (early brk model) [x86]
 	  So reduce one layer between early allocator to final allocator
 
-
 config MEMTEST
 	bool "Memtest"
 	---help---
diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 0457c49..396c849 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -116,24 +116,27 @@ extern unsigned long end_user_pfn;
 extern u64 find_e820_area(u64 start, u64 end, u64 size, u64 align);
 extern u64 find_e820_area_size(u64 start, u64 *sizep, u64 align);
 extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
-#include <linux/early_res.h>
 
 extern unsigned long e820_end_of_ram_pfn(void);
 extern unsigned long e820_end_of_low_ram_pfn(void);
-extern int e820_find_active_region(const struct e820entry *ei,
-				  unsigned long start_pfn,
-				  unsigned long last_pfn,
-				  unsigned long *ei_startpfn,
-				  unsigned long *ei_endpfn);
 extern void e820_register_active_regions(int nid, unsigned long start_pfn,
 					 unsigned long end_pfn);
 extern u64 e820_hole_size(u64 start, u64 end);
+
+extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
+
+void init_lmb_memory(void);
+void fill_lmb_memory(void);
+
 extern void finish_e820_parsing(void);
 extern void e820_reserve_resources(void);
 extern void e820_reserve_resources_late(void);
 extern void setup_memory_map(void);
 extern char *default_machine_specific_memory_setup(void);
 
+void reserve_early(u64 start, u64 end, char *name);
+void free_early(u64 start, u64 end);
+
 /*
  * Returns true iff the specified range [s,e) is completely contained inside
  * the ISA region.
diff --git a/arch/x86/include/asm/lmb.h b/arch/x86/include/asm/lmb.h
new file mode 100644
index 0000000..2fd1db4
--- /dev/null
+++ b/arch/x86/include/asm/lmb.h
@@ -0,0 +1,10 @@
+#ifndef _X86_LMB_H
+#define _X86_LMB_H
+
+#define LMB_DBG(fmt...) printk(fmt)
+
+#define LMB_REAL_LIMIT 0
+
+#define __NOT_KEEP_LMB
+
+#endif
diff --git a/arch/x86/kernel/check.c b/arch/x86/kernel/check.c
index fc999e6..90236af 100644
--- a/arch/x86/kernel/check.c
+++ b/arch/x86/kernel/check.c
@@ -2,7 +2,8 @@
 #include <linux/sched.h>
 #include <linux/kthread.h>
 #include <linux/workqueue.h>
-#include <asm/e820.h>
+#include <linux/lmb.h>
+
 #include <asm/proto.h>
 
 /*
@@ -18,10 +19,12 @@ static int __read_mostly memory_corruption_check = -1;
 static unsigned __read_mostly corruption_check_size = 64*1024;
 static unsigned __read_mostly corruption_check_period = 60; /* seconds */
 
-static struct e820entry scan_areas[MAX_SCAN_AREAS];
+static struct scan_area {
+	u64 addr;
+	u64 size;
+} scan_areas[MAX_SCAN_AREAS];
 static int num_scan_areas;
 
-
 static __init int set_corruption_check(char *arg)
 {
 	char *end;
@@ -81,7 +84,7 @@ void __init setup_bios_corruption_check(void)
 
 	while (addr < corruption_check_size && num_scan_areas < MAX_SCAN_AREAS) {
 		u64 size;
-		addr = find_e820_area_size(addr, &size, PAGE_SIZE);
+		addr = find_lmb_area_size(addr, &size, PAGE_SIZE);
 
 		if (!(addr + 1))
 			break;
@@ -92,7 +95,7 @@ void __init setup_bios_corruption_check(void)
 		if ((addr + size) > corruption_check_size)
 			size = corruption_check_size - addr;
 
-		e820_update_range(addr, size, E820_RAM, E820_RESERVED);
+		reserve_lmb(addr, addr + size, "SCAN RAM");
 		scan_areas[num_scan_areas].addr = addr;
 		scan_areas[num_scan_areas].size = size;
 		num_scan_areas++;
@@ -105,7 +108,6 @@ void __init setup_bios_corruption_check(void)
 
 	printk(KERN_INFO "Scanning %d areas for low memory corruption\n",
 	       num_scan_areas);
-	update_e820();
 }
 
 
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 0eb9830..c2a9ce4 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -15,6 +15,7 @@
 #include <linux/pfn.h>
 #include <linux/suspend.h>
 #include <linux/firmware-map.h>
+#include <linux/lmb.h>
 
 #include <asm/e820.h>
 #include <asm/proto.h>
@@ -731,30 +732,7 @@ core_initcall(e820_mark_nvs_memory);
  */
 u64 __init find_e820_area(u64 start, u64 end, u64 size, u64 align)
 {
-	int i;
-
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
-		u64 addr;
-		u64 ei_start, ei_last;
-
-		if (ei->type != E820_RAM)
-			continue;
-
-		ei_last = ei->addr + ei->size;
-		ei_start = ei->addr;
-		addr = find_early_area(ei_start, ei_last, start, end,
-					 size, align);
-
-		if (addr != -1ULL)
-			return addr;
-	}
-	return -1ULL;
-}
-
-u64 __init find_fw_memmap_area(u64 start, u64 end, u64 size, u64 align)
-{
-	return find_e820_area(start, end, size, align);
+	return find_lmb_area(start, end, size, align);
 }
 
 u64 __init get_max_mapped(void)
@@ -770,30 +748,11 @@ u64 __init get_max_mapped(void)
  */
 u64 __init find_e820_area_size(u64 start, u64 *sizep, u64 align)
 {
-	int i;
-
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
-		u64 addr;
-		u64 ei_start, ei_last;
-
-		if (ei->type != E820_RAM)
-			continue;
-
-		ei_last = ei->addr + ei->size;
-		ei_start = ei->addr;
-		addr = find_early_area_size(ei_start, ei_last, start,
-					 sizep, align);
-
-		if (addr != -1ULL)
-			return addr;
-	}
-
-	return -1ULL;
+	return find_lmb_area_size(start, sizep, align);
 }
 
 /*
- * pre allocated 4k and reserved it in e820
+ * pre allocated 4k and reserved it in lmb and e820_saved
  */
 u64 __init early_reserve_e820(u64 startt, u64 sizet, u64 align)
 {
@@ -802,7 +761,7 @@ u64 __init early_reserve_e820(u64 startt, u64 sizet, u64 align)
 	u64 start;
 
 	for (start = startt; ; start += size) {
-		start = find_e820_area_size(start, &size, align);
+		start = find_lmb_area_size(start, &size, align);
 		if (!(start + 1))
 			return 0;
 		if (size >= sizet)
@@ -819,10 +778,9 @@ u64 __init early_reserve_e820(u64 startt, u64 sizet, u64 align)
 	addr = round_down(start + size - sizet, align);
 	if (addr < start)
 		return 0;
-	e820_update_range(addr, sizet, E820_RAM, E820_RESERVED);
+	reserve_lmb(addr, addr + sizet, "new next");
 	e820_update_range_saved(addr, sizet, E820_RAM, E820_RESERVED);
-	printk(KERN_INFO "update e820 for early_reserve_e820\n");
-	update_e820();
+	printk(KERN_INFO "update e820_saved for early_reserve_e820\n");
 	update_e820_saved();
 
 	return addr;
@@ -884,52 +842,12 @@ unsigned long __init e820_end_of_low_ram_pfn(void)
 {
 	return e820_end_pfn(1UL<<(32 - PAGE_SHIFT), E820_RAM);
 }
-/*
- * Finds an active region in the address range from start_pfn to last_pfn and
- * returns its range in ei_startpfn and ei_endpfn for the e820 entry.
- */
-int __init e820_find_active_region(const struct e820entry *ei,
-				  unsigned long start_pfn,
-				  unsigned long last_pfn,
-				  unsigned long *ei_startpfn,
-				  unsigned long *ei_endpfn)
-{
-	u64 align = PAGE_SIZE;
-
-	*ei_startpfn = round_up(ei->addr, align) >> PAGE_SHIFT;
-	*ei_endpfn = round_down(ei->addr + ei->size, align) >> PAGE_SHIFT;
-
-	/* Skip map entries smaller than a page */
-	if (*ei_startpfn >= *ei_endpfn)
-		return 0;
-
-	/* Skip if map is outside the node */
-	if (ei->type != E820_RAM || *ei_endpfn <= start_pfn ||
-				    *ei_startpfn >= last_pfn)
-		return 0;
-
-	/* Check for overlaps */
-	if (*ei_startpfn < start_pfn)
-		*ei_startpfn = start_pfn;
-	if (*ei_endpfn > last_pfn)
-		*ei_endpfn = last_pfn;
-
-	return 1;
-}
 
 /* Walk the e820 map and register active regions within a node */
 void __init e820_register_active_regions(int nid, unsigned long start_pfn,
 					 unsigned long last_pfn)
 {
-	unsigned long ei_startpfn;
-	unsigned long ei_endpfn;
-	int i;
-
-	for (i = 0; i < e820.nr_map; i++)
-		if (e820_find_active_region(&e820.map[i],
-					    start_pfn, last_pfn,
-					    &ei_startpfn, &ei_endpfn))
-			add_active_range(nid, ei_startpfn, ei_endpfn);
+	lmb_register_active_regions(nid, start_pfn, last_pfn);
 }
 
 /*
@@ -939,18 +857,16 @@ void __init e820_register_active_regions(int nid, unsigned long start_pfn,
  */
 u64 __init e820_hole_size(u64 start, u64 end)
 {
-	unsigned long start_pfn = start >> PAGE_SHIFT;
-	unsigned long last_pfn = end >> PAGE_SHIFT;
-	unsigned long ei_startpfn, ei_endpfn, ram = 0;
-	int i;
+	return lmb_hole_size(start, end);
+}
 
-	for (i = 0; i < e820.nr_map; i++) {
-		if (e820_find_active_region(&e820.map[i],
-					    start_pfn, last_pfn,
-					    &ei_startpfn, &ei_endpfn))
-			ram += ei_endpfn - ei_startpfn;
-	}
-	return end - start - ((u64)ram << PAGE_SHIFT);
+void reserve_early(u64 start, u64 end, char *name)
+{
+	reserve_lmb(start, end, name);
+}
+void free_early(u64 start, u64 end)
+{
+	free_lmb(start, end);
 }
 
 static void early_panic(char *msg)
@@ -1204,3 +1120,24 @@ void __init setup_memory_map(void)
 	printk(KERN_INFO "BIOS-provided physical RAM map:\n");
 	e820_print_map(who);
 }
+
+void __init init_lmb_memory(void)
+{
+	lmb_init();
+}
+
+void __init fill_lmb_memory(void)
+{
+	int i;
+
+	for (i = 0; i < e820.nr_map; i++) {
+		struct e820entry *ei = &e820.map[i];
+
+		if (ei->type != E820_RAM && ei->type != E820_RESERVED_KERN)
+			continue;
+		add_lmb_memory(ei->addr, ei->addr + ei->size);
+	}
+
+	lmb_analyze();
+	lmb_dump_all();
+}
diff --git a/arch/x86/kernel/head.c b/arch/x86/kernel/head.c
index 3e66bd3..0341b09 100644
--- a/arch/x86/kernel/head.c
+++ b/arch/x86/kernel/head.c
@@ -1,5 +1,6 @@
 #include <linux/kernel.h>
 #include <linux/init.h>
+#include <linux/lmb.h>
 
 #include <asm/setup.h>
 #include <asm/bios_ebda.h>
@@ -51,5 +52,5 @@ void __init reserve_ebda_region(void)
 		lowmem = 0x9f000;
 
 	/* reserve all memory between lowmem and the 1MB mark */
-	reserve_early_overlap_ok(lowmem, 0x100000, "BIOS reserved");
+	reserve_lmb_overlap_ok(lowmem, 0x100000, "BIOS reserved");
 }
diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index 9a97504..cbab479 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -7,6 +7,7 @@
 
 #include <linux/init.h>
 #include <linux/start_kernel.h>
+#include <linux/lmb.h>
 
 #include <asm/setup.h>
 #include <asm/sections.h>
@@ -29,14 +30,15 @@ static void __init i386_default_early_setup(void)
 
 void __init i386_start_kernel(void)
 {
+	init_lmb_memory();
+
 #ifdef CONFIG_X86_TRAMPOLINE
 	/*
 	 * But first pinch a few for the stack/trampoline stuff
 	 * FIXME: Don't need the extra page at 4K, but need to fix
 	 * trampoline before removing it. (see the GDT stuff)
 	 */
-	reserve_early_overlap_ok(PAGE_SIZE, PAGE_SIZE + PAGE_SIZE,
-					 "EX TRAMPOLINE");
+	reserve_lmb(PAGE_SIZE, PAGE_SIZE + PAGE_SIZE, "EX TRAMPOLINE");
 #endif
 
 	reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 7147143..89dd2de 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -12,6 +12,7 @@
 #include <linux/percpu.h>
 #include <linux/start_kernel.h>
 #include <linux/io.h>
+#include <linux/lmb.h>
 
 #include <asm/processor.h>
 #include <asm/proto.h>
@@ -96,6 +97,8 @@ void __init x86_64_start_kernel(char * real_mode_data)
 
 void __init x86_64_start_reservations(char *real_mode_data)
 {
+	init_lmb_memory();
+
 	copy_bootdata(__va(real_mode_data));
 
 	reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index a2c1edd..5779c27 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -11,6 +11,7 @@
 #include <linux/init.h>
 #include <linux/delay.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/kernel_stat.h>
 #include <linux/mc146818rtc.h>
 #include <linux/bitops.h>
@@ -664,7 +665,7 @@ static void __init smp_reserve_memory(struct mpf_intel *mpf)
 {
 	unsigned long size = get_mpc_size(mpf->physptr);
 
-	reserve_early(mpf->physptr, mpf->physptr+size, "MP-table mpc");
+	reserve_lmb_overlap_ok(mpf->physptr, mpf->physptr+size, "MP-table mpc");
 }
 
 static int __init smp_scan_config(unsigned long base, unsigned long length)
@@ -693,7 +694,7 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
 			       mpf, (u64)virt_to_phys(mpf));
 
 			mem = virt_to_phys(mpf);
-			reserve_early(mem, mem + sizeof(*mpf), "MP-table mpf");
+			reserve_lmb_overlap_ok(mem, mem + sizeof(*mpf), "MP-table mpf");
 			if (mpf->physptr)
 				smp_reserve_memory(mpf);
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 7ca6878..a5c029f 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -31,6 +31,7 @@
 #include <linux/apm_bios.h>
 #include <linux/initrd.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/seq_file.h>
 #include <linux/console.h>
 #include <linux/mca.h>
@@ -870,8 +871,6 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	max_pfn = e820_end_of_ram_pfn();
 
-	/* preallocate 4k for mptable mpc */
-	early_reserve_e820_mpc_new();
 	/* update e820 for memory not covered by WB MTRRs */
 	mtrr_bp_init();
 	if (mtrr_trim_uncached_memory(max_pfn))
@@ -896,6 +895,23 @@ void __init setup_arch(char **cmdline_p)
 	max_pfn_mapped = KERNEL_IMAGE_SIZE >> PAGE_SHIFT;
 #endif
 
+	/*
+	 * Find and reserve possible boot-time SMP configuration:
+	 */
+	find_smp_config();
+
+	/*
+	 * Need to conclude brk, before fill_lmb_memory()
+	 *  it could use find_lmb_area, could overlap with
+	 *  brk area.
+	 */
+	reserve_brk();
+
+	fill_lmb_memory();
+
+	/* preallocate 4k for mptable mpc */
+	early_reserve_e820_mpc_new();
+
 #ifdef CONFIG_X86_CHECK_BIOS_CORRUPTION
 	setup_bios_corruption_check();
 #endif
@@ -903,13 +919,6 @@ void __init setup_arch(char **cmdline_p)
 	printk(KERN_DEBUG "initial memory mapped : 0 - %08lx\n",
 			max_pfn_mapped<<PAGE_SHIFT);
 
-	reserve_brk();
-
-	/*
-	 * Find and reserve possible boot-time SMP configuration:
-	 */
-	find_smp_config();
-
 	reserve_trampoline_memory();
 
 #ifdef CONFIG_ACPI_SLEEP
@@ -972,7 +981,7 @@ void __init setup_arch(char **cmdline_p)
 
 	initmem_init(0, max_pfn, acpi, k8);
 #ifndef CONFIG_NO_BOOTMEM
-	early_res_to_bootmem(0, max_low_pfn<<PAGE_SHIFT);
+	lmb_to_bootmem(0, max_low_pfn<<PAGE_SHIFT);
 #endif
 
 	dma32_reserve_bootmem();
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index ef6370b..35abcb8 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -137,13 +137,7 @@ static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, size_t align)
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-#ifdef CONFIG_NO_BOOTMEM
-	u64 start = __pa(ptr);
-	u64 end = start + size;
-	free_early_partial(start, end);
-#else
 	free_bootmem(__pa(ptr), size);
-#endif
 }
 
 static int __init pcpu_cpu_distance(unsigned int from, unsigned int to)
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 8948f47..fc9a403 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -7,6 +7,7 @@
 #include <linux/string.h>
 #include <linux/init.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/mmzone.h>
 #include <linux/ctype.h>
 #include <linux/module.h>
@@ -174,7 +175,7 @@ static void * __init early_node_mem(int nodeid, unsigned long start,
 	if (start < (MAX_DMA32_PFN<<PAGE_SHIFT) &&
 	    end > (MAX_DMA32_PFN<<PAGE_SHIFT))
 		start = MAX_DMA32_PFN<<PAGE_SHIFT;
-	mem = find_e820_area(start, end, size, align);
+	mem = find_lmb_area_node(nodeid, start, end, size, align);
 	if (mem != -1L)
 		return __va(mem);
 
@@ -184,7 +185,7 @@ static void * __init early_node_mem(int nodeid, unsigned long start,
 		start = MAX_DMA32_PFN<<PAGE_SHIFT;
 	else
 		start = MAX_DMA_PFN<<PAGE_SHIFT;
-	mem = find_e820_area(start, end, size, align);
+	mem = find_lmb_area_node(nodeid, start, end, size, align);
 	if (mem != -1L)
 		return __va(mem);
 
diff --git a/kernel/Makefile b/kernel/Makefile
index d5c3006..754bf79 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -11,7 +11,6 @@ obj-y     = sched.o fork.o exec_domain.o panic.o printk.o \
 	    hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \
 	    notifier.o ksysfs.o pm_qos_params.o sched_clock.o cred.o \
 	    async.o range.o
-obj-$(CONFIG_HAVE_EARLY_RES) += early_res.o
 obj-y += groups.o
 
 ifdef CONFIG_FUNCTION_TRACER
diff --git a/mm/bootmem.c b/mm/bootmem.c
index fadbc3b..81f9350 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -14,6 +14,7 @@
 #include <linux/module.h>
 #include <linux/kmemleak.h>
 #include <linux/range.h>
+#include <linux/lmb.h>
 
 #include <asm/bug.h>
 #include <asm/io.h>
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index faae23a..42bc09e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3441,38 +3441,19 @@ int __init add_from_early_node_map(struct range *range, int az,
 void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
 					u64 goal, u64 limit)
 {
-	int i;
 	void *ptr;
 
-	/* need to go over early_node_map to find out good range for node */
-	for_each_active_range_index_in_nid(i, nid) {
-		u64 addr;
-		u64 ei_start, ei_last;
-
-		ei_last = early_node_map[i].end_pfn;
-		ei_last <<= PAGE_SHIFT;
-		ei_start = early_node_map[i].start_pfn;
-		ei_start <<= PAGE_SHIFT;
-		addr = find_early_area(ei_start, ei_last,
-					 goal, limit, size, align);
-
-		if (addr == -1ULL)
-			continue;
+	u64 addr;
 
-#if 0
-		printk(KERN_DEBUG "alloc (nid=%d %llx - %llx) (%llx - %llx) %llx %llx => %llx\n",
-				nid,
-				ei_start, ei_last, goal, limit, size,
-				align, addr);
-#endif
+	addr = find_memory_core_early(nid, size, align, goal, limit);
 
-		ptr = phys_to_virt(addr);
-		memset(ptr, 0, size);
-		reserve_early_without_check(addr, addr + size, "BOOTMEM");
-		return ptr;
-	}
+	if (addr == -1ULL)
+		return NULL;
 
-	return NULL;
+	ptr = phys_to_virt(addr);
+	memset(ptr, 0, size);
+	reserve_lmb(addr, addr + size, "BOOTMEM");
+	return ptr;
 }
 #endif
 
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 392b9bb..cb917d5 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -219,18 +219,7 @@ void __init sparse_mem_maps_populate_node(struct page **map_map,
 
 	if (vmemmap_buf_start) {
 		/* need to free left buf */
-#ifdef CONFIG_NO_BOOTMEM
-		free_early(__pa(vmemmap_buf_start), __pa(vmemmap_buf_end));
-		if (vmemmap_buf_start < vmemmap_buf) {
-			char name[15];
-
-			snprintf(name, sizeof(name), "MEMMAP %d", nodeid);
-			reserve_early_without_check(__pa(vmemmap_buf_start),
-						    __pa(vmemmap_buf), name);
-		}
-#else
 		free_bootmem(__pa(vmemmap_buf), vmemmap_buf_end - vmemmap_buf);
-#endif
 		vmemmap_buf = NULL;
 		vmemmap_buf_end = NULL;
 	}
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 22/31] x86: Use lmb to replace early_res
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

1. replace find_e820_area with find_lmb_area
2. replace reserve_early with reserve_lmb
3. replace free_early with free_lmb.
4. NO_BOOTMEM will switch to use lmb too.
5. use _e820, _early wrap in the patch, in following patch, will
   replace them all
6. because free_lmb support partial free, we can remove some special care
7. Need to make sure that find_lmb_area() is called after fill_lmb_memory()
   so adjust some calling later in setup.c::setup_arch()
   -- corruption_check and mptable_update

-v2: Move reserve_brk() early
    Before fill_lmb_area, to avoid overlap between brk and find_lmb_area()
    that could happen We have more then 128 RAM entry in E820 tables, and
    fill_lmb_memory() could use find_lmb_area() to find a new place for
    lmb.memory.region array.
    and We don't need to use extend_brk() after fill_lmb_area()
-v3: Move find_smp_config early
    To make sure find_lmb_area not find wrong place, if BIOS doesn't put mptable
    in right place.
-v4: Treat RESERVED_KERN as RAM in lmb.memory. and they are already in
    lmb.reserved already..
    use __NOT_KEEP_LMB to make sure lmb related code could be freed later.

So move reserve_brk() early before fill_lmb_area().

Suggested-by: David S. Miller <davem@davemloft.net>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/Kconfig               |    9 +--
 arch/x86/include/asm/e820.h    |   15 +++--
 arch/x86/include/asm/lmb.h     |   10 +++
 arch/x86/kernel/check.c        |   14 ++--
 arch/x86/kernel/e820.c         |  139 +++++++++++-----------------------------
 arch/x86/kernel/head.c         |    3 +-
 arch/x86/kernel/head32.c       |    6 +-
 arch/x86/kernel/head64.c       |    3 +
 arch/x86/kernel/mpparse.c      |    5 +-
 arch/x86/kernel/setup.c        |   29 ++++++---
 arch/x86/kernel/setup_percpu.c |    6 --
 arch/x86/mm/numa_64.c          |    5 +-
 kernel/Makefile                |    1 -
 mm/bootmem.c                   |    1 +
 mm/page_alloc.c                |   35 ++--------
 mm/sparse-vmemmap.c            |   11 ---
 16 files changed, 111 insertions(+), 181 deletions(-)
 create mode 100644 arch/x86/include/asm/lmb.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6a80bce..3117de5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -27,6 +27,7 @@ config X86
 	select HAVE_PERF_EVENTS if (!M386 && !M486)
 	select HAVE_IOREMAP_PROT
 	select HAVE_KPROBES
+	select HAVE_LMB
 	select ARCH_WANT_OPTIONAL_GPIOLIB
 	select ARCH_WANT_FRAME_POINTERS
 	select HAVE_DMA_ATTRS
@@ -192,9 +193,6 @@ config ARCH_SUPPORTS_OPTIMIZED_INLINING
 config ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	def_bool y
 
-config HAVE_EARLY_RES
-	def_bool y
-
 config HAVE_INTEL_TXT
 	def_bool y
 	depends on EXPERIMENTAL && DMAR && ACPI
@@ -589,14 +587,13 @@ config NO_BOOTMEM
 	default y
 	bool "Disable Bootmem code"
 	---help---
-	  Use early_res directly instead of bootmem before slab is ready.
+	  Use lmb directly instead of bootmem before slab is ready.
 		- allocator (buddy) [generic]
 		- early allocator (bootmem) [generic]
-		- very early allocator (reserve_early*()) [x86]
+		- very early allocator (lmb) [some generic]
 		- very very early allocator (early brk model) [x86]
 	  So reduce one layer between early allocator to final allocator
 
-
 config MEMTEST
 	bool "Memtest"
 	---help---
diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 0457c49..396c849 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -116,24 +116,27 @@ extern unsigned long end_user_pfn;
 extern u64 find_e820_area(u64 start, u64 end, u64 size, u64 align);
 extern u64 find_e820_area_size(u64 start, u64 *sizep, u64 align);
 extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
-#include <linux/early_res.h>
 
 extern unsigned long e820_end_of_ram_pfn(void);
 extern unsigned long e820_end_of_low_ram_pfn(void);
-extern int e820_find_active_region(const struct e820entry *ei,
-				  unsigned long start_pfn,
-				  unsigned long last_pfn,
-				  unsigned long *ei_startpfn,
-				  unsigned long *ei_endpfn);
 extern void e820_register_active_regions(int nid, unsigned long start_pfn,
 					 unsigned long end_pfn);
 extern u64 e820_hole_size(u64 start, u64 end);
+
+extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
+
+void init_lmb_memory(void);
+void fill_lmb_memory(void);
+
 extern void finish_e820_parsing(void);
 extern void e820_reserve_resources(void);
 extern void e820_reserve_resources_late(void);
 extern void setup_memory_map(void);
 extern char *default_machine_specific_memory_setup(void);
 
+void reserve_early(u64 start, u64 end, char *name);
+void free_early(u64 start, u64 end);
+
 /*
  * Returns true iff the specified range [s,e) is completely contained inside
  * the ISA region.
diff --git a/arch/x86/include/asm/lmb.h b/arch/x86/include/asm/lmb.h
new file mode 100644
index 0000000..2fd1db4
--- /dev/null
+++ b/arch/x86/include/asm/lmb.h
@@ -0,0 +1,10 @@
+#ifndef _X86_LMB_H
+#define _X86_LMB_H
+
+#define LMB_DBG(fmt...) printk(fmt)
+
+#define LMB_REAL_LIMIT 0
+
+#define __NOT_KEEP_LMB
+
+#endif
diff --git a/arch/x86/kernel/check.c b/arch/x86/kernel/check.c
index fc999e6..90236af 100644
--- a/arch/x86/kernel/check.c
+++ b/arch/x86/kernel/check.c
@@ -2,7 +2,8 @@
 #include <linux/sched.h>
 #include <linux/kthread.h>
 #include <linux/workqueue.h>
-#include <asm/e820.h>
+#include <linux/lmb.h>
+
 #include <asm/proto.h>
 
 /*
@@ -18,10 +19,12 @@ static int __read_mostly memory_corruption_check = -1;
 static unsigned __read_mostly corruption_check_size = 64*1024;
 static unsigned __read_mostly corruption_check_period = 60; /* seconds */
 
-static struct e820entry scan_areas[MAX_SCAN_AREAS];
+static struct scan_area {
+	u64 addr;
+	u64 size;
+} scan_areas[MAX_SCAN_AREAS];
 static int num_scan_areas;
 
-
 static __init int set_corruption_check(char *arg)
 {
 	char *end;
@@ -81,7 +84,7 @@ void __init setup_bios_corruption_check(void)
 
 	while (addr < corruption_check_size && num_scan_areas < MAX_SCAN_AREAS) {
 		u64 size;
-		addr = find_e820_area_size(addr, &size, PAGE_SIZE);
+		addr = find_lmb_area_size(addr, &size, PAGE_SIZE);
 
 		if (!(addr + 1))
 			break;
@@ -92,7 +95,7 @@ void __init setup_bios_corruption_check(void)
 		if ((addr + size) > corruption_check_size)
 			size = corruption_check_size - addr;
 
-		e820_update_range(addr, size, E820_RAM, E820_RESERVED);
+		reserve_lmb(addr, addr + size, "SCAN RAM");
 		scan_areas[num_scan_areas].addr = addr;
 		scan_areas[num_scan_areas].size = size;
 		num_scan_areas++;
@@ -105,7 +108,6 @@ void __init setup_bios_corruption_check(void)
 
 	printk(KERN_INFO "Scanning %d areas for low memory corruption\n",
 	       num_scan_areas);
-	update_e820();
 }
 
 
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 0eb9830..c2a9ce4 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -15,6 +15,7 @@
 #include <linux/pfn.h>
 #include <linux/suspend.h>
 #include <linux/firmware-map.h>
+#include <linux/lmb.h>
 
 #include <asm/e820.h>
 #include <asm/proto.h>
@@ -731,30 +732,7 @@ core_initcall(e820_mark_nvs_memory);
  */
 u64 __init find_e820_area(u64 start, u64 end, u64 size, u64 align)
 {
-	int i;
-
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
-		u64 addr;
-		u64 ei_start, ei_last;
-
-		if (ei->type != E820_RAM)
-			continue;
-
-		ei_last = ei->addr + ei->size;
-		ei_start = ei->addr;
-		addr = find_early_area(ei_start, ei_last, start, end,
-					 size, align);
-
-		if (addr != -1ULL)
-			return addr;
-	}
-	return -1ULL;
-}
-
-u64 __init find_fw_memmap_area(u64 start, u64 end, u64 size, u64 align)
-{
-	return find_e820_area(start, end, size, align);
+	return find_lmb_area(start, end, size, align);
 }
 
 u64 __init get_max_mapped(void)
@@ -770,30 +748,11 @@ u64 __init get_max_mapped(void)
  */
 u64 __init find_e820_area_size(u64 start, u64 *sizep, u64 align)
 {
-	int i;
-
-	for (i = 0; i < e820.nr_map; i++) {
-		struct e820entry *ei = &e820.map[i];
-		u64 addr;
-		u64 ei_start, ei_last;
-
-		if (ei->type != E820_RAM)
-			continue;
-
-		ei_last = ei->addr + ei->size;
-		ei_start = ei->addr;
-		addr = find_early_area_size(ei_start, ei_last, start,
-					 sizep, align);
-
-		if (addr != -1ULL)
-			return addr;
-	}
-
-	return -1ULL;
+	return find_lmb_area_size(start, sizep, align);
 }
 
 /*
- * pre allocated 4k and reserved it in e820
+ * pre allocated 4k and reserved it in lmb and e820_saved
  */
 u64 __init early_reserve_e820(u64 startt, u64 sizet, u64 align)
 {
@@ -802,7 +761,7 @@ u64 __init early_reserve_e820(u64 startt, u64 sizet, u64 align)
 	u64 start;
 
 	for (start = startt; ; start += size) {
-		start = find_e820_area_size(start, &size, align);
+		start = find_lmb_area_size(start, &size, align);
 		if (!(start + 1))
 			return 0;
 		if (size >= sizet)
@@ -819,10 +778,9 @@ u64 __init early_reserve_e820(u64 startt, u64 sizet, u64 align)
 	addr = round_down(start + size - sizet, align);
 	if (addr < start)
 		return 0;
-	e820_update_range(addr, sizet, E820_RAM, E820_RESERVED);
+	reserve_lmb(addr, addr + sizet, "new next");
 	e820_update_range_saved(addr, sizet, E820_RAM, E820_RESERVED);
-	printk(KERN_INFO "update e820 for early_reserve_e820\n");
-	update_e820();
+	printk(KERN_INFO "update e820_saved for early_reserve_e820\n");
 	update_e820_saved();
 
 	return addr;
@@ -884,52 +842,12 @@ unsigned long __init e820_end_of_low_ram_pfn(void)
 {
 	return e820_end_pfn(1UL<<(32 - PAGE_SHIFT), E820_RAM);
 }
-/*
- * Finds an active region in the address range from start_pfn to last_pfn and
- * returns its range in ei_startpfn and ei_endpfn for the e820 entry.
- */
-int __init e820_find_active_region(const struct e820entry *ei,
-				  unsigned long start_pfn,
-				  unsigned long last_pfn,
-				  unsigned long *ei_startpfn,
-				  unsigned long *ei_endpfn)
-{
-	u64 align = PAGE_SIZE;
-
-	*ei_startpfn = round_up(ei->addr, align) >> PAGE_SHIFT;
-	*ei_endpfn = round_down(ei->addr + ei->size, align) >> PAGE_SHIFT;
-
-	/* Skip map entries smaller than a page */
-	if (*ei_startpfn >= *ei_endpfn)
-		return 0;
-
-	/* Skip if map is outside the node */
-	if (ei->type != E820_RAM || *ei_endpfn <= start_pfn ||
-				    *ei_startpfn >= last_pfn)
-		return 0;
-
-	/* Check for overlaps */
-	if (*ei_startpfn < start_pfn)
-		*ei_startpfn = start_pfn;
-	if (*ei_endpfn > last_pfn)
-		*ei_endpfn = last_pfn;
-
-	return 1;
-}
 
 /* Walk the e820 map and register active regions within a node */
 void __init e820_register_active_regions(int nid, unsigned long start_pfn,
 					 unsigned long last_pfn)
 {
-	unsigned long ei_startpfn;
-	unsigned long ei_endpfn;
-	int i;
-
-	for (i = 0; i < e820.nr_map; i++)
-		if (e820_find_active_region(&e820.map[i],
-					    start_pfn, last_pfn,
-					    &ei_startpfn, &ei_endpfn))
-			add_active_range(nid, ei_startpfn, ei_endpfn);
+	lmb_register_active_regions(nid, start_pfn, last_pfn);
 }
 
 /*
@@ -939,18 +857,16 @@ void __init e820_register_active_regions(int nid, unsigned long start_pfn,
  */
 u64 __init e820_hole_size(u64 start, u64 end)
 {
-	unsigned long start_pfn = start >> PAGE_SHIFT;
-	unsigned long last_pfn = end >> PAGE_SHIFT;
-	unsigned long ei_startpfn, ei_endpfn, ram = 0;
-	int i;
+	return lmb_hole_size(start, end);
+}
 
-	for (i = 0; i < e820.nr_map; i++) {
-		if (e820_find_active_region(&e820.map[i],
-					    start_pfn, last_pfn,
-					    &ei_startpfn, &ei_endpfn))
-			ram += ei_endpfn - ei_startpfn;
-	}
-	return end - start - ((u64)ram << PAGE_SHIFT);
+void reserve_early(u64 start, u64 end, char *name)
+{
+	reserve_lmb(start, end, name);
+}
+void free_early(u64 start, u64 end)
+{
+	free_lmb(start, end);
 }
 
 static void early_panic(char *msg)
@@ -1204,3 +1120,24 @@ void __init setup_memory_map(void)
 	printk(KERN_INFO "BIOS-provided physical RAM map:\n");
 	e820_print_map(who);
 }
+
+void __init init_lmb_memory(void)
+{
+	lmb_init();
+}
+
+void __init fill_lmb_memory(void)
+{
+	int i;
+
+	for (i = 0; i < e820.nr_map; i++) {
+		struct e820entry *ei = &e820.map[i];
+
+		if (ei->type != E820_RAM && ei->type != E820_RESERVED_KERN)
+			continue;
+		add_lmb_memory(ei->addr, ei->addr + ei->size);
+	}
+
+	lmb_analyze();
+	lmb_dump_all();
+}
diff --git a/arch/x86/kernel/head.c b/arch/x86/kernel/head.c
index 3e66bd3..0341b09 100644
--- a/arch/x86/kernel/head.c
+++ b/arch/x86/kernel/head.c
@@ -1,5 +1,6 @@
 #include <linux/kernel.h>
 #include <linux/init.h>
+#include <linux/lmb.h>
 
 #include <asm/setup.h>
 #include <asm/bios_ebda.h>
@@ -51,5 +52,5 @@ void __init reserve_ebda_region(void)
 		lowmem = 0x9f000;
 
 	/* reserve all memory between lowmem and the 1MB mark */
-	reserve_early_overlap_ok(lowmem, 0x100000, "BIOS reserved");
+	reserve_lmb_overlap_ok(lowmem, 0x100000, "BIOS reserved");
 }
diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index 9a97504..cbab479 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -7,6 +7,7 @@
 
 #include <linux/init.h>
 #include <linux/start_kernel.h>
+#include <linux/lmb.h>
 
 #include <asm/setup.h>
 #include <asm/sections.h>
@@ -29,14 +30,15 @@ static void __init i386_default_early_setup(void)
 
 void __init i386_start_kernel(void)
 {
+	init_lmb_memory();
+
 #ifdef CONFIG_X86_TRAMPOLINE
 	/*
 	 * But first pinch a few for the stack/trampoline stuff
 	 * FIXME: Don't need the extra page at 4K, but need to fix
 	 * trampoline before removing it. (see the GDT stuff)
 	 */
-	reserve_early_overlap_ok(PAGE_SIZE, PAGE_SIZE + PAGE_SIZE,
-					 "EX TRAMPOLINE");
+	reserve_lmb(PAGE_SIZE, PAGE_SIZE + PAGE_SIZE, "EX TRAMPOLINE");
 #endif
 
 	reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 7147143..89dd2de 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -12,6 +12,7 @@
 #include <linux/percpu.h>
 #include <linux/start_kernel.h>
 #include <linux/io.h>
+#include <linux/lmb.h>
 
 #include <asm/processor.h>
 #include <asm/proto.h>
@@ -96,6 +97,8 @@ void __init x86_64_start_kernel(char * real_mode_data)
 
 void __init x86_64_start_reservations(char *real_mode_data)
 {
+	init_lmb_memory();
+
 	copy_bootdata(__va(real_mode_data));
 
 	reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index a2c1edd..5779c27 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -11,6 +11,7 @@
 #include <linux/init.h>
 #include <linux/delay.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/kernel_stat.h>
 #include <linux/mc146818rtc.h>
 #include <linux/bitops.h>
@@ -664,7 +665,7 @@ static void __init smp_reserve_memory(struct mpf_intel *mpf)
 {
 	unsigned long size = get_mpc_size(mpf->physptr);
 
-	reserve_early(mpf->physptr, mpf->physptr+size, "MP-table mpc");
+	reserve_lmb_overlap_ok(mpf->physptr, mpf->physptr+size, "MP-table mpc");
 }
 
 static int __init smp_scan_config(unsigned long base, unsigned long length)
@@ -693,7 +694,7 @@ static int __init smp_scan_config(unsigned long base, unsigned long length)
 			       mpf, (u64)virt_to_phys(mpf));
 
 			mem = virt_to_phys(mpf);
-			reserve_early(mem, mem + sizeof(*mpf), "MP-table mpf");
+			reserve_lmb_overlap_ok(mem, mem + sizeof(*mpf), "MP-table mpf");
 			if (mpf->physptr)
 				smp_reserve_memory(mpf);
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 7ca6878..a5c029f 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -31,6 +31,7 @@
 #include <linux/apm_bios.h>
 #include <linux/initrd.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/seq_file.h>
 #include <linux/console.h>
 #include <linux/mca.h>
@@ -870,8 +871,6 @@ void __init setup_arch(char **cmdline_p)
 	 */
 	max_pfn = e820_end_of_ram_pfn();
 
-	/* preallocate 4k for mptable mpc */
-	early_reserve_e820_mpc_new();
 	/* update e820 for memory not covered by WB MTRRs */
 	mtrr_bp_init();
 	if (mtrr_trim_uncached_memory(max_pfn))
@@ -896,6 +895,23 @@ void __init setup_arch(char **cmdline_p)
 	max_pfn_mapped = KERNEL_IMAGE_SIZE >> PAGE_SHIFT;
 #endif
 
+	/*
+	 * Find and reserve possible boot-time SMP configuration:
+	 */
+	find_smp_config();
+
+	/*
+	 * Need to conclude brk, before fill_lmb_memory()
+	 *  it could use find_lmb_area, could overlap with
+	 *  brk area.
+	 */
+	reserve_brk();
+
+	fill_lmb_memory();
+
+	/* preallocate 4k for mptable mpc */
+	early_reserve_e820_mpc_new();
+
 #ifdef CONFIG_X86_CHECK_BIOS_CORRUPTION
 	setup_bios_corruption_check();
 #endif
@@ -903,13 +919,6 @@ void __init setup_arch(char **cmdline_p)
 	printk(KERN_DEBUG "initial memory mapped : 0 - %08lx\n",
 			max_pfn_mapped<<PAGE_SHIFT);
 
-	reserve_brk();
-
-	/*
-	 * Find and reserve possible boot-time SMP configuration:
-	 */
-	find_smp_config();
-
 	reserve_trampoline_memory();
 
 #ifdef CONFIG_ACPI_SLEEP
@@ -972,7 +981,7 @@ void __init setup_arch(char **cmdline_p)
 
 	initmem_init(0, max_pfn, acpi, k8);
 #ifndef CONFIG_NO_BOOTMEM
-	early_res_to_bootmem(0, max_low_pfn<<PAGE_SHIFT);
+	lmb_to_bootmem(0, max_low_pfn<<PAGE_SHIFT);
 #endif
 
 	dma32_reserve_bootmem();
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index ef6370b..35abcb8 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -137,13 +137,7 @@ static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, size_t align)
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-#ifdef CONFIG_NO_BOOTMEM
-	u64 start = __pa(ptr);
-	u64 end = start + size;
-	free_early_partial(start, end);
-#else
 	free_bootmem(__pa(ptr), size);
-#endif
 }
 
 static int __init pcpu_cpu_distance(unsigned int from, unsigned int to)
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 8948f47..fc9a403 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -7,6 +7,7 @@
 #include <linux/string.h>
 #include <linux/init.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/mmzone.h>
 #include <linux/ctype.h>
 #include <linux/module.h>
@@ -174,7 +175,7 @@ static void * __init early_node_mem(int nodeid, unsigned long start,
 	if (start < (MAX_DMA32_PFN<<PAGE_SHIFT) &&
 	    end > (MAX_DMA32_PFN<<PAGE_SHIFT))
 		start = MAX_DMA32_PFN<<PAGE_SHIFT;
-	mem = find_e820_area(start, end, size, align);
+	mem = find_lmb_area_node(nodeid, start, end, size, align);
 	if (mem != -1L)
 		return __va(mem);
 
@@ -184,7 +185,7 @@ static void * __init early_node_mem(int nodeid, unsigned long start,
 		start = MAX_DMA32_PFN<<PAGE_SHIFT;
 	else
 		start = MAX_DMA_PFN<<PAGE_SHIFT;
-	mem = find_e820_area(start, end, size, align);
+	mem = find_lmb_area_node(nodeid, start, end, size, align);
 	if (mem != -1L)
 		return __va(mem);
 
diff --git a/kernel/Makefile b/kernel/Makefile
index d5c3006..754bf79 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -11,7 +11,6 @@ obj-y     = sched.o fork.o exec_domain.o panic.o printk.o \
 	    hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \
 	    notifier.o ksysfs.o pm_qos_params.o sched_clock.o cred.o \
 	    async.o range.o
-obj-$(CONFIG_HAVE_EARLY_RES) += early_res.o
 obj-y += groups.o
 
 ifdef CONFIG_FUNCTION_TRACER
diff --git a/mm/bootmem.c b/mm/bootmem.c
index fadbc3b..81f9350 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -14,6 +14,7 @@
 #include <linux/module.h>
 #include <linux/kmemleak.h>
 #include <linux/range.h>
+#include <linux/lmb.h>
 
 #include <asm/bug.h>
 #include <asm/io.h>
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index faae23a..42bc09e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3441,38 +3441,19 @@ int __init add_from_early_node_map(struct range *range, int az,
 void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
 					u64 goal, u64 limit)
 {
-	int i;
 	void *ptr;
 
-	/* need to go over early_node_map to find out good range for node */
-	for_each_active_range_index_in_nid(i, nid) {
-		u64 addr;
-		u64 ei_start, ei_last;
-
-		ei_last = early_node_map[i].end_pfn;
-		ei_last <<= PAGE_SHIFT;
-		ei_start = early_node_map[i].start_pfn;
-		ei_start <<= PAGE_SHIFT;
-		addr = find_early_area(ei_start, ei_last,
-					 goal, limit, size, align);
-
-		if (addr == -1ULL)
-			continue;
+	u64 addr;
 
-#if 0
-		printk(KERN_DEBUG "alloc (nid=%d %llx - %llx) (%llx - %llx) %llx %llx => %llx\n",
-				nid,
-				ei_start, ei_last, goal, limit, size,
-				align, addr);
-#endif
+	addr = find_memory_core_early(nid, size, align, goal, limit);
 
-		ptr = phys_to_virt(addr);
-		memset(ptr, 0, size);
-		reserve_early_without_check(addr, addr + size, "BOOTMEM");
-		return ptr;
-	}
+	if (addr == -1ULL)
+		return NULL;
 
-	return NULL;
+	ptr = phys_to_virt(addr);
+	memset(ptr, 0, size);
+	reserve_lmb(addr, addr + size, "BOOTMEM");
+	return ptr;
 }
 #endif
 
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 392b9bb..cb917d5 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -219,18 +219,7 @@ void __init sparse_mem_maps_populate_node(struct page **map_map,
 
 	if (vmemmap_buf_start) {
 		/* need to free left buf */
-#ifdef CONFIG_NO_BOOTMEM
-		free_early(__pa(vmemmap_buf_start), __pa(vmemmap_buf_end));
-		if (vmemmap_buf_start < vmemmap_buf) {
-			char name[15];
-
-			snprintf(name, sizeof(name), "MEMMAP %d", nodeid);
-			reserve_early_without_check(__pa(vmemmap_buf_start),
-						    __pa(vmemmap_buf), name);
-		}
-#else
 		free_bootmem(__pa(vmemmap_buf), vmemmap_buf_end - vmemmap_buf);
-#endif
 		vmemmap_buf = NULL;
 		vmemmap_buf_end = NULL;
 	}
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 23/31] x86: Replace e820_/_early string with lmb_
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

1.include linux/lmb.h directly. so later could reduce e820.h reference.
2 this patch is done by sed scripts mainly

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/efi.h      |    2 +-
 arch/x86/kernel/acpi/sleep.c    |    5 +++--
 arch/x86/kernel/apic/numaq_32.c |    3 ++-
 arch/x86/kernel/efi.c           |    5 +++--
 arch/x86/kernel/head32.c        |    4 ++--
 arch/x86/kernel/head64.c        |    4 ++--
 arch/x86/kernel/setup.c         |   25 ++++++++++++-------------
 arch/x86/kernel/trampoline.c    |    6 +++---
 arch/x86/mm/init.c              |    5 +++--
 arch/x86/mm/init_32.c           |   10 ++++++----
 arch/x86/mm/init_64.c           |    9 +++++----
 arch/x86/mm/k8topology_64.c     |    4 +++-
 arch/x86/mm/memtest.c           |    7 +++----
 arch/x86/mm/numa_32.c           |   17 +++++++++--------
 arch/x86/mm/numa_64.c           |   32 ++++++++++++++++----------------
 arch/x86/mm/srat_32.c           |    3 ++-
 arch/x86/mm/srat_64.c           |    9 +++++----
 arch/x86/xen/mmu.c              |    5 +++--
 arch/x86/xen/setup.c            |    3 ++-
 mm/bootmem.c                    |    4 ++--
 20 files changed, 87 insertions(+), 75 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 8406ed7..f756536 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -90,7 +90,7 @@ extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
 #endif /* CONFIG_X86_32 */
 
 extern int add_efi_memmap;
-extern void efi_reserve_early(void);
+extern void efi_reserve_lmb(void);
 extern void efi_call_phys_prelog(void);
 extern void efi_call_phys_epilog(void);
 
diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index f996103..0cabfaa 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -7,6 +7,7 @@
 
 #include <linux/acpi.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/dmi.h>
 #include <linux/cpumask.h>
 #include <asm/segment.h>
@@ -133,7 +134,7 @@ void __init acpi_reserve_wakeup_memory(void)
 		return;
 	}
 
-	mem = find_e820_area(0, 1<<20, WAKEUP_SIZE, PAGE_SIZE);
+	mem = find_lmb_area(0, 1<<20, WAKEUP_SIZE, PAGE_SIZE);
 
 	if (mem == -1L) {
 		printk(KERN_ERR "ACPI: Cannot allocate lowmem, S3 disabled.\n");
@@ -141,7 +142,7 @@ void __init acpi_reserve_wakeup_memory(void)
 	}
 	acpi_realmode = (unsigned long) phys_to_virt(mem);
 	acpi_wakeup_address = mem;
-	reserve_early(mem, mem + WAKEUP_SIZE, "ACPI WAKEUP");
+	reserve_lmb(mem, mem + WAKEUP_SIZE, "ACPI WAKEUP");
 }
 
 
diff --git a/arch/x86/kernel/apic/numaq_32.c b/arch/x86/kernel/apic/numaq_32.c
index 3e28401..c71e494 100644
--- a/arch/x86/kernel/apic/numaq_32.c
+++ b/arch/x86/kernel/apic/numaq_32.c
@@ -26,6 +26,7 @@
 #include <linux/nodemask.h>
 #include <linux/topology.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/threads.h>
 #include <linux/cpumask.h>
 #include <linux/kernel.h>
@@ -88,7 +89,7 @@ static inline void numaq_register_node(int node, struct sys_cfg_data *scd)
 	node_end_pfn[node] =
 		 MB_TO_PAGES(eq->hi_shrd_mem_start + eq->hi_shrd_mem_size);
 
-	e820_register_active_regions(node, node_start_pfn[node],
+	lmb_register_active_regions(node, node_start_pfn[node],
 						node_end_pfn[node]);
 
 	memory_present(node, node_start_pfn[node], node_end_pfn[node]);
diff --git a/arch/x86/kernel/efi.c b/arch/x86/kernel/efi.c
index 299f03f..35038a7 100644
--- a/arch/x86/kernel/efi.c
+++ b/arch/x86/kernel/efi.c
@@ -30,6 +30,7 @@
 #include <linux/init.h>
 #include <linux/efi.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/spinlock.h>
 #include <linux/uaccess.h>
 #include <linux/time.h>
@@ -275,7 +276,7 @@ static void __init do_add_efi_memmap(void)
 	sanitize_e820_map();
 }
 
-void __init efi_reserve_early(void)
+void __init efi_reserve_lmb(void)
 {
 	unsigned long pmap;
 
@@ -290,7 +291,7 @@ void __init efi_reserve_early(void)
 		boot_params.efi_info.efi_memdesc_size;
 	memmap.desc_version = boot_params.efi_info.efi_memdesc_version;
 	memmap.desc_size = boot_params.efi_info.efi_memdesc_size;
-	reserve_early(pmap, pmap + memmap.nr_map * memmap.desc_size,
+	reserve_lmb(pmap, pmap + memmap.nr_map * memmap.desc_size,
 		      "EFI memmap");
 }
 
diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index cbab479..9d09d26 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -41,7 +41,7 @@ void __init i386_start_kernel(void)
 	reserve_lmb(PAGE_SIZE, PAGE_SIZE + PAGE_SIZE, "EX TRAMPOLINE");
 #endif
 
-	reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
+	reserve_lmb(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
 
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
@@ -50,7 +50,7 @@ void __init i386_start_kernel(void)
 		u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 		u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
 		u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
-		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
+		reserve_lmb(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
 
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 89dd2de..d4442a8 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -101,7 +101,7 @@ void __init x86_64_start_reservations(char *real_mode_data)
 
 	copy_bootdata(__va(real_mode_data));
 
-	reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
+	reserve_lmb(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
 
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
@@ -110,7 +110,7 @@ void __init x86_64_start_reservations(char *real_mode_data)
 		unsigned long ramdisk_image = boot_params.hdr.ramdisk_image;
 		unsigned long ramdisk_size  = boot_params.hdr.ramdisk_size;
 		unsigned long ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
-		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
+		reserve_lmb(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a5c029f..3d43f12 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -303,7 +303,7 @@ static inline void init_gbpages(void)
 static void __init reserve_brk(void)
 {
 	if (_brk_end > _brk_start)
-		reserve_early(__pa(_brk_start), __pa(_brk_end), "BRK");
+		reserve_lmb(__pa(_brk_start), __pa(_brk_end), "BRK");
 
 	/* Mark brk area as locked down and no longer taking any
 	   new allocations */
@@ -325,7 +325,7 @@ static void __init relocate_initrd(void)
 	char *p, *q;
 
 	/* We need to move the initrd down into lowmem */
-	ramdisk_here = find_e820_area(0, end_of_lowmem, area_size,
+	ramdisk_here = find_lmb_area(0, end_of_lowmem, area_size,
 					 PAGE_SIZE);
 
 	if (ramdisk_here == -1ULL)
@@ -334,8 +334,7 @@ static void __init relocate_initrd(void)
 
 	/* Note: this includes all the lowmem currently occupied by
 	   the initrd, we rely on that fact to keep the data intact. */
-	reserve_early(ramdisk_here, ramdisk_here + area_size,
-			 "NEW RAMDISK");
+	reserve_lmb(ramdisk_here, ramdisk_here + area_size, "NEW RAMDISK");
 	initrd_start = ramdisk_here + PAGE_OFFSET;
 	initrd_end   = initrd_start + ramdisk_size;
 	printk(KERN_INFO "Allocated new RAMDISK: %08llx - %08llx\n",
@@ -391,7 +390,7 @@ static void __init reserve_initrd(void)
 	initrd_start = 0;
 
 	if (ramdisk_size >= (end_of_lowmem>>1)) {
-		free_early(ramdisk_image, ramdisk_end);
+		free_lmb(ramdisk_image, ramdisk_end);
 		printk(KERN_ERR "initrd too large to handle, "
 		       "disabling initrd\n");
 		return;
@@ -414,7 +413,7 @@ static void __init reserve_initrd(void)
 
 	relocate_initrd();
 
-	free_early(ramdisk_image, ramdisk_end);
+	free_lmb(ramdisk_image, ramdisk_end);
 }
 #else
 static void __init reserve_initrd(void)
@@ -470,7 +469,7 @@ static void __init e820_reserve_setup_data(void)
 	e820_print_map("reserve setup_data");
 }
 
-static void __init reserve_early_setup_data(void)
+static void __init reserve_lmb_setup_data(void)
 {
 	struct setup_data *data;
 	u64 pa_data;
@@ -482,7 +481,7 @@ static void __init reserve_early_setup_data(void)
 	while (pa_data) {
 		data = early_memremap(pa_data, sizeof(*data));
 		sprintf(buf, "setup data %x", data->type);
-		reserve_early(pa_data, pa_data+sizeof(*data)+data->len, buf);
+		reserve_lmb(pa_data, pa_data+sizeof(*data)+data->len, buf);
 		pa_data = data->next;
 		early_iounmap(data, sizeof(*data));
 	}
@@ -520,7 +519,7 @@ static void __init reserve_crashkernel(void)
 	if (crash_base <= 0) {
 		const unsigned long long alignment = 16<<20;	/* 16M */
 
-		crash_base = find_e820_area(alignment, ULONG_MAX, crash_size,
+		crash_base = find_lmb_area(alignment, ULONG_MAX, crash_size,
 				 alignment);
 		if (crash_base == -1ULL) {
 			pr_info("crashkernel reservation failed - No suitable area found.\n");
@@ -529,14 +528,14 @@ static void __init reserve_crashkernel(void)
 	} else {
 		unsigned long long start;
 
-		start = find_e820_area(crash_base, ULONG_MAX, crash_size,
+		start = find_lmb_area(crash_base, ULONG_MAX, crash_size,
 				 1<<20);
 		if (start != crash_base) {
 			pr_info("crashkernel reservation failed - memory is in use.\n");
 			return;
 		}
 	}
-	reserve_early(crash_base, crash_base + crash_size, "CRASH KERNEL");
+	reserve_lmb(crash_base, crash_base + crash_size, "CRASH KERNEL");
 
 	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
 			"for crashkernel (System RAM: %ldMB)\n",
@@ -756,7 +755,7 @@ void __init setup_arch(char **cmdline_p)
 #endif
 	 4)) {
 		efi_enabled = 1;
-		efi_reserve_early();
+		efi_reserve_lmb();
 	}
 #endif
 
@@ -816,7 +815,7 @@ void __init setup_arch(char **cmdline_p)
 	vmi_activate();
 
 	/* after early param, so could get panic from serial */
-	reserve_early_setup_data();
+	reserve_lmb_setup_data();
 
 	if (acpi_mps_check()) {
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/kernel/trampoline.c b/arch/x86/kernel/trampoline.c
index c652ef6..4a634f6 100644
--- a/arch/x86/kernel/trampoline.c
+++ b/arch/x86/kernel/trampoline.c
@@ -1,7 +1,7 @@
 #include <linux/io.h>
+#include <linux/lmb.h>
 
 #include <asm/trampoline.h>
-#include <asm/e820.h>
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_ACPI_SLEEP)
 #define __trampinit
@@ -19,12 +19,12 @@ void __init reserve_trampoline_memory(void)
 	unsigned long mem;
 
 	/* Has to be in very low memory so we can execute real-mode AP code. */
-	mem = find_e820_area(0, 1<<20, TRAMPOLINE_SIZE, PAGE_SIZE);
+	mem = find_lmb_area(0, 1<<20, TRAMPOLINE_SIZE, PAGE_SIZE);
 	if (mem == -1L)
 		panic("Cannot allocate trampoline\n");
 
 	trampoline_base = __va(mem);
-	reserve_early(mem, mem + TRAMPOLINE_SIZE, "TRAMPOLINE");
+	reserve_lmb(mem, mem + TRAMPOLINE_SIZE, "TRAMPOLINE");
 }
 
 /*
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 452ee5b..837aaa2 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -1,6 +1,7 @@
 #include <linux/initrd.h>
 #include <linux/ioport.h>
 #include <linux/swap.h>
+#include <linux/lmb.h>
 
 #include <asm/cacheflush.h>
 #include <asm/e820.h>
@@ -74,7 +75,7 @@ static void __init find_early_table_space(unsigned long end, int use_pse,
 #else
 	start = 0x8000;
 #endif
-	e820_table_start = find_e820_area(start, max_pfn_mapped<<PAGE_SHIFT,
+	e820_table_start = find_lmb_area(start, max_pfn_mapped<<PAGE_SHIFT,
 					tables, PAGE_SIZE);
 	if (e820_table_start == -1UL)
 		panic("Cannot find space for the kernel page tables");
@@ -298,7 +299,7 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
 	__flush_tlb_all();
 
 	if (!after_bootmem && e820_table_end > e820_table_start)
-		reserve_early(e820_table_start << PAGE_SHIFT,
+		reserve_lmb(e820_table_start << PAGE_SHIFT,
 				 e820_table_end << PAGE_SHIFT, "PGTABLE");
 
 	if (!after_bootmem)
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 804bbe9..26b9ceb 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -25,6 +25,7 @@
 #include <linux/pfn.h>
 #include <linux/poison.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/slab.h>
 #include <linux/proc_fs.h>
 #include <linux/memory_hotplug.h>
@@ -712,14 +713,14 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 	highstart_pfn = highend_pfn = max_pfn;
 	if (max_pfn > max_low_pfn)
 		highstart_pfn = max_low_pfn;
-	e820_register_active_regions(0, 0, highend_pfn);
+	lmb_register_active_regions(0, 0, highend_pfn);
 	sparse_memory_present_with_active_regions(0);
 	printk(KERN_NOTICE "%ldMB HIGHMEM available.\n",
 		pages_to_mb(highend_pfn - highstart_pfn));
 	num_physpages = highend_pfn;
 	high_memory = (void *) __va(highstart_pfn * PAGE_SIZE - 1) + 1;
 #else
-	e820_register_active_regions(0, 0, max_low_pfn);
+	lmb_register_active_regions(0, 0, max_low_pfn);
 	sparse_memory_present_with_active_regions(0);
 	num_physpages = max_low_pfn;
 	high_memory = (void *) __va(max_low_pfn * PAGE_SIZE - 1) + 1;
@@ -781,11 +782,11 @@ void __init setup_bootmem_allocator(void)
 	 * Initialize the boot-time allocator (with low memory only):
 	 */
 	bootmap_size = bootmem_bootmap_pages(max_low_pfn)<<PAGE_SHIFT;
-	bootmap = find_e820_area(0, max_pfn_mapped<<PAGE_SHIFT, bootmap_size,
+	bootmap = find_lmb_area(0, max_pfn_mapped<<PAGE_SHIFT, bootmap_size,
 				 PAGE_SIZE);
 	if (bootmap == -1L)
 		panic("Cannot find bootmem map of size %ld\n", bootmap_size);
-	reserve_early(bootmap, bootmap + bootmap_size, "BOOTMAP");
+	reserve_lmb(bootmap, bootmap + bootmap_size, "BOOTMAP");
 #endif
 
 	printk(KERN_INFO "  mapped low ram: 0 - %08lx\n",
@@ -1069,3 +1070,4 @@ void mark_rodata_ro(void)
 #endif
 }
 #endif
+
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 5ba6b0e..b86492e 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -21,6 +21,7 @@
 #include <linux/initrd.h>
 #include <linux/pagemap.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/proc_fs.h>
 #include <linux/pci.h>
 #include <linux/pfn.h>
@@ -576,18 +577,18 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 	unsigned long bootmap_size, bootmap;
 
 	bootmap_size = bootmem_bootmap_pages(end_pfn)<<PAGE_SHIFT;
-	bootmap = find_e820_area(0, end_pfn<<PAGE_SHIFT, bootmap_size,
+	bootmap = find_lmb_area(0, end_pfn<<PAGE_SHIFT, bootmap_size,
 				 PAGE_SIZE);
 	if (bootmap == -1L)
 		panic("Cannot find bootmem map of size %ld\n", bootmap_size);
-	reserve_early(bootmap, bootmap + bootmap_size, "BOOTMAP");
+	reserve_lmb(bootmap, bootmap + bootmap_size, "BOOTMAP");
 	/* don't touch min_low_pfn */
 	bootmap_size = init_bootmem_node(NODE_DATA(0), bootmap >> PAGE_SHIFT,
 					 0, end_pfn);
-	e820_register_active_regions(0, start_pfn, end_pfn);
+	lmb_register_active_regions(0, start_pfn, end_pfn);
 	free_bootmem_with_active_regions(0, end_pfn);
 #else
-	e820_register_active_regions(0, start_pfn, end_pfn);
+	lmb_register_active_regions(0, start_pfn, end_pfn);
 #endif
 }
 #endif
diff --git a/arch/x86/mm/k8topology_64.c b/arch/x86/mm/k8topology_64.c
index 970ed57..d7d031b 100644
--- a/arch/x86/mm/k8topology_64.c
+++ b/arch/x86/mm/k8topology_64.c
@@ -11,6 +11,8 @@
 #include <linux/string.h>
 #include <linux/module.h>
 #include <linux/nodemask.h>
+#include <linux/lmb.h>
+
 #include <asm/io.h>
 #include <linux/pci_ids.h>
 #include <linux/acpi.h>
@@ -222,7 +224,7 @@ int __init k8_scan_nodes(void)
 	for_each_node_mask(i, node_possible_map) {
 		int j;
 
-		e820_register_active_regions(i,
+		lmb_register_active_regions(i,
 				nodes[i].start >> PAGE_SHIFT,
 				nodes[i].end >> PAGE_SHIFT);
 		for (j = apicid_base; j < cores + apicid_base; j++)
diff --git a/arch/x86/mm/memtest.c b/arch/x86/mm/memtest.c
index 18d244f..f889776 100644
--- a/arch/x86/mm/memtest.c
+++ b/arch/x86/mm/memtest.c
@@ -6,8 +6,7 @@
 #include <linux/smp.h>
 #include <linux/init.h>
 #include <linux/pfn.h>
-
-#include <asm/e820.h>
+#include <linux/lmb.h>
 
 static u64 patterns[] __initdata = {
 	0,
@@ -35,7 +34,7 @@ static void __init reserve_bad_mem(u64 pattern, u64 start_bad, u64 end_bad)
 	       (unsigned long long) pattern,
 	       (unsigned long long) start_bad,
 	       (unsigned long long) end_bad);
-	reserve_early(start_bad, end_bad, "BAD RAM");
+	reserve_lmb(start_bad, end_bad, "BAD RAM");
 }
 
 static void __init memtest(u64 pattern, u64 start_phys, u64 size)
@@ -74,7 +73,7 @@ static void __init do_one_pass(u64 pattern, u64 start, u64 end)
 	u64 size = 0;
 
 	while (start < end) {
-		start = find_e820_area_size(start, &size, 1);
+		start = find_lmb_area_size(start, &size, 1);
 
 		/* done ? */
 		if (start >= end)
diff --git a/arch/x86/mm/numa_32.c b/arch/x86/mm/numa_32.c
index 809baaa..3434a93 100644
--- a/arch/x86/mm/numa_32.c
+++ b/arch/x86/mm/numa_32.c
@@ -24,6 +24,7 @@
 
 #include <linux/mm.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/mmzone.h>
 #include <linux/highmem.h>
 #include <linux/initrd.h>
@@ -120,7 +121,7 @@ int __init get_memcfg_numa_flat(void)
 
 	node_start_pfn[0] = 0;
 	node_end_pfn[0] = max_pfn;
-	e820_register_active_regions(0, 0, max_pfn);
+	lmb_register_active_regions(0, 0, max_pfn);
 	memory_present(0, 0, max_pfn);
 	node_remap_size[0] = node_memmap_size_bytes(0, 0, max_pfn);
 
@@ -161,14 +162,14 @@ static void __init allocate_pgdat(int nid)
 		NODE_DATA(nid) = (pg_data_t *)node_remap_start_vaddr[nid];
 	else {
 		unsigned long pgdat_phys;
-		pgdat_phys = find_e820_area(min_low_pfn<<PAGE_SHIFT,
+		pgdat_phys = find_lmb_area(min_low_pfn<<PAGE_SHIFT,
 				 max_pfn_mapped<<PAGE_SHIFT,
 				 sizeof(pg_data_t),
 				 PAGE_SIZE);
 		NODE_DATA(nid) = (pg_data_t *)(pfn_to_kaddr(pgdat_phys>>PAGE_SHIFT));
 		memset(buf, 0, sizeof(buf));
 		sprintf(buf, "NODE_DATA %d",  nid);
-		reserve_early(pgdat_phys, pgdat_phys + sizeof(pg_data_t), buf);
+		reserve_lmb(pgdat_phys, pgdat_phys + sizeof(pg_data_t), buf);
 	}
 	printk(KERN_DEBUG "allocate_pgdat: node %d NODE_DATA %08lx\n",
 		nid, (unsigned long)NODE_DATA(nid));
@@ -291,7 +292,7 @@ static __init unsigned long calculate_numa_remap_pages(void)
 						 PTRS_PER_PTE);
 		node_kva_target <<= PAGE_SHIFT;
 		do {
-			node_kva_final = find_e820_area(node_kva_target,
+			node_kva_final = find_lmb_area(node_kva_target,
 					((u64)node_end_pfn[nid])<<PAGE_SHIFT,
 						((u64)size)<<PAGE_SHIFT,
 						LARGE_PAGE_BYTES);
@@ -318,9 +319,9 @@ static __init unsigned long calculate_numa_remap_pages(void)
 		 *  but we could have some hole in high memory, and it will only
 		 *  check page_is_ram(pfn) && !page_is_reserved_early(pfn) to decide
 		 *  to use it as free.
-		 *  So reserve_early here, hope we don't run out of that array
+		 *  So reserve_lmb here, hope we don't run out of that array
 		 */
-		reserve_early(node_kva_final,
+		reserve_lmb(node_kva_final,
 			      node_kva_final+(((u64)size)<<PAGE_SHIFT),
 			      "KVA RAM");
 
@@ -367,7 +368,7 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 
 	kva_target_pfn = round_down(max_low_pfn - kva_pages, PTRS_PER_PTE);
 	do {
-		kva_start_pfn = find_e820_area(kva_target_pfn<<PAGE_SHIFT,
+		kva_start_pfn = find_lmb_area(kva_target_pfn<<PAGE_SHIFT,
 					max_low_pfn<<PAGE_SHIFT,
 					kva_pages<<PAGE_SHIFT,
 					PTRS_PER_PTE<<PAGE_SHIFT) >> PAGE_SHIFT;
@@ -382,7 +383,7 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 	printk(KERN_INFO "max_pfn = %lx\n", max_pfn);
 
 	/* avoid clash with initrd */
-	reserve_early(kva_start_pfn<<PAGE_SHIFT,
+	reserve_lmb(kva_start_pfn<<PAGE_SHIFT,
 		      (kva_start_pfn + kva_pages)<<PAGE_SHIFT,
 		     "KVA PG");
 #ifdef CONFIG_HIGHMEM
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index fc9a403..80b568d 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -90,7 +90,7 @@ static int __init allocate_cachealigned_memnodemap(void)
 
 	addr = 0x8000;
 	nodemap_size = roundup(sizeof(s16) * memnodemapsize, L1_CACHE_BYTES);
-	nodemap_addr = find_e820_area(addr, max_pfn<<PAGE_SHIFT,
+	nodemap_addr = find_lmb_area(addr, max_pfn<<PAGE_SHIFT,
 				      nodemap_size, L1_CACHE_BYTES);
 	if (nodemap_addr == -1UL) {
 		printk(KERN_ERR
@@ -99,7 +99,7 @@ static int __init allocate_cachealigned_memnodemap(void)
 		return -1;
 	}
 	memnodemap = phys_to_virt(nodemap_addr);
-	reserve_early(nodemap_addr, nodemap_addr + nodemap_size, "MEMNODEMAP");
+	reserve_lmb(nodemap_addr, nodemap_addr + nodemap_size, "MEMNODEMAP");
 
 	printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
 	       nodemap_addr, nodemap_addr + nodemap_size);
@@ -230,7 +230,7 @@ setup_node_bootmem(int nodeid, unsigned long start, unsigned long end)
 	if (node_data[nodeid] == NULL)
 		return;
 	nodedata_phys = __pa(node_data[nodeid]);
-	reserve_early(nodedata_phys, nodedata_phys + pgdat_size, "NODE_DATA");
+	reserve_lmb(nodedata_phys, nodedata_phys + pgdat_size, "NODE_DATA");
 	printk(KERN_INFO "  NODE_DATA [%016lx - %016lx]\n", nodedata_phys,
 		nodedata_phys + pgdat_size - 1);
 	nid = phys_to_nid(nodedata_phys);
@@ -249,7 +249,7 @@ setup_node_bootmem(int nodeid, unsigned long start, unsigned long end)
 	 * Find a place for the bootmem map
 	 * nodedata_phys could be on other nodes by alloc_bootmem,
 	 * so need to sure bootmap_start not to be small, otherwise
-	 * early_node_mem will get that with find_e820_area instead
+	 * early_node_mem will get that with find_lmb_area instead
 	 * of alloc_bootmem, that could clash with reserved range
 	 */
 	bootmap_pages = bootmem_bootmap_pages(last_pfn - start_pfn);
@@ -261,12 +261,12 @@ setup_node_bootmem(int nodeid, unsigned long start, unsigned long end)
 	bootmap = early_node_mem(nodeid, bootmap_start, end,
 				 bootmap_pages<<PAGE_SHIFT, PAGE_SIZE);
 	if (bootmap == NULL)  {
-		free_early(nodedata_phys, nodedata_phys + pgdat_size);
+		free_lmb(nodedata_phys, nodedata_phys + pgdat_size);
 		node_data[nodeid] = NULL;
 		return;
 	}
 	bootmap_start = __pa(bootmap);
-	reserve_early(bootmap_start, bootmap_start+(bootmap_pages<<PAGE_SHIFT),
+	reserve_lmb(bootmap_start, bootmap_start+(bootmap_pages<<PAGE_SHIFT),
 			"BOOTMAP");
 
 	bootmap_size = init_bootmem_node(NODE_DATA(nodeid),
@@ -420,7 +420,7 @@ static int __init split_nodes_interleave(u64 addr, u64 max_addr,
 		nr_nodes = MAX_NUMNODES;
 	}
 
-	size = (max_addr - addr - e820_hole_size(addr, max_addr)) / nr_nodes;
+	size = (max_addr - addr - lmb_hole_size(addr, max_addr)) / nr_nodes;
 	/*
 	 * Calculate the number of big nodes that can be allocated as a result
 	 * of consolidating the remainder.
@@ -456,7 +456,7 @@ static int __init split_nodes_interleave(u64 addr, u64 max_addr,
 			 * non-reserved memory is less than the per-node size.
 			 */
 			while (end - physnodes[i].start -
-				e820_hole_size(physnodes[i].start, end) < size) {
+				lmb_hole_size(physnodes[i].start, end) < size) {
 				end += FAKE_NODE_MIN_SIZE;
 				if (end > physnodes[i].end) {
 					end = physnodes[i].end;
@@ -470,7 +470,7 @@ static int __init split_nodes_interleave(u64 addr, u64 max_addr,
 			 * this one must extend to the boundary.
 			 */
 			if (end < dma32_end && dma32_end - end -
-			    e820_hole_size(end, dma32_end) < FAKE_NODE_MIN_SIZE)
+			    lmb_hole_size(end, dma32_end) < FAKE_NODE_MIN_SIZE)
 				end = dma32_end;
 
 			/*
@@ -479,7 +479,7 @@ static int __init split_nodes_interleave(u64 addr, u64 max_addr,
 			 * physical node.
 			 */
 			if (physnodes[i].end - end -
-			    e820_hole_size(end, physnodes[i].end) < size)
+			    lmb_hole_size(end, physnodes[i].end) < size)
 				end = physnodes[i].end;
 
 			/*
@@ -507,7 +507,7 @@ static u64 __init find_end_of_node(u64 start, u64 max_addr, u64 size)
 {
 	u64 end = start + size;
 
-	while (end - start - e820_hole_size(start, end) < size) {
+	while (end - start - lmb_hole_size(start, end) < size) {
 		end += FAKE_NODE_MIN_SIZE;
 		if (end > max_addr) {
 			end = max_addr;
@@ -536,7 +536,7 @@ static int __init split_nodes_size_interleave(u64 addr, u64 max_addr, u64 size)
 	 * creates a uniform distribution of node sizes across the entire
 	 * machine (but not necessarily over physical nodes).
 	 */
-	min_size = (max_addr - addr - e820_hole_size(addr, max_addr)) /
+	min_size = (max_addr - addr - lmb_hole_size(addr, max_addr)) /
 						MAX_NUMNODES;
 	min_size = max(min_size, FAKE_NODE_MIN_SIZE);
 	if ((min_size & FAKE_NODE_MIN_HASH_MASK) < min_size)
@@ -569,7 +569,7 @@ static int __init split_nodes_size_interleave(u64 addr, u64 max_addr, u64 size)
 			 * this one must extend to the boundary.
 			 */
 			if (end < dma32_end && dma32_end - end -
-			    e820_hole_size(end, dma32_end) < FAKE_NODE_MIN_SIZE)
+			    lmb_hole_size(end, dma32_end) < FAKE_NODE_MIN_SIZE)
 				end = dma32_end;
 
 			/*
@@ -578,7 +578,7 @@ static int __init split_nodes_size_interleave(u64 addr, u64 max_addr, u64 size)
 			 * physical node.
 			 */
 			if (physnodes[i].end - end -
-			    e820_hole_size(end, physnodes[i].end) < size)
+			    lmb_hole_size(end, physnodes[i].end) < size)
 				end = physnodes[i].end;
 
 			/*
@@ -642,7 +642,7 @@ static int __init numa_emulation(unsigned long start_pfn,
 	 */
 	remove_all_active_ranges();
 	for_each_node_mask(i, node_possible_map) {
-		e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
+		lmb_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
 						nodes[i].end >> PAGE_SHIFT);
 		setup_node_bootmem(i, nodes[i].start, nodes[i].end);
 	}
@@ -695,7 +695,7 @@ void __init initmem_init(unsigned long start_pfn, unsigned long last_pfn,
 	node_set(0, node_possible_map);
 	for (i = 0; i < nr_cpu_ids; i++)
 		numa_set_node(i, 0);
-	e820_register_active_regions(0, start_pfn, last_pfn);
+	lmb_register_active_regions(0, start_pfn, last_pfn);
 	setup_node_bootmem(0, start_pfn << PAGE_SHIFT, last_pfn << PAGE_SHIFT);
 }
 
diff --git a/arch/x86/mm/srat_32.c b/arch/x86/mm/srat_32.c
index 9324f13..68dd606 100644
--- a/arch/x86/mm/srat_32.c
+++ b/arch/x86/mm/srat_32.c
@@ -25,6 +25,7 @@
  */
 #include <linux/mm.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/mmzone.h>
 #include <linux/acpi.h>
 #include <linux/nodemask.h>
@@ -264,7 +265,7 @@ int __init get_memcfg_from_srat(void)
 		if (node_read_chunk(chunk->nid, chunk))
 			continue;
 
-		e820_register_active_regions(chunk->nid, chunk->start_pfn,
+		lmb_register_active_regions(chunk->nid, chunk->start_pfn,
 					     min(chunk->end_pfn, max_pfn));
 	}
 	/* for out of order entries in SRAT */
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index 28c6876..9eca2bc 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -16,6 +16,7 @@
 #include <linux/module.h>
 #include <linux/topology.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/mm.h>
 #include <asm/proto.h>
 #include <asm/numa.h>
@@ -98,7 +99,7 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
 	unsigned long phys;
 
 	length = slit->header.length;
-	phys = find_e820_area(0, max_pfn_mapped<<PAGE_SHIFT, length,
+	phys = find_lmb_area(0, max_pfn_mapped<<PAGE_SHIFT, length,
 		 PAGE_SIZE);
 
 	if (phys == -1L)
@@ -106,7 +107,7 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
 
 	acpi_slit = __va(phys);
 	memcpy(acpi_slit, slit, length);
-	reserve_early(phys, phys + length, "ACPI SLIT");
+	reserve_lmb(phys, phys + length, "ACPI SLIT");
 }
 
 /* Callback for Proximity Domain -> x2APIC mapping */
@@ -324,7 +325,7 @@ static int __init nodes_cover_memory(const struct bootnode *nodes)
 			pxmram = 0;
 	}
 
-	e820ram = max_pfn - (e820_hole_size(0, max_pfn<<PAGE_SHIFT)>>PAGE_SHIFT);
+	e820ram = max_pfn - (lmb_hole_size(0, max_pfn<<PAGE_SHIFT)>>PAGE_SHIFT);
 	/* We seem to lose 3 pages somewhere. Allow 1M of slack. */
 	if ((long)(e820ram - pxmram) >= (1<<(20 - PAGE_SHIFT))) {
 		printk(KERN_ERR
@@ -373,7 +374,7 @@ int __init acpi_scan_nodes(unsigned long start, unsigned long end)
 	}
 
 	for_each_node_mask(i, nodes_parsed)
-		e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
+		lmb_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
 						nodes[i].end >> PAGE_SHIFT);
 	/* for out of order entries in SRAT */
 	sort_node_map();
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index f9eb7de..e61658e 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -43,6 +43,7 @@
 #include <linux/debugfs.h>
 #include <linux/bug.h>
 #include <linux/module.h>
+#include <linux/lmb.h>
 
 #include <asm/pgtable.h>
 #include <asm/tlbflush.h>
@@ -1734,7 +1735,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
 	__xen_write_cr3(true, __pa(pgd));
 	xen_mc_issue(PARAVIRT_LAZY_CPU);
 
-	reserve_early(__pa(xen_start_info->pt_base),
+	reserve_lmb(__pa(xen_start_info->pt_base),
 		      __pa(xen_start_info->pt_base +
 			   xen_start_info->nr_pt_frames * PAGE_SIZE),
 		      "XEN PAGETABLES");
@@ -1772,7 +1773,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
 
 	pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(__pa(swapper_pg_dir)));
 
-	reserve_early(__pa(xen_start_info->pt_base),
+	reserve_lmb(__pa(xen_start_info->pt_base),
 		      __pa(xen_start_info->pt_base +
 			   xen_start_info->nr_pt_frames * PAGE_SIZE),
 		      "XEN PAGETABLES");
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 3f2c411..e8b3ca1 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -8,6 +8,7 @@
 #include <linux/sched.h>
 #include <linux/mm.h>
 #include <linux/pm.h>
+#include <linux/lmb.h>
 
 #include <asm/elf.h>
 #include <asm/vdso.h>
@@ -59,7 +60,7 @@ char * __init xen_memory_setup(void)
 	 *  - xen_start_info
 	 * See comment above "struct start_info" in <xen/interface/xen.h>
 	 */
-	reserve_early(__pa(xen_start_info->mfn_list),
+	reserve_lmb(__pa(xen_start_info->mfn_list),
 		      __pa(xen_start_info->pt_base),
 			"XEN START INFO");
 
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 81f9350..7e10d41 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -421,7 +421,7 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
 			      unsigned long size)
 {
 #ifdef CONFIG_NO_BOOTMEM
-	free_early(physaddr, physaddr + size);
+	free_lmb(physaddr, physaddr + size);
 #else
 	unsigned long start, end;
 
@@ -446,7 +446,7 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
 void __init free_bootmem(unsigned long addr, unsigned long size)
 {
 #ifdef CONFIG_NO_BOOTMEM
-	free_early(addr, addr + size);
+	free_lmb(addr, addr + size);
 #else
 	unsigned long start, end;
 
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 23/31] x86: Replace e820_/_early string with lmb_
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

1.include linux/lmb.h directly. so later could reduce e820.h reference.
2 this patch is done by sed scripts mainly

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/efi.h      |    2 +-
 arch/x86/kernel/acpi/sleep.c    |    5 +++--
 arch/x86/kernel/apic/numaq_32.c |    3 ++-
 arch/x86/kernel/efi.c           |    5 +++--
 arch/x86/kernel/head32.c        |    4 ++--
 arch/x86/kernel/head64.c        |    4 ++--
 arch/x86/kernel/setup.c         |   25 ++++++++++++-------------
 arch/x86/kernel/trampoline.c    |    6 +++---
 arch/x86/mm/init.c              |    5 +++--
 arch/x86/mm/init_32.c           |   10 ++++++----
 arch/x86/mm/init_64.c           |    9 +++++----
 arch/x86/mm/k8topology_64.c     |    4 +++-
 arch/x86/mm/memtest.c           |    7 +++----
 arch/x86/mm/numa_32.c           |   17 +++++++++--------
 arch/x86/mm/numa_64.c           |   32 ++++++++++++++++----------------
 arch/x86/mm/srat_32.c           |    3 ++-
 arch/x86/mm/srat_64.c           |    9 +++++----
 arch/x86/xen/mmu.c              |    5 +++--
 arch/x86/xen/setup.c            |    3 ++-
 mm/bootmem.c                    |    4 ++--
 20 files changed, 87 insertions(+), 75 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 8406ed7..f756536 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -90,7 +90,7 @@ extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
 #endif /* CONFIG_X86_32 */
 
 extern int add_efi_memmap;
-extern void efi_reserve_early(void);
+extern void efi_reserve_lmb(void);
 extern void efi_call_phys_prelog(void);
 extern void efi_call_phys_epilog(void);
 
diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index f996103..0cabfaa 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -7,6 +7,7 @@
 
 #include <linux/acpi.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/dmi.h>
 #include <linux/cpumask.h>
 #include <asm/segment.h>
@@ -133,7 +134,7 @@ void __init acpi_reserve_wakeup_memory(void)
 		return;
 	}
 
-	mem = find_e820_area(0, 1<<20, WAKEUP_SIZE, PAGE_SIZE);
+	mem = find_lmb_area(0, 1<<20, WAKEUP_SIZE, PAGE_SIZE);
 
 	if (mem == -1L) {
 		printk(KERN_ERR "ACPI: Cannot allocate lowmem, S3 disabled.\n");
@@ -141,7 +142,7 @@ void __init acpi_reserve_wakeup_memory(void)
 	}
 	acpi_realmode = (unsigned long) phys_to_virt(mem);
 	acpi_wakeup_address = mem;
-	reserve_early(mem, mem + WAKEUP_SIZE, "ACPI WAKEUP");
+	reserve_lmb(mem, mem + WAKEUP_SIZE, "ACPI WAKEUP");
 }
 
 
diff --git a/arch/x86/kernel/apic/numaq_32.c b/arch/x86/kernel/apic/numaq_32.c
index 3e28401..c71e494 100644
--- a/arch/x86/kernel/apic/numaq_32.c
+++ b/arch/x86/kernel/apic/numaq_32.c
@@ -26,6 +26,7 @@
 #include <linux/nodemask.h>
 #include <linux/topology.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/threads.h>
 #include <linux/cpumask.h>
 #include <linux/kernel.h>
@@ -88,7 +89,7 @@ static inline void numaq_register_node(int node, struct sys_cfg_data *scd)
 	node_end_pfn[node] =
 		 MB_TO_PAGES(eq->hi_shrd_mem_start + eq->hi_shrd_mem_size);
 
-	e820_register_active_regions(node, node_start_pfn[node],
+	lmb_register_active_regions(node, node_start_pfn[node],
 						node_end_pfn[node]);
 
 	memory_present(node, node_start_pfn[node], node_end_pfn[node]);
diff --git a/arch/x86/kernel/efi.c b/arch/x86/kernel/efi.c
index 299f03f..35038a7 100644
--- a/arch/x86/kernel/efi.c
+++ b/arch/x86/kernel/efi.c
@@ -30,6 +30,7 @@
 #include <linux/init.h>
 #include <linux/efi.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/spinlock.h>
 #include <linux/uaccess.h>
 #include <linux/time.h>
@@ -275,7 +276,7 @@ static void __init do_add_efi_memmap(void)
 	sanitize_e820_map();
 }
 
-void __init efi_reserve_early(void)
+void __init efi_reserve_lmb(void)
 {
 	unsigned long pmap;
 
@@ -290,7 +291,7 @@ void __init efi_reserve_early(void)
 		boot_params.efi_info.efi_memdesc_size;
 	memmap.desc_version = boot_params.efi_info.efi_memdesc_version;
 	memmap.desc_size = boot_params.efi_info.efi_memdesc_size;
-	reserve_early(pmap, pmap + memmap.nr_map * memmap.desc_size,
+	reserve_lmb(pmap, pmap + memmap.nr_map * memmap.desc_size,
 		      "EFI memmap");
 }
 
diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index cbab479..9d09d26 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -41,7 +41,7 @@ void __init i386_start_kernel(void)
 	reserve_lmb(PAGE_SIZE, PAGE_SIZE + PAGE_SIZE, "EX TRAMPOLINE");
 #endif
 
-	reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
+	reserve_lmb(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
 
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
@@ -50,7 +50,7 @@ void __init i386_start_kernel(void)
 		u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 		u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
 		u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
-		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
+		reserve_lmb(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
 
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 89dd2de..d4442a8 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -101,7 +101,7 @@ void __init x86_64_start_reservations(char *real_mode_data)
 
 	copy_bootdata(__va(real_mode_data));
 
-	reserve_early(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
+	reserve_lmb(__pa_symbol(&_text), __pa_symbol(&__bss_stop), "TEXT DATA BSS");
 
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
@@ -110,7 +110,7 @@ void __init x86_64_start_reservations(char *real_mode_data)
 		unsigned long ramdisk_image = boot_params.hdr.ramdisk_image;
 		unsigned long ramdisk_size  = boot_params.hdr.ramdisk_size;
 		unsigned long ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
-		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
+		reserve_lmb(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a5c029f..3d43f12 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -303,7 +303,7 @@ static inline void init_gbpages(void)
 static void __init reserve_brk(void)
 {
 	if (_brk_end > _brk_start)
-		reserve_early(__pa(_brk_start), __pa(_brk_end), "BRK");
+		reserve_lmb(__pa(_brk_start), __pa(_brk_end), "BRK");
 
 	/* Mark brk area as locked down and no longer taking any
 	   new allocations */
@@ -325,7 +325,7 @@ static void __init relocate_initrd(void)
 	char *p, *q;
 
 	/* We need to move the initrd down into lowmem */
-	ramdisk_here = find_e820_area(0, end_of_lowmem, area_size,
+	ramdisk_here = find_lmb_area(0, end_of_lowmem, area_size,
 					 PAGE_SIZE);
 
 	if (ramdisk_here == -1ULL)
@@ -334,8 +334,7 @@ static void __init relocate_initrd(void)
 
 	/* Note: this includes all the lowmem currently occupied by
 	   the initrd, we rely on that fact to keep the data intact. */
-	reserve_early(ramdisk_here, ramdisk_here + area_size,
-			 "NEW RAMDISK");
+	reserve_lmb(ramdisk_here, ramdisk_here + area_size, "NEW RAMDISK");
 	initrd_start = ramdisk_here + PAGE_OFFSET;
 	initrd_end   = initrd_start + ramdisk_size;
 	printk(KERN_INFO "Allocated new RAMDISK: %08llx - %08llx\n",
@@ -391,7 +390,7 @@ static void __init reserve_initrd(void)
 	initrd_start = 0;
 
 	if (ramdisk_size >= (end_of_lowmem>>1)) {
-		free_early(ramdisk_image, ramdisk_end);
+		free_lmb(ramdisk_image, ramdisk_end);
 		printk(KERN_ERR "initrd too large to handle, "
 		       "disabling initrd\n");
 		return;
@@ -414,7 +413,7 @@ static void __init reserve_initrd(void)
 
 	relocate_initrd();
 
-	free_early(ramdisk_image, ramdisk_end);
+	free_lmb(ramdisk_image, ramdisk_end);
 }
 #else
 static void __init reserve_initrd(void)
@@ -470,7 +469,7 @@ static void __init e820_reserve_setup_data(void)
 	e820_print_map("reserve setup_data");
 }
 
-static void __init reserve_early_setup_data(void)
+static void __init reserve_lmb_setup_data(void)
 {
 	struct setup_data *data;
 	u64 pa_data;
@@ -482,7 +481,7 @@ static void __init reserve_early_setup_data(void)
 	while (pa_data) {
 		data = early_memremap(pa_data, sizeof(*data));
 		sprintf(buf, "setup data %x", data->type);
-		reserve_early(pa_data, pa_data+sizeof(*data)+data->len, buf);
+		reserve_lmb(pa_data, pa_data+sizeof(*data)+data->len, buf);
 		pa_data = data->next;
 		early_iounmap(data, sizeof(*data));
 	}
@@ -520,7 +519,7 @@ static void __init reserve_crashkernel(void)
 	if (crash_base <= 0) {
 		const unsigned long long alignment = 16<<20;	/* 16M */
 
-		crash_base = find_e820_area(alignment, ULONG_MAX, crash_size,
+		crash_base = find_lmb_area(alignment, ULONG_MAX, crash_size,
 				 alignment);
 		if (crash_base == -1ULL) {
 			pr_info("crashkernel reservation failed - No suitable area found.\n");
@@ -529,14 +528,14 @@ static void __init reserve_crashkernel(void)
 	} else {
 		unsigned long long start;
 
-		start = find_e820_area(crash_base, ULONG_MAX, crash_size,
+		start = find_lmb_area(crash_base, ULONG_MAX, crash_size,
 				 1<<20);
 		if (start != crash_base) {
 			pr_info("crashkernel reservation failed - memory is in use.\n");
 			return;
 		}
 	}
-	reserve_early(crash_base, crash_base + crash_size, "CRASH KERNEL");
+	reserve_lmb(crash_base, crash_base + crash_size, "CRASH KERNEL");
 
 	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
 			"for crashkernel (System RAM: %ldMB)\n",
@@ -756,7 +755,7 @@ void __init setup_arch(char **cmdline_p)
 #endif
 	 4)) {
 		efi_enabled = 1;
-		efi_reserve_early();
+		efi_reserve_lmb();
 	}
 #endif
 
@@ -816,7 +815,7 @@ void __init setup_arch(char **cmdline_p)
 	vmi_activate();
 
 	/* after early param, so could get panic from serial */
-	reserve_early_setup_data();
+	reserve_lmb_setup_data();
 
 	if (acpi_mps_check()) {
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/kernel/trampoline.c b/arch/x86/kernel/trampoline.c
index c652ef6..4a634f6 100644
--- a/arch/x86/kernel/trampoline.c
+++ b/arch/x86/kernel/trampoline.c
@@ -1,7 +1,7 @@
 #include <linux/io.h>
+#include <linux/lmb.h>
 
 #include <asm/trampoline.h>
-#include <asm/e820.h>
 
 #if defined(CONFIG_X86_64) && defined(CONFIG_ACPI_SLEEP)
 #define __trampinit
@@ -19,12 +19,12 @@ void __init reserve_trampoline_memory(void)
 	unsigned long mem;
 
 	/* Has to be in very low memory so we can execute real-mode AP code. */
-	mem = find_e820_area(0, 1<<20, TRAMPOLINE_SIZE, PAGE_SIZE);
+	mem = find_lmb_area(0, 1<<20, TRAMPOLINE_SIZE, PAGE_SIZE);
 	if (mem == -1L)
 		panic("Cannot allocate trampoline\n");
 
 	trampoline_base = __va(mem);
-	reserve_early(mem, mem + TRAMPOLINE_SIZE, "TRAMPOLINE");
+	reserve_lmb(mem, mem + TRAMPOLINE_SIZE, "TRAMPOLINE");
 }
 
 /*
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 452ee5b..837aaa2 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -1,6 +1,7 @@
 #include <linux/initrd.h>
 #include <linux/ioport.h>
 #include <linux/swap.h>
+#include <linux/lmb.h>
 
 #include <asm/cacheflush.h>
 #include <asm/e820.h>
@@ -74,7 +75,7 @@ static void __init find_early_table_space(unsigned long end, int use_pse,
 #else
 	start = 0x8000;
 #endif
-	e820_table_start = find_e820_area(start, max_pfn_mapped<<PAGE_SHIFT,
+	e820_table_start = find_lmb_area(start, max_pfn_mapped<<PAGE_SHIFT,
 					tables, PAGE_SIZE);
 	if (e820_table_start == -1UL)
 		panic("Cannot find space for the kernel page tables");
@@ -298,7 +299,7 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
 	__flush_tlb_all();
 
 	if (!after_bootmem && e820_table_end > e820_table_start)
-		reserve_early(e820_table_start << PAGE_SHIFT,
+		reserve_lmb(e820_table_start << PAGE_SHIFT,
 				 e820_table_end << PAGE_SHIFT, "PGTABLE");
 
 	if (!after_bootmem)
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 804bbe9..26b9ceb 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -25,6 +25,7 @@
 #include <linux/pfn.h>
 #include <linux/poison.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/slab.h>
 #include <linux/proc_fs.h>
 #include <linux/memory_hotplug.h>
@@ -712,14 +713,14 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 	highstart_pfn = highend_pfn = max_pfn;
 	if (max_pfn > max_low_pfn)
 		highstart_pfn = max_low_pfn;
-	e820_register_active_regions(0, 0, highend_pfn);
+	lmb_register_active_regions(0, 0, highend_pfn);
 	sparse_memory_present_with_active_regions(0);
 	printk(KERN_NOTICE "%ldMB HIGHMEM available.\n",
 		pages_to_mb(highend_pfn - highstart_pfn));
 	num_physpages = highend_pfn;
 	high_memory = (void *) __va(highstart_pfn * PAGE_SIZE - 1) + 1;
 #else
-	e820_register_active_regions(0, 0, max_low_pfn);
+	lmb_register_active_regions(0, 0, max_low_pfn);
 	sparse_memory_present_with_active_regions(0);
 	num_physpages = max_low_pfn;
 	high_memory = (void *) __va(max_low_pfn * PAGE_SIZE - 1) + 1;
@@ -781,11 +782,11 @@ void __init setup_bootmem_allocator(void)
 	 * Initialize the boot-time allocator (with low memory only):
 	 */
 	bootmap_size = bootmem_bootmap_pages(max_low_pfn)<<PAGE_SHIFT;
-	bootmap = find_e820_area(0, max_pfn_mapped<<PAGE_SHIFT, bootmap_size,
+	bootmap = find_lmb_area(0, max_pfn_mapped<<PAGE_SHIFT, bootmap_size,
 				 PAGE_SIZE);
 	if (bootmap == -1L)
 		panic("Cannot find bootmem map of size %ld\n", bootmap_size);
-	reserve_early(bootmap, bootmap + bootmap_size, "BOOTMAP");
+	reserve_lmb(bootmap, bootmap + bootmap_size, "BOOTMAP");
 #endif
 
 	printk(KERN_INFO "  mapped low ram: 0 - %08lx\n",
@@ -1069,3 +1070,4 @@ void mark_rodata_ro(void)
 #endif
 }
 #endif
+
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 5ba6b0e..b86492e 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -21,6 +21,7 @@
 #include <linux/initrd.h>
 #include <linux/pagemap.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/proc_fs.h>
 #include <linux/pci.h>
 #include <linux/pfn.h>
@@ -576,18 +577,18 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 	unsigned long bootmap_size, bootmap;
 
 	bootmap_size = bootmem_bootmap_pages(end_pfn)<<PAGE_SHIFT;
-	bootmap = find_e820_area(0, end_pfn<<PAGE_SHIFT, bootmap_size,
+	bootmap = find_lmb_area(0, end_pfn<<PAGE_SHIFT, bootmap_size,
 				 PAGE_SIZE);
 	if (bootmap == -1L)
 		panic("Cannot find bootmem map of size %ld\n", bootmap_size);
-	reserve_early(bootmap, bootmap + bootmap_size, "BOOTMAP");
+	reserve_lmb(bootmap, bootmap + bootmap_size, "BOOTMAP");
 	/* don't touch min_low_pfn */
 	bootmap_size = init_bootmem_node(NODE_DATA(0), bootmap >> PAGE_SHIFT,
 					 0, end_pfn);
-	e820_register_active_regions(0, start_pfn, end_pfn);
+	lmb_register_active_regions(0, start_pfn, end_pfn);
 	free_bootmem_with_active_regions(0, end_pfn);
 #else
-	e820_register_active_regions(0, start_pfn, end_pfn);
+	lmb_register_active_regions(0, start_pfn, end_pfn);
 #endif
 }
 #endif
diff --git a/arch/x86/mm/k8topology_64.c b/arch/x86/mm/k8topology_64.c
index 970ed57..d7d031b 100644
--- a/arch/x86/mm/k8topology_64.c
+++ b/arch/x86/mm/k8topology_64.c
@@ -11,6 +11,8 @@
 #include <linux/string.h>
 #include <linux/module.h>
 #include <linux/nodemask.h>
+#include <linux/lmb.h>
+
 #include <asm/io.h>
 #include <linux/pci_ids.h>
 #include <linux/acpi.h>
@@ -222,7 +224,7 @@ int __init k8_scan_nodes(void)
 	for_each_node_mask(i, node_possible_map) {
 		int j;
 
-		e820_register_active_regions(i,
+		lmb_register_active_regions(i,
 				nodes[i].start >> PAGE_SHIFT,
 				nodes[i].end >> PAGE_SHIFT);
 		for (j = apicid_base; j < cores + apicid_base; j++)
diff --git a/arch/x86/mm/memtest.c b/arch/x86/mm/memtest.c
index 18d244f..f889776 100644
--- a/arch/x86/mm/memtest.c
+++ b/arch/x86/mm/memtest.c
@@ -6,8 +6,7 @@
 #include <linux/smp.h>
 #include <linux/init.h>
 #include <linux/pfn.h>
-
-#include <asm/e820.h>
+#include <linux/lmb.h>
 
 static u64 patterns[] __initdata = {
 	0,
@@ -35,7 +34,7 @@ static void __init reserve_bad_mem(u64 pattern, u64 start_bad, u64 end_bad)
 	       (unsigned long long) pattern,
 	       (unsigned long long) start_bad,
 	       (unsigned long long) end_bad);
-	reserve_early(start_bad, end_bad, "BAD RAM");
+	reserve_lmb(start_bad, end_bad, "BAD RAM");
 }
 
 static void __init memtest(u64 pattern, u64 start_phys, u64 size)
@@ -74,7 +73,7 @@ static void __init do_one_pass(u64 pattern, u64 start, u64 end)
 	u64 size = 0;
 
 	while (start < end) {
-		start = find_e820_area_size(start, &size, 1);
+		start = find_lmb_area_size(start, &size, 1);
 
 		/* done ? */
 		if (start >= end)
diff --git a/arch/x86/mm/numa_32.c b/arch/x86/mm/numa_32.c
index 809baaa..3434a93 100644
--- a/arch/x86/mm/numa_32.c
+++ b/arch/x86/mm/numa_32.c
@@ -24,6 +24,7 @@
 
 #include <linux/mm.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/mmzone.h>
 #include <linux/highmem.h>
 #include <linux/initrd.h>
@@ -120,7 +121,7 @@ int __init get_memcfg_numa_flat(void)
 
 	node_start_pfn[0] = 0;
 	node_end_pfn[0] = max_pfn;
-	e820_register_active_regions(0, 0, max_pfn);
+	lmb_register_active_regions(0, 0, max_pfn);
 	memory_present(0, 0, max_pfn);
 	node_remap_size[0] = node_memmap_size_bytes(0, 0, max_pfn);
 
@@ -161,14 +162,14 @@ static void __init allocate_pgdat(int nid)
 		NODE_DATA(nid) = (pg_data_t *)node_remap_start_vaddr[nid];
 	else {
 		unsigned long pgdat_phys;
-		pgdat_phys = find_e820_area(min_low_pfn<<PAGE_SHIFT,
+		pgdat_phys = find_lmb_area(min_low_pfn<<PAGE_SHIFT,
 				 max_pfn_mapped<<PAGE_SHIFT,
 				 sizeof(pg_data_t),
 				 PAGE_SIZE);
 		NODE_DATA(nid) = (pg_data_t *)(pfn_to_kaddr(pgdat_phys>>PAGE_SHIFT));
 		memset(buf, 0, sizeof(buf));
 		sprintf(buf, "NODE_DATA %d",  nid);
-		reserve_early(pgdat_phys, pgdat_phys + sizeof(pg_data_t), buf);
+		reserve_lmb(pgdat_phys, pgdat_phys + sizeof(pg_data_t), buf);
 	}
 	printk(KERN_DEBUG "allocate_pgdat: node %d NODE_DATA %08lx\n",
 		nid, (unsigned long)NODE_DATA(nid));
@@ -291,7 +292,7 @@ static __init unsigned long calculate_numa_remap_pages(void)
 						 PTRS_PER_PTE);
 		node_kva_target <<= PAGE_SHIFT;
 		do {
-			node_kva_final = find_e820_area(node_kva_target,
+			node_kva_final = find_lmb_area(node_kva_target,
 					((u64)node_end_pfn[nid])<<PAGE_SHIFT,
 						((u64)size)<<PAGE_SHIFT,
 						LARGE_PAGE_BYTES);
@@ -318,9 +319,9 @@ static __init unsigned long calculate_numa_remap_pages(void)
 		 *  but we could have some hole in high memory, and it will only
 		 *  check page_is_ram(pfn) && !page_is_reserved_early(pfn) to decide
 		 *  to use it as free.
-		 *  So reserve_early here, hope we don't run out of that array
+		 *  So reserve_lmb here, hope we don't run out of that array
 		 */
-		reserve_early(node_kva_final,
+		reserve_lmb(node_kva_final,
 			      node_kva_final+(((u64)size)<<PAGE_SHIFT),
 			      "KVA RAM");
 
@@ -367,7 +368,7 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 
 	kva_target_pfn = round_down(max_low_pfn - kva_pages, PTRS_PER_PTE);
 	do {
-		kva_start_pfn = find_e820_area(kva_target_pfn<<PAGE_SHIFT,
+		kva_start_pfn = find_lmb_area(kva_target_pfn<<PAGE_SHIFT,
 					max_low_pfn<<PAGE_SHIFT,
 					kva_pages<<PAGE_SHIFT,
 					PTRS_PER_PTE<<PAGE_SHIFT) >> PAGE_SHIFT;
@@ -382,7 +383,7 @@ void __init initmem_init(unsigned long start_pfn, unsigned long end_pfn,
 	printk(KERN_INFO "max_pfn = %lx\n", max_pfn);
 
 	/* avoid clash with initrd */
-	reserve_early(kva_start_pfn<<PAGE_SHIFT,
+	reserve_lmb(kva_start_pfn<<PAGE_SHIFT,
 		      (kva_start_pfn + kva_pages)<<PAGE_SHIFT,
 		     "KVA PG");
 #ifdef CONFIG_HIGHMEM
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index fc9a403..80b568d 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -90,7 +90,7 @@ static int __init allocate_cachealigned_memnodemap(void)
 
 	addr = 0x8000;
 	nodemap_size = roundup(sizeof(s16) * memnodemapsize, L1_CACHE_BYTES);
-	nodemap_addr = find_e820_area(addr, max_pfn<<PAGE_SHIFT,
+	nodemap_addr = find_lmb_area(addr, max_pfn<<PAGE_SHIFT,
 				      nodemap_size, L1_CACHE_BYTES);
 	if (nodemap_addr == -1UL) {
 		printk(KERN_ERR
@@ -99,7 +99,7 @@ static int __init allocate_cachealigned_memnodemap(void)
 		return -1;
 	}
 	memnodemap = phys_to_virt(nodemap_addr);
-	reserve_early(nodemap_addr, nodemap_addr + nodemap_size, "MEMNODEMAP");
+	reserve_lmb(nodemap_addr, nodemap_addr + nodemap_size, "MEMNODEMAP");
 
 	printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
 	       nodemap_addr, nodemap_addr + nodemap_size);
@@ -230,7 +230,7 @@ setup_node_bootmem(int nodeid, unsigned long start, unsigned long end)
 	if (node_data[nodeid] == NULL)
 		return;
 	nodedata_phys = __pa(node_data[nodeid]);
-	reserve_early(nodedata_phys, nodedata_phys + pgdat_size, "NODE_DATA");
+	reserve_lmb(nodedata_phys, nodedata_phys + pgdat_size, "NODE_DATA");
 	printk(KERN_INFO "  NODE_DATA [%016lx - %016lx]\n", nodedata_phys,
 		nodedata_phys + pgdat_size - 1);
 	nid = phys_to_nid(nodedata_phys);
@@ -249,7 +249,7 @@ setup_node_bootmem(int nodeid, unsigned long start, unsigned long end)
 	 * Find a place for the bootmem map
 	 * nodedata_phys could be on other nodes by alloc_bootmem,
 	 * so need to sure bootmap_start not to be small, otherwise
-	 * early_node_mem will get that with find_e820_area instead
+	 * early_node_mem will get that with find_lmb_area instead
 	 * of alloc_bootmem, that could clash with reserved range
 	 */
 	bootmap_pages = bootmem_bootmap_pages(last_pfn - start_pfn);
@@ -261,12 +261,12 @@ setup_node_bootmem(int nodeid, unsigned long start, unsigned long end)
 	bootmap = early_node_mem(nodeid, bootmap_start, end,
 				 bootmap_pages<<PAGE_SHIFT, PAGE_SIZE);
 	if (bootmap == NULL)  {
-		free_early(nodedata_phys, nodedata_phys + pgdat_size);
+		free_lmb(nodedata_phys, nodedata_phys + pgdat_size);
 		node_data[nodeid] = NULL;
 		return;
 	}
 	bootmap_start = __pa(bootmap);
-	reserve_early(bootmap_start, bootmap_start+(bootmap_pages<<PAGE_SHIFT),
+	reserve_lmb(bootmap_start, bootmap_start+(bootmap_pages<<PAGE_SHIFT),
 			"BOOTMAP");
 
 	bootmap_size = init_bootmem_node(NODE_DATA(nodeid),
@@ -420,7 +420,7 @@ static int __init split_nodes_interleave(u64 addr, u64 max_addr,
 		nr_nodes = MAX_NUMNODES;
 	}
 
-	size = (max_addr - addr - e820_hole_size(addr, max_addr)) / nr_nodes;
+	size = (max_addr - addr - lmb_hole_size(addr, max_addr)) / nr_nodes;
 	/*
 	 * Calculate the number of big nodes that can be allocated as a result
 	 * of consolidating the remainder.
@@ -456,7 +456,7 @@ static int __init split_nodes_interleave(u64 addr, u64 max_addr,
 			 * non-reserved memory is less than the per-node size.
 			 */
 			while (end - physnodes[i].start -
-				e820_hole_size(physnodes[i].start, end) < size) {
+				lmb_hole_size(physnodes[i].start, end) < size) {
 				end += FAKE_NODE_MIN_SIZE;
 				if (end > physnodes[i].end) {
 					end = physnodes[i].end;
@@ -470,7 +470,7 @@ static int __init split_nodes_interleave(u64 addr, u64 max_addr,
 			 * this one must extend to the boundary.
 			 */
 			if (end < dma32_end && dma32_end - end -
-			    e820_hole_size(end, dma32_end) < FAKE_NODE_MIN_SIZE)
+			    lmb_hole_size(end, dma32_end) < FAKE_NODE_MIN_SIZE)
 				end = dma32_end;
 
 			/*
@@ -479,7 +479,7 @@ static int __init split_nodes_interleave(u64 addr, u64 max_addr,
 			 * physical node.
 			 */
 			if (physnodes[i].end - end -
-			    e820_hole_size(end, physnodes[i].end) < size)
+			    lmb_hole_size(end, physnodes[i].end) < size)
 				end = physnodes[i].end;
 
 			/*
@@ -507,7 +507,7 @@ static u64 __init find_end_of_node(u64 start, u64 max_addr, u64 size)
 {
 	u64 end = start + size;
 
-	while (end - start - e820_hole_size(start, end) < size) {
+	while (end - start - lmb_hole_size(start, end) < size) {
 		end += FAKE_NODE_MIN_SIZE;
 		if (end > max_addr) {
 			end = max_addr;
@@ -536,7 +536,7 @@ static int __init split_nodes_size_interleave(u64 addr, u64 max_addr, u64 size)
 	 * creates a uniform distribution of node sizes across the entire
 	 * machine (but not necessarily over physical nodes).
 	 */
-	min_size = (max_addr - addr - e820_hole_size(addr, max_addr)) /
+	min_size = (max_addr - addr - lmb_hole_size(addr, max_addr)) /
 						MAX_NUMNODES;
 	min_size = max(min_size, FAKE_NODE_MIN_SIZE);
 	if ((min_size & FAKE_NODE_MIN_HASH_MASK) < min_size)
@@ -569,7 +569,7 @@ static int __init split_nodes_size_interleave(u64 addr, u64 max_addr, u64 size)
 			 * this one must extend to the boundary.
 			 */
 			if (end < dma32_end && dma32_end - end -
-			    e820_hole_size(end, dma32_end) < FAKE_NODE_MIN_SIZE)
+			    lmb_hole_size(end, dma32_end) < FAKE_NODE_MIN_SIZE)
 				end = dma32_end;
 
 			/*
@@ -578,7 +578,7 @@ static int __init split_nodes_size_interleave(u64 addr, u64 max_addr, u64 size)
 			 * physical node.
 			 */
 			if (physnodes[i].end - end -
-			    e820_hole_size(end, physnodes[i].end) < size)
+			    lmb_hole_size(end, physnodes[i].end) < size)
 				end = physnodes[i].end;
 
 			/*
@@ -642,7 +642,7 @@ static int __init numa_emulation(unsigned long start_pfn,
 	 */
 	remove_all_active_ranges();
 	for_each_node_mask(i, node_possible_map) {
-		e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
+		lmb_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
 						nodes[i].end >> PAGE_SHIFT);
 		setup_node_bootmem(i, nodes[i].start, nodes[i].end);
 	}
@@ -695,7 +695,7 @@ void __init initmem_init(unsigned long start_pfn, unsigned long last_pfn,
 	node_set(0, node_possible_map);
 	for (i = 0; i < nr_cpu_ids; i++)
 		numa_set_node(i, 0);
-	e820_register_active_regions(0, start_pfn, last_pfn);
+	lmb_register_active_regions(0, start_pfn, last_pfn);
 	setup_node_bootmem(0, start_pfn << PAGE_SHIFT, last_pfn << PAGE_SHIFT);
 }
 
diff --git a/arch/x86/mm/srat_32.c b/arch/x86/mm/srat_32.c
index 9324f13..68dd606 100644
--- a/arch/x86/mm/srat_32.c
+++ b/arch/x86/mm/srat_32.c
@@ -25,6 +25,7 @@
  */
 #include <linux/mm.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/mmzone.h>
 #include <linux/acpi.h>
 #include <linux/nodemask.h>
@@ -264,7 +265,7 @@ int __init get_memcfg_from_srat(void)
 		if (node_read_chunk(chunk->nid, chunk))
 			continue;
 
-		e820_register_active_regions(chunk->nid, chunk->start_pfn,
+		lmb_register_active_regions(chunk->nid, chunk->start_pfn,
 					     min(chunk->end_pfn, max_pfn));
 	}
 	/* for out of order entries in SRAT */
diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c
index 28c6876..9eca2bc 100644
--- a/arch/x86/mm/srat_64.c
+++ b/arch/x86/mm/srat_64.c
@@ -16,6 +16,7 @@
 #include <linux/module.h>
 #include <linux/topology.h>
 #include <linux/bootmem.h>
+#include <linux/lmb.h>
 #include <linux/mm.h>
 #include <asm/proto.h>
 #include <asm/numa.h>
@@ -98,7 +99,7 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
 	unsigned long phys;
 
 	length = slit->header.length;
-	phys = find_e820_area(0, max_pfn_mapped<<PAGE_SHIFT, length,
+	phys = find_lmb_area(0, max_pfn_mapped<<PAGE_SHIFT, length,
 		 PAGE_SIZE);
 
 	if (phys == -1L)
@@ -106,7 +107,7 @@ void __init acpi_numa_slit_init(struct acpi_table_slit *slit)
 
 	acpi_slit = __va(phys);
 	memcpy(acpi_slit, slit, length);
-	reserve_early(phys, phys + length, "ACPI SLIT");
+	reserve_lmb(phys, phys + length, "ACPI SLIT");
 }
 
 /* Callback for Proximity Domain -> x2APIC mapping */
@@ -324,7 +325,7 @@ static int __init nodes_cover_memory(const struct bootnode *nodes)
 			pxmram = 0;
 	}
 
-	e820ram = max_pfn - (e820_hole_size(0, max_pfn<<PAGE_SHIFT)>>PAGE_SHIFT);
+	e820ram = max_pfn - (lmb_hole_size(0, max_pfn<<PAGE_SHIFT)>>PAGE_SHIFT);
 	/* We seem to lose 3 pages somewhere. Allow 1M of slack. */
 	if ((long)(e820ram - pxmram) >= (1<<(20 - PAGE_SHIFT))) {
 		printk(KERN_ERR
@@ -373,7 +374,7 @@ int __init acpi_scan_nodes(unsigned long start, unsigned long end)
 	}
 
 	for_each_node_mask(i, nodes_parsed)
-		e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
+		lmb_register_active_regions(i, nodes[i].start >> PAGE_SHIFT,
 						nodes[i].end >> PAGE_SHIFT);
 	/* for out of order entries in SRAT */
 	sort_node_map();
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index f9eb7de..e61658e 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -43,6 +43,7 @@
 #include <linux/debugfs.h>
 #include <linux/bug.h>
 #include <linux/module.h>
+#include <linux/lmb.h>
 
 #include <asm/pgtable.h>
 #include <asm/tlbflush.h>
@@ -1734,7 +1735,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
 	__xen_write_cr3(true, __pa(pgd));
 	xen_mc_issue(PARAVIRT_LAZY_CPU);
 
-	reserve_early(__pa(xen_start_info->pt_base),
+	reserve_lmb(__pa(xen_start_info->pt_base),
 		      __pa(xen_start_info->pt_base +
 			   xen_start_info->nr_pt_frames * PAGE_SIZE),
 		      "XEN PAGETABLES");
@@ -1772,7 +1773,7 @@ __init pgd_t *xen_setup_kernel_pagetable(pgd_t *pgd,
 
 	pin_pagetable_pfn(MMUEXT_PIN_L3_TABLE, PFN_DOWN(__pa(swapper_pg_dir)));
 
-	reserve_early(__pa(xen_start_info->pt_base),
+	reserve_lmb(__pa(xen_start_info->pt_base),
 		      __pa(xen_start_info->pt_base +
 			   xen_start_info->nr_pt_frames * PAGE_SIZE),
 		      "XEN PAGETABLES");
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 3f2c411..e8b3ca1 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -8,6 +8,7 @@
 #include <linux/sched.h>
 #include <linux/mm.h>
 #include <linux/pm.h>
+#include <linux/lmb.h>
 
 #include <asm/elf.h>
 #include <asm/vdso.h>
@@ -59,7 +60,7 @@ char * __init xen_memory_setup(void)
 	 *  - xen_start_info
 	 * See comment above "struct start_info" in <xen/interface/xen.h>
 	 */
-	reserve_early(__pa(xen_start_info->mfn_list),
+	reserve_lmb(__pa(xen_start_info->mfn_list),
 		      __pa(xen_start_info->pt_base),
 			"XEN START INFO");
 
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 81f9350..7e10d41 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -421,7 +421,7 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
 			      unsigned long size)
 {
 #ifdef CONFIG_NO_BOOTMEM
-	free_early(physaddr, physaddr + size);
+	free_lmb(physaddr, physaddr + size);
 #else
 	unsigned long start, end;
 
@@ -446,7 +446,7 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
 void __init free_bootmem(unsigned long addr, unsigned long size)
 {
 #ifdef CONFIG_NO_BOOTMEM
-	free_early(addr, addr + size);
+	free_lmb(addr, addr + size);
 #else
 	unsigned long start, end;
 
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 24/31] x86: Remove not used early_res code
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

and some functions in e820.c that are not used anymore

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h |   14 -
 arch/x86/kernel/e820.c      |   41 ---
 include/linux/early_res.h   |   23 --
 kernel/early_res.c          |  584 -------------------------------------------
 4 files changed, 0 insertions(+), 662 deletions(-)
 delete mode 100644 include/linux/early_res.h
 delete mode 100644 kernel/early_res.c

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 396c849..de6cd06 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -111,32 +111,18 @@ static inline void early_memtest(unsigned long start, unsigned long end)
 }
 #endif
 
-extern unsigned long end_user_pfn;
-
-extern u64 find_e820_area(u64 start, u64 end, u64 size, u64 align);
-extern u64 find_e820_area_size(u64 start, u64 *sizep, u64 align);
-extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
-
 extern unsigned long e820_end_of_ram_pfn(void);
 extern unsigned long e820_end_of_low_ram_pfn(void);
-extern void e820_register_active_regions(int nid, unsigned long start_pfn,
-					 unsigned long end_pfn);
-extern u64 e820_hole_size(u64 start, u64 end);
-
 extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
 
 void init_lmb_memory(void);
 void fill_lmb_memory(void);
-
 extern void finish_e820_parsing(void);
 extern void e820_reserve_resources(void);
 extern void e820_reserve_resources_late(void);
 extern void setup_memory_map(void);
 extern char *default_machine_specific_memory_setup(void);
 
-void reserve_early(u64 start, u64 end, char *name);
-void free_early(u64 start, u64 end);
-
 /*
  * Returns true iff the specified range [s,e) is completely contained inside
  * the ISA region.
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index c2a9ce4..47eb188 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -727,14 +727,6 @@ static int __init e820_mark_nvs_memory(void)
 core_initcall(e820_mark_nvs_memory);
 #endif
 
-/*
- * Find a free area with specified alignment in a specific range.
- */
-u64 __init find_e820_area(u64 start, u64 end, u64 size, u64 align)
-{
-	return find_lmb_area(start, end, size, align);
-}
-
 u64 __init get_max_mapped(void)
 {
 	u64 end = max_pfn_mapped;
@@ -743,13 +735,6 @@ u64 __init get_max_mapped(void)
 
 	return end;
 }
-/*
- * Find next free range after *start
- */
-u64 __init find_e820_area_size(u64 start, u64 *sizep, u64 align)
-{
-	return find_lmb_area_size(start, sizep, align);
-}
 
 /*
  * pre allocated 4k and reserved it in lmb and e820_saved
@@ -843,32 +828,6 @@ unsigned long __init e820_end_of_low_ram_pfn(void)
 	return e820_end_pfn(1UL<<(32 - PAGE_SHIFT), E820_RAM);
 }
 
-/* Walk the e820 map and register active regions within a node */
-void __init e820_register_active_regions(int nid, unsigned long start_pfn,
-					 unsigned long last_pfn)
-{
-	lmb_register_active_regions(nid, start_pfn, last_pfn);
-}
-
-/*
- * Find the hole size (in bytes) in the memory range.
- * @start: starting address of the memory range to scan
- * @end: ending address of the memory range to scan
- */
-u64 __init e820_hole_size(u64 start, u64 end)
-{
-	return lmb_hole_size(start, end);
-}
-
-void reserve_early(u64 start, u64 end, char *name)
-{
-	reserve_lmb(start, end, name);
-}
-void free_early(u64 start, u64 end)
-{
-	free_lmb(start, end);
-}
-
 static void early_panic(char *msg)
 {
 	early_printk(msg);
diff --git a/include/linux/early_res.h b/include/linux/early_res.h
deleted file mode 100644
index 29c09f5..0000000
--- a/include/linux/early_res.h
+++ /dev/null
@@ -1,23 +0,0 @@
-#ifndef _LINUX_EARLY_RES_H
-#define _LINUX_EARLY_RES_H
-#ifdef __KERNEL__
-
-extern void reserve_early(u64 start, u64 end, char *name);
-extern void reserve_early_overlap_ok(u64 start, u64 end, char *name);
-extern void free_early(u64 start, u64 end);
-void free_early_partial(u64 start, u64 end);
-extern void early_res_to_bootmem(u64 start, u64 end);
-
-void reserve_early_without_check(u64 start, u64 end, char *name);
-u64 find_early_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
-			 u64 size, u64 align);
-u64 find_early_area_size(u64 ei_start, u64 ei_last, u64 start,
-			 u64 *sizep, u64 align);
-u64 find_fw_memmap_area(u64 start, u64 end, u64 size, u64 align);
-u64 get_max_mapped(void);
-#include <linux/range.h>
-int get_free_all_memory_range(struct range **rangep, int nodeid);
-
-#endif /* __KERNEL__ */
-
-#endif /* _LINUX_EARLY_RES_H */
diff --git a/kernel/early_res.c b/kernel/early_res.c
deleted file mode 100644
index 31aa933..0000000
--- a/kernel/early_res.c
+++ /dev/null
@@ -1,584 +0,0 @@
-/*
- * early_res, could be used to replace bootmem
- */
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/init.h>
-#include <linux/bootmem.h>
-#include <linux/mm.h>
-#include <linux/early_res.h>
-
-/*
- * Early reserved memory areas.
- */
-/*
- * need to make sure this one is bigger enough before
- * find_fw_memmap_area could be used
- */
-#define MAX_EARLY_RES_X 32
-
-struct early_res {
-	u64 start, end;
-	char name[15];
-	char overlap_ok;
-};
-static struct early_res early_res_x[MAX_EARLY_RES_X] __initdata;
-
-static int max_early_res __initdata = MAX_EARLY_RES_X;
-static struct early_res *early_res __initdata = &early_res_x[0];
-static int early_res_count __initdata;
-
-static int __init find_overlapped_early(u64 start, u64 end)
-{
-	int i;
-	struct early_res *r;
-
-	for (i = 0; i < max_early_res && early_res[i].end; i++) {
-		r = &early_res[i];
-		if (end > r->start && start < r->end)
-			break;
-	}
-
-	return i;
-}
-
-/*
- * Drop the i-th range from the early reservation map,
- * by copying any higher ranges down one over it, and
- * clearing what had been the last slot.
- */
-static void __init drop_range(int i)
-{
-	int j;
-
-	for (j = i + 1; j < max_early_res && early_res[j].end; j++)
-		;
-
-	memmove(&early_res[i], &early_res[i + 1],
-	       (j - 1 - i) * sizeof(struct early_res));
-
-	early_res[j - 1].end = 0;
-	early_res_count--;
-}
-
-static void __init drop_range_partial(int i, u64 start, u64 end)
-{
-	u64 common_start, common_end;
-	u64 old_start, old_end;
-
-	old_start = early_res[i].start;
-	old_end = early_res[i].end;
-	common_start = max(old_start, start);
-	common_end = min(old_end, end);
-
-	/* no overlap ? */
-	if (common_start >= common_end)
-		return;
-
-	if (old_start < common_start) {
-		/* make head segment */
-		early_res[i].end = common_start;
-		if (old_end > common_end) {
-			char name[15];
-
-			/*
-			 * Save a local copy of the name, since the
-			 * early_res array could get resized inside
-			 * reserve_early_without_check() ->
-			 * __check_and_double_early_res(), which would
-			 * make the current name pointer invalid.
-			 */
-			strncpy(name, early_res[i].name,
-					 sizeof(early_res[i].name) - 1);
-			/* add another for left over on tail */
-			reserve_early_without_check(common_end, old_end, name);
-		}
-		return;
-	} else {
-		if (old_end > common_end) {
-			/* reuse the entry for tail left */
-			early_res[i].start = common_end;
-			return;
-		}
-		/* all covered */
-		drop_range(i);
-	}
-}
-
-/*
- * Split any existing ranges that:
- *  1) are marked 'overlap_ok', and
- *  2) overlap with the stated range [start, end)
- * into whatever portion (if any) of the existing range is entirely
- * below or entirely above the stated range.  Drop the portion
- * of the existing range that overlaps with the stated range,
- * which will allow the caller of this routine to then add that
- * stated range without conflicting with any existing range.
- */
-static void __init drop_overlaps_that_are_ok(u64 start, u64 end)
-{
-	int i;
-	struct early_res *r;
-	u64 lower_start, lower_end;
-	u64 upper_start, upper_end;
-	char name[15];
-
-	for (i = 0; i < max_early_res && early_res[i].end; i++) {
-		r = &early_res[i];
-
-		/* Continue past non-overlapping ranges */
-		if (end <= r->start || start >= r->end)
-			continue;
-
-		/*
-		 * Leave non-ok overlaps as is; let caller
-		 * panic "Overlapping early reservations"
-		 * when it hits this overlap.
-		 */
-		if (!r->overlap_ok)
-			return;
-
-		/*
-		 * We have an ok overlap.  We will drop it from the early
-		 * reservation map, and add back in any non-overlapping
-		 * portions (lower or upper) as separate, overlap_ok,
-		 * non-overlapping ranges.
-		 */
-
-		/* 1. Note any non-overlapping (lower or upper) ranges. */
-		strncpy(name, r->name, sizeof(name) - 1);
-
-		lower_start = lower_end = 0;
-		upper_start = upper_end = 0;
-		if (r->start < start) {
-			lower_start = r->start;
-			lower_end = start;
-		}
-		if (r->end > end) {
-			upper_start = end;
-			upper_end = r->end;
-		}
-
-		/* 2. Drop the original ok overlapping range */
-		drop_range(i);
-
-		i--;		/* resume for-loop on copied down entry */
-
-		/* 3. Add back in any non-overlapping ranges. */
-		if (lower_end)
-			reserve_early_overlap_ok(lower_start, lower_end, name);
-		if (upper_end)
-			reserve_early_overlap_ok(upper_start, upper_end, name);
-	}
-}
-
-static void __init __reserve_early(u64 start, u64 end, char *name,
-						int overlap_ok)
-{
-	int i;
-	struct early_res *r;
-
-	i = find_overlapped_early(start, end);
-	if (i >= max_early_res)
-		panic("Too many early reservations");
-	r = &early_res[i];
-	if (r->end)
-		panic("Overlapping early reservations "
-		      "%llx-%llx %s to %llx-%llx %s\n",
-		      start, end - 1, name ? name : "", r->start,
-		      r->end - 1, r->name);
-	r->start = start;
-	r->end = end;
-	r->overlap_ok = overlap_ok;
-	if (name)
-		strncpy(r->name, name, sizeof(r->name) - 1);
-	early_res_count++;
-}
-
-/*
- * A few early reservtations come here.
- *
- * The 'overlap_ok' in the name of this routine does -not- mean it
- * is ok for these reservations to overlap an earlier reservation.
- * Rather it means that it is ok for subsequent reservations to
- * overlap this one.
- *
- * Use this entry point to reserve early ranges when you are doing
- * so out of "Paranoia", reserving perhaps more memory than you need,
- * just in case, and don't mind a subsequent overlapping reservation
- * that is known to be needed.
- *
- * The drop_overlaps_that_are_ok() call here isn't really needed.
- * It would be needed if we had two colliding 'overlap_ok'
- * reservations, so that the second such would not panic on the
- * overlap with the first.  We don't have any such as of this
- * writing, but might as well tolerate such if it happens in
- * the future.
- */
-void __init reserve_early_overlap_ok(u64 start, u64 end, char *name)
-{
-	drop_overlaps_that_are_ok(start, end);
-	__reserve_early(start, end, name, 1);
-}
-
-static void __init __check_and_double_early_res(u64 ex_start, u64 ex_end)
-{
-	u64 start, end, size, mem;
-	struct early_res *new;
-
-	/* do we have enough slots left ? */
-	if ((max_early_res - early_res_count) > max(max_early_res/8, 2))
-		return;
-
-	/* double it */
-	mem = -1ULL;
-	size = sizeof(struct early_res) * max_early_res * 2;
-	if (early_res == early_res_x)
-		start = 0;
-	else
-		start = early_res[0].end;
-	end = ex_start;
-	if (start + size < end)
-		mem = find_fw_memmap_area(start, end, size,
-					 sizeof(struct early_res));
-	if (mem == -1ULL) {
-		start = ex_end;
-		end = get_max_mapped();
-		if (start + size < end)
-			mem = find_fw_memmap_area(start, end, size,
-						 sizeof(struct early_res));
-	}
-	if (mem == -1ULL)
-		panic("can not find more space for early_res array");
-
-	new = __va(mem);
-	/* save the first one for own */
-	new[0].start = mem;
-	new[0].end = mem + size;
-	new[0].overlap_ok = 0;
-	/* copy old to new */
-	if (early_res == early_res_x) {
-		memcpy(&new[1], &early_res[0],
-			 sizeof(struct early_res) * max_early_res);
-		memset(&new[max_early_res+1], 0,
-			 sizeof(struct early_res) * (max_early_res - 1));
-		early_res_count++;
-	} else {
-		memcpy(&new[1], &early_res[1],
-			 sizeof(struct early_res) * (max_early_res - 1));
-		memset(&new[max_early_res], 0,
-			 sizeof(struct early_res) * max_early_res);
-	}
-	memset(&early_res[0], 0, sizeof(struct early_res) * max_early_res);
-	early_res = new;
-	max_early_res *= 2;
-	printk(KERN_DEBUG "early_res array is doubled to %d at [%llx - %llx]\n",
-		max_early_res, mem, mem + size - 1);
-}
-
-/*
- * Most early reservations come here.
- *
- * We first have drop_overlaps_that_are_ok() drop any pre-existing
- * 'overlap_ok' ranges, so that we can then reserve this memory
- * range without risk of panic'ing on an overlapping overlap_ok
- * early reservation.
- */
-void __init reserve_early(u64 start, u64 end, char *name)
-{
-	if (start >= end)
-		return;
-
-	__check_and_double_early_res(start, end);
-
-	drop_overlaps_that_are_ok(start, end);
-	__reserve_early(start, end, name, 0);
-}
-
-void __init reserve_early_without_check(u64 start, u64 end, char *name)
-{
-	struct early_res *r;
-
-	if (start >= end)
-		return;
-
-	__check_and_double_early_res(start, end);
-
-	r = &early_res[early_res_count];
-
-	r->start = start;
-	r->end = end;
-	r->overlap_ok = 0;
-	if (name)
-		strncpy(r->name, name, sizeof(r->name) - 1);
-	early_res_count++;
-}
-
-void __init free_early(u64 start, u64 end)
-{
-	struct early_res *r;
-	int i;
-
-	i = find_overlapped_early(start, end);
-	r = &early_res[i];
-	if (i >= max_early_res || r->end != end || r->start != start)
-		panic("free_early on not reserved area: %llx-%llx!",
-			 start, end - 1);
-
-	drop_range(i);
-}
-
-void __init free_early_partial(u64 start, u64 end)
-{
-	struct early_res *r;
-	int i;
-
-	if (start == end)
-		return;
-
-	if (WARN_ONCE(start > end, "  wrong range [%#llx, %#llx]\n", start, end))
-		return;
-
-try_next:
-	i = find_overlapped_early(start, end);
-	if (i >= max_early_res)
-		return;
-
-	r = &early_res[i];
-	/* hole ? */
-	if (r->end >= end && r->start <= start) {
-		drop_range_partial(i, start, end);
-		return;
-	}
-
-	drop_range_partial(i, start, end);
-	goto try_next;
-}
-
-#ifdef CONFIG_NO_BOOTMEM
-static void __init subtract_early_res(struct range *range, int az)
-{
-	int i, count;
-	u64 final_start, final_end;
-	int idx = 0;
-
-	count  = 0;
-	for (i = 0; i < max_early_res && early_res[i].end; i++)
-		count++;
-
-	/* need to skip first one ?*/
-	if (early_res != early_res_x)
-		idx = 1;
-
-#define DEBUG_PRINT_EARLY_RES 1
-
-#if DEBUG_PRINT_EARLY_RES
-	printk(KERN_INFO "Subtract (%d early reservations)\n", count);
-#endif
-	for (i = idx; i < count; i++) {
-		struct early_res *r = &early_res[i];
-#if DEBUG_PRINT_EARLY_RES
-		printk(KERN_INFO "  #%d [%010llx - %010llx] %15s\n", i,
-			r->start, r->end, r->name);
-#endif
-		final_start = PFN_DOWN(r->start);
-		final_end = PFN_UP(r->end);
-		if (final_start >= final_end)
-			continue;
-		subtract_range(range, az, final_start, final_end);
-	}
-
-}
-
-int __init get_free_all_memory_range(struct range **rangep, int nodeid)
-{
-	int i, count;
-	u64 start = 0, end;
-	u64 size;
-	u64 mem;
-	struct range *range;
-	int nr_range;
-
-	count  = 0;
-	for (i = 0; i < max_early_res && early_res[i].end; i++)
-		count++;
-
-	count *= 2;
-
-	size = sizeof(struct range) * count;
-	end = get_max_mapped();
-#ifdef MAX_DMA32_PFN
-	if (end > (MAX_DMA32_PFN << PAGE_SHIFT))
-		start = MAX_DMA32_PFN << PAGE_SHIFT;
-#endif
-	mem = find_fw_memmap_area(start, end, size, sizeof(struct range));
-	if (mem == -1ULL)
-		panic("can not find more space for range free");
-
-	range = __va(mem);
-	/* use early_node_map[] and early_res to get range array at first */
-	memset(range, 0, size);
-	nr_range = 0;
-
-	/* need to go over early_node_map to find out good range for node */
-	nr_range = add_from_early_node_map(range, count, nr_range, nodeid);
-#ifdef CONFIG_X86_32
-	subtract_range(range, count, max_low_pfn, -1ULL);
-#endif
-	subtract_early_res(range, count);
-	nr_range = clean_sort_range(range, count);
-
-	/* need to clear it ? */
-	if (nodeid == MAX_NUMNODES) {
-		memset(&early_res[0], 0,
-			 sizeof(struct early_res) * max_early_res);
-		early_res = NULL;
-		max_early_res = 0;
-	}
-
-	*rangep = range;
-	return nr_range;
-}
-#else
-void __init early_res_to_bootmem(u64 start, u64 end)
-{
-	int i, count;
-	u64 final_start, final_end;
-	int idx = 0;
-
-	count  = 0;
-	for (i = 0; i < max_early_res && early_res[i].end; i++)
-		count++;
-
-	/* need to skip first one ?*/
-	if (early_res != early_res_x)
-		idx = 1;
-
-	printk(KERN_INFO "(%d/%d early reservations) ==> bootmem [%010llx - %010llx]\n",
-			 count - idx, max_early_res, start, end);
-	for (i = idx; i < count; i++) {
-		struct early_res *r = &early_res[i];
-		printk(KERN_INFO "  #%d [%010llx - %010llx] %16s", i,
-			r->start, r->end, r->name);
-		final_start = max(start, r->start);
-		final_end = min(end, r->end);
-		if (final_start >= final_end) {
-			printk(KERN_CONT "\n");
-			continue;
-		}
-		printk(KERN_CONT " ==> [%010llx - %010llx]\n",
-			final_start, final_end);
-		reserve_bootmem_generic(final_start, final_end - final_start,
-				BOOTMEM_DEFAULT);
-	}
-	/* clear them */
-	memset(&early_res[0], 0, sizeof(struct early_res) * max_early_res);
-	early_res = NULL;
-	max_early_res = 0;
-	early_res_count = 0;
-}
-#endif
-
-/* Check for already reserved areas */
-static inline int __init bad_addr(u64 *addrp, u64 size, u64 align)
-{
-	int i;
-	u64 addr = *addrp;
-	int changed = 0;
-	struct early_res *r;
-again:
-	i = find_overlapped_early(addr, addr + size);
-	r = &early_res[i];
-	if (i < max_early_res && r->end) {
-		*addrp = addr = round_up(r->end, align);
-		changed = 1;
-		goto again;
-	}
-	return changed;
-}
-
-/* Check for already reserved areas */
-static inline int __init bad_addr_size(u64 *addrp, u64 *sizep, u64 align)
-{
-	int i;
-	u64 addr = *addrp, last;
-	u64 size = *sizep;
-	int changed = 0;
-again:
-	last = addr + size;
-	for (i = 0; i < max_early_res && early_res[i].end; i++) {
-		struct early_res *r = &early_res[i];
-		if (last > r->start && addr < r->start) {
-			size = r->start - addr;
-			changed = 1;
-			goto again;
-		}
-		if (last > r->end && addr < r->end) {
-			addr = round_up(r->end, align);
-			size = last - addr;
-			changed = 1;
-			goto again;
-		}
-		if (last <= r->end && addr >= r->start) {
-			(*sizep)++;
-			return 0;
-		}
-	}
-	if (changed) {
-		*addrp = addr;
-		*sizep = size;
-	}
-	return changed;
-}
-
-/*
- * Find a free area with specified alignment in a specific range.
- * only with the area.between start to end is active range from early_node_map
- * so they are good as RAM
- */
-u64 __init find_early_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
-			 u64 size, u64 align)
-{
-	u64 addr, last;
-
-	addr = round_up(ei_start, align);
-	if (addr < start)
-		addr = round_up(start, align);
-	if (addr >= ei_last)
-		goto out;
-	while (bad_addr(&addr, size, align) && addr+size <= ei_last)
-		;
-	last = addr + size;
-	if (last > ei_last)
-		goto out;
-	if (last > end)
-		goto out;
-
-	return addr;
-
-out:
-	return -1ULL;
-}
-
-u64 __init find_early_area_size(u64 ei_start, u64 ei_last, u64 start,
-			 u64 *sizep, u64 align)
-{
-	u64 addr, last;
-
-	addr = round_up(ei_start, align);
-	if (addr < start)
-		addr = round_up(start, align);
-	if (addr >= ei_last)
-		goto out;
-	*sizep = ei_last - addr;
-	while (bad_addr_size(&addr, sizep, align) && addr + *sizep <= ei_last)
-		;
-	last = addr + *sizep;
-	if (last > ei_last)
-		goto out;
-
-	return addr;
-
-out:
-	return -1ULL;
-}
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 24/31] x86: Remove not used early_res code
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

and some functions in e820.c that are not used anymore

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h |   14 -
 arch/x86/kernel/e820.c      |   41 ---
 include/linux/early_res.h   |   23 --
 kernel/early_res.c          |  584 -------------------------------------------
 4 files changed, 0 insertions(+), 662 deletions(-)
 delete mode 100644 include/linux/early_res.h
 delete mode 100644 kernel/early_res.c

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 396c849..de6cd06 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -111,32 +111,18 @@ static inline void early_memtest(unsigned long start, unsigned long end)
 }
 #endif
 
-extern unsigned long end_user_pfn;
-
-extern u64 find_e820_area(u64 start, u64 end, u64 size, u64 align);
-extern u64 find_e820_area_size(u64 start, u64 *sizep, u64 align);
-extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
-
 extern unsigned long e820_end_of_ram_pfn(void);
 extern unsigned long e820_end_of_low_ram_pfn(void);
-extern void e820_register_active_regions(int nid, unsigned long start_pfn,
-					 unsigned long end_pfn);
-extern u64 e820_hole_size(u64 start, u64 end);
-
 extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
 
 void init_lmb_memory(void);
 void fill_lmb_memory(void);
-
 extern void finish_e820_parsing(void);
 extern void e820_reserve_resources(void);
 extern void e820_reserve_resources_late(void);
 extern void setup_memory_map(void);
 extern char *default_machine_specific_memory_setup(void);
 
-void reserve_early(u64 start, u64 end, char *name);
-void free_early(u64 start, u64 end);
-
 /*
  * Returns true iff the specified range [s,e) is completely contained inside
  * the ISA region.
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index c2a9ce4..47eb188 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -727,14 +727,6 @@ static int __init e820_mark_nvs_memory(void)
 core_initcall(e820_mark_nvs_memory);
 #endif
 
-/*
- * Find a free area with specified alignment in a specific range.
- */
-u64 __init find_e820_area(u64 start, u64 end, u64 size, u64 align)
-{
-	return find_lmb_area(start, end, size, align);
-}
-
 u64 __init get_max_mapped(void)
 {
 	u64 end = max_pfn_mapped;
@@ -743,13 +735,6 @@ u64 __init get_max_mapped(void)
 
 	return end;
 }
-/*
- * Find next free range after *start
- */
-u64 __init find_e820_area_size(u64 start, u64 *sizep, u64 align)
-{
-	return find_lmb_area_size(start, sizep, align);
-}
 
 /*
  * pre allocated 4k and reserved it in lmb and e820_saved
@@ -843,32 +828,6 @@ unsigned long __init e820_end_of_low_ram_pfn(void)
 	return e820_end_pfn(1UL<<(32 - PAGE_SHIFT), E820_RAM);
 }
 
-/* Walk the e820 map and register active regions within a node */
-void __init e820_register_active_regions(int nid, unsigned long start_pfn,
-					 unsigned long last_pfn)
-{
-	lmb_register_active_regions(nid, start_pfn, last_pfn);
-}
-
-/*
- * Find the hole size (in bytes) in the memory range.
- * @start: starting address of the memory range to scan
- * @end: ending address of the memory range to scan
- */
-u64 __init e820_hole_size(u64 start, u64 end)
-{
-	return lmb_hole_size(start, end);
-}
-
-void reserve_early(u64 start, u64 end, char *name)
-{
-	reserve_lmb(start, end, name);
-}
-void free_early(u64 start, u64 end)
-{
-	free_lmb(start, end);
-}
-
 static void early_panic(char *msg)
 {
 	early_printk(msg);
diff --git a/include/linux/early_res.h b/include/linux/early_res.h
deleted file mode 100644
index 29c09f5..0000000
--- a/include/linux/early_res.h
+++ /dev/null
@@ -1,23 +0,0 @@
-#ifndef _LINUX_EARLY_RES_H
-#define _LINUX_EARLY_RES_H
-#ifdef __KERNEL__
-
-extern void reserve_early(u64 start, u64 end, char *name);
-extern void reserve_early_overlap_ok(u64 start, u64 end, char *name);
-extern void free_early(u64 start, u64 end);
-void free_early_partial(u64 start, u64 end);
-extern void early_res_to_bootmem(u64 start, u64 end);
-
-void reserve_early_without_check(u64 start, u64 end, char *name);
-u64 find_early_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
-			 u64 size, u64 align);
-u64 find_early_area_size(u64 ei_start, u64 ei_last, u64 start,
-			 u64 *sizep, u64 align);
-u64 find_fw_memmap_area(u64 start, u64 end, u64 size, u64 align);
-u64 get_max_mapped(void);
-#include <linux/range.h>
-int get_free_all_memory_range(struct range **rangep, int nodeid);
-
-#endif /* __KERNEL__ */
-
-#endif /* _LINUX_EARLY_RES_H */
diff --git a/kernel/early_res.c b/kernel/early_res.c
deleted file mode 100644
index 31aa933..0000000
--- a/kernel/early_res.c
+++ /dev/null
@@ -1,584 +0,0 @@
-/*
- * early_res, could be used to replace bootmem
- */
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/init.h>
-#include <linux/bootmem.h>
-#include <linux/mm.h>
-#include <linux/early_res.h>
-
-/*
- * Early reserved memory areas.
- */
-/*
- * need to make sure this one is bigger enough before
- * find_fw_memmap_area could be used
- */
-#define MAX_EARLY_RES_X 32
-
-struct early_res {
-	u64 start, end;
-	char name[15];
-	char overlap_ok;
-};
-static struct early_res early_res_x[MAX_EARLY_RES_X] __initdata;
-
-static int max_early_res __initdata = MAX_EARLY_RES_X;
-static struct early_res *early_res __initdata = &early_res_x[0];
-static int early_res_count __initdata;
-
-static int __init find_overlapped_early(u64 start, u64 end)
-{
-	int i;
-	struct early_res *r;
-
-	for (i = 0; i < max_early_res && early_res[i].end; i++) {
-		r = &early_res[i];
-		if (end > r->start && start < r->end)
-			break;
-	}
-
-	return i;
-}
-
-/*
- * Drop the i-th range from the early reservation map,
- * by copying any higher ranges down one over it, and
- * clearing what had been the last slot.
- */
-static void __init drop_range(int i)
-{
-	int j;
-
-	for (j = i + 1; j < max_early_res && early_res[j].end; j++)
-		;
-
-	memmove(&early_res[i], &early_res[i + 1],
-	       (j - 1 - i) * sizeof(struct early_res));
-
-	early_res[j - 1].end = 0;
-	early_res_count--;
-}
-
-static void __init drop_range_partial(int i, u64 start, u64 end)
-{
-	u64 common_start, common_end;
-	u64 old_start, old_end;
-
-	old_start = early_res[i].start;
-	old_end = early_res[i].end;
-	common_start = max(old_start, start);
-	common_end = min(old_end, end);
-
-	/* no overlap ? */
-	if (common_start >= common_end)
-		return;
-
-	if (old_start < common_start) {
-		/* make head segment */
-		early_res[i].end = common_start;
-		if (old_end > common_end) {
-			char name[15];
-
-			/*
-			 * Save a local copy of the name, since the
-			 * early_res array could get resized inside
-			 * reserve_early_without_check() ->
-			 * __check_and_double_early_res(), which would
-			 * make the current name pointer invalid.
-			 */
-			strncpy(name, early_res[i].name,
-					 sizeof(early_res[i].name) - 1);
-			/* add another for left over on tail */
-			reserve_early_without_check(common_end, old_end, name);
-		}
-		return;
-	} else {
-		if (old_end > common_end) {
-			/* reuse the entry for tail left */
-			early_res[i].start = common_end;
-			return;
-		}
-		/* all covered */
-		drop_range(i);
-	}
-}
-
-/*
- * Split any existing ranges that:
- *  1) are marked 'overlap_ok', and
- *  2) overlap with the stated range [start, end)
- * into whatever portion (if any) of the existing range is entirely
- * below or entirely above the stated range.  Drop the portion
- * of the existing range that overlaps with the stated range,
- * which will allow the caller of this routine to then add that
- * stated range without conflicting with any existing range.
- */
-static void __init drop_overlaps_that_are_ok(u64 start, u64 end)
-{
-	int i;
-	struct early_res *r;
-	u64 lower_start, lower_end;
-	u64 upper_start, upper_end;
-	char name[15];
-
-	for (i = 0; i < max_early_res && early_res[i].end; i++) {
-		r = &early_res[i];
-
-		/* Continue past non-overlapping ranges */
-		if (end <= r->start || start >= r->end)
-			continue;
-
-		/*
-		 * Leave non-ok overlaps as is; let caller
-		 * panic "Overlapping early reservations"
-		 * when it hits this overlap.
-		 */
-		if (!r->overlap_ok)
-			return;
-
-		/*
-		 * We have an ok overlap.  We will drop it from the early
-		 * reservation map, and add back in any non-overlapping
-		 * portions (lower or upper) as separate, overlap_ok,
-		 * non-overlapping ranges.
-		 */
-
-		/* 1. Note any non-overlapping (lower or upper) ranges. */
-		strncpy(name, r->name, sizeof(name) - 1);
-
-		lower_start = lower_end = 0;
-		upper_start = upper_end = 0;
-		if (r->start < start) {
-			lower_start = r->start;
-			lower_end = start;
-		}
-		if (r->end > end) {
-			upper_start = end;
-			upper_end = r->end;
-		}
-
-		/* 2. Drop the original ok overlapping range */
-		drop_range(i);
-
-		i--;		/* resume for-loop on copied down entry */
-
-		/* 3. Add back in any non-overlapping ranges. */
-		if (lower_end)
-			reserve_early_overlap_ok(lower_start, lower_end, name);
-		if (upper_end)
-			reserve_early_overlap_ok(upper_start, upper_end, name);
-	}
-}
-
-static void __init __reserve_early(u64 start, u64 end, char *name,
-						int overlap_ok)
-{
-	int i;
-	struct early_res *r;
-
-	i = find_overlapped_early(start, end);
-	if (i >= max_early_res)
-		panic("Too many early reservations");
-	r = &early_res[i];
-	if (r->end)
-		panic("Overlapping early reservations "
-		      "%llx-%llx %s to %llx-%llx %s\n",
-		      start, end - 1, name ? name : "", r->start,
-		      r->end - 1, r->name);
-	r->start = start;
-	r->end = end;
-	r->overlap_ok = overlap_ok;
-	if (name)
-		strncpy(r->name, name, sizeof(r->name) - 1);
-	early_res_count++;
-}
-
-/*
- * A few early reservtations come here.
- *
- * The 'overlap_ok' in the name of this routine does -not- mean it
- * is ok for these reservations to overlap an earlier reservation.
- * Rather it means that it is ok for subsequent reservations to
- * overlap this one.
- *
- * Use this entry point to reserve early ranges when you are doing
- * so out of "Paranoia", reserving perhaps more memory than you need,
- * just in case, and don't mind a subsequent overlapping reservation
- * that is known to be needed.
- *
- * The drop_overlaps_that_are_ok() call here isn't really needed.
- * It would be needed if we had two colliding 'overlap_ok'
- * reservations, so that the second such would not panic on the
- * overlap with the first.  We don't have any such as of this
- * writing, but might as well tolerate such if it happens in
- * the future.
- */
-void __init reserve_early_overlap_ok(u64 start, u64 end, char *name)
-{
-	drop_overlaps_that_are_ok(start, end);
-	__reserve_early(start, end, name, 1);
-}
-
-static void __init __check_and_double_early_res(u64 ex_start, u64 ex_end)
-{
-	u64 start, end, size, mem;
-	struct early_res *new;
-
-	/* do we have enough slots left ? */
-	if ((max_early_res - early_res_count) > max(max_early_res/8, 2))
-		return;
-
-	/* double it */
-	mem = -1ULL;
-	size = sizeof(struct early_res) * max_early_res * 2;
-	if (early_res == early_res_x)
-		start = 0;
-	else
-		start = early_res[0].end;
-	end = ex_start;
-	if (start + size < end)
-		mem = find_fw_memmap_area(start, end, size,
-					 sizeof(struct early_res));
-	if (mem == -1ULL) {
-		start = ex_end;
-		end = get_max_mapped();
-		if (start + size < end)
-			mem = find_fw_memmap_area(start, end, size,
-						 sizeof(struct early_res));
-	}
-	if (mem == -1ULL)
-		panic("can not find more space for early_res array");
-
-	new = __va(mem);
-	/* save the first one for own */
-	new[0].start = mem;
-	new[0].end = mem + size;
-	new[0].overlap_ok = 0;
-	/* copy old to new */
-	if (early_res == early_res_x) {
-		memcpy(&new[1], &early_res[0],
-			 sizeof(struct early_res) * max_early_res);
-		memset(&new[max_early_res+1], 0,
-			 sizeof(struct early_res) * (max_early_res - 1));
-		early_res_count++;
-	} else {
-		memcpy(&new[1], &early_res[1],
-			 sizeof(struct early_res) * (max_early_res - 1));
-		memset(&new[max_early_res], 0,
-			 sizeof(struct early_res) * max_early_res);
-	}
-	memset(&early_res[0], 0, sizeof(struct early_res) * max_early_res);
-	early_res = new;
-	max_early_res *= 2;
-	printk(KERN_DEBUG "early_res array is doubled to %d at [%llx - %llx]\n",
-		max_early_res, mem, mem + size - 1);
-}
-
-/*
- * Most early reservations come here.
- *
- * We first have drop_overlaps_that_are_ok() drop any pre-existing
- * 'overlap_ok' ranges, so that we can then reserve this memory
- * range without risk of panic'ing on an overlapping overlap_ok
- * early reservation.
- */
-void __init reserve_early(u64 start, u64 end, char *name)
-{
-	if (start >= end)
-		return;
-
-	__check_and_double_early_res(start, end);
-
-	drop_overlaps_that_are_ok(start, end);
-	__reserve_early(start, end, name, 0);
-}
-
-void __init reserve_early_without_check(u64 start, u64 end, char *name)
-{
-	struct early_res *r;
-
-	if (start >= end)
-		return;
-
-	__check_and_double_early_res(start, end);
-
-	r = &early_res[early_res_count];
-
-	r->start = start;
-	r->end = end;
-	r->overlap_ok = 0;
-	if (name)
-		strncpy(r->name, name, sizeof(r->name) - 1);
-	early_res_count++;
-}
-
-void __init free_early(u64 start, u64 end)
-{
-	struct early_res *r;
-	int i;
-
-	i = find_overlapped_early(start, end);
-	r = &early_res[i];
-	if (i >= max_early_res || r->end != end || r->start != start)
-		panic("free_early on not reserved area: %llx-%llx!",
-			 start, end - 1);
-
-	drop_range(i);
-}
-
-void __init free_early_partial(u64 start, u64 end)
-{
-	struct early_res *r;
-	int i;
-
-	if (start == end)
-		return;
-
-	if (WARN_ONCE(start > end, "  wrong range [%#llx, %#llx]\n", start, end))
-		return;
-
-try_next:
-	i = find_overlapped_early(start, end);
-	if (i >= max_early_res)
-		return;
-
-	r = &early_res[i];
-	/* hole ? */
-	if (r->end >= end && r->start <= start) {
-		drop_range_partial(i, start, end);
-		return;
-	}
-
-	drop_range_partial(i, start, end);
-	goto try_next;
-}
-
-#ifdef CONFIG_NO_BOOTMEM
-static void __init subtract_early_res(struct range *range, int az)
-{
-	int i, count;
-	u64 final_start, final_end;
-	int idx = 0;
-
-	count  = 0;
-	for (i = 0; i < max_early_res && early_res[i].end; i++)
-		count++;
-
-	/* need to skip first one ?*/
-	if (early_res != early_res_x)
-		idx = 1;
-
-#define DEBUG_PRINT_EARLY_RES 1
-
-#if DEBUG_PRINT_EARLY_RES
-	printk(KERN_INFO "Subtract (%d early reservations)\n", count);
-#endif
-	for (i = idx; i < count; i++) {
-		struct early_res *r = &early_res[i];
-#if DEBUG_PRINT_EARLY_RES
-		printk(KERN_INFO "  #%d [%010llx - %010llx] %15s\n", i,
-			r->start, r->end, r->name);
-#endif
-		final_start = PFN_DOWN(r->start);
-		final_end = PFN_UP(r->end);
-		if (final_start >= final_end)
-			continue;
-		subtract_range(range, az, final_start, final_end);
-	}
-
-}
-
-int __init get_free_all_memory_range(struct range **rangep, int nodeid)
-{
-	int i, count;
-	u64 start = 0, end;
-	u64 size;
-	u64 mem;
-	struct range *range;
-	int nr_range;
-
-	count  = 0;
-	for (i = 0; i < max_early_res && early_res[i].end; i++)
-		count++;
-
-	count *= 2;
-
-	size = sizeof(struct range) * count;
-	end = get_max_mapped();
-#ifdef MAX_DMA32_PFN
-	if (end > (MAX_DMA32_PFN << PAGE_SHIFT))
-		start = MAX_DMA32_PFN << PAGE_SHIFT;
-#endif
-	mem = find_fw_memmap_area(start, end, size, sizeof(struct range));
-	if (mem == -1ULL)
-		panic("can not find more space for range free");
-
-	range = __va(mem);
-	/* use early_node_map[] and early_res to get range array at first */
-	memset(range, 0, size);
-	nr_range = 0;
-
-	/* need to go over early_node_map to find out good range for node */
-	nr_range = add_from_early_node_map(range, count, nr_range, nodeid);
-#ifdef CONFIG_X86_32
-	subtract_range(range, count, max_low_pfn, -1ULL);
-#endif
-	subtract_early_res(range, count);
-	nr_range = clean_sort_range(range, count);
-
-	/* need to clear it ? */
-	if (nodeid == MAX_NUMNODES) {
-		memset(&early_res[0], 0,
-			 sizeof(struct early_res) * max_early_res);
-		early_res = NULL;
-		max_early_res = 0;
-	}
-
-	*rangep = range;
-	return nr_range;
-}
-#else
-void __init early_res_to_bootmem(u64 start, u64 end)
-{
-	int i, count;
-	u64 final_start, final_end;
-	int idx = 0;
-
-	count  = 0;
-	for (i = 0; i < max_early_res && early_res[i].end; i++)
-		count++;
-
-	/* need to skip first one ?*/
-	if (early_res != early_res_x)
-		idx = 1;
-
-	printk(KERN_INFO "(%d/%d early reservations) ==> bootmem [%010llx - %010llx]\n",
-			 count - idx, max_early_res, start, end);
-	for (i = idx; i < count; i++) {
-		struct early_res *r = &early_res[i];
-		printk(KERN_INFO "  #%d [%010llx - %010llx] %16s", i,
-			r->start, r->end, r->name);
-		final_start = max(start, r->start);
-		final_end = min(end, r->end);
-		if (final_start >= final_end) {
-			printk(KERN_CONT "\n");
-			continue;
-		}
-		printk(KERN_CONT " ==> [%010llx - %010llx]\n",
-			final_start, final_end);
-		reserve_bootmem_generic(final_start, final_end - final_start,
-				BOOTMEM_DEFAULT);
-	}
-	/* clear them */
-	memset(&early_res[0], 0, sizeof(struct early_res) * max_early_res);
-	early_res = NULL;
-	max_early_res = 0;
-	early_res_count = 0;
-}
-#endif
-
-/* Check for already reserved areas */
-static inline int __init bad_addr(u64 *addrp, u64 size, u64 align)
-{
-	int i;
-	u64 addr = *addrp;
-	int changed = 0;
-	struct early_res *r;
-again:
-	i = find_overlapped_early(addr, addr + size);
-	r = &early_res[i];
-	if (i < max_early_res && r->end) {
-		*addrp = addr = round_up(r->end, align);
-		changed = 1;
-		goto again;
-	}
-	return changed;
-}
-
-/* Check for already reserved areas */
-static inline int __init bad_addr_size(u64 *addrp, u64 *sizep, u64 align)
-{
-	int i;
-	u64 addr = *addrp, last;
-	u64 size = *sizep;
-	int changed = 0;
-again:
-	last = addr + size;
-	for (i = 0; i < max_early_res && early_res[i].end; i++) {
-		struct early_res *r = &early_res[i];
-		if (last > r->start && addr < r->start) {
-			size = r->start - addr;
-			changed = 1;
-			goto again;
-		}
-		if (last > r->end && addr < r->end) {
-			addr = round_up(r->end, align);
-			size = last - addr;
-			changed = 1;
-			goto again;
-		}
-		if (last <= r->end && addr >= r->start) {
-			(*sizep)++;
-			return 0;
-		}
-	}
-	if (changed) {
-		*addrp = addr;
-		*sizep = size;
-	}
-	return changed;
-}
-
-/*
- * Find a free area with specified alignment in a specific range.
- * only with the area.between start to end is active range from early_node_map
- * so they are good as RAM
- */
-u64 __init find_early_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
-			 u64 size, u64 align)
-{
-	u64 addr, last;
-
-	addr = round_up(ei_start, align);
-	if (addr < start)
-		addr = round_up(start, align);
-	if (addr >= ei_last)
-		goto out;
-	while (bad_addr(&addr, size, align) && addr+size <= ei_last)
-		;
-	last = addr + size;
-	if (last > ei_last)
-		goto out;
-	if (last > end)
-		goto out;
-
-	return addr;
-
-out:
-	return -1ULL;
-}
-
-u64 __init find_early_area_size(u64 ei_start, u64 ei_last, u64 start,
-			 u64 *sizep, u64 align)
-{
-	u64 addr, last;
-
-	addr = round_up(ei_start, align);
-	if (addr < start)
-		addr = round_up(start, align);
-	if (addr >= ei_last)
-		goto out;
-	*sizep = ei_last - addr;
-	while (bad_addr_size(&addr, sizep, align) && addr + *sizep <= ei_last)
-		;
-	last = addr + *sizep;
-	if (last > ei_last)
-		goto out;
-
-	return addr;
-
-out:
-	return -1ULL;
-}
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 25/31] x86, lmb: Use lmb_memory_size()/lmb_free_memory_size() to get correct dma_reserve
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

lmb_memory_size() will return memory size in lmb.memory.region.
lmb_free_memory_size() will return free memory size in lmb.memory.region.

So We can get exact reseved size.

Set the size right after initmem_init(), because later bootmem API will
get area from 16M. (except some fallback).

Later after we remove the bootmem, We could call that just before paging_init().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h |    2 ++
 arch/x86/kernel/e820.c      |   17 +++++++++++++++++
 arch/x86/kernel/setup.c     |    1 +
 arch/x86/mm/init_64.c       |    7 -------
 4 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index de6cd06..334281f 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -117,6 +117,8 @@ extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
 
 void init_lmb_memory(void);
 void fill_lmb_memory(void);
+void find_lmb_dma_reserve(void);
+
 extern void finish_e820_parsing(void);
 extern void e820_reserve_resources(void);
 extern void e820_reserve_resources_late(void);
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 47eb188..9c4b9da 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1100,3 +1100,20 @@ void __init fill_lmb_memory(void)
 	lmb_analyze();
 	lmb_dump_all();
 }
+
+void __init find_lmb_dma_reserve(void)
+{
+#ifdef CONFIG_X86_64
+	u64 free_size_pfn;
+	u64 mem_size_pfn;
+	/*
+	 * need to find out used area below MAX_DMA_PFN
+	 * need to use lmb to get free size in [0, MAX_DMA_PFN]
+	 * at first, and assume boot_mem will not take below MAX_DMA_PFN
+	 */
+	mem_size_pfn = lmb_memory_size(0, MAX_DMA_PFN << PAGE_SHIFT) >> PAGE_SHIFT;
+	free_size_pfn = lmb_free_memory_size(0, MAX_DMA_PFN << PAGE_SHIFT) >> PAGE_SHIFT;
+	set_dma_reserve(mem_size_pfn - free_size_pfn);
+#endif
+}
+
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 3d43f12..b00ccc4 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -979,6 +979,7 @@ void __init setup_arch(char **cmdline_p)
 #endif
 
 	initmem_init(0, max_pfn, acpi, k8);
+	find_lmb_dma_reserve();
 #ifndef CONFIG_NO_BOOTMEM
 	lmb_to_bootmem(0, max_low_pfn<<PAGE_SHIFT);
 #endif
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b86492e..f1af41c 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -52,8 +52,6 @@
 #include <asm/init.h>
 #include <linux/bootmem.h>
 
-static unsigned long dma_reserve __initdata;
-
 static int __init parse_direct_gbpages_off(char *arg)
 {
 	direct_gbpages = 0;
@@ -820,11 +818,6 @@ int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
 
 	reserve_bootmem(phys, len, flags);
 
-	if (phys+len <= MAX_DMA_PFN*PAGE_SIZE) {
-		dma_reserve += len / PAGE_SIZE;
-		set_dma_reserve(dma_reserve);
-	}
-
 	return 0;
 }
 #endif
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 25/31] x86, lmb: Use lmb_memory_size()/lmb_free_memory_size() to get correct dma_reserve
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

lmb_memory_size() will return memory size in lmb.memory.region.
lmb_free_memory_size() will return free memory size in lmb.memory.region.

So We can get exact reseved size.

Set the size right after initmem_init(), because later bootmem API will
get area from 16M. (except some fallback).

Later after we remove the bootmem, We could call that just before paging_init().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h |    2 ++
 arch/x86/kernel/e820.c      |   17 +++++++++++++++++
 arch/x86/kernel/setup.c     |    1 +
 arch/x86/mm/init_64.c       |    7 -------
 4 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index de6cd06..334281f 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -117,6 +117,8 @@ extern u64 early_reserve_e820(u64 startt, u64 sizet, u64 align);
 
 void init_lmb_memory(void);
 void fill_lmb_memory(void);
+void find_lmb_dma_reserve(void);
+
 extern void finish_e820_parsing(void);
 extern void e820_reserve_resources(void);
 extern void e820_reserve_resources_late(void);
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 47eb188..9c4b9da 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1100,3 +1100,20 @@ void __init fill_lmb_memory(void)
 	lmb_analyze();
 	lmb_dump_all();
 }
+
+void __init find_lmb_dma_reserve(void)
+{
+#ifdef CONFIG_X86_64
+	u64 free_size_pfn;
+	u64 mem_size_pfn;
+	/*
+	 * need to find out used area below MAX_DMA_PFN
+	 * need to use lmb to get free size in [0, MAX_DMA_PFN]
+	 * at first, and assume boot_mem will not take below MAX_DMA_PFN
+	 */
+	mem_size_pfn = lmb_memory_size(0, MAX_DMA_PFN << PAGE_SHIFT) >> PAGE_SHIFT;
+	free_size_pfn = lmb_free_memory_size(0, MAX_DMA_PFN << PAGE_SHIFT) >> PAGE_SHIFT;
+	set_dma_reserve(mem_size_pfn - free_size_pfn);
+#endif
+}
+
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 3d43f12..b00ccc4 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -979,6 +979,7 @@ void __init setup_arch(char **cmdline_p)
 #endif
 
 	initmem_init(0, max_pfn, acpi, k8);
+	find_lmb_dma_reserve();
 #ifndef CONFIG_NO_BOOTMEM
 	lmb_to_bootmem(0, max_low_pfn<<PAGE_SHIFT);
 #endif
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b86492e..f1af41c 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -52,8 +52,6 @@
 #include <asm/init.h>
 #include <linux/bootmem.h>
 
-static unsigned long dma_reserve __initdata;
-
 static int __init parse_direct_gbpages_off(char *arg)
 {
 	direct_gbpages = 0;
@@ -820,11 +818,6 @@ int __init reserve_bootmem_generic(unsigned long phys, unsigned long len,
 
 	reserve_bootmem(phys, len, flags);
 
-	if (phys+len <= MAX_DMA_PFN*PAGE_SIZE) {
-		dma_reserve += len / PAGE_SIZE;
-		set_dma_reserve(dma_reserve);
-	}
-
 	return 0;
 }
 #endif
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 26/31] x86: Align e820 ram range to page
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

to workaround wrong BIOS memory map.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/e820.c |   44 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 9c4b9da..6fdd9e9 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -903,6 +903,47 @@ static int __init parse_memmap_opt(char *p)
 }
 early_param("memmap", parse_memmap_opt);
 
+static void __init e820_align_ram_page(void)
+{
+	int i;
+	bool changed = false;;
+
+	for (i = 0; i < e820.nr_map; i++) {
+		struct e820entry *entry = &e820.map[i];
+		u64 start, end;
+		u64 start_aligned, end_aligned;
+
+		if (entry->type != E820_RAM)
+			continue;
+
+		start = entry->addr;
+		end = start + entry->size;
+
+		start_aligned = round_up(start, PAGE_SIZE);
+		end_aligned = round_down(end, PAGE_SIZE);
+
+		if (end_aligned <= start_aligned) {
+			e820_update_range(start, end - start, E820_RAM, E820_RESERVED);
+			changed = true;
+			continue;
+		}
+		if (start < start_aligned) {
+			e820_update_range(start, start_aligned - start, E820_RAM, E820_RESERVED);
+			changed = true;
+		}
+		if (end_aligned < end) {
+			e820_update_range(end_aligned, end - end_aligned, E820_RAM, E820_RESERVED);
+			changed = true;
+		}
+	}
+
+	if (changed) {
+		sanitize_e820_map();
+		printk(KERN_INFO "aligned physical RAM map:\n");
+		e820_print_map("aligned");
+	}
+}
+
 void __init finish_e820_parsing(void)
 {
 	if (userdef) {
@@ -915,6 +956,9 @@ void __init finish_e820_parsing(void)
 		printk(KERN_INFO "user-defined physical RAM map:\n");
 		e820_print_map("user");
 	}
+
+	/* In case, We have RAM entres that are not PAGE aligned */
+	e820_align_ram_page();
 }
 
 static inline const char *e820_type_to_string(int e820_type)
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 26/31] x86: Align e820 ram range to page
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

to workaround wrong BIOS memory map.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/e820.c |   44 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 9c4b9da..6fdd9e9 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -903,6 +903,47 @@ static int __init parse_memmap_opt(char *p)
 }
 early_param("memmap", parse_memmap_opt);
 
+static void __init e820_align_ram_page(void)
+{
+	int i;
+	bool changed = false;;
+
+	for (i = 0; i < e820.nr_map; i++) {
+		struct e820entry *entry = &e820.map[i];
+		u64 start, end;
+		u64 start_aligned, end_aligned;
+
+		if (entry->type != E820_RAM)
+			continue;
+
+		start = entry->addr;
+		end = start + entry->size;
+
+		start_aligned = round_up(start, PAGE_SIZE);
+		end_aligned = round_down(end, PAGE_SIZE);
+
+		if (end_aligned <= start_aligned) {
+			e820_update_range(start, end - start, E820_RAM, E820_RESERVED);
+			changed = true;
+			continue;
+		}
+		if (start < start_aligned) {
+			e820_update_range(start, start_aligned - start, E820_RAM, E820_RESERVED);
+			changed = true;
+		}
+		if (end_aligned < end) {
+			e820_update_range(end_aligned, end - end_aligned, E820_RAM, E820_RESERVED);
+			changed = true;
+		}
+	}
+
+	if (changed) {
+		sanitize_e820_map();
+		printk(KERN_INFO "aligned physical RAM map:\n");
+		e820_print_map("aligned");
+	}
+}
+
 void __init finish_e820_parsing(void)
 {
 	if (userdef) {
@@ -915,6 +956,9 @@ void __init finish_e820_parsing(void)
 		printk(KERN_INFO "user-defined physical RAM map:\n");
 		e820_print_map("user");
 	}
+
+	/* In case, We have RAM entres that are not PAGE aligned */
+	e820_align_ram_page();
 }
 
 static inline const char *e820_type_to_string(int e820_type)
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 27/31] x86: Use wake_system_ram_range instead of e820_any_mapped in agp path
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Move apterture_valid back to .c

and early path still use e820_any_mapped()

so later we can make e820_any_mapped() to _init

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/gart.h   |   22 ----------------------
 arch/x86/kernel/aperture_64.c |   22 ++++++++++++++++++++++
 drivers/char/agp/amd64-agp.c  |   39 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/gart.h b/arch/x86/include/asm/gart.h
index 4ac5b0f..2b63a91 100644
--- a/arch/x86/include/asm/gart.h
+++ b/arch/x86/include/asm/gart.h
@@ -74,26 +74,4 @@ static inline void enable_gart_translation(struct pci_dev *dev, u64 addr)
         pci_write_config_dword(dev, AMD64_GARTAPERTURECTL, ctl);
 }
 
-static inline int aperture_valid(u64 aper_base, u32 aper_size, u32 min_size)
-{
-	if (!aper_base)
-		return 0;
-
-	if (aper_base + aper_size > 0x100000000ULL) {
-		printk(KERN_INFO "Aperture beyond 4GB. Ignoring.\n");
-		return 0;
-	}
-	if (e820_any_mapped(aper_base, aper_base + aper_size, E820_RAM)) {
-		printk(KERN_INFO "Aperture pointing to e820 RAM. Ignoring.\n");
-		return 0;
-	}
-	if (aper_size < min_size) {
-		printk(KERN_INFO "Aperture too small (%d MB) than (%d MB)\n",
-				 aper_size>>20, min_size>>20);
-		return 0;
-	}
-
-	return 1;
-}
-
 #endif /* _ASM_X86_GART_H */
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index 3704997..f6e6270 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -145,6 +145,28 @@ static u32 __init find_cap(int bus, int slot, int func, int cap)
 	return 0;
 }
 
+static int __init aperture_valid(u64 aper_base, u32 aper_size, u32 min_size)
+{
+	if (!aper_base)
+		return 0;
+
+	if (aper_base + aper_size > 0x100000000ULL) {
+		printk(KERN_INFO "Aperture beyond 4GB. Ignoring.\n");
+		return 0;
+	}
+	if (e820_any_mapped(aper_base, aper_base + aper_size, E820_RAM)) {
+		printk(KERN_INFO "Aperture pointing to e820 RAM. Ignoring.\n");
+		return 0;
+	}
+	if (aper_size < min_size) {
+		printk(KERN_INFO "Aperture too small (%d MB) than (%d MB)\n",
+				 aper_size>>20, min_size>>20);
+		return 0;
+	}
+
+	return 1;
+}
+
 /* Read a standard AGPv3 bridge header */
 static u32 __init read_agp(int bus, int slot, int func, int cap, u32 *order)
 {
diff --git a/drivers/char/agp/amd64-agp.c b/drivers/char/agp/amd64-agp.c
index fd50ead..85cabd0 100644
--- a/drivers/char/agp/amd64-agp.c
+++ b/drivers/char/agp/amd64-agp.c
@@ -14,7 +14,6 @@
 #include <linux/agp_backend.h>
 #include <linux/mmzone.h>
 #include <asm/page.h>		/* PAGE_SIZE */
-#include <asm/e820.h>
 #include <asm/k8.h>
 #include <asm/gart.h>
 #include "agp.h"
@@ -231,6 +230,44 @@ static const struct agp_bridge_driver amd_8151_driver = {
 	.agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
+static int __devinit
+__is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+	return 1;
+}
+
+static int __devinit any_ram_in_range(u64 base, u64 size)
+{
+	unsigned long pfn, nr_pages;
+
+	pfn = base >> PAGE_SHIFT;
+	nr_pages = size >> PAGE_SHIFT;
+
+	return walk_system_ram_range(pfn, nr_pages, NULL, __is_ram) == 1;
+}
+
+static int __devinit aperture_valid(u64 aper_base, u32 aper_size, u32 min_size)
+{
+	if (!aper_base)
+		return 0;
+
+	if (aper_base + aper_size > 0x100000000ULL) {
+		printk(KERN_INFO "Aperture beyond 4GB. Ignoring.\n");
+		return 0;
+	}
+	if (any_ram_in_range(aper_base, aper_size)) {
+		printk(KERN_INFO "Aperture pointing to E820 RAM. Ignoring.\n");
+		return 0;
+	}
+	if (aper_size < min_size) {
+		printk(KERN_INFO "Aperture too small (%d MB) than (%d MB)\n",
+				 aper_size>>20, min_size>>20);
+		return 0;
+	}
+
+	return 1;
+}
+
 /* Some basic sanity checks for the aperture. */
 static int __devinit agp_aperture_valid(u64 aper, u32 size)
 {
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 27/31] x86: Use wake_system_ram_range instead of e820_any_mapped in agp path
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Move apterture_valid back to .c

and early path still use e820_any_mapped()

so later we can make e820_any_mapped() to _init

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/gart.h   |   22 ----------------------
 arch/x86/kernel/aperture_64.c |   22 ++++++++++++++++++++++
 drivers/char/agp/amd64-agp.c  |   39 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 60 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/gart.h b/arch/x86/include/asm/gart.h
index 4ac5b0f..2b63a91 100644
--- a/arch/x86/include/asm/gart.h
+++ b/arch/x86/include/asm/gart.h
@@ -74,26 +74,4 @@ static inline void enable_gart_translation(struct pci_dev *dev, u64 addr)
         pci_write_config_dword(dev, AMD64_GARTAPERTURECTL, ctl);
 }
 
-static inline int aperture_valid(u64 aper_base, u32 aper_size, u32 min_size)
-{
-	if (!aper_base)
-		return 0;
-
-	if (aper_base + aper_size > 0x100000000ULL) {
-		printk(KERN_INFO "Aperture beyond 4GB. Ignoring.\n");
-		return 0;
-	}
-	if (e820_any_mapped(aper_base, aper_base + aper_size, E820_RAM)) {
-		printk(KERN_INFO "Aperture pointing to e820 RAM. Ignoring.\n");
-		return 0;
-	}
-	if (aper_size < min_size) {
-		printk(KERN_INFO "Aperture too small (%d MB) than (%d MB)\n",
-				 aper_size>>20, min_size>>20);
-		return 0;
-	}
-
-	return 1;
-}
-
 #endif /* _ASM_X86_GART_H */
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index 3704997..f6e6270 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -145,6 +145,28 @@ static u32 __init find_cap(int bus, int slot, int func, int cap)
 	return 0;
 }
 
+static int __init aperture_valid(u64 aper_base, u32 aper_size, u32 min_size)
+{
+	if (!aper_base)
+		return 0;
+
+	if (aper_base + aper_size > 0x100000000ULL) {
+		printk(KERN_INFO "Aperture beyond 4GB. Ignoring.\n");
+		return 0;
+	}
+	if (e820_any_mapped(aper_base, aper_base + aper_size, E820_RAM)) {
+		printk(KERN_INFO "Aperture pointing to e820 RAM. Ignoring.\n");
+		return 0;
+	}
+	if (aper_size < min_size) {
+		printk(KERN_INFO "Aperture too small (%d MB) than (%d MB)\n",
+				 aper_size>>20, min_size>>20);
+		return 0;
+	}
+
+	return 1;
+}
+
 /* Read a standard AGPv3 bridge header */
 static u32 __init read_agp(int bus, int slot, int func, int cap, u32 *order)
 {
diff --git a/drivers/char/agp/amd64-agp.c b/drivers/char/agp/amd64-agp.c
index fd50ead..85cabd0 100644
--- a/drivers/char/agp/amd64-agp.c
+++ b/drivers/char/agp/amd64-agp.c
@@ -14,7 +14,6 @@
 #include <linux/agp_backend.h>
 #include <linux/mmzone.h>
 #include <asm/page.h>		/* PAGE_SIZE */
-#include <asm/e820.h>
 #include <asm/k8.h>
 #include <asm/gart.h>
 #include "agp.h"
@@ -231,6 +230,44 @@ static const struct agp_bridge_driver amd_8151_driver = {
 	.agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
+static int __devinit
+__is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+	return 1;
+}
+
+static int __devinit any_ram_in_range(u64 base, u64 size)
+{
+	unsigned long pfn, nr_pages;
+
+	pfn = base >> PAGE_SHIFT;
+	nr_pages = size >> PAGE_SHIFT;
+
+	return walk_system_ram_range(pfn, nr_pages, NULL, __is_ram) == 1;
+}
+
+static int __devinit aperture_valid(u64 aper_base, u32 aper_size, u32 min_size)
+{
+	if (!aper_base)
+		return 0;
+
+	if (aper_base + aper_size > 0x100000000ULL) {
+		printk(KERN_INFO "Aperture beyond 4GB. Ignoring.\n");
+		return 0;
+	}
+	if (any_ram_in_range(aper_base, aper_size)) {
+		printk(KERN_INFO "Aperture pointing to E820 RAM. Ignoring.\n");
+		return 0;
+	}
+	if (aper_size < min_size) {
+		printk(KERN_INFO "Aperture too small (%d MB) than (%d MB)\n",
+				 aper_size>>20, min_size>>20);
+		return 0;
+	}
+
+	return 1;
+}
+
 /* Some basic sanity checks for the aperture. */
 static int __devinit agp_aperture_valid(u64 aper, u32 size)
 {
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 28/31] x86: Add get_centaur_ram_top()
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

So we can avoid to access e820.map[] directly.

later we could move e820 to static and _initdata

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h   |    9 ++++++
 arch/x86/kernel/cpu/centaur.c |   53 +-------------------------------------
 arch/x86/kernel/e820.c        |   56 +++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/setup.c       |    2 +
 4 files changed, 69 insertions(+), 51 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 334281f..cd7de51 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -76,6 +76,15 @@ struct e820map {
 /* see comment in arch/x86/kernel/e820.c */
 extern struct e820map e820;
 
+#if defined(CONFIG_X86_OOSTORE) && defined(CONFIG_CPU_SUP_CENTAUR)
+extern int centaur_ram_top;
+void get_centaur_ram_top(void);
+#else
+static inline void get_centaur_ram_top(void)
+{
+}
+#endif
+
 extern unsigned long pci_mem_start;
 extern int e820_any_mapped(u64 start, u64 end, unsigned type);
 extern int e820_all_mapped(u64 start, u64 end, unsigned type);
diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c
index e58d978..bb49358 100644
--- a/arch/x86/kernel/cpu/centaur.c
+++ b/arch/x86/kernel/cpu/centaur.c
@@ -37,63 +37,14 @@ static void __cpuinit centaur_mcr_insert(int reg, u32 base, u32 size, int key)
 	mtrr_centaur_report_mcr(reg, lo, hi);	/* Tell the mtrr driver */
 }
 
-/*
- * Figure what we can cover with MCR's
- *
- * Shortcut: We know you can't put 4Gig of RAM on a winchip
- */
-static u32 __cpuinit ramtop(void)
-{
-	u32 clip = 0xFFFFFFFFUL;
-	u32 top = 0;
-	int i;
-
-	for (i = 0; i < e820.nr_map; i++) {
-		unsigned long start, end;
-
-		if (e820.map[i].addr > 0xFFFFFFFFUL)
-			continue;
-		/*
-		 * Don't MCR over reserved space. Ignore the ISA hole
-		 * we frob around that catastrophe already
-		 */
-		if (e820.map[i].type == E820_RESERVED) {
-			if (e820.map[i].addr >= 0x100000UL &&
-			    e820.map[i].addr < clip)
-				clip = e820.map[i].addr;
-			continue;
-		}
-		start = e820.map[i].addr;
-		end = e820.map[i].addr + e820.map[i].size;
-		if (start >= end)
-			continue;
-		if (end > top)
-			top = end;
-	}
-	/*
-	 * Everything below 'top' should be RAM except for the ISA hole.
-	 * Because of the limited MCR's we want to map NV/ACPI into our
-	 * MCR range for gunk in RAM
-	 *
-	 * Clip might cause us to MCR insufficient RAM but that is an
-	 * acceptable failure mode and should only bite obscure boxes with
-	 * a VESA hole at 15Mb
-	 *
-	 * The second case Clip sometimes kicks in is when the EBDA is marked
-	 * as reserved. Again we fail safe with reasonable results
-	 */
-	if (top > clip)
-		top = clip;
-
-	return top;
-}
+int __cpuinitdata centaur_ram_top;
 
 /*
  * Compute a set of MCR's to give maximum coverage
  */
 static int __cpuinit centaur_mcr_compute(int nr, int key)
 {
-	u32 mem = ramtop();
+	u32 mem = centaur_ram_top;
 	u32 root = power2(mem);
 	u32 base = root;
 	u32 top = root;
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 6fdd9e9..6b17893 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1124,6 +1124,62 @@ void __init setup_memory_map(void)
 	e820_print_map(who);
 }
 
+#if defined(CONFIG_X86_OOSTORE) && defined(CONFIG_CPU_SUP_CENTAUR)
+/*
+ * Figure what we can cover with MCR's
+ *
+ * Shortcut: We know you can't put 4Gig of RAM on a winchip
+ */
+void __init get_centaur_ram_top(void)
+{
+	u32 clip = 0xFFFFFFFFUL;
+	u32 top = 0;
+	int i;
+
+	if (boot_cpu_data.x86_vendor != X86_VENDOR_CENTAUR)
+		return;
+
+	for (i = 0; i < e820.nr_map; i++) {
+		unsigned long start, end;
+
+		if (e820.map[i].addr > 0xFFFFFFFFUL)
+			continue;
+		/*
+		 * Don't MCR over reserved space. Ignore the ISA hole
+		 * we frob around that catastrophe already
+		 */
+		if (e820.map[i].type == E820_RESERVED) {
+			if (e820.map[i].addr >= 0x100000UL &&
+			    e820.map[i].addr < clip)
+				clip = e820.map[i].addr;
+			continue;
+		}
+		start = e820.map[i].addr;
+		end = e820.map[i].addr + e820.map[i].size;
+		if (start >= end)
+			continue;
+		if (end > top)
+			top = end;
+	}
+	/*
+	 * Everything below 'top' should be RAM except for the ISA hole.
+	 * Because of the limited MCR's we want to map NV/ACPI into our
+	 * MCR range for gunk in RAM
+	 *
+	 * Clip might cause us to MCR insufficient RAM but that is an
+	 * acceptable failure mode and should only bite obscure boxes with
+	 * a VESA hole at 15Mb
+	 *
+	 * The second case Clip sometimes kicks in is when the EBDA is marked
+	 * as reserved. Again we fail safe with reasonable results
+	 */
+	if (top > clip)
+		top = clip;
+
+	centaur_ram_top = top;
+}
+#endif
+
 void __init init_lmb_memory(void)
 {
 	lmb_init();
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index b00ccc4..0e52435 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -875,6 +875,8 @@ void __init setup_arch(char **cmdline_p)
 	if (mtrr_trim_uncached_memory(max_pfn))
 		max_pfn = e820_end_of_ram_pfn();
 
+	get_centaur_ram_top();
+
 #ifdef CONFIG_X86_32
 	/* max_low_pfn get updated here */
 	find_low_pfn_range();
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 28/31] x86: Add get_centaur_ram_top()
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

So we can avoid to access e820.map[] directly.

later we could move e820 to static and _initdata

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h   |    9 ++++++
 arch/x86/kernel/cpu/centaur.c |   53 +-------------------------------------
 arch/x86/kernel/e820.c        |   56 +++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/setup.c       |    2 +
 4 files changed, 69 insertions(+), 51 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 334281f..cd7de51 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -76,6 +76,15 @@ struct e820map {
 /* see comment in arch/x86/kernel/e820.c */
 extern struct e820map e820;
 
+#if defined(CONFIG_X86_OOSTORE) && defined(CONFIG_CPU_SUP_CENTAUR)
+extern int centaur_ram_top;
+void get_centaur_ram_top(void);
+#else
+static inline void get_centaur_ram_top(void)
+{
+}
+#endif
+
 extern unsigned long pci_mem_start;
 extern int e820_any_mapped(u64 start, u64 end, unsigned type);
 extern int e820_all_mapped(u64 start, u64 end, unsigned type);
diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c
index e58d978..bb49358 100644
--- a/arch/x86/kernel/cpu/centaur.c
+++ b/arch/x86/kernel/cpu/centaur.c
@@ -37,63 +37,14 @@ static void __cpuinit centaur_mcr_insert(int reg, u32 base, u32 size, int key)
 	mtrr_centaur_report_mcr(reg, lo, hi);	/* Tell the mtrr driver */
 }
 
-/*
- * Figure what we can cover with MCR's
- *
- * Shortcut: We know you can't put 4Gig of RAM on a winchip
- */
-static u32 __cpuinit ramtop(void)
-{
-	u32 clip = 0xFFFFFFFFUL;
-	u32 top = 0;
-	int i;
-
-	for (i = 0; i < e820.nr_map; i++) {
-		unsigned long start, end;
-
-		if (e820.map[i].addr > 0xFFFFFFFFUL)
-			continue;
-		/*
-		 * Don't MCR over reserved space. Ignore the ISA hole
-		 * we frob around that catastrophe already
-		 */
-		if (e820.map[i].type == E820_RESERVED) {
-			if (e820.map[i].addr >= 0x100000UL &&
-			    e820.map[i].addr < clip)
-				clip = e820.map[i].addr;
-			continue;
-		}
-		start = e820.map[i].addr;
-		end = e820.map[i].addr + e820.map[i].size;
-		if (start >= end)
-			continue;
-		if (end > top)
-			top = end;
-	}
-	/*
-	 * Everything below 'top' should be RAM except for the ISA hole.
-	 * Because of the limited MCR's we want to map NV/ACPI into our
-	 * MCR range for gunk in RAM
-	 *
-	 * Clip might cause us to MCR insufficient RAM but that is an
-	 * acceptable failure mode and should only bite obscure boxes with
-	 * a VESA hole at 15Mb
-	 *
-	 * The second case Clip sometimes kicks in is when the EBDA is marked
-	 * as reserved. Again we fail safe with reasonable results
-	 */
-	if (top > clip)
-		top = clip;
-
-	return top;
-}
+int __cpuinitdata centaur_ram_top;
 
 /*
  * Compute a set of MCR's to give maximum coverage
  */
 static int __cpuinit centaur_mcr_compute(int nr, int key)
 {
-	u32 mem = ramtop();
+	u32 mem = centaur_ram_top;
 	u32 root = power2(mem);
 	u32 base = root;
 	u32 top = root;
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 6fdd9e9..6b17893 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1124,6 +1124,62 @@ void __init setup_memory_map(void)
 	e820_print_map(who);
 }
 
+#if defined(CONFIG_X86_OOSTORE) && defined(CONFIG_CPU_SUP_CENTAUR)
+/*
+ * Figure what we can cover with MCR's
+ *
+ * Shortcut: We know you can't put 4Gig of RAM on a winchip
+ */
+void __init get_centaur_ram_top(void)
+{
+	u32 clip = 0xFFFFFFFFUL;
+	u32 top = 0;
+	int i;
+
+	if (boot_cpu_data.x86_vendor != X86_VENDOR_CENTAUR)
+		return;
+
+	for (i = 0; i < e820.nr_map; i++) {
+		unsigned long start, end;
+
+		if (e820.map[i].addr > 0xFFFFFFFFUL)
+			continue;
+		/*
+		 * Don't MCR over reserved space. Ignore the ISA hole
+		 * we frob around that catastrophe already
+		 */
+		if (e820.map[i].type == E820_RESERVED) {
+			if (e820.map[i].addr >= 0x100000UL &&
+			    e820.map[i].addr < clip)
+				clip = e820.map[i].addr;
+			continue;
+		}
+		start = e820.map[i].addr;
+		end = e820.map[i].addr + e820.map[i].size;
+		if (start >= end)
+			continue;
+		if (end > top)
+			top = end;
+	}
+	/*
+	 * Everything below 'top' should be RAM except for the ISA hole.
+	 * Because of the limited MCR's we want to map NV/ACPI into our
+	 * MCR range for gunk in RAM
+	 *
+	 * Clip might cause us to MCR insufficient RAM but that is an
+	 * acceptable failure mode and should only bite obscure boxes with
+	 * a VESA hole at 15Mb
+	 *
+	 * The second case Clip sometimes kicks in is when the EBDA is marked
+	 * as reserved. Again we fail safe with reasonable results
+	 */
+	if (top > clip)
+		top = clip;
+
+	centaur_ram_top = top;
+}
+#endif
+
 void __init init_lmb_memory(void)
 {
 	lmb_init();
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index b00ccc4..0e52435 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -875,6 +875,8 @@ void __init setup_arch(char **cmdline_p)
 	if (mtrr_trim_uncached_memory(max_pfn))
 		max_pfn = e820_end_of_ram_pfn();
 
+	get_centaur_ram_top();
+
 #ifdef CONFIG_X86_32
 	/* max_low_pfn get updated here */
 	find_low_pfn_range();
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 29/31] x86: Make e820_any_mapped to __init
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

We don't need to expose e820_any_mapped anymore

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/e820.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 6b17893..2182bb0 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -47,9 +47,10 @@ EXPORT_SYMBOL(pci_mem_start);
 /*
  * This function checks if any part of the range <start,end> is mapped
  * with type.
+ * phys_pud_init() is using it and is _meminit, but we have !after_bootmem
+ * so could use refok here
  */
-int
-e820_any_mapped(u64 start, u64 end, unsigned type)
+int __init_refok e820_any_mapped(u64 start, u64 end, unsigned type)
 {
 	int i;
 
@@ -64,7 +65,6 @@ e820_any_mapped(u64 start, u64 end, unsigned type)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(e820_any_mapped);
 
 /*
  * This function checks if the entire range <start,end> is mapped with type.
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 29/31] x86: Make e820_any_mapped to __init
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

We don't need to expose e820_any_mapped anymore

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/e820.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 6b17893..2182bb0 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -47,9 +47,10 @@ EXPORT_SYMBOL(pci_mem_start);
 /*
  * This function checks if any part of the range <start,end> is mapped
  * with type.
+ * phys_pud_init() is using it and is _meminit, but we have !after_bootmem
+ * so could use refok here
  */
-int
-e820_any_mapped(u64 start, u64 end, unsigned type)
+int __init_refok e820_any_mapped(u64 start, u64 end, unsigned type)
 {
 	int i;
 
@@ -64,7 +65,6 @@ e820_any_mapped(u64 start, u64 end, unsigned type)
 	}
 	return 0;
 }
-EXPORT_SYMBOL_GPL(e820_any_mapped);
 
 /*
  * This function checks if the entire range <start,end> is mapped with type.
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 30/31] x86: Use walk_system_ream_range()instead of e820.map directly
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

So we can make e820 to be __initdata

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/tboot.c |   22 +++++++++-------------
 1 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index cc2c604..cf27d64 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -170,34 +170,30 @@ static void tboot_create_trampoline(void)
 
 #ifdef CONFIG_ACPI_SLEEP
 
-static void add_mac_region(phys_addr_t start, unsigned long size)
+static int
+add_mac_region(unsigned long start_pfn, unsigned long nr_pages, void  *arg)
 {
+	u64 start = start_pfn;
+	u64 size = nr_pages;
 	struct tboot_mac_region *mr;
-	phys_addr_t end = start + size;
 
 	if (tboot->num_mac_regions >= MAX_TB_MAC_REGIONS)
 		panic("tboot: Too many MAC regions\n");
 
 	if (start && size) {
 		mr = &tboot->mac_regions[tboot->num_mac_regions++];
-		mr->start = round_down(start, PAGE_SIZE);
-		mr->size  = round_up(end, PAGE_SIZE) - mr->start;
+		mr->start = start << PAGE_SHIFT;
+		mr->size  = (u32) (size << PAGE_SHIFT);
 	}
+
+	return 0;
 }
 
 static int tboot_setup_sleep(void)
 {
-	int i;
-
 	tboot->num_mac_regions = 0;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		if ((e820.map[i].type != E820_RAM)
-		 && (e820.map[i].type != E820_RESERVED_KERN))
-			continue;
-
-		add_mac_region(e820.map[i].addr, e820.map[i].size);
-	}
+	walk_system_ram_range(0, max_pfn, NULL, add_mac_region);
 
 	tboot->acpi_sinfo.kernel_s3_resume_vector = acpi_wakeup_address;
 
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 30/31] x86: Use walk_system_ream_range()instead of e820.map directly
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

So we can make e820 to be __initdata

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/kernel/tboot.c |   22 +++++++++-------------
 1 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index cc2c604..cf27d64 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -170,34 +170,30 @@ static void tboot_create_trampoline(void)
 
 #ifdef CONFIG_ACPI_SLEEP
 
-static void add_mac_region(phys_addr_t start, unsigned long size)
+static int
+add_mac_region(unsigned long start_pfn, unsigned long nr_pages, void  *arg)
 {
+	u64 start = start_pfn;
+	u64 size = nr_pages;
 	struct tboot_mac_region *mr;
-	phys_addr_t end = start + size;
 
 	if (tboot->num_mac_regions >= MAX_TB_MAC_REGIONS)
 		panic("tboot: Too many MAC regions\n");
 
 	if (start && size) {
 		mr = &tboot->mac_regions[tboot->num_mac_regions++];
-		mr->start = round_down(start, PAGE_SIZE);
-		mr->size  = round_up(end, PAGE_SIZE) - mr->start;
+		mr->start = start << PAGE_SHIFT;
+		mr->size  = (u32) (size << PAGE_SHIFT);
 	}
+
+	return 0;
 }
 
 static int tboot_setup_sleep(void)
 {
-	int i;
-
 	tboot->num_mac_regions = 0;
 
-	for (i = 0; i < e820.nr_map; i++) {
-		if ((e820.map[i].type != E820_RAM)
-		 && (e820.map[i].type != E820_RESERVED_KERN))
-			continue;
-
-		add_mac_region(e820.map[i].addr, e820.map[i].size);
-	}
+	walk_system_ram_range(0, max_pfn, NULL, add_mac_region);
 
 	tboot->acpi_sinfo.kernel_s3_resume_vector = acpi_wakeup_address;
 
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 31/31] x86: make e820 to be __initdata
  2010-03-29  2:42 ` Yinghai Lu
@ 2010-03-29  2:43   ` Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Finally no users after init boot stage. We can free it

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h |    2 --
 arch/x86/kernel/e820.c      |    2 +-
 2 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index cd7de51..f2ab72e 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -73,8 +73,6 @@ struct e820map {
 #define BIOS_END		0x00100000
 
 #ifdef __KERNEL__
-/* see comment in arch/x86/kernel/e820.c */
-extern struct e820map e820;
 
 #if defined(CONFIG_X86_OOSTORE) && defined(CONFIG_CPU_SUP_CENTAUR)
 extern int centaur_ram_top;
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 2182bb0..5a0d688 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -35,7 +35,7 @@
  * user can e.g. boot the original kernel with mem=1G while still booting the
  * next kernel with full memory.
  */
-struct e820map e820;
+static struct e820map __initdata e820;
 static struct e820map __initdata e820_saved;
 
 /* For PCI or other memory-mapped resources */
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 31/31] x86: make e820 to be __initdata
@ 2010-03-29  2:43   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29  2:43 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Be
  Cc: Johannes Weiner, linux-kernel, linux-arch, Yinghai Lu

Finally no users after init boot stage. We can free it

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/include/asm/e820.h |    2 --
 arch/x86/kernel/e820.c      |    2 +-
 2 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index cd7de51..f2ab72e 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -73,8 +73,6 @@ struct e820map {
 #define BIOS_END		0x00100000
 
 #ifdef __KERNEL__
-/* see comment in arch/x86/kernel/e820.c */
-extern struct e820map e820;
 
 #if defined(CONFIG_X86_OOSTORE) && defined(CONFIG_CPU_SUP_CENTAUR)
 extern int centaur_ram_top;
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 2182bb0..5a0d688 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -35,7 +35,7 @@
  * user can e.g. boot the original kernel with mem=1G while still booting the
  * next kernel with full memory.
  */
-struct e820map e820;
+static struct e820map __initdata e820;
 static struct e820map __initdata e820_saved;
 
 /* For PCI or other memory-mapped resources */
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* Re: [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init
  2010-03-29  2:43   ` Yinghai Lu
  (?)
@ 2010-03-29 12:07   ` Michael Ellerman
  2010-03-29 16:20     ` Yinghai Lu
  -1 siblings, 1 reply; 106+ messages in thread
From: Michael Ellerman @ 2010-03-29 12:07 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 337 bytes --]

On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
> So those lmb bits could released after kernel is booted up.
> 
> Arch code could define __NOT_KEEP_LMB in asm/lmb.h, __init_lmb will become __init

ARCH_KEEP_LMB or ARCH_DISCARD_LMB, or something like that, would be more
in keeping with existing flags like this.

cheers


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29  2:43   ` Yinghai Lu
  (?)
@ 2010-03-29 12:22   ` Michael Ellerman
  2010-03-29 16:45     ` Yinghai Lu
  2010-03-29 21:49     ` Benjamin Herrenschmidt
  -1 siblings, 2 replies; 106+ messages in thread
From: Michael Ellerman @ 2010-03-29 12:22 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 2416 bytes --]

On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
> They will check if the region array is big enough.
> 
> __check_and_double_region_array will try to double the region if that array spare
> slots if not big enough.
> find_lmb_area() is used to find good postion for new region array.
> Old array will be copied to new array.
> 
> Arch code should provide to get_max_mapped, so the new array have accessiable
> address
..
> diff --git a/mm/lmb.c b/mm/lmb.c
> index d5d5dc4..9798458 100644
> --- a/mm/lmb.c
> +++ b/mm/lmb.c
> @@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
>  	return -1;
>  }
>  
> +u64 __weak __init get_max_mapped(void)
> +{
> +	u64 end = max_low_pfn;
> +
> +	end <<= PAGE_SHIFT;
> +
> +	return end;
> +}

^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
adding this function, we could just say that the arch code needs to set
rmo_size up with an appropriate value, and then use that below. Though
maybe that's conflating things.

...
> +
> +void __init add_lmb_memory(u64 start, u64 end)
> +{
> +	__check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
> +	lmb_add(start, end - start);
> +}
> +
> +void __init reserve_lmb(u64 start, u64 end, char *name)
> +{
> +	if (start == end)
> +		return;
> +
> +	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
> +		return;
> +
> +	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
> +	lmb_reserve(start, end - start);
> +}
> +
> +void __init free_lmb(u64 start, u64 end)
> +{
> +	if (start == end)
> +		return;
> +
> +	if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
> +		return;
> +
> +	/* keep punching hole, could run out of slots too */
> +	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
> +	lmb_free(start, end - start);
> +}

Doesn't this mean that if I call lmb_alloc() or lmb_free() too many
times then I'll potentially run out of space? So doesn't that
essentially break the existing API?

It seems to me that rather than adding these "special" routines that
check for enough space on the way in, instead you should be checking in
lmb_add_region() - which is where AFAICS all allocs/frees/reserves
eventually end up if they need to insert a new region.

cheers



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29  2:42 ` Yinghai Lu
                   ` (31 preceding siblings ...)
  (?)
@ 2010-03-29 12:22 ` Michael Ellerman
  2010-03-29 16:52   ` Yinghai Lu
  -1 siblings, 1 reply; 106+ messages in thread
From: Michael Ellerman @ 2010-03-29 12:22 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 673 bytes --]

On Sun, 2010-03-28 at 19:42 -0700, Yinghai Lu wrote:
> the new lmb could be used to early_res in x86.
> 
> Suggested by: David, Ben, and Thomas
> 
> First three patches should go into 2.6.34
> 
> -v6: change sequence as requested by Thomas
> -v7: seperate them to more patches
> -v8: add boundary checking to make sure not free partial page.
> -v9: use lmb_debug to control print out of reserve_lmb.
>      add e820 clean up, and e820 become __initdata

Bike shedding perhaps, but can you maintain the naming convention, ie.
lmb_xxx() rather than xxx_lmb(). Neither is necessarily better, but all
the existing functions use the lmb_xxx() style.

cheers


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init
  2010-03-29 12:07   ` Michael Ellerman
@ 2010-03-29 16:20     ` Yinghai Lu
  2010-03-29 18:34       ` David Miller
  0 siblings, 1 reply; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 16:20 UTC (permalink / raw)
  To: michael
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 05:07 AM, Michael Ellerman wrote:
> On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
>> So those lmb bits could released after kernel is booted up.
>>
>> Arch code could define __NOT_KEEP_LMB in asm/lmb.h, __init_lmb will become __init
> 
> ARCH_KEEP_LMB or ARCH_DISCARD_LMB, or something like that, would be more
> in keeping with existing flags like this.
> 

ok, will use ARCH_DISCARD_LMB.

BTW, it seems only PowerPC need to keep lmb after init stage, right ?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 12:22   ` Michael Ellerman
@ 2010-03-29 16:45     ` Yinghai Lu
  2010-03-29 22:20       ` Michael Ellerman
  2010-03-29 21:49     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 16:45 UTC (permalink / raw)
  To: michael
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 05:22 AM, Michael Ellerman wrote:
> On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
>> They will check if the region array is big enough.
>>
>> __check_and_double_region_array will try to double the region if that array spare
>> slots if not big enough.
>> find_lmb_area() is used to find good postion for new region array.
>> Old array will be copied to new array.
>>
>> Arch code should provide to get_max_mapped, so the new array have accessiable
>> address
> ..
>> diff --git a/mm/lmb.c b/mm/lmb.c
>> index d5d5dc4..9798458 100644
>> --- a/mm/lmb.c
>> +++ b/mm/lmb.c
>> @@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
>>  	return -1;
>>  }
>>  
>> +u64 __weak __init get_max_mapped(void)
>> +{
>> +	u64 end = max_low_pfn;
>> +
>> +	end <<= PAGE_SHIFT;
>> +
>> +	return end;
>> +}
> 
> ^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
> adding this function, we could just say that the arch code needs to set
> rmo_size up with an appropriate value, and then use that below. Though
> maybe that's conflating things.

ok

will have another patch following this patchset. to use rmo_size replace get_max_mapped()

long __init_lmb lmb_add(u64 base, u64 size)
{
        struct lmb_region *_rgn = &lmb.memory;

        /* On pSeries LPAR systems, the first LMB is our RMO region. */
        if (base == 0)
                lmb.rmo_size = size;

        return lmb_add_region(_rgn, base, size);

}

looks scary.
maybe later powerpc could used lmb_find and set_lmb_rmo_size in their arch code.


> 
> ...
>> +
>> +void __init add_lmb_memory(u64 start, u64 end)
>> +{
>> +	__check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
>> +	lmb_add(start, end - start);
>> +}
>> +
>> +void __init reserve_lmb(u64 start, u64 end, char *name)
>> +{
>> +	if (start == end)
>> +		return;
>> +
>> +	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
>> +		return;
>> +
>> +	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
>> +	lmb_reserve(start, end - start);
>> +}
>> +
>> +void __init free_lmb(u64 start, u64 end)
>> +{
>> +	if (start == end)
>> +		return;
>> +
>> +	if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
>> +		return;
>> +
>> +	/* keep punching hole, could run out of slots too */
>> +	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
>> +	lmb_free(start, end - start);
>> +}
> 
> Doesn't this mean that if I call lmb_alloc() or lmb_free() too many
> times then I'll potentially run out of space? So doesn't that
> essentially break the existing API?

No, I didn't touch existing API, arches other than x86 should have little change about 
lmb.memory.region
lmb.reserved.region
become pointer from array.

> 
> It seems to me that rather than adding these "special" routines that
> check for enough space on the way in, instead you should be checking in
> lmb_add_region() - which is where AFAICS all allocs/frees/reserves
> eventually end up if they need to insert a new region.

later i prefer to replace lmb_alloc with find_lmb_area + reserve_lmb.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 12:22 ` [PATCH -v9 00/31] use lmb with x86 Michael Ellerman
@ 2010-03-29 16:52   ` Yinghai Lu
  2010-03-29 20:39     ` Yinghai Lu
  2010-03-29 22:10     ` Michael Ellerman
  0 siblings, 2 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 16:52 UTC (permalink / raw)
  To: michael
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 05:22 AM, Michael Ellerman wrote:
> On Sun, 2010-03-28 at 19:42 -0700, Yinghai Lu wrote:
>> the new lmb could be used to early_res in x86.
>>
>> Suggested by: David, Ben, and Thomas
>>
>> First three patches should go into 2.6.34
>>
>> -v6: change sequence as requested by Thomas
>> -v7: seperate them to more patches
>> -v8: add boundary checking to make sure not free partial page.
>> -v9: use lmb_debug to control print out of reserve_lmb.
>>      add e820 clean up, and e820 become __initdata
> 
> Bike shedding perhaps, but can you maintain the naming convention, ie.
> lmb_xxx() rather than xxx_lmb(). Neither is necessarily better, but all
> the existing functions use the lmb_xxx() style.
> 

so you want

find_lmb_area ==> lmb_find_area
reserve_lmb ==> lmb_reserve
free_lmb ==> lmb_free

first one is ok, 

but next two we already have lmb_reserved and lmb_free without checking and increasing the size of region array.

should i use 
lmb_reserve_with_check?

thanks

yinghai


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 02/31] x86: Make sure free_init_pages() free pages in boundary
  2010-03-29  2:42   ` Yinghai Lu
  (?)
@ 2010-03-29 16:57   ` Ingo Molnar
  2010-03-29 16:59     ` Yinghai Lu
  -1 siblings, 1 reply; 106+ messages in thread
From: Ingo Molnar @ 2010-03-29 16:57 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, David Miller,
	Benjamin Herrenschmidt, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch


Note, i applied the fix below, without it the 32-bit defconfig build would 
fail with:

 arch/x86/kernel/head32.c: In function 'i386_start_kernel':
 arch/x86/kernel/head32.c:50: error: implicit declaration of function 'PAGE_ALIGN'

	Ingo

Index: linux/arch/x86/kernel/head32.c
===================================================================
--- linux.orig/arch/x86/kernel/head32.c
+++ linux/arch/x86/kernel/head32.c
@@ -7,6 +7,7 @@
 
 #include <linux/init.h>
 #include <linux/start_kernel.h>
+#include <linux/mm.h>
 
 #include <asm/setup.h>
 #include <asm/sections.h>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 02/31] x86: Make sure free_init_pages() free pages in boundary
  2010-03-29 16:57   ` Ingo Molnar
@ 2010-03-29 16:59     ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 16:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, H. Peter Anvin, Andrew Morton, David Miller,
	Benjamin Herrenschmidt, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On 03/29/2010 09:57 AM, Ingo Molnar wrote:
> 
> Note, i applied the fix below, without it the 32-bit defconfig build would 
> fail with:
> 
>  arch/x86/kernel/head32.c: In function 'i386_start_kernel':
>  arch/x86/kernel/head32.c:50: error: implicit declaration of function 'PAGE_ALIGN'
> 
thanks.

after lmb work, will check all those
round_up
round_down
roundup
PAGE_ALIGN
ALIGN
...

We promised to Andrew, We need to clean them up.


Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init
  2010-03-29 16:20     ` Yinghai Lu
@ 2010-03-29 18:34       ` David Miller
  2010-03-29 18:39         ` Yinghai Lu
  0 siblings, 1 reply; 106+ messages in thread
From: David Miller @ 2010-03-29 18:34 UTC (permalink / raw)
  To: yinghai
  Cc: michael, mingo, tglx, hpa, akpm, benh, torvalds, hannes,
	linux-kernel, linux-arch

From: Yinghai Lu <yinghai@kernel.org>
Date: Mon, 29 Mar 2010 09:20:28 -0700

> BTW, it seems only PowerPC need to keep lmb after init stage, right ?

For now.  Sparc64 will need it at some point in the future.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init
  2010-03-29 18:34       ` David Miller
@ 2010-03-29 18:39         ` Yinghai Lu
  2010-03-29 19:11           ` David Miller
  0 siblings, 1 reply; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 18:39 UTC (permalink / raw)
  To: David Miller
  Cc: michael, mingo, tglx, hpa, akpm, benh, torvalds, hannes,
	linux-kernel, linux-arch

On 03/29/2010 11:34 AM, David Miller wrote:
> From: Yinghai Lu <yinghai@kernel.org>
> Date: Mon, 29 Mar 2010 09:20:28 -0700
> 
>> BTW, it seems only PowerPC need to keep lmb after init stage, right ?
> 
> For now.  Sparc64 will need it at some point in the future.

for memory hotplug support?

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [tip:x86/urgent] x86: Make smp_locks end with page alignment
  2010-03-29  2:42   ` Yinghai Lu
  (?)
@ 2010-03-29 18:42   ` tip-bot for Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: tip-bot for Yinghai Lu @ 2010-03-29 18:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, yinghai, torvalds, hannes, davem, benh,
	tglx, mingo

Commit-ID:  596b711ed6b5235f8545680ef38ace00f9898c32
Gitweb:     http://git.kernel.org/tip/596b711ed6b5235f8545680ef38ace00f9898c32
Author:     Yinghai Lu <yinghai@kernel.org>
AuthorDate: Sun, 28 Mar 2010 19:42:54 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 29 Mar 2010 18:42:30 +0200

x86: Make smp_locks end with page alignment

Fix:

 ------------[ cut here ]------------
 WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa()
 free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned
 Modules linked in:
 Pid: 0, comm: swapper Not tainted
 2.6.34-rc2-tip-03946-g4f16b23-dirty #50 Call Trace:
  [<40232e9f>] warn_slowpath_common+0x65/0x7c
  [<4021c9f0>] ? free_init_pages+0x4c/0xfa
  [<40881434>] ? _etext+0x0/0x24
  [<40232eea>] warn_slowpath_fmt+0x24/0x27
  [<4021c9f0>] free_init_pages+0x4c/0xfa
  [<40881434>] ? _etext+0x0/0x24
  [<40d3f4bd>] alternative_instructions+0xf6/0x100
  [<40d3fe4f>] check_bugs+0xbd/0xbf
  [<40d398a7>] start_kernel+0x2d5/0x2e4
  [<40d390ce>] i386_start_kernel+0xce/0xd5
 ---[ end trace 4eaa2a86a8e2da22 ]---

Comments in vmlinux.lds.S already said:

 |        /*
 |         * smp_locks might be freed after init
 |         * start/end must be page aligned
 |         */

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1269830604-26214-2-git-send-email-yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/vmlinux.lds.S |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 44879df..2cc2497 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -291,8 +291,8 @@ SECTIONS
 	.smp_locks : AT(ADDR(.smp_locks) - LOAD_OFFSET) {
 		__smp_locks = .;
 		*(.smp_locks)
-		__smp_locks_end = .;
 		. = ALIGN(PAGE_SIZE);
+		__smp_locks_end = .;
 	}
 
 #ifdef CONFIG_X86_64

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [tip:x86/urgent] x86: Make sure free_init_pages() frees pages on page boundary
  2010-03-29  2:42   ` Yinghai Lu
  (?)
  (?)
@ 2010-03-29 18:42   ` tip-bot for Yinghai Lu
  -1 siblings, 0 replies; 106+ messages in thread
From: tip-bot for Yinghai Lu @ 2010-03-29 18:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, yinghai, torvalds, hannes, davem, benh,
	tglx, sgruszka, mingo

Commit-ID:  c967da6a0ba837f762042e931d4afcf72045547c
Gitweb:     http://git.kernel.org/tip/c967da6a0ba837f762042e931d4afcf72045547c
Author:     Yinghai Lu <yinghai@kernel.org>
AuthorDate: Sun, 28 Mar 2010 19:42:55 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 29 Mar 2010 18:55:33 +0200

x86: Make sure free_init_pages() frees pages on page boundary

When CONFIG_NO_BOOTMEM=y, it could use memory more effiently, or
in a more compact fashion.

Example:

 Allocated new RAMDISK: 00ec2000 - 0248ce57
 Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56

The new RAMDISK's end is not page aligned.
Last page could be shared with other users.

When free_init_pages are called for initrd or .init, the page
could be freed and we could corrupt other data.

code segment in free_init_pages():

 |        for (; addr < end; addr += PAGE_SIZE) {
 |                ClearPageReserved(virt_to_page(addr));
 |                init_page_count(virt_to_page(addr));
 |                memset((void *)(addr & ~(PAGE_SIZE-1)),
 |                        POISON_FREE_INITMEM, PAGE_SIZE);
 |                free_page(addr);
 |                totalram_pages++;
 |        }

last half page could be used as one whole free page.

So page align the boundaries.

-v2: make the original initramdisk to be aligned, according to
     Johannes, otherwise we have the chance to lose one page.
     we still need to keep initrd_end not aligned, otherwise it could
     confuse decompressor.
-v3: change to WARN_ON instead, suggested by Johannes.
-v4: use PAGE_ALIGN, suggested by Johannes.
     We may fix that macro name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN
     Add comments about assuming ramdisk start is aligned
     in relocate_initrd(), change to re get ramdisk_image instead of save it
     to make diff smaller. Add warning for wrong range, suggested by Johannes.
-v6: remove one WARN()
     We need to align beginning in free_init_pages()
     do not copy more than ramdisk_size, noticed by Johannes

Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1269830604-26214-3-git-send-email-yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/head32.c |    4 +++-
 arch/x86/kernel/head64.c |    3 ++-
 arch/x86/kernel/setup.c  |   10 ++++++----
 arch/x86/mm/init.c       |   32 ++++++++++++++++++++++++++------
 4 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index adedeef..b2e2460 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -7,6 +7,7 @@
 
 #include <linux/init.h>
 #include <linux/start_kernel.h>
+#include <linux/mm.h>
 
 #include <asm/setup.h>
 #include <asm/sections.h>
@@ -44,9 +45,10 @@ void __init i386_start_kernel(void)
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
 	if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
+		/* Assume only end is not page aligned */
 		u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 		u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
-		u64 ramdisk_end   = ramdisk_image + ramdisk_size;
+		u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index b5a9896..7147143 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -103,9 +103,10 @@ void __init x86_64_start_reservations(char *real_mode_data)
 #ifdef CONFIG_BLK_DEV_INITRD
 	/* Reserve INITRD */
 	if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
+		/* Assume only end is not page aligned */
 		unsigned long ramdisk_image = boot_params.hdr.ramdisk_image;
 		unsigned long ramdisk_size  = boot_params.hdr.ramdisk_size;
-		unsigned long ramdisk_end   = ramdisk_image + ramdisk_size;
+		unsigned long ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 		reserve_early(ramdisk_image, ramdisk_end, "RAMDISK");
 	}
 #endif
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 5d7ba1a..d76e185 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -314,16 +314,17 @@ static void __init reserve_brk(void)
 #define MAX_MAP_CHUNK	(NR_FIX_BTMAPS << PAGE_SHIFT)
 static void __init relocate_initrd(void)
 {
-
+	/* Assume only end is not page aligned */
 	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+	u64 area_size     = PAGE_ALIGN(ramdisk_size);
 	u64 end_of_lowmem = max_low_pfn_mapped << PAGE_SHIFT;
 	u64 ramdisk_here;
 	unsigned long slop, clen, mapaddr;
 	char *p, *q;
 
 	/* We need to move the initrd down into lowmem */
-	ramdisk_here = find_e820_area(0, end_of_lowmem, ramdisk_size,
+	ramdisk_here = find_e820_area(0, end_of_lowmem, area_size,
 					 PAGE_SIZE);
 
 	if (ramdisk_here == -1ULL)
@@ -332,7 +333,7 @@ static void __init relocate_initrd(void)
 
 	/* Note: this includes all the lowmem currently occupied by
 	   the initrd, we rely on that fact to keep the data intact. */
-	reserve_early(ramdisk_here, ramdisk_here + ramdisk_size,
+	reserve_early(ramdisk_here, ramdisk_here + area_size,
 			 "NEW RAMDISK");
 	initrd_start = ramdisk_here + PAGE_OFFSET;
 	initrd_end   = initrd_start + ramdisk_size;
@@ -376,9 +377,10 @@ static void __init relocate_initrd(void)
 
 static void __init reserve_initrd(void)
 {
+	/* Assume only end is not page aligned */
 	u64 ramdisk_image = boot_params.hdr.ramdisk_image;
 	u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
-	u64 ramdisk_end   = ramdisk_image + ramdisk_size;
+	u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 	u64 end_of_lowmem = max_low_pfn_mapped << PAGE_SHIFT;
 
 	if (!boot_params.hdr.type_of_loader ||
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index e71c5cb..452ee5b 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -331,11 +331,23 @@ int devmem_is_allowed(unsigned long pagenr)
 
 void free_init_pages(char *what, unsigned long begin, unsigned long end)
 {
-	unsigned long addr = begin;
+	unsigned long addr;
+	unsigned long begin_aligned, end_aligned;
 
-	if (addr >= end)
+	/* Make sure boundaries are page aligned */
+	begin_aligned = PAGE_ALIGN(begin);
+	end_aligned   = end & PAGE_MASK;
+
+	if (WARN_ON(begin_aligned != begin || end_aligned != end)) {
+		begin = begin_aligned;
+		end   = end_aligned;
+	}
+
+	if (begin >= end)
 		return;
 
+	addr = begin;
+
 	/*
 	 * If debugging page accesses then do not free this memory but
 	 * mark them not present - any buggy init-section access will
@@ -343,7 +355,7 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
 	 */
 #ifdef CONFIG_DEBUG_PAGEALLOC
 	printk(KERN_INFO "debug: unmapping init memory %08lx..%08lx\n",
-		begin, PAGE_ALIGN(end));
+		begin, end);
 	set_memory_np(begin, (end - begin) >> PAGE_SHIFT);
 #else
 	/*
@@ -358,8 +370,7 @@ void free_init_pages(char *what, unsigned long begin, unsigned long end)
 	for (; addr < end; addr += PAGE_SIZE) {
 		ClearPageReserved(virt_to_page(addr));
 		init_page_count(virt_to_page(addr));
-		memset((void *)(addr & ~(PAGE_SIZE-1)),
-			POISON_FREE_INITMEM, PAGE_SIZE);
+		memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE);
 		free_page(addr);
 		totalram_pages++;
 	}
@@ -376,6 +387,15 @@ void free_initmem(void)
 #ifdef CONFIG_BLK_DEV_INITRD
 void free_initrd_mem(unsigned long start, unsigned long end)
 {
-	free_init_pages("initrd memory", start, end);
+	/*
+	 * end could be not aligned, and We can not align that,
+	 * decompresser could be confused by aligned initrd_end
+	 * We already reserve the end partial page before in
+	 *   - i386_start_kernel()
+	 *   - x86_64_start_kernel()
+	 *   - relocate_initrd()
+	 * So here We can do PAGE_ALIGN() safely to get partial page to be freed
+	 */
+	free_init_pages("initrd memory", start, PAGE_ALIGN(end));
 }
 #endif

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [tip:x86/urgent] x86: Do not free zero sized per cpu areas
  2010-03-29  2:42   ` Yinghai Lu
  (?)
  (?)
@ 2010-03-29 18:43   ` tip-bot for Ian Campbell
  -1 siblings, 0 replies; 106+ messages in thread
From: tip-bot for Ian Campbell @ 2010-03-29 18:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, joel.becker, hpa, mingo, konrad.wilk, yinghai,
	torvalds, peterz, ian.campbell, hannes, davem, benh, tglx,
	sgruszka, mingo

Commit-ID:  eed63519e3e74d515d2007ecd895338d0ba2a85c
Gitweb:     http://git.kernel.org/tip/eed63519e3e74d515d2007ecd895338d0ba2a85c
Author:     Ian Campbell <ian.campbell@citrix.com>
AuthorDate: Sun, 28 Mar 2010 19:42:56 -0700
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 29 Mar 2010 18:55:40 +0200

x86: Do not free zero sized per cpu areas

This avoids an infinite loop in free_early_partial().

Add a warning to free_early_partial() to catch future problems.

-v5: put back start > end back into WARN_ONCE()
-v6: use one line for warning, suggested by Linus
-v7: more tests
-v8: remove the function name as suggested by Johannes
     WARN_ONCE() will print out that function name.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Joel Becker <joel.becker@oracle.com>
Tested-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1269830604-26214-4-git-send-email-yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/early_res.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/early_res.c b/kernel/early_res.c
index 3cb2c66..31aa933 100644
--- a/kernel/early_res.c
+++ b/kernel/early_res.c
@@ -333,6 +333,12 @@ void __init free_early_partial(u64 start, u64 end)
 	struct early_res *r;
 	int i;
 
+	if (start == end)
+		return;
+
+	if (WARN_ONCE(start > end, "  wrong range [%#llx, %#llx]\n", start, end))
+		return;
+
 try_next:
 	i = find_overlapped_early(start, end);
 	if (i >= max_early_res)

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* Re: [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init
  2010-03-29 18:39         ` Yinghai Lu
@ 2010-03-29 19:11           ` David Miller
  2010-03-29 21:44             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 106+ messages in thread
From: David Miller @ 2010-03-29 19:11 UTC (permalink / raw)
  To: yinghai
  Cc: michael, mingo, tglx, hpa, akpm, benh, torvalds, hannes,
	linux-kernel, linux-arch

From: Yinghai Lu <yinghai@kernel.org>
Date: Mon, 29 Mar 2010 11:39:52 -0700

> On 03/29/2010 11:34 AM, David Miller wrote:
>> From: Yinghai Lu <yinghai@kernel.org>
>> Date: Mon, 29 Mar 2010 09:20:28 -0700
>> 
>>> BTW, it seems only PowerPC need to keep lmb after init stage, right ?
>> 
>> For now.  Sparc64 will need it at some point in the future.
> 
> for memory hotplug support?

Yes.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 16:52   ` Yinghai Lu
@ 2010-03-29 20:39     ` Yinghai Lu
  2010-03-29 22:10     ` Michael Ellerman
  1 sibling, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 20:39 UTC (permalink / raw)
  To: michael
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 09:52 AM, Yinghai Lu wrote:
> On 03/29/2010 05:22 AM, Michael Ellerman wrote:
>> On Sun, 2010-03-28 at 19:42 -0700, Yinghai Lu wrote:
>>> the new lmb could be used to early_res in x86.
>>>
>>> Suggested by: David, Ben, and Thomas
>>>
>>> First three patches should go into 2.6.34
>>>
>>> -v6: change sequence as requested by Thomas
>>> -v7: seperate them to more patches
>>> -v8: add boundary checking to make sure not free partial page.
>>> -v9: use lmb_debug to control print out of reserve_lmb.
>>>      add e820 clean up, and e820 become __initdata
>>
>> Bike shedding perhaps, but can you maintain the naming convention, ie.
>> lmb_xxx() rather than xxx_lmb(). Neither is necessarily better, but all
>> the existing functions use the lmb_xxx() style.
>>
> 
> so you want
> 
> find_lmb_area ==> lmb_find_area
> reserve_lmb ==> lmb_reserve
> free_lmb ==> lmb_free
> 
> first one is ok, 
> 
> but next two we already have lmb_reserved and lmb_free without checking and increasing the size of region array.
> 
> should i use 
> lmb_reserve_with_check?
> 

I change 

find_lmb_area ==> lmb_find_area
reserve_lmb ==> lmb_reserve_area
free_lmb ==> lmb_free_area

does that look ok to you?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init
  2010-03-29 19:11           ` David Miller
@ 2010-03-29 21:44             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-29 21:44 UTC (permalink / raw)
  To: David Miller
  Cc: yinghai, michael, mingo, tglx, hpa, akpm, torvalds, hannes,
	linux-kernel, linux-arch

On Mon, 2010-03-29 at 12:11 -0700, David Miller wrote:
> From: Yinghai Lu <yinghai@kernel.org>
> Date: Mon, 29 Mar 2010 11:39:52 -0700
> 
> > On 03/29/2010 11:34 AM, David Miller wrote:
> >> From: Yinghai Lu <yinghai@kernel.org>
> >> Date: Mon, 29 Mar 2010 09:20:28 -0700
> >> 
> >>> BTW, it seems only PowerPC need to keep lmb after init stage, right ?
> >> 
> >> For now.  Sparc64 will need it at some point in the future.
> > 
> > for memory hotplug support?
> 
> Yes.

We also can use it to implement page_is_ram() which might help
get /dev/mem cachability setting right by default one day :-)

However, we only need to keep the memory list, not the reserve list.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 12:22   ` Michael Ellerman
  2010-03-29 16:45     ` Yinghai Lu
@ 2010-03-29 21:49     ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-29 21:49 UTC (permalink / raw)
  To: michael
  Cc: Yinghai Lu, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Mon, 2010-03-29 at 23:22 +1100, Michael Ellerman wrote:
> ^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
> adding this function, we could just say that the arch code needs to set
> rmo_size up with an appropriate value, and then use that below. Though
> maybe that's conflating things. 

Well... not quite.

The RMO (which really is the RMA, historical misnaming) is the region of
memory we can access very early during boot (in real mode on ppc64 but
I plan to use it to represent the boot-time fixed mapping on ppc32 at
some stage). It's not strictly equivalent to max lowmem. However, on
ppc64, it happens to be the size of the first added LMB entry/

In any case, the LMBs should really not care. They allocate where you
tell them to. 

IE. That stuff is arch specific enough that I suspect we should just
move it out, while the concept of max_lowmem is common enough (at least
for 32-bit archs) that I'm happy to have some provisions for it inside
the LMB core.

Maybe what we need is an arch call to set the allocation "limit". It
could be set in stages during boot. To the initial mapped memory (bolted
TLB) on ppc32 very early, and then pushed up to max_low_pfn as soon as
the full MMU setup is done for example. IE. All we need is an
lmb_set_alloc_limit() called by the arch in the right spots, that
defines the default allocation limit for lmb_alloc() though of course
lmb_alloc_base() can be used by callers to enforce explicit limits.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 16:52   ` Yinghai Lu
  2010-03-29 20:39     ` Yinghai Lu
@ 2010-03-29 22:10     ` Michael Ellerman
  2010-03-29 22:17       ` Yinghai Lu
  1 sibling, 1 reply; 106+ messages in thread
From: Michael Ellerman @ 2010-03-29 22:10 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 1412 bytes --]

On Mon, 2010-03-29 at 09:52 -0700, Yinghai Lu wrote:
> On 03/29/2010 05:22 AM, Michael Ellerman wrote:
> > On Sun, 2010-03-28 at 19:42 -0700, Yinghai Lu wrote:
> >> the new lmb could be used to early_res in x86.
> >>
> >> Suggested by: David, Ben, and Thomas
> >>
> >> First three patches should go into 2.6.34
> >>
> >> -v6: change sequence as requested by Thomas
> >> -v7: seperate them to more patches
> >> -v8: add boundary checking to make sure not free partial page.
> >> -v9: use lmb_debug to control print out of reserve_lmb.
> >>      add e820 clean up, and e820 become __initdata
> > 
> > Bike shedding perhaps, but can you maintain the naming convention, ie.
> > lmb_xxx() rather than xxx_lmb(). Neither is necessarily better, but all
> > the existing functions use the lmb_xxx() style.
> > 
> 
> so you want
> 
> find_lmb_area ==> lmb_find_area
> reserve_lmb ==> lmb_reserve
> free_lmb ==> lmb_free
> 
> first one is ok, 
> 
> but next two we already have lmb_reserved and lmb_free without checking and increasing the size of region array.

That was the point of my other mail. We now have two lmb APIs, one which
checks if the array will overflow and one which doesn't. That seems like
a bad idea. Having one called lmb_free() and one called free_lmb() is
definitely a bad idea, because it's completely non obvious which one
caters for overflow.

cheers





[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 22:10     ` Michael Ellerman
@ 2010-03-29 22:17       ` Yinghai Lu
  2010-03-29 22:32         ` Michael Ellerman
  2010-03-29 23:29         ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 22:17 UTC (permalink / raw)
  To: michael
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 03:10 PM, Michael Ellerman wrote:
> On Mon, 2010-03-29 at 09:52 -0700, Yinghai Lu wrote:
>> On 03/29/2010 05:22 AM, Michael Ellerman wrote:
>>> On Sun, 2010-03-28 at 19:42 -0700, Yinghai Lu wrote:
>>>> the new lmb could be used to early_res in x86.
>>>>
>>>> Suggested by: David, Ben, and Thomas
>>>>
>>>> First three patches should go into 2.6.34
>>>>
>>>> -v6: change sequence as requested by Thomas
>>>> -v7: seperate them to more patches
>>>> -v8: add boundary checking to make sure not free partial page.
>>>> -v9: use lmb_debug to control print out of reserve_lmb.
>>>>      add e820 clean up, and e820 become __initdata
>>>
>>> Bike shedding perhaps, but can you maintain the naming convention, ie.
>>> lmb_xxx() rather than xxx_lmb(). Neither is necessarily better, but all
>>> the existing functions use the lmb_xxx() style.
>>>
>>
>> so you want
>>
>> find_lmb_area ==> lmb_find_area
>> reserve_lmb ==> lmb_reserve
>> free_lmb ==> lmb_free
>>
>> first one is ok, 
>>
>> but next two we already have lmb_reserved and lmb_free without checking and increasing the size of region array.
> 
> That was the point of my other mail. We now have two lmb APIs, one which
> checks if the array will overflow and one which doesn't. That seems like
> a bad idea. Having one called lmb_free() and one called free_lmb() is
> definitely a bad idea, because it's completely non obvious which one
> caters for overflow.

I want to keep the affects to other lmb users to minium at first.

and we can merge those functions later.

or you insist on merging them in this patchset?

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 16:45     ` Yinghai Lu
@ 2010-03-29 22:20       ` Michael Ellerman
  2010-03-29 22:37         ` Yinghai Lu
  2010-03-29 23:31         ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 106+ messages in thread
From: Michael Ellerman @ 2010-03-29 22:20 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 4467 bytes --]

On Mon, 2010-03-29 at 09:45 -0700, Yinghai Lu wrote:
> On 03/29/2010 05:22 AM, Michael Ellerman wrote:
> > On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
> >> They will check if the region array is big enough.
> >>
> >> __check_and_double_region_array will try to double the region if that array spare
> >> slots if not big enough.
> >> find_lmb_area() is used to find good postion for new region array.
> >> Old array will be copied to new array.
> >>
> >> Arch code should provide to get_max_mapped, so the new array have accessiable
> >> address
> > ..
> >> diff --git a/mm/lmb.c b/mm/lmb.c
> >> index d5d5dc4..9798458 100644
> >> --- a/mm/lmb.c
> >> +++ b/mm/lmb.c
> >> @@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
> >>  	return -1;
> >>  }
> >>  
> >> +u64 __weak __init get_max_mapped(void)
> >> +{
> >> +	u64 end = max_low_pfn;
> >> +
> >> +	end <<= PAGE_SHIFT;
> >> +
> >> +	return end;
> >> +}
> > 
> > ^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
> > adding this function, we could just say that the arch code needs to set
> > rmo_size up with an appropriate value, and then use that below. Though
> > maybe that's conflating things.
> 
> ok
> 
> will have another patch following this patchset. to use rmo_size replace get_max_mapped()

No don't, Benh's idea was better. Leave rmo_size for now, we can clean
that up later.

We just need a lmb.alloc_limit and a lmb_set_alloc_limit() which arch
code calls when it knows what the alloc limit is (and can call multiple
times during boot). Or maybe it should be called "default_alloc_limit",
but that's getting a bit long winded.

> 
> long __init_lmb lmb_add(u64 base, u64 size)
> {
>         struct lmb_region *_rgn = &lmb.memory;
> 
>         /* On pSeries LPAR systems, the first LMB is our RMO region. */
>         if (base == 0)
>                 lmb.rmo_size = size;
> 
>         return lmb_add_region(_rgn, base, size);
> 
> }
> 
> looks scary.
> maybe later powerpc could used lmb_find and set_lmb_rmo_size in their arch code.

It's not really scary, and it gives you a hint where the code came from
originally :)

We can remove that later though, with some powerpc code to detect the
first memory region before we put it into lmb.

> > ...
> >> +
> >> +void __init add_lmb_memory(u64 start, u64 end)
> >> +{
> >> +	__check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
> >> +	lmb_add(start, end - start);
> >> +}
> >> +
> >> +void __init reserve_lmb(u64 start, u64 end, char *name)
> >> +{
> >> +	if (start == end)
> >> +		return;
> >> +
> >> +	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
> >> +		return;
> >> +
> >> +	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
> >> +	lmb_reserve(start, end - start);
> >> +}
> >> +
> >> +void __init free_lmb(u64 start, u64 end)
> >> +{
> >> +	if (start == end)
> >> +		return;
> >> +
> >> +	if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
> >> +		return;
> >> +
> >> +	/* keep punching hole, could run out of slots too */
> >> +	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
> >> +	lmb_free(start, end - start);
> >> +}
> > 
> > Doesn't this mean that if I call lmb_alloc() or lmb_free() too many
> > times then I'll potentially run out of space? So doesn't that
> > essentially break the existing API?
> 
> No, I didn't touch existing API, arches other than x86 should have little change about 
> lmb.memory.region
> lmb.reserved.region
> become pointer from array.

But that's my point. You shouldn't need to touch the existing API, and
you shouldn't need to add a new parallel API. You should just be able to
add the logic for doubling the array in the lmb core, and then everyone
gets dynamically expandable lmb. I don't see any reason why we want to
have two APIs.

> > It seems to me that rather than adding these "special" routines that
> > check for enough space on the way in, instead you should be checking in
> > lmb_add_region() - which is where AFAICS all allocs/frees/reserves
> > eventually end up if they need to insert a new region.
> 
> later i prefer to replace lmb_alloc with find_lmb_area + reserve_lmb.

Why? The existing code has been working for years and is well tested?

cheers



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 22:17       ` Yinghai Lu
@ 2010-03-29 22:32         ` Michael Ellerman
  2010-03-29 22:41           ` Yinghai Lu
  2010-03-29 23:33           ` Benjamin Herrenschmidt
  2010-03-29 23:29         ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 106+ messages in thread
From: Michael Ellerman @ 2010-03-29 22:32 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 2483 bytes --]

On Mon, 2010-03-29 at 15:17 -0700, Yinghai Lu wrote:
> On 03/29/2010 03:10 PM, Michael Ellerman wrote:
> > On Mon, 2010-03-29 at 09:52 -0700, Yinghai Lu wrote:
> >> On 03/29/2010 05:22 AM, Michael Ellerman wrote:
> >>> On Sun, 2010-03-28 at 19:42 -0700, Yinghai Lu wrote:
> >>>> the new lmb could be used to early_res in x86.
> >>>>
> >>>> Suggested by: David, Ben, and Thomas
> >>>>
> >>>> First three patches should go into 2.6.34
> >>>>
> >>>> -v6: change sequence as requested by Thomas
> >>>> -v7: seperate them to more patches
> >>>> -v8: add boundary checking to make sure not free partial page.
> >>>> -v9: use lmb_debug to control print out of reserve_lmb.
> >>>>      add e820 clean up, and e820 become __initdata
> >>>
> >>> Bike shedding perhaps, but can you maintain the naming convention, ie.
> >>> lmb_xxx() rather than xxx_lmb(). Neither is necessarily better, but all
> >>> the existing functions use the lmb_xxx() style.
> >>>
> >>
> >> so you want
> >>
> >> find_lmb_area ==> lmb_find_area
> >> reserve_lmb ==> lmb_reserve
> >> free_lmb ==> lmb_free
> >>
> >> first one is ok, 
> >>
> >> but next two we already have lmb_reserved and lmb_free without checking and increasing the size of region array.
> > 
> > That was the point of my other mail. We now have two lmb APIs, one which
> > checks if the array will overflow and one which doesn't. That seems like
> > a bad idea. Having one called lmb_free() and one called free_lmb() is
> > definitely a bad idea, because it's completely non obvious which one
> > caters for overflow.
> 
> I want to keep the affects to other lmb users to minium at first.

That's a good plan, but I don't think this is the nicest way to do it.

> and we can merge those functions later.
> 
> or you insist on merging them in this patchset?

No I don't insist.

I _suggest_ that if we want to avoid affecting existing lmb users, then
the checking logic should go into the existing API, but be #ifdef'ed for
now - eg. CONFIG_DYNAMIC_LMB or something. That way you avoid affecting
existing users (more or less), but you don't add a new API that you then
have to remove later.

Having said that I don't think it really does affect existing users that
much. We still have the statically defined region arrays, and they're
still the same size, so sparc and powerpc should never need to resize,
except on machines where we currently run out of space in the array
anyway.

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 22:20       ` Michael Ellerman
@ 2010-03-29 22:37         ` Yinghai Lu
  2010-03-29 23:34           ` Benjamin Herrenschmidt
  2010-03-29 23:31         ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 22:37 UTC (permalink / raw)
  To: michael
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 03:20 PM, Michael Ellerman wrote:
> On Mon, 2010-03-29 at 09:45 -0700, Yinghai Lu wrote:
>> On 03/29/2010 05:22 AM, Michael Ellerman wrote:
>>> On Sun, 2010-03-28 at 19:43 -0700, Yinghai Lu wrote:
>>>> They will check if the region array is big enough.
>>>>
>>>> __check_and_double_region_array will try to double the region if that array spare
>>>> slots if not big enough.
>>>> find_lmb_area() is used to find good postion for new region array.
>>>> Old array will be copied to new array.
>>>>
>>>> Arch code should provide to get_max_mapped, so the new array have accessiable
>>>> address
>>> ..
>>>> diff --git a/mm/lmb.c b/mm/lmb.c
>>>> index d5d5dc4..9798458 100644
>>>> --- a/mm/lmb.c
>>>> +++ b/mm/lmb.c
>>>> @@ -551,6 +551,95 @@ int lmb_find(struct lmb_property *res)
>>>>  	return -1;
>>>>  }
>>>>  
>>>> +u64 __weak __init get_max_mapped(void)
>>>> +{
>>>> +	u64 end = max_low_pfn;
>>>> +
>>>> +	end <<= PAGE_SHIFT;
>>>> +
>>>> +	return end;
>>>> +}
>>>
>>> ^ This is (sort of) what lmb.rmo_size represents. So maybe instead of
>>> adding this function, we could just say that the arch code needs to set
>>> rmo_size up with an appropriate value, and then use that below. Though
>>> maybe that's conflating things.
>>
>> ok
>>
>> will have another patch following this patchset. to use rmo_size replace get_max_mapped()
> 
> No don't, Benh's idea was better. Leave rmo_size for now, we can clean
> that up later.
> 
> We just need a lmb.alloc_limit and a lmb_set_alloc_limit() which arch
> code calls when it knows what the alloc limit is (and can call multiple
> times during boot). Or maybe it should be called "default_alloc_limit",
> but that's getting a bit long winded.

ok, I will get_max_mapped() for now, an will change to new field...later

> 
>>
>> long __init_lmb lmb_add(u64 base, u64 size)
>> {
>>         struct lmb_region *_rgn = &lmb.memory;
>>
>>         /* On pSeries LPAR systems, the first LMB is our RMO region. */
>>         if (base == 0)
>>                 lmb.rmo_size = size;
>>
>>         return lmb_add_region(_rgn, base, size);
>>
>> }
>>
>> looks scary.
>> maybe later powerpc could used lmb_find and set_lmb_rmo_size in their arch code.
> 
> It's not really scary, and it gives you a hint where the code came from
> originally :)
> 
> We can remove that later though, with some powerpc code to detect the
> first memory region before we put it into lmb.

good.

> 
>>> ...
>>>> +
>>>> +void __init add_lmb_memory(u64 start, u64 end)
>>>> +{
>>>> +	__check_and_double_region_array(&lmb.memory, &lmb_memory_region[0], start, end);
>>>> +	lmb_add(start, end - start);
>>>> +}
>>>> +
>>>> +void __init reserve_lmb(u64 start, u64 end, char *name)
>>>> +{
>>>> +	if (start == end)
>>>> +		return;
>>>> +
>>>> +	if (WARN_ONCE(start > end, "reserve_lmb: wrong range [%#llx, %#llx]\n", start, end))
>>>> +		return;
>>>> +
>>>> +	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
>>>> +	lmb_reserve(start, end - start);
>>>> +}
>>>> +
>>>> +void __init free_lmb(u64 start, u64 end)
>>>> +{
>>>> +	if (start == end)
>>>> +		return;
>>>> +
>>>> +	if (WARN_ONCE(start > end, "free_lmb: wrong range [%#llx, %#llx]\n", start, end))
>>>> +		return;
>>>> +
>>>> +	/* keep punching hole, could run out of slots too */
>>>> +	__check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
>>>> +	lmb_free(start, end - start);
>>>> +}
>>>
>>> Doesn't this mean that if I call lmb_alloc() or lmb_free() too many
>>> times then I'll potentially run out of space? So doesn't that
>>> essentially break the existing API?
>>
>> No, I didn't touch existing API, arches other than x86 should have little change about 
>> lmb.memory.region
>> lmb.reserved.region
>> become pointer from array.
> 
> But that's my point. You shouldn't need to touch the existing API, and
> you shouldn't need to add a new parallel API. You should just be able to
> add the logic for doubling the array in the lmb core, and then everyone
> gets dynamically expandable lmb. I don't see any reason why we want to
> have two APIs.

that will have too much change for all x86 caller.

that set of API is __init, and will be freed later. 

and these two are just wrapper for old one to make x86 transition more easily.

> 
>>> It seems to me that rather than adding these "special" routines that
>>> check for enough space on the way in, instead you should be checking in
>>> lmb_add_region() - which is where AFAICS all allocs/frees/reserves
>>> eventually end up if they need to insert a new region.
>>
>> later i prefer to replace lmb_alloc with find_lmb_area + reserve_lmb.
> 
> Why? The existing code has been working for years and is well tested?

We need to find_lmb_area to find good position for new reserved region array.

static void __init __reserve_lmb(u64 start, u64 end, char *name)
{
        __check_and_double_region_array(&lmb.reserved, &lmb_reserved_region[0], start, end);
        lmb_reserve(start, end - start);
}

with the execlude area in __check_and_double_region_lmb() will not put the new array overlap with area that we are going to reserve.

and find_lmb_area + reserve_lmb should be identical to lmb_alloc...

Thanks

Yinghai Lu

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 22:32         ` Michael Ellerman
@ 2010-03-29 22:41           ` Yinghai Lu
  2010-03-29 23:33           ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 22:41 UTC (permalink / raw)
  To: michael
  Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Andrew Morton,
	David Miller, Benjamin Herrenschmidt, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 03:32 PM, Michael Ellerman wrote:
>>
>> I want to keep the affects to other lmb users to minium at first.
> 
> That's a good plan, but I don't think this is the nicest way to do it.
> 
Too much to be test. so better to keep less impact to current lmb users.
>> and we can merge those functions later.
>>
>> or you insist on merging them in this patchset?
> 
> No I don't insist.
> 
> I _suggest_ that if we want to avoid affecting existing lmb users, then
> the checking logic should go into the existing API, but be #ifdef'ed for
> now - eg. CONFIG_DYNAMIC_LMB or something. That way you avoid affecting
> existing users (more or less), but you don't add a new API that you then
> have to remove later.
> 
> Having said that I don't think it really does affect existing users that
> much. We still have the statically defined region arrays, and they're
> still the same size, so sparc and powerpc should never need to resize,
> except on machines where we currently run out of space in the array
> anyway.

dynamic region array for lmb.memory and lmb.reserved, actually is only need when plan to use
lmb to replace bootmem code.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 22:17       ` Yinghai Lu
  2010-03-29 22:32         ` Michael Ellerman
@ 2010-03-29 23:29         ` Benjamin Herrenschmidt
  2010-03-29 23:47           ` Yinghai Lu
  1 sibling, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-29 23:29 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Mon, 2010-03-29 at 15:17 -0700, Yinghai Lu wrote:
> > That was the point of my other mail. We now have two lmb APIs, one
> which
> > checks if the array will overflow and one which doesn't. That seems
> like
> > a bad idea. Having one called lmb_free() and one called free_lmb()
> is
> > definitely a bad idea, because it's completely non obvious which one
> > caters for overflow.
> 
> I want to keep the affects to other lmb users to minium at first.
> 
> and we can merge those functions later.
> 
> or you insist on merging them in this patchset?

As a separate patch sure, but you should really separate the patch
series that changes LMB from the one that moves x86 to it imho. It would
make things much clearer.

It would also allow you to spend some time properly -documenting- why
you need to change LMB the way you do, since it's non obvious for those
not familiar with x86 needs. I'm not objecting to the changes, I'm just
asking for much better documentation as to why they are needed and what
function they provide.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 22:20       ` Michael Ellerman
  2010-03-29 22:37         ` Yinghai Lu
@ 2010-03-29 23:31         ` Benjamin Herrenschmidt
  2010-03-30  0:03           ` Yinghai Lu
  1 sibling, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-29 23:31 UTC (permalink / raw)
  To: michael
  Cc: Yinghai Lu, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Tue, 2010-03-30 at 09:20 +1100, Michael Ellerman wrote:
> 
> But that's my point. You shouldn't need to touch the existing API, and
> you shouldn't need to add a new parallel API. You should just be able to
> add the logic for doubling the array in the lmb core, and then everyone
> gets dynamically expandable lmb. I don't see any reason why we want to
> have two APIs.

Ack.

> > > It seems to me that rather than adding these "special" routines that
> > > check for enough space on the way in, instead you should be checking in
> > > lmb_add_region() - which is where AFAICS all allocs/frees/reserves
> > > eventually end up if they need to insert a new region.
> > 
> > later i prefer to replace lmb_alloc with find_lmb_area + reserve_lmb.
> 
> Why? The existing code has been working for years and is well tested? 

I still don't totally understand why he needs a find_lmb_area()
anyways. 

It might be justified ... or not. I just want it to be better
documented.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 22:32         ` Michael Ellerman
  2010-03-29 22:41           ` Yinghai Lu
@ 2010-03-29 23:33           ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-29 23:33 UTC (permalink / raw)
  To: michael
  Cc: Yinghai Lu, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Tue, 2010-03-30 at 09:32 +1100, Michael Ellerman wrote:
> I _suggest_ that if we want to avoid affecting existing lmb users, then
> the checking logic should go into the existing API, but be #ifdef'ed for
> now - eg. CONFIG_DYNAMIC_LMB or something. That way you avoid affecting
> existing users (more or less), but you don't add a new API that you then
> have to remove later.
> 
> Having said that I don't think it really does affect existing users that
> much. We still have the statically defined region arrays, and they're
> still the same size, so sparc and powerpc should never need to resize,
> except on machines where we currently run out of space in the array
> anyway. 

We'll want the dynamic sizing if we switch bootmem to lmb (though last I
looked, that still needs sparsemem to be fixed as well as it still uses
bootmem). There's also some interest on ARM side I've heard.

I still have my evil plan to turn it into lists anyways, but we'll see
how that goes later. In any case, the doubling ability does the job for
now and yes, it should definitely be part of the core.

LMB should be able to use it's own storage to do the doubling. IE.
Always guarantee that you have at least 1 or 2 free entries in the
table, if you're going past that threshold, then use the remaining entry
to allocate a new table. Easy :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 22:37         ` Yinghai Lu
@ 2010-03-29 23:34           ` Benjamin Herrenschmidt
  2010-03-29 23:53             ` Yinghai Lu
  0 siblings, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-29 23:34 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Mon, 2010-03-29 at 15:37 -0700, Yinghai Lu wrote:
> > We just need a lmb.alloc_limit and a lmb_set_alloc_limit() which arch
> > code calls when it knows what the alloc limit is (and can call multiple
> > times during boot). Or maybe it should be called "default_alloc_limit",
> > but that's getting a bit long winded.
> 
> ok, I will get_max_mapped() for now, an will change to new field...later

No. Do it now. get_max_mapped() sucks as an identifier.

Cheers,
Ben.

 


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH -v9 00/31] use lmb with x86
  2010-03-29 23:29         ` Benjamin Herrenschmidt
@ 2010-03-29 23:47           ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 23:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On 03/29/2010 04:29 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2010-03-29 at 15:17 -0700, Yinghai Lu wrote:
>>> That was the point of my other mail. We now have two lmb APIs, one
>> which
>>> checks if the array will overflow and one which doesn't. That seems
>> like
>>> a bad idea. Having one called lmb_free() and one called free_lmb()
>> is
>>> definitely a bad idea, because it's completely non obvious which one
>>> caters for overflow.
>>
>> I want to keep the affects to other lmb users to minium at first.
>>
>> and we can merge those functions later.
>>
>> or you insist on merging them in this patchset?
> 
> As a separate patch sure, but you should really separate the patch
> series that changes LMB from the one that moves x86 to it imho. It would
> make things much clearer.

Those patches should go through tip/x86 ?

Please check the patches only have "lmb:" in the subject.

> 
> It would also allow you to spend some time properly -documenting- why
> you need to change LMB the way you do, since it's non obvious for those
> not familiar with x86 needs. I'm not objecting to the changes, I'm just
> asking for much better documentation as to why they are needed and what
> function they provide.

Thanks, will try to write more changelog for next reversion.

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 23:34           ` Benjamin Herrenschmidt
@ 2010-03-29 23:53             ` Yinghai Lu
  2010-03-30  4:13               ` Michael Ellerman
  2010-03-30  5:24               ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-29 23:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On 03/29/2010 04:34 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2010-03-29 at 15:37 -0700, Yinghai Lu wrote:
>>> We just need a lmb.alloc_limit and a lmb_set_alloc_limit() which arch
>>> code calls when it knows what the alloc limit is (and can call multiple
>>> times during boot). Or maybe it should be called "default_alloc_limit",
>>> but that's getting a bit long winded.
>>
>> ok, I will get_max_mapped() for now, an will change to new field...later
> 
> No. Do it now. get_max_mapped() sucks as an identifier.
> 

ok, can i reuse rmo_size, or introduce one new member in struct lmb.

default_limit?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 23:31         ` Benjamin Herrenschmidt
@ 2010-03-30  0:03           ` Yinghai Lu
  2010-03-30  5:26             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 106+ messages in thread
From: Yinghai Lu @ 2010-03-30  0:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On 03/29/2010 04:31 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2010-03-30 at 09:20 +1100, Michael Ellerman wrote:
>>
>> But that's my point. You shouldn't need to touch the existing API, and
>> you shouldn't need to add a new parallel API. You should just be able to
>> add the logic for doubling the array in the lmb core, and then everyone
>> gets dynamically expandable lmb. I don't see any reason why we want to
>> have two APIs.
> 
> Ack.
ok, we can merge them later.
> 
>>>> It seems to me that rather than adding these "special" routines that
>>>> check for enough space on the way in, instead you should be checking in
>>>> lmb_add_region() - which is where AFAICS all allocs/frees/reserves
>>>> eventually end up if they need to insert a new region.
>>>
>>> later i prefer to replace lmb_alloc with find_lmb_area + reserve_lmb.
>>
>> Why? The existing code has been working for years and is well tested? 
> 
> I still don't totally understand why he needs a find_lmb_area()
> anyways. 
> 
> It might be justified ... or not. I just want it to be better
> documented.


current changelog for that

------------------

Subject: [PATCH 6/31] lmb: Add lmb_find_area()

It will try find area according with size/align in specified range (start, end).

Need use it find correct buffer for new lmb.reserved.region.

also make it more easy for x86 to use lmb.
x86 early_res is using find/reserve pattern instead of alloc.

lmb_find_area() will honor goal

When we need temporary buff for range array etc for range work, if We are using
lmb_alloc(), We will need to add some post fix code for buffer that is used
by range array, because it is in the lmb.reserved already.

----------------

in short: It could make us to avoid use the range that we are going to reserve,
      when we try to get new position new lmb.reserved.region.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 23:53             ` Yinghai Lu
@ 2010-03-30  4:13               ` Michael Ellerman
  2010-03-30  4:21                 ` Yinghai Lu
  2010-03-30  5:24               ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 106+ messages in thread
From: Michael Ellerman @ 2010-03-30  4:13 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Benjamin Herrenschmidt, Ingo Molnar, Thomas Gleixner,
	H. Peter Anvin, Andrew Morton, David Miller, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 753 bytes --]

On Mon, 2010-03-29 at 16:53 -0700, Yinghai Lu wrote:
> On 03/29/2010 04:34 PM, Benjamin Herrenschmidt wrote:
> > On Mon, 2010-03-29 at 15:37 -0700, Yinghai Lu wrote:
> >>> We just need a lmb.alloc_limit and a lmb_set_alloc_limit() which arch
> >>> code calls when it knows what the alloc limit is (and can call multiple
> >>> times during boot). Or maybe it should be called "default_alloc_limit",
> >>> but that's getting a bit long winded.
> >>
> >> ok, I will get_max_mapped() for now, an will change to new field...later
> > 
> > No. Do it now. get_max_mapped() sucks as an identifier.
> > 
> 
> ok, can i reuse rmo_size, or introduce one new member in struct lmb.
> 
> default_limit?

alloc_limit or default_alloc_limit

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30  4:13               ` Michael Ellerman
@ 2010-03-30  4:21                 ` Yinghai Lu
  2010-03-30  5:29                   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 106+ messages in thread
From: Yinghai Lu @ 2010-03-30  4:21 UTC (permalink / raw)
  To: michael
  Cc: Benjamin Herrenschmidt, Ingo Molnar, Thomas Gleixner,
	H. Peter Anvin, Andrew Morton, David Miller, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 09:13 PM, Michael Ellerman wrote:
> On Mon, 2010-03-29 at 16:53 -0700, Yinghai Lu wrote:
>> On 03/29/2010 04:34 PM, Benjamin Herrenschmidt wrote:
>>> On Mon, 2010-03-29 at 15:37 -0700, Yinghai Lu wrote:
>>>>> We just need a lmb.alloc_limit and a lmb_set_alloc_limit() which arch
>>>>> code calls when it knows what the alloc limit is (and can call multiple
>>>>> times during boot). Or maybe it should be called "default_alloc_limit",
>>>>> but that's getting a bit long winded.
>>>>
>>>> ok, I will get_max_mapped() for now, an will change to new field...later
>>>
>>> No. Do it now. get_max_mapped() sucks as an identifier.
>>>
>>
>> ok, can i reuse rmo_size, or introduce one new member in struct lmb.
>>
>> default_limit?
> 
> alloc_limit or default_alloc_limit
> 
looks that is not accurate.

if someone want to find some area, but not going to access that range, then we should let them alloc it.

how about access_limit?

Thanks

Yinghai Lu

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-29 23:53             ` Yinghai Lu
  2010-03-30  4:13               ` Michael Ellerman
@ 2010-03-30  5:24               ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-30  5:24 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Mon, 2010-03-29 at 16:53 -0700, Yinghai Lu wrote:
> 
> ok, can i reuse rmo_size, or introduce one new member in struct lmb.
> 
> default_limit? 

lmb.default_limit sounds good to me.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30  0:03           ` Yinghai Lu
@ 2010-03-30  5:26             ` Benjamin Herrenschmidt
  2010-03-30  6:12               ` Yinghai Lu
  0 siblings, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-30  5:26 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Mon, 2010-03-29 at 17:03 -0700, Yinghai Lu wrote:
> 
> in short: It could make us to avoid use the range that we are going to
> reserve,
>       when we try to get new position new lmb.reserved.region. 

I'm not too sure I follow you. For the resizing, I would just basically
call a low level variant of alloc (__lmb_alloc ?) that explicitely
doesn't honor the total-2 "reserved" entries in the array.

Ie. It should all be one single find/allocation function.

In fact, you want to split lmb_find from lmb_reserve, then just make
lmb_alloc use the above, I don't want 2 implementations of the same
thing (maybe call it __lmb_find to expose the fact that it's a low level
function to avoid for normal use).

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30  4:21                 ` Yinghai Lu
@ 2010-03-30  5:29                   ` Benjamin Herrenschmidt
  2010-03-30  5:40                     ` Yinghai Lu
  0 siblings, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-30  5:29 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Mon, 2010-03-29 at 21:21 -0700, Yinghai Lu wrote:
> 
> if someone want to find some area, but not going to access that range,
> then we should let them alloc it.
> 
> how about access_limit? 

No, no you don't get it. default_alloc_limit is fine. It's the -default-
limit. It doesn't apply to lmb_alloc_base() that takes an explicit
limit...

Something along those lines.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30  5:29                   ` Benjamin Herrenschmidt
@ 2010-03-30  5:40                     ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-30  5:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On 03/29/2010 10:29 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2010-03-29 at 21:21 -0700, Yinghai Lu wrote:
>>
>> if someone want to find some area, but not going to access that range,
>> then we should let them alloc it.
>>
>> how about access_limit? 
> 
> No, no you don't get it. default_alloc_limit is fine. It's the -default-
> limit. It doesn't apply to lmb_alloc_base() that takes an explicit
> limit...
> 
fine, will use default_limit.

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30  5:26             ` Benjamin Herrenschmidt
@ 2010-03-30  6:12               ` Yinghai Lu
  2010-03-30  6:46                 ` Michael Ellerman
  2010-03-30 21:30                 ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-30  6:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On 03/29/2010 10:26 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2010-03-29 at 17:03 -0700, Yinghai Lu wrote:
>>
>> in short: It could make us to avoid use the range that we are going to
>> reserve,
>>       when we try to get new position new lmb.reserved.region. 
> 
> I'm not too sure I follow you. For the resizing, I would just basically
> call a low level variant of alloc (__lmb_alloc ?) that explicitely
> doesn't honor the total-2 "reserved" entries in the array.

1. you want to reserve rangeA
2. before that will check if region array is big enough,
3. if region is not big enough, will call lmb_alloc to get new range.
   lmb_alloc could return rangB that is overlapped with rangeA
4. current lmb_alloc only honor limit, and doesn't honor low limit.

another usage is: for temporary buffer, like range array for subtraction.
we don't need to do free later.

> 
> Ie. It should all be one single find/allocation function.
> 
> In fact, you want to split lmb_find from lmb_reserve, then just make
> lmb_alloc use the above, I don't want 2 implementations of the same
> thing (maybe call it __lmb_find to expose the fact that it's a low level
> function to avoid for normal use).

that is some difference between them, and lmb_alloc doesn't honor low limit.

we can provide
lmb_find_area
lmb_reserve_area
lmb_free_area

and use lmb_find_area + lmb_reserve_area to get one lmb_alloc()

x86 sometime is using find_lmb_area to find big area, and use those area and later reserve actually used area.

you could use lmb_alloc, and later lmb_free not used. that is equal to lmb_find + lmb_reserve + lmb_free ...

Thanks

Yinghai Lu

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30  6:12               ` Yinghai Lu
@ 2010-03-30  6:46                 ` Michael Ellerman
  2010-03-30  6:57                   ` Yinghai Lu
  2010-03-30 21:30                 ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 106+ messages in thread
From: Michael Ellerman @ 2010-03-30  6:46 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Benjamin Herrenschmidt, Ingo Molnar, Thomas Gleixner,
	H. Peter Anvin, Andrew Morton, David Miller, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 1134 bytes --]

On Mon, 2010-03-29 at 23:12 -0700, Yinghai Lu wrote:
> On 03/29/2010 10:26 PM, Benjamin Herrenschmidt wrote:
> > On Mon, 2010-03-29 at 17:03 -0700, Yinghai Lu wrote:
> >>
> >> in short: It could make us to avoid use the range that we are going to
> >> reserve,
> >>       when we try to get new position new lmb.reserved.region. 
> > 
> > I'm not too sure I follow you. For the resizing, I would just basically
> > call a low level variant of alloc (__lmb_alloc ?) that explicitely
> > doesn't honor the total-2 "reserved" entries in the array.
> 
> 1. you want to reserve rangeA
> 2. before that will check if region array is big enough,
> 3. if region is not big enough, will call lmb_alloc to get new range.
>    lmb_alloc could return rangB that is overlapped with rangeA

So instead you do it the other way.

1. you want to reserve rangeA
2. you reserve rangeA
3. if reserving rangeA consumed a slot in the array then you check if
you have at least two free slots. If not you realloc. You don't need any
special tricks because you have space to lmb_alloc() a new area and move
everything over.

cheers

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30  6:46                 ` Michael Ellerman
@ 2010-03-30  6:57                   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-30  6:57 UTC (permalink / raw)
  To: michael
  Cc: Benjamin Herrenschmidt, Ingo Molnar, Thomas Gleixner,
	H. Peter Anvin, Andrew Morton, David Miller, Linus Torvalds,
	Johannes Weiner, linux-kernel, linux-arch

On 03/29/2010 11:46 PM, Michael Ellerman wrote:
> On Mon, 2010-03-29 at 23:12 -0700, Yinghai Lu wrote:
>> On 03/29/2010 10:26 PM, Benjamin Herrenschmidt wrote:
>>> On Mon, 2010-03-29 at 17:03 -0700, Yinghai Lu wrote:
>>>>
>>>> in short: It could make us to avoid use the range that we are going to
>>>> reserve,
>>>>       when we try to get new position new lmb.reserved.region. 
>>>
>>> I'm not too sure I follow you. For the resizing, I would just basically
>>> call a low level variant of alloc (__lmb_alloc ?) that explicitely
>>> doesn't honor the total-2 "reserved" entries in the array.
>>
>> 1. you want to reserve rangeA
>> 2. before that will check if region array is big enough,
>> 3. if region is not big enough, will call lmb_alloc to get new range.
>>    lmb_alloc could return rangB that is overlapped with rangeA
> 
> So instead you do it the other way.
> 
> 1. you want to reserve rangeA
> 2. you reserve rangeA
> 3. if reserving rangeA consumed a slot in the array then you check if
> you have at least two free slots. If not you realloc. You don't need any
> special tricks because you have space to lmb_alloc() a new area and move
> everything over.

so that is check it later. should work.

one less find_lmb_area user.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30  6:12               ` Yinghai Lu
  2010-03-30  6:46                 ` Michael Ellerman
@ 2010-03-30 21:30                 ` Benjamin Herrenschmidt
  2010-03-30 22:42                   ` Yinghai Lu
  1 sibling, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-30 21:30 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On Mon, 2010-03-29 at 23:12 -0700, Yinghai Lu wrote:
> 
> 1. you want to reserve rangeA
> 2. before that will check if region array is big enough,
> 3. if region is not big enough, will call lmb_alloc to get new range.
>    lmb_alloc could return rangB that is overlapped with rangeA
> 4. current lmb_alloc only honor limit, and doesn't honor low limit.

I see. This is a direct consequence of you wanting to use find/reserve
instead of alloc tho :-)

This is also easily fixed. Instead of doing the resize "on demand" like
I originally proposed, do it at the end of reserve/alloc. If the number
of free entries is down to 1 or 0, then alloc a new chunk.

Of course, all of that requires that reservations done from FW to take
memory out because it must not be accessed need to be all done before
the first grow of the array, and so the static array must be sized
accordingly. We may want to catch these things too. We don't want to
warn on multiple overlapping lmb_reserve() tho on powerpc...
 
> another usage is: for temporary buffer, like range array for
> subtraction. we don't need to do free later.

Sorry, doesn't parse.

> > Ie. It should all be one single find/allocation function.
> > 
> > In fact, you want to split lmb_find from lmb_reserve, then just make
> > lmb_alloc use the above, I don't want 2 implementations of the same
> > thing (maybe call it __lmb_find to expose the fact that it's a low
> level
> > function to avoid for normal use).
> 
> that is some difference between them, and lmb_alloc doesn't honor low
> limit.

If you want a low and a high limit, then add the low limit to
lmb_alloc_base(), it's easy to fix all callers, there aren't many, and
make not one but two defaults for lmb_alloc(), one for the low limit and
one for the high limit. Problem solved.

> we can provide
> lmb_find_area
> lmb_reserve_area
> lmb_free_area
> 
> and use lmb_find_area + lmb_reserve_area to get one lmb_alloc()

I still don't understand why you insist on using find + reserve instead
of alloc in x86 land. Can you give me a proper explanation as to why
that is needed since it seems to be causing problems, and so far and I
don't see what it solves.

> x86 sometime is using find_lmb_area to find big area, and use those
> area and later reserve actually used area.

That's very wrong. If you use something, alloc/reserve it. You can
always free it later.

> you could use lmb_alloc, and later lmb_free not used. that is equal to
> lmb_find + lmb_reserve + lmb_free ...

Sure, then just do alloc + later free.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/31] lmb: Add reserve_lmb/free_lmb
  2010-03-30 21:30                 ` Benjamin Herrenschmidt
@ 2010-03-30 22:42                   ` Yinghai Lu
  0 siblings, 0 replies; 106+ messages in thread
From: Yinghai Lu @ 2010-03-30 22:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: michael, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
	Andrew Morton, David Miller, Linus Torvalds, Johannes Weiner,
	linux-kernel, linux-arch

On 03/30/2010 02:30 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2010-03-29 at 23:12 -0700, Yinghai Lu wrote:
>>
>> 1. you want to reserve rangeA
>> 2. before that will check if region array is big enough,
>> 3. if region is not big enough, will call lmb_alloc to get new range.
>>    lmb_alloc could return rangB that is overlapped with rangeA
>> 4. current lmb_alloc only honor limit, and doesn't honor low limit.
> 
> I see. This is a direct consequence of you wanting to use find/reserve
> instead of alloc tho :-)
> 
> This is also easily fixed. Instead of doing the resize "on demand" like
> I originally proposed, do it at the end of reserve/alloc. If the number
> of free entries is down to 1 or 0, then alloc a new chunk.

already done. last night, Michael point that out.

> 
> Of course, all of that requires that reservations done from FW to take
> memory out because it must not be accessed need to be all done before
> the first grow of the array, and so the static array must be sized
> accordingly. We may want to catch these things too. We don't want to
> warn on multiple overlapping lmb_reserve() tho on powerpc...
>  
>> another usage is: for temporary buffer, like range array for
>> subtraction. we don't need to do free later.
> 
> Sorry, doesn't parse.
never mind.
> 
>>> Ie. It should all be one single find/allocation function.
>>>
>>> In fact, you want to split lmb_find from lmb_reserve, then just make
>>> lmb_alloc use the above, I don't want 2 implementations of the same
>>> thing (maybe call it __lmb_find to expose the fact that it's a low
>> level
>>> function to avoid for normal use).
>>
>> that is some difference between them, and lmb_alloc doesn't honor low
>> limit.
> 
> If you want a low and a high limit, then add the low limit to
> lmb_alloc_base(), it's easy to fix all callers, there aren't many, and
> make not one but two defaults for lmb_alloc(), one for the low limit and
> one for the high limit. Problem solved.

make 32 bit x86 can work from high to low now.

> 
>> we can provide
>> lmb_find_area
>> lmb_reserve_area
>> lmb_free_area
>>
>> and use lmb_find_area + lmb_reserve_area to get one lmb_alloc()
> 
> I still don't understand why you insist on using find + reserve instead
> of alloc in x86 land. Can you give me a proper explanation as to why
> that is needed since it seems to be causing problems, and so far and I
> don't see what it solves.
> 
>> x86 sometime is using find_lmb_area to find big area, and use those
>> area and later reserve actually used area.
> 
> That's very wrong. If you use something, alloc/reserve it. You can
> always free it later.
> 
>> you could use lmb_alloc, and later lmb_free not used. that is equal to
>> lmb_find + lmb_reserve + lmb_free ...
> 
> Sure, then just do alloc + later free.

ok, I will replace that later after it get stable.
but at first will expose the find_lmb_area.

will send out -v11, please check that version.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2010-03-30 22:44 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-29  2:42 [PATCH -v9 00/31] use lmb with x86 Yinghai Lu
2010-03-29  2:42 ` Yinghai Lu
2010-03-29  2:42 ` [PATCH 01/31] x86: Make smp_locks end with page alignment Yinghai Lu
2010-03-29  2:42   ` Yinghai Lu
2010-03-29 18:42   ` [tip:x86/urgent] " tip-bot for Yinghai Lu
2010-03-29  2:42 ` [PATCH 02/31] x86: Make sure free_init_pages() free pages in boundary Yinghai Lu
2010-03-29  2:42   ` Yinghai Lu
2010-03-29 16:57   ` Ingo Molnar
2010-03-29 16:59     ` Yinghai Lu
2010-03-29 18:42   ` [tip:x86/urgent] x86: Make sure free_init_pages() frees pages on page boundary tip-bot for Yinghai Lu
2010-03-29  2:42 ` [PATCH 03/31] x86: Do not free zero sized per cpu areas Yinghai Lu
2010-03-29  2:42   ` Yinghai Lu
2010-03-29  2:42   ` Yinghai Lu
2010-03-29 18:43   ` [tip:x86/urgent] " tip-bot for Ian Campbell
2010-03-29  2:42 ` [PATCH 04/31] lmb: Move lmb.c to mm/ Yinghai Lu
2010-03-29  2:42   ` Yinghai Lu
2010-03-29  2:42 ` [PATCH 05/31] lmb: Seperate region array from lmb_region struct Yinghai Lu
2010-03-29  2:42   ` Yinghai Lu
2010-03-29  2:42 ` [PATCH 06/31] lmb: Add find_lmb_area() Yinghai Lu
2010-03-29  2:42   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 07/31] lmb: Add reserve_lmb/free_lmb Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29 12:22   ` Michael Ellerman
2010-03-29 16:45     ` Yinghai Lu
2010-03-29 22:20       ` Michael Ellerman
2010-03-29 22:37         ` Yinghai Lu
2010-03-29 23:34           ` Benjamin Herrenschmidt
2010-03-29 23:53             ` Yinghai Lu
2010-03-30  4:13               ` Michael Ellerman
2010-03-30  4:21                 ` Yinghai Lu
2010-03-30  5:29                   ` Benjamin Herrenschmidt
2010-03-30  5:40                     ` Yinghai Lu
2010-03-30  5:24               ` Benjamin Herrenschmidt
2010-03-29 23:31         ` Benjamin Herrenschmidt
2010-03-30  0:03           ` Yinghai Lu
2010-03-30  5:26             ` Benjamin Herrenschmidt
2010-03-30  6:12               ` Yinghai Lu
2010-03-30  6:46                 ` Michael Ellerman
2010-03-30  6:57                   ` Yinghai Lu
2010-03-30 21:30                 ` Benjamin Herrenschmidt
2010-03-30 22:42                   ` Yinghai Lu
2010-03-29 21:49     ` Benjamin Herrenschmidt
2010-03-29  2:43 ` [PATCH 08/31] lmb: Add find_lmb_area_size() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 09/31] bootmem, x86: Add weak version of reserve_bootmem_generic Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 10/31] lmb: Add lmb_to_bootmem() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 11/31] lmb: Add get_free_all_memory_range() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 12/31] lmb: Add lmb_register_active_regions() and lmb_hole_size() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 13/31] lmb: Prepare to include linux/lmb.h in core file Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 14/31] lmb: Add find_memory_core_early() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 15/31] lmb: Add find_lmb_area_node() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 16/31] lmb: Add lmb_free_memory_size() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 17/31] lmb: Add lmb_memory_size() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 18/31] lmb: Add reserve_lmb_overlap_ok() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 19/31] lmb: Use lmb_debug to control debug message print out Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 20/31] lmb: Add __NOT_KEEP_LMB to put lmb code to .init Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29 12:07   ` Michael Ellerman
2010-03-29 16:20     ` Yinghai Lu
2010-03-29 18:34       ` David Miller
2010-03-29 18:39         ` Yinghai Lu
2010-03-29 19:11           ` David Miller
2010-03-29 21:44             ` Benjamin Herrenschmidt
2010-03-29  2:43 ` [PATCH 21/31] x86: Add sanitize_e820_map() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 22/31] x86: Use lmb to replace early_res Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 23/31] x86: Replace e820_/_early string with lmb_ Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 24/31] x86: Remove not used early_res code Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 25/31] x86, lmb: Use lmb_memory_size()/lmb_free_memory_size() to get correct dma_reserve Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 26/31] x86: Align e820 ram range to page Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 27/31] x86: Use wake_system_ram_range instead of e820_any_mapped in agp path Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 28/31] x86: Add get_centaur_ram_top() Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 29/31] x86: Make e820_any_mapped to __init Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 30/31] x86: Use walk_system_ream_range()instead of e820.map directly Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29  2:43 ` [PATCH 31/31] x86: make e820 to be __initdata Yinghai Lu
2010-03-29  2:43   ` Yinghai Lu
2010-03-29 12:22 ` [PATCH -v9 00/31] use lmb with x86 Michael Ellerman
2010-03-29 16:52   ` Yinghai Lu
2010-03-29 20:39     ` Yinghai Lu
2010-03-29 22:10     ` Michael Ellerman
2010-03-29 22:17       ` Yinghai Lu
2010-03-29 22:32         ` Michael Ellerman
2010-03-29 22:41           ` Yinghai Lu
2010-03-29 23:33           ` Benjamin Herrenschmidt
2010-03-29 23:29         ` Benjamin Herrenschmidt
2010-03-29 23:47           ` Yinghai Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.