linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch 000/005](memory hotplug) freeing pages allocated by bootmem for hotremove v3.
@ 2008-04-07 12:43 Yasunori Goto
  2008-04-07 12:45 ` [Patch 001/005](memory hotplug) register section/node id to free Yasunori Goto
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Yasunori Goto @ 2008-04-07 12:43 UTC (permalink / raw)
  To: Badari Pulavarty, Andrew Morton
  Cc: Linux Kernel ML, linux-mm, Yinghai Lu, Yasunori Goto


Hello.

This is v3 for freeing pages which is allocated by bootmem
for memory hot-remove.

Please apply.

---
This patch set is to free pages which is allocated by bootmem for
memory-hotremove. Some structures of memory management are
allocated by bootmem. ex) memmap, etc.
To remove memory physically, some of them must be freed according
to circumstance.
This patch set makes basis to free those pages, and free memmaps.

Basic my idea is using remain members of struct page to remember
information of users of bootmem (section number or node id).
When the section is removing, kernel can confirm it.
By this information, some issues can be solved.

  1) When the memmap of removing section is allocated on other
     section by bootmem, it should/can be free. 
  2) When the memmap of removing section is allocated on the
     same section, it shouldn't be freed. Because the section has to be
     logical memory offlined already and all pages must be isolated against
     page allocater. If it is freed, page allocator may use it which will
     be removed physically soon.
  3) When removing section has other section's memmap, 
     kernel will be able to show easily which section should be removed
     before it for user. (Not implemented yet)
  4) When the above case 2), the page isolation will be able to check and skip
     memmap's page when logical memory offline (offline_pages()).
     Current page isolation code fails in this case because this page is 
     just reserved page and it can't distinguish this pages can be
     removed or not. But, it will be able to do by this patch.
     (Not implemented yet.)
  5) The node information like pgdat has similar issues. But, this
     will be able to be solved too by this.
     (Not implemented yet, but, remembering node id in the pages.)

Fortunately, current bootmem allocator just keeps PageReserved flags,
and doesn't use any other members of page struct. The users of
bootmem doesn't use them too.

This patch set is for 2.6.25-rc8-mm1.

Change log since v2.
  - Rebase for 2.6.25-rc8-mm1.
  - Fix panic at boot when CONFIG_SPARSEMEM_VMEMMAP is selected,
    and kernel returns EBUSY for physical removing.
    (This should be removed after it can do.)
  - Change not good comments.

Change log since v1.
  - allocate usemap on same section of pgdat. usemap's page is hard to be removed
    until other sections removing. This is avoid dependency problem between
    sections.
  - make alloc_bootmem_section() for above.
  - fix compile error for other config.
  - Add user counting. If a page is used by some user, it can be checked.

Todo:
  - for SPARSEMEM_VMEMMAP.
    Freeing vmemmap's page is more diffcult than normal sparsemem.
    Because not only memmap's page, but also the pages like page table must
    be removed too. If removing section has pages for , then it must
    be migrated too. Relocatable page table is necessary.
    (Ross Biro-san is working for it.)
    http://marc.info/?l=linux-mm&m=120110502617654&w=2



Thanks.



-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Patch 001/005](memory hotplug) register section/node id to free
  2008-04-07 12:43 [Patch 000/005](memory hotplug) freeing pages allocated by bootmem for hotremove v3 Yasunori Goto
@ 2008-04-07 12:45 ` Yasunori Goto
  2008-06-16 10:21   ` Andy Whitcroft
  2008-04-07 12:46 ` [Patch 002/005](memory hotplug) align memmap to page size Yasunori Goto
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Yasunori Goto @ 2008-04-07 12:45 UTC (permalink / raw)
  To: Badari Pulavarty, Andrew Morton; +Cc: Linux Kernel ML, linux-mm, Yinghai Lu


This is to register information which is node or section's id.
Kernel can distinguish which node/section uses the pages
allcated by bootmem. This is basis for hot-remove sections or nodes.


Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>

---

 include/linux/memory_hotplug.h |   27 +++++++++++
 include/linux/mmzone.h         |    1 
 mm/bootmem.c                   |    1 
 mm/memory_hotplug.c            |   99 ++++++++++++++++++++++++++++++++++++++++-
 mm/sparse.c                    |    3 -
 5 files changed, 128 insertions(+), 3 deletions(-)

Index: current/mm/bootmem.c
===================================================================
--- current.orig/mm/bootmem.c	2008-04-07 16:06:49.000000000 +0900
+++ current/mm/bootmem.c	2008-04-07 20:08:14.000000000 +0900
@@ -458,6 +458,7 @@
 
 unsigned long __init free_all_bootmem_node(pg_data_t *pgdat)
 {
+	register_page_bootmem_info_node(pgdat);
 	return free_all_bootmem_core(pgdat);
 }
 
Index: current/include/linux/memory_hotplug.h
===================================================================
--- current.orig/include/linux/memory_hotplug.h	2008-04-07 16:06:49.000000000 +0900
+++ current/include/linux/memory_hotplug.h	2008-04-07 16:33:12.000000000 +0900
@@ -11,6 +11,15 @@
 struct mem_section;
 
 #ifdef CONFIG_MEMORY_HOTPLUG
+
+/*
+ * Magic number for free bootmem.
+ * The normal smallest mapcount is -1. Here is smaller value than it.
+ */
+#define SECTION_INFO		0xfffffffe
+#define MIX_INFO		0xfffffffd
+#define NODE_INFO		0xfffffffc
+
 /*
  * pgdat resizing functions
  */
@@ -145,6 +154,18 @@
 #endif /* CONFIG_NUMA */
 #endif /* CONFIG_HAVE_ARCH_NODEDATA_EXTENSION */
 
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
+{
+}
+static inline void put_page_bootmem(struct page *page)
+{
+}
+#else
+extern void register_page_bootmem_info_node(struct pglist_data *pgdat);
+extern void put_page_bootmem(struct page *page);
+#endif
+
 #else /* ! CONFIG_MEMORY_HOTPLUG */
 /*
  * Stub functions for when hotplug is off
@@ -172,6 +193,10 @@
 	return -ENOSYS;
 }
 
+static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
+{
+}
+
 #endif /* ! CONFIG_MEMORY_HOTPLUG */
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
@@ -192,5 +217,7 @@
 extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn,
 								int nr_pages);
 extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms);
+extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map,
+					  unsigned long pnum);
 
 #endif /* __LINUX_MEMORY_HOTPLUG_H */
Index: current/include/linux/mmzone.h
===================================================================
--- current.orig/include/linux/mmzone.h	2008-04-07 16:06:49.000000000 +0900
+++ current/include/linux/mmzone.h	2008-04-07 18:29:08.000000000 +0900
@@ -879,6 +879,7 @@
 	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
 }
 extern int __section_nr(struct mem_section* ms);
+extern unsigned long usemap_size(void);
 
 /*
  * We use the lower bits of the mem_map pointer to store
Index: current/mm/memory_hotplug.c
===================================================================
--- current.orig/mm/memory_hotplug.c	2008-04-07 16:06:49.000000000 +0900
+++ current/mm/memory_hotplug.c	2008-04-07 20:08:13.000000000 +0900
@@ -59,8 +59,105 @@
 	return;
 }
 
-
 #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+#ifndef CONFIG_SPARSEMEM_VMEMMAP
+static void get_page_bootmem(unsigned long info,  struct page *page, int magic)
+{
+	atomic_set(&page->_mapcount, magic);
+	SetPagePrivate(page);
+	set_page_private(page, info);
+	atomic_inc(&page->_count);
+}
+
+void put_page_bootmem(struct page *page)
+{
+	int magic;
+
+	magic = atomic_read(&page->_mapcount);
+	BUG_ON(magic >= -1);
+
+	if (atomic_dec_return(&page->_count) == 1) {
+		ClearPagePrivate(page);
+		set_page_private(page, 0);
+		reset_page_mapcount(page);
+		__free_pages_bootmem(page, 0);
+	}
+
+}
+
+void register_page_bootmem_info_section(unsigned long start_pfn)
+{
+	unsigned long *usemap, mapsize, section_nr, i;
+	struct mem_section *ms;
+	struct page *page, *memmap;
+
+	if (!pfn_valid(start_pfn))
+		return;
+
+	section_nr = pfn_to_section_nr(start_pfn);
+	ms = __nr_to_section(section_nr);
+
+	/* Get section's memmap address */
+	memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr);
+
+	/*
+	 * Get page for the memmap's phys address
+	 * XXX: need more consideration for sparse_vmemmap...
+	 */
+	page = virt_to_page(memmap);
+	mapsize = sizeof(struct page) * PAGES_PER_SECTION;
+	mapsize = PAGE_ALIGN(mapsize) >> PAGE_SHIFT;
+
+	/* remember memmap's page */
+	for (i = 0; i < mapsize; i++, page++)
+		get_page_bootmem(section_nr, page, SECTION_INFO);
+
+	usemap = __nr_to_section(section_nr)->pageblock_flags;
+	page = virt_to_page(usemap);
+
+	mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT;
+
+	for (i = 0; i < mapsize; i++, page++)
+		get_page_bootmem(section_nr, page, MIX_INFO);
+
+}
+
+void register_page_bootmem_info_node(struct pglist_data *pgdat)
+{
+	unsigned long i, pfn, end_pfn, nr_pages;
+	int node = pgdat->node_id;
+	struct page *page;
+	struct zone *zone;
+
+	nr_pages = PAGE_ALIGN(sizeof(struct pglist_data)) >> PAGE_SHIFT;
+	page = virt_to_page(pgdat);
+
+	for (i = 0; i < nr_pages; i++, page++)
+		get_page_bootmem(node, page, NODE_INFO);
+
+	zone = &pgdat->node_zones[0];
+	for (; zone < pgdat->node_zones + MAX_NR_ZONES - 1; zone++) {
+		if (zone->wait_table) {
+			nr_pages = zone->wait_table_hash_nr_entries
+				* sizeof(wait_queue_head_t);
+			nr_pages = PAGE_ALIGN(nr_pages) >> PAGE_SHIFT;
+			page = virt_to_page(zone->wait_table);
+
+			for (i = 0; i < nr_pages; i++, page++)
+				get_page_bootmem(node, page, NODE_INFO);
+		}
+	}
+
+	pfn = pgdat->node_start_pfn;
+	end_pfn = pfn + pgdat->node_spanned_pages;
+
+	/* register_section info */
+	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION)
+		register_page_bootmem_info_section(pfn);
+
+}
+#endif /* !CONFIG_SPARSEMEM_VMEMMAP */
+
 static int __add_zone(struct zone *zone, unsigned long phys_start_pfn)
 {
 	struct pglist_data *pgdat = zone->zone_pgdat;
Index: current/mm/sparse.c
===================================================================
--- current.orig/mm/sparse.c	2008-04-07 16:06:49.000000000 +0900
+++ current/mm/sparse.c	2008-04-07 20:08:16.000000000 +0900
@@ -200,7 +200,6 @@
 /*
  * Decode mem_map from the coded memmap
  */
-static
 struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum)
 {
 	/* mask off the extra low bits of information */
@@ -223,7 +222,7 @@
 	return 1;
 }
 
-static unsigned long usemap_size(void)
+unsigned long usemap_size(void)
 {
 	unsigned long size_bytes;
 	size_bytes = roundup(SECTION_BLOCKFLAGS_BITS, 8) / 8;

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Patch 002/005](memory hotplug) align memmap to page size
  2008-04-07 12:43 [Patch 000/005](memory hotplug) freeing pages allocated by bootmem for hotremove v3 Yasunori Goto
  2008-04-07 12:45 ` [Patch 001/005](memory hotplug) register section/node id to free Yasunori Goto
@ 2008-04-07 12:46 ` Yasunori Goto
  2008-06-16 10:26   ` Andy Whitcroft
  2008-04-07 12:47 ` [Patch 003/005](memory hotplug) make alloc_bootmem_section() Yasunori Goto
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Yasunori Goto @ 2008-04-07 12:46 UTC (permalink / raw)
  To: Badari Pulavarty, Andrew Morton; +Cc: Linux Kernel ML, linux-mm, Yinghai Lu


To free memmap easier, this patch aligns it to page size.
Bootmem allocater may mix some objects in one pages.
It's not good for freeing memmap of memory hot-remove.


Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>

---
 mm/sparse.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: current/mm/sparse.c
===================================================================
--- current.orig/mm/sparse.c	2008-04-07 19:18:50.000000000 +0900
+++ current/mm/sparse.c	2008-04-07 20:08:13.000000000 +0900
@@ -265,8 +265,8 @@
 	if (map)
 		return map;
 
-	map = alloc_bootmem_node(NODE_DATA(nid),
-			sizeof(struct page) * PAGES_PER_SECTION);
+	map = alloc_bootmem_pages_node(NODE_DATA(nid),
+		       PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION));
 	return map;
 }
 #endif /* !CONFIG_SPARSEMEM_VMEMMAP */

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Patch 003/005](memory hotplug) make alloc_bootmem_section()
  2008-04-07 12:43 [Patch 000/005](memory hotplug) freeing pages allocated by bootmem for hotremove v3 Yasunori Goto
  2008-04-07 12:45 ` [Patch 001/005](memory hotplug) register section/node id to free Yasunori Goto
  2008-04-07 12:46 ` [Patch 002/005](memory hotplug) align memmap to page size Yasunori Goto
@ 2008-04-07 12:47 ` Yasunori Goto
  2008-06-16 10:32   ` Andy Whitcroft
  2008-04-07 12:48 ` [Patch 004/005](memory hotplug)allocate usemap on the section with pgdat Yasunori Goto
  2008-04-07 12:50 ` [Patch 005/005](memory hotplug) free memmaps allocated by bootmem Yasunori Goto
  4 siblings, 1 reply; 15+ messages in thread
From: Yasunori Goto @ 2008-04-07 12:47 UTC (permalink / raw)
  To: Badari Pulavarty, Andrew Morton; +Cc: Linux Kernel ML, linux-mm, Yinghai Lu


alloc_bootmem_section() can allocate specified section's area.
This is used for usemap to keep same section with pgdat by later patch.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>

---
 include/linux/bootmem.h |    2 ++
 mm/bootmem.c            |   31 +++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

Index: current/include/linux/bootmem.h
===================================================================
--- current.orig/include/linux/bootmem.h	2008-04-07 19:18:44.000000000 +0900
+++ current/include/linux/bootmem.h	2008-04-07 19:30:08.000000000 +0900
@@ -101,6 +101,8 @@
 extern void free_bootmem_node(pg_data_t *pgdat,
 			      unsigned long addr,
 			      unsigned long size);
+extern void *alloc_bootmem_section(unsigned long size,
+				   unsigned long section_nr);
 
 #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
 #define alloc_bootmem_node(pgdat, x) \
Index: current/mm/bootmem.c
===================================================================
--- current.orig/mm/bootmem.c	2008-04-07 19:18:44.000000000 +0900
+++ current/mm/bootmem.c	2008-04-07 19:30:08.000000000 +0900
@@ -540,6 +540,37 @@
 	return __alloc_bootmem(size, align, goal);
 }
 
+#ifdef CONFIG_SPARSEMEM
+void * __init alloc_bootmem_section(unsigned long size,
+				    unsigned long section_nr)
+{
+	void *ptr;
+	unsigned long limit, goal, start_nr, end_nr, pfn;
+	struct pglist_data *pgdat;
+
+	pfn = section_nr_to_pfn(section_nr);
+	goal = PFN_PHYS(pfn);
+	limit = PFN_PHYS(section_nr_to_pfn(section_nr + 1)) - 1;
+	pgdat = NODE_DATA(early_pfn_to_nid(pfn));
+	ptr = __alloc_bootmem_core(pgdat->bdata, size, SMP_CACHE_BYTES, goal,
+				   limit);
+
+	if (!ptr)
+		return NULL;
+
+	start_nr = pfn_to_section_nr(PFN_DOWN(__pa(ptr)));
+	end_nr = pfn_to_section_nr(PFN_DOWN(__pa(ptr) + size));
+	if (start_nr != section_nr || end_nr != section_nr) {
+		printk(KERN_WARNING "alloc_bootmem failed on section %ld.\n",
+		       section_nr);
+		free_bootmem_core(pgdat->bdata, __pa(ptr), size);
+		ptr = NULL;
+	}
+
+	return ptr;
+}
+#endif
+
 #ifndef ARCH_LOW_ADDRESS_LIMIT
 #define ARCH_LOW_ADDRESS_LIMIT	0xffffffffUL
 #endif

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Patch 004/005](memory hotplug)allocate usemap on the section with pgdat
  2008-04-07 12:43 [Patch 000/005](memory hotplug) freeing pages allocated by bootmem for hotremove v3 Yasunori Goto
                   ` (2 preceding siblings ...)
  2008-04-07 12:47 ` [Patch 003/005](memory hotplug) make alloc_bootmem_section() Yasunori Goto
@ 2008-04-07 12:48 ` Yasunori Goto
  2008-04-07 12:50 ` [Patch 005/005](memory hotplug) free memmaps allocated by bootmem Yasunori Goto
  4 siblings, 0 replies; 15+ messages in thread
From: Yasunori Goto @ 2008-04-07 12:48 UTC (permalink / raw)
  To: Badari Pulavarty, Andrew Morton; +Cc: Linux Kernel ML, linux-mm, Yinghai Lu


Usemaps are allocated on the section which has pgdat by this.

Because usemap size is very small, many other sections usemaps
are allocated on only one page. If a section has usemap, it
can't be removed until removing other sections.
This dependency is not desirable for memory removing.

Pgdat has similar feature. When a section has pgdat area, it 
must be the last section for removing on the node.
So, if section A has pgdat and section B has usemap for section A,
Both sections can't be removed due to dependency each other.

To solve this issue, this patch collects usemap on same
section with pgdat.
If other sections doesn't have any dependency, this section will
be able to be removed finally.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>

---

 mm/sparse.c |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Index: current/mm/sparse.c
===================================================================
--- current.orig/mm/sparse.c	2008-04-07 20:12:55.000000000 +0900
+++ current/mm/sparse.c	2008-04-07 20:13:15.000000000 +0900
@@ -239,11 +239,22 @@
 
 static unsigned long *__init sparse_early_usemap_alloc(unsigned long pnum)
 {
-	unsigned long *usemap;
+	unsigned long *usemap, section_nr;
 	struct mem_section *ms = __nr_to_section(pnum);
 	int nid = sparse_early_nid(ms);
+	struct pglist_data *pgdat = NODE_DATA(nid);
 
-	usemap = alloc_bootmem_node(NODE_DATA(nid), usemap_size());
+	/*
+	 * Usemap's page can't be freed until freeing other sections
+	 * which use it. And, Pgdat has same feature.
+	 * If section A has pgdat and section B has usemap for other
+	 * sections (includes section A), both sections can't be removed,
+	 * because there is the dependency each other.
+	 * To solve above issue, this collects all usemap on the same section
+	 * which has pgdat.
+	 */
+	section_nr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT);
+	usemap = alloc_bootmem_section(usemap_size(), section_nr);
 	printk(KERN_INFO "sparse_early_usemap_alloc: usemap = %p size = %ld\n",
 		usemap, usemap_size());
 	if (usemap)

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Patch 005/005](memory hotplug) free memmaps allocated by bootmem
  2008-04-07 12:43 [Patch 000/005](memory hotplug) freeing pages allocated by bootmem for hotremove v3 Yasunori Goto
                   ` (3 preceding siblings ...)
  2008-04-07 12:48 ` [Patch 004/005](memory hotplug)allocate usemap on the section with pgdat Yasunori Goto
@ 2008-04-07 12:50 ` Yasunori Goto
  2008-06-16 10:44   ` Andy Whitcroft
  4 siblings, 1 reply; 15+ messages in thread
From: Yasunori Goto @ 2008-04-07 12:50 UTC (permalink / raw)
  To: Badari Pulavarty, Andrew Morton; +Cc: Linux Kernel ML, linux-mm, Yinghai Lu


This patch is to free memmaps which is allocated by bootmem.

Freeing usemap is not necessary. The pages of usemap may be necessary
for other sections.

If removing section is last section on the node,
its section is the final user of usemap page.
(usemaps are allocated on its section by previous patch.)
But it shouldn't be freed too, because the section must be
logical offline state which all pages are isolated against page allocater.
If it is freed, page alloctor may use it which will be removed
physically soon. It will be disaster.
So, this patch keeps it as it is.


Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>

---
 mm/internal.h       |    3 +--
 mm/memory_hotplug.c |   11 +++++++++++
 mm/page_alloc.c     |    2 +-
 mm/sparse.c         |   51 +++++++++++++++++++++++++++++++++++++++++++++++----
 4 files changed, 60 insertions(+), 7 deletions(-)

Index: current/mm/sparse.c
===================================================================
--- current.orig/mm/sparse.c	2008-04-07 20:13:25.000000000 +0900
+++ current/mm/sparse.c	2008-04-07 20:27:20.000000000 +0900
@@ -8,6 +8,7 @@
 #include <linux/module.h>
 #include <linux/spinlock.h>
 #include <linux/vmalloc.h>
+#include "internal.h"
 #include <asm/dma.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
@@ -360,6 +361,9 @@
 {
 	return; /* XXX: Not implemented yet */
 }
+static void free_map_bootmem(struct page *page, unsigned long nr_pages)
+{
+}
 #else
 static struct page *__kmalloc_section_memmap(unsigned long nr_pages)
 {
@@ -397,17 +401,47 @@
 		free_pages((unsigned long)memmap,
 			   get_order(sizeof(struct page) * nr_pages));
 }
+
+static void free_map_bootmem(struct page *page, unsigned long nr_pages)
+{
+	unsigned long maps_section_nr, removing_section_nr, i;
+	int magic;
+
+	for (i = 0; i < nr_pages; i++, page++) {
+		magic = atomic_read(&page->_mapcount);
+
+		BUG_ON(magic == NODE_INFO);
+
+		maps_section_nr = pfn_to_section_nr(page_to_pfn(page));
+		removing_section_nr = page->private;
+
+		/*
+		 * When this function is called, the removing section is
+		 * logical offlined state. This means all pages are isolated
+		 * from page allocator. If removing section's memmap is placed
+		 * on the same section, it must not be freed.
+		 * If it is freed, page allocator may allocate it which will
+		 * be removed physically soon.
+		 */
+		if (maps_section_nr != removing_section_nr)
+			put_page_bootmem(page);
+	}
+}
 #endif /* CONFIG_SPARSEMEM_VMEMMAP */
 
 static void free_section_usemap(struct page *memmap, unsigned long *usemap)
 {
+	struct page *usemap_page;
+	unsigned long nr_pages;
+
 	if (!usemap)
 		return;
 
+	usemap_page = virt_to_page(usemap);
 	/*
 	 * Check to see if allocation came from hot-plug-add
 	 */
-	if (PageSlab(virt_to_page(usemap))) {
+	if (PageSlab(usemap_page)) {
 		kfree(usemap);
 		if (memmap)
 			__kfree_section_memmap(memmap, PAGES_PER_SECTION);
@@ -415,10 +449,19 @@
 	}
 
 	/*
-	 * TODO: Allocations came from bootmem - how do I free up ?
+	 * The usemap came from bootmem. This is packed with other usemaps
+	 * on the section which has pgdat at boot time. Just keep it as is now.
 	 */
-	printk(KERN_WARNING "Not freeing up allocations from bootmem "
-			"- leaking memory\n");
+
+	if (memmap) {
+		struct page *memmap_page;
+		memmap_page = virt_to_page(memmap);
+
+		nr_pages = PAGE_ALIGN(PAGES_PER_SECTION * sizeof(struct page))
+			>> PAGE_SHIFT;
+
+		free_map_bootmem(memmap_page, nr_pages);
+	}
 }
 
 /*
Index: current/mm/page_alloc.c
===================================================================
--- current.orig/mm/page_alloc.c	2008-04-07 20:12:55.000000000 +0900
+++ current/mm/page_alloc.c	2008-04-07 20:13:29.000000000 +0900
@@ -568,7 +568,7 @@
 /*
  * permit the bootmem allocator to evade page validation on high-order frees
  */
-void __init __free_pages_bootmem(struct page *page, unsigned int order)
+void __free_pages_bootmem(struct page *page, unsigned int order)
 {
 	if (order == 0) {
 		__ClearPageReserved(page);
Index: current/mm/internal.h
===================================================================
--- current.orig/mm/internal.h	2008-04-07 20:12:55.000000000 +0900
+++ current/mm/internal.h	2008-04-07 20:13:29.000000000 +0900
@@ -34,8 +34,7 @@
 	atomic_dec(&page->_count);
 }
 
-extern void __init __free_pages_bootmem(struct page *page,
-						unsigned int order);
+extern void __free_pages_bootmem(struct page *page, unsigned int order);
 
 /*
  * function for dealing with page's order in buddy system.
Index: current/mm/memory_hotplug.c
===================================================================
--- current.orig/mm/memory_hotplug.c	2008-04-07 20:12:55.000000000 +0900
+++ current/mm/memory_hotplug.c	2008-04-07 20:13:29.000000000 +0900
@@ -199,6 +199,16 @@
 	return register_new_memory(__pfn_to_section(phys_start_pfn));
 }
 
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+static int __remove_section(struct zone *zone, struct mem_section *ms)
+{
+	/*
+	 * XXX: Freeing memmap with vmemmap is not implement yet.
+	 *      This should be removed later.
+	 */
+	return -EBUSY;
+}
+#else
 static int __remove_section(struct zone *zone, struct mem_section *ms)
 {
 	unsigned long flags;
@@ -217,6 +227,7 @@
 	pgdat_resize_unlock(pgdat, &flags);
 	return 0;
 }
+#endif
 
 /*
  * Reasonably generic function for adding memory.  It is

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch 001/005](memory hotplug) register section/node id to free
  2008-04-07 12:45 ` [Patch 001/005](memory hotplug) register section/node id to free Yasunori Goto
@ 2008-06-16 10:21   ` Andy Whitcroft
  2008-06-16 13:58     ` Yasunori Goto
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Whitcroft @ 2008-06-16 10:21 UTC (permalink / raw)
  To: Yasunori Goto
  Cc: Badari Pulavarty, Andrew Morton, Linux Kernel ML, linux-mm, Yinghai Lu

On Mon, Apr 07, 2008 at 09:45:04PM +0900, Yasunori Goto wrote:
> This is to register information which is node or section's id.
> Kernel can distinguish which node/section uses the pages
> allcated by bootmem. This is basis for hot-remove sections or nodes.
> 
> 
> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
> 
> ---
> 
>  include/linux/memory_hotplug.h |   27 +++++++++++
>  include/linux/mmzone.h         |    1 
>  mm/bootmem.c                   |    1 
>  mm/memory_hotplug.c            |   99 ++++++++++++++++++++++++++++++++++++++++-
>  mm/sparse.c                    |    3 -
>  5 files changed, 128 insertions(+), 3 deletions(-)
> 
> Index: current/mm/bootmem.c
> ===================================================================
> --- current.orig/mm/bootmem.c	2008-04-07 16:06:49.000000000 +0900
> +++ current/mm/bootmem.c	2008-04-07 20:08:14.000000000 +0900
> @@ -458,6 +458,7 @@
>  
>  unsigned long __init free_all_bootmem_node(pg_data_t *pgdat)
>  {
> +	register_page_bootmem_info_node(pgdat);
>  	return free_all_bootmem_core(pgdat);
>  }
>  
> Index: current/include/linux/memory_hotplug.h
> ===================================================================
> --- current.orig/include/linux/memory_hotplug.h	2008-04-07 16:06:49.000000000 +0900
> +++ current/include/linux/memory_hotplug.h	2008-04-07 16:33:12.000000000 +0900
> @@ -11,6 +11,15 @@
>  struct mem_section;
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
> +
> +/*
> + * Magic number for free bootmem.
> + * The normal smallest mapcount is -1. Here is smaller value than it.
> + */
> +#define SECTION_INFO		0xfffffffe
> +#define MIX_INFO		0xfffffffd
> +#define NODE_INFO		0xfffffffc

Perhaps these should be defined relative to -1 to make that very
explicit.

#define SECTION_INFO	(-1 - 1)
#define MIX_INFO	(-1 - 2)
#define NODE_INFO	(-1 - 3)

Also from a scan of this patch I cannot see why I might care about the
type of these.  Yes it appears you are going to need a marker to say
which bootmem pages are under this reference counted scheme and which
are not.  From a review perspective having some clue in the leader about
the type and why we care would help.

>From the names I was expecting SECTION related info, NODE related info,
and a MIXture of things.  However, SECTION seems to be the actual sections,
NODE seems to be pgdat information, MIX seems to be usemap?  Why is it
not USEMAP here?  Possibily I will find out in a later patch but a clue
here might help.

> +
>  /*
>   * pgdat resizing functions
>   */
> @@ -145,6 +154,18 @@
>  #endif /* CONFIG_NUMA */
>  #endif /* CONFIG_HAVE_ARCH_NODEDATA_EXTENSION */
>  
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> +static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
> +{
> +}
> +static inline void put_page_bootmem(struct page *page)
> +{
> +}
> +#else
> +extern void register_page_bootmem_info_node(struct pglist_data *pgdat);
> +extern void put_page_bootmem(struct page *page);
> +#endif
> +
>  #else /* ! CONFIG_MEMORY_HOTPLUG */
>  /*
>   * Stub functions for when hotplug is off
> @@ -172,6 +193,10 @@
>  	return -ENOSYS;
>  }
>  
> +static inline void register_page_bootmem_info_node(struct pglist_data *pgdat)
> +{
> +}
> +
>  #endif /* ! CONFIG_MEMORY_HOTPLUG */
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
> @@ -192,5 +217,7 @@
>  extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn,
>  								int nr_pages);
>  extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms);
> +extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map,
> +					  unsigned long pnum);
>  
>  #endif /* __LINUX_MEMORY_HOTPLUG_H */
> Index: current/include/linux/mmzone.h
> ===================================================================
> --- current.orig/include/linux/mmzone.h	2008-04-07 16:06:49.000000000 +0900
> +++ current/include/linux/mmzone.h	2008-04-07 18:29:08.000000000 +0900
> @@ -879,6 +879,7 @@
>  	return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
>  }
>  extern int __section_nr(struct mem_section* ms);
> +extern unsigned long usemap_size(void);
>  
>  /*
>   * We use the lower bits of the mem_map pointer to store
> Index: current/mm/memory_hotplug.c
> ===================================================================
> --- current.orig/mm/memory_hotplug.c	2008-04-07 16:06:49.000000000 +0900
> +++ current/mm/memory_hotplug.c	2008-04-07 20:08:13.000000000 +0900
> @@ -59,8 +59,105 @@
>  	return;
>  }
>  
> -
>  #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
> +#ifndef CONFIG_SPARSEMEM_VMEMMAP
> +static void get_page_bootmem(unsigned long info,  struct page *page, int magic)
> +{
> +	atomic_set(&page->_mapcount, magic);
> +	SetPagePrivate(page);
> +	set_page_private(page, info);
> +	atomic_inc(&page->_count);
> +}

Although I guess these 'magic' constants are effectivly magic numbers it
is also the type.  So I do wonder if this would be better called type.

> +
> +void put_page_bootmem(struct page *page)
> +{
> +	int magic;
> +
> +	magic = atomic_read(&page->_mapcount);
> +	BUG_ON(magic >= -1);
> +
> +	if (atomic_dec_return(&page->_count) == 1) {
> +		ClearPagePrivate(page);
> +		set_page_private(page, 0);
> +		reset_page_mapcount(page);
> +		__free_pages_bootmem(page, 0);
> +	}
> +
> +}

That seems pretty sensible, using _count to track track the number of
users of this page to allow it to be tracked.  But there was no mention
of this in the changelog, so I was about to complain that the get_ was a
strange name for something which set the magic numbers.  It mirroring
get_page, put_page makes the name sensible.  But please document that in
the changelog.

The BUG in put_page_bootmem I assume is effectivly saying "this page was
not reference counted and so cannot be freed with this call".  Is there
anything stopping us simply reference counting all bootmem allocations
in this manner?  So that any of them could be released?

Also how does this scheme cope with things being merged into the end of
the blocks you mark as freeable.  bootmem can pack small things into the
end of the previous allocation if they fit and alignment allows.  Is it
not possible that such allocations would get packed in, but not
accounted for in the _count so that when hotplug frees these things the
bootmem page would get dropped, but still have useful data in it?

> +
> +void register_page_bootmem_info_section(unsigned long start_pfn)
> +{
> +	unsigned long *usemap, mapsize, section_nr, i;
> +	struct mem_section *ms;
> +	struct page *page, *memmap;
> +
> +	if (!pfn_valid(start_pfn))
> +		return;
> +
> +	section_nr = pfn_to_section_nr(start_pfn);
> +	ms = __nr_to_section(section_nr);
> +
> +	/* Get section's memmap address */
> +	memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr);
> +
> +	/*
> +	 * Get page for the memmap's phys address
> +	 * XXX: need more consideration for sparse_vmemmap...
> +	 */
> +	page = virt_to_page(memmap);
> +	mapsize = sizeof(struct page) * PAGES_PER_SECTION;
> +	mapsize = PAGE_ALIGN(mapsize) >> PAGE_SHIFT;
> +
> +	/* remember memmap's page */
> +	for (i = 0; i < mapsize; i++, page++)
> +		get_page_bootmem(section_nr, page, SECTION_INFO);
> +
> +	usemap = __nr_to_section(section_nr)->pageblock_flags;
> +	page = virt_to_page(usemap);
> +
> +	mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT;
> +
> +	for (i = 0; i < mapsize; i++, page++)
> +		get_page_bootmem(section_nr, page, MIX_INFO);
> +

I am concerned that some of these pages might be in the numa remap space?
If they are they were not part of bootmem, will they free correctly in the
same manner?  They are necessarily not mapped at the correct kernel virtual
address so the __pa() is not going to find the right struct page is it?

Perhaps if you simply reference counted all bootmem allocations you
would avoid this problem?

> +}
> +
> +void register_page_bootmem_info_node(struct pglist_data *pgdat)
> +{
> +	unsigned long i, pfn, end_pfn, nr_pages;
> +	int node = pgdat->node_id;
> +	struct page *page;
> +	struct zone *zone;
> +
> +	nr_pages = PAGE_ALIGN(sizeof(struct pglist_data)) >> PAGE_SHIFT;
> +	page = virt_to_page(pgdat);
> +
> +	for (i = 0; i < nr_pages; i++, page++)
> +		get_page_bootmem(node, page, NODE_INFO);
> +
> +	zone = &pgdat->node_zones[0];
> +	for (; zone < pgdat->node_zones + MAX_NR_ZONES - 1; zone++) {
> +		if (zone->wait_table) {
> +			nr_pages = zone->wait_table_hash_nr_entries
> +				* sizeof(wait_queue_head_t);
> +			nr_pages = PAGE_ALIGN(nr_pages) >> PAGE_SHIFT;
> +			page = virt_to_page(zone->wait_table);
> +
> +			for (i = 0; i < nr_pages; i++, page++)
> +				get_page_bootmem(node, page, NODE_INFO);
> +		}
> +	}
> +
> +	pfn = pgdat->node_start_pfn;
> +	end_pfn = pfn + pgdat->node_spanned_pages;
> +
> +	/* register_section info */
> +	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION)
> +		register_page_bootmem_info_section(pfn);
> +
> +}
> +#endif /* !CONFIG_SPARSEMEM_VMEMMAP */
> +
>  static int __add_zone(struct zone *zone, unsigned long phys_start_pfn)
>  {
>  	struct pglist_data *pgdat = zone->zone_pgdat;
> Index: current/mm/sparse.c
> ===================================================================
> --- current.orig/mm/sparse.c	2008-04-07 16:06:49.000000000 +0900
> +++ current/mm/sparse.c	2008-04-07 20:08:16.000000000 +0900
> @@ -200,7 +200,6 @@
>  /*
>   * Decode mem_map from the coded memmap
>   */
> -static
>  struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum)
>  {
>  	/* mask off the extra low bits of information */
> @@ -223,7 +222,7 @@
>  	return 1;
>  }
>  
> -static unsigned long usemap_size(void)
> +unsigned long usemap_size(void)
>  {
>  	unsigned long size_bytes;
>  	size_bytes = roundup(SECTION_BLOCKFLAGS_BITS, 8) / 8;
> 

I wonder if these export changes might make more sense as a separate
patch, they are effectivly just noise.

-apw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch 002/005](memory hotplug) align memmap to page size
  2008-04-07 12:46 ` [Patch 002/005](memory hotplug) align memmap to page size Yasunori Goto
@ 2008-06-16 10:26   ` Andy Whitcroft
  2008-06-16 13:26     ` Yasunori Goto
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Whitcroft @ 2008-06-16 10:26 UTC (permalink / raw)
  To: Yasunori Goto
  Cc: Badari Pulavarty, Andrew Morton, Linux Kernel ML, linux-mm, Yinghai Lu

On Mon, Apr 07, 2008 at 09:46:19PM +0900, Yasunori Goto wrote:
> To free memmap easier, this patch aligns it to page size.
> Bootmem allocater may mix some objects in one pages.
> It's not good for freeing memmap of memory hot-remove.
> 
> 
> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
> 
> ---
>  mm/sparse.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: current/mm/sparse.c
> ===================================================================
> --- current.orig/mm/sparse.c	2008-04-07 19:18:50.000000000 +0900
> +++ current/mm/sparse.c	2008-04-07 20:08:13.000000000 +0900
> @@ -265,8 +265,8 @@
>  	if (map)
>  		return map;
>  
> -	map = alloc_bootmem_node(NODE_DATA(nid),
> -			sizeof(struct page) * PAGES_PER_SECTION);
> +	map = alloc_bootmem_pages_node(NODE_DATA(nid),
> +		       PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION));
>  	return map;
>  }
>  #endif /* !CONFIG_SPARSEMEM_VMEMMAP */

Ahh ok, we do makes sure the mmap uses up the rest of the space.  That
though is a shame as we cannot slip the usemap in the end of the space
any more (assuming we could).

-apw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch 003/005](memory hotplug) make alloc_bootmem_section()
  2008-04-07 12:47 ` [Patch 003/005](memory hotplug) make alloc_bootmem_section() Yasunori Goto
@ 2008-06-16 10:32   ` Andy Whitcroft
  2008-06-16 13:18     ` Yasunori Goto
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Whitcroft @ 2008-06-16 10:32 UTC (permalink / raw)
  To: Yasunori Goto
  Cc: Badari Pulavarty, Andrew Morton, Linux Kernel ML, linux-mm, Yinghai Lu

On Mon, Apr 07, 2008 at 09:47:29PM +0900, Yasunori Goto wrote:
> alloc_bootmem_section() can allocate specified section's area.
> This is used for usemap to keep same section with pgdat by later patch.
> 
> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
> 
> ---
>  include/linux/bootmem.h |    2 ++
>  mm/bootmem.c            |   31 +++++++++++++++++++++++++++++++
>  2 files changed, 33 insertions(+)
> 
> Index: current/include/linux/bootmem.h
> ===================================================================
> --- current.orig/include/linux/bootmem.h	2008-04-07 19:18:44.000000000 +0900
> +++ current/include/linux/bootmem.h	2008-04-07 19:30:08.000000000 +0900
> @@ -101,6 +101,8 @@
>  extern void free_bootmem_node(pg_data_t *pgdat,
>  			      unsigned long addr,
>  			      unsigned long size);
> +extern void *alloc_bootmem_section(unsigned long size,
> +				   unsigned long section_nr);
>  
>  #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
>  #define alloc_bootmem_node(pgdat, x) \
> Index: current/mm/bootmem.c
> ===================================================================
> --- current.orig/mm/bootmem.c	2008-04-07 19:18:44.000000000 +0900
> +++ current/mm/bootmem.c	2008-04-07 19:30:08.000000000 +0900
> @@ -540,6 +540,37 @@
>  	return __alloc_bootmem(size, align, goal);
>  }
>  
> +#ifdef CONFIG_SPARSEMEM
> +void * __init alloc_bootmem_section(unsigned long size,
> +				    unsigned long section_nr)
> +{
> +	void *ptr;
> +	unsigned long limit, goal, start_nr, end_nr, pfn;
> +	struct pglist_data *pgdat;
> +
> +	pfn = section_nr_to_pfn(section_nr);
> +	goal = PFN_PHYS(pfn);
> +	limit = PFN_PHYS(section_nr_to_pfn(section_nr + 1)) - 1;
> +	pgdat = NODE_DATA(early_pfn_to_nid(pfn));
> +	ptr = __alloc_bootmem_core(pgdat->bdata, size, SMP_CACHE_BYTES, goal,
> +				   limit);
> +
> +	if (!ptr)
> +		return NULL;
> +
This also indicates a failure allocating within the section, and yet we
do not report it here.

> +	start_nr = pfn_to_section_nr(PFN_DOWN(__pa(ptr)));
> +	end_nr = pfn_to_section_nr(PFN_DOWN(__pa(ptr) + size));
> +	if (start_nr != section_nr || end_nr != section_nr) {
> +		printk(KERN_WARNING "alloc_bootmem failed on section %ld.\n",
> +		       section_nr);
> +		free_bootmem_core(pgdat->bdata, __pa(ptr), size);

But we do here.  I think we should report both if this is worth
reporting.

> +		ptr = NULL;
> +	}
> +
> +	return ptr;
> +}
> +#endif
> +
>  #ifndef ARCH_LOW_ADDRESS_LIMIT
>  #define ARCH_LOW_ADDRESS_LIMIT	0xffffffffUL
>  #endif

-apw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch 005/005](memory hotplug) free memmaps allocated by bootmem
  2008-04-07 12:50 ` [Patch 005/005](memory hotplug) free memmaps allocated by bootmem Yasunori Goto
@ 2008-06-16 10:44   ` Andy Whitcroft
  2008-06-16 14:09     ` Yasunori Goto
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Whitcroft @ 2008-06-16 10:44 UTC (permalink / raw)
  To: Yasunori Goto
  Cc: Badari Pulavarty, Andrew Morton, Linux Kernel ML, linux-mm, Yinghai Lu

On Mon, Apr 07, 2008 at 09:50:18PM +0900, Yasunori Goto wrote:
> This patch is to free memmaps which is allocated by bootmem.
> 
> Freeing usemap is not necessary. The pages of usemap may be necessary
> for other sections.
> 
> If removing section is last section on the node,
> its section is the final user of usemap page.
> (usemaps are allocated on its section by previous patch.)
> But it shouldn't be freed too, because the section must be
> logical offline state which all pages are isolated against page allocater.
> If it is freed, page alloctor may use it which will be removed
> physically soon. It will be disaster.
> So, this patch keeps it as it is.
> 
> 
> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
> 
> ---
>  mm/internal.h       |    3 +--
>  mm/memory_hotplug.c |   11 +++++++++++
>  mm/page_alloc.c     |    2 +-
>  mm/sparse.c         |   51 +++++++++++++++++++++++++++++++++++++++++++++++----
>  4 files changed, 60 insertions(+), 7 deletions(-)
> 
> Index: current/mm/sparse.c
> ===================================================================
> --- current.orig/mm/sparse.c	2008-04-07 20:13:25.000000000 +0900
> +++ current/mm/sparse.c	2008-04-07 20:27:20.000000000 +0900
> @@ -8,6 +8,7 @@
>  #include <linux/module.h>
>  #include <linux/spinlock.h>
>  #include <linux/vmalloc.h>
> +#include "internal.h"
>  #include <asm/dma.h>
>  #include <asm/pgalloc.h>
>  #include <asm/pgtable.h>
> @@ -360,6 +361,9 @@
>  {
>  	return; /* XXX: Not implemented yet */
>  }
> +static void free_map_bootmem(struct page *page, unsigned long nr_pages)
> +{
> +}
>  #else
>  static struct page *__kmalloc_section_memmap(unsigned long nr_pages)
>  {
> @@ -397,17 +401,47 @@
>  		free_pages((unsigned long)memmap,
>  			   get_order(sizeof(struct page) * nr_pages));
>  }
> +
> +static void free_map_bootmem(struct page *page, unsigned long nr_pages)
> +{
> +	unsigned long maps_section_nr, removing_section_nr, i;
> +	int magic;
> +
> +	for (i = 0; i < nr_pages; i++, page++) {
> +		magic = atomic_read(&page->_mapcount);
> +
> +		BUG_ON(magic == NODE_INFO);

Are we sure the node area was big enough to never allocate usemap's into
it and change the magic to MIX?  I saw you make the section information
page sized but not the others.

> +
> +		maps_section_nr = pfn_to_section_nr(page_to_pfn(page));
> +		removing_section_nr = page->private;
> +
> +		/*
> +		 * When this function is called, the removing section is
> +		 * logical offlined state. This means all pages are isolated
> +		 * from page allocator. If removing section's memmap is placed
> +		 * on the same section, it must not be freed.
> +		 * If it is freed, page allocator may allocate it which will
> +		 * be removed physically soon.
> +		 */
> +		if (maps_section_nr != removing_section_nr)
> +			put_page_bootmem(page);

Would the section memmap have its own get_page_bootmem reference here?
Would that not protect it from release?

> +	}
> +}
>  #endif /* CONFIG_SPARSEMEM_VMEMMAP */
>  
>  static void free_section_usemap(struct page *memmap, unsigned long *usemap)
>  {
> +	struct page *usemap_page;
> +	unsigned long nr_pages;
> +
>  	if (!usemap)
>  		return;
>  
> +	usemap_page = virt_to_page(usemap);
>  	/*
>  	 * Check to see if allocation came from hot-plug-add
>  	 */
> -	if (PageSlab(virt_to_page(usemap))) {
> +	if (PageSlab(usemap_page)) {
>  		kfree(usemap);
>  		if (memmap)
>  			__kfree_section_memmap(memmap, PAGES_PER_SECTION);
> @@ -415,10 +449,19 @@
>  	}
>  
>  	/*
> -	 * TODO: Allocations came from bootmem - how do I free up ?
> +	 * The usemap came from bootmem. This is packed with other usemaps
> +	 * on the section which has pgdat at boot time. Just keep it as is now.
>  	 */
> -	printk(KERN_WARNING "Not freeing up allocations from bootmem "
> -			"- leaking memory\n");
> +
> +	if (memmap) {
> +		struct page *memmap_page;
> +		memmap_page = virt_to_page(memmap);
> +
> +		nr_pages = PAGE_ALIGN(PAGES_PER_SECTION * sizeof(struct page))
> +			>> PAGE_SHIFT;
> +
> +		free_map_bootmem(memmap_page, nr_pages);
> +	}
>  }
>  
>  /*
> Index: current/mm/page_alloc.c
> ===================================================================
> --- current.orig/mm/page_alloc.c	2008-04-07 20:12:55.000000000 +0900
> +++ current/mm/page_alloc.c	2008-04-07 20:13:29.000000000 +0900
> @@ -568,7 +568,7 @@
>  /*
>   * permit the bootmem allocator to evade page validation on high-order frees
>   */
> -void __init __free_pages_bootmem(struct page *page, unsigned int order)
> +void __free_pages_bootmem(struct page *page, unsigned int order)

not __meminit or something?

>  {
>  	if (order == 0) {
>  		__ClearPageReserved(page);
> Index: current/mm/internal.h
> ===================================================================
> --- current.orig/mm/internal.h	2008-04-07 20:12:55.000000000 +0900
> +++ current/mm/internal.h	2008-04-07 20:13:29.000000000 +0900
> @@ -34,8 +34,7 @@
>  	atomic_dec(&page->_count);
>  }
>  
> -extern void __init __free_pages_bootmem(struct page *page,
> -						unsigned int order);
> +extern void __free_pages_bootmem(struct page *page, unsigned int order);
>  
>  /*
>   * function for dealing with page's order in buddy system.
> Index: current/mm/memory_hotplug.c
> ===================================================================
> --- current.orig/mm/memory_hotplug.c	2008-04-07 20:12:55.000000000 +0900
> +++ current/mm/memory_hotplug.c	2008-04-07 20:13:29.000000000 +0900
> @@ -199,6 +199,16 @@
>  	return register_new_memory(__pfn_to_section(phys_start_pfn));
>  }
>  
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> +static int __remove_section(struct zone *zone, struct mem_section *ms)
> +{
> +	/*
> +	 * XXX: Freeing memmap with vmemmap is not implement yet.
> +	 *      This should be removed later.
> +	 */
> +	return -EBUSY;
> +}
> +#else
>  static int __remove_section(struct zone *zone, struct mem_section *ms)
>  {
>  	unsigned long flags;
> @@ -217,6 +227,7 @@
>  	pgdat_resize_unlock(pgdat, &flags);
>  	return 0;
>  }
> +#endif
>  
>  /*
>   * Reasonably generic function for adding memory.  It is

-apw

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch 003/005](memory hotplug) make alloc_bootmem_section()
  2008-06-16 10:32   ` Andy Whitcroft
@ 2008-06-16 13:18     ` Yasunori Goto
  0 siblings, 0 replies; 15+ messages in thread
From: Yasunori Goto @ 2008-06-16 13:18 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: Badari Pulavarty, Andrew Morton, Linux Kernel ML, linux-mm, Yinghai Lu

> On Mon, Apr 07, 2008 at 09:47:29PM +0900, Yasunori Goto wrote:
> > alloc_bootmem_section() can allocate specified section's area.
> > This is used for usemap to keep same section with pgdat by later patch.
> > 
> > Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
> > 
> > ---
> >  include/linux/bootmem.h |    2 ++
> >  mm/bootmem.c            |   31 +++++++++++++++++++++++++++++++
> >  2 files changed, 33 insertions(+)
> > 
> > Index: current/include/linux/bootmem.h
> > ===================================================================
> > --- current.orig/include/linux/bootmem.h	2008-04-07 19:18:44.000000000 +0900
> > +++ current/include/linux/bootmem.h	2008-04-07 19:30:08.000000000 +0900
> > @@ -101,6 +101,8 @@
> >  extern void free_bootmem_node(pg_data_t *pgdat,
> >  			      unsigned long addr,
> >  			      unsigned long size);
> > +extern void *alloc_bootmem_section(unsigned long size,
> > +				   unsigned long section_nr);
> >  
> >  #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
> >  #define alloc_bootmem_node(pgdat, x) \
> > Index: current/mm/bootmem.c
> > ===================================================================
> > --- current.orig/mm/bootmem.c	2008-04-07 19:18:44.000000000 +0900
> > +++ current/mm/bootmem.c	2008-04-07 19:30:08.000000000 +0900
> > @@ -540,6 +540,37 @@
> >  	return __alloc_bootmem(size, align, goal);
> >  }
> >  
> > +#ifdef CONFIG_SPARSEMEM
> > +void * __init alloc_bootmem_section(unsigned long size,
> > +				    unsigned long section_nr)
> > +{
> > +	void *ptr;
> > +	unsigned long limit, goal, start_nr, end_nr, pfn;
> > +	struct pglist_data *pgdat;
> > +
> > +	pfn = section_nr_to_pfn(section_nr);
> > +	goal = PFN_PHYS(pfn);
> > +	limit = PFN_PHYS(section_nr_to_pfn(section_nr + 1)) - 1;
> > +	pgdat = NODE_DATA(early_pfn_to_nid(pfn));
> > +	ptr = __alloc_bootmem_core(pgdat->bdata, size, SMP_CACHE_BYTES, goal,
> > +				   limit);
> > +
> > +	if (!ptr)
> > +		return NULL;
> > +
> This also indicates a failure allocating within the section, and yet we
> do not report it here.
> 
> > +	start_nr = pfn_to_section_nr(PFN_DOWN(__pa(ptr)));
> > +	end_nr = pfn_to_section_nr(PFN_DOWN(__pa(ptr) + size));
> > +	if (start_nr != section_nr || end_nr != section_nr) {
> > +		printk(KERN_WARNING "alloc_bootmem failed on section %ld.\n",
> > +		       section_nr);
> > +		free_bootmem_core(pgdat->bdata, __pa(ptr), size);
> 
> But we do here.  I think we should report both if this is worth
> reporting.

Here is already removed from newest -mm.
(Bootmem code is rewritten by Johannes Weiner recently.)
Please check the newest -mm code.

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch 002/005](memory hotplug) align memmap to page size
  2008-06-16 10:26   ` Andy Whitcroft
@ 2008-06-16 13:26     ` Yasunori Goto
  0 siblings, 0 replies; 15+ messages in thread
From: Yasunori Goto @ 2008-06-16 13:26 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: Badari Pulavarty, Andrew Morton, Linux Kernel ML, linux-mm, Yinghai Lu

> On Mon, Apr 07, 2008 at 09:46:19PM +0900, Yasunori Goto wrote:
> > To free memmap easier, this patch aligns it to page size.
> > Bootmem allocater may mix some objects in one pages.
> > It's not good for freeing memmap of memory hot-remove.
> > 
> > 
> > Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
> > 
> > ---
> >  mm/sparse.c |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > Index: current/mm/sparse.c
> > ===================================================================
> > --- current.orig/mm/sparse.c	2008-04-07 19:18:50.000000000 +0900
> > +++ current/mm/sparse.c	2008-04-07 20:08:13.000000000 +0900
> > @@ -265,8 +265,8 @@
> >  	if (map)
> >  		return map;
> >  
> > -	map = alloc_bootmem_node(NODE_DATA(nid),
> > -			sizeof(struct page) * PAGES_PER_SECTION);
> > +	map = alloc_bootmem_pages_node(NODE_DATA(nid),
> > +		       PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION));
> >  	return map;
> >  }
> >  #endif /* !CONFIG_SPARSEMEM_VMEMMAP */
> 
> Ahh ok, we do makes sure the mmap uses up the rest of the space.  That
> though is a shame as we cannot slip the usemap in the end of the space
> any more (assuming we could).

I thought we could merge memmap and usemap page in same pages. However,
if memmap' size equals page size correctly, one page must be used for
only one usemap in the end. 

Bye.

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch 001/005](memory hotplug) register section/node id to free
  2008-06-16 10:21   ` Andy Whitcroft
@ 2008-06-16 13:58     ` Yasunori Goto
  2008-06-17 11:39       ` [Patch](memory hotplug) Tiny fixes of bootmem free patch for memory hotremove Yasunori Goto
  0 siblings, 1 reply; 15+ messages in thread
From: Yasunori Goto @ 2008-06-16 13:58 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: Badari Pulavarty, Andrew Morton, Linux Kernel ML, linux-mm, Yinghai Lu

> > Index: current/include/linux/memory_hotplug.h
> > ===================================================================
> > --- current.orig/include/linux/memory_hotplug.h	2008-04-07 16:06:49.000000000 +0900
> > +++ current/include/linux/memory_hotplug.h	2008-04-07 16:33:12.000000000 +0900
> > @@ -11,6 +11,15 @@
> >  struct mem_section;
> >  
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > +
> > +/*
> > + * Magic number for free bootmem.
> > + * The normal smallest mapcount is -1. Here is smaller value than it.
> > + */
> > +#define SECTION_INFO		0xfffffffe
> > +#define MIX_INFO		0xfffffffd
> > +#define NODE_INFO		0xfffffffc
> 
> Perhaps these should be defined relative to -1 to make that very
> explicit.
> 
> #define SECTION_INFO	(-1 - 1)
> #define MIX_INFO	(-1 - 2)
> #define NODE_INFO	(-1 - 3)
> 

Ah, ok. I'll change them. 

> Also from a scan of this patch I cannot see why I might care about the
> type of these.  Yes it appears you are going to need a marker to say
> which bootmem pages are under this reference counted scheme and which
> are not.  From a review perspective having some clue in the leader about
> the type and why we care would help.
> 
> >From the names I was expecting SECTION related info, NODE related info,
> and a MIXture of things.  However, SECTION seems to be the actual sections,
> NODE seems to be pgdat information, MIX seems to be usemap?  Why is it
> not USEMAP here?  Possibily I will find out in a later patch but a clue
> here might help.

MIX_INFO might not be good name. I intended it was MIX_SECTION_INFO.
When SECTION_INFO is specified, section_number in ->private
can be checked which section uses this page.
This is important for dependency check.

When MIX_(SECTION_)INFO is specified, many sections uses this page.
So, dependency check is not possible.
Currently, this type is used for just usemap. But I thought other feature
may use it later...



> > Index: current/mm/memory_hotplug.c
> > ===================================================================
> > --- current.orig/mm/memory_hotplug.c	2008-04-07 16:06:49.000000000 +0900
> > +++ current/mm/memory_hotplug.c	2008-04-07 20:08:13.000000000 +0900
> > @@ -59,8 +59,105 @@
> >  	return;
> >  }
> >  
> > -
> >  #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
> > +#ifndef CONFIG_SPARSEMEM_VMEMMAP
> > +static void get_page_bootmem(unsigned long info,  struct page *page, int magic)
> > +{
> > +	atomic_set(&page->_mapcount, magic);
> > +	SetPagePrivate(page);
> > +	set_page_private(page, info);
> > +	atomic_inc(&page->_count);
> > +}
> 
> Although I guess these 'magic' constants are effectivly magic numbers it
> is also the type.  So I do wonder if this would be better called type.

Ah, if you feel type is better than magic, then I'll change it.
To be honest, I feel I'm not good at naming of functions/parameters.
So, good advice is welcome.


> > +
> > +void put_page_bootmem(struct page *page)
> > +{
> > +	int magic;
> > +
> > +	magic = atomic_read(&page->_mapcount);
> > +	BUG_ON(magic >= -1);
> > +
> > +	if (atomic_dec_return(&page->_count) == 1) {
> > +		ClearPagePrivate(page);
> > +		set_page_private(page, 0);
> > +		reset_page_mapcount(page);
> > +		__free_pages_bootmem(page, 0);
> > +	}
> > +
> > +}
> 
> That seems pretty sensible, using _count to track track the number of
> users of this page to allow it to be tracked.  But there was no mention
> of this in the changelog, so I was about to complain that the get_ was a
> strange name for something which set the magic numbers.  It mirroring
> get_page, put_page makes the name sensible.  But please document that in
> the changelog.

Ah, sorry.

> The BUG in put_page_bootmem I assume is effectivly saying "this page was
> not reference counted and so cannot be freed with this call".  Is there
> anything stopping us simply reference counting all bootmem allocations
> in this manner?  So that any of them could be released?

This count is just set for only memory management structure.

> 
> Also how does this scheme cope with things being merged into the end of
> the blocks you mark as freeable.  bootmem can pack small things into the
> end of the previous allocation if they fit and alignment allows.  Is it
> not possible that such allocations would get packed in, but not
> accounted for in the _count so that when hotplug frees these things the
> bootmem page would get dropped, but still have useful data in it?

(See second patch)

> 
> > +
> > +void register_page_bootmem_info_section(unsigned long start_pfn)
> > +{
> > +	unsigned long *usemap, mapsize, section_nr, i;
> > +	struct mem_section *ms;
> > +	struct page *page, *memmap;
> > +
> > +	if (!pfn_valid(start_pfn))
> > +		return;
> > +
> > +	section_nr = pfn_to_section_nr(start_pfn);
> > +	ms = __nr_to_section(section_nr);
> > +
> > +	/* Get section's memmap address */
> > +	memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr);
> > +
> > +	/*
> > +	 * Get page for the memmap's phys address
> > +	 * XXX: need more consideration for sparse_vmemmap...
> > +	 */
> > +	page = virt_to_page(memmap);
> > +	mapsize = sizeof(struct page) * PAGES_PER_SECTION;
> > +	mapsize = PAGE_ALIGN(mapsize) >> PAGE_SHIFT;
> > +
> > +	/* remember memmap's page */
> > +	for (i = 0; i < mapsize; i++, page++)
> > +		get_page_bootmem(section_nr, page, SECTION_INFO);
> > +
> > +	usemap = __nr_to_section(section_nr)->pageblock_flags;
> > +	page = virt_to_page(usemap);
> > +
> > +	mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT;
> > +
> > +	for (i = 0; i < mapsize; i++, page++)
> > +		get_page_bootmem(section_nr, page, MIX_INFO);
> > +
> 
> I am concerned that some of these pages might be in the numa remap space?
> If they are they were not part of bootmem, will they free correctly in the
> same manner?  They are necessarily not mapped at the correct kernel virtual
> address so the __pa() is not going to find the right struct page is it?

Good point. Yes. You are right.

> 
> Perhaps if you simply reference counted all bootmem allocations you
> would avoid this problem?

No. At least offline_pages() must distinguish followings
 A) a page will be not necessary later.
 B) a page is really busy

A) is the pages for section's memmap, pgdat and so on.
B) is others

A) must be isolated final phase of memory hotplug, because they used by
page offlining themselves. But offline_pages() is prior than final phase.

B) should be freed before offline_pages(). If it is busy, offline_pages should
be canceled.


> > Index: current/mm/sparse.c
> > ===================================================================
> > --- current.orig/mm/sparse.c	2008-04-07 16:06:49.000000000 +0900
> > +++ current/mm/sparse.c	2008-04-07 20:08:16.000000000 +0900
> > @@ -200,7 +200,6 @@
> >  /*
> >   * Decode mem_map from the coded memmap
> >   */
> > -static
> >  struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum)
> >  {
> >  	/* mask off the extra low bits of information */
> > @@ -223,7 +222,7 @@
> >  	return 1;
> >  }
> >  
> > -static unsigned long usemap_size(void)
> > +unsigned long usemap_size(void)
> >  {
> >  	unsigned long size_bytes;
> >  	size_bytes = roundup(SECTION_BLOCKFLAGS_BITS, 8) / 8;
> > 
> 
> I wonder if these export changes might make more sense as a separate
> patch, they are effectivly just noise.

Ah, yes. It should be separated. Sorry.

Thanks.

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Patch 005/005](memory hotplug) free memmaps allocated by bootmem
  2008-06-16 10:44   ` Andy Whitcroft
@ 2008-06-16 14:09     ` Yasunori Goto
  0 siblings, 0 replies; 15+ messages in thread
From: Yasunori Goto @ 2008-06-16 14:09 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: Badari Pulavarty, Andrew Morton, Linux Kernel ML, linux-mm, Yinghai Lu


> > Index: current/mm/sparse.c
> > ===================================================================
> > --- current.orig/mm/sparse.c	2008-04-07 20:13:25.000000000 +0900
> > +++ current/mm/sparse.c	2008-04-07 20:27:20.000000000 +0900
> > @@ -8,6 +8,7 @@
> >  #include <linux/module.h>
> >  #include <linux/spinlock.h>
> >  #include <linux/vmalloc.h>
> > +#include "internal.h"
> >  #include <asm/dma.h>
> >  #include <asm/pgalloc.h>
> >  #include <asm/pgtable.h>
> > @@ -360,6 +361,9 @@
> >  {
> >  	return; /* XXX: Not implemented yet */
> >  }
> > +static void free_map_bootmem(struct page *page, unsigned long nr_pages)
> > +{
> > +}
> >  #else
> >  static struct page *__kmalloc_section_memmap(unsigned long nr_pages)
> >  {
> > @@ -397,17 +401,47 @@
> >  		free_pages((unsigned long)memmap,
> >  			   get_order(sizeof(struct page) * nr_pages));
> >  }
> > +
> > +static void free_map_bootmem(struct page *page, unsigned long nr_pages)
> > +{
> > +	unsigned long maps_section_nr, removing_section_nr, i;
> > +	int magic;
> > +
> > +	for (i = 0; i < nr_pages; i++, page++) {
> > +		magic = atomic_read(&page->_mapcount);
> > +
> > +		BUG_ON(magic == NODE_INFO);
> 
> Are we sure the node area was big enough to never allocate usemap's into
> it and change the magic to MIX?  I saw you make the section information
> page sized but not the others.

I don't think this is finish for removing whole of node. Just preparing.
I would like to make section removing at first rather than node.

> > +
> > +		maps_section_nr = pfn_to_section_nr(page_to_pfn(page));
> > +		removing_section_nr = page->private;
> > +
> > +		/*
> > +		 * When this function is called, the removing section is
> > +		 * logical offlined state. This means all pages are isolated
> > +		 * from page allocator. If removing section's memmap is placed
> > +		 * on the same section, it must not be freed.
> > +		 * If it is freed, page allocator may allocate it which will
> > +		 * be removed physically soon.
> > +		 */
> > +		if (maps_section_nr != removing_section_nr)
> > +			put_page_bootmem(page);
> 
> Would the section memmap have its own get_page_bootmem reference here?
> Would that not protect it from release?

It's not protected.



> > Index: current/mm/page_alloc.c
> > ===================================================================
> > --- current.orig/mm/page_alloc.c	2008-04-07 20:12:55.000000000 +0900
> > +++ current/mm/page_alloc.c	2008-04-07 20:13:29.000000000 +0900
> > @@ -568,7 +568,7 @@
> >  /*
> >   * permit the bootmem allocator to evade page validation on high-order frees
> >   */
> > -void __init __free_pages_bootmem(struct page *page, unsigned int order)
> > +void __free_pages_bootmem(struct page *page, unsigned int order)
> 
> not __meminit or something?

Ah, yes. I'll fix it.

Thanks for comments.

Bye.

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Patch](memory hotplug) Tiny fixes of bootmem free patch for memory hotremove
  2008-06-16 13:58     ` Yasunori Goto
@ 2008-06-17 11:39       ` Yasunori Goto
  0 siblings, 0 replies; 15+ messages in thread
From: Yasunori Goto @ 2008-06-17 11:39 UTC (permalink / raw)
  To: Andy Whitcroft, Andrew Morton
  Cc: Badari Pulavarty, Linux Kernel ML, linux-mm, Yinghai Lu


Here is tiny fixes of bootmem free patch for memory hotremove.

  - Change some naming
      * Magic -> types
      * MIX_INFO -> MIX_SECTION_INFO
      * Change definition of bootmem type from direct hex value
  - __free_pages_bootmem() becomes __meminit.

This is for 2.6.26-rc5-mm3.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>


---
 include/linux/memory_hotplug.h |    8 ++++----
 mm/memory_hotplug.c            |   12 ++++++------
 mm/page_alloc.c                |    2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

Index: current/include/linux/memory_hotplug.h
===================================================================
--- current.orig/include/linux/memory_hotplug.h	2008-06-10 20:23:30.000000000 +0900
+++ current/include/linux/memory_hotplug.h	2008-06-17 20:30:14.000000000 +0900
@@ -13,12 +13,12 @@
 #ifdef CONFIG_MEMORY_HOTPLUG
 
 /*
- * Magic number for free bootmem.
+ * Types for free bootmem.
  * The normal smallest mapcount is -1. Here is smaller value than it.
  */
-#define SECTION_INFO		0xfffffffe
-#define MIX_INFO		0xfffffffd
-#define NODE_INFO		0xfffffffc
+#define SECTION_INFO		(-1 - 1)
+#define MIX_SECTION_INFO	(-1 - 2)
+#define NODE_INFO		(-1 - 3)
 
 /*
  * pgdat resizing functions
Index: current/mm/memory_hotplug.c
===================================================================
--- current.orig/mm/memory_hotplug.c	2008-06-17 15:34:29.000000000 +0900
+++ current/mm/memory_hotplug.c	2008-06-17 20:31:59.000000000 +0900
@@ -62,9 +62,9 @@
 
 #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
 #ifndef CONFIG_SPARSEMEM_VMEMMAP
-static void get_page_bootmem(unsigned long info,  struct page *page, int magic)
+static void get_page_bootmem(unsigned long info,  struct page *page, int type)
 {
-	atomic_set(&page->_mapcount, magic);
+	atomic_set(&page->_mapcount, type);
 	SetPagePrivate(page);
 	set_page_private(page, info);
 	atomic_inc(&page->_count);
@@ -72,10 +72,10 @@
 
 void put_page_bootmem(struct page *page)
 {
-	int magic;
+	int type;
 
-	magic = atomic_read(&page->_mapcount);
-	BUG_ON(magic >= -1);
+	type = atomic_read(&page->_mapcount);
+	BUG_ON(type >= -1);
 
 	if (atomic_dec_return(&page->_count) == 1) {
 		ClearPagePrivate(page);
@@ -119,7 +119,7 @@
 	mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT;
 
 	for (i = 0; i < mapsize; i++, page++)
-		get_page_bootmem(section_nr, page, MIX_INFO);
+		get_page_bootmem(section_nr, page, MIX_SECTION_INFO);
 
 }
 
Index: current/mm/page_alloc.c
===================================================================
--- current.orig/mm/page_alloc.c	2008-06-17 15:34:29.000000000 +0900
+++ current/mm/page_alloc.c	2008-06-17 20:08:47.000000000 +0900
@@ -583,7 +583,7 @@
 /*
  * permit the bootmem allocator to evade page validation on high-order frees
  */
-void __free_pages_bootmem(struct page *page, unsigned int order)
+void __meminit __free_pages_bootmem(struct page *page, unsigned int order)
 {
 	if (order == 0) {
 		__ClearPageReserved(page);

-- 
Yasunori Goto 



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2008-06-17 11:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-07 12:43 [Patch 000/005](memory hotplug) freeing pages allocated by bootmem for hotremove v3 Yasunori Goto
2008-04-07 12:45 ` [Patch 001/005](memory hotplug) register section/node id to free Yasunori Goto
2008-06-16 10:21   ` Andy Whitcroft
2008-06-16 13:58     ` Yasunori Goto
2008-06-17 11:39       ` [Patch](memory hotplug) Tiny fixes of bootmem free patch for memory hotremove Yasunori Goto
2008-04-07 12:46 ` [Patch 002/005](memory hotplug) align memmap to page size Yasunori Goto
2008-06-16 10:26   ` Andy Whitcroft
2008-06-16 13:26     ` Yasunori Goto
2008-04-07 12:47 ` [Patch 003/005](memory hotplug) make alloc_bootmem_section() Yasunori Goto
2008-06-16 10:32   ` Andy Whitcroft
2008-06-16 13:18     ` Yasunori Goto
2008-04-07 12:48 ` [Patch 004/005](memory hotplug)allocate usemap on the section with pgdat Yasunori Goto
2008-04-07 12:50 ` [Patch 005/005](memory hotplug) free memmaps allocated by bootmem Yasunori Goto
2008-06-16 10:44   ` Andy Whitcroft
2008-06-16 14:09     ` Yasunori Goto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).