* [PATCH 1/2] Add alloc_pages_exact_nid()
@ 2011-05-05 19:46 Andi Kleen
2011-05-05 19:46 ` [PATCH 2/2] Allocate memory cgroup structures in local nodes v3 Andi Kleen
0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2011-05-05 19:46 UTC (permalink / raw)
To: linux-mm; +Cc: akpm, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add a alloc_pages_exact_nid() that allocates on a specific node.
The naming is quite broken, but fixing that would need a larger
renaming action.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/gfp.h | 2 ++
mm/page_alloc.c | 48 ++++++++++++++++++++++++++++++++++++------------
2 files changed, 38 insertions(+), 12 deletions(-)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index bfb8f93..56d8fc8 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -353,6 +353,8 @@ extern unsigned long get_zeroed_page(gfp_t gfp_mask);
void *alloc_pages_exact(size_t size, gfp_t gfp_mask);
void free_pages_exact(void *virt, size_t size);
+/* This is different from alloc_pages_exact_node !!! */
+void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
#define __get_free_page(gfp_mask) \
__get_free_pages((gfp_mask), 0)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9f8a97b..5219dac 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2317,6 +2317,21 @@ void free_pages(unsigned long addr, unsigned int order)
EXPORT_SYMBOL(free_pages);
+static void *make_alloc_exact(void *addr, unsigned order, size_t size)
+{
+ if (addr) {
+ unsigned long alloc_end = addr + (PAGE_SIZE << order);
+ unsigned long used = addr + PAGE_ALIGN(size);
+
+ split_page(virt_to_page((void *)addr), order);
+ while (used < alloc_end) {
+ free_page(used);
+ used += PAGE_SIZE;
+ }
+ }
+ return (void *)addr;
+}
+
/**
* alloc_pages_exact - allocate an exact number physically-contiguous pages.
* @size: the number of bytes to allocate
@@ -2336,22 +2351,31 @@ void *alloc_pages_exact(size_t size, gfp_t gfp_mask)
unsigned long addr;
addr = __get_free_pages(gfp_mask, order);
- if (addr) {
- unsigned long alloc_end = addr + (PAGE_SIZE << order);
- unsigned long used = addr + PAGE_ALIGN(size);
-
- split_page(virt_to_page((void *)addr), order);
- while (used < alloc_end) {
- free_page(used);
- used += PAGE_SIZE;
- }
- }
-
- return (void *)addr;
+ return make_alloc_exact(addr, order, size);
}
EXPORT_SYMBOL(alloc_pages_exact);
/**
+ * alloc_pages_exact_nid - allocate an exact number physically-contiguous pages on node.
+ * @size: the number of bytes to allocate
+ * @gfp_mask: GFP flags for the allocation
+ *
+ * Like alloc_pages_exact, but try to allocate on node nid first
+ * before falling back.
+ * Note this is not alloc_pages_exact_node() which allocates
+ * on a specific node, but is not exact.
+ */
+void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask)
+{
+ unsigned order = get_order(size);
+ struct page *p = alloc_pages_node(nid, gfp_mask, order);
+ if (!p)
+ return NULL;
+ return make_alloc_exact(page_address(p), order, size);
+}
+EXPORT_SYMBOL(alloc_pages_exact_nid);
+
+/**
* free_pages_exact - release memory allocated via alloc_pages_exact()
* @virt: the value returned by alloc_pages_exact.
* @size: size of allocation, same value as passed to alloc_pages_exact().
--
1.7.4.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] Allocate memory cgroup structures in local nodes v3
2011-05-05 19:46 [PATCH 1/2] Add alloc_pages_exact_nid() Andi Kleen
@ 2011-05-05 19:46 ` Andi Kleen
2011-05-06 8:49 ` Michal Hocko
2011-05-06 19:46 ` Balbir Singh
0 siblings, 2 replies; 5+ messages in thread
From: Andi Kleen @ 2011-05-05 19:46 UTC (permalink / raw)
To: linux-mm
Cc: akpm, Andi Kleen, rientjes, Michal Hocko, Dave Hansen,
Balbir Singh, Johannes Weiner
From: Andi Kleen <ak@linux.intel.com>
dde79e005a769 added a regression that the memory cgroup data structures
all end up in node 0 because the first attempt at allocating them
would not pass in a node hint. Since the initialization runs on CPU #0
it would all end up node 0. This is a problem on large memory systems,
where node 0 would lose a lot of memory.
Change the alloc_pages_exact to alloc_pages_exact_node. This will
still fall back to other nodes if not enough memory is available.
[RED-PEN: right now it would fall back first before trying
vmalloc_node. Probably not the best strategy ... But I left it like
that for now.]
v3: Really call the correct function now. Thanks for everyone who commented.
Reported-by: Doug Nelson
Cc: rientjes@google.com
CC: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
mm/page_alloc.c | 4 ++--
mm/page_cgroup.c | 6 ++++--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5219dac..44e175d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2317,7 +2317,7 @@ void free_pages(unsigned long addr, unsigned int order)
EXPORT_SYMBOL(free_pages);
-static void *make_alloc_exact(void *addr, unsigned order, size_t size)
+static void *make_alloc_exact(unsigned long addr, unsigned order, size_t size)
{
if (addr) {
unsigned long alloc_end = addr + (PAGE_SIZE << order);
@@ -2371,7 +2371,7 @@ void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask)
struct page *p = alloc_pages_node(nid, gfp_mask, order);
if (!p)
return NULL;
- return make_alloc_exact(page_address(p), order, size);
+ return make_alloc_exact((unsigned long)page_address(p), order, size);
}
EXPORT_SYMBOL(alloc_pages_exact_nid);
diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index 9905501..347ab60 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -134,9 +134,11 @@ static void *__init_refok alloc_page_cgroup(size_t size, int nid)
{
void *addr = NULL;
- addr = alloc_pages_exact(size, GFP_KERNEL | __GFP_NOWARN);
- if (addr)
+ addr = alloc_pages_exact_nid(nid, size, GFP_KERNEL | __GFP_NOWARN);
+ if (addr) {
+ printk("%s: allocated exact\n", __FUNCTION__);
return addr;
+ }
if (node_state(nid, N_HIGH_MEMORY))
addr = vmalloc_node(size, nid);
--
1.7.4.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] Allocate memory cgroup structures in local nodes v3
2011-05-05 19:46 ` [PATCH 2/2] Allocate memory cgroup structures in local nodes v3 Andi Kleen
@ 2011-05-06 8:49 ` Michal Hocko
2011-05-06 17:06 ` Andi Kleen
2011-05-06 19:46 ` Balbir Singh
1 sibling, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2011-05-06 8:49 UTC (permalink / raw)
To: Andi Kleen
Cc: linux-mm, akpm, Andi Kleen, rientjes, Dave Hansen, Balbir Singh,
Johannes Weiner
On Thu 05-05-11 12:46:02, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> dde79e005a769 added a regression that the memory cgroup data structures
> all end up in node 0 because the first attempt at allocating them
> would not pass in a node hint. Since the initialization runs on CPU #0
> it would all end up node 0. This is a problem on large memory systems,
> where node 0 would lose a lot of memory.
>
> Change the alloc_pages_exact to alloc_pages_exact_node. This will
> still fall back to other nodes if not enough memory is available.
>
> [RED-PEN: right now it would fall back first before trying
> vmalloc_node. Probably not the best strategy ... But I left it like
> that for now.]
>
> v3: Really call the correct function now. Thanks for everyone who commented.
> Reported-by: Doug Nelson
> Cc: rientjes@google.com
> CC: Michal Hocko <mhocko@suse.cz>
> Cc: Dave Hansen <dave@linux.vnet.ibm.com>
> Cc: Balbir Singh <balbir@in.ibm.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
> mm/page_alloc.c | 4 ++--
> mm/page_cgroup.c | 6 ++++--
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5219dac..44e175d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2317,7 +2317,7 @@ void free_pages(unsigned long addr, unsigned int order)
>
> EXPORT_SYMBOL(free_pages);
>
> -static void *make_alloc_exact(void *addr, unsigned order, size_t size)
> +static void *make_alloc_exact(unsigned long addr, unsigned order, size_t size)
> {
> if (addr) {
> unsigned long alloc_end = addr + (PAGE_SIZE << order);
> @@ -2371,7 +2371,7 @@ void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask)
> struct page *p = alloc_pages_node(nid, gfp_mask, order);
> if (!p)
> return NULL;
> - return make_alloc_exact(page_address(p), order, size);
> + return make_alloc_exact((unsigned long)page_address(p), order, size);
I am not sure whether this doesn't clash with what Dave was working on. Some
pieces are already in the -mm tree but I do not see node versions to be
renamed.
> }
> EXPORT_SYMBOL(alloc_pages_exact_nid);
>
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 9905501..347ab60 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -134,9 +134,11 @@ static void *__init_refok alloc_page_cgroup(size_t size, int nid)
> {
> void *addr = NULL;
>
> - addr = alloc_pages_exact(size, GFP_KERNEL | __GFP_NOWARN);
> - if (addr)
> + addr = alloc_pages_exact_nid(nid, size, GFP_KERNEL | __GFP_NOWARN);
> + if (addr) {
> + printk("%s: allocated exact\n", __FUNCTION__);
What is this printk for? Other than that the change looks good to me.
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Thanks
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] Allocate memory cgroup structures in local nodes v3
2011-05-06 8:49 ` Michal Hocko
@ 2011-05-06 17:06 ` Andi Kleen
0 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2011-05-06 17:06 UTC (permalink / raw)
To: Michal Hocko
Cc: Andi Kleen, linux-mm, akpm, Andi Kleen, rientjes, Dave Hansen,
Balbir Singh, Johannes Weiner
> What is this printk for? Other than that the change looks good to me.
Leftover debugging code. I'll remove it.
Thanks.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] Allocate memory cgroup structures in local nodes v3
2011-05-05 19:46 ` [PATCH 2/2] Allocate memory cgroup structures in local nodes v3 Andi Kleen
2011-05-06 8:49 ` Michal Hocko
@ 2011-05-06 19:46 ` Balbir Singh
1 sibling, 0 replies; 5+ messages in thread
From: Balbir Singh @ 2011-05-06 19:46 UTC (permalink / raw)
To: Andi Kleen
Cc: linux-mm, akpm, Andi Kleen, rientjes, Michal Hocko, Dave Hansen,
Johannes Weiner
* Andi Kleen <andi@firstfloor.org> [2011-05-05 12:46:02]:
> From: Andi Kleen <ak@linux.intel.com>
>
> dde79e005a769 added a regression that the memory cgroup data structures
> all end up in node 0 because the first attempt at allocating them
> would not pass in a node hint. Since the initialization runs on CPU #0
> it would all end up node 0. This is a problem on large memory systems,
> where node 0 would lose a lot of memory.
>
> Change the alloc_pages_exact to alloc_pages_exact_node. This will
^^^ (typo should be nid)
> still fall back to other nodes if not enough memory is available.
>
> [RED-PEN: right now it would fall back first before trying
> vmalloc_node. Probably not the best strategy ... But I left it like
> that for now.]
>
The patch looks good except for the printk.
--
Three Cheers,
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-05-07 9:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-05 19:46 [PATCH 1/2] Add alloc_pages_exact_nid() Andi Kleen
2011-05-05 19:46 ` [PATCH 2/2] Allocate memory cgroup structures in local nodes v3 Andi Kleen
2011-05-06 8:49 ` Michal Hocko
2011-05-06 17:06 ` Andi Kleen
2011-05-06 19:46 ` Balbir Singh
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.