* [PATCH 0/4] cache-specific changes for memcg (preparation)
@ 2012-06-20 20:59 Glauber Costa
2012-06-20 20:59 ` [PATCH 1/4] slab: rename gfpflags to allocflags Glauber Costa
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: Glauber Costa @ 2012-06-20 20:59 UTC (permalink / raw)
To: Pekka Enberg; +Cc: linux-mm, Cristoph Lameter, David Rientjes
Pekka,
Please consider merging the following patches.
Patches 1-3 are around for a while, specially 1 and 3.
Patch #4 is a bit newer, and got less reviews (reviews appreciated),
but I am using it myself without further problems.
Feel free to pick all of them, or part of them, the way you prefer.
Glauber Costa (4):
slab: rename gfpflags to allocflags
Wipe out CFLGS_OFF_SLAB from flags during initial slab creation
slab: move FULL state transition to an initcall
don't do __ClearPageSlab before freeing slab page.
include/linux/slab_def.h | 2 +-
mm/page_alloc.c | 5 ++++-
mm/slab.c | 29 +++++++++++++++--------------
mm/slob.c | 1 -
mm/slub.c | 1 -
5 files changed, 20 insertions(+), 18 deletions(-)
--
1.7.10.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/4] slab: rename gfpflags to allocflags
2012-06-20 20:59 [PATCH 0/4] cache-specific changes for memcg (preparation) Glauber Costa
@ 2012-06-20 20:59 ` Glauber Costa
2012-06-21 7:53 ` David Rientjes
2012-06-20 20:59 ` [PATCH 2/4] Wipe out CFLGS_OFF_SLAB from flags during initial slab creation Glauber Costa
` (2 subsequent siblings)
3 siblings, 1 reply; 16+ messages in thread
From: Glauber Costa @ 2012-06-20 20:59 UTC (permalink / raw)
To: Pekka Enberg
Cc: linux-mm, Cristoph Lameter, David Rientjes, Glauber Costa, Pekka Enberg
A consistent name with slub saves us an acessor function.
In both caches, this field represents the same thing. We would
like to use it from the mem_cgroup code.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Christoph Lameter <cl@linux.com>
CC: Pekka Enberg <penberg@cs.helsinki.fi>
---
include/linux/slab_def.h | 2 +-
mm/slab.c | 10 +++++-----
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 1d93f27..0c634fa 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -39,7 +39,7 @@ struct kmem_cache {
unsigned int gfporder;
/* force GFP flags, e.g. GFP_DMA */
- gfp_t gfpflags;
+ gfp_t allocflags;
size_t colour; /* cache colouring range */
unsigned int colour_off; /* colour offset */
diff --git a/mm/slab.c b/mm/slab.c
index dd607a8..bb79652 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1771,7 +1771,7 @@ static void *kmem_getpages(struct kmem_cache *cachep, gfp_t flags, int nodeid)
flags |= __GFP_COMP;
#endif
- flags |= cachep->gfpflags;
+ flags |= cachep->allocflags;
if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
flags |= __GFP_RECLAIMABLE;
@@ -2482,9 +2482,9 @@ kmem_cache_create (const char *name, size_t size, size_t align,
cachep->colour = left_over / cachep->colour_off;
cachep->slab_size = slab_size;
cachep->flags = flags;
- cachep->gfpflags = 0;
+ cachep->allocflags = 0;
if (CONFIG_ZONE_DMA_FLAG && (flags & SLAB_CACHE_DMA))
- cachep->gfpflags |= GFP_DMA;
+ cachep->allocflags |= GFP_DMA;
cachep->size = size;
cachep->reciprocal_buffer_size = reciprocal_value(size);
@@ -2831,9 +2831,9 @@ static void kmem_flagcheck(struct kmem_cache *cachep, gfp_t flags)
{
if (CONFIG_ZONE_DMA_FLAG) {
if (flags & GFP_DMA)
- BUG_ON(!(cachep->gfpflags & GFP_DMA));
+ BUG_ON(!(cachep->allocflags & GFP_DMA));
else
- BUG_ON(cachep->gfpflags & GFP_DMA);
+ BUG_ON(cachep->allocflags & GFP_DMA);
}
}
--
1.7.10.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/4] Wipe out CFLGS_OFF_SLAB from flags during initial slab creation
2012-06-20 20:59 [PATCH 0/4] cache-specific changes for memcg (preparation) Glauber Costa
2012-06-20 20:59 ` [PATCH 1/4] slab: rename gfpflags to allocflags Glauber Costa
@ 2012-06-20 20:59 ` Glauber Costa
2012-06-21 7:59 ` David Rientjes
2012-06-20 20:59 ` [PATCH 3/4] slab: move FULL state transition to an initcall Glauber Costa
2012-06-20 20:59 ` [PATCH 4/4] don't do __ClearPageSlab before freeing slab page Glauber Costa
3 siblings, 1 reply; 16+ messages in thread
From: Glauber Costa @ 2012-06-20 20:59 UTC (permalink / raw)
To: Pekka Enberg; +Cc: linux-mm, Cristoph Lameter, David Rientjes, Pekka Enberg
CFLGS_OFF_SLAB is not a valid flag to be passed to cache creation.
If we are duplicating a cache - support added in a future patch -
we will rely on the flags it has stored in itself. That may include
CFLGS_OFF_SLAB.
So it is better to clean this flag at cache creation.
CC: Christoph Lameter <cl@linux.com>
CC: Pekka Enberg <penberg@cs.helsinki.fi>
CC: David Rientjes <rientjes@google.com>
---
mm/slab.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/mm/slab.c b/mm/slab.c
index bb79652..0d1bd09 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2317,6 +2317,12 @@ kmem_cache_create (const char *name, size_t size, size_t align,
BUG_ON(flags & SLAB_POISON);
#endif
/*
+ * Passing this flag at creation time is invalid, but if we're
+ * duplicating a slab, it may happen.
+ */
+ flags &= ~CFLGS_OFF_SLAB;
+
+ /*
* Always checks flags, a caller might be expecting debug support which
* isn't available.
*/
--
1.7.10.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/4] slab: move FULL state transition to an initcall
2012-06-20 20:59 [PATCH 0/4] cache-specific changes for memcg (preparation) Glauber Costa
2012-06-20 20:59 ` [PATCH 1/4] slab: rename gfpflags to allocflags Glauber Costa
2012-06-20 20:59 ` [PATCH 2/4] Wipe out CFLGS_OFF_SLAB from flags during initial slab creation Glauber Costa
@ 2012-06-20 20:59 ` Glauber Costa
2012-06-21 8:01 ` David Rientjes
2012-06-20 20:59 ` [PATCH 4/4] don't do __ClearPageSlab before freeing slab page Glauber Costa
3 siblings, 1 reply; 16+ messages in thread
From: Glauber Costa @ 2012-06-20 20:59 UTC (permalink / raw)
To: Pekka Enberg
Cc: linux-mm, Cristoph Lameter, David Rientjes, Glauber Costa, Pekka Enberg
During kmem_cache_init_late(), we transition to the LATE state,
and after some more work, to the FULL state, its last state
This is quite different from slub, that will only transition to
its last state (previously SYSFS), in a (late)initcall, after a lot
more of the kernel is ready.
This means that in slab, we have no way to taking actions dependent
on the initialization of other pieces of the kernel that are supposed
to start way after kmem_init_late(), such as cgroups initialization.
To achieve more consistency in this behavior, that patch only
transitions to the UP state in kmem_init_late. In my analysis,
setup_cpu_cache() should be happy to test for >= UP, instead of
== FULL. It also has passed some tests I've made.
We then only mark FULL state after the reap timers are in place,
meaning that no further setup is expected.
Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Christoph Lameter <cl@linux.com>
CC: Pekka Enberg <penberg@cs.helsinki.fi>
CC: David Rientjes <rientjes@google.com>
---
mm/slab.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/slab.c b/mm/slab.c
index 0d1bd09..cb6da05 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1668,9 +1668,6 @@ void __init kmem_cache_init_late(void)
BUG();
mutex_unlock(&cache_chain_mutex);
- /* Done! */
- g_cpucache_up = FULL;
-
/*
* Register a cpu startup notifier callback that initializes
* cpu_cache_get for all new cpus
@@ -1700,6 +1697,9 @@ static int __init cpucache_init(void)
*/
for_each_online_cpu(cpu)
start_cpu_timer(cpu);
+
+ /* Done! */
+ g_cpucache_up = FULL;
return 0;
}
__initcall(cpucache_init);
@@ -2167,7 +2167,7 @@ static size_t calculate_slab_order(struct kmem_cache *cachep,
static int __init_refok setup_cpu_cache(struct kmem_cache *cachep, gfp_t gfp)
{
- if (g_cpucache_up == FULL)
+ if (g_cpucache_up >= LATE)
return enable_cpucache(cachep, gfp);
if (g_cpucache_up == NONE) {
--
1.7.10.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/4] don't do __ClearPageSlab before freeing slab page.
2012-06-20 20:59 [PATCH 0/4] cache-specific changes for memcg (preparation) Glauber Costa
` (2 preceding siblings ...)
2012-06-20 20:59 ` [PATCH 3/4] slab: move FULL state transition to an initcall Glauber Costa
@ 2012-06-20 20:59 ` Glauber Costa
2012-06-21 8:04 ` David Rientjes
3 siblings, 1 reply; 16+ messages in thread
From: Glauber Costa @ 2012-06-20 20:59 UTC (permalink / raw)
To: Pekka Enberg
Cc: linux-mm, Cristoph Lameter, David Rientjes, Glauber Costa,
Pekka Enberg, Michal Hocko, Kamezawa Hiroyuki, Johannes Weiner,
Suleiman Souhlal
This will give the oportunity to the page allocator to
determine that a given page was previously a slab page, and
take action accordingly.
If memcg kmem is present, this means that that page needs to
be unaccounted. The page allocator will now have the responsibility
to clear that bit upon free_pages().
It is not uncommon to have the page allocator to check page flags.
Mlock flag, for instance, is checked pervasively all over the place.
So I hope this is okay for the slab as well.
Signed-off-by: Glauber Costa <glommer@parallels.com>
CC: Pekka Enberg <penberg@cs.helsinki.fi>
CC: Michal Hocko <mhocko@suse.cz>
CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Johannes Weiner <hannes@cmpxchg.org>
CC: Suleiman Souhlal <suleiman@google.com>
---
mm/page_alloc.c | 5 ++++-
mm/slab.c | 5 -----
mm/slob.c | 1 -
mm/slub.c | 1 -
4 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6092f33..fdec73e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -698,8 +698,10 @@ static bool free_pages_prepare(struct page *page, unsigned int order)
if (PageAnon(page))
page->mapping = NULL;
- for (i = 0; i < (1 << order); i++)
+ for (i = 0; i < (1 << order); i++) {
+ __ClearPageSlab(page + i);
bad += free_pages_check(page + i);
+ }
if (bad)
return false;
@@ -2561,6 +2563,7 @@ EXPORT_SYMBOL(get_zeroed_page);
void __free_pages(struct page *page, unsigned int order)
{
if (put_page_testzero(page)) {
+ __ClearPageSlab(page);
if (order == 0)
free_hot_cold_page(page, 0);
else
diff --git a/mm/slab.c b/mm/slab.c
index cb6da05..3e578fc 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1821,11 +1821,6 @@ static void kmem_freepages(struct kmem_cache *cachep, void *addr)
else
sub_zone_page_state(page_zone(page),
NR_SLAB_UNRECLAIMABLE, nr_freed);
- while (i--) {
- BUG_ON(!PageSlab(page));
- __ClearPageSlab(page);
- page++;
- }
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += nr_freed;
free_pages((unsigned long)addr, cachep->gfporder);
diff --git a/mm/slob.c b/mm/slob.c
index 95d1c7d..48b9a79 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -359,7 +359,6 @@ static void slob_free(void *block, int size)
if (slob_page_free(sp))
clear_slob_page_free(sp);
spin_unlock_irqrestore(&slob_lock, flags);
- __ClearPageSlab(sp);
reset_page_mapcount(sp);
slob_free_pages(b, 0);
return;
diff --git a/mm/slub.c b/mm/slub.c
index f96d8bc..b0ac04a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1413,7 +1413,6 @@ static void __free_slab(struct kmem_cache *s, struct page *page)
NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
-pages);
- __ClearPageSlab(page);
reset_page_mapcount(page);
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += pages;
--
1.7.10.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 1/4] slab: rename gfpflags to allocflags
2012-06-20 20:59 ` [PATCH 1/4] slab: rename gfpflags to allocflags Glauber Costa
@ 2012-06-21 7:53 ` David Rientjes
0 siblings, 0 replies; 16+ messages in thread
From: David Rientjes @ 2012-06-21 7:53 UTC (permalink / raw)
To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, Cristoph Lameter, Pekka Enberg
On Thu, 21 Jun 2012, Glauber Costa wrote:
> A consistent name with slub saves us an acessor function.
> In both caches, this field represents the same thing. We would
> like to use it from the mem_cgroup code.
>
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Acked-by: Christoph Lameter <cl@linux.com>
> CC: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: David Rientjes <rientjes@google.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/4] Wipe out CFLGS_OFF_SLAB from flags during initial slab creation
2012-06-20 20:59 ` [PATCH 2/4] Wipe out CFLGS_OFF_SLAB from flags during initial slab creation Glauber Costa
@ 2012-06-21 7:59 ` David Rientjes
0 siblings, 0 replies; 16+ messages in thread
From: David Rientjes @ 2012-06-21 7:59 UTC (permalink / raw)
To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, Cristoph Lameter, Pekka Enberg
On Thu, 21 Jun 2012, Glauber Costa wrote:
> CFLGS_OFF_SLAB is not a valid flag to be passed to cache creation.
> If we are duplicating a cache - support added in a future patch -
> we will rely on the flags it has stored in itself. That may include
> CFLGS_OFF_SLAB.
>
> So it is better to clean this flag at cache creation.
>
I think this should be folded into the patch that allows cache
duplication, it doesn't make sense to apply right now because
CFLGS_OFF_SLAB is internal to mm/slab.c and it's not going to be passing
this to kmem_cache_create() itself yet.
Also, do we care that cache_estimate() uses a formal of type int for flags
when CFLGS_OFF_SLAB is unsigned long?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] slab: move FULL state transition to an initcall
2012-06-20 20:59 ` [PATCH 3/4] slab: move FULL state transition to an initcall Glauber Costa
@ 2012-06-21 8:01 ` David Rientjes
2012-07-02 10:57 ` Pekka Enberg
0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2012-06-21 8:01 UTC (permalink / raw)
To: Glauber Costa; +Cc: Pekka Enberg, linux-mm, Cristoph Lameter, Pekka Enberg
On Thu, 21 Jun 2012, Glauber Costa wrote:
> During kmem_cache_init_late(), we transition to the LATE state,
> and after some more work, to the FULL state, its last state
>
> This is quite different from slub, that will only transition to
> its last state (previously SYSFS), in a (late)initcall, after a lot
> more of the kernel is ready.
>
> This means that in slab, we have no way to taking actions dependent
> on the initialization of other pieces of the kernel that are supposed
> to start way after kmem_init_late(), such as cgroups initialization.
>
> To achieve more consistency in this behavior, that patch only
> transitions to the UP state in kmem_init_late. In my analysis,
> setup_cpu_cache() should be happy to test for >= UP, instead of
> == FULL. It also has passed some tests I've made.
>
> We then only mark FULL state after the reap timers are in place,
> meaning that no further setup is expected.
>
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> Acked-by: Christoph Lameter <cl@linux.com>
> CC: Pekka Enberg <penberg@cs.helsinki.fi>
> CC: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
[ Might want to fix your address book in your email client because
Christoph's name is misspelled in the cc list. ]
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] don't do __ClearPageSlab before freeing slab page.
2012-06-20 20:59 ` [PATCH 4/4] don't do __ClearPageSlab before freeing slab page Glauber Costa
@ 2012-06-21 8:04 ` David Rientjes
2012-06-21 8:13 ` Glauber Costa
0 siblings, 1 reply; 16+ messages in thread
From: David Rientjes @ 2012-06-21 8:04 UTC (permalink / raw)
To: Glauber Costa
Cc: Pekka Enberg, linux-mm, Cristoph Lameter, Pekka Enberg,
Michal Hocko, Kamezawa Hiroyuki, Johannes Weiner,
Suleiman Souhlal
On Thu, 21 Jun 2012, Glauber Costa wrote:
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6092f33..fdec73e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -698,8 +698,10 @@ static bool free_pages_prepare(struct page *page, unsigned int order)
>
> if (PageAnon(page))
> page->mapping = NULL;
> - for (i = 0; i < (1 << order); i++)
> + for (i = 0; i < (1 << order); i++) {
> + __ClearPageSlab(page + i);
> bad += free_pages_check(page + i);
> + }
> if (bad)
> return false;
>
> @@ -2561,6 +2563,7 @@ EXPORT_SYMBOL(get_zeroed_page);
> void __free_pages(struct page *page, unsigned int order)
> {
> if (put_page_testzero(page)) {
> + __ClearPageSlab(page);
> if (order == 0)
> free_hot_cold_page(page, 0);
> else
These are called from a number of different places that has nothing to do
with slab so it's certainly out of place here. Is there really no
alternative way of doing this?
> diff --git a/mm/slab.c b/mm/slab.c
> index cb6da05..3e578fc 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1821,11 +1821,6 @@ static void kmem_freepages(struct kmem_cache *cachep, void *addr)
> else
> sub_zone_page_state(page_zone(page),
> NR_SLAB_UNRECLAIMABLE, nr_freed);
> - while (i--) {
> - BUG_ON(!PageSlab(page));
> - __ClearPageSlab(page);
> - page++;
> - }
> if (current->reclaim_state)
> current->reclaim_state->reclaimed_slab += nr_freed;
> free_pages((unsigned long)addr, cachep->gfporder);
And we lose this validation in slab.
I'm hoping there's an alternative.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] don't do __ClearPageSlab before freeing slab page.
2012-06-21 8:04 ` David Rientjes
@ 2012-06-21 8:13 ` Glauber Costa
2012-06-21 11:04 ` Kamezawa Hiroyuki
2012-07-03 18:40 ` Christoph Lameter
0 siblings, 2 replies; 16+ messages in thread
From: Glauber Costa @ 2012-06-21 8:13 UTC (permalink / raw)
To: David Rientjes
Cc: Pekka Enberg, linux-mm, Cristoph Lameter, Pekka Enberg,
Michal Hocko, Kamezawa Hiroyuki, Johannes Weiner,
Suleiman Souhlal
On 06/21/2012 12:04 PM, David Rientjes wrote:
> On Thu, 21 Jun 2012, Glauber Costa wrote:
>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 6092f33..fdec73e 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -698,8 +698,10 @@ static bool free_pages_prepare(struct page *page, unsigned int order)
>>
>> if (PageAnon(page))
>> page->mapping = NULL;
>> - for (i = 0; i < (1 << order); i++)
>> + for (i = 0; i < (1 << order); i++) {
>> + __ClearPageSlab(page + i);
>> bad += free_pages_check(page + i);
>> + }
>> if (bad)
>> return false;
>>
>> @@ -2561,6 +2563,7 @@ EXPORT_SYMBOL(get_zeroed_page);
>> void __free_pages(struct page *page, unsigned int order)
>> {
>> if (put_page_testzero(page)) {
>> + __ClearPageSlab(page);
>> if (order == 0)
>> free_hot_cold_page(page, 0);
>> else
>
> These are called from a number of different places that has nothing to do
> with slab so it's certainly out of place here. Is there really no
> alternative way of doing this?
Well, if the requirement is that we must handle this from the page
allocator, how else should I know if I must call the corresponding free
functions ?
Also note that other bits are tested inside the page allocator as well,
such as MLock.
I saw no other way, but if you have suggestions, I'd be open to try
them, of course.
>> diff --git a/mm/slab.c b/mm/slab.c
>> index cb6da05..3e578fc 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -1821,11 +1821,6 @@ static void kmem_freepages(struct kmem_cache *cachep, void *addr)
>> else
>> sub_zone_page_state(page_zone(page),
>> NR_SLAB_UNRECLAIMABLE, nr_freed);
>> - while (i--) {
>> - BUG_ON(!PageSlab(page));
>> - __ClearPageSlab(page);
>> - page++;
>> - }
>> if (current->reclaim_state)
>> current->reclaim_state->reclaimed_slab += nr_freed;
>> free_pages((unsigned long)addr, cachep->gfporder);
>
> And we lose this validation in slab.
>
> I'm hoping there's an alternative.
The validation can stay, if you would prefer. We still expect the page
to be PageSlab at this point, so we can test for it, and only clear later.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] don't do __ClearPageSlab before freeing slab page.
2012-06-21 8:13 ` Glauber Costa
@ 2012-06-21 11:04 ` Kamezawa Hiroyuki
2012-06-21 11:14 ` Glauber Costa
2012-06-21 20:24 ` Glauber Costa
2012-07-03 18:40 ` Christoph Lameter
1 sibling, 2 replies; 16+ messages in thread
From: Kamezawa Hiroyuki @ 2012-06-21 11:04 UTC (permalink / raw)
To: Glauber Costa
Cc: David Rientjes, Pekka Enberg, linux-mm, Cristoph Lameter,
Pekka Enberg, Michal Hocko, Johannes Weiner, Suleiman Souhlal
(2012/06/21 17:13), Glauber Costa wrote:
> On 06/21/2012 12:04 PM, David Rientjes wrote:
>> On Thu, 21 Jun 2012, Glauber Costa wrote:
>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 6092f33..fdec73e 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -698,8 +698,10 @@ static bool free_pages_prepare(struct page *page, unsigned int order)
>>>
>>> if (PageAnon(page))
>>> page->mapping = NULL;
>>> - for (i = 0; i < (1 << order); i++)
>>> + for (i = 0; i < (1 << order); i++) {
>>> + __ClearPageSlab(page + i);
>>> bad += free_pages_check(page + i);
>>> + }
>>> if (bad)
>>> return false;
>>>
>>> @@ -2561,6 +2563,7 @@ EXPORT_SYMBOL(get_zeroed_page);
>>> void __free_pages(struct page *page, unsigned int order)
>>> {
>>> if (put_page_testzero(page)) {
>>> + __ClearPageSlab(page);
>>> if (order == 0)
>>> free_hot_cold_page(page, 0);
>>> else
>>
>> These are called from a number of different places that has nothing to do
>> with slab so it's certainly out of place here. Is there really no
>> alternative way of doing this?
>
> Well, if the requirement is that we must handle this from the page allocator, how else should I know if I must call the corresponding free functions ?
>
> Also note that other bits are tested inside the page allocator as well, such as MLock.
>
> I saw no other way, but if you have suggestions, I'd be open to try them, of course.
>
I'm sorry I don't understand the logic enough well.
Why check in __free_pages() is better than check in callers of slab.c/slub.c ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] don't do __ClearPageSlab before freeing slab page.
2012-06-21 11:04 ` Kamezawa Hiroyuki
@ 2012-06-21 11:14 ` Glauber Costa
2012-06-21 20:24 ` Glauber Costa
1 sibling, 0 replies; 16+ messages in thread
From: Glauber Costa @ 2012-06-21 11:14 UTC (permalink / raw)
To: Kamezawa Hiroyuki
Cc: David Rientjes, Pekka Enberg, linux-mm, Cristoph Lameter,
Pekka Enberg, Michal Hocko, Johannes Weiner, Suleiman Souhlal
On 06/21/2012 03:04 PM, Kamezawa Hiroyuki wrote:
> (2012/06/21 17:13), Glauber Costa wrote:
>> On 06/21/2012 12:04 PM, David Rientjes wrote:
>>> On Thu, 21 Jun 2012, Glauber Costa wrote:
>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 6092f33..fdec73e 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -698,8 +698,10 @@ static bool free_pages_prepare(struct page
>>>> *page, unsigned int order)
>>>>
>>>> if (PageAnon(page))
>>>> page->mapping = NULL;
>>>> - for (i = 0; i < (1 << order); i++)
>>>> + for (i = 0; i < (1 << order); i++) {
>>>> + __ClearPageSlab(page + i);
>>>> bad += free_pages_check(page + i);
>>>> + }
>>>> if (bad)
>>>> return false;
>>>>
>>>> @@ -2561,6 +2563,7 @@ EXPORT_SYMBOL(get_zeroed_page);
>>>> void __free_pages(struct page *page, unsigned int order)
>>>> {
>>>> if (put_page_testzero(page)) {
>>>> + __ClearPageSlab(page);
>>>> if (order == 0)
>>>> free_hot_cold_page(page, 0);
>>>> else
>>>
>>> These are called from a number of different places that has nothing
>>> to do
>>> with slab so it's certainly out of place here. Is there really no
>>> alternative way of doing this?
>>
>> Well, if the requirement is that we must handle this from the page
>> allocator, how else should I know if I must call the corresponding
>> free functions ?
>>
>> Also note that other bits are tested inside the page allocator as
>> well, such as MLock.
>>
>> I saw no other way, but if you have suggestions, I'd be open to try
>> them, of course.
>>
>
> I'm sorry I don't understand the logic enough well.
>
> Why check in __free_pages() is better than check in callers of
> slab.c/slub.c ?
>
It's less intrusive, because it is independent of the cache being used
and doesn't modify the cache code. That's one of the things I tried to
achieve in this last patchset.
If the slab guys think this change in the slab is acceptable, and better
than in __free_pages, I'd be happy to change back to it. But then, I'd
like to to the same for the allocation.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] don't do __ClearPageSlab before freeing slab page.
2012-06-21 11:04 ` Kamezawa Hiroyuki
2012-06-21 11:14 ` Glauber Costa
@ 2012-06-21 20:24 ` Glauber Costa
2012-07-03 18:42 ` Christoph Lameter
1 sibling, 1 reply; 16+ messages in thread
From: Glauber Costa @ 2012-06-21 20:24 UTC (permalink / raw)
To: Kamezawa Hiroyuki
Cc: David Rientjes, Pekka Enberg, linux-mm, Cristoph Lameter,
Pekka Enberg, Michal Hocko, Johannes Weiner, Suleiman Souhlal
On 06/21/2012 03:04 PM, Kamezawa Hiroyuki wrote:
> (2012/06/21 17:13), Glauber Costa wrote:
>> On 06/21/2012 12:04 PM, David Rientjes wrote:
>>> On Thu, 21 Jun 2012, Glauber Costa wrote:
>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 6092f33..fdec73e 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -698,8 +698,10 @@ static bool free_pages_prepare(struct page
>>>> *page, unsigned int order)
>>>>
>>>> if (PageAnon(page))
>>>> page->mapping = NULL;
>>>> - for (i = 0; i < (1 << order); i++)
>>>> + for (i = 0; i < (1 << order); i++) {
>>>> + __ClearPageSlab(page + i);
>>>> bad += free_pages_check(page + i);
>>>> + }
>>>> if (bad)
>>>> return false;
>>>>
>>>> @@ -2561,6 +2563,7 @@ EXPORT_SYMBOL(get_zeroed_page);
>>>> void __free_pages(struct page *page, unsigned int order)
>>>> {
>>>> if (put_page_testzero(page)) {
>>>> + __ClearPageSlab(page);
>>>> if (order == 0)
>>>> free_hot_cold_page(page, 0);
>>>> else
>>>
>>> These are called from a number of different places that has nothing
>>> to do
>>> with slab so it's certainly out of place here. Is there really no
>>> alternative way of doing this?
>>
>> Well, if the requirement is that we must handle this from the page
>> allocator, how else should I know if I must call the corresponding
>> free functions ?
>>
>> Also note that other bits are tested inside the page allocator as
>> well, such as MLock.
>>
>> I saw no other way, but if you have suggestions, I'd be open to try
>> them, of course.
>>
>
> I'm sorry I don't understand the logic enough well.
>
> Why check in __free_pages() is better than check in callers of
> slab.c/slub.c ?
>
> Thanks,
> -Kame
>
How would the slab people feel, specially Christoph, about a simple
change in the caches, replacing free_pages and alloc_pages by common
functions that calls the memcg correspondents when needed ?
It could even be done in a header file, and the change in the slab goes
only as far as changing names in the call sites. (And maybe
standardizing, because we have a mixture of free_pages and __free_pages
around)
This would possibly render the __GFP_SLABMEMC not needed, since we'd
have stable call sites for memcg to derive its context from.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/4] slab: move FULL state transition to an initcall
2012-06-21 8:01 ` David Rientjes
@ 2012-07-02 10:57 ` Pekka Enberg
0 siblings, 0 replies; 16+ messages in thread
From: Pekka Enberg @ 2012-07-02 10:57 UTC (permalink / raw)
To: David Rientjes; +Cc: Glauber Costa, linux-mm, Cristoph Lameter, Pekka Enberg
On Thu, 21 Jun 2012, David Rientjes wrote:
> > During kmem_cache_init_late(), we transition to the LATE state,
> > and after some more work, to the FULL state, its last state
> >
> > This is quite different from slub, that will only transition to
> > its last state (previously SYSFS), in a (late)initcall, after a lot
> > more of the kernel is ready.
> >
> > This means that in slab, we have no way to taking actions dependent
> > on the initialization of other pieces of the kernel that are supposed
> > to start way after kmem_init_late(), such as cgroups initialization.
> >
> > To achieve more consistency in this behavior, that patch only
> > transitions to the UP state in kmem_init_late. In my analysis,
> > setup_cpu_cache() should be happy to test for >= UP, instead of
> > == FULL. It also has passed some tests I've made.
> >
> > We then only mark FULL state after the reap timers are in place,
> > meaning that no further setup is expected.
> >
> > Signed-off-by: Glauber Costa <glommer@parallels.com>
> > Acked-by: Christoph Lameter <cl@linux.com>
> > CC: Pekka Enberg <penberg@cs.helsinki.fi>
> > CC: David Rientjes <rientjes@google.com>
>
> Acked-by: David Rientjes <rientjes@google.com>
>
> [ Might want to fix your address book in your email client because
> Christoph's name is misspelled in the cc list. ]
Applied, thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] don't do __ClearPageSlab before freeing slab page.
2012-06-21 8:13 ` Glauber Costa
2012-06-21 11:04 ` Kamezawa Hiroyuki
@ 2012-07-03 18:40 ` Christoph Lameter
1 sibling, 0 replies; 16+ messages in thread
From: Christoph Lameter @ 2012-07-03 18:40 UTC (permalink / raw)
To: Glauber Costa
Cc: David Rientjes, Pekka Enberg, linux-mm, Pekka Enberg,
Michal Hocko, Kamezawa Hiroyuki, Johannes Weiner,
Suleiman Souhlal
On Thu, 21 Jun 2012, Glauber Costa wrote:
> Well, if the requirement is that we must handle this from the page allocator,
> how else should I know if I must call the corresponding free functions ?
Is there such a requirement? I believe I was talking about a wrapper that
accounts for page allocator requests.
> Also note that other bits are tested inside the page allocator as well, such
> as MLock.
Yea so we could do this but doing so requires agreement with the people
involved in page allocator development.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] don't do __ClearPageSlab before freeing slab page.
2012-06-21 20:24 ` Glauber Costa
@ 2012-07-03 18:42 ` Christoph Lameter
0 siblings, 0 replies; 16+ messages in thread
From: Christoph Lameter @ 2012-07-03 18:42 UTC (permalink / raw)
To: Glauber Costa
Cc: Kamezawa Hiroyuki, David Rientjes, Pekka Enberg, linux-mm,
Pekka Enberg, Michal Hocko, Johannes Weiner, Suleiman Souhlal
On Fri, 22 Jun 2012, Glauber Costa wrote:
> How would the slab people feel, specially Christoph, about a simple change in
> the caches, replacing free_pages and alloc_pages by common functions that
> calls the memcg correspondents when needed ?
I believe that is one of the optionst that I proposed earlier.
> This would possibly render the __GFP_SLABMEMC not needed, since we'd have
> stable call sites for memcg to derive its context from.
Right.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-07-03 18:42 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-20 20:59 [PATCH 0/4] cache-specific changes for memcg (preparation) Glauber Costa
2012-06-20 20:59 ` [PATCH 1/4] slab: rename gfpflags to allocflags Glauber Costa
2012-06-21 7:53 ` David Rientjes
2012-06-20 20:59 ` [PATCH 2/4] Wipe out CFLGS_OFF_SLAB from flags during initial slab creation Glauber Costa
2012-06-21 7:59 ` David Rientjes
2012-06-20 20:59 ` [PATCH 3/4] slab: move FULL state transition to an initcall Glauber Costa
2012-06-21 8:01 ` David Rientjes
2012-07-02 10:57 ` Pekka Enberg
2012-06-20 20:59 ` [PATCH 4/4] don't do __ClearPageSlab before freeing slab page Glauber Costa
2012-06-21 8:04 ` David Rientjes
2012-06-21 8:13 ` Glauber Costa
2012-06-21 11:04 ` Kamezawa Hiroyuki
2012-06-21 11:14 ` Glauber Costa
2012-06-21 20:24 ` Glauber Costa
2012-07-03 18:42 ` Christoph Lameter
2012-07-03 18:40 ` Christoph Lameter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.