linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB
@ 2016-10-26 17:41 Thomas Garnier
  2016-10-26 19:08 ` Christoph Lameter
  2016-10-27  7:25 ` Michal Hocko
  0 siblings, 2 replies; 6+ messages in thread
From: Thomas Garnier @ 2016-10-26 17:41 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton
  Cc: linux-mm, linux-kernel, gthelen, Thomas Garnier

While testing OBJFREELIST_SLAB integration with pagealloc, we found a
bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
CFLGS_OBJFREELIST_SLAB.

The original kmem_cache is created early making OFF_SLAB not possible.
When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
is enabled it will try to enable it first under certain conditions.
Given kmem_cache(sys) reuses the original flag, you can have both flags
at the same time resulting in allocation failures and odd behaviors.

The proposed fix removes these flags by default at the entrance of
__kmem_cache_create. This way the function will define which way the
freelist should be handled at this stage for the new cache.

Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
---
Based on next-20161025
---
 mm/slab.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/slab.c b/mm/slab.c
index 3c83c29..efe280a 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
 	int err;
 	size_t size = cachep->size;
 
+	/*
+	 * memcg re-creates caches with the flags of the originals. Remove
+	 * the freelist related flags to ensure they are re-defined at this
+	 * stage. Prevent having both flags on edge cases like with pagealloc
+	 * if the original cache was created too early to be OFF_SLAB.
+	 */
+	flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
+
 #if DEBUG
 #if FORCED_DEBUG
 	/*
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB
  2016-10-26 17:41 [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB Thomas Garnier
@ 2016-10-26 19:08 ` Christoph Lameter
  2016-10-26 19:22   ` Thomas Garnier
  2016-10-27  7:25 ` Michal Hocko
  1 sibling, 1 reply; 6+ messages in thread
From: Christoph Lameter @ 2016-10-26 19:08 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, linux-kernel, gthelen

Hmmm...Doesnt this belong into memcg_create_kmem_cache() or into
kmem_cache_create() in mm/slab_common.h? Definitely not in an allocator
specific function since this is an issue for all allocators.

memcg_create_kmem_cache() simply assumes that it can pass flags from the
kmem_cache structure to kmem_cache_create(). However, those flags may
contain slab specific options.

kmem_cache_create() could filter out flags that cannot be specified.

Maybe create SLAB_FLAGS_PERMITTED in linux/mm/slab.h and mask other bits
out in kmem_cache_create()?

Slub also has internal flags and those also should not be passed to
kmem_cache_create(). If we define the valid ones we can mask them out.

The cleanest approach would be if kmem_cache_create() would reject invalid
flags and fail and if memcg_create_kmem_cache() would mask out the invalid
flags using SLAB_FLAGS_PERMITTED or so.



On Wed, 26 Oct 2016, Thomas Garnier wrote:

> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> CFLGS_OBJFREELIST_SLAB.
>
> The original kmem_cache is created early making OFF_SLAB not possible.
> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> is enabled it will try to enable it first under certain conditions.
> Given kmem_cache(sys) reuses the original flag, you can have both flags
> at the same time resulting in allocation failures and odd behaviors.
>
> The proposed fix removes these flags by default at the entrance of
> __kmem_cache_create. This way the function will define which way the
> freelist should be handled at this stage for the new cache.
>
> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
> Signed-off-by: Thomas Garnier <thgarnie@google.com>
> Signed-off-by: Greg Thelen <gthelen@google.com>
> ---
> Based on next-20161025
> ---
>  mm/slab.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/mm/slab.c b/mm/slab.c
> index 3c83c29..efe280a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
>  	int err;
>  	size_t size = cachep->size;
>
> +	/*
> +	 * memcg re-creates caches with the flags of the originals. Remove
> +	 * the freelist related flags to ensure they are re-defined at this
> +	 * stage. Prevent having both flags on edge cases like with pagealloc
> +	 * if the original cache was created too early to be OFF_SLAB.
> +	 */
> +	flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
> +
>  #if DEBUG
>  #if FORCED_DEBUG
>  	/*
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB
  2016-10-26 19:08 ` Christoph Lameter
@ 2016-10-26 19:22   ` Thomas Garnier
  2016-10-26 20:47     ` Christoph Lameter
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Garnier @ 2016-10-26 19:22 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Linux-MM, LKML, Greg Thelen

On Wed, Oct 26, 2016 at 12:08 PM, Christoph Lameter <cl@linux.com> wrote:
> Hmmm...Doesnt this belong into memcg_create_kmem_cache() or into
> kmem_cache_create() in mm/slab_common.h? Definitely not in an allocator
> specific function since this is an issue for all allocators.
>
> memcg_create_kmem_cache() simply assumes that it can pass flags from the
> kmem_cache structure to kmem_cache_create(). However, those flags may
> contain slab specific options.
>
> kmem_cache_create() could filter out flags that cannot be specified.

That make sense.

>
> Maybe create SLAB_FLAGS_PERMITTED in linux/mm/slab.h and mask other bits
> out in kmem_cache_create()?
>
> Slub also has internal flags and those also should not be passed to
> kmem_cache_create(). If we define the valid ones we can mask them out.
>
> The cleanest approach would be if kmem_cache_create() would reject invalid
> flags and fail and if memcg_create_kmem_cache() would mask out the invalid
> flags using SLAB_FLAGS_PERMITTED or so.

Okay, I think for SLAB we can allow everything except the two flags
mentioned here.

Should I deny certain flags for SLUB? I can allow everything for now.

>
>
>
> On Wed, 26 Oct 2016, Thomas Garnier wrote:
>
>> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> CFLGS_OBJFREELIST_SLAB.
>>
>> The original kmem_cache is created early making OFF_SLAB not possible.
>> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> is enabled it will try to enable it first under certain conditions.
>> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> at the same time resulting in allocation failures and odd behaviors.
>>
>> The proposed fix removes these flags by default at the entrance of
>> __kmem_cache_create. This way the function will define which way the
>> freelist should be handled at this stage for the new cache.
>>
>> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
>> Signed-off-by: Thomas Garnier <thgarnie@google.com>
>> Signed-off-by: Greg Thelen <gthelen@google.com>
>> ---
>> Based on next-20161025
>> ---
>>  mm/slab.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 3c83c29..efe280a 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
>>       int err;
>>       size_t size = cachep->size;
>>
>> +     /*
>> +      * memcg re-creates caches with the flags of the originals. Remove
>> +      * the freelist related flags to ensure they are re-defined at this
>> +      * stage. Prevent having both flags on edge cases like with pagealloc
>> +      * if the original cache was created too early to be OFF_SLAB.
>> +      */
>> +     flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
>> +
>>  #if DEBUG
>>  #if FORCED_DEBUG
>>       /*
>>



-- 
Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB
  2016-10-26 19:22   ` Thomas Garnier
@ 2016-10-26 20:47     ` Christoph Lameter
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Lameter @ 2016-10-26 20:47 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	Linux-MM, LKML, Greg Thelen

On Wed, 26 Oct 2016, Thomas Garnier wrote:

> Okay, I think for SLAB we can allow everything except the two flags
> mentioned here.

No no no. Just allow the flags already defined in include/linux/slab.h
that can be specd by subsystems when they call into the slab allocators.

> Should I deny certain flags for SLUB? I can allow everything for now.

All allocator should just allow flags defined in include/linux/slab.h be
passed to kmem_cache_create(). That is the API that all allocators need to support.
If someone wants to add new flags then we need to make sure that all
allocators can handle it.


The flags are (from include/linux/slab.h)
/*
 * Flags to pass to kmem_cache_create().
 */
#define SLAB_CONSISTENCY_CHECKS 0x00000100UL    /* DEBUG: Perform (expensive) checks on alloc/free */
#define SLAB_RED_ZONE           0x00000400UL    /* DEBUG: Red zone objs in a cache */
#define SLAB_POISON             0x00000800UL    /* DEBUG: Poison objects */
#define SLAB_HWCACHE_ALIGN      0x00002000UL    /* Align objs on cache lines */
#define SLAB_CACHE_DMA          0x00004000UL    /* Use GFP_DMA memory */
#define SLAB_STORE_USER         0x00010000UL    /* DEBUG: Store the last owner for bug hunting */
#define SLAB_PANIC              0x00040000UL    /* Panic if kmem_cache_create() fails */
#define SLAB_DESTROY_BY_RCU     0x00080000UL    /* Defer freeing slabs to RCU */
#define SLAB_MEM_SPREAD         0x00100000UL    /* Spread some memory over cpuset */
#define SLAB_TRACE              0x00200000UL    /* Trace allocations and frees */
#define SLAB_DEBUG_OBJECTS	0x00400000UL
#define SLAB_NOLEAKTRACE 	0x00800000UL    /* Avoid kmemleak tracing
#define SLAB_NOTRACK      	0x01000000UL
#define SLAB_FAILSLAB          	0x02000000UL    /* Fault injection mark */
#define SLAB_ACCOUNT          	0x04000000UL    /* Account to memcg */
#define SLAB_KASAN              0x08000000UL
#define SLAB_RECLAIM_ACCOUNT    0x00020000UL            /* Objects are reclaimable */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB
  2016-10-26 17:41 [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB Thomas Garnier
  2016-10-26 19:08 ` Christoph Lameter
@ 2016-10-27  7:25 ` Michal Hocko
  2016-10-27 14:34   ` Thomas Garnier
  1 sibling, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2016-10-27  7:25 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, linux-kernel, gthelen, Vladimir Davydov

The patch is marked for memcg but I do not see any direct relation.
I am not familiar with this code enough probably but if this really is
memcg kmem related, please do not forget to CC Vladimir

On Wed 26-10-16 10:41:28, Thomas Garnier wrote:
> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> CFLGS_OBJFREELIST_SLAB.
> 
> The original kmem_cache is created early making OFF_SLAB not possible.
> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> is enabled it will try to enable it first under certain conditions.
> Given kmem_cache(sys) reuses the original flag, you can have both flags
> at the same time resulting in allocation failures and odd behaviors.
> 
> The proposed fix removes these flags by default at the entrance of
> __kmem_cache_create. This way the function will define which way the
> freelist should be handled at this stage for the new cache.
> 
> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
> Signed-off-by: Thomas Garnier <thgarnie@google.com>
> Signed-off-by: Greg Thelen <gthelen@google.com>
> ---
> Based on next-20161025
> ---
>  mm/slab.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 3c83c29..efe280a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
>  	int err;
>  	size_t size = cachep->size;
>  
> +	/*
> +	 * memcg re-creates caches with the flags of the originals. Remove
> +	 * the freelist related flags to ensure they are re-defined at this
> +	 * stage. Prevent having both flags on edge cases like with pagealloc
> +	 * if the original cache was created too early to be OFF_SLAB.
> +	 */
> +	flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
> +
>  #if DEBUG
>  #if FORCED_DEBUG
>  	/*
> -- 
> 2.8.0.rc3.226.g39d4020
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB
  2016-10-27  7:25 ` Michal Hocko
@ 2016-10-27 14:34   ` Thomas Garnier
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Garnier @ 2016-10-27 14:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, Linux-MM, LKML, Greg Thelen, Vladimir Davydov

On Thu, Oct 27, 2016 at 12:25 AM, Michal Hocko <mhocko@kernel.org> wrote:
> The patch is marked for memcg but I do not see any direct relation.
> I am not familiar with this code enough probably but if this really is
> memcg kmem related, please do not forget to CC Vladimir
>

Yes, the next iteration should be closer to memcg. I will CC Vladimir.

Thanks for the heads-up.

> On Wed 26-10-16 10:41:28, Thomas Garnier wrote:
>> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> CFLGS_OBJFREELIST_SLAB.
>>
>> The original kmem_cache is created early making OFF_SLAB not possible.
>> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> is enabled it will try to enable it first under certain conditions.
>> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> at the same time resulting in allocation failures and odd behaviors.
>>
>> The proposed fix removes these flags by default at the entrance of
>> __kmem_cache_create. This way the function will define which way the
>> freelist should be handled at this stage for the new cache.
>>
>> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
>> Signed-off-by: Thomas Garnier <thgarnie@google.com>
>> Signed-off-by: Greg Thelen <gthelen@google.com>
>> ---
>> Based on next-20161025
>> ---
>>  mm/slab.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 3c83c29..efe280a 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
>>       int err;
>>       size_t size = cachep->size;
>>
>> +     /*
>> +      * memcg re-creates caches with the flags of the originals. Remove
>> +      * the freelist related flags to ensure they are re-defined at this
>> +      * stage. Prevent having both flags on edge cases like with pagealloc
>> +      * if the original cache was created too early to be OFF_SLAB.
>> +      */
>> +     flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
>> +
>>  #if DEBUG
>>  #if FORCED_DEBUG
>>       /*
>> --
>> 2.8.0.rc3.226.g39d4020
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
> --
> Michal Hocko
> SUSE Labs



-- 
Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-10-27 14:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-26 17:41 [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB Thomas Garnier
2016-10-26 19:08 ` Christoph Lameter
2016-10-26 19:22   ` Thomas Garnier
2016-10-26 20:47     ` Christoph Lameter
2016-10-27  7:25 ` Michal Hocko
2016-10-27 14:34   ` Thomas Garnier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).