All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
@ 2018-04-10 12:53 Matthew Wilcox
  2018-04-10 12:53 ` [PATCH 2/2] page cache: Mask off unwanted GFP flags Matthew Wilcox
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Matthew Wilcox @ 2018-04-10 12:53 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-kernel, Jan Kara, Jeff Layton,
	Mel Gorman, stable

From: Matthew Wilcox <mawilcox@microsoft.com>

__GFP_ZERO requests that the object be initialised to all-zeroes,
while the purpose of a constructor is to initialise an object to a
particular pattern.  We cannot do both.  Add a warning to catch any
users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
a constructor.

Fixes: d07dbea46405 ("Slab allocators: support __GFP_ZERO in all allocators")
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: stable@vger.kernel.org
---
 mm/slab.c | 6 ++++--
 mm/slob.c | 4 +++-
 mm/slub.c | 6 ++++--
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 38d3f4fd17d7..8b2cb7db85db 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3313,8 +3313,10 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
 	local_irq_restore(save_flags);
 	ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
 
-	if (unlikely(flags & __GFP_ZERO) && ptr)
-		memset(ptr, 0, cachep->object_size);
+	if (unlikely(flags & __GFP_ZERO) && ptr) {
+		if (!WARN_ON_ONCE(cachep->ctor))
+			memset(ptr, 0, cachep->object_size);
+	}
 
 	slab_post_alloc_hook(cachep, flags, 1, &ptr);
 	return ptr;
diff --git a/mm/slob.c b/mm/slob.c
index 1a46181b675c..958173fd7c24 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -556,8 +556,10 @@ static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
 					    flags, node);
 	}
 
-	if (b && c->ctor)
+	if (b && c->ctor) {
+		WARN_ON_ONCE(flags & __GFP_ZERO);
 		c->ctor(b);
+	}
 
 	kmemleak_alloc_recursive(b, c->size, 1, c->flags, flags);
 	return b;
diff --git a/mm/slub.c b/mm/slub.c
index 9e1100f9298f..0f55f0a0dcaa 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2714,8 +2714,10 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
 		stat(s, ALLOC_FASTPATH);
 	}
 
-	if (unlikely(gfpflags & __GFP_ZERO) && object)
-		memset(object, 0, s->object_size);
+	if (unlikely(gfpflags & __GFP_ZERO) && object) {
+		if (!WARN_ON_ONCE(s->ctor))
+			memset(object, 0, s->object_size);
+	}
 
 	slab_post_alloc_hook(s, gfpflags, 1, &object);
 
-- 
2.16.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/2] page cache: Mask off unwanted GFP flags
  2018-04-10 12:53 [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Matthew Wilcox
@ 2018-04-10 12:53 ` Matthew Wilcox
  2018-04-10 13:08   ` Johannes Weiner
                     ` (3 more replies)
  2018-04-10 13:00 ` [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Johannes Weiner
                   ` (4 subsequent siblings)
  5 siblings, 4 replies; 22+ messages in thread
From: Matthew Wilcox @ 2018-04-10 12:53 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-kernel, Jan Kara, Jeff Layton,
	Mel Gorman, stable

From: Matthew Wilcox <mawilcox@microsoft.com>

The page cache has used the mapping's GFP flags for allocating
radix tree nodes for a long time.  It took care to always mask off the
__GFP_HIGHMEM flag, and masked off other flags in other paths, but the
__GFP_ZERO flag was still able to sneak through.  The __GFP_DMA and
__GFP_DMA32 flags would also have been able to sneak through if they
were ever used.  Fix them all by using GFP_RECLAIM_MASK at the innermost
location, and remove it from earlier in the callchain.

Fixes: 19f99cee206c ("f2fs: add core inode operations")
Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: stable@vger.kernel.org
---
 mm/filemap.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index c2147682f4c3..1a4bfc5ed3dc 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -785,7 +785,7 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
 	VM_BUG_ON_PAGE(!PageLocked(new), new);
 	VM_BUG_ON_PAGE(new->mapping, new);
 
-	error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
+	error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK);
 	if (!error) {
 		struct address_space *mapping = old->mapping;
 		void (*freepage)(struct page *);
@@ -841,7 +841,7 @@ static int __add_to_page_cache_locked(struct page *page,
 			return error;
 	}
 
-	error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
+	error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK);
 	if (error) {
 		if (!huge)
 			mem_cgroup_cancel_charge(page, memcg, false);
@@ -1574,8 +1574,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
 		if (fgp_flags & FGP_ACCESSED)
 			__SetPageReferenced(page);
 
-		err = add_to_page_cache_lru(page, mapping, offset,
-				gfp_mask & GFP_RECLAIM_MASK);
+		err = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
 		if (unlikely(err)) {
 			put_page(page);
 			page = NULL;
@@ -2378,7 +2377,7 @@ static int page_cache_read(struct file *file, pgoff_t offset, gfp_t gfp_mask)
 		if (!page)
 			return -ENOMEM;
 
-		ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask & GFP_KERNEL);
+		ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
 		if (ret == 0)
 			ret = mapping->a_ops->readpage(file, page);
 		else if (ret == -EEXIST)
-- 
2.16.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 12:53 [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Matthew Wilcox
  2018-04-10 12:53 ` [PATCH 2/2] page cache: Mask off unwanted GFP flags Matthew Wilcox
@ 2018-04-10 13:00 ` Johannes Weiner
  2018-04-10 13:07 ` Michal Hocko
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Johannes Weiner @ 2018-04-10 13:00 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue, Apr 10, 2018 at 05:53:50AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> __GFP_ZERO requests that the object be initialised to all-zeroes,
> while the purpose of a constructor is to initialise an object to a
> particular pattern.  We cannot do both.  Add a warning to catch any
> users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
> a constructor.
> 
> Fixes: d07dbea46405 ("Slab allocators: support __GFP_ZERO in all allocators")
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: stable@vger.kernel.org

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 12:53 [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Matthew Wilcox
  2018-04-10 12:53 ` [PATCH 2/2] page cache: Mask off unwanted GFP flags Matthew Wilcox
  2018-04-10 13:00 ` [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Johannes Weiner
@ 2018-04-10 13:07 ` Michal Hocko
  2018-04-10 13:40 ` Vlastimil Babka
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-10 13:07 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue 10-04-18 05:53:50, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> __GFP_ZERO requests that the object be initialised to all-zeroes,
> while the purpose of a constructor is to initialise an object to a
> particular pattern.  We cannot do both.  Add a warning to catch any
> users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
> a constructor.
> 
> Fixes: d07dbea46405 ("Slab allocators: support __GFP_ZERO in all allocators")
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: stable@vger.kernel.org
> ---
>  mm/slab.c | 6 ++++--
>  mm/slob.c | 4 +++-
>  mm/slub.c | 6 ++++--
>  3 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 38d3f4fd17d7..8b2cb7db85db 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3313,8 +3313,10 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
>  	local_irq_restore(save_flags);
>  	ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
>  
> -	if (unlikely(flags & __GFP_ZERO) && ptr)
> -		memset(ptr, 0, cachep->object_size);
> +	if (unlikely(flags & __GFP_ZERO) && ptr) {
> +		if (!WARN_ON_ONCE(cachep->ctor))
> +			memset(ptr, 0, cachep->object_size);
> +	}
>  
>  	slab_post_alloc_hook(cachep, flags, 1, &ptr);
>  	return ptr;

Why don't we need to cover this in slab_alloc and kmem_cache_alloc_bulk as well?

Other than that this patch makes sense to me.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] page cache: Mask off unwanted GFP flags
  2018-04-10 12:53 ` [PATCH 2/2] page cache: Mask off unwanted GFP flags Matthew Wilcox
@ 2018-04-10 13:08   ` Johannes Weiner
  2018-04-10 13:09   ` Michal Hocko
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Johannes Weiner @ 2018-04-10 13:08 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue, Apr 10, 2018 at 05:53:51AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> The page cache has used the mapping's GFP flags for allocating
> radix tree nodes for a long time.  It took care to always mask off the
> __GFP_HIGHMEM flag, and masked off other flags in other paths, but the
> __GFP_ZERO flag was still able to sneak through.  The __GFP_DMA and
> __GFP_DMA32 flags would also have been able to sneak through if they
> were ever used.  Fix them all by using GFP_RECLAIM_MASK at the innermost
> location, and remove it from earlier in the callchain.

Could you please mention the nullptr crash here, maybe even in the
patch subject? That makes it much easier to find this patch when you
run into that bug or when evaluating backport candidates.

Other than that,

> Fixes: 19f99cee206c ("f2fs: add core inode operations")
> Reported-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: stable@vger.kernel.org

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] page cache: Mask off unwanted GFP flags
  2018-04-10 12:53 ` [PATCH 2/2] page cache: Mask off unwanted GFP flags Matthew Wilcox
  2018-04-10 13:08   ` Johannes Weiner
@ 2018-04-10 13:09   ` Michal Hocko
  2018-04-10 13:45   ` Minchan Kim
  2018-04-10 13:46   ` Jan Kara
  3 siblings, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2018-04-10 13:09 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue 10-04-18 05:53:51, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> The page cache has used the mapping's GFP flags for allocating
> radix tree nodes for a long time.  It took care to always mask off the
> __GFP_HIGHMEM flag, and masked off other flags in other paths, but the
> __GFP_ZERO flag was still able to sneak through.  The __GFP_DMA and
> __GFP_DMA32 flags would also have been able to sneak through if they
> were ever used.  Fix them all by using GFP_RECLAIM_MASK at the innermost
> location, and remove it from earlier in the callchain.
> 
> Fixes: 19f99cee206c ("f2fs: add core inode operations")
> Reported-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: stable@vger.kernel.org

I would push this into __radix_tree_preload...
Anyway
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/filemap.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index c2147682f4c3..1a4bfc5ed3dc 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -785,7 +785,7 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
>  	VM_BUG_ON_PAGE(!PageLocked(new), new);
>  	VM_BUG_ON_PAGE(new->mapping, new);
>  
> -	error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
> +	error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK);
>  	if (!error) {
>  		struct address_space *mapping = old->mapping;
>  		void (*freepage)(struct page *);
> @@ -841,7 +841,7 @@ static int __add_to_page_cache_locked(struct page *page,
>  			return error;
>  	}
>  
> -	error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
> +	error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK);
>  	if (error) {
>  		if (!huge)
>  			mem_cgroup_cancel_charge(page, memcg, false);
> @@ -1574,8 +1574,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
>  		if (fgp_flags & FGP_ACCESSED)
>  			__SetPageReferenced(page);
>  
> -		err = add_to_page_cache_lru(page, mapping, offset,
> -				gfp_mask & GFP_RECLAIM_MASK);
> +		err = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
>  		if (unlikely(err)) {
>  			put_page(page);
>  			page = NULL;
> @@ -2378,7 +2377,7 @@ static int page_cache_read(struct file *file, pgoff_t offset, gfp_t gfp_mask)
>  		if (!page)
>  			return -ENOMEM;
>  
> -		ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask & GFP_KERNEL);
> +		ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
>  		if (ret == 0)
>  			ret = mapping->a_ops->readpage(file, page);
>  		else if (ret == -EEXIST)
> -- 
> 2.16.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 12:53 [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Matthew Wilcox
                   ` (2 preceding siblings ...)
  2018-04-10 13:07 ` Michal Hocko
@ 2018-04-10 13:40 ` Vlastimil Babka
  2018-04-10 13:53 ` Eric Dumazet
  2018-04-10 14:21 ` Christopher Lameter
  5 siblings, 0 replies; 22+ messages in thread
From: Vlastimil Babka @ 2018-04-10 13:40 UTC (permalink / raw)
  To: Matthew Wilcox, linux-mm
  Cc: Matthew Wilcox, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-kernel, Jan Kara, Jeff Layton,
	Mel Gorman, stable

On 04/10/2018 02:53 PM, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> __GFP_ZERO requests that the object be initialised to all-zeroes,
> while the purpose of a constructor is to initialise an object to a
> particular pattern.  We cannot do both.  Add a warning to catch any
> users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
> a constructor.
> 
> Fixes: d07dbea46405 ("Slab allocators: support __GFP_ZERO in all allocators")
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: stable@vger.kernel.org

It doesn't fix any known problem, does it? Then the stable tag seems too
much IMHO. Especially for fixing a 2007 commit.

Otherwise
Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/slab.c | 6 ++++--
>  mm/slob.c | 4 +++-
>  mm/slub.c | 6 ++++--
>  3 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 38d3f4fd17d7..8b2cb7db85db 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3313,8 +3313,10 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
>  	local_irq_restore(save_flags);
>  	ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
>  
> -	if (unlikely(flags & __GFP_ZERO) && ptr)
> -		memset(ptr, 0, cachep->object_size);
> +	if (unlikely(flags & __GFP_ZERO) && ptr) {
> +		if (!WARN_ON_ONCE(cachep->ctor))
> +			memset(ptr, 0, cachep->object_size);
> +	}
>  
>  	slab_post_alloc_hook(cachep, flags, 1, &ptr);
>  	return ptr;
> diff --git a/mm/slob.c b/mm/slob.c
> index 1a46181b675c..958173fd7c24 100644
> --- a/mm/slob.c
> +++ b/mm/slob.c
> @@ -556,8 +556,10 @@ static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
>  					    flags, node);
>  	}
>  
> -	if (b && c->ctor)
> +	if (b && c->ctor) {
> +		WARN_ON_ONCE(flags & __GFP_ZERO);
>  		c->ctor(b);
> +	}
>  
>  	kmemleak_alloc_recursive(b, c->size, 1, c->flags, flags);
>  	return b;
> diff --git a/mm/slub.c b/mm/slub.c
> index 9e1100f9298f..0f55f0a0dcaa 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2714,8 +2714,10 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
>  		stat(s, ALLOC_FASTPATH);
>  	}
>  
> -	if (unlikely(gfpflags & __GFP_ZERO) && object)
> -		memset(object, 0, s->object_size);
> +	if (unlikely(gfpflags & __GFP_ZERO) && object) {
> +		if (!WARN_ON_ONCE(s->ctor))
> +			memset(object, 0, s->object_size);
> +	}
>  
>  	slab_post_alloc_hook(s, gfpflags, 1, &object);
>  
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] page cache: Mask off unwanted GFP flags
  2018-04-10 12:53 ` [PATCH 2/2] page cache: Mask off unwanted GFP flags Matthew Wilcox
  2018-04-10 13:08   ` Johannes Weiner
  2018-04-10 13:09   ` Michal Hocko
@ 2018-04-10 13:45   ` Minchan Kim
  2018-04-10 14:02     ` Matthew Wilcox
  2018-04-10 13:46   ` Jan Kara
  3 siblings, 1 reply; 22+ messages in thread
From: Minchan Kim @ 2018-04-10 13:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable, jaegeuk

On Tue, Apr 10, 2018 at 05:53:51AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> The page cache has used the mapping's GFP flags for allocating
> radix tree nodes for a long time.  It took care to always mask off the
> __GFP_HIGHMEM flag, and masked off other flags in other paths, but the
> __GFP_ZERO flag was still able to sneak through.  The __GFP_DMA and
> __GFP_DMA32 flags would also have been able to sneak through if they
> were ever used.  Fix them all by using GFP_RECLAIM_MASK at the innermost
> location, and remove it from earlier in the callchain.
> 
> Fixes: 19f99cee206c ("f2fs: add core inode operations")

Why this patch fix 19f99cee206c instead of 449dd6984d0e?
F2FS doesn't have any problem before introducing 449dd6984d0e?


> Reported-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: stable@vger.kernel.org
> ---
>  mm/filemap.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index c2147682f4c3..1a4bfc5ed3dc 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -785,7 +785,7 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
>  	VM_BUG_ON_PAGE(!PageLocked(new), new);
>  	VM_BUG_ON_PAGE(new->mapping, new);
>  
> -	error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
> +	error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK);
>  	if (!error) {
>  		struct address_space *mapping = old->mapping;
>  		void (*freepage)(struct page *);
> @@ -841,7 +841,7 @@ static int __add_to_page_cache_locked(struct page *page,
>  			return error;
>  	}
>  
> -	error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
> +	error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK);
>  	if (error) {
>  		if (!huge)
>  			mem_cgroup_cancel_charge(page, memcg, false);
> @@ -1574,8 +1574,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
>  		if (fgp_flags & FGP_ACCESSED)
>  			__SetPageReferenced(page);
>  
> -		err = add_to_page_cache_lru(page, mapping, offset,
> -				gfp_mask & GFP_RECLAIM_MASK);
> +		err = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
>  		if (unlikely(err)) {
>  			put_page(page);
>  			page = NULL;
> @@ -2378,7 +2377,7 @@ static int page_cache_read(struct file *file, pgoff_t offset, gfp_t gfp_mask)
>  		if (!page)
>  			return -ENOMEM;
>  
> -		ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask & GFP_KERNEL);
> +		ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
>  		if (ret == 0)
>  			ret = mapping->a_ops->readpage(file, page);
>  		else if (ret == -EEXIST)
> -- 
> 2.16.3
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] page cache: Mask off unwanted GFP flags
  2018-04-10 12:53 ` [PATCH 2/2] page cache: Mask off unwanted GFP flags Matthew Wilcox
                     ` (2 preceding siblings ...)
  2018-04-10 13:45   ` Minchan Kim
@ 2018-04-10 13:46   ` Jan Kara
  3 siblings, 0 replies; 22+ messages in thread
From: Jan Kara @ 2018-04-10 13:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue 10-04-18 05:53:51, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> The page cache has used the mapping's GFP flags for allocating
> radix tree nodes for a long time.  It took care to always mask off the
> __GFP_HIGHMEM flag, and masked off other flags in other paths, but the
> __GFP_ZERO flag was still able to sneak through.  The __GFP_DMA and
> __GFP_DMA32 flags would also have been able to sneak through if they
> were ever used.  Fix them all by using GFP_RECLAIM_MASK at the innermost
> location, and remove it from earlier in the callchain.
> 
> Fixes: 19f99cee206c ("f2fs: add core inode operations")
> Reported-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: stable@vger.kernel.org

Looks good to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  mm/filemap.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index c2147682f4c3..1a4bfc5ed3dc 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -785,7 +785,7 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask)
>  	VM_BUG_ON_PAGE(!PageLocked(new), new);
>  	VM_BUG_ON_PAGE(new->mapping, new);
>  
> -	error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
> +	error = radix_tree_preload(gfp_mask & GFP_RECLAIM_MASK);
>  	if (!error) {
>  		struct address_space *mapping = old->mapping;
>  		void (*freepage)(struct page *);
> @@ -841,7 +841,7 @@ static int __add_to_page_cache_locked(struct page *page,
>  			return error;
>  	}
>  
> -	error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
> +	error = radix_tree_maybe_preload(gfp_mask & GFP_RECLAIM_MASK);
>  	if (error) {
>  		if (!huge)
>  			mem_cgroup_cancel_charge(page, memcg, false);
> @@ -1574,8 +1574,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset,
>  		if (fgp_flags & FGP_ACCESSED)
>  			__SetPageReferenced(page);
>  
> -		err = add_to_page_cache_lru(page, mapping, offset,
> -				gfp_mask & GFP_RECLAIM_MASK);
> +		err = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
>  		if (unlikely(err)) {
>  			put_page(page);
>  			page = NULL;
> @@ -2378,7 +2377,7 @@ static int page_cache_read(struct file *file, pgoff_t offset, gfp_t gfp_mask)
>  		if (!page)
>  			return -ENOMEM;
>  
> -		ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask & GFP_KERNEL);
> +		ret = add_to_page_cache_lru(page, mapping, offset, gfp_mask);
>  		if (ret == 0)
>  			ret = mapping->a_ops->readpage(file, page);
>  		else if (ret == -EEXIST)
> -- 
> 2.16.3
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 12:53 [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Matthew Wilcox
                   ` (3 preceding siblings ...)
  2018-04-10 13:40 ` Vlastimil Babka
@ 2018-04-10 13:53 ` Eric Dumazet
  2018-04-10 16:50   ` Matthew Wilcox
  2018-04-10 14:21 ` Christopher Lameter
  5 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2018-04-10 13:53 UTC (permalink / raw)
  To: Matthew Wilcox, linux-mm
  Cc: Matthew Wilcox, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-kernel, Jan Kara, Jeff Layton,
	Mel Gorman, stable



On 04/10/2018 05:53 AM, Matthew Wilcox wrote:
> From: Matthew Wilcox <mawilcox@microsoft.com>
> 
> __GFP_ZERO requests that the object be initialised to all-zeroes,
> while the purpose of a constructor is to initialise an object to a
> particular pattern.  We cannot do both.  Add a warning to catch any
> users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
> a constructor.
> 
> Fixes: d07dbea46405 ("Slab allocators: support __GFP_ZERO in all allocators")
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> Cc: stable@vger.kernel.org


Since there are probably no bug to fix, what about adding the extra check
only for some DEBUG option ?

How many caches are still using ctor these days ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] page cache: Mask off unwanted GFP flags
  2018-04-10 13:45   ` Minchan Kim
@ 2018-04-10 14:02     ` Matthew Wilcox
  2018-04-10 15:18       ` Jaegeuk Kim
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2018-04-10 14:02 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, Matthew Wilcox, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable, jaegeuk

On Tue, Apr 10, 2018 at 10:45:45PM +0900, Minchan Kim wrote:
> On Tue, Apr 10, 2018 at 05:53:51AM -0700, Matthew Wilcox wrote:
> > From: Matthew Wilcox <mawilcox@microsoft.com>
> > 
> > The page cache has used the mapping's GFP flags for allocating
> > radix tree nodes for a long time.  It took care to always mask off the
> > __GFP_HIGHMEM flag, and masked off other flags in other paths, but the
> > __GFP_ZERO flag was still able to sneak through.  The __GFP_DMA and
> > __GFP_DMA32 flags would also have been able to sneak through if they
> > were ever used.  Fix them all by using GFP_RECLAIM_MASK at the innermost
> > location, and remove it from earlier in the callchain.
> > 
> > Fixes: 19f99cee206c ("f2fs: add core inode operations")
> 
> Why this patch fix 19f99cee206c instead of 449dd6984d0e?
> F2FS doesn't have any problem before introducing 449dd6984d0e?

Well, there's the problem.  This bug is the combination of three different
things:

1. The working set code relying on list_empty.
2. The page cache not filtering out the bad flags.
3. F2FS specifying a flag nobody had ever specified before.

So what single patch does this patch fix?  I don't think it really matters.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 12:53 [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Matthew Wilcox
                   ` (4 preceding siblings ...)
  2018-04-10 13:53 ` Eric Dumazet
@ 2018-04-10 14:21 ` Christopher Lameter
  2018-04-10 14:26   ` Christopher Lameter
  2018-04-10 15:54   ` Matthew Wilcox
  5 siblings, 2 replies; 22+ messages in thread
From: Christopher Lameter @ 2018-04-10 14:21 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-kernel, Jan Kara, Jeff Layton,
	Mel Gorman, stable

On Tue, 10 Apr 2018, Matthew Wilcox wrote:

> __GFP_ZERO requests that the object be initialised to all-zeroes,
> while the purpose of a constructor is to initialise an object to a
> particular pattern.  We cannot do both.  Add a warning to catch any
> users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
> a constructor.

Can we move this check out of the critical paths and check for
a ctor and GFP_ZERO when calling the page allocator? F.e. in
allocate_slab()?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 14:21 ` Christopher Lameter
@ 2018-04-10 14:26   ` Christopher Lameter
  2018-04-10 15:54   ` Matthew Wilcox
  1 sibling, 0 replies; 22+ messages in thread
From: Christopher Lameter @ 2018-04-10 14:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-kernel, Jan Kara, Jeff Layton,
	Mel Gorman, stable

On Tue, 10 Apr 2018, Christopher Lameter wrote:

> On Tue, 10 Apr 2018, Matthew Wilcox wrote:
>
> > __GFP_ZERO requests that the object be initialised to all-zeroes,
> > while the purpose of a constructor is to initialise an object to a
> > particular pattern.  We cannot do both.  Add a warning to catch any
> > users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
> > a constructor.
>
> Can we move this check out of the critical paths and check for
> a ctor and GFP_ZERO when calling the page allocator? F.e. in
> allocate_slab()?

The ctor's are run at that point from setup_object() so it makes sense.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] page cache: Mask off unwanted GFP flags
  2018-04-10 14:02     ` Matthew Wilcox
@ 2018-04-10 15:18       ` Jaegeuk Kim
  0 siblings, 0 replies; 22+ messages in thread
From: Jaegeuk Kim @ 2018-04-10 15:18 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Minchan Kim, linux-mm, Matthew Wilcox, Christoph Lameter,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-kernel, Jan Kara, Jeff Layton, Mel Gorman, stable

On 04/10, Matthew Wilcox wrote:
> On Tue, Apr 10, 2018 at 10:45:45PM +0900, Minchan Kim wrote:
> > On Tue, Apr 10, 2018 at 05:53:51AM -0700, Matthew Wilcox wrote:
> > > From: Matthew Wilcox <mawilcox@microsoft.com>
> > > 
> > > The page cache has used the mapping's GFP flags for allocating
> > > radix tree nodes for a long time.  It took care to always mask off the
> > > __GFP_HIGHMEM flag, and masked off other flags in other paths, but the
> > > __GFP_ZERO flag was still able to sneak through.  The __GFP_DMA and
> > > __GFP_DMA32 flags would also have been able to sneak through if they
> > > were ever used.  Fix them all by using GFP_RECLAIM_MASK at the innermost
> > > location, and remove it from earlier in the callchain.
> > > 
> > > Fixes: 19f99cee206c ("f2fs: add core inode operations")
> > 
> > Why this patch fix 19f99cee206c instead of 449dd6984d0e?
> > F2FS doesn't have any problem before introducing 449dd6984d0e?
> 
> Well, there's the problem.  This bug is the combination of three different
> things:
> 
> 1. The working set code relying on list_empty.
> 2. The page cache not filtering out the bad flags.
> 3. F2FS specifying a flag nobody had ever specified before.
> 
> So what single patch does this patch fix?  I don't think it really matters.

Hope there'd be someone who does care about patch description though, IMHO,
this fixes the MM regression introduced by:
449dd6984d0e ("mm: keep page cache radix tree nodes in check") merged in v3.15,
2014.

19f99cee206c ("f2fs: add core inode operations) merged in v3.8, 2012, just
revealed this out. In fact, I've never hit this bug in old kernels.

>From the user viewpoint, may I suggest to describe what kind of symptom we're
able to see due to this bug?

Something like:

[ 7858.792946] [<ffffff80086f4de0>] __list_del_entry+0x30/0xd0
[ 7858.792951] [<ffffff8008362018>] list_lru_del+0xac/0x1ac
[ 7858.792957] [<ffffff800830f04c>] page_cache_tree_insert+0xd8/0x110
[ 7858.792962] [<ffffff8008310188>] __add_to_page_cache_locked+0xf8/0x4e0
[ 7858.792967] [<ffffff800830ff34>] add_to_page_cache_lru+0x50/0x1ac
[ 7858.792972] [<ffffff800830fdd0>] pagecache_get_page+0x468/0x57c
[ 7858.792979] [<ffffff80085d081c>] __get_node_page+0x84/0x764
[ 7858.792986] [<ffffff800859cd94>] f2fs_iget+0x264/0xdc8
[ 7858.792991] [<ffffff800859ee00>] f2fs_lookup+0x3b4/0x660
[ 7858.792998] [<ffffff80083d2540>] lookup_slow+0x1e4/0x348
[ 7858.793003] [<ffffff80083d0eb8>] walk_component+0x21c/0x320
[ 7858.793008] [<ffffff80083d0010>] path_lookupat+0x90/0x1bc
[ 7858.793013] [<ffffff80083cfe6c>] filename_lookup+0x8c/0x1a0
[ 7858.793018] [<ffffff80083c52d0>] vfs_fstatat+0x84/0x10c
[ 7858.793023] [<ffffff80083c5b00>] SyS_newfstatat+0x28/0x64

Thanks,

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 14:21 ` Christopher Lameter
  2018-04-10 14:26   ` Christopher Lameter
@ 2018-04-10 15:54   ` Matthew Wilcox
  2018-04-10 17:04     ` Christopher Lameter
  1 sibling, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2018-04-10 15:54 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: linux-mm, Matthew Wilcox, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-kernel, Jan Kara, Jeff Layton,
	Mel Gorman, stable

On Tue, Apr 10, 2018 at 09:21:20AM -0500, Christopher Lameter wrote:
> On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> 
> > __GFP_ZERO requests that the object be initialised to all-zeroes,
> > while the purpose of a constructor is to initialise an object to a
> > particular pattern.  We cannot do both.  Add a warning to catch any
> > users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
> > a constructor.
> 
> Can we move this check out of the critical paths and check for
> a ctor and GFP_ZERO when calling the page allocator? F.e. in
> allocate_slab()?

Are you willing to have this kind of bug go uncaught for a while?
In this specific case, __GFP_ZERO was only being passed on a few of the
calls to kmem_cache_alloc.  So we'd happily trash the constructed object
any time we didn't allocate a page.

I appreciate it's a tradeoff, and we don't want to clutter the critical
path unnecessarily.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 13:53 ` Eric Dumazet
@ 2018-04-10 16:50   ` Matthew Wilcox
  2018-04-10 17:30     ` Christopher Lameter
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2018-04-10 16:50 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linux-mm, Matthew Wilcox, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue, Apr 10, 2018 at 06:53:04AM -0700, Eric Dumazet wrote:
> On 04/10/2018 05:53 AM, Matthew Wilcox wrote:
> > From: Matthew Wilcox <mawilcox@microsoft.com>
> > 
> > __GFP_ZERO requests that the object be initialised to all-zeroes,
> > while the purpose of a constructor is to initialise an object to a
> > particular pattern.  We cannot do both.  Add a warning to catch any
> > users who mistakenly pass a __GFP_ZERO flag when allocating a slab with
> > a constructor.
> > 
> > Fixes: d07dbea46405 ("Slab allocators: support __GFP_ZERO in all allocators")
> > Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
> > Cc: stable@vger.kernel.org
> 
> Since there are probably no bug to fix, what about adding the extra check
> only for some DEBUG option ?
> 
> How many caches are still using ctor these days ?

That's a really good question, and strangely hard to find out.  I settled
on "git grep -A4 kmem_cache_alloc" and then searching the 'less' output
with '[^L]);'.

--
arch/powerpc/kvm/book3s_64_mmu_radix.c: kvm_pte_cache = kmem_cache_create("kvm-pte", size, size, 0, pte_ctor);
--
arch/powerpc/mm/init-common.c:  new = kmem_cache_create(name, table_size, align, 0, ctor);
--
arch/powerpc/platforms/cell/spufs/inode.c:      spufs_inode_cache = kmem_cache_create("spufs_inode_cache",
arch/powerpc/platforms/cell/spufs/inode.c-                      sizeof(struct spufs_inode_info), 0,
arch/powerpc/platforms/cell/spufs/inode.c-                      SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, spufs_init_once);
--
arch/sh/mm/pgtable.c:   pgd_cachep = kmem_cache_create("pgd_cache",
arch/sh/mm/pgtable.c-                                  PTRS_PER_PGD * (1<<PTE_MAGNITUDE),
arch/sh/mm/pgtable.c-                                  PAGE_SIZE, SLAB_PANIC, pgd_ctor);
--
arch/sparc/mm/tsb.c:    pgtable_cache = kmem_cache_create("pgtable_cache",
arch/sparc/mm/tsb.c-                                      PAGE_SIZE, PAGE_SIZE,
arch/sparc/mm/tsb.c-                                      0,
arch/sparc/mm/tsb.c-                                      _clear_page);
--
drivers/dax/super.c:    dax_cache = kmem_cache_create("dax_cache", sizeof(struct
 dax_device), 0,
drivers/dax/super.c-                    (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT
|
drivers/dax/super.c-                     SLAB_MEM_SPREAD|SLAB_ACCOUNT),
drivers/dax/super.c-                    init_once);
--
drivers/staging/ncpfs/inode.c:  ncp_inode_cachep = kmem_cache_create("ncp_inode_
cache",
drivers/staging/ncpfs/inode.c-                                       sizeof(stru
ct ncp_inode_info),
drivers/staging/ncpfs/inode.c-                                       0, (SLAB_RE
CLAIM_ACCOUNT|
drivers/staging/ncpfs/inode.c-                                          SLAB_MEM
_SPREAD|SLAB_ACCOUNT),
drivers/staging/ncpfs/inode.c-                                       init_once);
--
drivers/usb/mon/mon_text.c:     rp->e_slab = kmem_cache_create(rp->slab_name,
drivers/usb/mon/mon_text.c-         sizeof(struct mon_event_text), sizeof(long),
 0,
drivers/usb/mon/mon_text.c-         mon_text_ctor);
--
fs/9p/v9fs.c:   v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
fs/9p/v9fs.c-                                     sizeof(struct v9fs_inode),
fs/9p/v9fs.c-                                     0, (SLAB_RECLAIM_ACCOUNT|
fs/9p/v9fs.c-                                         SLAB_MEM_SPREAD|SLAB_ACCOUNT),
fs/9p/v9fs.c-                                     v9fs_inode_init_once);
--
fs/adfs/super.c:        adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
fs/adfs/super.c-                                             sizeof(struct adfs_inode_info),
fs/adfs/super.c-                                             0, (SLAB_RECLAIM_ACCOUNT|
fs/adfs/super.c-                                                SLAB_MEM_SPREAD|SLAB_ACCOUNT),
fs/adfs/super.c-                                             init_once);
... snip a huge number of filesystems ...
--
ipc/mqueue.c:   mqueue_inode_cachep = kmem_cache_create("mqueue_inode_cache",
ipc/mqueue.c-                           sizeof(struct mqueue_inode_info), 0,
ipc/mqueue.c-                           SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, init_once);
--
kernel/fork.c:  sighand_cachep = kmem_cache_create("sighand_cache",
kernel/fork.c-                  sizeof(struct sighand_struct), 0,
kernel/fork.c-                  SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_TYPESAFE_BY_R
CU|
kernel/fork.c-                  SLAB_ACCOUNT, sighand_ctor);
--
lib/radix-tree.c:       radix_tree_node_cachep = kmem_cache_create("radix_tree_n
ode",
lib/radix-tree.c-                       sizeof(struct radix_tree_node), 0,
lib/radix-tree.c-                       SLAB_PANIC | SLAB_RECLAIM_ACCOUNT,
lib/radix-tree.c-                       radix_tree_node_ctor);
--
mm/rmap.c:      anon_vma_cachep = kmem_cache_create("anon_vma", sizeof(struct an
on_vma),
mm/rmap.c-                      0, SLAB_TYPESAFE_BY_RCU|SLAB_PANIC|SLAB_ACCOUNT,
mm/rmap.c-                      anon_vma_ctor);
--
mm/shmem.c:     shmem_inode_cachep = kmem_cache_create("shmem_inode_cache",
mm/shmem.c-                             sizeof(struct shmem_inode_info),
mm/shmem.c-                             0, SLAB_PANIC|SLAB_ACCOUNT, shmem_init_inode);
--
net/sunrpc/rpc_pipe.c:  rpc_inode_cachep = kmem_cache_create("rpc_inode_cache",
net/sunrpc/rpc_pipe.c-                          sizeof(struct rpc_inode),
net/sunrpc/rpc_pipe.c-                          0, (SLAB_HWCACHE_ALIGN|SLAB_RECL
AIM_ACCOUNT|
net/sunrpc/rpc_pipe.c-                                          SLAB_MEM_SPREAD|
SLAB_ACCOUNT),
net/sunrpc/rpc_pipe.c-                          init_once);
--
security/integrity/iint.c:          kmem_cache_create("iint_cache", sizeof(struc
t integrity_iint_cache),
security/integrity/iint.c-                            0, SLAB_PANIC, init_once);

So aside from the filesystems, about fourteen places use it in the kernel.

If we want to get rid of the concept of constructors, it's doable,
but somebody needs to do the work to show what the effects will be.

For example, I took a quick look at the sighand_struct in kernel/fork.c.
That initialises the spinlock and waitqueue head which are at the end
of sighand_struct.  The caller who allocates sighand_struct touches
the head of the struct.  So if we removed the ctor, we'd touch two
cachelines on allocation instead of one ... but we could rearrange the
sighand_struct to put all the initialised bits in the first cacheline
(and we probably should).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 15:54   ` Matthew Wilcox
@ 2018-04-10 17:04     ` Christopher Lameter
  0 siblings, 0 replies; 22+ messages in thread
From: Christopher Lameter @ 2018-04-10 17:04 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, Matthew Wilcox, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-kernel, Jan Kara, Jeff Layton,
	Mel Gorman, stable

On Tue, 10 Apr 2018, Matthew Wilcox wrote:

> Are you willing to have this kind of bug go uncaught for a while?

There will be frequent allocations and this will show up at some point.

Also you could put this into the debug only portions somehwere so we
always catch it when debugging is on,
'

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 16:50   ` Matthew Wilcox
@ 2018-04-10 17:30     ` Christopher Lameter
  2018-04-10 17:38       ` Matthew Wilcox
  0 siblings, 1 reply; 22+ messages in thread
From: Christopher Lameter @ 2018-04-10 17:30 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Eric Dumazet, linux-mm, Matthew Wilcox, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue, 10 Apr 2018, Matthew Wilcox wrote:

> If we want to get rid of the concept of constructors, it's doable,
> but somebody needs to do the work to show what the effects will be.

How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches?
Those must have a defined state of the objects at all times and a constructor is
required for that. And their use of RCU is required for numerous lockless
lookup algorithms in the kernhel.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 17:30     ` Christopher Lameter
@ 2018-04-10 17:38       ` Matthew Wilcox
  2018-04-10 17:45         ` Christopher Lameter
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2018-04-10 17:38 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Eric Dumazet, linux-mm, Matthew Wilcox, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue, Apr 10, 2018 at 12:30:23PM -0500, Christopher Lameter wrote:
> On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> 
> > If we want to get rid of the concept of constructors, it's doable,
> > but somebody needs to do the work to show what the effects will be.
> 
> How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches?
> Those must have a defined state of the objects at all times and a constructor is
> required for that. And their use of RCU is required for numerous lockless
> lookup algorithms in the kernhel.

Not at all times.  Only once they've been used.  Re-constructing them
once they've been used might break the rcu typesafety, I suppose ...
would need to examine the callers.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 17:38       ` Matthew Wilcox
@ 2018-04-10 17:45         ` Christopher Lameter
  2018-04-10 17:50           ` Matthew Wilcox
  0 siblings, 1 reply; 22+ messages in thread
From: Christopher Lameter @ 2018-04-10 17:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Eric Dumazet, linux-mm, Matthew Wilcox, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue, 10 Apr 2018, Matthew Wilcox wrote:

> > How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches?
> > Those must have a defined state of the objects at all times and a constructor is
> > required for that. And their use of RCU is required for numerous lockless
> > lookup algorithms in the kernhel.
>
> Not at all times.  Only once they've been used.  Re-constructing them
> once they've been used might break the rcu typesafety, I suppose ...
> would need to examine the callers.

Objects can be freed and reused and still be accessed from code that
thinks the object is the old and not the new object....

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 17:45         ` Christopher Lameter
@ 2018-04-10 17:50           ` Matthew Wilcox
  2018-04-10 20:21             ` Christopher Lameter
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Wilcox @ 2018-04-10 17:50 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Eric Dumazet, linux-mm, Matthew Wilcox, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue, Apr 10, 2018 at 12:45:56PM -0500, Christopher Lameter wrote:
> On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> 
> > > How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches?
> > > Those must have a defined state of the objects at all times and a constructor is
> > > required for that. And their use of RCU is required for numerous lockless
> > > lookup algorithms in the kernhel.
> >
> > Not at all times.  Only once they've been used.  Re-constructing them
> > once they've been used might break the rcu typesafety, I suppose ...
> > would need to examine the callers.
> 
> Objects can be freed and reused and still be accessed from code that
> thinks the object is the old and not the new object....

Yes, I know, that's the point of RCU typesafety.  My point is that an
object *which has never been used* can't be accessed.  So you don't *need*
a constructor.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor
  2018-04-10 17:50           ` Matthew Wilcox
@ 2018-04-10 20:21             ` Christopher Lameter
  0 siblings, 0 replies; 22+ messages in thread
From: Christopher Lameter @ 2018-04-10 20:21 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Eric Dumazet, linux-mm, Matthew Wilcox, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-kernel,
	Jan Kara, Jeff Layton, Mel Gorman, stable

On Tue, 10 Apr 2018, Matthew Wilcox wrote:

> > Objects can be freed and reused and still be accessed from code that
> > thinks the object is the old and not the new object....
>
> Yes, I know, that's the point of RCU typesafety.  My point is that an
> object *which has never been used* can't be accessed.  So you don't *need*
> a constructor.

But the object needs to have the proper contents after it was released and
re-allocated. Some objects may rely on contents (like list heads)
surviving the realloc process because access must always be possible.

validate_slab() checks on proper metadata content in a slab
although it does not access the payload. So that may work you separate
the payload init from the metadata init.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-04-10 20:21 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-10 12:53 [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Matthew Wilcox
2018-04-10 12:53 ` [PATCH 2/2] page cache: Mask off unwanted GFP flags Matthew Wilcox
2018-04-10 13:08   ` Johannes Weiner
2018-04-10 13:09   ` Michal Hocko
2018-04-10 13:45   ` Minchan Kim
2018-04-10 14:02     ` Matthew Wilcox
2018-04-10 15:18       ` Jaegeuk Kim
2018-04-10 13:46   ` Jan Kara
2018-04-10 13:00 ` [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor Johannes Weiner
2018-04-10 13:07 ` Michal Hocko
2018-04-10 13:40 ` Vlastimil Babka
2018-04-10 13:53 ` Eric Dumazet
2018-04-10 16:50   ` Matthew Wilcox
2018-04-10 17:30     ` Christopher Lameter
2018-04-10 17:38       ` Matthew Wilcox
2018-04-10 17:45         ` Christopher Lameter
2018-04-10 17:50           ` Matthew Wilcox
2018-04-10 20:21             ` Christopher Lameter
2018-04-10 14:21 ` Christopher Lameter
2018-04-10 14:26   ` Christopher Lameter
2018-04-10 15:54   ` Matthew Wilcox
2018-04-10 17:04     ` Christopher Lameter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.