linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
@ 2021-07-28  9:13 Wang Hai
  2021-07-28 10:49 ` Kefeng Wang
  2021-07-28 13:23 ` Michal Hocko
  0 siblings, 2 replies; 9+ messages in thread
From: Wang Hai @ 2021-07-28  9:13 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka, hannes,
	shakeelb, ast, wangkefeng.wang
  Cc: linux-mm, linux-kernel

When I use kfree_rcu() to free a large memory allocated by
kmalloc_node(), the following dump occurs.

BUG: kernel NULL pointer dereference, address: 0000000000000020
[...]
Oops: 0000 [#1] SMP
[...]
Workqueue: events kfree_rcu_work
RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
[...]
Call Trace:
 kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
 kfree_bulk include/linux/slab.h:413 [inline]
 kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
 process_one_work+0x207/0x530 kernel/workqueue.c:2276
 worker_thread+0x320/0x610 kernel/workqueue.c:2422
 kthread+0x13d/0x160 kernel/kthread.c:313
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

When kmalloc_node() a large memory, page is allocated, not slab,
so when freeing memory via kfree_rcu(), this large memory should not
be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
is used for slab.

So in this case, there is no need to do anything with this large
page in memcg_slab_free_hook(), just skip it.

Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
Signed-off-by: Wang Hai <wanghai38@huawei.com>
---
 mm/slab.h | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 67e06637ff2e..247d3f9c21f7 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
 			continue;
 
 		page = virt_to_head_page(p[i]);
+		if (!s_orig) {
+			if (unlikely(!PageSlab(page))) {
+				BUG_ON(!PageCompound(page));
+				continue;
+			}
+			s = page->slab_cache;
+		} else {
+			s = s_orig;
+		}
+
 		objcgs = page_objcgs(page);
 		if (!objcgs)
 			continue;
 
-		if (!s_orig)
-			s = page->slab_cache;
-		else
-			s = s_orig;
-
 		off = obj_to_index(s, page, p[i]);
 		objcg = objcgs[off];
 		if (!objcg)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  2021-07-28  9:13 [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook() Wang Hai
@ 2021-07-28 10:49 ` Kefeng Wang
  2021-07-28 13:23 ` Michal Hocko
  1 sibling, 0 replies; 9+ messages in thread
From: Kefeng Wang @ 2021-07-28 10:49 UTC (permalink / raw)
  To: Wang Hai, cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka,
	hannes, shakeelb, ast
  Cc: linux-mm, linux-kernel


On 2021/7/28 17:13, Wang Hai wrote:
> When I use kfree_rcu() to free a large memory allocated by
> kmalloc_node(), the following dump occurs.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> [...]
> Oops: 0000 [#1] SMP
> [...]
> Workqueue: events kfree_rcu_work
> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> [...]
> Call Trace:
>   kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
>   kfree_bulk include/linux/slab.h:413 [inline]
>   kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
>   process_one_work+0x207/0x530 kernel/workqueue.c:2276
>   worker_thread+0x320/0x610 kernel/workqueue.c:2422
>   kthread+0x13d/0x160 kernel/kthread.c:313
>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>
> When kmalloc_node() a large memory, page is allocated, not slab,
> so when freeing memory via kfree_rcu(), this large memory should not
> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> is used for slab.
>
> So in this case, there is no need to do anything with this large
> page in memcg_slab_free_hook(), just skip it.
>
> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>   mm/slab.h | 15 ++++++++++-----
>   1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 67e06637ff2e..247d3f9c21f7 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
>   			continue;
>   
>   		page = virt_to_head_page(p[i]);
> +		if (!s_orig) {
> +			if (unlikely(!PageSlab(page))) {
> +				BUG_ON(!PageCompound(page));
> +				continue;
> +			}
> +			s = page->slab_cache;
> +		} else {
> +			s = s_orig;
> +		}
> +
>   		objcgs = page_objcgs(page);
>   		if (!objcgs)
>   			continue;
>   
> -		if (!s_orig)
> -			s = page->slab_cache;
> -		else
> -			s = s_orig;
> -
>   		off = obj_to_index(s, page, p[i]);
>   		objcg = objcgs[off];
>   		if (!objcg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  2021-07-28  9:13 [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook() Wang Hai
  2021-07-28 10:49 ` Kefeng Wang
@ 2021-07-28 13:23 ` Michal Hocko
  2021-07-28 14:10   ` Shakeel Butt
  2021-07-28 14:21   ` Kefeng Wang
  1 sibling, 2 replies; 9+ messages in thread
From: Michal Hocko @ 2021-07-28 13:23 UTC (permalink / raw)
  To: Wang Hai
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka, hannes,
	shakeelb, ast, wangkefeng.wang, linux-mm, linux-kernel

On Wed 28-07-21 17:13:48, Wang Hai wrote:
> When I use kfree_rcu() to free a large memory allocated by
> kmalloc_node(), the following dump occurs.
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> [...]
> Oops: 0000 [#1] SMP
> [...]
> Workqueue: events kfree_rcu_work
> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> [...]
> Call Trace:
>  kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
>  kfree_bulk include/linux/slab.h:413 [inline]
>  kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
>  process_one_work+0x207/0x530 kernel/workqueue.c:2276
>  worker_thread+0x320/0x610 kernel/workqueue.c:2422
>  kthread+0x13d/0x160 kernel/kthread.c:313
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> 
> When kmalloc_node() a large memory, page is allocated, not slab,
> so when freeing memory via kfree_rcu(), this large memory should not
> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> is used for slab.
> 
> So in this case, there is no need to do anything with this large
> page in memcg_slab_free_hook(), just skip it.
> 
> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")

Are you sure that this commit is really breaking the code. Unless I have
missed something there shouldn't be any real change wrt. large
allocations here. page_has_obj_cgroups is just a different name for what
what page_objcgs is giving us.

I haven't studied the kfree_rcu part but isn't the problem its use of
kmem_cache_free_bulk or isn't the problem right there in the bulk free?

> Signed-off-by: Wang Hai <wanghai38@huawei.com>
> ---
>  mm/slab.h | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/slab.h b/mm/slab.h
> index 67e06637ff2e..247d3f9c21f7 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
>  			continue;
>  
>  		page = virt_to_head_page(p[i]);
> +		if (!s_orig) {
> +			if (unlikely(!PageSlab(page))) {
> +				BUG_ON(!PageCompound(page));

BUG_ON is not really a good idea here. Why should we crash the kernel
just because of an unexpected page showing up. Leaking it would be more
appropriate (the same would apply to kfree btw). I would just warn
here. Also don't we need any hookd here.  Looking at kfree path it does
call kfree_hook. Why is that not needed here?

> +				continue;
> +			}
> +			s = page->slab_cache;
> +		} else {
> +			s = s_orig;
> +		}
> +
>  		objcgs = page_objcgs(page);
>  		if (!objcgs)
>  			continue;
>  
> -		if (!s_orig)
> -			s = page->slab_cache;
> -		else
> -			s = s_orig;
> -
>  		off = obj_to_index(s, page, p[i]);
>  		objcg = objcgs[off];
>  		if (!objcg)
> -- 
> 2.17.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  2021-07-28 13:23 ` Michal Hocko
@ 2021-07-28 14:10   ` Shakeel Butt
  2021-07-28 16:43     ` Michal Hocko
  2021-07-28 14:21   ` Kefeng Wang
  1 sibling, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2021-07-28 14:10 UTC (permalink / raw)
  To: Michal Hocko, Roman Gushchin
  Cc: Wang Hai, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Vlastimil Babka, Johannes Weiner,
	Alexei Starovoitov, wangkefeng.wang, Linux MM, LKML

+Roman

On Wed, Jul 28, 2021 at 6:23 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 28-07-21 17:13:48, Wang Hai wrote:
> > When I use kfree_rcu() to free a large memory allocated by
> > kmalloc_node(), the following dump occurs.
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000020
> > [...]
> > Oops: 0000 [#1] SMP
> > [...]
> > Workqueue: events kfree_rcu_work
> > RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> > RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> > RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> > [...]
> > Call Trace:
> >  kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> >  kfree_bulk include/linux/slab.h:413 [inline]
> >  kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> >  process_one_work+0x207/0x530 kernel/workqueue.c:2276
> >  worker_thread+0x320/0x610 kernel/workqueue.c:2422
> >  kthread+0x13d/0x160 kernel/kthread.c:313
> >  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> >
> > When kmalloc_node() a large memory, page is allocated, not slab,
> > so when freeing memory via kfree_rcu(), this large memory should not
> > be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> > is used for slab.
> >
> > So in this case, there is no need to do anything with this large
> > page in memcg_slab_free_hook(), just skip it.
> >
> > Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
>
> Are you sure that this commit is really breaking the code. Unless I have
> missed something there shouldn't be any real change wrt. large
> allocations here. page_has_obj_cgroups is just a different name for what
> what page_objcgs is giving us.

Actually they are different. For MEMCG_DATA_KMEM page,
page_has_obj_cgroups() will return false while page_objcgs() on
non-VM_DEBUG kernels will return "struct obj_cgroup *" instead of
"struct obj_cgroup **".

>
> I haven't studied the kfree_rcu part but isn't the problem its use of
> kmem_cache_free_bulk or isn't the problem right there in the bulk free?
>

SLUB's kmem_cache_free_bulk() is doing two passes. In first pass
uncharges all the objects and in the second pass frees them to the
slub infra. There is nothing wrong with that. It is just that SLUB
mixes page (for higher order) and slab allocations and requires
special handling.

> > Signed-off-by: Wang Hai <wanghai38@huawei.com>
> > ---
> >  mm/slab.h | 15 ++++++++++-----
> >  1 file changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/slab.h b/mm/slab.h
> > index 67e06637ff2e..247d3f9c21f7 100644
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
> >                       continue;
> >
> >               page = virt_to_head_page(p[i]);
> > +             if (!s_orig) {
> > +                     if (unlikely(!PageSlab(page))) {
> > +                             BUG_ON(!PageCompound(page));
>
> BUG_ON is not really a good idea here. Why should we crash the kernel
> just because of an unexpected page showing up. Leaking it would be more
> appropriate (the same would apply to kfree btw). I would just warn
> here.

The simplest fix would be to not call memcg_slab_free_hook() in
kmem_cache_free_bulk() because slab_free() will call
memcg_slab_free_hook() for individual objects. Not sure if there will
be any performance impact.

>  Also don't we need any hookd here.  Looking at kfree path it does
> call kfree_hook. Why is that not needed here?

These are called later in build_detached_freelist().

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  2021-07-28 13:23 ` Michal Hocko
  2021-07-28 14:10   ` Shakeel Butt
@ 2021-07-28 14:21   ` Kefeng Wang
  2021-07-28 14:26     ` Shakeel Butt
  1 sibling, 1 reply; 9+ messages in thread
From: Kefeng Wang @ 2021-07-28 14:21 UTC (permalink / raw)
  To: Michal Hocko, Wang Hai
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, vbabka, hannes,
	shakeelb, ast, linux-mm, linux-kernel


On 2021/7/28 21:23, Michal Hocko wrote:
> On Wed 28-07-21 17:13:48, Wang Hai wrote:
>> When I use kfree_rcu() to free a large memory allocated by
>> kmalloc_node(), the following dump occurs.
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000020
>> [...]
>> Oops: 0000 [#1] SMP
>> [...]
>> Workqueue: events kfree_rcu_work
>> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
>> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
>> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
>> [...]
>> Call Trace:
>>   kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
>>   kfree_bulk include/linux/slab.h:413 [inline]
>>   kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
>>   process_one_work+0x207/0x530 kernel/workqueue.c:2276
>>   worker_thread+0x320/0x610 kernel/workqueue.c:2422
>>   kthread+0x13d/0x160 kernel/kthread.c:313
>>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>>
>> When kmalloc_node() a large memory, page is allocated, not slab,
>> so when freeing memory via kfree_rcu(), this large memory should not
>> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
>> is used for slab.
>>
>> So in this case, there is no need to do anything with this large
>> page in memcg_slab_free_hook(), just skip it.
>>
>> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> Are you sure that this commit is really breaking the code. Unless I have
Yes, we confirmed that this commit introduces the bug.
> missed something there shouldn't be any real change wrt. large
> allocations here. page_has_obj_cgroups is just a different name for what
> what page_objcgs is giving us.

maybe we could simply use page_objcgs_check to fix the issue ? we will 
check it again.

diff --git a/mm/slab.h b/mm/slab.h
index f997fd5e42c8..58c01a34e5b8 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -346,7 +346,7 @@ static inline void memcg_slab_free_hook(struct 
kmem_cache *s_orig,
                         continue;

                 page = virt_to_head_page(p[i]);
-               objcgs = page_objcgs(page);
+               objcgs = page_objcgs_check(page);
                 if (!objcgs)
                         continue;

>
> I haven't studied the kfree_rcu part but isn't the problem its use of
> kmem_cache_free_bulk or isn't the problem right there in the bulk free?
>
>> Signed-off-by: Wang Hai <wanghai38@huawei.com>
>> ---
>>   mm/slab.h | 15 ++++++++++-----
>>   1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/slab.h b/mm/slab.h
>> index 67e06637ff2e..247d3f9c21f7 100644
>> --- a/mm/slab.h
>> +++ b/mm/slab.h
>> @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig,
>>   			continue;
>>   
>>   		page = virt_to_head_page(p[i]);
>> +		if (!s_orig) {
>> +			if (unlikely(!PageSlab(page))) {
>> +				BUG_ON(!PageCompound(page));
the same logical is in build_detached_freelist()
> BUG_ON is not really a good idea here. Why should we crash the kernel
> just because of an unexpected page showing up. Leaking it would be more
> appropriate (the same would apply to kfree btw). I would just warn
> here. Also don't we need any hookd here.  Looking at kfree path it does
> call kfree_hook. Why is that not needed here?
kmem_cache_free_bulk
-- memcg_slab_free_hook  // skip objcgs ops for page
-- build_detached_freelist  // kfree_hook and page free


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  2021-07-28 14:21   ` Kefeng Wang
@ 2021-07-28 14:26     ` Shakeel Butt
  2021-07-28 14:33       ` wanghai (M)
  0 siblings, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2021-07-28 14:26 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Michal Hocko, Wang Hai, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, Vlastimil Babka,
	Johannes Weiner, Alexei Starovoitov, Linux MM, LKML

On Wed, Jul 28, 2021 at 7:21 AM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>
>
> On 2021/7/28 21:23, Michal Hocko wrote:
> > On Wed 28-07-21 17:13:48, Wang Hai wrote:
> >> When I use kfree_rcu() to free a large memory allocated by
> >> kmalloc_node(), the following dump occurs.
> >>
> >> BUG: kernel NULL pointer dereference, address: 0000000000000020
> >> [...]
> >> Oops: 0000 [#1] SMP
> >> [...]
> >> Workqueue: events kfree_rcu_work
> >> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> >> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> >> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> >> [...]
> >> Call Trace:
> >>   kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> >>   kfree_bulk include/linux/slab.h:413 [inline]
> >>   kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> >>   process_one_work+0x207/0x530 kernel/workqueue.c:2276
> >>   worker_thread+0x320/0x610 kernel/workqueue.c:2422
> >>   kthread+0x13d/0x160 kernel/kthread.c:313
> >>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> >>
> >> When kmalloc_node() a large memory, page is allocated, not slab,
> >> so when freeing memory via kfree_rcu(), this large memory should not
> >> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> >> is used for slab.
> >>
> >> So in this case, there is no need to do anything with this large
> >> page in memcg_slab_free_hook(), just skip it.
> >>
> >> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> > Are you sure that this commit is really breaking the code. Unless I have
> Yes, we confirmed that this commit introduces the bug.
> > missed something there shouldn't be any real change wrt. large
> > allocations here. page_has_obj_cgroups is just a different name for what
> > what page_objcgs is giving us.
>
> maybe we could simply use page_objcgs_check to fix the issue ? we will
> check it again.

You will see the same crash with page_objcgs_check as well.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  2021-07-28 14:26     ` Shakeel Butt
@ 2021-07-28 14:33       ` wanghai (M)
  2021-07-28 14:44         ` Shakeel Butt
  0 siblings, 1 reply; 9+ messages in thread
From: wanghai (M) @ 2021-07-28 14:33 UTC (permalink / raw)
  To: Shakeel Butt, Kefeng Wang
  Cc: Michal Hocko, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Vlastimil Babka, Johannes Weiner,
	Alexei Starovoitov, Linux MM, LKML


在 2021/7/28 22:26, Shakeel Butt 写道:
> On Wed, Jul 28, 2021 at 7:21 AM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>
>> On 2021/7/28 21:23, Michal Hocko wrote:
>>> On Wed 28-07-21 17:13:48, Wang Hai wrote:
>>>> When I use kfree_rcu() to free a large memory allocated by
>>>> kmalloc_node(), the following dump occurs.
>>>>
>>>> BUG: kernel NULL pointer dereference, address: 0000000000000020
>>>> [...]
>>>> Oops: 0000 [#1] SMP
>>>> [...]
>>>> Workqueue: events kfree_rcu_work
>>>> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
>>>> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
>>>> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
>>>> [...]
>>>> Call Trace:
>>>>    kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
>>>>    kfree_bulk include/linux/slab.h:413 [inline]
>>>>    kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
>>>>    process_one_work+0x207/0x530 kernel/workqueue.c:2276
>>>>    worker_thread+0x320/0x610 kernel/workqueue.c:2422
>>>>    kthread+0x13d/0x160 kernel/kthread.c:313
>>>>    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
>>>>
>>>> When kmalloc_node() a large memory, page is allocated, not slab,
>>>> so when freeing memory via kfree_rcu(), this large memory should not
>>>> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
>>>> is used for slab.
>>>>
>>>> So in this case, there is no need to do anything with this large
>>>> page in memcg_slab_free_hook(), just skip it.
>>>>
>>>> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
>>> Are you sure that this commit is really breaking the code. Unless I have
>> Yes, we confirmed that this commit introduces the bug.
>>> missed something there shouldn't be any real change wrt. large
>>> allocations here. page_has_obj_cgroups is just a different name for what
>>> what page_objcgs is giving us.
>> maybe we could simply use page_objcgs_check to fix the issue ? we will
>> check it again.
> You will see the same crash with page_objcgs_check as well.
> .

I just test it. It won't crash.

This is the test case:

node = kmalloc_node(299999, GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT, -1);
kfree_rcu(node, rcu);


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  2021-07-28 14:33       ` wanghai (M)
@ 2021-07-28 14:44         ` Shakeel Butt
  0 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2021-07-28 14:44 UTC (permalink / raw)
  To: wanghai (M)
  Cc: Kefeng Wang, Michal Hocko, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, Vlastimil Babka,
	Johannes Weiner, Alexei Starovoitov, Linux MM, LKML

On Wed, Jul 28, 2021 at 7:34 AM wanghai (M) <wanghai38@huawei.com> wrote:
>
>
> 在 2021/7/28 22:26, Shakeel Butt 写道:
> > On Wed, Jul 28, 2021 at 7:21 AM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> >>
> >> On 2021/7/28 21:23, Michal Hocko wrote:
> >>> On Wed 28-07-21 17:13:48, Wang Hai wrote:
> >>>> When I use kfree_rcu() to free a large memory allocated by
> >>>> kmalloc_node(), the following dump occurs.
> >>>>
> >>>> BUG: kernel NULL pointer dereference, address: 0000000000000020
> >>>> [...]
> >>>> Oops: 0000 [#1] SMP
> >>>> [...]
> >>>> Workqueue: events kfree_rcu_work
> >>>> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> >>>> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> >>>> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> >>>> [...]
> >>>> Call Trace:
> >>>>    kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> >>>>    kfree_bulk include/linux/slab.h:413 [inline]
> >>>>    kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> >>>>    process_one_work+0x207/0x530 kernel/workqueue.c:2276
> >>>>    worker_thread+0x320/0x610 kernel/workqueue.c:2422
> >>>>    kthread+0x13d/0x160 kernel/kthread.c:313
> >>>>    ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> >>>>
> >>>> When kmalloc_node() a large memory, page is allocated, not slab,
> >>>> so when freeing memory via kfree_rcu(), this large memory should not
> >>>> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> >>>> is used for slab.
> >>>>
> >>>> So in this case, there is no need to do anything with this large
> >>>> page in memcg_slab_free_hook(), just skip it.
> >>>>
> >>>> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> >>> Are you sure that this commit is really breaking the code. Unless I have
> >> Yes, we confirmed that this commit introduces the bug.
> >>> missed something there shouldn't be any real change wrt. large
> >>> allocations here. page_has_obj_cgroups is just a different name for what
> >>> what page_objcgs is giving us.
> >> maybe we could simply use page_objcgs_check to fix the issue ? we will
> >> check it again.
> > You will see the same crash with page_objcgs_check as well.
> > .
>
> I just test it. It won't crash.
>
> This is the test case:
>
> node = kmalloc_node(299999, GFP_ATOMIC | __GFP_NOWARN | __GFP_ACCOUNT, -1);
> kfree_rcu(node, rcu);
>

Indeed it is checking MEMCG_DATA_OBJCGS directly. Yeah, we can use
page_objcgs_check(). Please send a new version of the patch and do CC
Roman.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
  2021-07-28 14:10   ` Shakeel Butt
@ 2021-07-28 16:43     ` Michal Hocko
  0 siblings, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2021-07-28 16:43 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Roman Gushchin, Wang Hai, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, Vlastimil Babka,
	Johannes Weiner, Alexei Starovoitov, wangkefeng.wang, Linux MM,
	LKML

On Wed 28-07-21 07:10:26, Shakeel Butt wrote:
> +Roman
> 
> On Wed, Jul 28, 2021 at 6:23 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Wed 28-07-21 17:13:48, Wang Hai wrote:
> > > When I use kfree_rcu() to free a large memory allocated by
> > > kmalloc_node(), the following dump occurs.
> > >
> > > BUG: kernel NULL pointer dereference, address: 0000000000000020
> > > [...]
> > > Oops: 0000 [#1] SMP
> > > [...]
> > > Workqueue: events kfree_rcu_work
> > > RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> > > RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> > > RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> > > [...]
> > > Call Trace:
> > >  kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> > >  kfree_bulk include/linux/slab.h:413 [inline]
> > >  kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> > >  process_one_work+0x207/0x530 kernel/workqueue.c:2276
> > >  worker_thread+0x320/0x610 kernel/workqueue.c:2422
> > >  kthread+0x13d/0x160 kernel/kthread.c:313
> > >  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> > >
> > > When kmalloc_node() a large memory, page is allocated, not slab,
> > > so when freeing memory via kfree_rcu(), this large memory should not
> > > be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> > > is used for slab.
> > >
> > > So in this case, there is no need to do anything with this large
> > > page in memcg_slab_free_hook(), just skip it.
> > >
> > > Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> >
> > Are you sure that this commit is really breaking the code. Unless I have
> > missed something there shouldn't be any real change wrt. large
> > allocations here. page_has_obj_cgroups is just a different name for what
> > what page_objcgs is giving us.
> 
> Actually they are different. For MEMCG_DATA_KMEM page,
> page_has_obj_cgroups() will return false while page_objcgs() on
> non-VM_DEBUG kernels will return "struct obj_cgroup *" instead of
> "struct obj_cgroup **".

Right. Thanks for the clarification. I have missed that subtle
difference.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-07-28 16:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-28  9:13 [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook() Wang Hai
2021-07-28 10:49 ` Kefeng Wang
2021-07-28 13:23 ` Michal Hocko
2021-07-28 14:10   ` Shakeel Butt
2021-07-28 16:43     ` Michal Hocko
2021-07-28 14:21   ` Kefeng Wang
2021-07-28 14:26     ` Shakeel Butt
2021-07-28 14:33       ` wanghai (M)
2021-07-28 14:44         ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).