All of lore.kernel.org
 help / color / mirror / Atom feed
* Q, slab, kmemleak_erase() and redzone?
@ 2009-11-20 16:14 hooanon05
  2009-11-22  9:35 ` Pekka Enberg
  0 siblings, 1 reply; 10+ messages in thread
From: hooanon05 @ 2009-11-20 16:14 UTC (permalink / raw)
  To: Catalin Marinas, Pekka Enberg; +Cc: linux-kernel


in short: Is it safe to assign NULL to the un-adjusted pointer in
	  kmemleak_erase()?

in long:
I've met a strange redzone warning at deleting a module.

----------------------------------------------------------------------
slab error in verify_redzone_free(): cache `size-256': memory outside object was overwritten
Pid: 5080, comm: modprobe Not tainted 2.6.32-rc7aufsD #320
Call Trace:
 [<ffffffff811010d1>] ? dbg_redzone2+0x31/0x70
 [<ffffffff81101371>] __slab_error+0x21/0x30
 [<ffffffff811022cd>] cache_free_debugcheck+0x1fd/0x300
 [<ffffffff811041e5>] ? __kmem_cache_destroy+0x65/0x110
 [<ffffffff81103bc0>] kfree+0x1c0/0x260
 [<ffffffff811041e5>] __kmem_cache_destroy+0x65/0x110
 [<ffffffff81104336>] kmem_cache_destroy+0xa6/0x100
 [<ffffffffa03130b4>] au_cache_fin+0xb4/0xf0 [aufs]
 [<ffffffff81458387>] ? _write_unlock+0x57/0x70
 [<ffffffffa0348c75>] aufs_exit+0x15/0x39 [aufs]
 [<ffffffff81095cdb>] sys_delete_module+0x19b/0x260
 [<ffffffff81087e3d>] ? trace_hardirqs_on_caller+0x14d/0x1a0
 [<ffffffff8145797e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff810127c2>] system_call_fastpath+0x16/0x1b
ffff88000e87aa40: redzone 1:0xd84156c5635688c0, redzone 2:0x0.
----------------------------------------------------------------------

When delete_module(2) removes aufs.ko, aufs_exit() calls
kmem_cache_destroy() (SLAB) to remove some aufs specific caches whose
name are NOT 'size-256.' Diving into kmemcache, I found the trigger is
in __kmem_cache_destroy(),
	for_each_online_cpu(i)
	    kfree(cachep->array[i]);
The 'cachep->array[i]' is in 'size-256' cache, and kfree() for it
produced the warning.

At first, I thought I made mistakes in my module and corruped
memory. But I could not find such bug.
Inserting some code to check the correctness of cachep->array[i] for
size-256 everywhere led me to kmemleak_erase() in ____cache_alloc().

__cache_alloc()
{
	objp = __do_cache_alloc(cachep, flags);
	:::
	objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller);
	:::
	return objp;
}

__do_cache_alloc()
{
	:::
	objp = ____cache_alloc(cache, flags);
	:::
	return objp;
}
	
____cache_alloc()
{
	objp = ac->entry[--ac->avail];
	or
	objp = cache_alloc_refill(cachep, flags);
	:::
	kmemleak_erase(&ac->entry[ac->avail]);
	return objp;
}

kmemleak_erase(void **ptr)
{
	*ptr = NULL;
}

cache_alloc_debugcheck_after() adjusts the passed objp by
	objp += obj_offset(cachep);
which is 4 or 8 bytes when CONFIG_DEBUG_SLAB is enabled (also
cache_alloc_refill() may return NULL).
In other words, the passed pointer to kmemleak_erase() is not adjusted
yet.
Is it safe to assign NULL to the un-adjusted pointer in kmemleak_erase()?


J. R. Okajima

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-11-20 16:14 Q, slab, kmemleak_erase() and redzone? hooanon05
@ 2009-11-22  9:35 ` Pekka Enberg
  2009-11-24  7:06   ` hooanon05
  0 siblings, 1 reply; 10+ messages in thread
From: Pekka Enberg @ 2009-11-22  9:35 UTC (permalink / raw)
  To: hooanon05; +Cc: Catalin Marinas, linux-kernel

On Fri, Nov 20, 2009 at 6:14 PM,  <hooanon05@yahoo.co.jp> wrote:
> in short: Is it safe to assign NULL to the un-adjusted pointer in
>          kmemleak_erase()?
>
> in long:
> I've met a strange redzone warning at deleting a module.
>
> ----------------------------------------------------------------------
> slab error in verify_redzone_free(): cache `size-256': memory outside object was overwritten
> Pid: 5080, comm: modprobe Not tainted 2.6.32-rc7aufsD #320
> Call Trace:
>  [<ffffffff811010d1>] ? dbg_redzone2+0x31/0x70
>  [<ffffffff81101371>] __slab_error+0x21/0x30
>  [<ffffffff811022cd>] cache_free_debugcheck+0x1fd/0x300
>  [<ffffffff811041e5>] ? __kmem_cache_destroy+0x65/0x110
>  [<ffffffff81103bc0>] kfree+0x1c0/0x260
>  [<ffffffff811041e5>] __kmem_cache_destroy+0x65/0x110
>  [<ffffffff81104336>] kmem_cache_destroy+0xa6/0x100
>  [<ffffffffa03130b4>] au_cache_fin+0xb4/0xf0 [aufs]
>  [<ffffffff81458387>] ? _write_unlock+0x57/0x70
>  [<ffffffffa0348c75>] aufs_exit+0x15/0x39 [aufs]
>  [<ffffffff81095cdb>] sys_delete_module+0x19b/0x260
>  [<ffffffff81087e3d>] ? trace_hardirqs_on_caller+0x14d/0x1a0
>  [<ffffffff8145797e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>  [<ffffffff810127c2>] system_call_fastpath+0x16/0x1b
> ffff88000e87aa40: redzone 1:0xd84156c5635688c0, redzone 2:0x0.
> ----------------------------------------------------------------------
>
> When delete_module(2) removes aufs.ko, aufs_exit() calls
> kmem_cache_destroy() (SLAB) to remove some aufs specific caches whose
> name are NOT 'size-256.' Diving into kmemcache, I found the trigger is
> in __kmem_cache_destroy(),
>        for_each_online_cpu(i)
>            kfree(cachep->array[i]);
> The 'cachep->array[i]' is in 'size-256' cache, and kfree() for it
> produced the warning.
>
> At first, I thought I made mistakes in my module and corruped
> memory. But I could not find such bug.
> Inserting some code to check the correctness of cachep->array[i] for
> size-256 everywhere led me to kmemleak_erase() in ____cache_alloc().
>
> __cache_alloc()
> {
>        objp = __do_cache_alloc(cachep, flags);
>        :::
>        objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller);
>        :::
>        return objp;
> }
>
> __do_cache_alloc()
> {
>        :::
>        objp = ____cache_alloc(cache, flags);
>        :::
>        return objp;
> }
>
> ____cache_alloc()
> {
>        objp = ac->entry[--ac->avail];
>        or
>        objp = cache_alloc_refill(cachep, flags);
>        :::
>        kmemleak_erase(&ac->entry[ac->avail]);
>        return objp;
> }
>
> kmemleak_erase(void **ptr)
> {
>        *ptr = NULL;
> }
>
> cache_alloc_debugcheck_after() adjusts the passed objp by
>        objp += obj_offset(cachep);
> which is 4 or 8 bytes when CONFIG_DEBUG_SLAB is enabled (also
> cache_alloc_refill() may return NULL).
> In other words, the passed pointer to kmemleak_erase() is not adjusted
> yet.
> Is it safe to assign NULL to the un-adjusted pointer in kmemleak_erase()?

We are setting an element in the per CPU array to NULL so the the
kmemleak code in ____cache_alloc() is safe. Red-zoning is done at the
_object_ which is not touched by kmemleak. Looking at the oops, it
does seem likely that you have a bug in your module (or in some other
part of the kernel).

                        Pekka

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-11-22  9:35 ` Pekka Enberg
@ 2009-11-24  7:06   ` hooanon05
  2009-12-01 11:49     ` Pekka Enberg
  0 siblings, 1 reply; 10+ messages in thread
From: hooanon05 @ 2009-11-24  7:06 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Catalin Marinas, linux-kernel


Pekka Enberg:
> We are setting an element in the per CPU array to NULL so the the
> kmemleak code in ____cache_alloc() is safe. Red-zoning is done at the
> _object_ which is not touched by kmemleak. Looking at the oops, it
> does seem likely that you have a bug in your module (or in some other
> part of the kernel).

Thanks for reply.
In ____cache_alloc(), the variable 'ac' is assigned before
cache_alloc_refill() call, and it is used for the parameter of
kmemleak_erase(). The value may be changed by cache_alloc_refill(),
isn't it?
In this case, kmemleak_erase() receives the incorrect pointer and sets
NULL to somewhere else which may be redzone?

How about this fix?
If cpu_cache_get() call is heavy and we cannot ignore it when KMEMLEAK
is disabled, then a new wrapper may be necessary.

diff --git a/mm/slab.c b/mm/slab.c
index 71e0a1f..3f3e018 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3104,6 +3104,7 @@ static inline void *____cache_alloc(struct kmem_cache *cachep, gfp_t flags)
 	} else {
 		STATS_INC_ALLOCMISS(cachep);
 		objp = cache_alloc_refill(cachep, flags);
+		ac = cpu_cache_get(cachep);
 	}
 	/*
 	 * To avoid a false negative, if an object that is in one of the



J. R. Okajima

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-11-24  7:06   ` hooanon05
@ 2009-12-01 11:49     ` Pekka Enberg
  2009-12-01 17:56       ` Catalin Marinas
  2009-12-02  3:21       ` hooanon05
  0 siblings, 2 replies; 10+ messages in thread
From: Pekka Enberg @ 2009-12-01 11:49 UTC (permalink / raw)
  To: hooanon05; +Cc: Catalin Marinas, linux-kernel

hooanon05@yahoo.co.jp kirjoitti:
> Pekka Enberg:
>> We are setting an element in the per CPU array to NULL so the the
>> kmemleak code in ____cache_alloc() is safe. Red-zoning is done at the
>> _object_ which is not touched by kmemleak. Looking at the oops, it
>> does seem likely that you have a bug in your module (or in some other
>> part of the kernel).
> 
> Thanks for reply.
> In ____cache_alloc(), the variable 'ac' is assigned before
> cache_alloc_refill() call, and it is used for the parameter of
> kmemleak_erase(). The value may be changed by cache_alloc_refill(),
> isn't it?

No. The pointer returned by cpu_cache_get() is not changed by 
cache_alloc_refill(). The contents of the array might change, yes. That 
said, we should check if objp is NULL before calling kmemleak_erase(). 
Catalin?

> In this case, kmemleak_erase() receives the incorrect pointer and sets
> NULL to somewhere else which may be redzone?
> 
> How about this fix?
> If cpu_cache_get() call is heavy and we cannot ignore it when KMEMLEAK
> is disabled, then a new wrapper may be necessary.
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 71e0a1f..3f3e018 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3104,6 +3104,7 @@ static inline void *____cache_alloc(struct kmem_cache *cachep, gfp_t flags)
>  	} else {
>  		STATS_INC_ALLOCMISS(cachep);
>  		objp = cache_alloc_refill(cachep, flags);
> +		ac = cpu_cache_get(cachep);
>  	}
>  	/*
>  	 * To avoid a false negative, if an object that is in one of the
> 
> 
> 
> J. R. Okajima


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-12-01 11:49     ` Pekka Enberg
@ 2009-12-01 17:56       ` Catalin Marinas
  2009-12-02  6:31         ` Pekka Enberg
  2009-12-02  3:21       ` hooanon05
  1 sibling, 1 reply; 10+ messages in thread
From: Catalin Marinas @ 2009-12-01 17:56 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: hooanon05, linux-kernel

On Tue, 2009-12-01 at 11:49 +0000, Pekka Enberg wrote:
> hooanon05@yahoo.co.jp kirjoitti:
> > Pekka Enberg:
> >> We are setting an element in the per CPU array to NULL so the the
> >> kmemleak code in ____cache_alloc() is safe. Red-zoning is done at the
> >> _object_ which is not touched by kmemleak. Looking at the oops, it
> >> does seem likely that you have a bug in your module (or in some other
> >> part of the kernel).
> >
> > Thanks for reply.
> > In ____cache_alloc(), the variable 'ac' is assigned before
> > cache_alloc_refill() call, and it is used for the parameter of
> > kmemleak_erase(). The value may be changed by cache_alloc_refill(),
> > isn't it?
> 
> No. The pointer returned by cpu_cache_get() is not changed by
> cache_alloc_refill(). The contents of the array might change, yes. That
> said, we should check if objp is NULL before calling kmemleak_erase().

Possibly but I don't understand why that's needed. The kmemleak_erase()
call just sets the ac->entry[ac->avail] to NULL. If ac->avail is 0, it
doesn't cause any harm.

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-12-01 11:49     ` Pekka Enberg
  2009-12-01 17:56       ` Catalin Marinas
@ 2009-12-02  3:21       ` hooanon05
  1 sibling, 0 replies; 10+ messages in thread
From: hooanon05 @ 2009-12-02  3:21 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Catalin Marinas, linux-kernel


Pekka Enberg:
> No. The pointer returned by cpu_cache_get() is not changed by 
> cache_alloc_refill(). The contents of the array might change, yes. That 
> said, we should check if objp is NULL before calling kmemleak_erase(). 

To test whether objp is NULL or not is another issue.
'ac' is changed actually. You can confirm it by inserting 
	WARN_ON_ONCE(ac != cpu_cache_get(cachep));
after cache_alloc_refill() in ____cache_alloc().

And do you think these comments/code in cache_alloc_refill() are wrong?
{
	:::
	x = cache_grow(cachep, flags | GFP_THISNODE, node, NULL);

	/* cache_grow can reenable interrupts, then ac could change. */
	ac = cpu_cache_get(cachep);
	:::
}


J. R. Okajima

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-12-01 17:56       ` Catalin Marinas
@ 2009-12-02  6:31         ` Pekka Enberg
  2009-12-02  6:32           ` Pekka Enberg
  0 siblings, 1 reply; 10+ messages in thread
From: Pekka Enberg @ 2009-12-02  6:31 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: hooanon05, linux-kernel

Catalin Marinas kirjoitti:
> On Tue, 2009-12-01 at 11:49 +0000, Pekka Enberg wrote:
>> hooanon05@yahoo.co.jp kirjoitti:
>>> Pekka Enberg:
>>>> We are setting an element in the per CPU array to NULL so the the
>>>> kmemleak code in ____cache_alloc() is safe. Red-zoning is done at the
>>>> _object_ which is not touched by kmemleak. Looking at the oops, it
>>>> does seem likely that you have a bug in your module (or in some other
>>>> part of the kernel).
>>> Thanks for reply.
>>> In ____cache_alloc(), the variable 'ac' is assigned before
>>> cache_alloc_refill() call, and it is used for the parameter of
>>> kmemleak_erase(). The value may be changed by cache_alloc_refill(),
>>> isn't it?
>> No. The pointer returned by cpu_cache_get() is not changed by
>> cache_alloc_refill(). The contents of the array might change, yes. That
>> said, we should check if objp is NULL before calling kmemleak_erase().
> 
> Possibly but I don't understand why that's needed. The kmemleak_erase()
> call just sets the ac->entry[ac->avail] to NULL. If ac->avail is 0, it
> doesn't cause any harm.

No, you are absolutely correct. Can you please send an updated patch to 
Catalin that adds a comment on top of the cpu_cache_get() call that 
explains why we need it there?

			Pekka

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-12-02  6:31         ` Pekka Enberg
@ 2009-12-02  6:32           ` Pekka Enberg
  2009-12-02  6:57             ` hooanon05
  0 siblings, 1 reply; 10+ messages in thread
From: Pekka Enberg @ 2009-12-02  6:32 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: hooanon05, linux-kernel

Pekka Enberg kirjoitti:
> Catalin Marinas kirjoitti:
>> On Tue, 2009-12-01 at 11:49 +0000, Pekka Enberg wrote:
>>> hooanon05@yahoo.co.jp kirjoitti:
>>>> Pekka Enberg:
>>>>> We are setting an element in the per CPU array to NULL so the the
>>>>> kmemleak code in ____cache_alloc() is safe. Red-zoning is done at the
>>>>> _object_ which is not touched by kmemleak. Looking at the oops, it
>>>>> does seem likely that you have a bug in your module (or in some other
>>>>> part of the kernel).
>>>> Thanks for reply.
>>>> In ____cache_alloc(), the variable 'ac' is assigned before
>>>> cache_alloc_refill() call, and it is used for the parameter of
>>>> kmemleak_erase(). The value may be changed by cache_alloc_refill(),
>>>> isn't it?
>>> No. The pointer returned by cpu_cache_get() is not changed by
>>> cache_alloc_refill(). The contents of the array might change, yes. That
>>> said, we should check if objp is NULL before calling kmemleak_erase().
>>
>> Possibly but I don't understand why that's needed. The kmemleak_erase()
>> call just sets the ac->entry[ac->avail] to NULL. If ac->avail is 0, it
>> doesn't cause any harm.
> 
> No, you are absolutely correct. Can you please send an updated patch to 
> Catalin that adds a comment on top of the cpu_cache_get() call that 
> explains why we need it there?

Doh, this was supposed to be a reply to Okajima's email :-).

			Pekka

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-12-02  6:32           ` Pekka Enberg
@ 2009-12-02  6:57             ` hooanon05
  2009-12-02  7:01               ` Pekka Enberg
  0 siblings, 1 reply; 10+ messages in thread
From: hooanon05 @ 2009-12-02  6:57 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Catalin Marinas, linux-kernel


Pekka Enberg:
> > No, you are absolutely correct. Can you please send an updated patch to 
> > Catalin that adds a comment on top of the cpu_cache_get() call that 
> > explains why we need it there?
> 
> Doh, this was supposed to be a reply to Okajima's email :-).

Before I send a small patch, let me make sure about other small issues.

- How heavy is 'ac = cpu_cache_get(cachep)' (which will be inserted by
  the patch)?
  It will be compiled and executed regardless CONFIG_DEBUG_KMEMLEAK, and
  it is totally meaningless when DEBUG_KMEMLEAK is disabled. Can we
  ignore this loss?

- Should we add a condition 'if (objp)' before calling kmemleak_erase()?
  As Catalin wrote, it may be harmless. But setting NULL is unnecessary.
  Do you accept this change too?


J. R. Okajima

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Q, slab, kmemleak_erase() and redzone?
  2009-12-02  6:57             ` hooanon05
@ 2009-12-02  7:01               ` Pekka Enberg
  0 siblings, 0 replies; 10+ messages in thread
From: Pekka Enberg @ 2009-12-02  7:01 UTC (permalink / raw)
  To: hooanon05; +Cc: Catalin Marinas, linux-kernel

Hi!

hooanon05@yahoo.co.jp kirjoitti:
> Pekka Enberg:
>>> No, you are absolutely correct. Can you please send an updated patch to 
>>> Catalin that adds a comment on top of the cpu_cache_get() call that 
>>> explains why we need it there?
>> Doh, this was supposed to be a reply to Okajima's email :-).
> 
> Before I send a small patch, let me make sure about other small issues.
> 
> - How heavy is 'ac = cpu_cache_get(cachep)' (which will be inserted by
>   the patch)?
>   It will be compiled and executed regardless CONFIG_DEBUG_KMEMLEAK, and
>   it is totally meaningless when DEBUG_KMEMLEAK is disabled. Can we
>   ignore this loss?

No, it won't be. The compiler should notice it's dead code and remove it 
when CONFIG_KMEMLEAK is disabled.

> - Should we add a condition 'if (objp)' before calling kmemleak_erase()?
>   As Catalin wrote, it may be harmless. But setting NULL is unnecessary.
>   Do you accept this change too?

Yeah, I'd prefer to see the check there. While Catalin is correct, 
that's not obvious from reading the code.

			Pekka

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-12-02  7:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-20 16:14 Q, slab, kmemleak_erase() and redzone? hooanon05
2009-11-22  9:35 ` Pekka Enberg
2009-11-24  7:06   ` hooanon05
2009-12-01 11:49     ` Pekka Enberg
2009-12-01 17:56       ` Catalin Marinas
2009-12-02  6:31         ` Pekka Enberg
2009-12-02  6:32           ` Pekka Enberg
2009-12-02  6:57             ` hooanon05
2009-12-02  7:01               ` Pekka Enberg
2009-12-02  3:21       ` hooanon05

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.