All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-next-20130422] Bug in bootup code or debug code?
@ 2013-04-24 12:08 Tetsuo Handa
  2013-04-25 12:20 ` [linux-next-20130422] Bug in SLAB? Tetsuo Handa
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-04-24 12:08 UTC (permalink / raw)
  To: linux-kernel

Hello.

linux-next-20130422 does not boot when built with CONFIG_DEBUG_SLAB=y &&
CONFIG_DEBUG_SPINLOCK=y && CONFIG_DEBUG_PAGEALLOC=y .

It hangs (with CPU#0 spinning) immediately after printing

  Decompressing Linux... Parsing ELF... done.
  Booting the kernel.

lines.

Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc8-next-20130422 .

Any clue before starting bisection?

Regards.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [linux-next-20130422] Bug in SLAB?
  2013-04-24 12:08 [linux-next-20130422] Bug in bootup code or debug code? Tetsuo Handa
@ 2013-04-25 12:20 ` Tetsuo Handa
  2013-04-29  2:40   ` Tetsuo Handa
  2013-04-29  2:56     ` Zhan Jianyu
  0 siblings, 2 replies; 57+ messages in thread
From: Tetsuo Handa @ 2013-04-25 12:20 UTC (permalink / raw)
  To: glommer, cl, penberg; +Cc: linux-kernel

Tetsuo Handa wrote:
> Hello.
> 
> linux-next-20130422 does not boot when built with CONFIG_DEBUG_SLAB=y &&
> CONFIG_DEBUG_SPINLOCK=y && CONFIG_DEBUG_PAGEALLOC=y .
> 
> It hangs (with CPU#0 spinning) immediately after printing
> 
>   Decompressing Linux... Parsing ELF... done.
>   Booting the kernel.
> 
> lines.
> 
> Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc8-next-20130422 .
> 
> Any clue before starting bisection?
> 
> Regards.
> 

Bisection (with a build fix from commit db845067 "slab: Fixup
CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections") reached commit e3366016
"slab: Use common kmalloc_index/kmalloc_size functions".
Would you have a look at commit e3366016?



By the way, I have a fear regarding commit e3366016.

include/linux/slab_def.h says

  extern struct kmem_cache *kmalloc_caches[PAGE_SHIFT + MAX_ORDER];
  extern struct kmem_cache *kmalloc_dma_caches[PAGE_SHIFT + MAX_ORDER];

while mm/slab.c says

  struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];
  struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];

and include/linux/slab.h says

  #define KMALLOC_SHIFT_HIGH      ((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
                                  (MAX_ORDER + PAGE_SHIFT - 1) : 25)

.

If (MAX_ORDER + PAGE_SHIFT - 1) <= 25 is true, mm/slab.c will allocate

  struct kmem_cache *kmalloc_caches[MAX_ORDER + PAGE_SHIFT - 1 + 1];
  struct kmem_cache *kmalloc_dma_caches[MAX_ORDER + PAGE_SHIFT - 1 + 1];

which matches the size declared in include/linux/slab_def.h .

If (MAX_ORDER + PAGE_SHIFT - 1) > 25 is true (I don't know whether such case
happens), mm/slab.c will allocate

  struct kmem_cache *kmalloc_caches[25 + 1];
  struct kmem_cache *kmalloc_dma_caches[25 + 1];

which is smaller than the size declared in include/linux/slab_def.h .

Well, this mismatch seems to be fixed by commit 9425c58e "slab: Common
definition for the array of kmalloc caches". But usage like

  for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
          struct kmem_cache_node *n;
          struct kmem_cache *cache = kmalloc_caches[i];

is remaining.

Also, kmalloc_index() in include/linux/slab.h can return 0 to 26.

If (MAX_ORDER + PAGE_SHIFT - 1) > 25 is true and
kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
happens), kmalloc_caches[26] is beyond the array, for kmalloc_caches[26]
allows 0 to 25.

If (MAX_ORDER + PAGE_SHIFT - 1) <= 25 is true and
kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
happens), kmalloc_caches[26] is beyond the array, for
kmalloc_caches[MAX_ORDER + PAGE_SHIFT] allows 0 to MAX_ORDER + PAGE_SHIFT - 1.

Would you recheck that the array size is correct?



Regards.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-25 12:20 ` [linux-next-20130422] Bug in SLAB? Tetsuo Handa
@ 2013-04-29  2:40   ` Tetsuo Handa
  2013-04-29 10:12     ` Pekka Enberg
  2013-04-29  2:56     ` Zhan Jianyu
  1 sibling, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-04-29  2:40 UTC (permalink / raw)
  To: glommer, cl, penberg; +Cc: linux-kernel

Tetsuo Handa wrote:
> Also, kmalloc_index() in include/linux/slab.h can return 0 to 26.
> 
> If (MAX_ORDER + PAGE_SHIFT - 1) > 25 is true and
> kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
> happens), kmalloc_caches[26] is beyond the array, for kmalloc_caches[26]
> allows 0 to 25.
> 
> If (MAX_ORDER + PAGE_SHIFT - 1) <= 25 is true and
> kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
> happens), kmalloc_caches[26] is beyond the array, for
> kmalloc_caches[MAX_ORDER + PAGE_SHIFT] allows 0 to MAX_ORDER + PAGE_SHIFT - 1.
> 
> Would you recheck that the array size is correct?
> 

I confirmed (on x86_32) that

  volatile unsigned int size = 8 * 1024 * 1024;
  kmalloc(size, GFP_KERNEL);

causes no warning at compile time and returns NULL at runtime. But

  unsigned int size = 8 * 1024 * 1024;
  kmalloc(size, GFP_KERNEL);

causes compile time warning

  include/linux/slab_def.h:136: warning: array subscript is above array bounds

and runtime bug.

  BUG: unable to handle kernel NULL pointer dereference at 00000058
  IP: [<c10b9d76>] kmem_cache_alloc+0x26/0xb0

I confirmed (on x86_32) that

  kmalloc(64 * 1024 * 1024, GFP_KERNEL);

causes compile time warning

  include/linux/slab_def.h:136: warning: array subscript is above array bounds

and runtime bug.

  Kernel BUG at c10b9c5b [verbose debug info unavailable]
  invalid opcode: 0000 [#1] SMP

Also,

  volatile unsigned int size = 64 * 1024 * 1024;
  kmalloc(size, GFP_KERNEL);

causes no warning at compile time but runtime bug.

  Kernel BUG at c10b9c5b [verbose debug info unavailable]
  invalid opcode: 0000 [#1] SMP

There are kernel modules which expect kmalloc() to return NULL rather than
oops when the requested size is too large.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-25 12:20 ` [linux-next-20130422] Bug in SLAB? Tetsuo Handa
@ 2013-04-29  2:56     ` Zhan Jianyu
  2013-04-29  2:56     ` Zhan Jianyu
  1 sibling, 0 replies; 57+ messages in thread
From: Zhan Jianyu @ 2013-04-29  2:56 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, cl, penberg, linux-kernel, linux-mm

On Thu, Apr 25, 2013 at 8:20 PM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> Bisection (with a build fix from commit db845067 "slab: Fixup
> CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections") reached commit e3366016
> "slab: Use common kmalloc_index/kmalloc_size functions".
> Would you have a look at commit e3366016?


Cc:   linux-mm@kvack.org



--

Regards,
Zhan Jianyu

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
@ 2013-04-29  2:56     ` Zhan Jianyu
  0 siblings, 0 replies; 57+ messages in thread
From: Zhan Jianyu @ 2013-04-29  2:56 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, cl, penberg, linux-kernel, linux-mm

On Thu, Apr 25, 2013 at 8:20 PM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> Bisection (with a build fix from commit db845067 "slab: Fixup
> CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections") reached commit e3366016
> "slab: Use common kmalloc_index/kmalloc_size functions".
> Would you have a look at commit e3366016?


Cc:   linux-mm@kvack.org



--

Regards,
Zhan Jianyu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29  2:56     ` Zhan Jianyu
  (?)
@ 2013-04-29  3:56     ` Tetsuo Handa
  -1 siblings, 0 replies; 57+ messages in thread
From: Tetsuo Handa @ 2013-04-29  3:56 UTC (permalink / raw)
  To: linux-mm

Zhan Jianyu wrote:
> > Bisection (with a build fix from commit db845067 "slab: Fixup
> > CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections") reached commit e3366016
> > "slab: Use common kmalloc_index/kmalloc_size functions".
> > Would you have a look at commit e3366016?
> 
> 
> Cc:   linux-mm@kvack.org
> 

Thanks for ML for reporting, Zhan.

The thread starts from https://lkml.org/lkml/2013/4/24/234 .
This regression is about to be merged into linux.git.

Problem with debug options turned on:

  It hangs (with CPU#0 spinning) immediately after printing

    Decompressing Linux... Parsing ELF... done.
    Booting the kernel.

  lines.

Problem with debug options turned off:

  kmalloc() triggers oops when the requested size is too large.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29  2:40   ` Tetsuo Handa
@ 2013-04-29 10:12     ` Pekka Enberg
  2013-04-29 14:44       ` Glauber Costa
  0 siblings, 1 reply; 57+ messages in thread
From: Pekka Enberg @ 2013-04-29 10:12 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Glauber Costa, Christoph Lameter, LKML

On Mon, Apr 29, 2013 at 5:40 AM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> Tetsuo Handa wrote:
>> Also, kmalloc_index() in include/linux/slab.h can return 0 to 26.
>>
>> If (MAX_ORDER + PAGE_SHIFT - 1) > 25 is true and
>> kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
>> happens), kmalloc_caches[26] is beyond the array, for kmalloc_caches[26]
>> allows 0 to 25.
>>
>> If (MAX_ORDER + PAGE_SHIFT - 1) <= 25 is true and
>> kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
>> happens), kmalloc_caches[26] is beyond the array, for
>> kmalloc_caches[MAX_ORDER + PAGE_SHIFT] allows 0 to MAX_ORDER + PAGE_SHIFT - 1.
>>
>> Would you recheck that the array size is correct?
>>
>
> I confirmed (on x86_32) that
>
>   volatile unsigned int size = 8 * 1024 * 1024;
>   kmalloc(size, GFP_KERNEL);
>
> causes no warning at compile time and returns NULL at runtime. But
>
>   unsigned int size = 8 * 1024 * 1024;
>   kmalloc(size, GFP_KERNEL);
>
> causes compile time warning
>
>   include/linux/slab_def.h:136: warning: array subscript is above array bounds
>
> and runtime bug.
>
>   BUG: unable to handle kernel NULL pointer dereference at 00000058
>   IP: [<c10b9d76>] kmem_cache_alloc+0x26/0xb0
>
> I confirmed (on x86_32) that
>
>   kmalloc(64 * 1024 * 1024, GFP_KERNEL);
>
> causes compile time warning
>
>   include/linux/slab_def.h:136: warning: array subscript is above array bounds
>
> and runtime bug.
>
>   Kernel BUG at c10b9c5b [verbose debug info unavailable]
>   invalid opcode: 0000 [#1] SMP
>
> Also,
>
>   volatile unsigned int size = 64 * 1024 * 1024;
>   kmalloc(size, GFP_KERNEL);
>
> causes no warning at compile time but runtime bug.
>
>   Kernel BUG at c10b9c5b [verbose debug info unavailable]
>   invalid opcode: 0000 [#1] SMP
>
> There are kernel modules which expect kmalloc() to return NULL rather than
> oops when the requested size is too large.

Christoph, Glauber, it seems like commit e3366016 ("slab: Use common
kmalloc_index/kmalloc_size functions") is causing some problems here.
Can you please take a look?

                        Pekka

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 10:12     ` Pekka Enberg
@ 2013-04-29 14:44       ` Glauber Costa
  2013-04-29 14:59         ` Christoph Lameter
  0 siblings, 1 reply; 57+ messages in thread
From: Glauber Costa @ 2013-04-29 14:44 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Tetsuo Handa, Christoph Lameter, LKML

[-- Attachment #1: Type: text/plain, Size: 2516 bytes --]

On 04/29/2013 02:12 PM, Pekka Enberg wrote:
> On Mon, Apr 29, 2013 at 5:40 AM, Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>> Tetsuo Handa wrote:
>>> Also, kmalloc_index() in include/linux/slab.h can return 0 to 26.
>>>
>>> If (MAX_ORDER + PAGE_SHIFT - 1) > 25 is true and
>>> kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
>>> happens), kmalloc_caches[26] is beyond the array, for kmalloc_caches[26]
>>> allows 0 to 25.
>>>
>>> If (MAX_ORDER + PAGE_SHIFT - 1) <= 25 is true and
>>> kmalloc_index(64 * 1024 * 1024) is requested (I don't know whether such case
>>> happens), kmalloc_caches[26] is beyond the array, for
>>> kmalloc_caches[MAX_ORDER + PAGE_SHIFT] allows 0 to MAX_ORDER + PAGE_SHIFT - 1.
>>>
>>> Would you recheck that the array size is correct?
>>>
>>
>> I confirmed (on x86_32) that
>>
>>   volatile unsigned int size = 8 * 1024 * 1024;
>>   kmalloc(size, GFP_KERNEL);
>>
>> causes no warning at compile time and returns NULL at runtime. But
>>
>>   unsigned int size = 8 * 1024 * 1024;
>>   kmalloc(size, GFP_KERNEL);
>>
>> causes compile time warning
>>
>>   include/linux/slab_def.h:136: warning: array subscript is above array bounds
>>
>> and runtime bug.
>>
>>   BUG: unable to handle kernel NULL pointer dereference at 00000058
>>   IP: [<c10b9d76>] kmem_cache_alloc+0x26/0xb0
>>
>> I confirmed (on x86_32) that
>>
>>   kmalloc(64 * 1024 * 1024, GFP_KERNEL);
>>
>> causes compile time warning
>>
>>   include/linux/slab_def.h:136: warning: array subscript is above array bounds
>>
>> and runtime bug.
>>
>>   Kernel BUG at c10b9c5b [verbose debug info unavailable]
>>   invalid opcode: 0000 [#1] SMP
>>
>> Also,
>>
>>   volatile unsigned int size = 64 * 1024 * 1024;
>>   kmalloc(size, GFP_KERNEL);
>>
>> causes no warning at compile time but runtime bug.
>>
>>   Kernel BUG at c10b9c5b [verbose debug info unavailable]
>>   invalid opcode: 0000 [#1] SMP
>>
>> There are kernel modules which expect kmalloc() to return NULL rather than
>> oops when the requested size is too large.
> 
> Christoph, Glauber, it seems like commit e3366016 ("slab: Use common
> kmalloc_index/kmalloc_size functions") is causing some problems here.
> Can you please take a look?
> 
>                         Pekka
> 
I believe this is because the code now always assume that the cache is
found when a constant is passed. Before this patch, we had a "found"
statement that was mistakenly removed.

If I am right, the following (untested) patch should solve the problem.


[-- Attachment #2: 0001-slab-fix-kmalloc-regression-with-big-constant-alloca.patch --]
[-- Type: text/x-patch, Size: 1551 bytes --]

>From 45ad685c47e4c6bba7d772dbd298d321a73dc78a Mon Sep 17 00:00:00 2001
From: Glauber Costa <glommer@openvz.org>
Date: Mon, 29 Apr 2013 18:36:59 +0400
Subject: [PATCH] slab: fix kmalloc regression with big constant allocations

kmalloc have a maximum allocation size, but we are currently not
respecting it. We create a list of kmalloc sizes and return an array
index up to 26, which may or may not be within our limits, since this is
architecture dependent.

This patch fix this by making the slab code do the same as slub does: An
explicit check for the maximum size before the call to kmalloc_index,
and the use of the kmalloc non-constant fallback function in that case.

Signed-off-by: Glauber Costa <glommer@openvz.org>
---
 include/linux/slab_def.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 113ec08..3a240c8 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -172,8 +172,10 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 		if (!size)
 			return ZERO_SIZE_PTR;
 
-		i = kmalloc_index(size);
+		if (size > KMALLOC_MAX_SIZE)
+			goto not_found;
 
+		i = kmalloc_index(size);
 #ifdef CONFIG_ZONE_DMA
 		if (flags & GFP_DMA)
 			cachep = kmalloc_dma_caches[i];
@@ -183,6 +185,7 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 
 		return kmem_cache_alloc_node_trace(cachep, flags, node, size);
 	}
+not_found:
 	return __kmalloc_node(size, flags, node);
 }
 
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 14:44       ` Glauber Costa
@ 2013-04-29 14:59         ` Christoph Lameter
  2013-04-29 15:16           ` Glauber Costa
  2013-04-29 15:28           ` Tetsuo Handa
  0 siblings, 2 replies; 57+ messages in thread
From: Christoph Lameter @ 2013-04-29 14:59 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, Tetsuo Handa, LKML

On Mon, 29 Apr 2013, Glauber Costa wrote:

> >> causes no warning at compile time and returns NULL at runtime. But
> >>
> >>   unsigned int size = 8 * 1024 * 1024;
> >>   kmalloc(size, GFP_KERNEL);
> >>
> >> causes compile time warning
> >>
> >>   include/linux/slab_def.h:136: warning: array subscript is above array bounds
> >>
> >> and runtime bug.

SLAB should have support up to 2 << 25 = 1 mb << 5 = 32M

> I believe this is because the code now always assume that the cache is
> found when a constant is passed. Before this patch, we had a "found"
> statement that was mistakenly removed.

The code in kmalloc_index() creates a BUG() and preferentially should
create a compile time failure when a number that is too big is passed to it.

What is MAX_ORDER on the architecture?

An allocation size of more than MAX_ORDER is not supported by the page
allocator or by slab. It is safe to return NULL in that case.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 14:59         ` Christoph Lameter
@ 2013-04-29 15:16           ` Glauber Costa
  2013-04-29 15:46             ` Christoph Lameter
  2013-04-29 15:28           ` Tetsuo Handa
  1 sibling, 1 reply; 57+ messages in thread
From: Glauber Costa @ 2013-04-29 15:16 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Tetsuo Handa, LKML

On 04/29/2013 06:59 PM, Christoph Lameter wrote:
> The code in kmalloc_index() creates a BUG() and preferentially should
> create a compile time failure when a number that is too big is passed to it.
> 
> What is MAX_ORDER on the architecture?

Returning NULL is fine, but kmalloc_index currently does not check for
MAX_ORDER at all.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 14:59         ` Christoph Lameter
  2013-04-29 15:16           ` Glauber Costa
@ 2013-04-29 15:28           ` Tetsuo Handa
  2013-04-29 16:16             ` Tetsuo Handa
  2013-04-29 17:48             ` Christoph Lameter
  1 sibling, 2 replies; 57+ messages in thread
From: Tetsuo Handa @ 2013-04-29 15:28 UTC (permalink / raw)
  To: cl, glommer; +Cc: penberg, linux-kernel

Glauber Costa wrote:
> If I am right, the following (untested) patch should solve the problem.

This patch did not help;

  kmalloc(8 * 1024 * 1024, GFP_KERNEL)

still causes both

  include/linux/slab_def.h:136: warning: array subscript is above array bounds

and

  BUG: unable to handle kernel NULL pointer dereference at 00000058
  IP: [<c10b9d76>] kmem_cache_alloc+0x26/0xb0

.

Christoph Lameter wrote:
> What is MAX_ORDER on the architecture?

In my environment (x86_32), the constants are

  MAX_ORDER=11 PAGE_SHIFT=12 KMALLOC_SHIFT_HIGH=22 KMALLOC_MAX_SIZE=4194304

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 15:16           ` Glauber Costa
@ 2013-04-29 15:46             ` Christoph Lameter
  2013-04-29 15:54               ` Glauber Costa
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-04-29 15:46 UTC (permalink / raw)
  To: Glauber Costa; +Cc: Pekka Enberg, Tetsuo Handa, LKML

On Mon, 29 Apr 2013, Glauber Costa wrote:

> On 04/29/2013 06:59 PM, Christoph Lameter wrote:
> > The code in kmalloc_index() creates a BUG() and preferentially should
> > create a compile time failure when a number that is too big is passed to it.
> >
> > What is MAX_ORDER on the architecture?
>
> Returning NULL is fine, but kmalloc_index currently does not check for
> MAX_ORDER at all.

Could easily add that and BUG() on it?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 15:46             ` Christoph Lameter
@ 2013-04-29 15:54               ` Glauber Costa
  0 siblings, 0 replies; 57+ messages in thread
From: Glauber Costa @ 2013-04-29 15:54 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Tetsuo Handa, LKML

On 04/29/2013 07:46 PM, Christoph Lameter wrote:
> On Mon, 29 Apr 2013, Glauber Costa wrote:
> 
>> On 04/29/2013 06:59 PM, Christoph Lameter wrote:
>>> The code in kmalloc_index() creates a BUG() and preferentially should
>>> create a compile time failure when a number that is too big is passed to it.
>>>
>>> What is MAX_ORDER on the architecture?
>>
>> Returning NULL is fine, but kmalloc_index currently does not check for
>> MAX_ORDER at all.
> 
> Could easily add that and BUG() on it?
> 
We could, but now I am intrigued by the fact that the patch I sent does
not fix Tetsuo's problem. 8M should be well within MAX_ORDER

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 15:28           ` Tetsuo Handa
@ 2013-04-29 16:16             ` Tetsuo Handa
  2013-05-09 12:25               ` Tetsuo Handa
  2013-04-29 17:48             ` Christoph Lameter
  1 sibling, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-04-29 16:16 UTC (permalink / raw)
  To: cl, glommer; +Cc: penberg, linux-kernel

Tetsuo Handa wrote:
> Glauber Costa wrote:
> > If I am right, the following (untested) patch should solve the problem.
> 
> This patch did not help;
> 
>   kmalloc(8 * 1024 * 1024, GFP_KERNEL)
> 
> still causes both
> 
>   include/linux/slab_def.h:136: warning: array subscript is above array bounds
> 
> and
> 
>   BUG: unable to handle kernel NULL pointer dereference at 00000058
>   IP: [<c10b9d76>] kmem_cache_alloc+0x26/0xb0
> 
> .

I copied this patch (which modifies "static __always_inline void *kmalloc_node
(size_t size, gfp_t flags, int node)") to "static __always_inline void *kmalloc
(size_t size, gfp_t flags)", but it didn't help.

-------- test cases --------
        volatile unsigned int size = 0;
        void *ptr = kmalloc(size, GFP_KERNEL);
        printk("kmalloc(0)=%p\n", ptr);
        kfree(ptr);
        for (size = 1; size <= 256 * 1024 * 1024; size *= 2) {
                ptr = kmalloc(size, GFP_KERNEL);
                printk("kmalloc(%u)=%p\n", size, ptr);
                kfree(ptr);
        }
-------- test cases --------

kmalloc(0)=00000010
kmalloc(1)=de7eb840
kmalloc(2)=de7eb840
kmalloc(4)=de7eb840
kmalloc(8)=de7eb840
kmalloc(16)=de7eb840
kmalloc(32)=de7eb840
kmalloc(64)=de28ae40
kmalloc(128)=de5ba140
kmalloc(256)=de69e180
kmalloc(512)=dea14600
kmalloc(1024)=de522400
kmalloc(2048)=de1e4000
kmalloc(4096)=de9b3000
kmalloc(8192)=de24a000
kmalloc(16384)=de444000
kmalloc(32768)=de9b8000
kmalloc(65536)=dea20000
kmalloc(131072)=de980000
kmalloc(262144)=deb00000
kmalloc(524288)=dea80000
kmalloc(1048576)=de800000
kmalloc(2097152)=dde00000
kmalloc(4194304)=d5800000
kmalloc(8388608)=  (null)
kmalloc(16777216)=  (null)

Got BUG() at 32 * 1024 * 1024 bytes.

  Kernel BUG at c10b9c9b [verbose debug info unavailable]
  invalid opcode: 0000 [#1] SMP

There still seems to be a bug in the non-constant case.



> Christoph Lameter wrote:
> > What is MAX_ORDER on the architecture?
> 
> In my environment (x86_32), the constants are
> 
>   MAX_ORDER=11 PAGE_SHIFT=12 KMALLOC_SHIFT_HIGH=22 KMALLOC_MAX_SIZE=4194304
> 

I don't know if any, but on an architecture with PAGE_SHIFT + MAX_ORDER > 26,

  static void init_node_lock_keys(int q)
  {
          int i;
  
          if (slab_state < UP)
                  return;
  
          for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
                  struct kmem_cache_node *n;
                  struct kmem_cache *cache = kmalloc_caches[i];

looks like out of bounds access due to

  #define KMALLOC_SHIFT_HIGH      ((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
                                  (MAX_ORDER + PAGE_SHIFT - 1) : 25)

and

  struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];

.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 15:28           ` Tetsuo Handa
  2013-04-29 16:16             ` Tetsuo Handa
@ 2013-04-29 17:48             ` Christoph Lameter
  2013-04-29 21:45               ` Tetsuo Handa
  1 sibling, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-04-29 17:48 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel


On Tue, 30 Apr 2013, Tetsuo Handa wrote:

> Glauber Costa wrote:
> > If I am right, the following (untested) patch should solve the problem.
>
> This patch did not help;
>
>   kmalloc(8 * 1024 * 1024, GFP_KERNEL)
>
> still causes both
>
>   include/linux/slab_def.h:136: warning: array subscript is above array bounds
>
> and
>
>   BUG: unable to handle kernel NULL pointer dereference at 00000058
>   IP: [<c10b9d76>] kmem_cache_alloc+0x26/0xb0
>
> .
>
> Christoph Lameter wrote:
> > What is MAX_ORDER on the architecture?
>
> In my environment (x86_32), the constants are
>
>   MAX_ORDER=11 PAGE_SHIFT=12 KMALLOC_SHIFT_HIGH=22 KMALLOC_MAX_SIZE=4194304
>

Ok so the maximum allocation is 11+12=23 which is 8M. KMALLOC_MAX_SIZE
amd KMALLOC_SHIFT_HIGH are wrong.

Take the -1 off the constants under #ifdef CONFIG_SLAB in

include/linux/slab.h
Index: linux/include/linux/slab.h
===================================================================
--- linux.orig/include/linux/slab.h	2013-04-29 12:44:42.339011800 -0500
+++ linux/include/linux/slab.h	2013-04-29 12:48:11.446435859 -0500
@@ -176,8 +176,8 @@ struct kmem_cache {
  * to do various tricks to work around compiler limitations in order to
  * ensure proper constant folding.
  */
-#define KMALLOC_SHIFT_HIGH	((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
-				(MAX_ORDER + PAGE_SHIFT - 1) : 25)
+#define KMALLOC_SHIFT_HIGH	((MAX_ORDER + PAGE_SHIFT) <= 26 ? \
+				(MAX_ORDER + PAGE_SHIFT) : 26)
 #define KMALLOC_SHIFT_MAX	KMALLOC_SHIFT_HIGH
 #define KMALLOC_SHIFT_LOW	5
 #else
@@ -206,9 +206,9 @@ struct kmem_cache {
 #define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
 #endif

-extern struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];
+extern struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH];
 #ifdef CONFIG_ZONE_DMA
-extern struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
+extern struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH];
 #endif

 /*


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 17:48             ` Christoph Lameter
@ 2013-04-29 21:45               ` Tetsuo Handa
  2013-04-30 14:26                 ` Christoph Lameter
                                   ` (2 more replies)
  0 siblings, 3 replies; 57+ messages in thread
From: Tetsuo Handa @ 2013-04-29 21:45 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Christoph Lameter wrote:
> Ok so the maximum allocation is 11+12=23 which is 8M. KMALLOC_MAX_SIZE
> amd KMALLOC_SHIFT_HIGH are wrong.
> 
> Take the -1 off the constants under #ifdef CONFIG_SLAB in

Current diff is:

---------- patch start ----------
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 0c62175..889d6ef 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -189,8 +189,8 @@ struct kmem_cache {
  * to do various tricks to work around compiler limitations in order to
  * ensure proper constant folding.
  */
-#define KMALLOC_SHIFT_HIGH	((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
-				(MAX_ORDER + PAGE_SHIFT - 1) : 25)
+#define KMALLOC_SHIFT_HIGH	((MAX_ORDER + PAGE_SHIFT) <= 26 ? \
+				(MAX_ORDER + PAGE_SHIFT) : 26)
 #define KMALLOC_SHIFT_MAX	KMALLOC_SHIFT_HIGH
 #ifndef KMALLOC_SHIFT_LOW
 #define KMALLOC_SHIFT_LOW	5
@@ -221,9 +221,9 @@ struct kmem_cache {
 #define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
 #endif
 
-extern struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];
+extern struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH];
 #ifdef CONFIG_ZONE_DMA
-extern struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
+extern struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH];
 #endif
 
 /*
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 113ec08..be1446a 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -126,6 +126,9 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
 		if (!size)
 			return ZERO_SIZE_PTR;
 
+		if (size > KMALLOC_MAX_SIZE)
+			goto not_found;
+
 		i = kmalloc_index(size);
 
 #ifdef CONFIG_ZONE_DMA
@@ -139,6 +142,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
 
 		return ret;
 	}
+not_found:
 	return __kmalloc(size, flags);
 }
 
@@ -172,8 +176,10 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 		if (!size)
 			return ZERO_SIZE_PTR;
 
-		i = kmalloc_index(size);
+		if (size > KMALLOC_MAX_SIZE)
+			goto not_found;
 
+		i = kmalloc_index(size);
 #ifdef CONFIG_ZONE_DMA
 		if (flags & GFP_DMA)
 			cachep = kmalloc_dma_caches[i];
@@ -183,6 +189,7 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 
 		return kmem_cache_alloc_node_trace(cachep, flags, node, size);
 	}
+not_found:
 	return __kmalloc_node(size, flags, node);
 }
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 2f0e7d5..083e7c7 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -319,11 +319,11 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name, size_t size,
 	return s;
 }
 
-struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];
+struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH];
 EXPORT_SYMBOL(kmalloc_caches);
 
 #ifdef CONFIG_ZONE_DMA
-struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
+struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH];
 EXPORT_SYMBOL(kmalloc_dma_caches);
 #endif
 
---------- patch end ----------

Current testcases are:

---------- testcases start ----------
	volatile unsigned int size;
	void *ptr;
	ptr = kmalloc(0, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 0, ptr);
	kfree(ptr);
	ptr = kmalloc(1, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 1, ptr);
	kfree(ptr);
	ptr = kmalloc(2, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 2, ptr);
	kfree(ptr);
	ptr = kmalloc(4, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 4, ptr);
	kfree(ptr);
	ptr = kmalloc(8, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 8, ptr);
	kfree(ptr);
	ptr = kmalloc(16, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 16, ptr);
	kfree(ptr);
	ptr = kmalloc(32, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 32, ptr);
	kfree(ptr);
	ptr = kmalloc(64, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 64, ptr);
	kfree(ptr);
	ptr = kmalloc(128, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 128, ptr);
	kfree(ptr);
	ptr = kmalloc(256, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 256, ptr);
	kfree(ptr);
	ptr = kmalloc(512, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 512, ptr);
	kfree(ptr);
	ptr = kmalloc(1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(2 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 2 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(4 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 4 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(8 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 8 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(16 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 16 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(32 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 32 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(64 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 64 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(128 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 128 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(256 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 256 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(512 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 512 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(2 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 2 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(4 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 4 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(8 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 8 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(16 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 16 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(32 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 32 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(64 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 64 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(128 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 128 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(256 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 256 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(512 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 512 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(1024 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 1024 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(2 * 1024 * 1024 * 1024, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 2 * 1024 * 1024 * 1024, ptr);
	kfree(ptr);
	ptr = kmalloc(2 * 1024 * 1024 * 1024 + 1, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", 2 * 1024 * 1024 * 1024 + 1, ptr);
	kfree(ptr);
	for (size = 0; size; size <<= 1) {
		ptr = kmalloc(size, GFP_KERNEL);
		printk("kmalloc(%u)=%p\n", size, ptr);
		kfree(ptr);
		if (!size)
			size = 1;
	}
	for (size = 1; size; size <<= 1) {
		ptr = kmalloc(size + 1, GFP_KERNEL);
		printk("kmalloc(%u)=%p\n", size + 1, ptr);
		kfree(ptr);
	}
---------- testcases end ----------

The testcases still trigger BUG() at 32M:

---------- dmesg start ----------
(...snipped...)
kmalloc(2097152)=dde00000
kmalloc(4194304)=d9c00000
------------[ cut here ]------------
WARNING: at mm/page_alloc.c:2410 __alloc_pages_nodemask+0x179/0x650()
(...snipped...)
---[ end trace c08f36179e2d8ff2 ]---
SLAB: Unable to allocate memory on node 0 (gfp=0xd0)
  cache: kmalloc-8388608, object size: 8388608, order: 11
  node 0: slabs: 0/0, objs: 0/0, free: 0
kmalloc(8388608)=  (null)
kmalloc(16777216)=  (null)
------------[ cut here ]------------
Kernel BUG at c10b9c9b [verbose debug info unavailable]
invalid opcode: 0000 [#1] SMP
(...snipped...)
---------- dmesg end ----------

This means that redirecting to !__builtin_constant_p(size) case does not help.

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 21:45               ` Tetsuo Handa
@ 2013-04-30 14:26                 ` Christoph Lameter
  2013-04-30 16:01                   ` Tetsuo Handa
  2013-04-30 14:34                 ` Christoph Lameter
  2013-04-30 15:16                 ` Fix off by one error in slab.h Christoph Lameter
  2 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-04-30 14:26 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Tue, 30 Apr 2013, Tetsuo Handa wrote:

> The testcases still trigger BUG() at 32M:

I thought we established that MAX_ORDER only allows a maximum of 8M sized
allocations? Why are you trying 32M?

The BUG() should trigger for these allocations.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 21:45               ` Tetsuo Handa
  2013-04-30 14:26                 ` Christoph Lameter
@ 2013-04-30 14:34                 ` Christoph Lameter
  2013-05-01  8:03                   ` Pekka Enberg
  2013-04-30 15:16                 ` Fix off by one error in slab.h Christoph Lameter
  2 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-04-30 14:34 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Tue, 30 Apr 2013, Tetsuo Handa wrote:

> Current diff is:

[off by one stuff okay]

> diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
> index 113ec08..be1446a 100644
> --- a/include/linux/slab_def.h
> +++ b/include/linux/slab_def.h
> @@ -126,6 +126,9 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
>  		if (!size)
>  			return ZERO_SIZE_PTR;
>
> +		if (size > KMALLOC_MAX_SIZE)
> +			goto not_found;
> +
>  		i = kmalloc_index(size);

Why is this needed? kmalloc_index should BUG() for too large allocs.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Fix off by one error in slab.h
  2013-04-29 21:45               ` Tetsuo Handa
  2013-04-30 14:26                 ` Christoph Lameter
  2013-04-30 14:34                 ` Christoph Lameter
@ 2013-04-30 15:16                 ` Christoph Lameter
  2 siblings, 0 replies; 57+ messages in thread
From: Christoph Lameter @ 2013-04-30 15:16 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

Subject: Fix off by one error in slab.h

We ran into some strange issues as a result of an off by one isse in slab.h

The root of the issue is the treatment of KMALLOC_SHIFT_HIGH that is confusing.

Make KMALLOC_SHIFT_HIGH the first unsupported size instead of the last supported.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/slab.h
===================================================================
--- linux.orig/include/linux/slab.h	2013-04-30 09:54:23.636533564 -0500
+++ linux/include/linux/slab.h	2013-04-30 10:10:35.676932866 -0500
@@ -176,8 +176,8 @@ struct kmem_cache {
  * to do various tricks to work around compiler limitations in order to
  * ensure proper constant folding.
  */
-#define KMALLOC_SHIFT_HIGH	((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
-				(MAX_ORDER + PAGE_SHIFT - 1) : 25)
+#define KMALLOC_SHIFT_HIGH	((MAX_ORDER + PAGE_SHIFT) <= 26 ? \
+				(MAX_ORDER + PAGE_SHIFT) : 26)
 #define KMALLOC_SHIFT_MAX	KMALLOC_SHIFT_HIGH
 #define KMALLOC_SHIFT_LOW	5
 #else
@@ -185,7 +185,7 @@ struct kmem_cache {
  * SLUB allocates up to order 2 pages directly and otherwise
  * passes the request to the page allocator.
  */
-#define KMALLOC_SHIFT_HIGH	(PAGE_SHIFT + 1)
+#define KMALLOC_SHIFT_HIGH	(PAGE_SHIFT + 2)
 #define KMALLOC_SHIFT_MAX	(MAX_ORDER + PAGE_SHIFT)
 #define KMALLOC_SHIFT_LOW	3
 #endif
@@ -193,7 +193,7 @@ struct kmem_cache {
 /* Maximum allocatable size */
 #define KMALLOC_MAX_SIZE	(1UL << KMALLOC_SHIFT_MAX)
 /* Maximum size for which we actually use a slab cache */
-#define KMALLOC_MAX_CACHE_SIZE	(1UL << KMALLOC_SHIFT_HIGH)
+#define KMALLOC_MAX_CACHE_SIZE	((1UL << (KMALLOC_SHIFT_HIGH -1)))
 /* Maximum order allocatable via the slab allocagtor */
 #define KMALLOC_MAX_ORDER	(KMALLOC_SHIFT_MAX - PAGE_SHIFT)

@@ -206,9 +206,9 @@ struct kmem_cache {
 #define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
 #endif

-extern struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];
+extern struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH];
 #ifdef CONFIG_ZONE_DMA
-extern struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
+extern struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH];
 #endif

 /*
Index: linux/mm/slab_common.c
===================================================================
--- linux.orig/mm/slab_common.c	2013-04-30 09:54:23.636533564 -0500
+++ linux/mm/slab_common.c	2013-04-30 09:54:53.693039252 -0500
@@ -319,11 +319,11 @@ struct kmem_cache *__init create_kmalloc
 	return s;
 }

-struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];
+struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH];
 EXPORT_SYMBOL(kmalloc_caches);

 #ifdef CONFIG_ZONE_DMA
-struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH + 1];
+struct kmem_cache *kmalloc_dma_caches[KMALLOC_SHIFT_HIGH];
 EXPORT_SYMBOL(kmalloc_dma_caches);
 #endif

@@ -446,7 +446,7 @@ void __init create_kmalloc_caches(unsign
 	if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2])
 		kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);

-	for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++)
+	for (i = KMALLOC_SHIFT_LOW; i < KMALLOC_SHIFT_HIGH; i++)
 		if (!kmalloc_caches[i])
 			kmalloc_caches[i] = create_kmalloc_cache(NULL,
 							1 << i, flags);
@@ -454,7 +454,7 @@ void __init create_kmalloc_caches(unsign
 	/* Kmalloc array is now usable */
 	slab_state = UP;

-	for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
+	for (i = 0; i < KMALLOC_SHIFT_HIGH; i++) {
 		struct kmem_cache *s = kmalloc_caches[i];
 		char *n;

@@ -467,7 +467,7 @@ void __init create_kmalloc_caches(unsign
 	}

 #ifdef CONFIG_ZONE_DMA
-	for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
+	for (i = 0; i < KMALLOC_SHIFT_HIGH; i++) {
 		struct kmem_cache *s = kmalloc_caches[i];

 		if (s) {

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-30 14:26                 ` Christoph Lameter
@ 2013-04-30 16:01                   ` Tetsuo Handa
  2013-04-30 17:27                     ` Christoph Lameter
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-04-30 16:01 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Christoph Lameter wrote:
> On Tue, 30 Apr 2013, Tetsuo Handa wrote:
> 
> > The testcases still trigger BUG() at 32M:
> 
> I thought we established that MAX_ORDER only allows a maximum of 8M sized
> allocations? Why are you trying 32M?

Only for regression testing. At least until Linux 3.9, requesting too large
size didn't trigger oops, did it? I'm not expecting kmalloc() to trigger oops
for Linux 3.10 and future kernels.

> 
> The BUG() should trigger for these allocations.
> 

"kmalloc() returning NULL for these allocations" is needed by "try kmalloc()
first, fallback to vmalloc()" allocation. There are kernel modules which expect
kmalloc() to return NULL rather than oops when the requested size is larger
than KMALLOC_MAX_SIZE bytes. If kmalloc() suddenly starts triggering oops, such
modules will break.



Anyway, there is a regression we want to fix : we won't be able to boot
Linux 3.10-rc1 for x86_32 built with CONFIG_DEBUG_SLAB=y &&
CONFIG_DEBUG_SPINLOCK=y && CONFIG_DEBUG_PAGEALLOC=y .
("Fix off by one error in slab.h" did not fix the regression.)

Regards.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-30 16:01                   ` Tetsuo Handa
@ 2013-04-30 17:27                     ` Christoph Lameter
  2013-05-01  8:05                       ` Pekka Enberg
  2013-05-01 12:14                       ` Tetsuo Handa
  0 siblings, 2 replies; 57+ messages in thread
From: Christoph Lameter @ 2013-04-30 17:27 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel


On Wed, 1 May 2013, Tetsuo Handa wrote:

> Christoph Lameter wrote:
> > On Tue, 30 Apr 2013, Tetsuo Handa wrote:
> >
> > > The testcases still trigger BUG() at 32M:
> >
> > I thought we established that MAX_ORDER only allows a maximum of 8M sized
> > allocations? Why are you trying 32M?
>
> Only for regression testing. At least until Linux 3.9, requesting too large
> size didn't trigger oops, did it? I'm not expecting kmalloc() to trigger oops
> for Linux 3.10 and future kernels.

It did for SLUB. SLAB returned NULL for some cases.

> "kmalloc() returning NULL for these allocations" is needed by "try kmalloc()
> first, fallback to vmalloc()" allocation. There are kernel modules which expect
> kmalloc() to return NULL rather than oops when the requested size is larger
> than KMALLOC_MAX_SIZE bytes. If kmalloc() suddenly starts triggering oops, such
> modules will break.

This behavior has been in there for years. Why try a kmalloc that
always fails since the size is too big?

> Anyway, there is a regression we want to fix : we won't be able to boot
> Linux 3.10-rc1 for x86_32 built with CONFIG_DEBUG_SLAB=y &&
> CONFIG_DEBUG_SPINLOCK=y && CONFIG_DEBUG_PAGEALLOC=y .
> ("Fix off by one error in slab.h" did not fix the regression.)

Hmm... Where does this fail? In slab?



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-30 14:34                 ` Christoph Lameter
@ 2013-05-01  8:03                   ` Pekka Enberg
  2013-05-02 15:13                     ` Christoph Lameter
  0 siblings, 1 reply; 57+ messages in thread
From: Pekka Enberg @ 2013-05-01  8:03 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tetsuo Handa, glommer, linux-kernel

On 4/30/13 5:34 PM, Christoph Lameter wrote:
> On Tue, 30 Apr 2013, Tetsuo Handa wrote:
>
>> Current diff is:
>
> [off by one stuff okay]
>
>> diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
>> index 113ec08..be1446a 100644
>> --- a/include/linux/slab_def.h
>> +++ b/include/linux/slab_def.h
>> @@ -126,6 +126,9 @@ static __always_inline void *kmalloc(size_t size, gfp_t flags)
>>   		if (!size)
>>   			return ZERO_SIZE_PTR;
>>
>> +		if (size > KMALLOC_MAX_SIZE)
>> +			goto not_found;
>> +
>>   		i = kmalloc_index(size);
>
> Why is this needed? kmalloc_index should BUG() for too large allocs.

Why is that? Historically it has returned NULL, hasn't it? We have had 
cases where kernel code (naively) uses size directly from userspace and 
we definitely don't want to BUG_ON on it.

			Pekka

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-30 17:27                     ` Christoph Lameter
@ 2013-05-01  8:05                       ` Pekka Enberg
  2013-05-02 15:12                         ` Christoph Lameter
  2013-05-01 12:14                       ` Tetsuo Handa
  1 sibling, 1 reply; 57+ messages in thread
From: Pekka Enberg @ 2013-05-01  8:05 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tetsuo Handa, glommer, linux-kernel

On 4/30/13 8:27 PM, Christoph Lameter wrote:
>> "kmalloc() returning NULL for these allocations" is needed by "try kmalloc()
>> first, fallback to vmalloc()" allocation. There are kernel modules which expect
>> kmalloc() to return NULL rather than oops when the requested size is larger
>> than KMALLOC_MAX_SIZE bytes. If kmalloc() suddenly starts triggering oops, such
>> modules will break.
>
> This behavior has been in there for years. Why try a kmalloc that
> always fails since the size is too big?

...because want the extra protection for cases where size is controlled 
by userspace. This is consistent with kcalloc() that returns NULL on 
integer overflow.

			Pekka

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-30 17:27                     ` Christoph Lameter
  2013-05-01  8:05                       ` Pekka Enberg
@ 2013-05-01 12:14                       ` Tetsuo Handa
  2013-05-02 11:51                         ` Tetsuo Handa
  2013-05-02 20:53                         ` Christoph Lameter
  1 sibling, 2 replies; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-01 12:14 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Christoph Lameter wrote:
> > "kmalloc() returning NULL for these allocations" is needed by "try kmalloc()
> > first, fallback to vmalloc()" allocation. There are kernel modules which expect
> > kmalloc() to return NULL rather than oops when the requested size is larger
> > than KMALLOC_MAX_SIZE bytes. If kmalloc() suddenly starts triggering oops, such
> > modules will break.
> 
> This behavior has been in there for years. Why try a kmalloc that
> always fails since the size is too big?
> 

This is nothing but a testcase. Size argument is sometimes unknown and/or
user-controlled. We expect that not only kmalloc() etc. but also kstrdup(),
kmemdup(), krealloc() etc. do not trigger oops. I think that checking the size
in SLAB/SLUB is the only safe way.

> > Anyway, there is a regression we want to fix : we won't be able to boot
> > Linux 3.10-rc1 for x86_32 built with CONFIG_DEBUG_SLAB=y &&
> > CONFIG_DEBUG_SPINLOCK=y && CONFIG_DEBUG_PAGEALLOC=y .
> > ("Fix off by one error in slab.h" did not fix the regression.)
> 
> Hmm... Where does this fail? In slab?
> 
It hangs (with CPU#0 spinning) immediately after printing

  Decompressing Linux... Parsing ELF... done.
  Booting the kernel.

lines. Today I heard that gdb can be used if I use qemu, but I doubt that I can
manage time to understand and find the exact location within a few days.

The culprit location is possibly in SLAB because the kernel boots if built with
CONFIG_DEBUG_SLAB=n || CONFIG_DEBUG_SPINLOCK=n || CONFIG_DEBUG_PAGEALLOC=n.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-01 12:14                       ` Tetsuo Handa
@ 2013-05-02 11:51                         ` Tetsuo Handa
  2013-05-02 20:53                         ` Christoph Lameter
  1 sibling, 0 replies; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-02 11:51 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Tetsuo Handa wrote:
> > Hmm... Where does this fail? In slab?
> > 
> It hangs (with CPU#0 spinning) immediately after printing
> 
>   Decompressing Linux... Parsing ELF... done.
>   Booting the kernel.
> 
> lines. Today I heard that gdb can be used if I use qemu, but I doubt that I can
> manage time to understand and find the exact location within a few days.
> 
> The culprit location is possibly in SLAB because the kernel boots if built with
> CONFIG_DEBUG_SLAB=n || CONFIG_DEBUG_SPINLOCK=n || CONFIG_DEBUG_PAGEALLOC=n.
> 
It turned out that cachep->slabp_cache == NULL is the cause of boot failure.
cachep->slabp_cache == NULL is caused by kmalloc_caches[5] == NULL. Any clue?

int __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags)
{
        (...snipped...)
        if (flags & CFLGS_OFF_SLAB) {
                cachep->slabp_cache = kmalloc_slab(slab_size, 0u);
                /*
                 * This is a possibility for one of the malloc_sizes caches.
                 * But since we go off slab only for object size greater than
                 * PAGE_SIZE/8, and malloc_sizes gets created in ascending order,
                 * this should not happen at all.
                 * But leave a BUG_ON for some lucky dude.
                 */
                BUG_ON(ZERO_OR_NULL_PTR(cachep->slabp_cache));
        }
        (...snipped...)
}

(gdb) b __find_general_cachep
Breakpoint 1 at 0xc10b6030: file mm/slab.c, line 679.
(gdb) c
Continuing.

Breakpoint 1, __find_general_cachep (size=32, gfpflags=0) at mm/slab.c:679
679             BUG_ON(kmalloc_caches[INDEX_AC] == NULL);
(gdb) s
671     {
(gdb)
679             BUG_ON(kmalloc_caches[INDEX_AC] == NULL);
(gdb)
681             if (!size)
(gdb)
684             i = kmalloc_index(size);
(gdb)
kmalloc_index (size=32, gfpflags=0) at include/linux/slab.h:208
208             if (size <= KMALLOC_MIN_SIZE)
(gdb)
__find_general_cachep (size=32, gfpflags=0) at mm/slab.c:692
692             if (unlikely(gfpflags & GFP_DMA))
(gdb)
695             return kmalloc_caches[i];
(gdb) print i
$1 = 5
(gdb) print kmalloc_caches[i]
$2 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[0]
$3 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[1]
$4 = (struct kmem_cache *) 0xf6800120
(gdb) print kmalloc_caches[2]
$5 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[3]
$6 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[4]
$7 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[5]
$8 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[6]
$9 = (struct kmem_cache *) 0xf6800080
(gdb) print kmalloc_caches[7]
$10 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[8]
$11 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[9]
$12 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[10]
$13 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[11]
$14 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[12]
$15 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[13]
$16 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[14]
$17 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[15]
$18 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[16]
$19 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[17]
$20 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[18]
$21 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[19]
$22 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[20]
$23 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[21]
$24 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[22]
$25 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[23]
$26 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[24]
$27 = (struct kmem_cache *) 0x0
(gdb) print kmalloc_caches[25]
$28 = (struct kmem_cache *) 0xf68001c0
(gdb) print kmalloc_caches[26]
$29 = (struct kmem_cache *) 0x0
(gdb) s
696     }
(gdb)
__kmem_cache_create (cachep=0xf6800260, flags=2147493888) at mm/slab.c:2493
2493                    BUG_ON(ZERO_OR_NULL_PTR(cachep->slabp_cache));
(gdb)
2485                    cachep->slabp_cache = kmem_find_general_cachep(slab_size, 0u);
(gdb)
2493                    BUG_ON(ZERO_OR_NULL_PTR(cachep->slabp_cache));
(gdb)
^C
Program received signal SIGINT, Interrupt.
0xc1409afc in panic (fmt=0xc1511274 "Attempted to kill the idle task!") at kernel/panic.c:182
182                     mdelay(PANIC_TIMER_STEP);
(gdb) bt
#0  0xc1409afc in panic (fmt=0xc1511274 "Attempted to kill the idle task!") at kernel/panic.c:182
#1  0xc1034a5f in do_exit (code=11) at kernel/exit.c:718
#2  0xc1005887 in oops_end (flags=70, regs=0xc158def0, signr=11) at arch/x86/kernel/dumpstack.c:249
#3  0xc10059af in die (str=0xc1502c45 "invalid opcode", regs=0xc158def0, err=0) at arch/x86/kernel/dumpstack.c:310
#4  0xc1002e25 in do_trap_no_signal (trapnr=6, signr=4, str=0xc1502c45 "invalid opcode", regs=0xc158def0, error_code=0, info=0xc158de60)
    at arch/x86/kernel/traps.c:130
#5  do_trap (trapnr=6, signr=4, str=0xc1502c45 "invalid opcode", regs=0xc158def0, error_code=0, info=0xc158de60) at arch/x86/kernel/traps.c:145
#6  0xc10032a6 in do_invalid_op (regs=0xc158def0, error_code=0) at arch/x86/kernel/traps.c:213
#7  0xc140c742 in ?? () at arch/x86/kernel/entry_32.S:1318
#8  0xc10b91d5 in __kmem_cache_create (cachep=0xf6800260, flags=2147493888) at mm/slab.c:2493
#9  0xc15dfcde in create_boot_cache (s=0xf6800260, name=0xc15148be "kmalloc", size=192, flags=8192) at mm/slab_common.c:299
#10 0xc15dfd4b in create_kmalloc_cache (name=0xc15148be "kmalloc", size=192, flags=8192) at mm/slab_common.c:316
#11 0xc15e0aa2 in kmem_cache_init () at mm/slab.c:1652
#12 0xc15c972a in mm_init () at init/main.c:462
#13 start_kernel () at init/main.c:527
#14 0xc15c929f in i386_start_kernel () at arch/x86/kernel/head32.c:66
#15 0x00000000 in ?? ()

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-01  8:05                       ` Pekka Enberg
@ 2013-05-02 15:12                         ` Christoph Lameter
  0 siblings, 0 replies; 57+ messages in thread
From: Christoph Lameter @ 2013-05-02 15:12 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Tetsuo Handa, glommer, linux-kernel

On Wed, 1 May 2013, Pekka Enberg wrote:

> > This behavior has been in there for years. Why try a kmalloc that
> > always fails since the size is too big?
>
> ...because want the extra protection for cases where size is controlled by
> userspace. This is consistent with kcalloc() that returns NULL on integer
> overflow.

The slab allocator behavior since SLUB was introduced has been to BUG() on
allocations that can never be satisfied.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-01  8:03                   ` Pekka Enberg
@ 2013-05-02 15:13                     ` Christoph Lameter
  0 siblings, 0 replies; 57+ messages in thread
From: Christoph Lameter @ 2013-05-02 15:13 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Tetsuo Handa, glommer, linux-kernel

On Wed, 1 May 2013, Pekka Enberg wrote:

	> Why is that? Historically it has returned NULL, hasn't it? We have had cases
> where kernel code (naively) uses size directly from userspace and we
> definitely don't want to BUG_ON on it.

In that case the size is dynamic and then we return an error. In this case
the size is static and known at compile time. The allocation can never
succeed.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-01 12:14                       ` Tetsuo Handa
  2013-05-02 11:51                         ` Tetsuo Handa
@ 2013-05-02 20:53                         ` Christoph Lameter
  2013-05-03  8:26                           ` Tetsuo Handa
  1 sibling, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-05-02 20:53 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Wed, 1 May 2013, Tetsuo Handa wrote:

> The culprit location is possibly in SLAB because the kernel boots if built with
> CONFIG_DEBUG_SLAB=n || CONFIG_DEBUG_SPINLOCK=n || CONFIG_DEBUG_PAGEALLOC=n.

I have booted such a configuration just fine. Please have a look at the
kernel config that I send or send me yours.

Here is a patch that restores the old behavior for SLAB



Subject: slab: Return NULL for oversized allocations

The inline path seems to have changed the SLAB behavior for very large
kmalloc allocations. This patch restores the old behavior.

Signed-off-by: Christoph Lameter <cl@linux.com>


Index: linux/include/linux/slab_def.h
===================================================================
--- linux.orig/include/linux/slab_def.h	2013-05-02 15:02:45.864728115 -0500
+++ linux/include/linux/slab_def.h	2013-05-02 15:06:14.940474110 -0500
@@ -126,6 +126,9 @@ static __always_inline void *kmalloc(siz
 		if (!size)
 			return ZERO_SIZE_PTR;

+		if (size >= KMALLOC_MAX_SIZE)
+			return NULL;
+
 		i = kmalloc_index(size);

 #ifdef CONFIG_ZONE_DMA
@@ -172,6 +175,9 @@ static __always_inline void *kmalloc_nod
 		if (!size)
 			return ZERO_SIZE_PTR;

+		if (size > KMALLOC_MAX_SIZE)
+			return NULL;
+
 		i = kmalloc_index(size);

 #ifdef CONFIG_ZONE_DMA

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-02 20:53                         ` Christoph Lameter
@ 2013-05-03  8:26                           ` Tetsuo Handa
  2013-05-03 15:43                             ` Christoph Lameter
  2013-05-03 18:04                             ` Christoph Lameter
  0 siblings, 2 replies; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-03  8:26 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Christoph Lameter wrote:
> I have booted such a configuration just fine. Please have a look at the
> kernel config that I send or send me yours.

OK. Here are complete steps to reproduce using QEMU on CentOS 6.4.

(1) Compile the kernel for x86_32 and run it on QEMU.

  $ cd /tmp/
  $ wget -O - https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/snapshot/next-20130426.tar.bz2 | tar -jxf -
  $ cd next-20130426/
  $ wget -O .config http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc8-next-20130422
  $ make -sj2 CONFIG_FRAME_POINTER=y CONFIG_DEBUG_INFO=y
  $ /usr/libexec/qemu-kvm -no-kvm -cpu pentium3 -m 512 -nographic -kernel arch/x86/boot/bzImage -S -gdb tcp::1234

(2) Attach gdb to the QEMU process from another terminal and continue.

  $ gdb /tmp/next-20130426/vmlinux
  (gdb) target remote :1234
  (gdb) c

    Press Ctrl-C after waiting for ten seconds.
    You will find that the kernel has panic()ed.

If one of CONFIG_DEBUG_SLAB/CONFIG_DEBUG_SPINLOCK/CONFIG_DEBUG_PAGEALLOC is
changed to n from the config above, the kernel boots fine. But the kernel
triggers oops for very large kmalloc allocations.

> 
> Here is a patch that restores the old behavior for SLAB
> 
> 
> 
> Subject: slab: Return NULL for oversized allocations
> 
> The inline path seems to have changed the SLAB behavior for very large
> kmalloc allocations. This patch restores the old behavior.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>
> 

I don't think this patch is sufficient. There are functions (e.g. kstrdup())
which call variant functions (e.g. kmalloc_track_caller()). I think we need to
check size at functions which determine index from size (e.g. kmalloc_slab()).

> 
> Index: linux/include/linux/slab_def.h
> ===================================================================
> --- linux.orig/include/linux/slab_def.h	2013-05-02 15:02:45.864728115 -0500
> +++ linux/include/linux/slab_def.h	2013-05-02 15:06:14.940474110 -0500
> @@ -126,6 +126,9 @@ static __always_inline void *kmalloc(siz
>  		if (!size)
>  			return ZERO_SIZE_PTR;
> 
> +		if (size >= KMALLOC_MAX_SIZE)
> +			return NULL;
> +

Why not (size > KMALLOC_MAX_SIZE) ?

>  		i = kmalloc_index(size);
> 
>  #ifdef CONFIG_ZONE_DMA

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-03  8:26                           ` Tetsuo Handa
@ 2013-05-03 15:43                             ` Christoph Lameter
  2013-05-06  6:59                               ` Geert Uytterhoeven
  2013-05-03 18:04                             ` Christoph Lameter
  1 sibling, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-05-03 15:43 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Fri, 3 May 2013, Tetsuo Handa wrote:

> I don't think this patch is sufficient. There are functions (e.g. kstrdup())
> which call variant functions (e.g. kmalloc_track_caller()). I think we need to
> check size at functions which determine index from size (e.g. kmalloc_slab()).

Right.

> > Index: linux/include/linux/slab_def.h
> > ===================================================================
> > --- linux.orig/include/linux/slab_def.h	2013-05-02 15:02:45.864728115 -0500
> > +++ linux/include/linux/slab_def.h	2013-05-02 15:06:14.940474110 -0500
> > @@ -126,6 +126,9 @@ static __always_inline void *kmalloc(siz
> >  		if (!size)
> >  			return ZERO_SIZE_PTR;
> >
> > +		if (size >= KMALLOC_MAX_SIZE)
> > +			return NULL;
> > +
>
> Why not (size > KMALLOC_MAX_SIZE) ?

Correct.

We also would want some diagnostics as to who is doing these large allocs
so that these issues can be fixed. Updated patch:

Subject: slab: Return NULL for oversized allocations

The inline path seems to have changed the SLAB behavior for very large
kmalloc allocations. This patch restores the old behavior but also
adds diagnostics so that we can figure where in the code these
large allocations occur.

Signed-off-by: Christoph Lameter <cl@linux.com>


Index: linux/include/linux/slab_def.h
===================================================================
--- linux.orig/include/linux/slab_def.h	2013-05-03 10:36:46.019564801 -0500
+++ linux/include/linux/slab_def.h	2013-05-03 10:37:28.860302188 -0500
@@ -126,6 +126,11 @@ static __always_inline void *kmalloc(siz
 		if (!size)
 			return ZERO_SIZE_PTR;

+		if (size > KMALLOC_MAX_SIZE) {
+			WARN_ON(1);
+			return NULL;
+		}
+
 		i = kmalloc_index(size);

 #ifdef CONFIG_ZONE_DMA
@@ -172,6 +177,11 @@ static __always_inline void *kmalloc_nod
 		if (!size)
 			return ZERO_SIZE_PTR;

+		if (size > KMALLOC_MAX_SIZE) {
+			WARN_ON(1);
+			return NULL;
+		}
+
 		i = kmalloc_index(size);

 #ifdef CONFIG_ZONE_DMA
Index: linux/mm/slab_common.c
===================================================================
--- linux.orig/mm/slab_common.c	2013-05-03 10:36:46.019564801 -0500
+++ linux/mm/slab_common.c	2013-05-03 10:38:29.045351837 -0500
@@ -373,6 +373,11 @@ struct kmem_cache *kmalloc_slab(size_t s
 {
 	int index;

+	if (size > KMALLOC_MAX_SIZE) {
+		WARN_ON(1);
+		return NULL;
+	}
+
 	if (size <= 192) {
 		if (!size)
 			return ZERO_SIZE_PTR;



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-03  8:26                           ` Tetsuo Handa
  2013-05-03 15:43                             ` Christoph Lameter
@ 2013-05-03 18:04                             ` Christoph Lameter
  2013-05-03 18:48                               ` Tetsuo Handa
                                                 ` (2 more replies)
  1 sibling, 3 replies; 57+ messages in thread
From: Christoph Lameter @ 2013-05-03 18:04 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

Enabling various debugging options increases the size of structures and
the subslab handling in SLABs kmem_cache_create will start to fail.


Here is a fix for that:

Subject: Fix bootstrap creation of kmalloc caches

For SLAB the kmalloc caches must be created in ascending sizes
in order for the OFF_SLAB sub-slab cache to work properly.

Create the non power of two caches immediately after the prior power
of two kmalloc cache. Do not create the non power of two caches
before all other caches.

Signed-off-by: Christoph Lamete <cl@linux.com>

Index: linux/mm/slab_common.c
===================================================================
--- linux.orig/mm/slab_common.c	2013-05-03 11:16:45.764382372 -0500
+++ linux/mm/slab_common.c	2013-05-03 12:59:40.431797020 -0500
@@ -444,18 +444,24 @@ void __init create_kmalloc_caches(unsign
 		for (i = 128 + 8; i <= 192; i += 8)
 			size_index[size_index_elem(i)] = 8;
 	}
-	/* Caches that are not of the two-to-the-power-of size */
-	if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1])
-		kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
-
-	if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2])
-		kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
-
-	for (i = KMALLOC_SHIFT_LOW; i < KMALLOC_SHIFT_HIGH; i++)
-		if (!kmalloc_caches[i])
+	for (i = KMALLOC_SHIFT_LOW; i < KMALLOC_SHIFT_HIGH; i++) {
+		if (!kmalloc_caches[i]) {
 			kmalloc_caches[i] = create_kmalloc_cache(NULL,
 							1 << i, flags);

+			/*
+			 * Caches that are not of the two-to-the-power-of size.
+			 * These have to be created immediately after the
+			 * earlier power of two caches
+			 */
+			if (KMALLOC_MIN_SIZE <= 32 && !kmalloc_caches[1] && i == 6)
+				kmalloc_caches[1] = create_kmalloc_cache(NULL, 96, flags);
+
+			if (KMALLOC_MIN_SIZE <= 64 && !kmalloc_caches[2] && i == 7)
+				kmalloc_caches[2] = create_kmalloc_cache(NULL, 192, flags);
+		}
+	}
+
 	/* Kmalloc array is now usable */
 	slab_state = UP;


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-03 18:04                             ` Christoph Lameter
@ 2013-05-03 18:48                               ` Tetsuo Handa
  2013-05-03 19:21                                 ` Christoph Lameter
  2013-05-06  6:32                               ` Pekka Enberg
  2013-05-06 20:24                               ` Pekka Enberg
  2 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-03 18:48 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Applying

  Subject: slab: Return NULL for oversized allocations
  (Date: Fri, 3 May 2013 15:43:18 +0000)

and

  Subject: Fix bootstrap creation of kmalloc caches
  (Date: Fri, 3 May 2013 18:04:18 +0000)

on linux-next-20130426 made my kernel boots fine and passes the allocation
testcases. Thank you.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-03 18:48                               ` Tetsuo Handa
@ 2013-05-03 19:21                                 ` Christoph Lameter
  2013-05-04  0:15                                   ` Tetsuo Handa
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-05-03 19:21 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Sat, 4 May 2013, Tetsuo Handa wrote:

>   Subject: slab: Return NULL for oversized allocations
>   (Date: Fri, 3 May 2013 15:43:18 +0000)
>
> and
>
>   Subject: Fix bootstrap creation of kmalloc caches
>   (Date: Fri, 3 May 2013 18:04:18 +0000)
>
> on linux-next-20130426 made my kernel boots fine and passes the allocation
> testcases. Thank you.

Ok could I see the kernel logs with the warnings?


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-03 19:21                                 ` Christoph Lameter
@ 2013-05-04  0:15                                   ` Tetsuo Handa
  2013-05-06 13:46                                     ` Christoph Lameter
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-04  0:15 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Christoph Lameter wrote:
> Ok could I see the kernel logs with the warnings?
Sure.

---------- testcase ----------
volatile unsigned int size;
for (size = 4 * 1024 * 1024; size <= 32 * 1024 * 1024; size <<= 1) {
	void *ptr = kmalloc(size, GFP_KERNEL);
	printk("kmalloc(%u)=%p\n", size, ptr);
	kfree(ptr);
}
---------- testcase ----------

---------- before applying this patch ----------
kmalloc(4194304)=d9c00000
kmalloc(8388608)=  (null)
kmalloc(16777216)=  (null)
------------[ cut here ]------------
Kernel BUG at c10b9c5b [verbose debug info unavailable]
invalid opcode: 0000 [#1] SMP
Modules linked in: test(OF+) binfmt_misc
CPU: 0 PID: 3881 Comm: insmod Tainted: GF          O 3.9.0-rc8-next-20130426 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 08/15/2008
task: df2a3b30 ti: de0a0000 task.ti: de0a0000
EIP: 0060:[<c10b9c5b>] EFLAGS: 00010202 CPU: 0
EIP is at cache_alloc_refill+0x56b/0x590
EAX: 00000040 EBX: df002680 ECX: 00000000 EDX: 00000010
ESI: df000ac0 EDI: 00000000 EBP: 00000001 ESP: de0a1e08
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: b769f3e0 CR3: 1f0fc000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Stack:
 00000000 00000000 00000000 000000d0 00000000 00000010 00000040 00000246
 000000d0 00000000 00000000 df000ac0 00000202 000000d0 c10b9d0e ffffffff
 00000000 e0838040 00000001 00000000 e083a01c e0838024 01000000 00000000
Call Trace:
 [<c10b9d0e>] ? __kmalloc+0x8e/0xd0
 [<e083a01c>] ? test_init+0x1c/0x5c [test]
 [<c1000102>] ? do_one_initcall+0x32/0x170
 [<c10b3098>] ? __vunmap+0xa8/0xe0
 [<e083a000>] ? 0xe0839fff
 [<c106e066>] ? load_module+0x1196/0x1560
 [<c106e7e0>] ? SyS_init_module+0xb0/0xd0
 [<c143273a>] ? sysenter_do_call+0x12/0x22
Code: 85 c0 0f 84 05 ff ff ff 31 c9 89 ea 89 f0 e8 8d f9 ff ff e9 f5 fe ff ff 0f 0b eb fe 8b 6e 20 f7 c5 01 00 00 00 0f 84 95 fd ff ff <0f> 0b eb fe 0f 0b eb fe 0f b6 44 24 1f 89 da 8b 4c 24 20 89 04
EIP: [<c10b9c5b>] cache_alloc_refill+0x56b/0x590 SS:ESP 0068:de0a1e08
---[ end trace f0be5ebb0e2c1371 ]---
---------- before applying this patch ----------

---------- after applying this patch ----------
kmalloc(4194304)=d7000000
------------[ cut here ]------------
WARNING: at mm/slab_common.c:377 kmalloc_slab+0x57/0x70()
Modules linked in: test(OF+) binfmt_misc
CPU: 1 PID: 4337 Comm: insmod Tainted: GF       W  O 3.9.0-rc8-next-20130426-dirty #2
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 08/15/2008
 c142f6a1 c102fbcb c1538e8c c153f938 00000179 c10a5197 c10a5197 d7000000
 e1584040 00000001 000000d0 c102fc1b 00000009 00000000 c10a5197 d7000000
 c10ba3da 00000286 d7000000 e1584040 00000001 00000000 e158601c e1584024
Call Trace:
 [<c142f6a1>] ? dump_stack+0xa/0x19
 [<c102fbcb>] ? warn_slowpath_common+0x6b/0xa0
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c102fc1b>] ? warn_slowpath_null+0x1b/0x20
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c10ba3da>] ? __kmalloc+0x1a/0xd0
 [<e158601c>] ? test_init+0x1c/0x5c [test]
 [<c1000102>] ? do_one_initcall+0x32/0x170
 [<c10b30b8>] ? __vunmap+0xa8/0xe0
 [<e1586000>] ? 0xe1585fff
 [<c106e066>] ? load_module+0x1196/0x1560
 [<c106e7e0>] ? SyS_init_module+0xb0/0xd0
 [<c143273a>] ? sysenter_do_call+0x12/0x22
---[ end trace b6ab930b322ad480 ]---
kmalloc(8388608)=  (null)
------------[ cut here ]------------
WARNING: at mm/slab_common.c:377 kmalloc_slab+0x57/0x70()
Modules linked in: test(OF+) binfmt_misc
CPU: 1 PID: 4337 Comm: insmod Tainted: GF       W  O 3.9.0-rc8-next-20130426-dirty #2
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 08/15/2008
 c142f6a1 c102fbcb c1538e8c c153f938 00000179 c10a5197 c10a5197 00000000
 e1584040 00000001 000000d0 c102fc1b 00000009 00000000 c10a5197 00000000
 c10ba3da ffffffff 00000000 e1584040 00000001 00000000 e158601c e1584024
Call Trace:
 [<c142f6a1>] ? dump_stack+0xa/0x19
 [<c102fbcb>] ? warn_slowpath_common+0x6b/0xa0
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c102fc1b>] ? warn_slowpath_null+0x1b/0x20
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c10ba3da>] ? __kmalloc+0x1a/0xd0
 [<e158601c>] ? test_init+0x1c/0x5c [test]
 [<c1000102>] ? do_one_initcall+0x32/0x170
 [<c10b30b8>] ? __vunmap+0xa8/0xe0
 [<e1586000>] ? 0xe1585fff
 [<c106e066>] ? load_module+0x1196/0x1560
 [<c106e7e0>] ? SyS_init_module+0xb0/0xd0
 [<c143273a>] ? sysenter_do_call+0x12/0x22
---[ end trace b6ab930b322ad481 ]---
kmalloc(16777216)=  (null)
------------[ cut here ]------------
WARNING: at mm/slab_common.c:377 kmalloc_slab+0x57/0x70()
Modules linked in: test(OF+) binfmt_misc
CPU: 1 PID: 4337 Comm: insmod Tainted: GF       W  O 3.9.0-rc8-next-20130426-dirty #2
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 08/15/2008
 c142f6a1 c102fbcb c1538e8c c153f938 00000179 c10a5197 c10a5197 00000000
 e1584040 00000001 000000d0 c102fc1b 00000009 00000000 c10a5197 00000000
 c10ba3da ffffffff 00000000 e1584040 00000001 00000000 e158601c e1584024
Call Trace:
 [<c142f6a1>] ? dump_stack+0xa/0x19
 [<c102fbcb>] ? warn_slowpath_common+0x6b/0xa0
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c102fc1b>] ? warn_slowpath_null+0x1b/0x20
 [<c10a5197>] ? kmalloc_slab+0x57/0x70
 [<c10ba3da>] ? __kmalloc+0x1a/0xd0
 [<e158601c>] ? test_init+0x1c/0x5c [test]
 [<c1000102>] ? do_one_initcall+0x32/0x170
 [<c10b30b8>] ? __vunmap+0xa8/0xe0
 [<e1586000>] ? 0xe1585fff
 [<c106e066>] ? load_module+0x1196/0x1560
 [<c106e7e0>] ? SyS_init_module+0xb0/0xd0
 [<c143273a>] ? sysenter_do_call+0x12/0x22
---[ end trace b6ab930b322ad482 ]---
kmalloc(33554432)=  (null)
---------- after applying this patch ----------

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-03 18:04                             ` Christoph Lameter
  2013-05-03 18:48                               ` Tetsuo Handa
@ 2013-05-06  6:32                               ` Pekka Enberg
  2013-05-06 13:47                                 ` Christoph Lameter
  2013-05-06 20:24                               ` Pekka Enberg
  2 siblings, 1 reply; 57+ messages in thread
From: Pekka Enberg @ 2013-05-06  6:32 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tetsuo Handa, Glauber Costa, LKML

On Fri, May 3, 2013 at 9:04 PM, Christoph Lameter <cl@linux.com> wrote:
> Enabling various debugging options increases the size of structures and
> the subslab handling in SLABs kmem_cache_create will start to fail.
>
>
> Here is a fix for that:
>
> Subject: Fix bootstrap creation of kmalloc caches
>
> For SLAB the kmalloc caches must be created in ascending sizes
> in order for the OFF_SLAB sub-slab cache to work properly.
>
> Create the non power of two caches immediately after the prior power
> of two kmalloc cache. Do not create the non power of two caches
> before all other caches.
>
> Signed-off-by: Christoph Lamete <cl@linux.com>

This doesn't seem to apply against slab/next branch. What tree did you
use to generate the patch?

                        Pekka

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-03 15:43                             ` Christoph Lameter
@ 2013-05-06  6:59                               ` Geert Uytterhoeven
  2013-05-06  7:27                                 ` Pekka Enberg
  0 siblings, 1 reply; 57+ messages in thread
From: Geert Uytterhoeven @ 2013-05-06  6:59 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tetsuo Handa, glommer, Pekka Enberg, linux-kernel

On Fri, May 3, 2013 at 5:43 PM, Christoph Lameter <cl@linux.com> wrote:
> Subject: slab: Return NULL for oversized allocations
>
> The inline path seems to have changed the SLAB behavior for very large
> kmalloc allocations. This patch restores the old behavior but also
> adds diagnostics so that we can figure where in the code these
> large allocations occur.
>
> Signed-off-by: Christoph Lameter <cl@linux.com>
>
>
> Index: linux/include/linux/slab_def.h
> ===================================================================
> --- linux.orig/include/linux/slab_def.h 2013-05-03 10:36:46.019564801 -0500
> +++ linux/include/linux/slab_def.h      2013-05-03 10:37:28.860302188 -0500
> @@ -126,6 +126,11 @@ static __always_inline void *kmalloc(siz
>                 if (!size)
>                         return ZERO_SIZE_PTR;
>
> +               if (size > KMALLOC_MAX_SIZE) {
> +                       WARN_ON(1);

As we were worried about this being triggered frm userspace, this needs
some rate limiting, to avoid flooding the kernel logs.

> +                       return NULL;
> +               }
> +
>                 i = kmalloc_index(size);
>
>  #ifdef CONFIG_ZONE_DMA
> @@ -172,6 +177,11 @@ static __always_inline void *kmalloc_nod
>                 if (!size)
>                         return ZERO_SIZE_PTR;
>
> +               if (size > KMALLOC_MAX_SIZE) {
> +                       WARN_ON(1);

Idem ditto.

> +                       return NULL;
> +               }
> +
>                 i = kmalloc_index(size);

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-06  6:59                               ` Geert Uytterhoeven
@ 2013-05-06  7:27                                 ` Pekka Enberg
  0 siblings, 0 replies; 57+ messages in thread
From: Pekka Enberg @ 2013-05-06  7:27 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Christoph Lameter, Tetsuo Handa, Glauber Costa, linux-kernel

On Mon, May 6, 2013 at 9:59 AM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Fri, May 3, 2013 at 5:43 PM, Christoph Lameter <cl@linux.com> wrote:
>> Subject: slab: Return NULL for oversized allocations
>>
>> The inline path seems to have changed the SLAB behavior for very large
>> kmalloc allocations. This patch restores the old behavior but also
>> adds diagnostics so that we can figure where in the code these
>> large allocations occur.
>>
>> Signed-off-by: Christoph Lameter <cl@linux.com>
>>
>>
>> Index: linux/include/linux/slab_def.h
>> ===================================================================
>> --- linux.orig/include/linux/slab_def.h 2013-05-03 10:36:46.019564801 -0500
>> +++ linux/include/linux/slab_def.h      2013-05-03 10:37:28.860302188 -0500
>> @@ -126,6 +126,11 @@ static __always_inline void *kmalloc(siz
>>                 if (!size)
>>                         return ZERO_SIZE_PTR;
>>
>> +               if (size > KMALLOC_MAX_SIZE) {
>> +                       WARN_ON(1);
>
> As we were worried about this being triggered frm userspace, this needs
> some rate limiting, to avoid flooding the kernel logs.

I changed it to WARN_ON_ONCE():

https://git.kernel.org/cgit/linux/kernel/git/penberg/linux.git/commit/?h=slab/next

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-04  0:15                                   ` Tetsuo Handa
@ 2013-05-06 13:46                                     ` Christoph Lameter
  2013-05-07 10:38                                       ` Tetsuo Handa
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-05-06 13:46 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Sat, 4 May 2013, Tetsuo Handa wrote:

> Christoph Lameter wrote:
> > Ok could I see the kernel logs with the warnings?
> Sure.

These are exclusively from the module load. So the kernel seems to be
clean of large kmalloc's ?



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-06  6:32                               ` Pekka Enberg
@ 2013-05-06 13:47                                 ` Christoph Lameter
  0 siblings, 0 replies; 57+ messages in thread
From: Christoph Lameter @ 2013-05-06 13:47 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Tetsuo Handa, Glauber Costa, LKML

On Mon, 6 May 2013, Pekka Enberg wrote:

> This doesn't seem to apply against slab/next branch. What tree did you
> use to generate the patch?

Slab/next from a couple of weeks ago.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-03 18:04                             ` Christoph Lameter
  2013-05-03 18:48                               ` Tetsuo Handa
  2013-05-06  6:32                               ` Pekka Enberg
@ 2013-05-06 20:24                               ` Pekka Enberg
  2013-05-07 14:23                                 ` Christoph Lameter
  2 siblings, 1 reply; 57+ messages in thread
From: Pekka Enberg @ 2013-05-06 20:24 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tetsuo Handa, Glauber Costa, LKML

On Fri, May 3, 2013 at 9:04 PM, Christoph Lameter <cl@linux.com> wrote:
> -       for (i = KMALLOC_SHIFT_LOW; i < KMALLOC_SHIFT_HIGH; i++)

This didn't match what I had in my tree. I fixed it by hand but please
verify the end result:

https://git.kernel.org/cgit/linux/kernel/git/penberg/linux.git/commit/?h=slab/next&id=8a965b3baa89ffedc73c0fbc750006c631012ced

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-06 13:46                                     ` Christoph Lameter
@ 2013-05-07 10:38                                       ` Tetsuo Handa
  2013-05-07 14:28                                         ` Christoph Lameter
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-07 10:38 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Christoph Lameter wrote:
> On Sat, 4 May 2013, Tetsuo Handa wrote:
> 
> > Christoph Lameter wrote:
> > > Ok could I see the kernel logs with the warnings?
> > Sure.
> 
> These are exclusively from the module load. So the kernel seems to be
> clean of large kmalloc's ?
> 
There are modules (e.g. TOMOYO) which do not check for KMALLOC_MAX_SIZE limit
and expect kmalloc() larger than KMALLOC_MAX_SIZE bytes to return NULL.

As far as I know, such modules sequentially double the buffer size. Therefore,
as long as request for KMALLOC_MAX_SIZE * 2 bytes returns NULL, they won't
trigger oops by requesting for KMALLOC_MAX_SIZE * 8 bytes.

The testcase I wrote is for module (I don't know if there is one) which
requests for KMALLOC_MAX_SIZE * 8 bytes without requesting for
KMALLOC_MAX_SIZE * 2 bytes and/or KMALLOC_MAX_SIZE * 4 bytes.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-06 20:24                               ` Pekka Enberg
@ 2013-05-07 14:23                                 ` Christoph Lameter
  2013-05-07 14:44                                   ` Pekka Enberg
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-05-07 14:23 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Tetsuo Handa, Glauber Costa, LKML

On Mon, 6 May 2013, Pekka Enberg wrote:

> On Fri, May 3, 2013 at 9:04 PM, Christoph Lameter <cl@linux.com> wrote:
> > -       for (i = KMALLOC_SHIFT_LOW; i < KMALLOC_SHIFT_HIGH; i++)
>
> This didn't match what I had in my tree. I fixed it by hand but please
> verify the end result:
>
> https://git.kernel.org/cgit/linux/kernel/git/penberg/linux.git/commit/?h=slab/next&id=8a965b3baa89ffedc73c0fbc750006c631012ced
>
Well this is because you did not take the patch that changed the way
KMALLOC_SHIFT_HIGH is treated.

The patch above looks fine though.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-07 10:38                                       ` Tetsuo Handa
@ 2013-05-07 14:28                                         ` Christoph Lameter
  2013-07-01 20:09                                           ` Andrew Morton
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-05-07 14:28 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Tue, 7 May 2013, Tetsuo Handa wrote:

> > These are exclusively from the module load. So the kernel seems to be
> > clean of large kmalloc's ?
> >
> There are modules (e.g. TOMOYO) which do not check for KMALLOC_MAX_SIZE limit
> and expect kmalloc() larger than KMALLOC_MAX_SIZE bytes to return NULL.

Dont do that. Please fix these things. The slab allocators are for small
allocations not large ones like these. There are page allocator functions
that allow higher order allocations if contiguous memory is needed and
there is vmalloc that makes it safe. We  recently added a CMA allocator
specifically for these large contiguous physical allocations.

> As far as I know, such modules sequentially double the buffer size. Therefore,
> as long as request for KMALLOC_MAX_SIZE * 2 bytes returns NULL, they won't
> trigger oops by requesting for KMALLOC_MAX_SIZE * 8 bytes.

Please use vmalloc for these large allocs.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-07 14:23                                 ` Christoph Lameter
@ 2013-05-07 14:44                                   ` Pekka Enberg
  2013-05-07 15:40                                     ` Christoph Lameter
  0 siblings, 1 reply; 57+ messages in thread
From: Pekka Enberg @ 2013-05-07 14:44 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tetsuo Handa, Glauber Costa, LKML

On Tue, May 7, 2013 at 5:23 PM, Christoph Lameter <cl@linux.com> wrote:
> Well this is because you did not take the patch that changed the way
> KMALLOC_SHIFT_HIGH is treated.

Is that still needed? I only took the ones Tetsuo said were needed to
fix the issue at hand.

                        Pekka

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-07 14:44                                   ` Pekka Enberg
@ 2013-05-07 15:40                                     ` Christoph Lameter
  0 siblings, 0 replies; 57+ messages in thread
From: Christoph Lameter @ 2013-05-07 15:40 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Tetsuo Handa, Glauber Costa, LKML

On Tue, 7 May 2013, Pekka Enberg wrote:

> On Tue, May 7, 2013 at 5:23 PM, Christoph Lameter <cl@linux.com> wrote:
> > Well this is because you did not take the patch that changed the way
> > KMALLOC_SHIFT_HIGH is treated.
>
> Is that still needed? I only took the ones Tetsuo said were needed to
> fix the issue at hand.

If his tests work then I guess not. Queue for later.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-04-29 16:16             ` Tetsuo Handa
@ 2013-05-09 12:25               ` Tetsuo Handa
  2013-05-09 14:14                 ` Christoph Lameter
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-09 12:25 UTC (permalink / raw)
  To: cl, glommer; +Cc: penberg, linux-kernel, penguin-kernel

Tetsuo Handa wrote:
> > Christoph Lameter wrote:
> > > What is MAX_ORDER on the architecture?
> > 
> > In my environment (x86_32), the constants are
> > 
> >   MAX_ORDER=11 PAGE_SHIFT=12 KMALLOC_SHIFT_HIGH=22 KMALLOC_MAX_SIZE=4194304
> > 
> 
> I don't know if any, but on an architecture with PAGE_SHIFT + MAX_ORDER > 26,
> 
>   static void init_node_lock_keys(int q)
>   {
>           int i;
>   
>           if (slab_state < UP)
>                   return;
>   
>           for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
>                   struct kmem_cache_node *n;
>                   struct kmem_cache *cache = kmalloc_caches[i];
> 
> looks like out of bounds access due to
> 
>   #define KMALLOC_SHIFT_HIGH      ((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
>                                   (MAX_ORDER + PAGE_SHIFT - 1) : 25)
> 
> and
> 
>   struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];
> 
> .
> 

As of commit e0fd9aff on linux.git#master,
CONFIG_PPC_256K_PAGES=y (which makes PAGE_SHIFT 18) &&
CONFIG_FORCE_MAX_ZONEORDER=11 (which makes MAX_ORDER 11) &&
CONFIG_PROVE_LOCKING=y with below assertion

----------
diff --git a/mm/slab.c b/mm/slab.c
index 8ccd296..0401982 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -565,6 +565,7 @@ static void init_node_lock_keys(int q)
        if (slab_state < UP)
                return;

+       BUILD_BUG_ON(PAGE_SHIFT + MAX_ORDER != KMALLOC_SHIFT_HIGH + 1);
        for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
                struct kmem_cache_node *n;
                struct kmem_cache *cache = kmalloc_caches[i];
----------

on powerpc triggers below error.

  CC      mm/slab.o
mm/slab.c: In function 'init_node_lock_keys':
mm/slab.c:568:2: error: call to '__compiletime_assert_568' declared with attribute error: BUILD_BUG_ON failed: PAGE_SHIFT + MAX_ORDER != KMALLOC_SHIFT_HIGH + 1
make[1]: *** [mm/slab.o] Error 1
make: *** [mm/slab.o] Error 2

This is an example of overrun at kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1]
range which the compiler may not be able to detect.

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-09 12:25               ` Tetsuo Handa
@ 2013-05-09 14:14                 ` Christoph Lameter
  2013-05-09 21:54                   ` Tetsuo Handa
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-05-09 14:14 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Thu, 9 May 2013, Tetsuo Handa wrote:

> +       BUILD_BUG_ON(PAGE_SHIFT + MAX_ORDER != KMALLOC_SHIFT_HIGH + 1);
>         for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {

Yea looping to PAGE_SHIFT + MAX_ORDER is fundamentally wrong.


Subject: SLAB: Fix init_lock_keys()

init_lock_keys goes too far in initializing values in kmalloc_caches because
it assumed that the size of the kmalloc array goes up to MAX_ORDER. However, the size
of the kmalloc array for SLAB may be restricted due to increased page sizes or CONFIG_FORCE_MAX_ZONEORDER.

Reported-by: Tetsuo Handa <penguin-kernel@I-Love.SAKURA.ne.jp>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/mm/slab.c
===================================================================
--- linux.orig/mm/slab.c	2013-05-09 09:06:20.000000000 -0500
+++ linux/mm/slab.c	2013-05-09 09:08:08.338606055 -0500
@@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
 	if (slab_state < UP)
 		return;

-	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
+	for (i = 1; i =< KMALLOC_SHIFT_HIGH; i++) {
 		struct kmem_cache_node *n;
 		struct kmem_cache *cache = kmalloc_caches[i];


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-09 14:14                 ` Christoph Lameter
@ 2013-05-09 21:54                   ` Tetsuo Handa
  2013-05-10 12:38                     ` Tetsuo Handa
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-09 21:54 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Christoph Lameter wrote:
> -	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
> +	for (i = 1; i =< KMALLOC_SHIFT_HIGH; i++) {

  mm/slab.c: In function 'init_node_lock_keys':
  mm/slab.c:568:17: error: expected expression before '<' token

What I'm worrying is:

  /*
   * The largest kmalloc size supported by the SLAB allocators is
   * 32 megabyte (2^25) or the maximum allocatable page order if that is
   * less than 32 MB.
   *
   * WARNING: Its not easy to increase this value since the allocators have
   * to do various tricks to work around compiler limitations in order to
   * ensure proper constant folding.
   */
  #define KMALLOC_SHIFT_HIGH      ((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
                                  (MAX_ORDER + PAGE_SHIFT - 1) : 25)

  extern struct kmem_cache *kmalloc_caches[KMALLOC_SHIFT_HIGH + 1];

Can we manage with allocating only 26 elements when MAX_ORDER + PAGE_SHIFT > 26
(e.g. PAGE_SIZE == 256 * 1024) ?

Can kmalloc_index()/kmalloc_size()/kmalloc_slab() etc. work correctly when
MAX_ORDER + PAGE_SHIFT > 26 (e.g. PAGE_SIZE == 256 * 1024) ?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-09 21:54                   ` Tetsuo Handa
@ 2013-05-10 12:38                     ` Tetsuo Handa
  2013-05-10 15:57                       ` Christoph Lameter
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-05-10 12:38 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Tetsuo Handa wrote:
> Can we manage with allocating only 26 elements when MAX_ORDER + PAGE_SHIFT > 26
> (e.g. PAGE_SIZE == 256 * 1024) ?
> 
> Can kmalloc_index()/kmalloc_size()/kmalloc_slab() etc. work correctly when
> MAX_ORDER + PAGE_SHIFT > 26 (e.g. PAGE_SIZE == 256 * 1024) ?
> 
Today I compared SLAB/SLUB code. If I understood correctly, the line

  if (size <=  64 * 1024 * 1024) return 26;

in kmalloc_index() is redundant (in fact, kmalloc_caches[26] is out of range)
and conflicts with what the comment

  * The largest kmalloc size supported by the SLAB allocators is
  * 32 megabyte (2^25) or the maximum allocatable page order if that is
  * less than 32 MB.

says, and 0 <= kmalloc_index() <= 25 is always true for SLAB and
0 <= kmalloc_index() <= PAGE_SHIFT+1 is always true for SLUB.

Therefore, towards 3.10-rc1,

> > -	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
> > +	for (i = 1; i =< KMALLOC_SHIFT_HIGH; i++) {
> 
-+	for (i = 1; i =< KMALLOC_SHIFT_HIGH; i++) {
++	for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {

would be the last fix for me. (I don't know why kmalloc_caches[0] is excluded.)

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-10 12:38                     ` Tetsuo Handa
@ 2013-05-10 15:57                       ` Christoph Lameter
       [not found]                         ` <201305272153.HEH51089.OHFQMOOtFLSFVJ@I-love.SAKURA.ne.jp>
  0 siblings, 1 reply; 57+ messages in thread
From: Christoph Lameter @ 2013-05-10 15:57 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: glommer, penberg, linux-kernel

On Fri, 10 May 2013, Tetsuo Handa wrote:

> Tetsuo Handa wrote:
> > Can we manage with allocating only 26 elements when MAX_ORDER + PAGE_SHIFT > 26
> > (e.g. PAGE_SIZE == 256 * 1024) ?
> >
> > Can kmalloc_index()/kmalloc_size()/kmalloc_slab() etc. work correctly when
> > MAX_ORDER + PAGE_SHIFT > 26 (e.g. PAGE_SIZE == 256 * 1024) ?
> >
> Today I compared SLAB/SLUB code. If I understood correctly, the line
>
>   if (size <=  64 * 1024 * 1024) return 26;
>
> in kmalloc_index() is redundant (in fact, kmalloc_caches[26] is out of range)
> and conflicts with what the comment

True we could remove it but it does not hurt. There is a bounding of size
before any call to kmalloc_index.

>   * The largest kmalloc size supported by the SLAB allocators is
>   * 32 megabyte (2^25) or the maximum allocatable page order if that is
>   * less than 32 MB.
>
> says, and 0 <= kmalloc_index() <= 25 is always true for SLAB and
> 0 <= kmalloc_index() <= PAGE_SHIFT+1 is always true for SLUB.
>
> Therefore, towards 3.10-rc1,
>
> > > -	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
> > > +	for (i = 1; i =< KMALLOC_SHIFT_HIGH; i++) {
> >
> -+	for (i = 1; i =< KMALLOC_SHIFT_HIGH; i++) {
> ++	for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
>
> would be the last fix for me. (I don't know why kmalloc_caches[0] is excluded.)

Yep. kmalloc[0] is not used. The first cache to be used is 1 and 2 which
are the non power of two caches. 3 and higher are the power of two caches.


Subject: SLAB: Fix init_lock_keys

init_lock_keys goes too far in initializing values in kmalloc_caches because
it assumed that the size of the kmalloc array goes up to MAX_ORDER. However, the size
of the kmalloc array for SLAB may be restricted due to increased page sizes or CONFIG_FORCE_MAX_ZONEORDER.

Reported-by: Tetsuo Handa <penguin-kernel@I-Love.SAKURA.ne.jp>
Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/mm/slab.c
===================================================================
--- linux.orig/mm/slab.c	2013-05-09 09:06:20.000000000 -0500
+++ linux/mm/slab.c	2013-05-09 09:08:08.338606055 -0500
@@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
 	if (slab_state < UP)
 		return;

-	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
+	for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
 		struct kmem_cache_node *n;
 		struct kmem_cache *cache = kmalloc_caches[i];


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
       [not found]                         ` <201305272153.HEH51089.OHFQMOOtFLSFVJ@I-love.SAKURA.ne.jp>
@ 2013-06-17 11:59                           ` Tetsuo Handa
  0 siblings, 0 replies; 57+ messages in thread
From: Tetsuo Handa @ 2013-06-17 11:59 UTC (permalink / raw)
  To: cl; +Cc: glommer, penberg, linux-kernel

Tetsuo Handa wrote:
> Christoph Lameter wrote:
> > Subject: SLAB: Fix init_lock_keys
> > 
> > init_lock_keys goes too far in initializing values in kmalloc_caches because
> > it assumed that the size of the kmalloc array goes up to MAX_ORDER. However, the size
> > of the kmalloc array for SLAB may be restricted due to increased page sizes or CONFIG_FORCE_MAX_ZONEORDER.
> > 
> > Reported-by: Tetsuo Handa <penguin-kernel@I-Love.SAKURA.ne.jp>
> > Signed-off-by: Christoph Lameter <cl@linux.com>
> > 
> > Index: linux/mm/slab.c
> > ===================================================================
> > --- linux.orig/mm/slab.c	2013-05-09 09:06:20.000000000 -0500
> > +++ linux/mm/slab.c	2013-05-09 09:08:08.338606055 -0500
> > @@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
> >  	if (slab_state < UP)
> >  		return;
> > 
> > -	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
> > +	for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
> >  		struct kmem_cache_node *n;
> >  		struct kmem_cache *cache = kmalloc_caches[i];
> > 
> > 
> Looks OK to me. Please send this one to 3.10-rcX.
> 
It's already 3.10-rc6. Please be sure to send.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-05-07 14:28                                         ` Christoph Lameter
@ 2013-07-01 20:09                                           ` Andrew Morton
  2013-07-01 21:45                                             ` Tetsuo Handa
  0 siblings, 1 reply; 57+ messages in thread
From: Andrew Morton @ 2013-07-01 20:09 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Tetsuo Handa, glommer, penberg, linux-kernel

On Tue, 7 May 2013 14:28:49 +0000 Christoph Lameter <cl@linux.com> wrote:

> On Tue, 7 May 2013, Tetsuo Handa wrote:
> 
> > > These are exclusively from the module load. So the kernel seems to be
> > > clean of large kmalloc's ?
> > >
> > There are modules (e.g. TOMOYO) which do not check for KMALLOC_MAX_SIZE limit
> > and expect kmalloc() larger than KMALLOC_MAX_SIZE bytes to return NULL.
> 
> Dont do that. Please fix these things.

Slab should return NULL for a request greater than KMALLOC_MAX_SIZE. 
For heaven's sake don't break that!

What's going on with this bug, btw?  This:

--- a/mm/slab.c~slab-fix-init_lock_keys
+++ a/mm/slab.c
@@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
 	if (slab_state < UP)
 		return;
 
-	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
+	for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
 		struct kmem_cache_node *n;
 		struct kmem_cache *cache = kmalloc_caches[i];
 

still seems to be unapplied.

I've read through the thread trying to work out what the end-user
impact of that fix is, but it's all clear as mud.  It's possible that
the end-user effect is `kernel locks up after printing "Booting the
kernel"'.  Or maybe not.

And if the above patch does indeed fix something significant, we might
need a -stable backport.

Can we get some clarity here please?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-07-01 20:09                                           ` Andrew Morton
@ 2013-07-01 21:45                                             ` Tetsuo Handa
  2013-07-01 21:53                                               ` Andrew Morton
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-07-01 21:45 UTC (permalink / raw)
  To: akpm, cl; +Cc: glommer, penberg, linux-kernel

Andrew Morton wrote:
> On Tue, 7 May 2013 14:28:49 +0000 Christoph Lameter <cl@linux.com> wrote:
> 
> > On Tue, 7 May 2013, Tetsuo Handa wrote:
> > 
> > > > These are exclusively from the module load. So the kernel seems to be
> > > > clean of large kmalloc's ?
> > > >
> > > There are modules (e.g. TOMOYO) which do not check for KMALLOC_MAX_SIZE limit
> > > and expect kmalloc() larger than KMALLOC_MAX_SIZE bytes to return NULL.
> > 
> > Dont do that. Please fix these things.
> 
> Slab should return NULL for a request greater than KMALLOC_MAX_SIZE. 
> For heaven's sake don't break that!

The patch that fixes above things (commit 6286ae97) went to 3.10.



> What's going on with this bug, btw?  This:
> 
> --- a/mm/slab.c~slab-fix-init_lock_keys
> +++ a/mm/slab.c
> @@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
>  	if (slab_state < UP)
>  		return;
>  
> -	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
> +	for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
>  		struct kmem_cache_node *n;
>  		struct kmem_cache *cache = kmalloc_caches[i];
>  
> 
> still seems to be unapplied.
> 
The patch that fixes oops and panic on early boot on architectures with
PAGE_SHIFT + MAX_ORDER > 26 missed 3.10.

> I've read through the thread trying to work out what the end-user
> impact of that fix is, but it's all clear as mud.  It's possible that
> the end-user effect is `kernel locks up after printing "Booting the
> kernel"'.  Or maybe not.
> 
> And if the above patch does indeed fix something significant, we might
> need a -stable backport.
> 

Somebody needs this patch when debugging with CONFIG_LOCKDEP=y on
architectures with PAGE_SHIFT + MAX_ORDER > 26 .

> Can we get some clarity here please?
> 

Thank you for adding to -mm. But please delete

  Tetsuo said:
  
  : It hangs (with CPU#0 spinning) immediately after printing
  : 
  :   Decompressing Linux... Parsing ELF... done.
  :   Booting the kernel.
  : 
  : lines.

lines from "+ slab-fix-init_lock_keys.patch added to -mm tree", for
these lines are fixed by commit 8a965b3b. Though the same symptom would
appear if hitting this PAGE_SHIFT + MAX_ORDER > 26 bug, I can't confirm
the symptom for environments which I don't have.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-07-01 21:45                                             ` Tetsuo Handa
@ 2013-07-01 21:53                                               ` Andrew Morton
  2013-07-02 12:49                                                 ` Tetsuo Handa
  0 siblings, 1 reply; 57+ messages in thread
From: Andrew Morton @ 2013-07-01 21:53 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: cl, glommer, penberg, linux-kernel

On Tue, 2 Jul 2013 06:45:27 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:

> 
> > I've read through the thread trying to work out what the end-user
> > impact of that fix is, but it's all clear as mud.  It's possible that
> > the end-user effect is `kernel locks up after printing "Booting the
> > kernel"'.  Or maybe not.
> > 
> > And if the above patch does indeed fix something significant, we might
> > need a -stable backport.
> > 
> 
> Somebody needs this patch when debugging with CONFIG_LOCKDEP=y on
> architectures with PAGE_SHIFT + MAX_ORDER > 26 .

Well *why* do they need it?  What happens without the patch?  How would
a person determine whether their kernel needs this patch?

When this patch crosses Greg's desk for -stable inclusion he's going to
wonder "why do users of -stable kernels need this", and you guys
haven't told him!

Grumble.  Why is it so hard to get a simple and decent changelog for
this patch?


Look, I'll make this easier:

: Subject: slab: fix init_lock_keys
: 
: In 3.10 kernels with CONFIG_LOCKDEP=y on architectures with
: PAGE_SHIFT + MAX_ORDER > 26 such as [architecture goes here], the kernel does
: [x] when the user does [y].
:
: init_lock_keys() goes too far in initializing values in kmalloc_caches
: because it assumed that the size of the kmalloc array goes up to
: MAX_ORDER.  However, the size of the kmalloc array for SLAB may be
: restricted due to increased page sizes or CONFIG_FORCE_MAX_ZONEORDER.
:
: Fix this by [z].


Please fill in the text within [].

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-07-01 21:53                                               ` Andrew Morton
@ 2013-07-02 12:49                                                 ` Tetsuo Handa
  2013-07-02 19:12                                                   ` Andrew Morton
  0 siblings, 1 reply; 57+ messages in thread
From: Tetsuo Handa @ 2013-07-02 12:49 UTC (permalink / raw)
  To: akpm; +Cc: cl, glommer, penberg, linux-kernel

Andrew Morton wrote:
> Look, I'll make this easier:
> 
> : Subject: slab: fix init_lock_keys
> : 
> : In 3.10 kernels with CONFIG_LOCKDEP=y on architectures with
> : PAGE_SHIFT + MAX_ORDER > 26 such as [architecture goes here], the kernel does
> : [x] when the user does [y].
> :
> : init_lock_keys() goes too far in initializing values in kmalloc_caches
> : because it assumed that the size of the kmalloc array goes up to
> : MAX_ORDER.  However, the size of the kmalloc array for SLAB may be
> : restricted due to increased page sizes or CONFIG_FORCE_MAX_ZONEORDER.
> :
> : Fix this by [z].
> 
> 
> Please fill in the text within [].
> 
OK. I made from http://marc.info/?l=linux-kernel&m=136810234704350&w=2 .
-----
Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26.

In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes
init_lock_keys() dereference beyond kmalloc_caches[26].
This leads to an unbootable system (kernel panic at initializing SLAB)
if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL.

Fix this by making sure that init_lock_keys() does not dereference beyond
kmalloc_caches[26] arrays.
-----

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-07-02 12:49                                                 ` Tetsuo Handa
@ 2013-07-02 19:12                                                   ` Andrew Morton
  2013-07-07 15:58                                                     ` Pekka Enberg
  0 siblings, 1 reply; 57+ messages in thread
From: Andrew Morton @ 2013-07-02 19:12 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: cl, glommer, penberg, linux-kernel

On Tue, 2 Jul 2013 21:49:26 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:

> Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y
> CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26.
> 
> In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes
> init_lock_keys() dereference beyond kmalloc_caches[26].
> This leads to an unbootable system (kernel panic at initializing SLAB)
> if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL.
> 
> Fix this by making sure that init_lock_keys() does not dereference beyond
> kmalloc_caches[26] arrays.

Nice, thanks.  Pekka, please grab.


From: Christoph Lameter <cl@linux.com>
Subject: slab: fix init_lock_keys

Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26.

In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes
init_lock_keys() dereference beyond kmalloc_caches[26].
This leads to an unbootable system (kernel panic at initializing SLAB)
if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL.

Fix this by making sure that init_lock_keys() does not dereference beyond
kmalloc_caches[26] arrays.

Signed-off-by: Christoph Lameter <cl@linux.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-Love.SAKURA.ne.jp>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: <stable@vger.kernel.org>	[3.10.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN mm/slab.c~slab-fix-init_lock_keys mm/slab.c
--- a/mm/slab.c~slab-fix-init_lock_keys
+++ a/mm/slab.c
@@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
 	if (slab_state < UP)
 		return;
 
-	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
+	for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
 		struct kmem_cache_node *n;
 		struct kmem_cache *cache = kmalloc_caches[i];
 
_


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [linux-next-20130422] Bug in SLAB?
  2013-07-02 19:12                                                   ` Andrew Morton
@ 2013-07-07 15:58                                                     ` Pekka Enberg
  0 siblings, 0 replies; 57+ messages in thread
From: Pekka Enberg @ 2013-07-07 15:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Tetsuo Handa, cl, glommer, linux-kernel

On 7/2/13 10:12 PM, Andrew Morton wrote:
> On Tue, 2 Jul 2013 21:49:26 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:
>
>> Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y
>> CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26.
>>
>> In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes
>> init_lock_keys() dereference beyond kmalloc_caches[26].
>> This leads to an unbootable system (kernel panic at initializing SLAB)
>> if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL.
>>
>> Fix this by making sure that init_lock_keys() does not dereference beyond
>> kmalloc_caches[26] arrays.
>
> Nice, thanks.  Pekka, please grab.
>
>
> From: Christoph Lameter <cl@linux.com>
> Subject: slab: fix init_lock_keys
>
> Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y
> CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26.
>
> In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes
> init_lock_keys() dereference beyond kmalloc_caches[26].
> This leads to an unbootable system (kernel panic at initializing SLAB)
> if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL.
>
> Fix this by making sure that init_lock_keys() does not dereference beyond
> kmalloc_caches[26] arrays.
>
> Signed-off-by: Christoph Lameter <cl@linux.com>
> Reported-by: Tetsuo Handa <penguin-kernel@I-Love.SAKURA.ne.jp>
> Cc: Pekka Enberg <penberg@kernel.org>
> Cc: <stable@vger.kernel.org>	[3.10.x]
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
>   mm/slab.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff -puN mm/slab.c~slab-fix-init_lock_keys mm/slab.c
> --- a/mm/slab.c~slab-fix-init_lock_keys
> +++ a/mm/slab.c
> @@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
>   	if (slab_state < UP)
>   		return;
>
> -	for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
> +	for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
>   		struct kmem_cache_node *n;
>   		struct kmem_cache *cache = kmalloc_caches[i];
>
> _
>

Grabbed, thanks a lot Andrew!

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2013-07-07 15:58 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-24 12:08 [linux-next-20130422] Bug in bootup code or debug code? Tetsuo Handa
2013-04-25 12:20 ` [linux-next-20130422] Bug in SLAB? Tetsuo Handa
2013-04-29  2:40   ` Tetsuo Handa
2013-04-29 10:12     ` Pekka Enberg
2013-04-29 14:44       ` Glauber Costa
2013-04-29 14:59         ` Christoph Lameter
2013-04-29 15:16           ` Glauber Costa
2013-04-29 15:46             ` Christoph Lameter
2013-04-29 15:54               ` Glauber Costa
2013-04-29 15:28           ` Tetsuo Handa
2013-04-29 16:16             ` Tetsuo Handa
2013-05-09 12:25               ` Tetsuo Handa
2013-05-09 14:14                 ` Christoph Lameter
2013-05-09 21:54                   ` Tetsuo Handa
2013-05-10 12:38                     ` Tetsuo Handa
2013-05-10 15:57                       ` Christoph Lameter
     [not found]                         ` <201305272153.HEH51089.OHFQMOOtFLSFVJ@I-love.SAKURA.ne.jp>
2013-06-17 11:59                           ` Tetsuo Handa
2013-04-29 17:48             ` Christoph Lameter
2013-04-29 21:45               ` Tetsuo Handa
2013-04-30 14:26                 ` Christoph Lameter
2013-04-30 16:01                   ` Tetsuo Handa
2013-04-30 17:27                     ` Christoph Lameter
2013-05-01  8:05                       ` Pekka Enberg
2013-05-02 15:12                         ` Christoph Lameter
2013-05-01 12:14                       ` Tetsuo Handa
2013-05-02 11:51                         ` Tetsuo Handa
2013-05-02 20:53                         ` Christoph Lameter
2013-05-03  8:26                           ` Tetsuo Handa
2013-05-03 15:43                             ` Christoph Lameter
2013-05-06  6:59                               ` Geert Uytterhoeven
2013-05-06  7:27                                 ` Pekka Enberg
2013-05-03 18:04                             ` Christoph Lameter
2013-05-03 18:48                               ` Tetsuo Handa
2013-05-03 19:21                                 ` Christoph Lameter
2013-05-04  0:15                                   ` Tetsuo Handa
2013-05-06 13:46                                     ` Christoph Lameter
2013-05-07 10:38                                       ` Tetsuo Handa
2013-05-07 14:28                                         ` Christoph Lameter
2013-07-01 20:09                                           ` Andrew Morton
2013-07-01 21:45                                             ` Tetsuo Handa
2013-07-01 21:53                                               ` Andrew Morton
2013-07-02 12:49                                                 ` Tetsuo Handa
2013-07-02 19:12                                                   ` Andrew Morton
2013-07-07 15:58                                                     ` Pekka Enberg
2013-05-06  6:32                               ` Pekka Enberg
2013-05-06 13:47                                 ` Christoph Lameter
2013-05-06 20:24                               ` Pekka Enberg
2013-05-07 14:23                                 ` Christoph Lameter
2013-05-07 14:44                                   ` Pekka Enberg
2013-05-07 15:40                                     ` Christoph Lameter
2013-04-30 14:34                 ` Christoph Lameter
2013-05-01  8:03                   ` Pekka Enberg
2013-05-02 15:13                     ` Christoph Lameter
2013-04-30 15:16                 ` Fix off by one error in slab.h Christoph Lameter
2013-04-29  2:56   ` [linux-next-20130422] Bug in SLAB? Zhan Jianyu
2013-04-29  2:56     ` Zhan Jianyu
2013-04-29  3:56     ` Tetsuo Handa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.