linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* SLUB: sysfs lets root force slab order below required minimum, causing memory corruption
@ 2020-03-04  0:23 Jann Horn
  2020-03-04  1:26 ` David Rientjes
  2020-03-04 13:17 ` Vlastimil Babka
  0 siblings, 2 replies; 8+ messages in thread
From: Jann Horn @ 2020-03-04  0:23 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton
  Cc: Linux-MM, kernel list, Kees Cook, Matthew Garrett

Hi!

FYI, I noticed that if you do something like the following as root,
the system blows up pretty quickly with error messages about stuff
like corrupt freelist pointers because SLUB actually allows root to
force a page order that is smaller than what is required to store a
single object:

    echo 0 > /sys/kernel/slab/task_struct/order

The other SLUB debugging options, like red_zone, also look kind of
suspicious with regards to races (either racing with other writes to
the SLUB debugging options, or with object allocations).


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB: sysfs lets root force slab order below required minimum, causing memory corruption
  2020-03-04  0:23 SLUB: sysfs lets root force slab order below required minimum, causing memory corruption Jann Horn
@ 2020-03-04  1:26 ` David Rientjes
  2020-03-04  2:22   ` Kees Cook
  2020-03-04 14:57   ` Pekka Enberg
  2020-03-04 13:17 ` Vlastimil Babka
  1 sibling, 2 replies; 8+ messages in thread
From: David Rientjes @ 2020-03-04  1:26 UTC (permalink / raw)
  To: Jann Horn
  Cc: Christoph Lameter, Pekka Enberg, Joonsoo Kim, Andrew Morton,
	Linux-MM, kernel list, Kees Cook, Matthew Garrett

On Wed, 4 Mar 2020, Jann Horn wrote:

> Hi!
> 
> FYI, I noticed that if you do something like the following as root,
> the system blows up pretty quickly with error messages about stuff
> like corrupt freelist pointers because SLUB actually allows root to
> force a page order that is smaller than what is required to store a
> single object:
> 
>     echo 0 > /sys/kernel/slab/task_struct/order
> 
> The other SLUB debugging options, like red_zone, also look kind of
> suspicious with regards to races (either racing with other writes to
> the SLUB debugging options, or with object allocations).
> 

Thanks for the report, Jann.  To address the most immediate issue, 
allowing a smaller order than allowed, I think we'd need something like 
this.

I can propose it as a formal patch if nobody has any alternate 
suggestions?
---
 mm/slub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3598,7 +3598,7 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order)
 	 */
 	size = ALIGN(size, s->align);
 	s->size = size;
-	if (forced_order >= 0)
+	if (forced_order >= slab_order(size, 1, MAX_ORDER, 1))
 		order = forced_order;
 	else
 		order = calculate_order(size);


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB: sysfs lets root force slab order below required minimum, causing memory corruption
  2020-03-04  1:26 ` David Rientjes
@ 2020-03-04  2:22   ` Kees Cook
  2020-03-04 17:26     ` Vlastimil Babka
  2020-03-04 14:57   ` Pekka Enberg
  1 sibling, 1 reply; 8+ messages in thread
From: Kees Cook @ 2020-03-04  2:22 UTC (permalink / raw)
  To: David Rientjes
  Cc: Jann Horn, Christoph Lameter, Pekka Enberg, Joonsoo Kim,
	Andrew Morton, Linux-MM, kernel list, Matthew Garrett

On Tue, Mar 03, 2020 at 05:26:14PM -0800, David Rientjes wrote:
> On Wed, 4 Mar 2020, Jann Horn wrote:
> 
> > Hi!
> > 
> > FYI, I noticed that if you do something like the following as root,
> > the system blows up pretty quickly with error messages about stuff
> > like corrupt freelist pointers because SLUB actually allows root to
> > force a page order that is smaller than what is required to store a
> > single object:
> > 
> >     echo 0 > /sys/kernel/slab/task_struct/order
> > 
> > The other SLUB debugging options, like red_zone, also look kind of
> > suspicious with regards to races (either racing with other writes to
> > the SLUB debugging options, or with object allocations).
> > 
> 
> Thanks for the report, Jann.  To address the most immediate issue, 
> allowing a smaller order than allowed, I think we'd need something like 
> this.
> 
> I can propose it as a formal patch if nobody has any alternate 
> suggestions?
> ---
>  mm/slub.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3598,7 +3598,7 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order)
>  	 */
>  	size = ALIGN(size, s->align);
>  	s->size = size;
> -	if (forced_order >= 0)
> +	if (forced_order >= slab_order(size, 1, MAX_ORDER, 1))
>  		order = forced_order;
>  	else
>  		order = calculate_order(size);

Seems reasonable!

For the race concerns, should this logic just make sure the resulting
order can never shrink? Or does it need much stronger atomicity?

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB: sysfs lets root force slab order below required minimum, causing memory corruption
  2020-03-04  0:23 SLUB: sysfs lets root force slab order below required minimum, causing memory corruption Jann Horn
  2020-03-04  1:26 ` David Rientjes
@ 2020-03-04 13:17 ` Vlastimil Babka
  1 sibling, 0 replies; 8+ messages in thread
From: Vlastimil Babka @ 2020-03-04 13:17 UTC (permalink / raw)
  To: Jann Horn, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton
  Cc: Linux-MM, kernel list, Kees Cook, Matthew Garrett

On 3/4/20 1:23 AM, Jann Horn wrote:
> Hi!
> 
> FYI, I noticed that if you do something like the following as root,
> the system blows up pretty quickly with error messages about stuff
> like corrupt freelist pointers because SLUB actually allows root to
> force a page order that is smaller than what is required to store a
> single object:
> 
>     echo 0 > /sys/kernel/slab/task_struct/order
> 
> The other SLUB debugging options, like red_zone, also look kind of
> suspicious with regards to races (either racing with other writes to
> the SLUB debugging options, or with object allocations).

Yeah I also wondered last week that there seems to be no sychronization with
alloc/free activity. Increasing order is AFAICS also dangerous with freelist
randomization:

https://lore.kernel.org/linux-mm/d3acc069-a5c6-f40a-f95c-b546664bc4ee@suse.cz/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB: sysfs lets root force slab order below required minimum, causing memory corruption
  2020-03-04  1:26 ` David Rientjes
  2020-03-04  2:22   ` Kees Cook
@ 2020-03-04 14:57   ` Pekka Enberg
  1 sibling, 0 replies; 8+ messages in thread
From: Pekka Enberg @ 2020-03-04 14:57 UTC (permalink / raw)
  To: David Rientjes, Jann Horn
  Cc: Christoph Lameter, Pekka Enberg, Joonsoo Kim, Andrew Morton,
	Linux-MM, kernel list, Kees Cook, Matthew Garrett



On 3/4/20 3:26 AM, David Rientjes wrote:
> On Wed, 4 Mar 2020, Jann Horn wrote:
> 
>> Hi!
>>
>> FYI, I noticed that if you do something like the following as root,
>> the system blows up pretty quickly with error messages about stuff
>> like corrupt freelist pointers because SLUB actually allows root to
>> force a page order that is smaller than what is required to store a
>> single object:
>>
>>      echo 0 > /sys/kernel/slab/task_struct/order
>>
>> The other SLUB debugging options, like red_zone, also look kind of
>> suspicious with regards to races (either racing with other writes to
>> the SLUB debugging options, or with object allocations).
>>
> 
> Thanks for the report, Jann.  To address the most immediate issue,
> allowing a smaller order than allowed, I think we'd need something like
> this.
> 
> I can propose it as a formal patch if nobody has any alternate
> suggestions?
> ---
>   mm/slub.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3598,7 +3598,7 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order)
>   	 */
>   	size = ALIGN(size, s->align);
>   	s->size = size;
> -	if (forced_order >= 0)
> +	if (forced_order >= slab_order(size, 1, MAX_ORDER, 1))
>   		order = forced_order;
>   	else
>   		order = calculate_order(size);
> 

Reviewed-by: Pekka Enberg <penberg@iki.fi>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB: sysfs lets root force slab order below required minimum, causing memory corruption
  2020-03-04  2:22   ` Kees Cook
@ 2020-03-04 17:26     ` Vlastimil Babka
  2020-03-04 20:39       ` David Rientjes
  0 siblings, 1 reply; 8+ messages in thread
From: Vlastimil Babka @ 2020-03-04 17:26 UTC (permalink / raw)
  To: Kees Cook, David Rientjes
  Cc: Jann Horn, Christoph Lameter, Pekka Enberg, Joonsoo Kim,
	Andrew Morton, Linux-MM, kernel list, Matthew Garrett,
	Vijayanand Jitta

On 3/4/20 3:22 AM, Kees Cook wrote:
> On Tue, Mar 03, 2020 at 05:26:14PM -0800, David Rientjes wrote:
> 
> Seems reasonable!
> 
> For the race concerns, should this logic just make sure the resulting
> order can never shrink? Or does it need much stronger atomicity?

If order grows, I think we also need to recalculate the random sequence for
freelist randomization [1]. I expect that would be rather problematic with
parallel allocations/freeing going on.

As was also noted, the any_slab_objects(s) checks are racy - might return false
and immediately some other CPU can allocate some.

I wonder if this race window could be fixed at all without introducing extra
locking in the fast path? Which means it's probably not worth the trouble of
having these runtime knobs. How about making the files read-only (if not remove
completely). Vijayanand described a use case in [2], shouldn't it be possible to
implement that scenario (all caches have debugging enabled except zram cache)
with kernel parameters only?

Thanks,
Vlastimil

[1] https://lore.kernel.org/linux-mm/d3acc069-a5c6-f40a-f95c-b546664bc4ee@suse.cz/
[2]
https://lore.kernel.org/linux-mm/1383cd32-1ddc-4dac-b5f8-9c42282fa81c@codeaurora.org/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB: sysfs lets root force slab order below required minimum, causing memory corruption
  2020-03-04 17:26     ` Vlastimil Babka
@ 2020-03-04 20:39       ` David Rientjes
  2020-03-08 19:34         ` Christopher Lameter
  0 siblings, 1 reply; 8+ messages in thread
From: David Rientjes @ 2020-03-04 20:39 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Kees Cook, Jann Horn, Christoph Lameter, Pekka Enberg,
	Joonsoo Kim, Andrew Morton, Linux-MM, kernel list,
	Matthew Garrett, Vijayanand Jitta

On Wed, 4 Mar 2020, Vlastimil Babka wrote:

> > Seems reasonable!
> > 
> > For the race concerns, should this logic just make sure the resulting
> > order can never shrink? Or does it need much stronger atomicity?
> 
> If order grows, I think we also need to recalculate the random sequence for
> freelist randomization [1]. I expect that would be rather problematic with
> parallel allocations/freeing going on.
> 
> As was also noted, the any_slab_objects(s) checks are racy - might return false
> and immediately some other CPU can allocate some.
> 
> I wonder if this race window could be fixed at all without introducing extra
> locking in the fast path? Which means it's probably not worth the trouble of
> having these runtime knobs. How about making the files read-only (if not remove
> completely). Vijayanand described a use case in [2], shouldn't it be possible to
> implement that scenario (all caches have debugging enabled except zram cache)
> with kernel parameters only?
> 

I'm not sure how dependent the CONFIG_SLUB_DEBUG users are on being able 
to modify these are runtime (they've been around for 12+ years) but I 
agree that it seems particularly dangerous.

I think they can be fixed by freezing allocations and frees for the 
particular kmem_cache on all cpus which would add the additional 
conditional in the fastpath and that's going to be required in the very 
small minority of cases where an admin actually wants to change these.

The slub_debug kernel command line options are already pretty 
comprehensive as described by Documentation/vm/slub.rst.  I *think* these 
tunables were primarily introduced for kernel debugging and not general 
purpose, perhaps with the exception of "order".

So I think we may be able to fix "order" with a combination of my patch as 
well as a fix to the freelist randomization and that the others should 
likely be made read only.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SLUB: sysfs lets root force slab order below required minimum, causing memory corruption
  2020-03-04 20:39       ` David Rientjes
@ 2020-03-08 19:34         ` Christopher Lameter
  0 siblings, 0 replies; 8+ messages in thread
From: Christopher Lameter @ 2020-03-08 19:34 UTC (permalink / raw)
  To: David Rientjes
  Cc: Vlastimil Babka, Kees Cook, Jann Horn, Pekka Enberg, Joonsoo Kim,
	Andrew Morton, Linux-MM, kernel list, Matthew Garrett,
	Vijayanand Jitta

On Wed, 4 Mar 2020, David Rientjes wrote:

> I'm not sure how dependent the CONFIG_SLUB_DEBUG users are on being able
> to modify these are runtime (they've been around for 12+ years) but I
> agree that it seems particularly dangerous.

The order of each individual slab page is stored in struct page. That is
why every slub slab page can have a different order. This enabled fallback
to order 0 allocations and also allows a dynamic configuration of the
order at runtime.

> The slub_debug kernel command line options are already pretty
> comprehensive as described by Documentation/vm/slub.rst.  I *think* these
> tunables were primarily introduced for kernel debugging and not general
> purpose, perhaps with the exception of "order".

What do you mean by "general purpose? Certainly the allocator should not
blow up when forcing zero order allocations.

> So I think we may be able to fix "order" with a combination of my patch as
> well as a fix to the freelist randomization and that the others should
> likely be made read only.

Hmmm. races increases as more metadata is added that is depending on the
size of the slab page and the number of objects in it.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-03-08 19:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04  0:23 SLUB: sysfs lets root force slab order below required minimum, causing memory corruption Jann Horn
2020-03-04  1:26 ` David Rientjes
2020-03-04  2:22   ` Kees Cook
2020-03-04 17:26     ` Vlastimil Babka
2020-03-04 20:39       ` David Rientjes
2020-03-08 19:34         ` Christopher Lameter
2020-03-04 14:57   ` Pekka Enberg
2020-03-04 13:17 ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).