linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Unbounded growth of slab caches and how to shrink them
@ 2016-06-29 10:34 Nikolay Borisov
  2016-06-29 14:00 ` Christoph Lameter
  0 siblings, 1 reply; 4+ messages in thread
From: Nikolay Borisov @ 2016-06-29 10:34 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Linux-Kernel@Vger. Kernel. Org

Hello Christoph, 

I've observed a rather strange unbounded growth of the kmalloc-192 
slab cache: 

OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
711124869 411527215   3%    0.19K 16934908       42 135479264K kmalloc-192

Essentially the kmalloc is around 130 GB , yet only 3 percent of this are 
being used. In this case I'd like to essentially shrink the overall size 
of the cache. How is it possible to achieve that? I tried echoing '1' 
to /sys/kernel/slab/kmalloc-192/shrink but nothing changed. 

This is on 3.12 which is rather old kernel, but still I believe it is 
entirely possible for someone to find a way to flood a machine with
network requests which would cause a lot of objects to be allocate, 
resulting in a particular slab cache growing, then later when the request 
flood stops the cache would be almost empty, yet the memory won't be usable
for anything other than satisfying memory allocation from this cache. 

Regards, 
Nikolay 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Unbounded growth of slab caches and how to shrink them
  2016-06-29 10:34 Unbounded growth of slab caches and how to shrink them Nikolay Borisov
@ 2016-06-29 14:00 ` Christoph Lameter
  2016-06-29 14:17   ` Nikolay Borisov
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Lameter @ 2016-06-29 14:00 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Linux-Kernel@Vger. Kernel. Org

On Wed, 29 Jun 2016, Nikolay Borisov wrote:

> I've observed a rather strange unbounded growth of the kmalloc-192
> slab cache:
>
> OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> 711124869 411527215   3%    0.19K 16934908       42 135479264K kmalloc-192
>
> Essentially the kmalloc is around 130 GB , yet only 3 percent of this are
> being used. In this case I'd like to essentially shrink the overall size
> of the cache. How is it possible to achieve that? I tried echoing '1'
> to /sys/kernel/slab/kmalloc-192/shrink but nothing changed.

Ok this probably means that most slabs have just a few or one objects?
Some workloads can result in situations like that. Can you enable
debugging and get a list of functions where these objects are allocated?

> This is on 3.12 which is rather old kernel, but still I believe it is
> entirely possible for someone to find a way to flood a machine with
> network requests which would cause a lot of objects to be allocate,
> resulting in a particular slab cache growing, then later when the request
> flood stops the cache would be almost empty, yet the memory won't be usable
> for anything other than satisfying memory allocation from this cache.

True. Long known problem and all my attempts to facilitate a solution here
did not go anywhere. The essential solution would require objects being
movable or removable from the sparsely allocated page frames. And this
goes way beyond my subsystem.

If you can figure out which subsystem allocates or frees these objects
(through the call traces) then we may find a knob in the subsystem to
clear those out once in a while.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Unbounded growth of slab caches and how to shrink them
  2016-06-29 14:00 ` Christoph Lameter
@ 2016-06-29 14:17   ` Nikolay Borisov
  2016-06-29 14:34     ` Christoph Lameter
  0 siblings, 1 reply; 4+ messages in thread
From: Nikolay Borisov @ 2016-06-29 14:17 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Linux-Kernel@Vger. Kernel. Org



On 06/29/2016 05:00 PM, Christoph Lameter wrote:
> On Wed, 29 Jun 2016, Nikolay Borisov wrote:
> 
>> I've observed a rather strange unbounded growth of the kmalloc-192
>> slab cache:
>>
>> OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
>> 711124869 411527215   3%    0.19K 16934908       42 135479264K kmalloc-192
>>
>> Essentially the kmalloc is around 130 GB , yet only 3 percent of this are
>> being used. In this case I'd like to essentially shrink the overall size
>> of the cache. How is it possible to achieve that? I tried echoing '1'
>> to /sys/kernel/slab/kmalloc-192/shrink but nothing changed.
> 
> Ok this probably means that most slabs have just a few or one objects?
> Some workloads can result in situations like that. Can you enable
> debugging and get a list of functions where these objects are allocated?

Right, so what debugging concretely do you have in mind. So far what I
did was reboot the machine with SLUB merging disabled, since there are
quite a lot of slabs being merged into that particular one:

:t-0000192   <- cred_jar pid_3 inet_peer_cache request_sock_TCPv6
kmalloc-192 file_lock_cache bio-0 ip_dst_cache key_jar

I'm quite sure it's likely it's one of the either networking or bio-0
slab cache, since the others seems generally not very used.

> 
>> This is on 3.12 which is rather old kernel, but still I believe it is
>> entirely possible for someone to find a way to flood a machine with
>> network requests which would cause a lot of objects to be allocate,
>> resulting in a particular slab cache growing, then later when the request
>> flood stops the cache would be almost empty, yet the memory won't be usable
>> for anything other than satisfying memory allocation from this cache.
> 
> True. Long known problem and all my attempts to facilitate a solution here
> did not go anywhere. The essential solution would require objects being
> movable or removable from the sparsely allocated page frames. And this
> goes way beyond my subsystem.
> 
> If you can figure out which subsystem allocates or frees these objects
> (through the call traces) then we may find a knob in the subsystem to
> clear those out once in a while.
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Unbounded growth of slab caches and how to shrink them
  2016-06-29 14:17   ` Nikolay Borisov
@ 2016-06-29 14:34     ` Christoph Lameter
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Lameter @ 2016-06-29 14:34 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Linux-Kernel@Vger. Kernel. Org

On Wed, 29 Jun 2016, Nikolay Borisov wrote:

> Right, so what debugging concretely do you have in mind. So far what I
> did was reboot the machine with SLUB merging disabled, since there are
> quite a lot of slabs being merged into that particular one:
>
> :t-0000192   <- cred_jar pid_3 inet_peer_cache request_sock_TCPv6
> kmalloc-192 file_lock_cache bio-0 ip_dst_cache key_jar
>
> I'm quite sure it's likely it's one of the either networking or bio-0
> slab cache, since the others seems generally not very used.

Reboot the box with "slub_debug" on the kernel command line. Then post
the output of /sys/kernel/slab/kmalloc-128/alloc_calls and free_call after
you have recreated the situation.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-06-29 14:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-29 10:34 Unbounded growth of slab caches and how to shrink them Nikolay Borisov
2016-06-29 14:00 ` Christoph Lameter
2016-06-29 14:17   ` Nikolay Borisov
2016-06-29 14:34     ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).