All of lore.kernel.org
 help / color / mirror / Atom feed
* Reading /proc/slabinfo causes stalls
@ 2012-10-16 17:51 Avleen Vig
  2012-10-16 18:04 ` Pekka Enberg
  0 siblings, 1 reply; 8+ messages in thread
From: Avleen Vig @ 2012-10-16 17:51 UTC (permalink / raw)
  To: linux-kernel

I *think* this is the right place to ask this, and apologies if it's not
(is there a better place?).

We have checks which read /proc/slabinfo once a minute, and have noticed
that this causes the entire system to stall for a few milliseconds.
It's long enough that it causes noticeable delays in latency-sensitive
applications (between 10ms and 100ms).

Is this a known condition? Are there work arounds or other ways to get the
slab allocation data which don't cause stalls?


Thanks :-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reading /proc/slabinfo causes stalls
  2012-10-16 17:51 Reading /proc/slabinfo causes stalls Avleen Vig
@ 2012-10-16 18:04 ` Pekka Enberg
  2012-10-16 18:25   ` Avleen Vig
  2012-10-16 18:29   ` Christoph Lameter
  0 siblings, 2 replies; 8+ messages in thread
From: Pekka Enberg @ 2012-10-16 18:04 UTC (permalink / raw)
  To: Avleen Vig; +Cc: linux-kernel, Christoph Lameter, David Rientjes, Andrew Morton

Hi Avleen,

On Tue, Oct 16, 2012 at 8:51 PM, Avleen Vig <avleen@gmail.com> wrote:
> I *think* this is the right place to ask this, and apologies if it's not
> (is there a better place?).
>
> We have checks which read /proc/slabinfo once a minute, and have noticed
> that this causes the entire system to stall for a few milliseconds.
> It's long enough that it causes noticeable delays in latency-sensitive
> applications (between 10ms and 100ms).
>
> Is this a known condition? Are there work arounds or other ways to get the
> slab allocation data which don't cause stalls?

What kernel version are you using? What does your .config look like?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reading /proc/slabinfo causes stalls
  2012-10-16 18:04 ` Pekka Enberg
@ 2012-10-16 18:25   ` Avleen Vig
  2012-10-16 18:29   ` Christoph Lameter
  1 sibling, 0 replies; 8+ messages in thread
From: Avleen Vig @ 2012-10-16 18:25 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: linux-kernel, Christoph Lameter, David Rientjes, Andrew Morton

By default we use the 2.6.32-279.5.2 as provided by CentOS.
However I also tried the 3.0.46 from elrepo:
http://elrepo.org/linux/kernel/el6/x86_64/RPMS/

The config for the 3.0 kernel is:
https://gist.github.com/3901068

They both seem to exhibit the same problem.

Thanks :)

On Tue, Oct 16, 2012 at 1:04 PM, Pekka Enberg <penberg@kernel.org> wrote:
> Hi Avleen,
>
> On Tue, Oct 16, 2012 at 8:51 PM, Avleen Vig <avleen@gmail.com> wrote:
>> I *think* this is the right place to ask this, and apologies if it's not
>> (is there a better place?).
>>
>> We have checks which read /proc/slabinfo once a minute, and have noticed
>> that this causes the entire system to stall for a few milliseconds.
>> It's long enough that it causes noticeable delays in latency-sensitive
>> applications (between 10ms and 100ms).
>>
>> Is this a known condition? Are there work arounds or other ways to get the
>> slab allocation data which don't cause stalls?
>
> What kernel version are you using? What does your .config look like?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reading /proc/slabinfo causes stalls
  2012-10-16 18:04 ` Pekka Enberg
  2012-10-16 18:25   ` Avleen Vig
@ 2012-10-16 18:29   ` Christoph Lameter
  2012-10-16 18:47     ` David Rientjes
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Lameter @ 2012-10-16 18:29 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Avleen Vig, linux-kernel, David Rientjes, Andrew Morton

On Tue, 16 Oct 2012, Pekka Enberg wrote:

> > Is this a known condition? Are there work arounds or other ways to get the
> > slab allocation data which don't cause stalls?
>
> What kernel version are you using? What does your .config look like?

slabinfo access requires a mutex which will stall certain slab operations
(shrinking caches, retuning caches, cleaning caches in SLAB).

If the allocator is SLAB then I would think that this is a stall of
slabinfo mutex vs. slab clean operations (gets more frequent the more
processors are in the box).

If this is the case then using the SLUB allocator will fix the issue since
SLUB does not need to perform cache cleaning.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reading /proc/slabinfo causes stalls
  2012-10-16 18:29   ` Christoph Lameter
@ 2012-10-16 18:47     ` David Rientjes
  2012-10-16 18:55       ` Christoph Lameter
  0 siblings, 1 reply; 8+ messages in thread
From: David Rientjes @ 2012-10-16 18:47 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Avleen Vig, linux-kernel, Andrew Morton

On Tue, 16 Oct 2012, Christoph Lameter wrote:

> If the allocator is SLAB then I would think that this is a stall of
> slabinfo mutex vs. slab clean operations (gets more frequent the more
> processors are in the box).
> 
> If this is the case then using the SLUB allocator will fix the issue since
> SLUB does not need to perform cache cleaning.
> 

Or convert slab_mutex to be a rwlock?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reading /proc/slabinfo causes stalls
  2012-10-16 18:47     ` David Rientjes
@ 2012-10-16 18:55       ` Christoph Lameter
  2012-10-16 19:04         ` Avleen Vig
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Lameter @ 2012-10-16 18:55 UTC (permalink / raw)
  To: David Rientjes; +Cc: Pekka Enberg, Avleen Vig, linux-kernel, Andrew Morton

On Tue, 16 Oct 2012, David Rientjes wrote:

> Or convert slab_mutex to be a rwlock?

The RT folks just converted it to a mutex.... ;-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reading /proc/slabinfo causes stalls
  2012-10-16 18:55       ` Christoph Lameter
@ 2012-10-16 19:04         ` Avleen Vig
  2012-10-16 19:27           ` Christoph Lameter
  0 siblings, 1 reply; 8+ messages in thread
From: Avleen Vig @ 2012-10-16 19:04 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: David Rientjes, Pekka Enberg, linux-kernel, Andrew Morton

On Tue, Oct 16, 2012 at 1:55 PM, Christoph Lameter <cl@linux.com> wrote:
> On Tue, 16 Oct 2012, David Rientjes wrote:
>
>> Or convert slab_mutex to be a rwlock?
>
> The RT folks just converted it to a mutex.... ;-)

Damn :-)

I'll play around with SLUB.
There were some concerns that it doesn't perform as well as SLAB on
large hardware (we're almost always running >16G RAM, and regularly
>96G RAM on things like memcache servers).
If the performance pans out, SLUB would be great.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reading /proc/slabinfo causes stalls
  2012-10-16 19:04         ` Avleen Vig
@ 2012-10-16 19:27           ` Christoph Lameter
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Lameter @ 2012-10-16 19:27 UTC (permalink / raw)
  To: Avleen Vig; +Cc: David Rientjes, Pekka Enberg, linux-kernel, Andrew Morton

On Tue, 16 Oct 2012, Avleen Vig wrote:

> There were some concerns that it doesn't perform as well as SLAB on
> large hardware (we're almost always running >16G RAM, and regularly
> >96G RAM on things like memcache servers).
> If the performance pans out, SLUB would be great.

We have been using SLUB here for years with servers up to 256G RAM and
64 processors in both HPC and Enterprise environments.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-10-16 19:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-16 17:51 Reading /proc/slabinfo causes stalls Avleen Vig
2012-10-16 18:04 ` Pekka Enberg
2012-10-16 18:25   ` Avleen Vig
2012-10-16 18:29   ` Christoph Lameter
2012-10-16 18:47     ` David Rientjes
2012-10-16 18:55       ` Christoph Lameter
2012-10-16 19:04         ` Avleen Vig
2012-10-16 19:27           ` Christoph Lameter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.