linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Lameter <clameter@sgi.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matt Mackall <mpm@selenic.com>, "Rafael J. Wysocki" <rjw@sisk.pl>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
Date: Mon, 17 Dec 2007 11:54:46 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0712171147580.13147@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <20071214124900.GB31931@elte.hu>

On Fri, 14 Dec 2007, Ingo Molnar wrote:

> which is of little help if it regresses on other workloads. As we've 
> seen it, SLUB can be more than 10 times slower on hackbench. You can 
> tune SLUB to use 2MB pages but of course that's not a production level 
> system. OTOH, have you tried to tune SLAB in the above benchmark?

Hackbench is one special use case and I was not aware of there being an 
issue there. AFAICT other workloads are fine. I still do not understand 
why the measures in SLUB to avoid lock contention do not take in this 
case. Need to run some more tests.

> > - Single threaded allocation speed is up to double that of SLAB
> 
> link?

Same as link as for the earlier numbers.

> > - Debugging on SLAB is difficult. Requires recompile of the kernel
> >   and the resulting output is difficult to interpret. SLUB can apply
> >   debugging options to a subset of the slabcaches in order to allow
> >   the system to work with maximum speed. This is necessary to detect
> >   difficult to reproduce race conditions.
> 
> that's not a fundamental property of SLAB. It would be an about 10 lines 
> hack to enable SLAB debugging switchable-on runtime, with the boot flag 
> defaulting to 'off'.

Well try it. Note that you need to avoid the runtime debugging result in a 
negative performance impact.
 
> > - SLAB can capture huge amounts of memory in its queues. The problem
> >   gets worse the more processors and NUMA nodes are in the system. The 
> >   amount of memory limits the number of per cpu objects one can 
> >   configure.
> 
> well that's the nature of caches, but it could be improved: restrict 
> alien caches along cpusets and demand-allocate them.

Maybe but that adds additional complexity. There are other issues with 
queues too.

> > - SLAB requires a pass through all slab caches every 2 seconds to
> >   expire objects. This is a problem both for realtime and MPI jobs 
> >   that cannot take such a processor outage.
> 
> the moment you start capturing more memory in SLUB's per cpu queues 
> (which do exist), you will have the same sort of problem.

There are no queues and thus no problem in SLUB. The per cpu slab is 
exactly one slab and cannot grow beyond that.

> > - SLAB requires the update of two words for freeing
> >   and allocation. SLUB can do that by updating a single word which 
> >   allows to avoid enabling and disabling interrupts if the processor 
> >   supports an atomic instruction for that purpose. This is important 
> >   for realtime kernels where special measures may have to be 
> >   implemented if one wants to disable interrupts.
> 
> i do appreciate that :-) SLUB was rather easy to "port" to PREEMPT_RT: 
> it did not need a single line of change. The SLAB portion is a lot 
> scarier:

Finally something positive. I think we can get to a point where SLUB can 
be the same on RT and non RT. 
 
> How about renaming it to SLAB2 instead of SLUB? The "unqueued" bit is 
> just stupid NIH syndrome. It's _of course_ queued because it has to. "It 
> does not have _THAT_ queue as SLAB used to have" is just a silly excuse.

Hmmm yes. At some point I want to remove SLAB and rename SLUB SLAB. Note 
that the queues (if you want to call the per slab page freelist queues) 
are significantly different.

> > - SLUB creates rarely used DMA caches on demand instead of creating
> >   them all on bootup (SLAB).
> 
> actually, this might be a bug. the DMA caches should be created right 
> away and filled with a small amount of objects due to stupid 16MB 
> limitations with certain hardware. Later on a GFP_DMA request might not 
> be fulfillable. (because that zone is filled up pretty quickly)

Use of SLAB DMA memory are exceedingly rare. Andi Kleen has removed 
almost all uses of slab DMA. The DMA must remain allocatable in order to 
allow allocations for legacy device drivers. If it fills up then we will 
have other issues.


  reply	other threads:[~2007-12-17 20:14 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-08  2:40 2.6.24-rc4-git5: Reported regressions from 2.6.23 Rafael J. Wysocki
2007-12-08  6:53 ` Fabio Comolli
2007-12-08  8:28   ` Ingo Molnar
2007-12-08  9:23     ` Andrew Morton
2007-12-08 22:11       ` Rafael J. Wysocki
2007-12-08  9:29 ` Andrew Morton
2007-12-08 22:17   ` Rafael J. Wysocki
2007-12-08  9:30 ` tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23] Ingo Molnar
2007-12-08 10:11   ` Andrew Morton
2007-12-08 16:37   ` Matt Mackall
2007-12-08 17:47     ` Linus Torvalds
2007-12-08 17:54       ` Linus Torvalds
2007-12-08 18:09         ` Andrew Morton
2007-12-08 18:37           ` Linus Torvalds
2007-12-08 19:52             ` Ingo Molnar
2007-12-08 20:29               ` Ingo Molnar
2007-12-09  8:20                 ` Pekka Enberg
2007-12-09  8:50                   ` Ingo Molnar
2007-12-09  9:18                     ` Pekka Enberg
2007-12-09 11:51                       ` Ingo Molnar
2007-12-09 12:34                         ` Ingo Molnar
2007-12-13 22:07                     ` Christoph Lameter
2007-12-09 15:59                   ` Arjan van de Ven
2007-12-11  6:27               ` Dave Jones
2007-12-11  8:52                 ` Ingo Molnar
2007-12-11 19:03                   ` Peter Zijlstra
2007-12-14  4:07               ` Christoph Lameter
2007-12-14  7:16                 ` Eric Dumazet
2007-12-14 12:49                 ` Ingo Molnar
2007-12-17 19:54                   ` Christoph Lameter [this message]
2007-12-09  7:58           ` Ingo Molnar
2007-12-09 14:17             ` Rafael J. Wysocki
2007-12-08 18:33         ` Matt Mackall
2007-12-08 19:00         ` Matt Mackall
2007-12-09  8:33         ` Pekka Enberg
2007-12-13 22:03       ` Christoph Lameter
2007-12-08  9:36 ` 2.6.24-rc4-git5: Reported regressions from 2.6.23 Andrew Morton
2007-12-08 10:12   ` Andreas Mohr
2007-12-08 10:20     ` Andrew Morton
2007-12-08 10:28       ` Matthew Garrett
2007-12-08 10:55       ` Andreas Mohr
2007-12-09 15:46         ` Tejun Heo
2007-12-09 19:59           ` Andreas Mohr
2007-12-09  6:52   ` Tejun Heo
2007-12-09 14:20     ` Rafael J. Wysocki
2007-12-09 15:11       ` Tejun Heo
2007-12-08  9:42 ` Andrew Morton
2007-12-08 18:57   ` Roland Dreier
2007-12-08 19:40   ` Theodore Tso
2007-12-08 19:55     ` Ingo Molnar
2007-12-08 22:30     ` Rafael J. Wysocki
2007-12-09  2:15       ` Theodore Tso
2007-12-13 10:49         ` Takashi Iwai
2007-12-20 15:42           ` Takashi Iwai
2007-12-08  9:46 ` Andrew Morton
2007-12-08 15:49   ` Alan Stern
2007-12-08  9:52 ` Andrew Morton
2007-12-09  7:00   ` Tejun Heo
2007-12-09 13:42     ` Alan Cox
2007-12-09 15:09       ` Tejun Heo
2007-12-09 15:25         ` Alan Cox
2007-12-09 15:39           ` Tejun Heo
2007-12-09 18:36       ` Linus Torvalds
2007-12-09 21:54         ` Alan Cox
2007-12-09 18:41       ` Linus Torvalds
2007-12-09 22:01         ` Alan Cox
2007-12-09 22:51           ` Ray Lee
2007-12-10  1:57           ` Linus Torvalds
2007-12-10  3:28             ` Alan Cox
2007-12-10  3:38             ` Alan Cox
2007-12-10 15:38               ` Linus Torvalds
2007-12-10  8:21             ` Ingo Molnar
2007-12-10  8:27               ` Tejun Heo
2007-12-10  8:41                 ` Ingo Molnar
2007-12-08 10:44 ` Richard Purdie
2007-12-08 22:32   ` Rafael J. Wysocki
2007-12-09 11:54 ` Andrew Morton
2007-12-09 12:05   ` Ingo Molnar
2007-12-09 14:24   ` Rafael J. Wysocki
2007-12-10 20:42 ` Ingo Molnar
2007-12-10 20:57   ` Guillaume Chazarain
2007-12-10 20:59   ` Andrew Morton
2007-12-10 22:45     ` Ingo Molnar
2007-12-10 23:04       ` Ingo Molnar
2007-12-10 23:34         ` Stefano Brivio
2007-12-10 23:53           ` Guillaume Chazarain
2007-12-11  8:48             ` Ingo Molnar
2007-12-10 23:56           ` Arjan van de Ven
2007-12-11  0:01             ` Guillaume Chazarain
2007-12-11  1:06               ` Arjan van de Ven
2007-12-11  8:43                 ` Ingo Molnar
2007-12-11  9:01           ` Ingo Molnar
2007-12-11 21:10             ` Stefano Brivio
2007-12-19  0:58             ` Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0712171147580.13147@schroedinger.engr.sgi.com \
    --to=clameter@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mpm@selenic.com \
    --cc=rjw@sisk.pl \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).