linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] procfs: provide slub's /proc/slabinfo
@ 2008-01-02 18:43 Hugh Dickins
  2008-01-02 18:53 ` Christoph Lameter
  2008-01-02 19:09 ` Pekka Enberg
  0 siblings, 2 replies; 69+ messages in thread
From: Hugh Dickins @ 2008-01-02 18:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka Enberg, Ingo Molnar, Andi Kleen, Christoph Lameter,
	Peter Zijlstra, linux-kernel

SLUB's new slabinfo isn't there: it looks as if a last minute change
to Pekka's patch left it dependent on CONFIG_SLAB at the procfs end:
allow for CONFIG_SLUB too.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
---
To minimize ifdeffery, this leaves it with S_IWUSR though unwritable:
I'm assuming that's acceptable.

 fs/proc/proc_misc.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- 2.6.24-rc6-git/fs/proc/proc_misc.c	2007-10-20 08:04:13.000000000 +0100
+++ linux/fs/proc/proc_misc.c	2008-01-02 17:55:57.000000000 +0000
@@ -410,15 +410,18 @@ static const struct file_operations proc
 };
 #endif
 
-#ifdef CONFIG_SLAB
+#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB)
 static int slabinfo_open(struct inode *inode, struct file *file)
 {
 	return seq_open(file, &slabinfo_op);
 }
+
 static const struct file_operations proc_slabinfo_operations = {
 	.open		= slabinfo_open,
 	.read		= seq_read,
+#ifdef CONFIG_SLAB
 	.write		= slabinfo_write,
+#endif
 	.llseek		= seq_lseek,
 	.release	= seq_release,
 };
@@ -728,7 +731,7 @@ void __init proc_misc_init(void)
 #endif
 	create_seq_entry("stat", 0, &proc_stat_operations);
 	create_seq_entry("interrupts", 0, &proc_interrupts_operations);
-#ifdef CONFIG_SLAB
+#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB)
 	create_seq_entry("slabinfo",S_IWUSR|S_IRUGO,&proc_slabinfo_operations);
 #ifdef CONFIG_DEBUG_SLAB_LEAK
 	create_seq_entry("slab_allocators", 0 ,&proc_slabstats_operations);

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-02 18:43 [PATCH] procfs: provide slub's /proc/slabinfo Hugh Dickins
@ 2008-01-02 18:53 ` Christoph Lameter
  2008-01-02 19:09 ` Pekka Enberg
  1 sibling, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-02 18:53 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Pekka Enberg, Ingo Molnar, Andi Kleen,
	Peter Zijlstra, linux-kernel

On Wed, 2 Jan 2008, Hugh Dickins wrote:

> SLUB's new slabinfo isn't there: it looks as if a last minute change
> to Pekka's patch left it dependent on CONFIG_SLAB at the procfs end:
> allow for CONFIG_SLUB too.
> 
> Signed-off-by: Hugh Dickins <hugh@veritas.com>

I just saw the patch in Linus tree and wondered how this is going to work 
without the piece that you are providing here.

Acked-by: Christoph Lameter <clameter@sgi.com>


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-02 18:43 [PATCH] procfs: provide slub's /proc/slabinfo Hugh Dickins
  2008-01-02 18:53 ` Christoph Lameter
@ 2008-01-02 19:09 ` Pekka Enberg
  2008-01-02 19:35   ` Linus Torvalds
  1 sibling, 1 reply; 69+ messages in thread
From: Pekka Enberg @ 2008-01-02 19:09 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Ingo Molnar, Andi Kleen, Christoph Lameter,
	Peter Zijlstra, linux-kernel

Hi,

On Jan 2, 2008 8:43 PM, Hugh Dickins <hugh@veritas.com> wrote:
> SLUB's new slabinfo isn't there: it looks as if a last minute change
> to Pekka's patch left it dependent on CONFIG_SLAB at the procfs end:
> allow for CONFIG_SLUB too.
>
> Signed-off-by: Hugh Dickins <hugh@veritas.com>
> ---
> To minimize ifdeffery, this leaves it with S_IWUSR though unwritable:
> I'm assuming that's acceptable.

I already sent the remaining bits to Linus but this looks much
cleaner. Thanks Hugh!

Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-02 19:09 ` Pekka Enberg
@ 2008-01-02 19:35   ` Linus Torvalds
  2008-01-02 19:45     ` Linus Torvalds
                       ` (2 more replies)
  0 siblings, 3 replies; 69+ messages in thread
From: Linus Torvalds @ 2008-01-02 19:35 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Hugh Dickins, Ingo Molnar, Andi Kleen, Christoph Lameter,
	Peter Zijlstra, Linux Kernel Mailing List



On Wed, 2 Jan 2008, Pekka Enberg wrote:
> 
> I already sent the remaining bits to Linus but this looks much
> cleaner. Thanks Hugh!
> 
> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>

Actually, I'd much rather just do this instead (on top of your patch)

It just creates a new CONFIG_SLABINFO that automatically has the right 
dependencies (ie depends on PROC being on, and either SLAB or SLUB), and 
then both SLAB and SLUB just have the exact same interfaces.

Which means that SLOB could also trivially implement the same thing, with 
no new #ifdef'fery or other crud.

It's totally untested, but looks fairly obvious.

		Linus

---
 fs/proc/proc_misc.c      |   21 ++-------------------
 include/linux/slab.h     |    5 +++++
 include/linux/slab_def.h |    3 ---
 include/linux/slub_def.h |    2 --
 init/Kconfig             |    6 ++++++
 mm/slab.c                |    2 +-
 mm/slub.c                |   11 +++++++++--
 7 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/fs/proc/proc_misc.c b/fs/proc/proc_misc.c
index a11968b..3462bfd 100644
--- a/fs/proc/proc_misc.c
+++ b/fs/proc/proc_misc.c
@@ -410,7 +410,7 @@ static const struct file_operations proc_modules_operations = {
 };
 #endif
 
-#ifdef CONFIG_SLAB
+#ifdef CONFIG_SLABINFO
 static int slabinfo_open(struct inode *inode, struct file *file)
 {
 	return seq_open(file, &slabinfo_op);
@@ -451,20 +451,6 @@ static const struct file_operations proc_slabstats_operations = {
 #endif
 #endif
 
-#ifdef CONFIG_SLUB
-static int slabinfo_open(struct inode *inode, struct file *file)
-{
-	return seq_open(file, &slabinfo_op);
-}
-
-static const struct file_operations proc_slabinfo_operations = {
-	.open		= slabinfo_open,
-	.read		= seq_read,
-	.llseek		= seq_lseek,
-	.release	= seq_release,
-};
-#endif
-
 static int show_stat(struct seq_file *p, void *v)
 {
 	int i;
@@ -742,15 +728,12 @@ void __init proc_misc_init(void)
 #endif
 	create_seq_entry("stat", 0, &proc_stat_operations);
 	create_seq_entry("interrupts", 0, &proc_interrupts_operations);
-#ifdef CONFIG_SLAB
+#ifdef CONFIG_SLABINFO
 	create_seq_entry("slabinfo",S_IWUSR|S_IRUGO,&proc_slabinfo_operations);
 #ifdef CONFIG_DEBUG_SLAB_LEAK
 	create_seq_entry("slab_allocators", 0 ,&proc_slabstats_operations);
 #endif
 #endif
-#ifdef CONFIG_SLUB
-	create_seq_entry("slabinfo", S_IWUSR|S_IRUGO, &proc_slabinfo_operations);
-#endif
 	create_seq_entry("buddyinfo",S_IRUGO, &fragmentation_file_operations);
 	create_seq_entry("pagetypeinfo", S_IRUGO, &pagetypeinfo_file_ops);
 	create_seq_entry("vmstat",S_IRUGO, &proc_vmstat_file_operations);
diff --git a/include/linux/slab.h b/include/linux/slab.h
index f3a8eec..f62caaa 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -271,5 +271,10 @@ static inline void *kzalloc(size_t size, gfp_t flags)
 	return kmalloc(size, flags | __GFP_ZERO);
 }
 
+#ifdef CONFIG_SLABINFO
+extern const struct seq_operations slabinfo_op;
+ssize_t slabinfo_write(struct file *, const char __user *, size_t, loff_t *);
+#endif
+
 #endif	/* __KERNEL__ */
 #endif	/* _LINUX_SLAB_H */
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 32bdc2f..fcc4809 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -95,7 +95,4 @@ found:
 
 #endif	/* CONFIG_NUMA */
 
-extern const struct seq_operations slabinfo_op;
-ssize_t slabinfo_write(struct file *, const char __user *, size_t, loff_t *);
-
 #endif	/* _LINUX_SLAB_DEF_H */
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index b7d9408..40801e7 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -200,6 +200,4 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 }
 #endif
 
-extern const struct seq_operations slabinfo_op;
-
 #endif /* _LINUX_SLUB_DEF_H */
diff --git a/init/Kconfig b/init/Kconfig
index 404bbf3..b9d11a8 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -658,6 +658,12 @@ endchoice
 
 endmenu		# General setup
 
+config SLABINFO
+	bool
+	depends on PROC_FS
+	depends on SLAB || SLUB
+	default y
+
 config RT_MUTEXES
 	boolean
 	select PLIST
diff --git a/mm/slab.c b/mm/slab.c
index 2e338a5..aebb9f6 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4105,7 +4105,7 @@ out:
 	schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_CPUC));
 }
 
-#ifdef CONFIG_PROC_FS
+#ifdef CONFIG_SLABINFO
 
 static void print_slabinfo_header(struct seq_file *m)
 {
diff --git a/mm/slub.c b/mm/slub.c
index 903dabd..474945e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4127,7 +4127,14 @@ __initcall(slab_sysfs_init);
 /*
  * The /proc/slabinfo ABI
  */
-#ifdef CONFIG_PROC_FS
+#ifdef CONFIG_SLABINFO
+
+ssize_t slabinfo_write(struct file *file, const char __user * buffer,
+                       size_t count, loff_t *ppos)
+{
+	return -EINVAL;
+}
+
 
 static void print_slabinfo_header(struct seq_file *m)
 {
@@ -4201,4 +4208,4 @@ const struct seq_operations slabinfo_op = {
 	.show = s_show,
 };
 
-#endif /* CONFIG_PROC_FS */
+#endif /* CONFIG_SLABINFO */

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-02 19:35   ` Linus Torvalds
@ 2008-01-02 19:45     ` Linus Torvalds
  2008-01-02 19:49     ` Pekka Enberg
  2008-01-02 22:50     ` Matt Mackall
  2 siblings, 0 replies; 69+ messages in thread
From: Linus Torvalds @ 2008-01-02 19:45 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Hugh Dickins, Ingo Molnar, Andi Kleen, Christoph Lameter,
	Peter Zijlstra, Linux Kernel Mailing List



On Wed, 2 Jan 2008, Linus Torvalds wrote:
> 
> Actually, I'd much rather just do this instead (on top of your patch)

Side note - and I didn't do this - this also allows you to disable 
slabinfo for those people who think it is pointless.

In particular, you could just make that

> +config SLABINFO
> +	bool
> +	depends on PROC_FS
> +	depends on SLAB || SLUB
> +	default y

Kconfig entry use something like

	bool "Enable /proc/slabinfo support"

(perhaps adding an "if EMBEDDED" at the end), and now people can decide 
whether they actually want it or not (independently of whether they use 
SLUB or SLAB or whatever). But I left it hardcoded to on, just because I'm 
not sure it's worth it.

		Linus

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-02 19:35   ` Linus Torvalds
  2008-01-02 19:45     ` Linus Torvalds
@ 2008-01-02 19:49     ` Pekka Enberg
  2008-01-02 22:50     ` Matt Mackall
  2 siblings, 0 replies; 69+ messages in thread
From: Pekka Enberg @ 2008-01-02 19:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hugh Dickins, Ingo Molnar, Andi Kleen, Christoph Lameter,
	Peter Zijlstra, Linux Kernel Mailing List

Hi Linus,

On Jan 2, 2008 9:35 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> Actually, I'd much rather just do this instead (on top of your patch)
>
> It just creates a new CONFIG_SLABINFO that automatically has the right
> dependencies (ie depends on PROC being on, and either SLAB or SLUB), and
> then both SLAB and SLUB just have the exact same interfaces.
>
> Which means that SLOB could also trivially implement the same thing, with
> no new #ifdef'fery or other crud.
>
> It's totally untested, but looks fairly obvious.

Looks good to me. Thanks! :-)

Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-02 19:35   ` Linus Torvalds
  2008-01-02 19:45     ` Linus Torvalds
  2008-01-02 19:49     ` Pekka Enberg
@ 2008-01-02 22:50     ` Matt Mackall
  2008-01-03  8:52       ` Ingo Molnar
  2 siblings, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-01-02 22:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka Enberg, Hugh Dickins, Ingo Molnar, Andi Kleen,
	Christoph Lameter, Peter Zijlstra, Linux Kernel Mailing List


On Wed, 2008-01-02 at 11:35 -0800, Linus Torvalds wrote:
> 
> On Wed, 2 Jan 2008, Pekka Enberg wrote:
> > 
> > I already sent the remaining bits to Linus but this looks much
> > cleaner. Thanks Hugh!
> > 
> > Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
> 
> Actually, I'd much rather just do this instead (on top of your patch)
> 
> It just creates a new CONFIG_SLABINFO that automatically has the right 
> dependencies (ie depends on PROC being on, and either SLAB or SLUB), and 
> then both SLAB and SLUB just have the exact same interfaces.
> 
> Which means that SLOB could also trivially implement the same thing, with 
> no new #ifdef'fery or other crud.

Except SLOB's emulation of slabs is so thin, it doesn't have the
relevant information. We have a very small struct kmem_cache, which I
suppose could contain a counter. But we don't have anything like the
kmalloc slabs, so you'd only be getting half the picture anyway.
The output of slabtop would simply be misleading because there are no
underlying "slabs" in the first place.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-02 22:50     ` Matt Mackall
@ 2008-01-03  8:52       ` Ingo Molnar
  2008-01-03 16:46         ` Matt Mackall
  2008-01-03 20:31         ` [PATCH] procfs: provide slub's /proc/slabinfo Christoph Lameter
  0 siblings, 2 replies; 69+ messages in thread
From: Ingo Molnar @ 2008-01-03  8:52 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka Enberg, Hugh Dickins, Andi Kleen,
	Christoph Lameter, Peter Zijlstra, Linux Kernel Mailing List


* Matt Mackall <mpm@selenic.com> wrote:

> > Which means that SLOB could also trivially implement the same thing, 
> > with no new #ifdef'fery or other crud.
> 
> Except SLOB's emulation of slabs is so thin, it doesn't have the 
> relevant information. We have a very small struct kmem_cache, which I 
> suppose could contain a counter. But we don't have anything like the 
> kmalloc slabs, so you'd only be getting half the picture anyway. The 
> output of slabtop would simply be misleading because there are no 
> underlying "slabs" in the first place.

i think SLOB/embedded is sufficiently special that a "no /proc/slabinfo" 
restriction is perfectly supportable. (for instance it's only selectable 
if CONFIG_EMBEDDED=y) If a SLOB user has any memory allocation problems 
it's worth going to the bigger allocators anyway, to get all the 
debugging goodies.

btw., do you think it would be worth/possible to have build mode for 
SLUB that is acceptably close to the memory efficiency of SLOB? (and 
hence work towards unifying all the 3 allocators into SLUB in essence)

right now we are far away from it - SLUB has an order of magnitude 
larger .o than SLOB, even on UP. I'm wondering why that is so - SLUB's 
data structures _are_ quite compact and could in theory be used in a 
SLOB-alike way. Perhaps one problem is that much of SLUB's debugging 
code is always built in?

	Ingo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-03  8:52       ` Ingo Molnar
@ 2008-01-03 16:46         ` Matt Mackall
  2008-01-04  2:21           ` Christoph Lameter
  2008-01-03 20:31         ` [PATCH] procfs: provide slub's /proc/slabinfo Christoph Lameter
  1 sibling, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-01-03 16:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Pekka Enberg, Hugh Dickins, Andi Kleen,
	Christoph Lameter, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-03 at 09:52 +0100, Ingo Molnar wrote:
> * Matt Mackall <mpm@selenic.com> wrote:
> 
> > > Which means that SLOB could also trivially implement the same thing, 
> > > with no new #ifdef'fery or other crud.
> > 
> > Except SLOB's emulation of slabs is so thin, it doesn't have the 
> > relevant information. We have a very small struct kmem_cache, which I 
> > suppose could contain a counter. But we don't have anything like the 
> > kmalloc slabs, so you'd only be getting half the picture anyway. The 
> > output of slabtop would simply be misleading because there are no 
> > underlying "slabs" in the first place.
> 
> i think SLOB/embedded is sufficiently special that a "no /proc/slabinfo" 
> restriction is perfectly supportable. (for instance it's only selectable 
> if CONFIG_EMBEDDED=y) If a SLOB user has any memory allocation problems 
> it's worth going to the bigger allocators anyway, to get all the 
> debugging goodies.
> 
> btw., do you think it would be worth/possible to have build mode for 
> SLUB that is acceptably close to the memory efficiency of SLOB? (and 
> hence work towards unifying all the 3 allocators into SLUB in essence)

There are three downsides with the slab-like approach: internal
fragmentation, under-utilized slabs, and pinning.

The first is the situation where we ask for a kmalloc of 33 bytes and
get 64. I think the average kmalloc wastes about 30% trying to fit into
power-of-two buckets. We can tune our buckets a bit, but I think in
general trying to back kmalloc with slabs is problematic. SLOB has a
2-byte granularity up to the point where it just hands things off to the
page allocator.

If we tried to add more slabs to fill the gaps, we'd exacerbate the
second problem: because only one type of object can go on a slab, a lot
of slabs are half-full. SLUB's automerging of slabs helps some here, but
is still restricted to objects of the same size.

And finally, there's the whole pinning problem: we can have a cache like
the dcache grow very large and then contract, but still have most of its
slabs used by pinned dentries. Christoph has some rather hairy patches
to address this, but SLOB doesn't have much of a problem here - those
pages are still available to allocate other objects on.

By comparison, SLOB's big downsides are that it's not O(1) and it has a
single lock. But it's currently fast enough to keep up with SLUB on
kernel compiles on my 2G box and Nick had an allocator benchmark where
scalability didn't fall off until beyond 4 CPUs.

> right now we are far away from it - SLUB has an order of magnitude 
> larger .o than SLOB, even on UP. I'm wondering why that is so - SLUB's 
> data structures _are_ quite compact and could in theory be used in a 
> SLOB-alike way. Perhaps one problem is that much of SLUB's debugging 
> code is always built in?

I think we should probably just accept that it makes sense to have more
than one allocator. A 64MB single CPU machine is very, very different
than a 64TB 4096-CPU machine. On one of those, it probably makes some
sense to burn some memory for maximum scalability.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-03  8:52       ` Ingo Molnar
  2008-01-03 16:46         ` Matt Mackall
@ 2008-01-03 20:31         ` Christoph Lameter
  1 sibling, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-03 20:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Matt Mackall, Linus Torvalds, Pekka Enberg, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 3 Jan 2008, Ingo Molnar wrote:

> right now we are far away from it - SLUB has an order of magnitude 
> larger .o than SLOB, even on UP. I'm wondering why that is so - SLUB's 
> data structures _are_ quite compact and could in theory be used in a 
> SLOB-alike way. Perhaps one problem is that much of SLUB's debugging 
> code is always built in?

In embedded mode you can switch the debugging code off.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-03 16:46         ` Matt Mackall
@ 2008-01-04  2:21           ` Christoph Lameter
  2008-01-04  2:45             ` Andi Kleen
  2008-01-04  4:11             ` Matt Mackall
  0 siblings, 2 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-04  2:21 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Ingo Molnar, Linus Torvalds, Pekka Enberg, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 3 Jan 2008, Matt Mackall wrote:

> There are three downsides with the slab-like approach: internal
> fragmentation, under-utilized slabs, and pinning.
> 
> The first is the situation where we ask for a kmalloc of 33 bytes and
> get 64. I think the average kmalloc wastes about 30% trying to fit into
> power-of-two buckets. We can tune our buckets a bit, but I think in
> general trying to back kmalloc with slabs is problematic. SLOB has a
> 2-byte granularity up to the point where it just hands things off to the
> page allocator.

The 2 byte overhead of SLOB becomes a liability when it comes to correctly 
aligning power of two sized object. SLOB has to add two bytes and then 
align the combined object (argh!). SLUB can align these without a 2 byte 
overhead. In some configurations this results in SLUB using even less 
memory than SLOB. See f.e. Pekka's test at 
http://marc.info/?l=linux-kernel&m=118405559214029&w=2

> If we tried to add more slabs to fill the gaps, we'd exacerbate the
> second problem: because only one type of object can go on a slab, a lot
> of slabs are half-full. SLUB's automerging of slabs helps some here, but
> is still restricted to objects of the same size.

The advantage of SLOB is to be able to put objects of multiple sizes into 
the same slab page. That advantage goes away once we have more than a few 
objects per slab because SLUB can store object in a denser way than SLOB.
 
> And finally, there's the whole pinning problem: we can have a cache like
> the dcache grow very large and then contract, but still have most of its
> slabs used by pinned dentries. Christoph has some rather hairy patches
> to address this, but SLOB doesn't have much of a problem here - those
> pages are still available to allocate other objects on.

Well if you just have a few dentries then they are likely all pinned. A 
large number of dentries will typically result in reclaimable slabs.
The slab defrag patchset not only deals with the dcache issue but provides 
similar solutions for inode and buffer_heads. Support for other slabs that
defragment can be added by providing two hooks per slab.

 > By comparison, SLOB's big downsides are that it's not O(1) and it has a
> single lock. But it's currently fast enough to keep up with SLUB on
> kernel compiles on my 2G box and Nick had an allocator benchmark where
> scalability didn't fall off until beyond 4 CPUs.

Both SLOB and SLAB suffer from the single lock problem. SLOB does it for 
every item allocated. SLAB does it for every nth item allocated. Given 
a fast allocation from multiple processors both will generate a bouncing 
cacheline. SLUB can take pages from the page allocator pools and allocate 
all objects from it without taking a lock.

> > right now we are far away from it - SLUB has an order of magnitude 
> > larger .o than SLOB, even on UP. I'm wondering why that is so - SLUB's 
> > data structures _are_ quite compact and could in theory be used in a 
> > SLOB-alike way. Perhaps one problem is that much of SLUB's debugging 
> > code is always built in?
> 
> I think we should probably just accept that it makes sense to have more
> than one allocator. A 64MB single CPU machine is very, very different
> than a 64TB 4096-CPU machine. On one of those, it probably makes some
> sense to burn some memory for maximum scalability.

I still have trouble to see that SLOB still has much to offer. An embedded 
allocator that in many cases has more allocation overhead than the default 
one? Ok you still have advantages if allocations are rounded up to the 
next power of two for a kmalloc and because of the combining of different 
types of allocations in a single slab if there are an overall small number 
of allocations. If one would create a custom slab for the worst problems 
there then this may also go away.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04  2:21           ` Christoph Lameter
@ 2008-01-04  2:45             ` Andi Kleen
  2008-01-04  4:34               ` Matt Mackall
  2008-01-04  9:17               ` Peter Zijlstra
  2008-01-04  4:11             ` Matt Mackall
  1 sibling, 2 replies; 69+ messages in thread
From: Andi Kleen @ 2008-01-04  2:45 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Matt Mackall, Ingo Molnar, Linus Torvalds, Pekka Enberg,
	Hugh Dickins, Andi Kleen, Peter Zijlstra,
	Linux Kernel Mailing List

> I still have trouble to see that SLOB still has much to offer. An embedded 
> allocator that in many cases has more allocation overhead than the default 
> one? Ok you still have advantages if allocations are rounded up to the 
> next power of two for a kmalloc and because of the combining of different 
> types of allocations in a single slab if there are an overall small number 
> of allocations. If one would create a custom slab for the worst problems 
> there then this may also go away.

I suspect it would be a good idea anyways to reevaluate the power of two
slabs. Perhaps a better distribution can be found based on some profiling?
I did profile kmalloc using a systemtap script some time ago but don't
remember the results exactly, but iirc it looked like it could be improved.

A long time ago i also had some code to let the network stack give hints
about its MMUs to slab to create fitting slabs for packets. But that
was never really pushed forward because it turned out it didn't help
much for the most common 1.5K MTU -- always only two packets fit into
a page.

-Andi
> 

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04  2:21           ` Christoph Lameter
  2008-01-04  2:45             ` Andi Kleen
@ 2008-01-04  4:11             ` Matt Mackall
  2008-01-04 20:34               ` Christoph Lameter
  2008-01-05 16:21               ` Pekka J Enberg
  1 sibling, 2 replies; 69+ messages in thread
From: Matt Mackall @ 2008-01-04  4:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Ingo Molnar, Linus Torvalds, Pekka Enberg, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-03 at 18:21 -0800, Christoph Lameter wrote:
> On Thu, 3 Jan 2008, Matt Mackall wrote:
> 
> > There are three downsides with the slab-like approach: internal
> > fragmentation, under-utilized slabs, and pinning.
> > 
> > The first is the situation where we ask for a kmalloc of 33 bytes and
> > get 64. I think the average kmalloc wastes about 30% trying to fit into
> > power-of-two buckets. We can tune our buckets a bit, but I think in
> > general trying to back kmalloc with slabs is problematic. SLOB has a
> > 2-byte granularity up to the point where it just hands things off to the
> > page allocator.
> 
> The 2 byte overhead of SLOB becomes a liability when it comes to correctly 
> aligning power of two sized object. SLOB has to add two bytes and then 
> align the combined object (argh!).

It does no such thing. Only kmallocs have 2-byte headers and kmalloc is
never aligned. RTFS, it's quite short.

>  SLUB can align these without a 2 byte 
> overhead. In some configurations this results in SLUB using even less 
> memory than SLOB. See f.e. Pekka's test at 
> http://marc.info/?l=linux-kernel&m=118405559214029&w=2

Available memory after boot is not a particularly stable measurement and
not valid if there's memory pressure. At any rate, I wasn't able to
reproduce this.

> > If we tried to add more slabs to fill the gaps, we'd exacerbate the
> > second problem: because only one type of object can go on a slab, a lot
> > of slabs are half-full. SLUB's automerging of slabs helps some here, but
> > is still restricted to objects of the same size.
> 
> The advantage of SLOB is to be able to put objects of multiple sizes into 
> the same slab page. That advantage goes away once we have more than a few 
> objects per slab because SLUB can store object in a denser way than SLOB.

Ugh, Christoph. Can you please stop repeating this falsehood? I'm sick
and tired of debunking it. There is no overhead for any objects with
externally-known size. So unless SLUB actually has negative overhead,
this just isn't true.

> > And finally, there's the whole pinning problem: we can have a cache like
> > the dcache grow very large and then contract, but still have most of its
> > slabs used by pinned dentries. Christoph has some rather hairy patches
> > to address this, but SLOB doesn't have much of a problem here - those
> > pages are still available to allocate other objects on.
> 
> Well if you just have a few dentries then they are likely all pinned. A 
> large number of dentries will typically result in reclaimable slabs.
> The slab defrag patchset not only deals with the dcache issue but provides 
> similar solutions for inode and buffer_heads. Support for other slabs that
> defragment can be added by providing two hooks per slab.

What's your point? Slabs have a inherent pinning problem that's ugly to
combat. SLOB doesn't.

>  > By comparison, SLOB's big downsides are that it's not O(1) and it has a
> > single lock. But it's currently fast enough to keep up with SLUB on
> > kernel compiles on my 2G box and Nick had an allocator benchmark where
> > scalability didn't fall off until beyond 4 CPUs.
> 
> Both SLOB and SLAB suffer from the single lock problem. SLOB does it for 
> every item allocated. SLAB does it for every nth item allocated. Given 
> a fast allocation from multiple processors both will generate a bouncing 
> cacheline. SLUB can take pages from the page allocator pools and allocate 
> all objects from it without taking a lock.

That's very nice, but SLOB already runs lockless on most of its target
machines by virtue of them being UP. Lock contention just isn't
interesting to SLOB. 

> > > right now we are far away from it - SLUB has an order of
> magnitude 
> > > larger .o than SLOB, even on UP. I'm wondering why that is so -
> SLUB's 
> > > data structures _are_ quite compact and could in theory be used in
> a 
> > > SLOB-alike way. Perhaps one problem is that much of SLUB's
> debugging 
> > > code is always built in?
> > 
> > I think we should probably just accept that it makes sense to have
> more
> > than one allocator. A 64MB single CPU machine is very, very
> different
> > than a 64TB 4096-CPU machine. On one of those, it probably makes
> some
> > sense to burn some memory for maximum scalability.
> 
> I still have trouble to see that SLOB still has much to offer. An
> embedded 
> allocator that in many cases has more allocation overhead than the
> default 
> one?

For the benefit of anyone who didn't read this the last few times I
rehashed this with you:

SLOB:
- internal overhead for kmalloc is 2 bytes (or 3 for odd-sized objects)
- internal overhead for kmem_cache_alloc is 0 bytes (or 1 for odd-sized
objects)
- any unused space down to 2 bytes on any SLOB page can be allocated by
any object that will fit

SLAB/SLUB
- internal overhead for kmalloc averages about 30%
- internal overhead for kmem_cache_alloc is (slab-size % object-size) /
objects-per-slab, which can be quite large for things like SKBs and
task_structs and is made worse by alignment
- unused space on slabs can't be allocated by objects of other
sizes/types

The only time SLAB/SLUB can win in efficiency (assuming they're using
the same page size) is when all your kmallocs just happen to be powers
of two. Which, assuming any likely distribution of string or other
object sizes, isn't often.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04  2:45             ` Andi Kleen
@ 2008-01-04  4:34               ` Matt Mackall
  2008-01-04  9:17               ` Peter Zijlstra
  1 sibling, 0 replies; 69+ messages in thread
From: Matt Mackall @ 2008-01-04  4:34 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Pekka Enberg,
	Hugh Dickins, Peter Zijlstra, Linux Kernel Mailing List


On Fri, 2008-01-04 at 03:45 +0100, Andi Kleen wrote:
> > I still have trouble to see that SLOB still has much to offer. An embedded 
> > allocator that in many cases has more allocation overhead than the default 
> > one? Ok you still have advantages if allocations are rounded up to the 
> > next power of two for a kmalloc and because of the combining of different 
> > types of allocations in a single slab if there are an overall small number 
> > of allocations. If one would create a custom slab for the worst problems 
> > there then this may also go away.
> 
> I suspect it would be a good idea anyways to reevaluate the power of two
> slabs. Perhaps a better distribution can be found based on some profiling?
> I did profile kmalloc using a systemtap script some time ago but don't
> remember the results exactly, but iirc it looked like it could be improved.

We can roughly group kmalloced objects into two classes:

a) intrinsically variable-sized (strings, etc.)
b) fixed-sized objects that nonetheless don't have their own caches

For (a), we can expect the size distribution to be approximately a
scale-invariant power distribution. So buckets of the form n**x make a
fair amount of sense. We might consider n less than 2 though.

For objects of type (b) that occur in significant numbers, well, we
might just want to add more caches. SLUB's merging of same-sized caches
will reduce the pain here.

> A long time ago i also had some code to let the network stack give hints
> about its MMUs to slab to create fitting slabs for packets. But that
> was never really pushed forward because it turned out it didn't help
> much for the most common 1.5K MTU -- always only two packets fit into
> a page.

Yes, that and task_struct kinda make you want to cry. Large-order
SLAB/SLUB/SLOB would go a long way to fix that, but has its own problems
of course.

One could imagine restructuring things so that the buddy allocator only
extended down to 64k or so and below that, gfp and friends called
through SLAB/SLUB/SLOB.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04  2:45             ` Andi Kleen
  2008-01-04  4:34               ` Matt Mackall
@ 2008-01-04  9:17               ` Peter Zijlstra
  2008-01-04 20:37                 ` Christoph Lameter
  1 sibling, 1 reply; 69+ messages in thread
From: Peter Zijlstra @ 2008-01-04  9:17 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Christoph Lameter, Matt Mackall, Ingo Molnar, Linus Torvalds,
	Pekka Enberg, Hugh Dickins, Linux Kernel Mailing List,
	William Lee Irwin III

[-- Attachment #1: Type: text/plain, Size: 1075 bytes --]


On Fri, 2008-01-04 at 03:45 +0100, Andi Kleen wrote:
> > I still have trouble to see that SLOB still has much to offer. An embedded 
> > allocator that in many cases has more allocation overhead than the default 
> > one? Ok you still have advantages if allocations are rounded up to the 
> > next power of two for a kmalloc and because of the combining of different 
> > types of allocations in a single slab if there are an overall small number 
> > of allocations. If one would create a custom slab for the worst problems 
> > there then this may also go away.
> 
> I suspect it would be a good idea anyways to reevaluate the power of two
> slabs. Perhaps a better distribution can be found based on some profiling?
> I did profile kmalloc using a systemtap script some time ago but don't
> remember the results exactly, but iirc it looked like it could be improved.

I remember wli trying to work out a series that had minimal
fragmentation. IIRC he was mixing a fibonaci series with the power of
two series.

Bill, do you remember getting anywhere?

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04  4:11             ` Matt Mackall
@ 2008-01-04 20:34               ` Christoph Lameter
  2008-01-04 20:55                 ` Matt Mackall
  2008-01-05 16:21               ` Pekka J Enberg
  1 sibling, 1 reply; 69+ messages in thread
From: Christoph Lameter @ 2008-01-04 20:34 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Ingo Molnar, Linus Torvalds, Pekka Enberg, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 3 Jan 2008, Matt Mackall wrote:

> > The advantage of SLOB is to be able to put objects of multiple sizes into 
> > the same slab page. That advantage goes away once we have more than a few 
> > objects per slab because SLUB can store object in a denser way than SLOB.
> 
> Ugh, Christoph. Can you please stop repeating this falsehood? I'm sick
> and tired of debunking it. There is no overhead for any objects with
> externally-known size. So unless SLUB actually has negative overhead,
> this just isn't true.

Hmmm.. Seems that I still do not understand how it is possible then to mix 
objects of different sizes in the same slab page. Somehow the allocator 
needs to know the size. So it is not possible in SLOB to use 
kmem_cache_alloc on an object and then free it using kfree?

> > Well if you just have a few dentries then they are likely all pinned. A 
> > large number of dentries will typically result in reclaimable slabs.
> > The slab defrag patchset not only deals with the dcache issue but provides 
> > similar solutions for inode and buffer_heads. Support for other slabs that
> > defragment can be added by providing two hooks per slab.
> 
> What's your point? Slabs have a inherent pinning problem that's ugly to
> combat. SLOB doesn't.

I thought we were talking about pinning problems of dentries. How are 
slabs pinned and why does it matter? If slabs are pineed by a dentry that 
is pinned then the slab page will be filled up with other dentries that 
are not pinned. The slab defrag approach causes a coalescing of objects 
around slabs that have pinned objects.

> SLOB:
> - internal overhead for kmalloc is 2 bytes (or 3 for odd-sized objects)

Well that increase if you need to align the object. For kmalloc this 
usually means cache line align a power of two object right? So we have a 
cacheline size of overhead?

> - internal overhead for kmem_cache_alloc is 0 bytes (or 1 for odd-sized
> objects)

You are not aligning to a double word boundary? This will create issues on 
certain platforms.

> SLAB/SLUB
> - internal overhead for kmalloc averages about 30%

I think that is valid for a random object size distribution?

> - internal overhead for kmem_cache_alloc is (slab-size % object-size) /
> objects-per-slab, which can be quite large for things like SKBs and
> task_structs and is made worse by alignment

Good, so SLOB can fit in small objects in those holes.

The calculation for SLAB is different since it also typically places it 
management structure in the slab. The management structure needs at least 
2 bytes per object.

So the per object overhead in SLAB is

((slab-size - management-structure-overhead) % object-size) / objects-per-slab

> The only time SLAB/SLUB can win in efficiency (assuming they're using
> the same page size) is when all your kmallocs just happen to be powers
> of two. Which, assuming any likely distribution of string or other
> object sizes, isn't often.

In case of SLAB that is true. In case of SLUB we could convert the 
kmallocs to kmem_cache_alloc. The newly created slab would in all 
likelyhood be an alias of an already existing structure and thus be 
essentially free. In that fashion SLUB can (in a limited way) put objects 
for different slab caches into the same slab page too.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04  9:17               ` Peter Zijlstra
@ 2008-01-04 20:37                 ` Christoph Lameter
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-04 20:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, Matt Mackall, Ingo Molnar, Linus Torvalds,
	Pekka Enberg, Hugh Dickins, Linux Kernel Mailing List,
	William Lee Irwin III

On Fri, 4 Jan 2008, Peter Zijlstra wrote:

> I remember wli trying to work out a series that had minimal
> fragmentation. IIRC he was mixing a fibonaci series with the power of
> two series.
> 
> Bill, do you remember getting anywhere?

I tried various approaches to reduce the overhead that power of two slab 
sizes cause in SLUB during the initial comparison of memory use with SLOB. 
This involved creating slabs in 32 byte increments and trying to add a few 
additional extra slabs in between the power of two sizes. None of that led 
to convincing results.
 
I found that the SLAB scheme with power of two caches and two extra ones 
(96 and 192 bytes) was optimal.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04 20:34               ` Christoph Lameter
@ 2008-01-04 20:55                 ` Matt Mackall
  2008-01-04 21:36                   ` Christoph Lameter
  0 siblings, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-01-04 20:55 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Ingo Molnar, Linus Torvalds, Pekka Enberg, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Fri, 2008-01-04 at 12:34 -0800, Christoph Lameter wrote:
> On Thu, 3 Jan 2008, Matt Mackall wrote:
> 
> > > The advantage of SLOB is to be able to put objects of multiple sizes into 
> > > the same slab page. That advantage goes away once we have more than a few 
> > > objects per slab because SLUB can store object in a denser way than SLOB.
> > 
> > Ugh, Christoph. Can you please stop repeating this falsehood? I'm sick
> > and tired of debunking it. There is no overhead for any objects with
> > externally-known size. So unless SLUB actually has negative overhead,
> > this just isn't true.
> 
> Hmmm.. Seems that I still do not understand how it is possible then to mix 
> objects of different sizes in the same slab page. Somehow the allocator 
> needs to know the size. So it is not possible in SLOB to use 
> kmem_cache_alloc on an object and then free it using kfree?

Indeed. Mismatching allocator and deallocator is bug, even if it happens
to work for SLAB/SLUB.

> > > Well if you just have a few dentries then they are likely all pinned. A 
> > > large number of dentries will typically result in reclaimable slabs.
> > > The slab defrag patchset not only deals with the dcache issue but provides 
> > > similar solutions for inode and buffer_heads. Support for other slabs that
> > > defragment can be added by providing two hooks per slab.
> > 
> > What's your point? Slabs have a inherent pinning problem that's ugly to
> > combat. SLOB doesn't.
> 
> I thought we were talking about pinning problems of dentries. How are 
> slabs pinned and why does it matter?

If a slab contains a dentry that is pinned, it can only be used for
other dentries and cannot be recycled for other allocations. If updatedb
comes along and fills memory with dentry slabs, many of which get
permanently pinned, then you have wasted memory.

>  If slabs are pineed by a dentry that 
> is pinned then the slab page will be filled up with other dentries that 
> are not pinned. The slab defrag approach causes a coalescing of objects 
> around slabs that have pinned objects.

Yes. You've got (most of) a fix. It's overly-complicated and SLOB
doesn't need it. How many ways do I need to say this?


> > SLOB:
> > - internal overhead for kmalloc is 2 bytes (or 3 for odd-sized objects)
> 
> Well that increase if you need to align the object. For kmalloc this 
> usually means cache line align a power of two object right? So we have a 
> cacheline size of overhead?

a) alignment doesn't increase memory use because the memory before the
object is still allocatable
b) kmallocs aren't aligned!

> > - internal overhead for kmem_cache_alloc is 0 bytes (or 1 for odd-sized
> > objects)
> 
> You are not aligning to a double word boundary? This will create issues on 
> certain platforms.

The alignment minimum varies per arch. SLOB can go down to 2 bytes.

> > SLAB/SLUB
> > - internal overhead for kmalloc averages about 30%
> 
> I think that is valid for a random object size distribution?

It's a measurement from memory. But it roughly agrees with what you'd
expect from a random distribution.

> > The only time SLAB/SLUB can win in efficiency (assuming they're using
> > the same page size) is when all your kmallocs just happen to be powers
> > of two. Which, assuming any likely distribution of string or other
> > object sizes, isn't often.
> 
> In case of SLAB that is true. In case of SLUB we could convert the 
> kmallocs to kmem_cache_alloc. The newly created slab would in all 
> likelyhood be an alias of an already existing structure and thus be 
> essentially free. In that fashion SLUB can (in a limited way) put objects 
> for different slab caches into the same slab page too.

Uh, no. You'd need a new slab for every multiple of 2 bytes. And then
you'd just be making the underused and pinned slab problems worse.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04 20:55                 ` Matt Mackall
@ 2008-01-04 21:36                   ` Christoph Lameter
  2008-01-04 22:30                     ` Matt Mackall
  0 siblings, 1 reply; 69+ messages in thread
From: Christoph Lameter @ 2008-01-04 21:36 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Ingo Molnar, Linus Torvalds, Pekka Enberg, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Fri, 4 Jan 2008, Matt Mackall wrote:

> > needs to know the size. So it is not possible in SLOB to use 
> > kmem_cache_alloc on an object and then free it using kfree?
> 
> Indeed. Mismatching allocator and deallocator is bug, even if it happens
> to work for SLAB/SLUB.

Was the kernel audited for this case? I saw some rather scary uses of slab 
objects for I/O purposes come up during SLUB development.

> Yes. You've got (most of) a fix. It's overly-complicated and SLOB
> doesn't need it. How many ways do I need to say this?

So SLOB is then not able to compact memory after an updatedb run? The 
memory must stay dedicated to slab uses. SLOB memory can be filled 
up with objects of other slab types but it cannot be reused for page 
cache and anonymous pages etc. Slab defrag is freeing memory back to the 
page allocator with SLUB. That is *not* provided by SLOB. Could be made to 
work though.

> > Well that increase if you need to align the object. For kmalloc this 
> > usually means cache line align a power of two object right? So we have a 
> > cacheline size of overhead?
> 
> a) alignment doesn't increase memory use because the memory before the
> object is still allocatable

Ok so we end up with lots of small holes on a list that has to be scanned 
to find free memory?

> b) kmallocs aren't aligned!

>From mm/slob.c:

#ifndef ARCH_KMALLOC_MINALIGN
#define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long)
#endif

void *__kmalloc_node(size_t size, gfp_t gfp, int node)
{
        unsigned int *m;
        int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);

(IMHO it would be safer to set the minimum default to __alignof__(unsigned 
long long) like SLAB/SLUB).



Ok. So lets try a worst case scenario. If we do a 128 byte kmalloc then we 
can allocate the following number of object from one 4k slab

SLUB 32	    (all memory of the 4k page is used for 128 byte objects)
SLAB 29/30  (management structure occupies first two/three objects)
SLOB 30(?)  (Alignment results in object being 136 byte of effective size,
		we have 16 bytes leftover that could be used for a
		very small allocation. Right?)

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04 21:36                   ` Christoph Lameter
@ 2008-01-04 22:30                     ` Matt Mackall
  2008-01-05 20:16                       ` Christoph Lameter
  0 siblings, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-01-04 22:30 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Ingo Molnar, Linus Torvalds, Pekka Enberg, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Fri, 2008-01-04 at 13:36 -0800, Christoph Lameter wrote:
> Ok. So lets try a worst case scenario. If we do a 128 byte kmalloc
> then we 
> can allocate the following number of object from one 4k slab
> 
> SLUB 32	    (all memory of the 4k page is used for 128 byte objects)
> SLAB 29/30  (management structure occupies first two/three objects)
> SLOB 30(?)  (Alignment results in object being 136 byte of effective size,
> 		we have 16 bytes leftover that could be used for a
> 		very small allocation. Right?)

Don't know how you got to 136, the minimum alignment is 4 on x86. But I
already said in my last email that SLUB would win for the special case
of power of two allocations. But as long as we're looking at worst
cases, let's consider an alloc of 257 bytes..

SLUB  8 (1016 bytes wasted)
SLOB 15 (105 bytes wasted, with 136 bytes still usable)

Can we be done with this now, please?

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04  4:11             ` Matt Mackall
  2008-01-04 20:34               ` Christoph Lameter
@ 2008-01-05 16:21               ` Pekka J Enberg
  2008-01-05 17:14                 ` Andi Kleen
                                   ` (2 more replies)
  1 sibling, 3 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-05 16:21 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List, zanussi

Hi Matt,

On Thu, 3 Jan 2008, Matt Mackall wrote:
> >  SLUB can align these without a 2 byte 
> > overhead. In some configurations this results in SLUB using even less 
> > memory than SLOB. See f.e. Pekka's test at 
> > http://marc.info/?l=linux-kernel&m=118405559214029&w=2
> 
> Available memory after boot is not a particularly stable measurement and
> not valid if there's memory pressure. At any rate, I wasn't able to
> reproduce this.

So, I have this silly memory profiler derived from the kleak patches by 
the relayfs people and would love to try it out on an embedded workload 
where SLUB memory footprint is terrible. Any suggestions?

			Pekka

---
 Documentation/kmemprof/Makefile    |    5 +
 Documentation/kmemprof/kmemprof.c  |  136 +++++++++++++++++++++++++++++++++++++
 Documentation/kmemprof/kmemprof.pl |  125 ++++++++++++++++++++++++++++++++++
 include/linux/slab.h               |   16 ++++
 include/linux/slub_def.h           |   27 +++++--
 lib/Kconfig.debug                  |    8 ++
 mm/Makefile                        |    2 
 mm/kmemprof.c                      |  124 +++++++++++++++++++++++++++++++++
 mm/slub.c                          |   34 +++++++--
 9 files changed, 463 insertions(+), 14 deletions(-)

Index: linux-2.6/lib/Kconfig.debug
===================================================================
--- linux-2.6.orig/lib/Kconfig.debug	2008-01-05 12:13:46.000000000 +0200
+++ linux-2.6/lib/Kconfig.debug	2008-01-05 18:11:53.000000000 +0200
@@ -173,6 +173,14 @@
 	  off in a kernel built with CONFIG_SLUB_DEBUG_ON by specifying
 	  "slub_debug=-".
 
+config KMEMPROF
+	bool "Kernel memory profiling support"
+	depends on SLUB
+	default n
+	help
+	  Say Y here to have the kernel track every memory allocation in
+	  the kernel.
+
 config DEBUG_PREEMPT
 	bool "Debug preemptible kernel"
 	depends on DEBUG_KERNEL && PREEMPT && (TRACE_IRQFLAGS_SUPPORT || PPC64)
Index: linux-2.6/mm/Makefile
===================================================================
--- linux-2.6.orig/mm/Makefile	2008-01-05 12:13:46.000000000 +0200
+++ linux-2.6/mm/Makefile	2008-01-05 18:12:12.000000000 +0200
@@ -30,4 +30,4 @@
 obj-$(CONFIG_MIGRATION) += migrate.o
 obj-$(CONFIG_SMP) += allocpercpu.o
 obj-$(CONFIG_QUICKLIST) += quicklist.o
-
+obj-$(CONFIG_KMEMPROF) += kmemprof.o
Index: linux-2.6/include/linux/slab.h
===================================================================
--- linux-2.6.orig/include/linux/slab.h	2008-01-05 12:13:46.000000000 +0200
+++ linux-2.6/include/linux/slab.h	2008-01-05 18:13:12.000000000 +0200
@@ -95,6 +95,22 @@
 void kfree(const void *);
 size_t ksize(const void *);
 
+#ifdef CONFIG_KMEMPROF
+void kmem_track_alloc(void *, const void *, unsigned long, unsigned long, gfp_t);
+void kmem_track_free(void *, const void *);
+#else
+static inline void kmem_track_alloc(void *call_site, const void *p,
+				    unsigned long nr_req,
+				    unsigned long nr_allocated,
+				    gfp_t flags)
+{
+}
+
+static inline void kmem_track_free(void *call_site, const void *p)
+{
+}
+#endif /* CONFIG_KMEMPROF */
+
 /*
  * Allocator specific definitions. These are mainly used to establish optimized
  * ways to convert kmalloc() calls to kmem_cache_alloc() invocations by
Index: linux-2.6/include/linux/slub_def.h
===================================================================
--- linux-2.6.orig/include/linux/slub_def.h	2008-01-05 12:13:46.000000000 +0200
+++ linux-2.6/include/linux/slub_def.h	2008-01-05 12:14:15.000000000 +0200
@@ -162,20 +162,37 @@
 void *kmem_cache_alloc(struct kmem_cache *, gfp_t);
 void *__kmalloc(size_t size, gfp_t flags);
 
+static __always_inline void *__kmalloc_pagealloc(size_t size, gfp_t flags)
+{
+	unsigned long order = get_order(size);
+	void *p;
+
+	p = (void *)__get_free_pages(flags | __GFP_COMP, order);
+	kmem_track_alloc(__builtin_return_address(0), p, size,
+			 PAGE_SIZE << order, flags);
+	return p;
+}
+
 static __always_inline void *kmalloc(size_t size, gfp_t flags)
 {
 	if (__builtin_constant_p(size)) {
 		if (size > PAGE_SIZE / 2)
-			return (void *)__get_free_pages(flags | __GFP_COMP,
-							get_order(size));
+			return __kmalloc_pagealloc(size, flags);
 
 		if (!(flags & SLUB_DMA)) {
 			struct kmem_cache *s = kmalloc_slab(size);
+			void *p;
 
-			if (!s)
+			if (!s) {
+				kmem_track_alloc(__builtin_return_address(0),
+						 ZERO_SIZE_PTR, size, 0,
+						 flags);
 				return ZERO_SIZE_PTR;
-
-			return kmem_cache_alloc(s, flags);
+			}
+			p = kmem_cache_alloc(s, flags);
+			kmem_track_alloc(__builtin_return_address(0), p, size,
+					 s->size, flags);
+			return p;
 		}
 	}
 	return __kmalloc(size, flags);
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2008-01-05 12:13:46.000000000 +0200
+++ linux-2.6/mm/slub.c	2008-01-05 15:45:12.000000000 +0200
@@ -1566,14 +1566,22 @@
 
 void *kmem_cache_alloc(struct kmem_cache *s, gfp_t gfpflags)
 {
-	return slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
+	void *p;
+
+	p = slab_alloc(s, gfpflags, -1, __builtin_return_address(0));
+	kmem_track_alloc(__builtin_return_address(0), p, s->size, s->size, gfpflags);
+	return p;
 }
 EXPORT_SYMBOL(kmem_cache_alloc);
 
 #ifdef CONFIG_NUMA
 void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node)
 {
-	return slab_alloc(s, gfpflags, node, __builtin_return_address(0));
+	void *p;
+
+	p = slab_alloc(s, gfpflags, node, __builtin_return_address(0));
+	kmem_track_alloc(__builtin_return_address(0), p, s->size, s->size, gfpflags);
+	return p;
 }
 EXPORT_SYMBOL(kmem_cache_alloc_node);
 #endif
@@ -1670,6 +1678,8 @@
 {
 	struct page *page;
 
+	kmem_track_free(__builtin_return_address(0), x);
+
 	page = virt_to_head_page(x);
 
 	slab_free(s, page, x, __builtin_return_address(0));
@@ -2516,17 +2526,20 @@
 void *__kmalloc(size_t size, gfp_t flags)
 {
 	struct kmem_cache *s;
+	void *p;
 
 	if (unlikely(size > PAGE_SIZE / 2))
-		return (void *)__get_free_pages(flags | __GFP_COMP,
-							get_order(size));
+		return __kmalloc_pagealloc(size, flags);
 
 	s = get_slab(size, flags);
 
 	if (unlikely(ZERO_OR_NULL_PTR(s)))
 		return s;
 
-	return slab_alloc(s, flags, -1, __builtin_return_address(0));
+	p = slab_alloc(s, flags, -1, __builtin_return_address(0));
+	kmem_track_alloc(__builtin_return_address(0), p, size, s->size,
+			 flags);
+	return p;
 }
 EXPORT_SYMBOL(__kmalloc);
 
@@ -2534,17 +2547,20 @@
 void *__kmalloc_node(size_t size, gfp_t flags, int node)
 {
 	struct kmem_cache *s;
+	void *p;
 
 	if (unlikely(size > PAGE_SIZE / 2))
-		return (void *)__get_free_pages(flags | __GFP_COMP,
-							get_order(size));
+		return __kmalloc_pagealloc(size, flags);
 
 	s = get_slab(size, flags);
 
 	if (unlikely(ZERO_OR_NULL_PTR(s)))
 		return s;
 
-	return slab_alloc(s, flags, node, __builtin_return_address(0));
+	p = slab_alloc(s, flags, node, __builtin_return_address(0));
+	kmem_track_alloc(__builtin_return_address(0), p, size, s->size,
+			 flags);
+	return p;
 }
 EXPORT_SYMBOL(__kmalloc_node);
 #endif
@@ -2593,6 +2609,8 @@
 {
 	struct page *page;
 
+	kmem_track_free(__builtin_return_address(0), x);
+
 	if (unlikely(ZERO_OR_NULL_PTR(x)))
 		return;
 
Index: linux-2.6/Documentation/kmemprof/Makefile
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/Documentation/kmemprof/Makefile	2008-01-05 18:11:25.000000000 +0200
@@ -0,0 +1,5 @@
+default:
+	$(CC) -lpthread -Wall kmemprof.c -o kmemprof
+
+clean:
+	rm -rf kmemprof cpu*.out kmemprof.all
Index: linux-2.6/Documentation/kmemprof/kmemprof.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/Documentation/kmemprof/kmemprof.c	2008-01-05 18:11:33.000000000 +0200
@@ -0,0 +1,136 @@
+#include <errno.h>
+#include <fcntl.h>
+#include <limits.h>
+#include <pthread.h>
+#include <poll.h>
+#include <signal.h>
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <unistd.h>
+
+static void panic(const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	vprintf(fmt, args);
+	va_end(args);
+	exit(EXIT_FAILURE);
+}
+
+static void write_str(const char *filename, const char *value)
+{
+	int fd;
+
+	fd = open(filename, O_RDWR);
+	if (fd < 0)
+		panic("Could not open() file %s: %s\n", filename, strerror(errno));
+
+	if (write(fd, value, strlen(value)) < 0)
+		panic("Could not write() to file %s: %s\n", filename, strerror(errno));
+
+	close(fd);
+}
+
+static int open_channel(int cpu)
+{
+	char filename[PATH_MAX];
+	int fd;
+
+	sprintf(filename, "/debug/kmemprof/cpu%d", cpu);
+	fd = open(filename, O_RDONLY | O_NONBLOCK);
+	if (fd < 0)
+		panic("Could not open() file %s: %s\n", filename, strerror(errno));
+	return fd;
+}
+
+static int open_log(int cpu)
+{
+	char filename[PATH_MAX];
+	int fd;
+
+	sprintf(filename, "cpu%d.out", cpu);
+	fd = open(filename, O_CREAT | O_RDWR | O_TRUNC,
+			S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
+	if (fd < 0)
+		panic("Could not open() file %s: %s\n", filename, strerror(errno));
+	return fd;
+}
+
+static void *reader_thread(void *data)
+{
+	unsigned long cpu = (unsigned long) data;
+	struct pollfd pollfd;
+	int relay_fd, log_fd;
+	char buf[4096];
+	int retval;
+
+	relay_fd = open_channel(cpu);
+	log_fd = open_log(cpu);
+
+	do {
+		pollfd.fd = relay_fd;
+		pollfd.events = POLLIN;
+		retval = poll(&pollfd, 1, 1);
+		if (retval < 0)
+			panic("poll() failed: %s\n", strerror(errno));
+
+		retval = read(relay_fd, buf, 4096);
+		if (!retval)
+			continue;
+		if (retval < 0) {
+			if (errno == EAGAIN)
+				continue;
+			perror("read");
+			break;
+		}
+		if (write(log_fd, buf, retval) < 0)
+			panic("Could not write() for cpu %lu: %s\n", cpu, strerror(errno));
+	} while (1);
+
+	return NULL;
+}
+
+int main(int argc, char *argv[])
+{
+	unsigned long nr_cpus;
+	pthread_t *readers;
+	sigset_t signals;
+	unsigned long i;
+	int signal;
+
+	sigemptyset(&signals);
+	sigaddset(&signals, SIGINT);
+	sigaddset(&signals, SIGTERM);
+	pthread_sigmask(SIG_BLOCK, &signals, NULL);
+
+	nr_cpus = sysconf(_SC_NPROCESSORS_ONLN);
+
+	readers = calloc(1, nr_cpus);
+	if (!readers)
+		panic("out of memory\n");
+
+	for (i = 0; i < nr_cpus; i++) {
+		int err;
+
+		err = pthread_create(&readers[i], NULL, reader_thread,
+				(void *) i);
+		if (err)
+			panic("Could not pthread_create(): %s\n", strerror(errno));
+	}
+
+	write_str("/debug/kmemprof/enabled", "1");
+	printf("Logging... Press Control-C to stop.\n");
+
+	while (sigwait(&signals, &signal) == 0) {
+		if (signal == SIGINT || signal == SIGTERM)
+			break;
+	}
+	write_str("/debug/kmemprof/enabled", "0");
+
+	return EXIT_SUCCESS;
+}
Index: linux-2.6/Documentation/kmemprof/kmemprof.pl
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/Documentation/kmemprof/kmemprof.pl	2008-01-05 18:11:25.000000000 +0200
@@ -0,0 +1,125 @@
+#!/usr/bin/perl
+
+$tracefile = "kmemprof.all";
+$symbolfile = "/proc/kallsyms";
+
+my $nr_allocs;
+my $nr_frees;
+
+use constant KMALLOC_ID => 0x1;
+use constant KFREE_ID => 0x2;
+
+open(SYMBOLFILE, $symbolfile) or die "can't open $symbolfile: $!";
+while (<SYMBOLFILE>) {
+    chomp;
+    @fields = split;
+    $symbols{hex $fields[0]} = $fields[2];
+}
+@sorted_keys = sort(keys(%symbols));
+
+system("cat cpu* > $tracefile");
+open(TRACEFILE, $tracefile) or die "can't open $tracefile: $!";
+binmode($tracefile);
+while (read(TRACEFILE, $buf, 8)) {
+    ($eventid) = unpack("C", $buf);
+    read(TRACEFILE, $buf, 40);
+    if ($eventid == KMALLOC_ID) {
+	($caller, $addr, $nr_req, $nr_alloc) = unpack("QQQQ", $buf);
+#	printf("%x %x %x %d %d\n", $eventid, $addr, $caller, $nr_req, $nr_alloc);
+	$nr_allocs++;
+	$bytes_requested+=$nr_req;
+	$bytes_allocated+=$nr_alloc;
+	$nr_allocs_by_caller{$caller}++;
+	$nr_bytes_requested_by_caller{$caller}+=$nr_req;
+	$nr_bytes_allocated_by_caller{$caller}+=$nr_alloc;
+	$nr_bytes_wasted_by_caller{$caller}+=($nr_alloc-$nr_req);
+	$alloc_addrs{$addr} = $caller;
+    } else {
+	($caller, $addr) = unpack("QQ", $buf);
+#	printf("%x %x %x %d %d\n", $eventid, $addr, $caller, $objsize);
+	$nr_frees++;
+	$nr_frees_by_caller{$caller}++;
+	$nr_bytes_freed_by_caller{$caller}+=0;
+	$free_addrs{$addr} = $caller;
+    }
+}
+summarize();
+
+sub lookup_symbol {
+    my ($caller) = @_;
+    $symbol = $cached_callers{$caller};
+    if ($symbol) {
+	return $symbol;
+    }
+    for($i = 0; $i < scalar(@sorted_keys) - 1; $i++) {
+	if (($caller >= $sorted_keys[$i]) && ($caller <= $sorted_keys[$i+1])) {
+	    $symbol = sprintf("%s+0x%x", $symbols{$sorted_keys[$i]}, $offset);
+	    $cached_callers{$caller} = $symbol;
+	    $offset = $caller - $sorted_keys[$i];
+	    break;
+	}
+    }
+    if (!$symbol) {
+	$symbol = "unknown";
+    }
+
+    return $symbol;
+}
+
+sub summarize {
+    print "Total number of allocations: $nr_allocs\n";
+    print "Total number of frees: $nr_frees\n";
+
+    print "\nTotal bytes requested: $bytes_requested [$bytes_allocated allocated]\n";
+    printf("Total bytes wasted: %d\n", $bytes_allocated-$bytes_requested);
+
+    print "\nTotal number of allocations by caller:\n";
+    while (($caller, $count) = each %nr_allocs_by_caller) {
+	$symbol = lookup_symbol($caller);
+	printf("  %x [%s]: %d\n", $caller, $symbol, $count);
+    }
+    print "\nTotal number of frees by caller:\n";
+    while (($caller, $count) = each %nr_frees_by_caller) {
+	$symbol = lookup_symbol($caller);
+	printf("  %x [%s]: %d\n", $caller, $symbol, $count);
+    }
+
+    print "\nTotal bytes requested by caller:\n";
+    while (($caller, $count) = each %nr_bytes_requested_by_caller) {
+	$symbol = lookup_symbol($caller);
+	printf("  %x [%s]: %d [%d]\n", $caller, $symbol, $count, $nr_bytes_allocated_by_caller{$caller});
+    }
+    print "\nTotal bytes wasted by caller:\n";
+    while (($caller, $count) = each %nr_bytes_wasted_by_caller) {
+	$symbol = lookup_symbol($caller);
+	printf("  %x [%s]: %d [average %d]\n", $caller, $symbol, $count, $count/$nr_allocs_by_caller{$caller});
+    }
+#    print "\nTotal bytes freed by caller:\n";
+#    while (($caller, $count) = each %nr_bytes_freed_by_caller) {
+#	$symbol = lookup_symbol($caller);
+#	printf("  %x [%s]: %d\n", $caller, $symbol, $count);
+#    }
+
+    print "\nUnfreed number of allocations:\n";
+    while (($addr, $caller) = each %alloc_addrs) {
+	if (!$free_addrs{$addr}) {
+	    $symbol = lookup_symbol($caller);
+	    $unfreed_nr_allocs{$symbol}++;
+	}
+    }
+    while (($symbol, $count) = each %unfreed_nr_allocs) {
+	    printf("  %s: %d\n", $symbol, $count);
+    }
+    print "\nUnmalloced number of frees:\n";
+    while (($addr, $caller) = each %free_addrs) {
+	if (!$alloc_addrs{$addr}) {
+	    $symbol = lookup_symbol($caller);
+	    $unmalloced_nr_frees{$symbol}++;
+	}
+    }
+    while (($symbol, $count) = each %unmalloced_nr_frees) {
+	    printf("  %s: %d\n", $symbol, $count);
+    }
+    close(TRACEFILE);
+    close(SYMBOL_FILE);
+}
Index: linux-2.6/mm/kmemprof.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6/mm/kmemprof.c	2008-01-05 17:08:56.000000000 +0200
@@ -0,0 +1,124 @@
+/*
+ * Copyright (C) 2008  Pekka Enberg
+ *
+ * This file is released under the GPL version 2.
+ */
+
+#include <linux/debugfs.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/relay.h>
+#include <linux/slab.h>
+
+static struct rchan *kmemprof_chan;
+static u32 kmemprof_enabled;
+
+/*
+ * This is the user-space visible ABI for kmem events.
+ */
+enum kmem_event_id {
+	KMEM_ALLOC	= 0x01,
+	KMEM_FREE	= 0x02,
+};
+
+struct kmem_event {
+	u64	event_id;
+	u64	call_site;
+	u64	ptr;
+	u64	nr_req;
+	u64	nr_alloc;
+	u32	gfp_flags;
+};
+
+static void kmemprof_log_event(struct kmem_event *e)
+{
+	relay_write(kmemprof_chan, e, sizeof(*e));
+}
+
+void kmem_track_alloc(void *call_site, const void *p, unsigned long nr_req,
+		      unsigned long nr_alloc, gfp_t flags)
+{
+	if (kmemprof_enabled) {
+		struct kmem_event e = {
+			.event_id	= KMEM_ALLOC,
+			.call_site	= (u64) call_site,
+			.ptr		= (u64) p,
+			.nr_req		= nr_req,
+			.nr_alloc	= nr_alloc,
+			.gfp_flags	= flags,
+		};
+		kmemprof_log_event(&e);
+	}
+}
+EXPORT_SYMBOL_GPL(kmem_track_alloc);
+
+void kmem_track_free(void *call_site, const void *p)
+{
+	if (kmemprof_enabled) {
+		struct kmem_event e = {
+			.event_id	= KMEM_FREE,
+			.call_site	= (u64) call_site,
+			.ptr		= (u64) p,
+		};
+		kmemprof_log_event(&e);
+	}
+}
+EXPORT_SYMBOL_GPL(kmem_track_free);
+
+/*
+ * The debugfs ABI for kmemprof
+ */
+#define KMEMPROF_MODE (S_IFREG | S_IRUSR | S_IWUSR)
+
+static struct dentry *kmemprof_dir;
+static struct dentry *kmemprof_enabled_file;
+
+#define KMEMPROF_SUBBUF_SIZE 262144
+#define KMEMPROF_NR_SUBBUFS  4
+
+static struct dentry *
+kmemprof_create_buf_file(const char *filename, struct dentry *parent,
+			 int mode, struct rchan_buf *buf, int *is_global)
+{
+	return debugfs_create_file(filename, mode, parent, buf,
+				   &relay_file_operations);
+}
+
+static int kmemprof_remove_buf_file(struct dentry *dentry)
+{
+	debugfs_remove(dentry);
+
+	return 0;
+}
+
+static struct rchan_callbacks relay_callbacks = {
+	.create_buf_file = kmemprof_create_buf_file,
+	.remove_buf_file = kmemprof_remove_buf_file,
+};
+
+static int __init kmemprof_debugfs_init(void)
+{
+	kmemprof_dir = debugfs_create_dir("kmemprof", NULL);
+	if (!kmemprof_dir)
+		goto failed;
+
+	kmemprof_chan = relay_open("cpu", kmemprof_dir, KMEMPROF_SUBBUF_SIZE,
+				KMEMPROF_NR_SUBBUFS, &relay_callbacks, NULL);
+	if (!kmemprof_chan)
+		goto failed;
+
+	kmemprof_enabled_file = debugfs_create_bool("enabled", KMEMPROF_MODE,
+					kmemprof_dir, &kmemprof_enabled);
+	if (!kmemprof_enabled_file)
+		goto failed;
+
+	return 0;
+failed:
+	printk(KERN_ERR "kmemprof: failed to initialize debugfs.\n");
+	debugfs_remove(kmemprof_enabled_file);
+	relay_close(kmemprof_chan);
+	debugfs_remove(kmemprof_dir);
+	return -ENOMEM;
+}
+
+late_initcall(kmemprof_debugfs_init);

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-05 16:21               ` Pekka J Enberg
@ 2008-01-05 17:14                 ` Andi Kleen
  2008-01-05 20:05                 ` Christoph Lameter
  2008-01-06 17:51                 ` Matt Mackall
  2 siblings, 0 replies; 69+ messages in thread
From: Andi Kleen @ 2008-01-05 17:14 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Matt Mackall, Christoph Lameter, Ingo Molnar, Linus Torvalds,
	Hugh Dickins, Andi Kleen, Peter Zijlstra,
	Linux Kernel Mailing List, zanussi

> So, I have this silly memory profiler derived from the kleak patches by 
> the relayfs people and would love to try it out on an embedded workload 
> where SLUB memory footprint is terrible. Any suggestions?

FWIW this can be all done in a few lines of systemtap. The only
change needed to the kernel is to not inline kmalloc().

-Andi

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-05 16:21               ` Pekka J Enberg
  2008-01-05 17:14                 ` Andi Kleen
@ 2008-01-05 20:05                 ` Christoph Lameter
  2008-01-07 20:12                   ` Pekka J Enberg
  2008-01-06 17:51                 ` Matt Mackall
  2 siblings, 1 reply; 69+ messages in thread
From: Christoph Lameter @ 2008-01-05 20:05 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Matt Mackall, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List, zanussi

On Sat, 5 Jan 2008, Pekka J Enberg wrote:

> So, I have this silly memory profiler derived from the kleak patches by 
> the relayfs people and would love to try it out on an embedded workload 
> where SLUB memory footprint is terrible. Any suggestions?

Good idea. But have you tried to look at slabinfo?

Try to run

	slabinfo -t

which will calculate the allocation overhead of the currently allocated 
objects in all slab caches.

One problem: The actual size of kmalloc'ed objects is not available so it 
does not calculate the overhead that comes about because of rounding. Your 
approach would cover that as well but I think we could also add a debug 
mode in which we store the actual size of a kmalloc object and export the 
information via sysfs. Would be nicer than adding this whole additional 
layer.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-04 22:30                     ` Matt Mackall
@ 2008-01-05 20:16                       ` Christoph Lameter
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-05 20:16 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Ingo Molnar, Linus Torvalds, Pekka Enberg, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Fri, 4 Jan 2008, Matt Mackall wrote:

> > SLUB 32	    (all memory of the 4k page is used for 128 byte objects)
> > SLAB 29/30  (management structure occupies first two/three objects)
> > SLOB 30(?)  (Alignment results in object being 136 byte of effective size,
> > 		we have 16 bytes leftover that could be used for a
> > 		very small allocation. Right?)
> 
> Don't know how you got to 136, the minimum alignment is 4 on x86. But I

Right I am thinking about 64 bit systems where the alignment is 8 bytes.

> already said in my last email that SLUB would win for the special case
> of power of two allocations. But as long as we're looking at worst
> cases, let's consider an alloc of 257 bytes..

Yup that hits it by forcing a rounding up to a size of 512 bytes because 
there is no intermediate cache size before 1024. The rounding up is 
a pretty weak spot in terms of memory use.

> SLUB  8 (1016 bytes wasted)
> SLOB 15 (105 bytes wasted, with 136 bytes still usable)

Well we can actually turn this around. What I gave was not actually the 
worst case for SLOB. The worst case is an 8 byte allocation where SLOB 
needs double the memory of SLUB.

SLUB 	512	(Nothing wasted)
SLOB 	256	(Half of the page wasted for metadata)
SLAB	119	(32 byte mininum alloc size + management struct needs)

But these are all extreme cases. Depends on the mix of allocs who wins and 
from what I can tell the avoiding of rounding up to a power of two gives 
SLOB a key advantage. If we would find the worst offenders there and use 
kmem_cache_alloc instead then we may be able to offset that advantage.




^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-05 16:21               ` Pekka J Enberg
  2008-01-05 17:14                 ` Andi Kleen
  2008-01-05 20:05                 ` Christoph Lameter
@ 2008-01-06 17:51                 ` Matt Mackall
  2008-01-07 18:06                   ` Pekka J Enberg
  2 siblings, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-01-06 17:51 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List, zanussi


On Sat, 2008-01-05 at 18:21 +0200, Pekka J Enberg wrote:
> Hi Matt,
> 
> On Thu, 3 Jan 2008, Matt Mackall wrote:
> > >  SLUB can align these without a 2 byte 
> > > overhead. In some configurations this results in SLUB using even less 
> > > memory than SLOB. See f.e. Pekka's test at 
> > > http://marc.info/?l=linux-kernel&m=118405559214029&w=2
> > 
> > Available memory after boot is not a particularly stable measurement and
> > not valid if there's memory pressure. At any rate, I wasn't able to
> > reproduce this.
> 
> So, I have this silly memory profiler derived from the kleak patches by 
> the relayfs people and would love to try it out on an embedded workload 
> where SLUB memory footprint is terrible. Any suggestions?

Or you could use this (which is a bit broken on modern kernels, but
provides lots of interesting detail):

http://lwn.net/Articles/124374/

I don't have any particular "terrible" workloads for SLUB. But my
attempts to simply boot with all three allocators to init=/bin/bash in,
say, lguest show a fair margin for SLOB.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-06 17:51                 ` Matt Mackall
@ 2008-01-07 18:06                   ` Pekka J Enberg
  2008-01-07 19:03                     ` Matt Mackall
  2008-01-09 19:15                     ` [RFC PATCH] greatly reduce SLOB external fragmentation Matt Mackall
  0 siblings, 2 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-07 18:06 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

Hi Matt,

On Sun, 6 Jan 2008, Matt Mackall wrote:
> I don't have any particular "terrible" workloads for SLUB. But my
> attempts to simply boot with all three allocators to init=/bin/bash in,
> say, lguest show a fair margin for SLOB.

Sorry, I once again have bad news ;-). I did some testing with

  lguest --block=<rootfile> 32 /boot/vmlinuz-2.6.24-rc6 root=/dev/vda init=doit

where rootfile is

  http://uml.nagafix.co.uk/BusyBox-1.5.0/BusyBox-1.5.0-x86-root_fs.bz2

and the "doit" script in the guest passed as init= is just

  #!/bin/sh
  mount -t proc proc /proc
  cat /proc/meminfo | grep MemTotal
  cat /proc/meminfo | grep MemFree
  cat /proc/meminfo | grep Slab

and the results are:

[ the minimum, maximum, and average are of captured from 10 individual runs ]

                                 Free (kB)             Used (kB)
                    Total (kB)   min   max   average   min  max  average
  SLUB (no debug)   26536        23868 23892 23877.6   2644 2668 2658.4
  SLOB              26548        23472 23640 23579.6   2908 3076 2968.4
  SLAB (no debug)   26544        23316 23364 23343.2   3180 3228 3200.8
  SLUB (with debug) 26484        23120 23136 23127.2   3348 3364 3356.8

So it seems that on average SLUB uses 300 kilobytes *less memory* (!) (which is
roughly 1% of total memory available) after boot than SLOB for my
configuration.

One possible explanation is that the high internal fragmentation (space
allocated but not used) of SLUB kmalloc() only affects short-lived allocations
and thus does not show up in the more permanent memory footprint.  Likewise, it
could be that SLOB has higher external fragmentation (small blocks that are
unavailable for allocation) of which SLUB does not suffer from.  Dunno, haven't
investigated as my results are contradictory to yours.

I am beginning to think this is highly dependent on .config so would you mind
sending me one you're using for testing, Matt?

			Pekka

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-07 18:06                   ` Pekka J Enberg
@ 2008-01-07 19:03                     ` Matt Mackall
  2008-01-07 19:53                       ` Pekka J Enberg
                                         ` (2 more replies)
  2008-01-09 19:15                     ` [RFC PATCH] greatly reduce SLOB external fragmentation Matt Mackall
  1 sibling, 3 replies; 69+ messages in thread
From: Matt Mackall @ 2008-01-07 19:03 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Mon, 2008-01-07 at 20:06 +0200, Pekka J Enberg wrote:
> Hi Matt,
> 
> On Sun, 6 Jan 2008, Matt Mackall wrote:
> > I don't have any particular "terrible" workloads for SLUB. But my
> > attempts to simply boot with all three allocators to init=/bin/bash in,
> > say, lguest show a fair margin for SLOB.
> 
> Sorry, I once again have bad news ;-). I did some testing with
> 
>   lguest --block=<rootfile> 32 /boot/vmlinuz-2.6.24-rc6 root=/dev/vda init=doit
> 
> where rootfile is
> 
>   http://uml.nagafix.co.uk/BusyBox-1.5.0/BusyBox-1.5.0-x86-root_fs.bz2
> 
> and the "doit" script in the guest passed as init= is just
> 
>   #!/bin/sh
>   mount -t proc proc /proc
>   cat /proc/meminfo | grep MemTotal
>   cat /proc/meminfo | grep MemFree
>   cat /proc/meminfo | grep Slab
> 
> and the results are:
> 
> [ the minimum, maximum, and average are of captured from 10 individual runs ]
> 
>                                  Free (kB)             Used (kB)
>                     Total (kB)   min   max   average   min  max  average
>   SLUB (no debug)   26536        23868 23892 23877.6   2644 2668 2658.4
>   SLOB              26548        23472 23640 23579.6   2908 3076 2968.4
>   SLAB (no debug)   26544        23316 23364 23343.2   3180 3228 3200.8
>   SLUB (with debug) 26484        23120 23136 23127.2   3348 3364 3356.8
> 
> So it seems that on average SLUB uses 300 kilobytes *less memory* (!) (which is
> roughly 1% of total memory available) after boot than SLOB for my
> configuration.

Fascinating. Which kernel version are you using? This patch doesn't seem
to have made it to mainline:

---

slob: fix free block merging at head of subpage

We weren't merging freed blocks at the beginning of the free list.
Fixing this showed a 2.5% efficiency improvement in a userspace test
harness.

Signed-off-by: Matt Mackall <mpm@selenic.com>

diff -r 5374012889d6 mm/slob.c
--- a/mm/slob.c	Wed Dec 05 09:27:46 2007 -0800
+++ b/mm/slob.c	Wed Dec 05 16:10:37 2007 -0600
@@ -398,6 +398,10 @@ static void slob_free(void *block, int s
 	sp->units += units;
 
 	if (b < sp->free) {
+		if (b + units == sp->free) {
+			units += slob_units(sp->free);
+			sp->free = slob_next(sp->free);
+		}
 		set_slob(b, units, sp->free);
 		sp->free = b;
 	} else {

---

> One possible explanation is that the high internal fragmentation (space
> allocated but not used) of SLUB kmalloc() only affects short-lived allocations
> and thus does not show up in the more permanent memory footprint.  Likewise, it
> could be that SLOB has higher external fragmentation (small blocks that are
> unavailable for allocation) of which SLUB does not suffer from.  Dunno, haven't
> investigated as my results are contradictory to yours.

I suppose that's possible.

> I am beginning to think this is highly dependent on .config so would you mind
> sending me one you're using for testing, Matt?

I'm sure I don't have it any more, as that was back in July or so. How
about you send me your config and I'll try to figure out what's going
on?

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-07 19:03                     ` Matt Mackall
@ 2008-01-07 19:53                       ` Pekka J Enberg
  2008-01-07 20:44                       ` Pekka J Enberg
  2008-01-10 10:04                       ` Pekka J Enberg
  2 siblings, 0 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-07 19:53 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

Hi Matt,

On Mon, 7 Jan 2008, Matt Mackall wrote:
> Fascinating. Which kernel version are you using? This patch doesn't seem
> to have made it to mainline:

It's a git pull from yesterday so that's 2.6.24-rc6-something. I'll give 
your patch a spin.

On Mon, 7 Jan 2008, Matt Mackall wrote:
> > I am beginning to think this is highly dependent on .config so would you mind
> > sending me one you're using for testing, Matt?
> 
> I'm sure I don't have it any more, as that was back in July or so. How
> about you send me your config and I'll try to figure out what's going
> on?

Sure. It's the config of my trusty old 32-bit x86 development laptop. This 
one has SLUB with debugging disabled. I didn't save the other ones but 
they're the same except for SLUB/SLOB options.

			Pekka

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.24-rc6
# Mon Jan  7 21:47:08 2008
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_SUPPORTS_OPROFILE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
CONFIG_AUDIT=y
# CONFIG_AUDITSYSCALL is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CGROUPS is not set
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_FAIR_USER_SCHED=y
# CONFIG_FAIR_CGROUP_SCHED is not set
CONFIG_SYSFS_DEPRECATED=y
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
# CONFIG_EPOLL is not set
# CONFIG_SIGNALFD is not set
# CONFIG_EVENTFD is not set
# CONFIG_SHMEM is not set
# CONFIG_VM_EVENT_COUNTERS is not set
# CONFIG_SLUB_DEBUG is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_TINY_SHMEM=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_LBD=y
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set
# CONFIG_BLK_DEV_BSG is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_PREEMPT_NOTIFIERS=y

#
# Processor type and features
#
# CONFIG_TICK_ONESHOT is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_X86_VSMP is not set
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
CONFIG_PARAVIRT=y
CONFIG_PARAVIRT_GUEST=y
# CONFIG_VMI is not set
CONFIG_LGUEST_GUEST=y
# CONFIG_M386 is not set
# CONFIG_M486 is not set
CONFIG_M586=y
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_X86_PPRO_FENCE=y
CONFIG_X86_F00F_BUG=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_ALIGNMENT_16=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_NR_CPUS=8
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
# CONFIG_X86_MCE is not set
CONFIG_VM86=y
CONFIG_TOSHIBA=m
CONFIG_I8K=m
CONFIG_X86_REBOOTFIXUPS=y
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_VMSPLIT_3G=y
# CONFIG_VMSPLIT_3G_OPT is not set
# CONFIG_VMSPLIT_2G is not set
# CONFIG_VMSPLIT_2G_OPT is not set
# CONFIG_VMSPLIT_1G is not set
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
# CONFIG_SPARSEMEM_VMEMMAP_ENABLE is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_RESOURCES_64BIT is not set
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_NR_QUICK=1
CONFIG_VIRT_TO_BUS=y
CONFIG_HIGHPTE=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_EFI=y
CONFIG_IRQBALANCE=y
CONFIG_BOOT_IOREMAP=y
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PHYSICAL_START=0x100000
CONFIG_RELOCATABLE=y
CONFIG_PHYSICAL_ALIGN=0x100000
CONFIG_HOTPLUG_CPU=y
CONFIG_COMPAT_VDSO=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management options
#
CONFIG_PM=y
CONFIG_PM_LEGACY=y
CONFIG_PM_DEBUG=y
# CONFIG_PM_VERBOSE is not set
CONFIG_PM_TRACE=y
CONFIG_PM_SLEEP_SMP=y
CONFIG_PM_SLEEP=y
CONFIG_SUSPEND_SMP_POSSIBLE=y
CONFIG_SUSPEND=y
CONFIG_HIBERNATION_SMP_POSSIBLE=y
# CONFIG_HIBERNATION is not set
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
# CONFIG_ACPI_PROCFS is not set
CONFIG_ACPI_PROCFS_POWER=y
CONFIG_ACPI_PROC_EVENT=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=m
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=m
CONFIG_ACPI_DOCK=m
# CONFIG_ACPI_BAY is not set
CONFIG_ACPI_PROCESSOR=m
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=m
CONFIG_ACPI_ASUS=m
CONFIG_ACPI_TOSHIBA=m
CONFIG_ACPI_BLACKLIST_YEAR=2000
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=m
# CONFIG_ACPI_SBS is not set
CONFIG_APM=m
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_ALLOW_INTS is not set
# CONFIG_APM_REAL_MODE_POWER_OFF is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=m
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_STAT=m
CONFIG_CPU_FREQ_STAT_DETAILS=y
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=m
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m

#
# CPUFreq processor drivers
#
CONFIG_X86_ACPI_CPUFREQ=m
CONFIG_X86_POWERNOW_K6=m
CONFIG_X86_POWERNOW_K7=m
CONFIG_X86_POWERNOW_K7_ACPI=y
CONFIG_X86_POWERNOW_K8=m
CONFIG_X86_POWERNOW_K8_ACPI=y
CONFIG_X86_GX_SUSPMOD=m
CONFIG_X86_SPEEDSTEP_CENTRINO=m
CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y
CONFIG_X86_SPEEDSTEP_ICH=m
CONFIG_X86_SPEEDSTEP_SMI=m
CONFIG_X86_P4_CLOCKMOD=m
CONFIG_X86_CPUFREQ_NFORCE2=m
CONFIG_X86_LONGRUN=m
CONFIG_X86_LONGHAUL=m
# CONFIG_X86_E_POWERSAVER is not set

#
# shared options
#
# CONFIG_X86_ACPI_CPUFREQ_PROC_INTF is not set
CONFIG_X86_SPEEDSTEP_LIB=m
CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK=y
# CONFIG_CPU_IDLE is not set

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=m
CONFIG_PCIEAER=y
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_DEBUG is not set
CONFIG_HT_IRQ=y
CONFIG_ISA_DMA_API=y
CONFIG_ISA=y
CONFIG_EISA=y
CONFIG_EISA_VLB_PRIMING=y
CONFIG_EISA_PCI_EISA=y
CONFIG_EISA_VIRTUAL_ROOT=y
CONFIG_EISA_NAMES=y
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
CONFIG_K8_NB=y
CONFIG_PCCARD=m
# CONFIG_PCMCIA_DEBUG is not set
CONFIG_PCMCIA=m
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_PCMCIA_IOCTL=y
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=m
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
CONFIG_PD6729=m
CONFIG_I82092=m
CONFIG_I82365=m
CONFIG_TCIC=m
CONFIG_PCMCIA_PROBE=y
CONFIG_PCCARD_NONSTATIC=m
CONFIG_HOTPLUG_PCI=m
CONFIG_HOTPLUG_PCI_FAKE=m
CONFIG_HOTPLUG_PCI_COMPAQ=m
CONFIG_HOTPLUG_PCI_COMPAQ_NVRAM=y
CONFIG_HOTPLUG_PCI_IBM=m
CONFIG_HOTPLUG_PCI_ACPI=m
CONFIG_HOTPLUG_PCI_ACPI_IBM=m
CONFIG_HOTPLUG_PCI_CPCI=y
CONFIG_HOTPLUG_PCI_CPCI_ZT5550=m
CONFIG_HOTPLUG_PCI_CPCI_GENERIC=m
CONFIG_HOTPLUG_PCI_SHPC=m

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_MISC=m

#
# Networking
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=m
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=m
# CONFIG_XFRM_SUB_POLICY is not set
# CONFIG_XFRM_MIGRATE is not set
CONFIG_NET_KEY=m
# CONFIG_NET_KEY_MIGRATE is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_ASK_IP_FIB_HASH=y
# CONFIG_IP_FIB_TRIE is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_XFRM_MODE_TRANSPORT=m
CONFIG_INET_XFRM_MODE_TUNNEL=m
CONFIG_INET_XFRM_MODE_BEET=m
# CONFIG_INET_LRO is not set
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
CONFIG_IP_VS=m
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m

#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
CONFIG_IPV6=m
CONFIG_IPV6_PRIVACY=y
# CONFIG_IPV6_ROUTER_PREF is not set
# CONFIG_IPV6_OPTIMISTIC_DAD is not set
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
# CONFIG_IPV6_MIP6 is not set
CONFIG_INET6_XFRM_TUNNEL=m
CONFIG_INET6_TUNNEL=m
CONFIG_INET6_XFRM_MODE_TRANSPORT=m
CONFIG_INET6_XFRM_MODE_TUNNEL=m
CONFIG_INET6_XFRM_MODE_BEET=m
CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION=m
CONFIG_IPV6_SIT=m
CONFIG_IPV6_TUNNEL=m
# CONFIG_IPV6_MULTIPLE_TABLES is not set
# CONFIG_NETLABEL is not set
CONFIG_NETWORK_SECMARK=y
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_BRIDGE_NETFILTER=y

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NF_CONNTRACK_ENABLED=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CT_ACCT=y
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_SECMARK=y
CONFIG_NF_CONNTRACK_EVENTS=y
CONFIG_NF_CT_PROTO_GRE=m
CONFIG_NF_CT_PROTO_SCTP=m
# CONFIG_NF_CT_PROTO_UDPLITE is not set
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_PPTP=m
# CONFIG_NF_CONNTRACK_SANE is not set
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NETFILTER_XTABLES=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
# CONFIG_NETFILTER_XT_TARGET_CONNMARK is not set
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set
# CONFIG_NETFILTER_XT_TARGET_TRACE is not set
CONFIG_NETFILTER_XT_TARGET_SECMARK=m
CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
# CONFIG_NETFILTER_XT_TARGET_TCPMSS is not set
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
# CONFIG_NETFILTER_XT_MATCH_CONNLIMIT is not set
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_SCTP=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
# CONFIG_NETFILTER_XT_MATCH_TIME is not set
# CONFIG_NETFILTER_XT_MATCH_U32 is not set
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m

#
# IP: Netfilter Configuration
#
CONFIG_NF_CONNTRACK_IPV4=m
CONFIG_NF_CONNTRACK_PROC_COMPAT=y
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_SAME=m
CONFIG_NF_NAT_SNMP_BASIC=m
CONFIG_NF_NAT_PROTO_GRE=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
CONFIG_NF_NAT_TFTP=m
CONFIG_NF_NAT_AMANDA=m
CONFIG_NF_NAT_PPTP=m
CONFIG_NF_NAT_H323=m
CONFIG_NF_NAT_SIP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m

#
# IPv6: Netfilter Configuration (EXPERIMENTAL)
#
CONFIG_NF_CONNTRACK_IPV6=m
CONFIG_IP6_NF_QUEUE=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_RT=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_OWNER=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_AH=m
# CONFIG_IP6_NF_MATCH_MH is not set
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_LOG=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_TARGET_HL=m
CONFIG_IP6_NF_RAW=m

#
# DECnet: Netfilter Configuration
#
CONFIG_DECNET_NF_GRABULATOR=m

#
# Bridge: Netfilter Configuration
#
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_ULOG=m
CONFIG_IP_DCCP=m
CONFIG_INET_DCCP_DIAG=m
CONFIG_IP_DCCP_ACKVEC=y

#
# DCCP CCIDs Configuration (EXPERIMENTAL)
#
CONFIG_IP_DCCP_CCID2=m
# CONFIG_IP_DCCP_CCID2_DEBUG is not set
CONFIG_IP_DCCP_CCID3=m
CONFIG_IP_DCCP_TFRC_LIB=m
# CONFIG_IP_DCCP_CCID3_DEBUG is not set
CONFIG_IP_DCCP_CCID3_RTO=100

#
# DCCP Kernel Hacking
#
# CONFIG_IP_DCCP_DEBUG is not set
CONFIG_NET_DCCPPROBE=m
CONFIG_IP_SCTP=m
# CONFIG_SCTP_DBG_MSG is not set
# CONFIG_SCTP_DBG_OBJCNT is not set
# CONFIG_SCTP_HMAC_NONE is not set
# CONFIG_SCTP_HMAC_SHA1 is not set
CONFIG_SCTP_HMAC_MD5=y
CONFIG_TIPC=m
# CONFIG_TIPC_ADVANCED is not set
# CONFIG_TIPC_DEBUG is not set
CONFIG_ATM=y
CONFIG_ATM_CLIP=y
# CONFIG_ATM_CLIP_NO_ICMP is not set
CONFIG_ATM_LANE=m
CONFIG_ATM_MPOA=m
CONFIG_ATM_BR2684=m
# CONFIG_ATM_BR2684_IPFILTER is not set
CONFIG_BRIDGE=m
CONFIG_VLAN_8021Q=m
CONFIG_DECNET=m
# CONFIG_DECNET_ROUTER is not set
CONFIG_LLC=y
CONFIG_LLC2=m
CONFIG_IPX=m
# CONFIG_IPX_INTERN is not set
CONFIG_ATALK=m
CONFIG_DEV_APPLETALK=m
CONFIG_LTPC=m
CONFIG_COPS=m
CONFIG_COPS_DAYNA=y
CONFIG_COPS_TANGENT=y
CONFIG_IPDDP=m
CONFIG_IPDDP_ENCAP=y
CONFIG_IPDDP_DECAP=y
CONFIG_X25=m
CONFIG_LAPB=m
CONFIG_ECONET=m
CONFIG_ECONET_AUNUDP=y
CONFIG_ECONET_NATIVE=y
CONFIG_WAN_ROUTER=m
CONFIG_NET_SCHED=y

#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_HFSC=m
CONFIG_NET_SCH_ATM=m
CONFIG_NET_SCH_PRIO=m
# CONFIG_NET_SCH_RR is not set
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_NETEM=m
CONFIG_NET_SCH_INGRESS=m

#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=m
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
# CONFIG_CLS_U32_PERF is not set
CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
CONFIG_NET_EMATCH_CMP=m
CONFIG_NET_EMATCH_NBYTE=m
CONFIG_NET_EMATCH_U32=m
CONFIG_NET_EMATCH_META=m
CONFIG_NET_EMATCH_TEXT=m
CONFIG_NET_CLS_ACT=y
CONFIG_NET_ACT_POLICE=y
# CONFIG_NET_ACT_GACT is not set
# CONFIG_NET_ACT_MIRRED is not set
# CONFIG_NET_ACT_IPT is not set
# CONFIG_NET_ACT_NAT is not set
# CONFIG_NET_ACT_PEDIT is not set
# CONFIG_NET_ACT_SIMP is not set
CONFIG_NET_CLS_POLICE=y
# CONFIG_NET_CLS_IND is not set
CONFIG_NET_SCH_FIFO=y

#
# Network testing
#
CONFIG_NET_PKTGEN=m
CONFIG_NET_TCPPROBE=m
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
CONFIG_BT=m
CONFIG_BT_L2CAP=m
CONFIG_BT_SCO=m
CONFIG_BT_RFCOMM=m
CONFIG_BT_RFCOMM_TTY=y
CONFIG_BT_BNEP=m
CONFIG_BT_BNEP_MC_FILTER=y
CONFIG_BT_BNEP_PROTO_FILTER=y
CONFIG_BT_HIDP=m

#
# Bluetooth device drivers
#
CONFIG_BT_HCIUSB=m
CONFIG_BT_HCIUSB_SCO=y
# CONFIG_BT_HCIBTSDIO is not set
CONFIG_BT_HCIUART=m
CONFIG_BT_HCIUART_H4=y
CONFIG_BT_HCIUART_BCSP=y
# CONFIG_BT_HCIUART_LL is not set
CONFIG_BT_HCIBCM203X=m
CONFIG_BT_HCIBPA10X=m
CONFIG_BT_HCIBFUSB=m
CONFIG_BT_HCIDTL1=m
CONFIG_BT_HCIBT3C=m
CONFIG_BT_HCIBLUECARD=m
CONFIG_BT_HCIBTUART=m
CONFIG_BT_HCIVHCI=m
# CONFIG_AF_RXRPC is not set
CONFIG_FIB_RULES=y

#
# Wireless
#
CONFIG_CFG80211=m
CONFIG_NL80211=y
CONFIG_WIRELESS_EXT=y
CONFIG_MAC80211=m
CONFIG_MAC80211_RCSIMPLE=y
# CONFIG_MAC80211_DEBUGFS is not set
# CONFIG_MAC80211_DEBUG is not set
CONFIG_IEEE80211=m
# CONFIG_IEEE80211_DEBUG is not set
CONFIG_IEEE80211_CRYPT_WEP=m
CONFIG_IEEE80211_CRYPT_CCMP=m
CONFIG_IEEE80211_CRYPT_TKIP=m
CONFIG_IEEE80211_SOFTMAC=m
# CONFIG_IEEE80211_SOFTMAC_DEBUG is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
CONFIG_CONNECTOR=m
# CONFIG_MTD is not set
# CONFIG_PARPORT is not set
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set

#
# Protocols
#
CONFIG_ISAPNP=y
CONFIG_PNPBIOS=y
CONFIG_PNPBIOS_PROC_FS=y
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_FD=m
CONFIG_BLK_DEV_XD=m
CONFIG_BLK_CPQ_DA=m
CONFIG_BLK_CPQ_CISS_DA=m
CONFIG_CISS_SCSI_TAPE=y
CONFIG_BLK_DEV_DAC960=m
CONFIG_BLK_DEV_UMEM=m
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_SX8=m
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=65536
CONFIG_BLK_DEV_RAM_BLOCKSIZE=1024
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_VIRTIO_BLK=y
CONFIG_MISC_DEVICES=y
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_SGI_IOC4 is not set
CONFIG_TIFM_CORE=m
# CONFIG_TIFM_7XX1 is not set
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_FUJITSU_LAPTOP is not set
# CONFIG_MSI_LAPTOP is not set
# CONFIG_SONY_LAPTOP is not set
CONFIG_THINKPAD_ACPI=m
# CONFIG_THINKPAD_ACPI_DEBUG is not set
CONFIG_THINKPAD_ACPI_BAY=y
# CONFIG_IDE is not set

#
# SCSI device support
#
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=m
CONFIG_SCSI_DMA=y
CONFIG_SCSI_TGT=m
CONFIG_SCSI_NETLINK=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
CONFIG_CHR_DEV_SG=m
CONFIG_CHR_DEV_SCH=m

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
# CONFIG_SCSI_FC_TGT_ATTRS is not set
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m
CONFIG_SCSI_SAS_LIBSAS=m
# CONFIG_SCSI_SAS_ATA is not set
# CONFIG_SCSI_SAS_LIBSAS_DEBUG is not set
# CONFIG_SCSI_SRP_ATTRS is not set
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_SEAGATE is not set
# CONFIG_SCSI_SIM710 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_NSP32 is not set
CONFIG_SCSI_DEBUG=m
# CONFIG_SCSI_SRP is not set
# CONFIG_SCSI_LOWLEVEL_PCMCIA is not set
CONFIG_ATA=m
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_ACPI=y
# CONFIG_SATA_AHCI is not set
# CONFIG_SATA_SVW is not set
CONFIG_ATA_PIIX=m
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIL24 is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_PATA_ACPI is not set
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CS5520 is not set
# CONFIG_PATA_CS5530 is not set
# CONFIG_PATA_CS5535 is not set
# CONFIG_PATA_CS5536 is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
CONFIG_ATA_GENERIC=m
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_ISAPNP is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_LEGACY is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PCMCIA is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_QDI is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RZ1000 is not set
# CONFIG_PATA_SC1200 is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
# CONFIG_PATA_WINBOND_VLB is not set
# CONFIG_PATA_PLATFORM is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=m
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
CONFIG_MD_RAID5_RESHAPE=y
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_DEBUG is not set
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_MIRROR=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
CONFIG_DM_MULTIPATH_EMC=m
# CONFIG_DM_MULTIPATH_RDAC is not set
# CONFIG_DM_MULTIPATH_HP is not set
# CONFIG_DM_DELAY is not set
# CONFIG_DM_UEVENT is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_IEEE1394 is not set
CONFIG_I2O=m
CONFIG_I2O_LCT_NOTIFY_ON_CHANGES=y
CONFIG_I2O_EXT_ADAPTEC=y
CONFIG_I2O_CONFIG=m
CONFIG_I2O_CONFIG_OLD_IOCTL=y
CONFIG_I2O_BUS=m
CONFIG_I2O_BLOCK=m
CONFIG_I2O_SCSI=m
CONFIG_I2O_PROC=m
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
# CONFIG_NETDEVICES_MULTIQUEUE is not set
# CONFIG_IFB is not set
CONFIG_DUMMY=m
CONFIG_BONDING=m
# CONFIG_MACVLAN is not set
CONFIG_EQUALIZER=m
CONFIG_TUN=m
# CONFIG_VETH is not set
CONFIG_NET_SB1000=m
# CONFIG_IP1000 is not set
CONFIG_ARCNET=m
CONFIG_ARCNET_1201=m
CONFIG_ARCNET_1051=m
CONFIG_ARCNET_RAW=m
CONFIG_ARCNET_CAP=m
CONFIG_ARCNET_COM90xx=m
CONFIG_ARCNET_COM90xxIO=m
CONFIG_ARCNET_RIM_I=m
CONFIG_ARCNET_COM20020=m
CONFIG_ARCNET_COM20020_ISA=m
CONFIG_ARCNET_COM20020_PCI=m
# CONFIG_NET_ETHERNET is not set
CONFIG_NETDEV_1000=y
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
CONFIG_E1000=m
CONFIG_E1000_NAPI=y
# CONFIG_E1000_DISABLE_PACKET_SPLIT is not set
# CONFIG_E1000E is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SIS190 is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2 is not set
# CONFIG_QLA3XXX is not set
# CONFIG_ATL1 is not set
# CONFIG_NETDEV_10000 is not set
CONFIG_TR=y
CONFIG_IBMTR=m
CONFIG_IBMOL=m
CONFIG_IBMLS=m
CONFIG_3C359=m
CONFIG_TMS380TR=m
CONFIG_TMSPCI=m
CONFIG_SKISA=m
CONFIG_PROTEON=m
CONFIG_ABYSS=m
CONFIG_SMCTR=m

#
# Wireless LAN
#
# CONFIG_WLAN_PRE80211 is not set
CONFIG_WLAN_80211=y
# CONFIG_PCMCIA_RAYCS is not set
# CONFIG_IPW2100 is not set
CONFIG_IPW2200=m
CONFIG_IPW2200_MONITOR=y
# CONFIG_IPW2200_RADIOTAP is not set
# CONFIG_IPW2200_PROMISCUOUS is not set
# CONFIG_IPW2200_QOS is not set
# CONFIG_IPW2200_DEBUG is not set
# CONFIG_LIBERTAS is not set
# CONFIG_AIRO is not set
# CONFIG_HERMES is not set
# CONFIG_ATMEL is not set
# CONFIG_AIRO_CS is not set
# CONFIG_PCMCIA_WL3501 is not set
# CONFIG_PRISM54 is not set
# CONFIG_USB_ZD1201 is not set
# CONFIG_RTL8187 is not set
# CONFIG_ADM8211 is not set
# CONFIG_P54_COMMON is not set
# CONFIG_IWLWIFI is not set
# CONFIG_HOSTAP is not set
# CONFIG_BCM43XX is not set
# CONFIG_B43 is not set
# CONFIG_B43LEGACY is not set
# CONFIG_ZD1211RW is not set
# CONFIG_RT2X00 is not set

#
# USB Network Adapters
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
# CONFIG_NET_PCMCIA is not set
# CONFIG_WAN is not set
# CONFIG_ATM_DRIVERS is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPP_MPPE=m
CONFIG_PPPOE=m
CONFIG_PPPOATM=m
# CONFIG_PPPOL2TP is not set
CONFIG_SLIP=m
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLHC=m
CONFIG_SLIP_SMART=y
CONFIG_SLIP_MODE_SLIP6=y
CONFIG_NET_FC=y
CONFIG_SHAPER=m
CONFIG_NETCONSOLE=m
# CONFIG_NETCONSOLE_DYNAMIC is not set
CONFIG_NETPOLL=y
# CONFIG_NETPOLL_TRAP is not set
CONFIG_NET_POLL_CONTROLLER=y
# CONFIG_VIRTIO_NET is not set
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_FF_MEMLESS=m
CONFIG_INPUT_POLLDEV=m

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
CONFIG_INPUT_EVDEV=m
CONFIG_INPUT_EVBUG=m

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_KEYBOARD_SUNKBD=m
CONFIG_KEYBOARD_LKKBD=m
CONFIG_KEYBOARD_XTKBD=m
CONFIG_KEYBOARD_NEWTON=m
CONFIG_KEYBOARD_STOWAWAY=m
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=m
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_SERIAL=m
# CONFIG_MOUSE_APPLETOUCH is not set
CONFIG_MOUSE_INPORT=m
# CONFIG_MOUSE_ATIXL is not set
CONFIG_MOUSE_LOGIBM=m
CONFIG_MOUSE_PC110PAD=m
CONFIG_MOUSE_VSXXXAA=m
CONFIG_INPUT_JOYSTICK=y
CONFIG_JOYSTICK_ANALOG=m
CONFIG_JOYSTICK_A3D=m
CONFIG_JOYSTICK_ADI=m
CONFIG_JOYSTICK_COBRA=m
CONFIG_JOYSTICK_GF2K=m
CONFIG_JOYSTICK_GRIP=m
CONFIG_JOYSTICK_GRIP_MP=m
CONFIG_JOYSTICK_GUILLEMOT=m
CONFIG_JOYSTICK_INTERACT=m
CONFIG_JOYSTICK_SIDEWINDER=m
CONFIG_JOYSTICK_TMDC=m
CONFIG_JOYSTICK_IFORCE=m
CONFIG_JOYSTICK_IFORCE_USB=y
CONFIG_JOYSTICK_IFORCE_232=y
CONFIG_JOYSTICK_WARRIOR=m
CONFIG_JOYSTICK_MAGELLAN=m
CONFIG_JOYSTICK_SPACEORB=m
CONFIG_JOYSTICK_SPACEBALL=m
CONFIG_JOYSTICK_STINGER=m
CONFIG_JOYSTICK_TWIDJOY=m
CONFIG_JOYSTICK_JOYDUMP=m
# CONFIG_JOYSTICK_XPAD is not set
# CONFIG_INPUT_TABLET is not set
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_ADS7846=m
# CONFIG_TOUCHSCREEN_FUJITSU is not set
CONFIG_TOUCHSCREEN_GUNZE=m
CONFIG_TOUCHSCREEN_ELO=m
CONFIG_TOUCHSCREEN_MTOUCH=m
CONFIG_TOUCHSCREEN_MK712=m
CONFIG_TOUCHSCREEN_PENMOUNT=m
CONFIG_TOUCHSCREEN_TOUCHRIGHT=m
CONFIG_TOUCHSCREEN_TOUCHWIN=m
CONFIG_TOUCHSCREEN_UCB1400=m
# CONFIG_TOUCHSCREEN_USB_COMPOSITE is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=m
CONFIG_INPUT_WISTRON_BTNS=m
# CONFIG_INPUT_ATLAS_BTNS is not set
# CONFIG_INPUT_ATI_REMOTE is not set
# CONFIG_INPUT_ATI_REMOTE2 is not set
# CONFIG_INPUT_KEYSPAN_REMOTE is not set
# CONFIG_INPUT_POWERMATE is not set
# CONFIG_INPUT_YEALINK is not set
CONFIG_INPUT_UINPUT=m

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=m
CONFIG_SERIO_CT82C710=m
CONFIG_SERIO_PCIPS2=m
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
CONFIG_GAMEPORT=m
CONFIG_GAMEPORT_NS558=m
CONFIG_GAMEPORT_L4=m
CONFIG_GAMEPORT_EMU10K1=m
CONFIG_GAMEPORT_FM801=m

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_COMPUTONE is not set
CONFIG_ROCKETPORT=m
CONFIG_CYCLADES=m
# CONFIG_CYZ_INTR is not set
CONFIG_DIGIEPCA=m
# CONFIG_ESPSERIAL is not set
# CONFIG_MOXA_INTELLIO is not set
CONFIG_MOXA_SMARTIO=m
CONFIG_MOXA_SMARTIO_NEW=m
# CONFIG_ISI is not set
CONFIG_SYNCLINK=m
CONFIG_SYNCLINKMP=m
CONFIG_SYNCLINK_GT=m
CONFIG_N_HDLC=m
CONFIG_SPECIALIX=m
# CONFIG_SPECIALIX_RTSCTS is not set
CONFIG_SX=m
# CONFIG_RIO is not set
CONFIG_STALDRV=y

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_CS=m
CONFIG_SERIAL_8250_NR_UARTS=48
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_FOURPORT=m
CONFIG_SERIAL_8250_ACCENT=m
CONFIG_SERIAL_8250_BOCA=m
CONFIG_SERIAL_8250_EXAR_ST16C554=m
CONFIG_SERIAL_8250_HUB6=m
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
CONFIG_SERIAL_8250_RSA=y

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SERIAL_JSM=m
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_HVC_DRIVER=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
CONFIG_HW_RANDOM=y
CONFIG_HW_RANDOM_INTEL=m
CONFIG_HW_RANDOM_AMD=m
CONFIG_HW_RANDOM_GEODE=m
CONFIG_HW_RANDOM_VIA=m
CONFIG_NVRAM=m
CONFIG_RTC=y
CONFIG_DTLK=m
CONFIG_R3964=m
CONFIG_APPLICOM=m
CONFIG_SONYPI=m

#
# PCMCIA character devices
#
CONFIG_SYNCLINK_CS=m
CONFIG_CARDMAN_4000=m
CONFIG_CARDMAN_4040=m
CONFIG_MWAVE=m
CONFIG_PC8736x_GPIO=m
CONFIG_NSC_GPIO=m
CONFIG_CS5535_GPIO=m
CONFIG_RAW_DRIVER=m
CONFIG_MAX_RAW_DEVS=256
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
CONFIG_HPET_MMAP=y
CONFIG_HANGCHECK_TIMER=m
CONFIG_TCG_TPM=m
CONFIG_TCG_TIS=m
CONFIG_TCG_NSC=m
CONFIG_TCG_ATMEL=m
CONFIG_TCG_INFINEON=m
CONFIG_TELCLOCK=m
CONFIG_DEVPORT=y
CONFIG_I2C=m
CONFIG_I2C_BOARDINFO=y
CONFIG_I2C_CHARDEV=m

#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
CONFIG_I2C_ALGOPCA=m

#
# I2C Hardware Bus support
#
CONFIG_I2C_ALI1535=m
CONFIG_I2C_ALI1563=m
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD756_S4882=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_OCORES=m
CONFIG_I2C_PARPORT_LIGHT=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
# CONFIG_I2C_SIMTEC is not set
CONFIG_SCx200_ACB=m
CONFIG_I2C_SIS5595=m
CONFIG_I2C_SIS630=m
CONFIG_I2C_SIS96X=m
# CONFIG_I2C_TAOS_EVM is not set
CONFIG_I2C_STUB=m
# CONFIG_I2C_TINY_USB is not set
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_VOODOO3=m
CONFIG_I2C_PCA_ISA=m

#
# Miscellaneous I2C Chip support
#
CONFIG_SENSORS_DS1337=m
CONFIG_SENSORS_DS1374=m
# CONFIG_DS1682 is not set
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
CONFIG_SENSORS_PCA9539=m
CONFIG_SENSORS_PCF8591=m
CONFIG_SENSORS_MAX6875=m
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set

#
# SPI support
#
CONFIG_SPI=y
# CONFIG_SPI_DEBUG is not set
CONFIG_SPI_MASTER=y

#
# SPI Master Controller Drivers
#
CONFIG_SPI_BITBANG=m

#
# SPI Protocol Masters
#
# CONFIG_SPI_AT25 is not set
# CONFIG_SPI_SPIDEV is not set
# CONFIG_SPI_TLE62X0 is not set
# CONFIG_W1 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
# CONFIG_BATTERY_DS2760 is not set
CONFIG_HWMON=y
CONFIG_HWMON_VID=m
CONFIG_SENSORS_ABITUGURU=m
# CONFIG_SENSORS_ABITUGURU3 is not set
# CONFIG_SENSORS_AD7418 is not set
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
# CONFIG_SENSORS_ADM1029 is not set
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ADM9240=m
# CONFIG_SENSORS_ADT7470 is not set
CONFIG_SENSORS_K8TEMP=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS1621=m
# CONFIG_SENSORS_I5K_AMB is not set
CONFIG_SENSORS_F71805F=m
# CONFIG_SENSORS_F71882FG is not set
# CONFIG_SENSORS_F75375S is not set
CONFIG_SENSORS_FSCHER=m
CONFIG_SENSORS_FSCPOS=m
# CONFIG_SENSORS_FSCHMD is not set
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_GL520SM=m
# CONFIG_SENSORS_CORETEMP is not set
# CONFIG_SENSORS_IBMPEX is not set
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM63=m
CONFIG_SENSORS_LM70=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_LM92=m
# CONFIG_SENSORS_LM93 is not set
CONFIG_SENSORS_MAX1619=m
# CONFIG_SENSORS_MAX6650 is not set
CONFIG_SENSORS_PC87360=m
CONFIG_SENSORS_PC87427=m
CONFIG_SENSORS_SIS5595=m
# CONFIG_SENSORS_DME1737 is not set
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_SMSC47M192=m
CONFIG_SENSORS_SMSC47B397=m
# CONFIG_SENSORS_THMC50 is not set
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_VT1211=m
CONFIG_SENSORS_VT8231=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83791D=m
CONFIG_SENSORS_W83792D=m
CONFIG_SENSORS_W83793=m
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
CONFIG_SENSORS_HDAPS=m
CONFIG_SENSORS_APPLESMC=m
# CONFIG_HWMON_DEBUG_CHIP is not set
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
CONFIG_ACQUIRE_WDT=m
CONFIG_ADVANTECH_WDT=m
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
CONFIG_SC520_WDT=m
CONFIG_EUROTECH_WDT=m
CONFIG_IB700_WDT=m
CONFIG_IBMASR=m
CONFIG_WAFER_WDT=m
CONFIG_I6300ESB_WDT=m
CONFIG_ITCO_WDT=m
CONFIG_ITCO_VENDOR_SUPPORT=y
# CONFIG_IT8712F_WDT is not set
CONFIG_SC1200_WDT=m
CONFIG_PC87413_WDT=m
CONFIG_60XX_WDT=m
CONFIG_SBC8360_WDT=m
# CONFIG_SBC7240_WDT is not set
CONFIG_CPU5_WDT=m
CONFIG_SMSC37B787_WDT=m
CONFIG_W83627HF_WDT=m
CONFIG_W83697HF_WDT=m
CONFIG_W83877F_WDT=m
CONFIG_W83977F_WDT=m
CONFIG_MACHZ_WDT=m
CONFIG_SBC_EPX_C3_WATCHDOG=m

#
# ISA-based Watchdog Cards
#
CONFIG_PCWATCHDOG=m
CONFIG_MIXCOMWD=m
CONFIG_WDT=m
CONFIG_WDT_501=y

#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m
CONFIG_WDT_501_PCI=y

#
# USB-based Watchdog Cards
#
CONFIG_USBPCWATCHDOG=m

#
# Sonics Silicon Backplane
#
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_SM501 is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
# CONFIG_DVB_CORE is not set
# CONFIG_DAB is not set

#
# Graphics support
#
CONFIG_AGP=m
CONFIG_AGP_ALI=m
CONFIG_AGP_ATI=m
CONFIG_AGP_AMD=m
CONFIG_AGP_AMD64=m
CONFIG_AGP_INTEL=m
CONFIG_AGP_NVIDIA=m
CONFIG_AGP_SIS=m
CONFIG_AGP_SWORKS=m
CONFIG_AGP_VIA=m
CONFIG_AGP_EFFICEON=m
CONFIG_DRM=m
CONFIG_DRM_TDFX=m
CONFIG_DRM_R128=m
CONFIG_DRM_RADEON=m
CONFIG_DRM_I810=m
CONFIG_DRM_I830=m
CONFIG_DRM_I915=m
CONFIG_DRM_MGA=m
CONFIG_DRM_SIS=m
CONFIG_DRM_VIA=m
CONFIG_DRM_SAVAGE=m
CONFIG_VGASTATE=m
CONFIG_VIDEO_OUTPUT_CONTROL=m
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
CONFIG_FB_DDC=m
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
# CONFIG_FB_SYS_FILLRECT is not set
# CONFIG_FB_SYS_COPYAREA is not set
# CONFIG_FB_SYS_IMAGEBLIT is not set
# CONFIG_FB_SYS_FOPS is not set
CONFIG_FB_DEFERRED_IO=y
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
CONFIG_FB_BACKLIGHT=y
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_VGA16=m
# CONFIG_FB_UVESA is not set
# CONFIG_FB_VESA is not set
# CONFIG_FB_EFI is not set
CONFIG_FB_IMAC=y
# CONFIG_FB_HECUBA is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I810 is not set
# CONFIG_FB_LE80578 is not set
# CONFIG_FB_INTEL is not set
# CONFIG_FB_MATROX is not set
CONFIG_FB_RADEON=m
CONFIG_FB_RADEON_I2C=y
CONFIG_FB_RADEON_BACKLIGHT=y
# CONFIG_FB_RADEON_DEBUG is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_CYBLA is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_GEODE is not set
# CONFIG_FB_VIRTUAL is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=m
# CONFIG_LCD_LTV350QV is not set
CONFIG_BACKLIGHT_CLASS_DEVICE=y
# CONFIG_BACKLIGHT_CORGI is not set
# CONFIG_BACKLIGHT_PROGEAR is not set

#
# Display device support
#
# CONFIG_DISPLAY_SUPPORT is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_VGACON_SOFT_SCROLLBACK is not set
CONFIG_VIDEO_SELECT=y
CONFIG_MDA_CONSOLE=m
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=m
# CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY is not set
# CONFIG_FRAMEBUFFER_CONSOLE_ROTATION is not set
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
# CONFIG_LOGO is not set

#
# Sound
#
CONFIG_SOUND=m

#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_PCM_OSS_PLUGINS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
CONFIG_SND_SEQ_RTCTIMER_DEFAULT=y
CONFIG_SND_DYNAMIC_MINORS=y
CONFIG_SND_SUPPORT_OLD_API=y
CONFIG_SND_VERBOSE_PROCFS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set

#
# Generic devices
#
CONFIG_SND_MPU401_UART=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_DUMMY=m
CONFIG_SND_VIRMIDI=m
CONFIG_SND_MTPAV=m
CONFIG_SND_SERIAL_U16550=m
CONFIG_SND_MPU401=m

#
# ISA devices
#
# CONFIG_SND_ADLIB is not set
# CONFIG_SND_AD1816A is not set
# CONFIG_SND_AD1848 is not set
# CONFIG_SND_ALS100 is not set
# CONFIG_SND_AZT2320 is not set
# CONFIG_SND_CMI8330 is not set
# CONFIG_SND_CS4231 is not set
# CONFIG_SND_CS4232 is not set
# CONFIG_SND_CS4236 is not set
# CONFIG_SND_DT019X is not set
# CONFIG_SND_ES968 is not set
# CONFIG_SND_ES1688 is not set
# CONFIG_SND_ES18XX is not set
# CONFIG_SND_SC6000 is not set
# CONFIG_SND_GUSCLASSIC is not set
# CONFIG_SND_GUSEXTREME is not set
# CONFIG_SND_GUSMAX is not set
# CONFIG_SND_INTERWAVE is not set
# CONFIG_SND_INTERWAVE_STB is not set
# CONFIG_SND_OPL3SA2 is not set
# CONFIG_SND_OPTI92X_AD1848 is not set
# CONFIG_SND_OPTI92X_CS4231 is not set
# CONFIG_SND_OPTI93X is not set
# CONFIG_SND_MIRO is not set
# CONFIG_SND_SB8 is not set
# CONFIG_SND_SB16 is not set
# CONFIG_SND_SBAWE is not set
# CONFIG_SND_SGALAXY is not set
# CONFIG_SND_SSCAPE is not set
# CONFIG_SND_WAVEFRONT is not set

#
# PCI devices
#
# CONFIG_SND_AD1889 is not set
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS5530 is not set
# CONFIG_SND_CS5535AUDIO is not set
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_HDA_INTEL is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
CONFIG_SND_INTEL8X0=m
CONFIG_SND_INTEL8X0M=m
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_AC97_POWER_SAVE is not set

#
# SPI devices
#

#
# USB devices
#
# CONFIG_SND_USB_AUDIO is not set
# CONFIG_SND_USB_USX2Y is not set
# CONFIG_SND_USB_CAIAQ is not set

#
# PCMCIA devices
#
# CONFIG_SND_VXPOCKET is not set
# CONFIG_SND_PDAUDIOCF is not set

#
# System on Chip audio support
#
# CONFIG_SND_SOC is not set

#
# SoC Audio support for SuperH
#

#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
CONFIG_AC97_BUS=m
CONFIG_HID_SUPPORT=y
CONFIG_HID=m
CONFIG_HID_DEBUG=y
# CONFIG_HIDRAW is not set

#
# USB Input Devices
#
CONFIG_USB_HID=m
CONFIG_USB_HIDINPUT_POWERBOOK=y
# CONFIG_HID_FF is not set
CONFIG_USB_HIDDEV=y

#
# USB HID Boot Protocol drivers
#
CONFIG_USB_KBD=m
CONFIG_USB_MOUSE=m
CONFIG_USB_SUPPORT=y
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
CONFIG_USB=m
# CONFIG_USB_DEBUG is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
CONFIG_USB_DEVICE_CLASS=y
# CONFIG_USB_DYNAMIC_MINORS is not set
CONFIG_USB_SUSPEND=y
# CONFIG_USB_PERSIST is not set
# CONFIG_USB_OTG is not set

#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
CONFIG_USB_EHCI_SPLIT_ISO=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_EHCI_TT_NEWSCHED=y
# CONFIG_USB_ISP116X_HCD is not set
CONFIG_USB_OHCI_HCD=m
# CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set
# CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_UHCI_HCD=m
CONFIG_USB_U132_HCD=m
CONFIG_USB_SL811_HCD=m
CONFIG_USB_SL811_CS=m
# CONFIG_USB_R8A66597_HCD is not set

#
# USB Device Class drivers
#
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m

#
# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support'
#

#
# may also be needed; see USB_STORAGE Help for more information
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
# CONFIG_USB_STORAGE_ISD200 is not set
CONFIG_USB_STORAGE_DPCM=y
CONFIG_USB_STORAGE_USBAT=y
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y
CONFIG_USB_STORAGE_ALAUDA=y
CONFIG_USB_STORAGE_KARMA=y
CONFIG_USB_LIBUSUAL=y

#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m
CONFIG_USB_MON=y

#
# USB port drivers
#

#
# USB Serial Converter support
#
CONFIG_USB_SERIAL=m
CONFIG_USB_SERIAL_GENERIC=y
CONFIG_USB_SERIAL_AIRCABLE=m
CONFIG_USB_SERIAL_AIRPRIME=m
CONFIG_USB_SERIAL_ARK3116=m
CONFIG_USB_SERIAL_BELKIN=m
# CONFIG_USB_SERIAL_CH341 is not set
CONFIG_USB_SERIAL_WHITEHEAT=m
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
CONFIG_USB_SERIAL_CP2101=m
CONFIG_USB_SERIAL_CYPRESS_M8=m
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_FUNSOFT=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
CONFIG_USB_SERIAL_KEYSPAN=m
CONFIG_USB_SERIAL_KEYSPAN_MPR=y
CONFIG_USB_SERIAL_KEYSPAN_USA28=y
CONFIG_USB_SERIAL_KEYSPAN_USA28X=y
CONFIG_USB_SERIAL_KEYSPAN_USA28XA=y
CONFIG_USB_SERIAL_KEYSPAN_USA28XB=y
CONFIG_USB_SERIAL_KEYSPAN_USA19=y
CONFIG_USB_SERIAL_KEYSPAN_USA18X=y
CONFIG_USB_SERIAL_KEYSPAN_USA19W=y
CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y
CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y
CONFIG_USB_SERIAL_KEYSPAN_USA49W=y
CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
CONFIG_USB_SERIAL_MOS7720=m
CONFIG_USB_SERIAL_MOS7840=m
CONFIG_USB_SERIAL_NAVMAN=m
CONFIG_USB_SERIAL_PL2303=m
# CONFIG_USB_SERIAL_OTI6858 is not set
CONFIG_USB_SERIAL_HP4X=m
CONFIG_USB_SERIAL_SAFE=m
# CONFIG_USB_SERIAL_SAFE_PADDED is not set
CONFIG_USB_SERIAL_SIERRAWIRELESS=m
CONFIG_USB_SERIAL_TI=m
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
CONFIG_USB_SERIAL_OPTION=m
CONFIG_USB_SERIAL_OMNINET=m
CONFIG_USB_SERIAL_DEBUG=m
CONFIG_USB_EZUSB=y

#
# USB Miscellaneous drivers
#
CONFIG_USB_EMI62=m
CONFIG_USB_EMI26=m
CONFIG_USB_ADUTUX=m
CONFIG_USB_AUERSWALD=m
CONFIG_USB_RIO500=m
CONFIG_USB_LEGOTOWER=m
CONFIG_USB_LCD=m
# CONFIG_USB_BERRY_CHARGE is not set
CONFIG_USB_LED=m
CONFIG_USB_CYPRESS_CY7C63=m
CONFIG_USB_CYTHERM=m
CONFIG_USB_PHIDGET=m
CONFIG_USB_PHIDGETKIT=m
CONFIG_USB_PHIDGETMOTORCONTROL=m
CONFIG_USB_PHIDGETSERVO=m
CONFIG_USB_IDMOUSE=m
CONFIG_USB_FTDI_ELAN=m
CONFIG_USB_APPLEDISPLAY=m
CONFIG_USB_SISUSBVGA=m
# CONFIG_USB_SISUSBVGA_CON is not set
CONFIG_USB_LD=m
CONFIG_USB_TRANCEVIBRATOR=m
# CONFIG_USB_IOWARRIOR is not set
CONFIG_USB_TEST=m

#
# USB DSL modem support
#
CONFIG_USB_ATM=m
CONFIG_USB_SPEEDTOUCH=m
CONFIG_USB_CXACRU=m
CONFIG_USB_UEAGLEATM=m
CONFIG_USB_XUSBATM=m

#
# USB Gadget Support
#
CONFIG_USB_GADGET=m
# CONFIG_USB_GADGET_DEBUG is not set
# CONFIG_USB_GADGET_DEBUG_FILES is not set
# CONFIG_USB_GADGET_DEBUG_FS is not set
CONFIG_USB_GADGET_SELECTED=y
# CONFIG_USB_GADGET_AMD5536UDC is not set
# CONFIG_USB_GADGET_ATMEL_USBA is not set
# CONFIG_USB_GADGET_FSL_USB2 is not set
CONFIG_USB_GADGET_NET2280=y
CONFIG_USB_NET2280=m
# CONFIG_USB_GADGET_PXA2XX is not set
# CONFIG_USB_GADGET_M66592 is not set
# CONFIG_USB_GADGET_GOKU is not set
# CONFIG_USB_GADGET_LH7A40X is not set
# CONFIG_USB_GADGET_OMAP is not set
# CONFIG_USB_GADGET_S3C2410 is not set
# CONFIG_USB_GADGET_AT91 is not set
# CONFIG_USB_GADGET_DUMMY_HCD is not set
CONFIG_USB_GADGET_DUALSPEED=y
CONFIG_USB_ZERO=m
CONFIG_USB_ETH=m
CONFIG_USB_ETH_RNDIS=y
CONFIG_USB_GADGETFS=m
CONFIG_USB_FILE_STORAGE=m
# CONFIG_USB_FILE_STORAGE_TEST is not set
CONFIG_USB_G_SERIAL=m
CONFIG_USB_MIDI_GADGET=m
CONFIG_MMC=m
# CONFIG_MMC_DEBUG is not set
# CONFIG_MMC_UNSAFE_RESUME is not set

#
# MMC/SD Card Drivers
#
CONFIG_MMC_BLOCK=m
CONFIG_MMC_BLOCK_BOUNCE=y
# CONFIG_SDIO_UART is not set

#
# MMC/SD Host Controller Drivers
#
CONFIG_MMC_SDHCI=m
# CONFIG_MMC_RICOH_MMC is not set
CONFIG_MMC_WBSD=m
CONFIG_MMC_TIFM_SD=m
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=m

#
# LED drivers
#

#
# LED Triggers
#
# CONFIG_LEDS_TRIGGERS is not set
# CONFIG_INFINIBAND is not set
# CONFIG_EDAC is not set
CONFIG_RTC_LIB=m
CONFIG_RTC_CLASS=m

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
CONFIG_RTC_INTF_DEV_UIE_EMUL=y
CONFIG_RTC_DRV_TEST=m

#
# I2C RTC drivers
#
CONFIG_RTC_DRV_DS1307=m
# CONFIG_RTC_DRV_DS1374 is not set
CONFIG_RTC_DRV_DS1672=m
# CONFIG_RTC_DRV_MAX6900 is not set
CONFIG_RTC_DRV_RS5C372=m
CONFIG_RTC_DRV_ISL1208=m
CONFIG_RTC_DRV_X1205=m
CONFIG_RTC_DRV_PCF8563=m
CONFIG_RTC_DRV_PCF8583=m
# CONFIG_RTC_DRV_M41T80 is not set

#
# SPI RTC drivers
#
CONFIG_RTC_DRV_RS5C348=m
CONFIG_RTC_DRV_MAX6902=m

#
# Platform RTC drivers
#
# CONFIG_RTC_DRV_CMOS is not set
CONFIG_RTC_DRV_DS1553=m
# CONFIG_RTC_DRV_STK17TA8 is not set
CONFIG_RTC_DRV_DS1742=m
CONFIG_RTC_DRV_M48T86=m
# CONFIG_RTC_DRV_M48T59 is not set
CONFIG_RTC_DRV_V3020=m

#
# on-CPU RTC drivers
#
# CONFIG_DMADEVICES is not set
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_KVM_AMD=m
CONFIG_LGUEST=m

#
# Userspace I/O
#
# CONFIG_UIO is not set
CONFIG_VIRTIO=y
CONFIG_VIRTIO_RING=y

#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_EFI_VARS=y
CONFIG_DELL_RBU=m
CONFIG_DCDBAS=m
CONFIG_DMIID=y

#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
# CONFIG_EXT2_FS_XIP is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
# CONFIG_EXT4DEV_FS is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
CONFIG_MINIX_FS=m
CONFIG_ROMFS_FS=m
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_QUOTA=y
# CONFIG_QUOTA_NETLINK_INTERFACE is not set
CONFIG_PRINT_QUOTA_WARNING=y
CONFIG_QFMT_V1=m
CONFIG_QFMT_V2=m
CONFIG_QUOTACTL=y
CONFIG_DNOTIFY=y
CONFIG_AUTOFS_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=m

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=m
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
# CONFIG_TMPFS_POSIX_ACL is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_CONFIGFS_FS=m

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
CONFIG_ECRYPT_FS=m
CONFIG_HFS_FS=m
CONFIG_HFSPLUS_FS=m
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=y
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
CONFIG_UFS_FS=m
# CONFIG_UFS_FS_WRITE is not set
# CONFIG_UFS_DEBUG is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
# CONFIG_NFS_V3_ACL is not set
CONFIG_NFS_V4=y
CONFIG_NFS_DIRECTIO=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V3_ACL is not set
CONFIG_NFSD_V4=y
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
# CONFIG_SUNRPC_BIND34 is not set
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_RPCSEC_GSS_SPKM3=m
# CONFIG_SMB_FS is not set
CONFIG_CIFS=m
# CONFIG_CIFS_STATS is not set
# CONFIG_CIFS_WEAK_PW_HASH is not set
# CONFIG_CIFS_XATTR is not set
# CONFIG_CIFS_DEBUG2 is not set
# CONFIG_CIFS_EXPERIMENTAL is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
CONFIG_ACORN_PARTITION=y
# CONFIG_ACORN_PARTITION_CUMANA is not set
# CONFIG_ACORN_PARTITION_EESOX is not set
CONFIG_ACORN_PARTITION_ICS=y
# CONFIG_ACORN_PARTITION_ADFS is not set
# CONFIG_ACORN_PARTITION_POWERTEC is not set
CONFIG_ACORN_PARTITION_RISCIX=y
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
CONFIG_ATARI_PARTITION=y
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
CONFIG_LDM_PARTITION=y
# CONFIG_LDM_DEBUG is not set
CONFIG_SGI_PARTITION=y
CONFIG_ULTRIX_PARTITION=y
CONFIG_SUN_PARTITION=y
CONFIG_KARMA_PARTITION=y
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="cp437"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_UTF8=m
# CONFIG_DLM is not set
CONFIG_INSTRUMENTATION=y
CONFIG_PROFILING=y
CONFIG_OPROFILE=m
CONFIG_KPROBES=y
# CONFIG_MARKERS is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_PRINTK_TIME=y
CONFIG_ENABLE_WARN_DEPRECATED=y
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_UNUSED_SYMBOLS=y
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_SHIRQ is not set
CONFIG_DETECT_SOFTLOCKUP=y
CONFIG_SCHED_DEBUG=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_TIMER_STATS is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_RT_MUTEX_TESTER is not set
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_LOCKDEP is not set
CONFIG_TRACE_IRQFLAGS=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_HIGHMEM is not set
# CONFIG_DEBUG_BUGVERBOSE is not set
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_SG is not set
CONFIG_FRAME_POINTER=y
# CONFIG_FORCED_INLINING is not set
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_LKDTM is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_SAMPLES is not set
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_DEBUG_RODATA is not set
CONFIG_4KSTACKS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_DOUBLEFAULT=y

#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_DEBUG_PROC_KEYS is not set
CONFIG_SECURITY=y
CONFIG_SECURITY_NETWORK=y
# CONFIG_SECURITY_NETWORK_XFRM is not set
# CONFIG_SECURITY_CAPABILITIES is not set
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=0
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SELINUX_ENABLE_SECMARK_DEFAULT is not set
# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
CONFIG_XOR_BLOCKS=m
CONFIG_ASYNC_CORE=m
CONFIG_ASYNC_MEMCPY=m
CONFIG_ASYNC_XOR=m
CONFIG_CRYPTO=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_BLKCIPHER=m
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=m
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=m
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_GF128MUL=m
CONFIG_CRYPTO_ECB=m
CONFIG_CRYPTO_CBC=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_LRW=m
# CONFIG_CRYPTO_XTS is not set
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_DES=m
# CONFIG_CRYPTO_FCRYPT is not set
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
CONFIG_CRYPTO_TWOFISH_586=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_AES=m
CONFIG_CRYPTO_AES_586=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_ANUBIS=m
# CONFIG_CRYPTO_SEED is not set
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_CRC32C=m
# CONFIG_CRYPTO_CAMELLIA is not set
CONFIG_CRYPTO_TEST=m
# CONFIG_CRYPTO_AUTHENC is not set
CONFIG_CRYPTO_HW=y
CONFIG_CRYPTO_DEV_PADLOCK=m
CONFIG_CRYPTO_DEV_PADLOCK_AES=m
CONFIG_CRYPTO_DEV_PADLOCK_SHA=m
CONFIG_CRYPTO_DEV_GEODE=m

#
# Library routines
#
CONFIG_BITREVERSE=y
CONFIG_CRC_CCITT=m
CONFIG_CRC16=m
CONFIG_CRC_ITU_T=m
CONFIG_CRC32=y
# CONFIG_CRC7 is not set
CONFIG_LIBCRC32C=m
CONFIG_AUDIT_GENERIC=y
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_PLIST=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_CHECK_SIGNATURE=y


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-05 20:05                 ` Christoph Lameter
@ 2008-01-07 20:12                   ` Pekka J Enberg
  0 siblings, 0 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-07 20:12 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Matt Mackall, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List, zanussi

Hi Christoph,

On Sat, 5 Jan 2008, Pekka J Enberg wrote:
> > So, I have this silly memory profiler derived from the kleak patches by 
> > the relayfs people and would love to try it out on an embedded workload 
> > where SLUB memory footprint is terrible. Any suggestions?

On Sat, 5 Jan 2008, Christoph Lameter wrote:
> Good idea. But have you tried to look at slabinfo?
> 
> Try to run
> 
> 	slabinfo -t
> 
> which will calculate the allocation overhead of the currently allocated 
> objects in all slab caches.

Oh, I didn't know about that. It's great for looking at the current memory 
footprint but not really suitable if you want to analyze allocation/free 
patterns.

And I think we need to look at the patterns to figure out *where* and 
*why* SLOB is better than SLUB and whether that can be fixed in SLUB or 
the callers. For example, while it's obviously true that SLUB has 
much higher internal fragmentation for kmalloc() than SLOB, it doesn't 
matter as much if the allocations are short-lived. I don't think you can 
extend the current SLUB debug mechanism to provide that but rather 
"kmalloc accounting" is needed.

The problem with my patch, though, is that due to relayfs I cannot capture 
events unless userspace is ready to read them which is why I am missing 
all the data from boot (I think systemtap has this same limitation?). So 
we need something like what Matt has except that it needs to keep track of 
every *event* and userspace must be able to access them.

			Pekka

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-07 19:03                     ` Matt Mackall
  2008-01-07 19:53                       ` Pekka J Enberg
@ 2008-01-07 20:44                       ` Pekka J Enberg
  2008-01-10 10:04                       ` Pekka J Enberg
  2 siblings, 0 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-07 20:44 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

Hi Matt,

On Mon, 7 Jan 2008, Matt Mackall wrote:
> Fascinating. Which kernel version are you using? This patch doesn't seem
> to have made it to mainline:
> 
> ---
> 
> slob: fix free block merging at head of subpage
> 
> We weren't merging freed blocks at the beginning of the free list.
> Fixing this showed a 2.5% efficiency improvement in a userspace test
> harness.

Hmm, interesting, it definitely improves the best case (although not quite 
on a par with SLUB) but makes worst case and average case significantly 
worse (almost as bad as SLAB):

[ the minimum, maximum, and average are of captured from 10 individual runs ]

                                 Free (kB)             Used (kB)
                    Total (kB)   min   max   average   min  max  average
  SLUB (no debug)   26536        23868 23892 23877.6   2644 2668 2658.4
  SLOB              26548        23472 23640 23579.6   2908 3076 2968.4
  SLOB (patched)    26548        23260 23728 23385.2   2820 3288 3162.8
  SLAB (no debug)   26544        23316 23364 23343.2   3180 3228 3200.8
  SLUB (with debug) 26484        23120 23136 23127.2   3348 3364 3356.8

			Pekka

^ permalink raw reply	[flat|nested] 69+ messages in thread

* [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-07 18:06                   ` Pekka J Enberg
  2008-01-07 19:03                     ` Matt Mackall
@ 2008-01-09 19:15                     ` Matt Mackall
  2008-01-09 22:43                       ` Pekka J Enberg
  2008-01-10 10:03                       ` Pekka J Enberg
  1 sibling, 2 replies; 69+ messages in thread
From: Matt Mackall @ 2008-01-09 19:15 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Mon, 2008-01-07 at 20:06 +0200, Pekka J Enberg wrote:
> Hi Matt,
> 
> On Sun, 6 Jan 2008, Matt Mackall wrote:
> > I don't have any particular "terrible" workloads for SLUB. But my
> > attempts to simply boot with all three allocators to init=/bin/bash in,
> > say, lguest show a fair margin for SLOB.
> 
> Sorry, I once again have bad news ;-). I did some testing with
> 
>   lguest --block=<rootfile> 32 /boot/vmlinuz-2.6.24-rc6 root=/dev/vda init=doit
> 
> where rootfile is
> 
>   http://uml.nagafix.co.uk/BusyBox-1.5.0/BusyBox-1.5.0-x86-root_fs.bz2
> 
> and the "doit" script in the guest passed as init= is just
> 
>   #!/bin/sh
>   mount -t proc proc /proc
>   cat /proc/meminfo | grep MemTotal
>   cat /proc/meminfo | grep MemFree
>   cat /proc/meminfo | grep Slab
> 
> and the results are:
> 
> [ the minimum, maximum, and average are of captured from 10 individual runs ]
> 
>                                  Free (kB)             Used (kB)
>                     Total (kB)   min   max   average   min  max  average
>   SLUB (no debug)   26536        23868 23892 23877.6   2644 2668 2658.4
>   SLOB              26548        23472 23640 23579.6   2908 3076 2968.4
>   SLAB (no debug)   26544        23316 23364 23343.2   3180 3228 3200.8
>   SLUB (with debug) 26484        23120 23136 23127.2   3348 3364 3356.8
> 
> So it seems that on average SLUB uses 300 kilobytes *less memory* (!) (which is
> roughly 1% of total memory available) after boot than SLOB for my
> configuration.
> 
> One possible explanation is that the high internal fragmentation (space
> allocated but not used) of SLUB kmalloc() only affects short-lived allocations
> and thus does not show up in the more permanent memory footprint.  Likewise, it
> could be that SLOB has higher external fragmentation (small blocks that are
> unavailable for allocation) of which SLUB does not suffer from.  Dunno, haven't
> investigated as my results are contradictory to yours.

Yep, you we definitely onto something here. Here's what I got with 10
runs of SLUB on my setup:

MemFree:         55780 kB
MemFree:         55780 kB
MemFree:         55780 kB
MemFree:         55784 kB
MemFree:         55788 kB
MemFree:         55788 kB
MemFree:         55788 kB
MemFree:         55792 kB
MemFree:         55796 kB
MemFree:         55800 kB
Avg:             55787.6

And with SLOB + my defrag fix:

MemFree:         55236 kB
MemFree:         55284 kB
MemFree:         55292 kB
MemFree:         55304 kB
MemFree:         55384 kB
MemFree:         55388 kB
MemFree:         55412 kB
MemFree:         55420 kB
MemFree:         55436 kB
MemFree:         55444 kB
Avg:             55360.0            

Ouch!

So I added a bunch of statistics gathering:

counted pages 409 unused 185242 biggest 1005 fragments 1416
slob pages 410 allocs 22528 frees 12109 active 10419 allocated 932647
      page scans 11249 block scans 40650
kmallocs 10247 active 5450 allocated 3484680 overhead 48836
bigpages 827 active 17
total 427 used 245

The first line tells us about SLOB's free list, which has 409 pages,
185k unused, spread into 1416 fragments. The average fragment is 130
bytes.

The next tells us that we've got 410 total SLOB pages (1 is fully
allocated), we've done 22k allocs, 12k frees, have 10k allocations
active, and 932k total memory allocated (including kmallocs). That means
our average SLOB allocation is ~90 bytes.

The kmallocs line tells us we've done 10k allocs, and 5k of them are not
yet freed. Since boot, we requested 3.48MiB of kmalloc (without padding)
and added on 49k of padding. Thus the average kmalloc is 340 bytes and
has 4.77 bytes of padding (1.2% overhead, quite good!).

SLAB and kmalloc objects => 4k are handed straight to the page allocator
(same as SLUB), of which there are 17 active pages.

So in total, SLOB is using 427 pages for what optimally could fit in 245
pages. In other words, external fragmentation is horrible.

I kicked this around for a while, slept on it, and then came up with
this little hack first thing this morning:

------------
slob: split free list by size

diff -r 6901ca355181 mm/slob.c
--- a/mm/slob.c	Tue Jan 08 21:01:15 2008 -0600
+++ b/mm/slob.c	Wed Jan 09 12:31:59 2008 -0600
@@ -112,7 +112,9 @@ static inline void free_slob_page(struct
 /*
  * All (partially) free slob pages go on this list.
  */
-static LIST_HEAD(free_slob_pages);
+#define SLOB_BREAK_POINT 300
+static LIST_HEAD(free_slob_pages_big);
+static LIST_HEAD(free_slob_pages_small);
 
 /*
  * slob_page: True for all slob pages (false for bigblock pages)
@@ -140,9 +142,9 @@ static inline int slob_page_free(struct 
 	return test_bit(PG_private, &sp->flags);
 }
 
-static inline void set_slob_page_free(struct slob_page *sp)
+static inline void set_slob_page_free(struct slob_page *sp, struct list_head *list)
 {
-	list_add(&sp->list, &free_slob_pages);
+	list_add(&sp->list, list);
 	__set_bit(PG_private, &sp->flags);
 }
 
@@ -294,12 +296,18 @@ static void *slob_alloc(size_t size, gfp
 {
 	struct slob_page *sp;
 	struct list_head *prev;
+	struct list_head *slob_list;
 	slob_t *b = NULL;
 	unsigned long flags;
 
+	slob_list = &free_slob_pages_small;
+	if (size > SLOB_BREAK_POINT)
+		slob_list = &free_slob_pages_big;
+
 	spin_lock_irqsave(&slob_lock, flags);
 	/* Iterate through each partially free page, try to find room */
-	list_for_each_entry(sp, &free_slob_pages, list) {
+
+	list_for_each_entry(sp, slob_list, list) {
 #ifdef CONFIG_NUMA
 		/*
 		 * If there's a node specification, search for a partial
@@ -321,9 +329,9 @@ static void *slob_alloc(size_t size, gfp
 		/* Improve fragment distribution and reduce our average
 		 * search time by starting our next search here. (see
 		 * Knuth vol 1, sec 2.5, pg 449) */
-		if (prev != free_slob_pages.prev &&
-				free_slob_pages.next != prev->next)
-			list_move_tail(&free_slob_pages, prev->next);
+		if (prev != slob_list->prev &&
+				slob_list->next != prev->next)
+			list_move_tail(slob_list, prev->next);
 		break;
 	}
 	spin_unlock_irqrestore(&slob_lock, flags);
@@ -341,7 +349,7 @@ static void *slob_alloc(size_t size, gfp
 		sp->free = b;
 		INIT_LIST_HEAD(&sp->list);
 		set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
-		set_slob_page_free(sp);
+		set_slob_page_free(sp, slob_list);
 		b = slob_page_alloc(sp, size, align);
 		BUG_ON(!b);
 		spin_unlock_irqrestore(&slob_lock, flags);
@@ -357,6 +365,7 @@ static void slob_free(void *block, int s
 static void slob_free(void *block, int size)
 {
 	struct slob_page *sp;
+	struct list_head *slob_list;
 	slob_t *prev, *next, *b = (slob_t *)block;
 	slobidx_t units;
 	unsigned long flags;
@@ -364,6 +373,10 @@ static void slob_free(void *block, int s
 	if (unlikely(ZERO_OR_NULL_PTR(block)))
 		return;
 	BUG_ON(!size);
+
+	slob_list = &free_slob_pages_small;
+	if (size > SLOB_BREAK_POINT)
+		slob_list = &free_slob_pages_big;
 
 	sp = (struct slob_page *)virt_to_page(block);
 	units = SLOB_UNITS(size);
@@ -387,7 +400,7 @@ static void slob_free(void *block, int s
 		set_slob(b, units,
 			(void *)((unsigned long)(b +
 					SLOB_UNITS(PAGE_SIZE)) & PAGE_MASK));
-		set_slob_page_free(sp);
+		set_slob_page_free(sp, slob_list);
 		goto out;
 	}
------------

And the results are fairly miraculous, so please double-check them on
your setup. The resulting statistics change to this:

small list pages 107 unused 39622 biggest 1516 fragments 3511
big list pages 129 unused 23264 biggest 2076 fragments 232
slob pages 243 allocs 22528 frees 12108 active 10420 allocated 932079
     page scans 8074 block scans 481530
kmallocs 10248 active 5451 allocated 3484220 overhead 42054
bigpages 825 active 16
total 259 used 244

and 10 runs looks like this:

MemFree:         56056 kB
MemFree:         56064 kB
MemFree:         56064 kB
MemFree:         56068 kB
MemFree:         56068 kB
MemFree:         56076 kB
MemFree:         56084 kB
MemFree:         56084 kB
MemFree:         56088 kB
MemFree:         56092 kB
Avg:             56074.4

So the average jumped by 714k from before the patch, became much more
stable, and beat SLUB by 287k. There are also 7 perfectly filled pages
now, up from 1 before. And we can't get a whole lot better than this:
we're using 259 pages for 244 pages of actual data, so our total
overhead is only 6%! For comparison, SLUB's using about 70 pages more
for the same data, so its total overhead appears to be about 35%.

By the way, the break at 300 bytes was just the first number that came
to my head but moving it around didn't seem to help. It might want to
change with page size. Knuth suggests that empirically, arena size/10 is
about the maximum allocation size to avoid fragmentation.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-09 19:15                     ` [RFC PATCH] greatly reduce SLOB external fragmentation Matt Mackall
@ 2008-01-09 22:43                       ` Pekka J Enberg
  2008-01-09 22:59                         ` Matt Mackall
  2008-01-10  2:46                         ` Matt Mackall
  2008-01-10 10:03                       ` Pekka J Enberg
  1 sibling, 2 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-09 22:43 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

Hi Matt,

On Wed, 9 Jan 2008, Matt Mackall wrote:
> I kicked this around for a while, slept on it, and then came up with
> this little hack first thing this morning:
> 
> ------------
> slob: split free list by size
> 

[snip]

> And the results are fairly miraculous, so please double-check them on
> your setup. The resulting statistics change to this:

[snip]
 
> So the average jumped by 714k from before the patch, became much more
> stable, and beat SLUB by 287k. There are also 7 perfectly filled pages
> now, up from 1 before. And we can't get a whole lot better than this:
> we're using 259 pages for 244 pages of actual data, so our total
> overhead is only 6%! For comparison, SLUB's using about 70 pages more
> for the same data, so its total overhead appears to be about 35%.

Unfortunately I only see a slight improvement to SLOB (but it still gets 
beaten by SLUB):

[ the minimum, maximum, and average are captured from 10 individual runs ]

                                 Free (kB)             Used (kB)
                    Total (kB)   min   max   average   min  max  average
  SLUB (no debug)   26536        23868 23892 23877.6   2644 2668 2658.4
  SLOB (patched)    26548        23456 23708 23603.2   2840 3092 2944.8
  SLOB (vanilla)    26548        23472 23640 23579.6   2908 3076 2968.4
  SLAB (no debug)   26544        23316 23364 23343.2   3180 3228 3200.8
  SLUB (with debug) 26484        23120 23136 23127.2   3348 3364 3356.8

What .config did you use? What kind of user-space do you have? (I am still 
using the exact same configuration I described in the first mail.)

			Pekka

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-09 22:43                       ` Pekka J Enberg
@ 2008-01-09 22:59                         ` Matt Mackall
  2008-01-10 10:02                           ` Pekka J Enberg
  2008-01-10  2:46                         ` Matt Mackall
  1 sibling, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-01-09 22:59 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-10 at 00:43 +0200, Pekka J Enberg wrote:
> Hi Matt,
> 
> On Wed, 9 Jan 2008, Matt Mackall wrote:
> > I kicked this around for a while, slept on it, and then came up with
> > this little hack first thing this morning:
> > 
> > ------------
> > slob: split free list by size
> > 
> 
> [snip]
> 
> > And the results are fairly miraculous, so please double-check them on
> > your setup. The resulting statistics change to this:
> 
> [snip]
>  
> > So the average jumped by 714k from before the patch, became much more
> > stable, and beat SLUB by 287k. There are also 7 perfectly filled pages
> > now, up from 1 before. And we can't get a whole lot better than this:
> > we're using 259 pages for 244 pages of actual data, so our total
> > overhead is only 6%! For comparison, SLUB's using about 70 pages more
> > for the same data, so its total overhead appears to be about 35%.
> 
> Unfortunately I only see a slight improvement to SLOB (but it still gets 
> beaten by SLUB):
> 
> [ the minimum, maximum, and average are captured from 10 individual runs ]
> 
>                                  Free (kB)             Used (kB)
>                     Total (kB)   min   max   average   min  max  average
>   SLUB (no debug)   26536        23868 23892 23877.6   2644 2668 2658.4
>   SLOB (patched)    26548        23456 23708 23603.2   2840 3092 2944.8
>   SLOB (vanilla)    26548        23472 23640 23579.6   2908 3076 2968.4
>   SLAB (no debug)   26544        23316 23364 23343.2   3180 3228 3200.8
>   SLUB (with debug) 26484        23120 23136 23127.2   3348 3364 3356.8

Huh, that's a fairly negligible change on your system. Is that with or
without the earlier patch? That doesn't appear to change much here.
Guess I'll have to clean up my stats patch and send it to you.

> What .config did you use? What kind of user-space do you have? (I am still 
> using the exact same configuration I described in the first mail.)

I'm using lguest with my Thinkpad config (I'll send that separately),
with busybox and init=doit like your setup. Was having trouble getting
lguest going with your config, but I may have found the problem.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-09 22:43                       ` Pekka J Enberg
  2008-01-09 22:59                         ` Matt Mackall
@ 2008-01-10  2:46                         ` Matt Mackall
  1 sibling, 0 replies; 69+ messages in thread
From: Matt Mackall @ 2008-01-10  2:46 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-10 at 00:43 +0200, Pekka J Enberg wrote:
> Hi Matt,
> 
> On Wed, 9 Jan 2008, Matt Mackall wrote:
> > I kicked this around for a while, slept on it, and then came up with
> > this little hack first thing this morning:
> > 
> > ------------
> > slob: split free list by size
> > 
> 
> [snip]
> 
> > And the results are fairly miraculous, so please double-check them on
> > your setup. The resulting statistics change to this:
> 
> [snip]
>  
> > So the average jumped by 714k from before the patch, became much more
> > stable, and beat SLUB by 287k. There are also 7 perfectly filled pages
> > now, up from 1 before. And we can't get a whole lot better than this:
> > we're using 259 pages for 244 pages of actual data, so our total
> > overhead is only 6%! For comparison, SLUB's using about 70 pages more
> > for the same data, so its total overhead appears to be about 35%.
> 
> Unfortunately I only see a slight improvement to SLOB (but it still gets 
> beaten by SLUB):
> 
> [ the minimum, maximum, and average are captured from 10 individual runs ]
> 
>                                  Free (kB)             Used (kB)
>                     Total (kB)   min   max   average   min  max  average
>   SLUB (no debug)   26536        23868 23892 23877.6   2644 2668 2658.4
>   SLOB (patched)    26548        23456 23708 23603.2   2840 3092 2944.8
>   SLOB (vanilla)    26548        23472 23640 23579.6   2908 3076 2968.4
>   SLAB (no debug)   26544        23316 23364 23343.2   3180 3228 3200.8
>   SLUB (with debug) 26484        23120 23136 23127.2   3348 3364 3356.8

With your kernel config and my lguest+busybox setup, I get:

SLUB:
MemFree:         24208 kB
MemFree:         24212 kB
MemFree:         24212 kB
MemFree:         24212 kB
MemFree:         24216 kB
MemFree:         24216 kB
MemFree:         24220 kB
MemFree:         24220 kB
MemFree:         24224 kB
MemFree:         24232 kB
avg:             24217.2

SLOB with two lists:
MemFree:         24204 kB
MemFree:         24260 kB
MemFree:         24260 kB
MemFree:         24276 kB
MemFree:         24288 kB
MemFree:         24292 kB
MemFree:         24312 kB
MemFree:         24320 kB
MemFree:         24336 kB
MemFree:         24396 kB
avg:             24294.4

Not sure why this result is so different from yours.

Hacked this up to three lists to experiment and we now have:
MemFree:         24348 kB
MemFree:         24372 kB
MemFree:         24372 kB
MemFree:         24372 kB
MemFree:         24372 kB
MemFree:         24380 kB
MemFree:         24384 kB
MemFree:         24404 kB
MemFree:         24404 kB
MemFree:         24408 kB
avg:             24344.4

Even the last version is still using about 250 pages of storage for 209
pages of data, so it's got about 20% overhead still.



-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-09 22:59                         ` Matt Mackall
@ 2008-01-10 10:02                           ` Pekka J Enberg
  2008-01-10 10:54                             ` Pekka J Enberg
  0 siblings, 1 reply; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-10 10:02 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

Hi Matt,

On Wed, 9 Jan 2008, Matt Mackall wrote:
> Huh, that's a fairly negligible change on your system. Is that with or
> without the earlier patch? That doesn't appear to change much here.
> Guess I'll have to clean up my stats patch and send it to you.

Ok, if I apply both of the patches, I get better results for SLOB:

[ the minimum, maximum, and average are captured from 10 individual runs ]

                        Total   Free (kB)             Used (kB)
                        (kB)    min   max   average   min  max  average
  SLUB (no debug)       26536   23868 23892 23877.6   2644 2668 2658.4
  SLOB (both patches)   26548   23612 23860 23766.4   2688 2936 2781.6
  SLOB (two lists)      26548   23456 23708 23603.2   2840 3092 2944.8
  SLOB (vanilla)        26548   23472 23640 23579.6   2908 3076 2968.4
  SLAB (no debug)       26544   23316 23364 23343.2   3180 3228 3200.8
  SLOB (merge fix)      26548   23260 23728 23385.2   2820 3288 3162.8
  SLUB (with debug)     26484   23120 23136 23127.2   3348 3364 3356.8

I'll double check the results for SLUB next but it seems obvious that your 
patches are a net gain for SLOB and should be applied. One problem though 
with SLOB seems to be that its memory efficiency is not so stable. Any 
ideas why that is?

			Pekka

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-09 19:15                     ` [RFC PATCH] greatly reduce SLOB external fragmentation Matt Mackall
  2008-01-09 22:43                       ` Pekka J Enberg
@ 2008-01-10 10:03                       ` Pekka J Enberg
  1 sibling, 0 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-10 10:03 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Wed, 9 Jan 2008, Matt Mackall wrote:
> ------------
> slob: split free list by size
> 

[snip]

Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH] procfs: provide slub's /proc/slabinfo
  2008-01-07 19:03                     ` Matt Mackall
  2008-01-07 19:53                       ` Pekka J Enberg
  2008-01-07 20:44                       ` Pekka J Enberg
@ 2008-01-10 10:04                       ` Pekka J Enberg
  2 siblings, 0 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-10 10:04 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Mon, 7 Jan 2008, Matt Mackall wrote:
> slob: fix free block merging at head of subpage
> 
> We weren't merging freed blocks at the beginning of the free list.
> Fixing this showed a 2.5% efficiency improvement in a userspace test
> harness.

[snip]

Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 10:02                           ` Pekka J Enberg
@ 2008-01-10 10:54                             ` Pekka J Enberg
  2008-01-10 15:44                               ` Matt Mackall
  2008-01-10 16:13                               ` Linus Torvalds
  0 siblings, 2 replies; 69+ messages in thread
From: Pekka J Enberg @ 2008-01-10 10:54 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

Hi Matt,

On Thu, 10 Jan 2008, Pekka J Enberg wrote:
> I'll double check the results for SLUB next but it seems obvious that your 
> patches are a net gain for SLOB and should be applied. One problem though 
> with SLOB seems to be that its memory efficiency is not so stable. Any 
> ideas why that is?

Ok, I did that. The number are stable and reproducible. In fact, the 
average for SLUB is within 1 KB of the previous numbers. So, we have the 
same .config, the same userspace, and the same hypervisor, so what's the 
difference here?

We probably don't have the same version of GCC which perhaps affects 
memory layout (struct sizes) and thus allocation patterns? I have included 
ver_linux from my laptop here:

Linux haji 2.6.24-rc6 #21 SMP Thu Jan 10 12:30:59 EET 2008 i686 GNU/Linux
 
Gnu C                  4.1.3
Gnu make               3.81
binutils               2.18
util-linux             2.13
mount                  2.13
module-init-tools      3.3-pre2
e2fsprogs              1.40.2
reiserfsprogs          3.6.19
pcmciautils            014
PPP                    2.4.4
Linux C Library        2.6.1
Dynamic linker (ldd)   2.6.1
Procps                 3.2.7
Net-tools              1.60
Console-tools          0.2.3
Sh-utils               5.97
udev                   113
wireless-tools         29
Modules Loaded         af_packet ipv6 binfmt_misc rfcomm l2cap uinput radeon drm acpi_cpufreq cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand freq_table cpufreq_conservative dock container joydev snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device pcmcia psmouse snd ipw2200 serio_raw pcspkr hci_usb soundcore snd_page_alloc bluetooth ieee80211 ieee80211_crypt yenta_socket rsrc_nonstatic pcmcia_core iTCO_wdt iTCO_vendor_support video output shpchp pci_hotplug intel_agp button agpgart thinkpad_acpi nvram evdev sg sd_mod ata_piix floppy ata_generic libata scsi_mod e1000 ehci_hcd uhci_hcd usbcore thermal processor fan fuse

			Pekka

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 10:54                             ` Pekka J Enberg
@ 2008-01-10 15:44                               ` Matt Mackall
  2008-01-10 16:13                               ` Linus Torvalds
  1 sibling, 0 replies; 69+ messages in thread
From: Matt Mackall @ 2008-01-10 15:44 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Christoph Lameter, Ingo Molnar, Linus Torvalds, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-10 at 12:54 +0200, Pekka J Enberg wrote:
> Hi Matt,
> 
> On Thu, 10 Jan 2008, Pekka J Enberg wrote:
> > I'll double check the results for SLUB next but it seems obvious that your 
> > patches are a net gain for SLOB and should be applied. One problem though 
> > with SLOB seems to be that its memory efficiency is not so stable. Any 
> > ideas why that is?

We're seeing different numbers in each allocator indicating that the
ordering of allocations is not stable. When fragmentation occurs, it
magnifies the underlying instability. On my config, where the split list
combats fragmentation extremely effectively, the stability is quite
good.

Perhaps I'll add a printk to allocs and generate some size histograms.
lguest + grep = relayfs for dummies.

> Ok, I did that. The number are stable and reproducible. In fact, the 
> average for SLUB is within 1 KB of the previous numbers. So, we have the 
> same .config, the same userspace, and the same hypervisor, so what's the 
> difference here?

I've got:

gcc version 4.2.3 20080102 (prerelease) (Debian 4.2.2-5)
BusyBox v1.2.1 (2006.11.01-11:21+0000) Built-in shell (ash)

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 10:54                             ` Pekka J Enberg
  2008-01-10 15:44                               ` Matt Mackall
@ 2008-01-10 16:13                               ` Linus Torvalds
  2008-01-10 17:49                                 ` Matt Mackall
                                                   ` (2 more replies)
  1 sibling, 3 replies; 69+ messages in thread
From: Linus Torvalds @ 2008-01-10 16:13 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Matt Mackall, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List



On Thu, 10 Jan 2008, Pekka J Enberg wrote:
> 
> We probably don't have the same version of GCC which perhaps affects 
> memory layout (struct sizes) and thus allocation patterns?

No, struct sizes will not change with compiler versions - that would 
create binary incompatibilities for libraries etc.

So apart from the kernel itself working around some old gcc bugs by making 
spinlocks have different size depending on the compiler version, sizes of 
structures should be the same (as long as the configuration is the same, 
of course).

However, a greedy first-fit allocator will be *very* sensitive to 
allocation pattern differences, so timing will probably make a big 
difference. In contrast, SLUB and SLAB both use fixed sizes per allocation 
queue, which makes them almost totally impervious to any allocation 
pattern from different allocation sizes (they still end up caring about 
the pattern *within* one size, but those tend to be much stabler).

There really is a reason why traditional heaps with first-fit are almost 
never used for any real loads.

(I'm not a fan of slabs per se - I think all the constructor/destructor 
crap is just that: total crap - but the size/type binning is a big deal, 
and I think SLOB was naïve to think a pure first-fit makes any sense. Now 
you guys are size-binning by just two or three bins, and it seems to make 
a difference for some loads, but compared to SLUB/SLAB it's a total hack).

I would suggest that if you guys are really serious about memory use, try 
to do a size-based heap thing, and do best-fit in that heap. Or just some 
really simple size-based binning, eg

	if (size > 2*PAGE_SIZE)
		goto page_allocator;
	bin = lookup_bin[(size+31) >> 5];

or whatever. Because first-fit is *known* to be bad.

At try to change it to address-ordered first-fit or something (which is 
much more complex than just plain LIFO, but hey, that's life).

I haven't checked much, but I *think* SLOB is just basic first-fit 
(perhaps the "next-fit" variation?) Next-fit is known to be EVEN WORSE 
than the simple first-fit when it comes to fragmentation (so no, Knuth was 
not always right - let's face it, much of Knuth is simply outdated).

			Linus

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 16:13                               ` Linus Torvalds
@ 2008-01-10 17:49                                 ` Matt Mackall
  2008-01-10 18:28                                   ` Linus Torvalds
                                                     ` (2 more replies)
  2008-01-10 18:13                                 ` Andi Kleen
  2008-07-30 21:51                                 ` Pekka J Enberg
  2 siblings, 3 replies; 69+ messages in thread
From: Matt Mackall @ 2008-01-10 17:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka J Enberg, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-10 at 08:13 -0800, Linus Torvalds wrote:
> 
> On Thu, 10 Jan 2008, Pekka J Enberg wrote:
> > 
> > We probably don't have the same version of GCC which perhaps affects 
> > memory layout (struct sizes) and thus allocation patterns?
> 
> No, struct sizes will not change with compiler versions - that would 
> create binary incompatibilities for libraries etc.
> 
> So apart from the kernel itself working around some old gcc bugs by making 
> spinlocks have different size depending on the compiler version, sizes of 
> structures should be the same (as long as the configuration is the same, 
> of course).
> 
> However, a greedy first-fit allocator will be *very* sensitive to 
> allocation pattern differences, so timing will probably make a big 
> difference. In contrast, SLUB and SLAB both use fixed sizes per allocation 
> queue, which makes them almost totally impervious to any allocation 
> pattern from different allocation sizes (they still end up caring about 
> the pattern *within* one size, but those tend to be much stabler).
> 
> There really is a reason why traditional heaps with first-fit are almost 
> never used for any real loads.
> 
> (I'm not a fan of slabs per se - I think all the constructor/destructor 
> crap is just that: total crap - but the size/type binning is a big deal, 
> and I think SLOB was naïve to think a pure first-fit makes any sense. Now 
> you guys are size-binning by just two or three bins, and it seems to make 
> a difference for some loads, but compared to SLUB/SLAB it's a total hack).

Here I'm going to differ with you. The premises of the SLAB concept
(from the original paper) are: 

a) fragmentation of conventional allocators gets worse over time
b) grouping objects of the same -type- (not size) together should mean
they have similar lifetimes and thereby keep fragmentation low
c) slabs can be O(1)
d) constructors and destructors are cache-friendly

There's some truth to (a), but the problem has been quite overstated,
pre-SLAB Linux kernels and countless other systems run for years. And
(b) is known to be false, you just have to look at our dcache and icache
pinning. (c) of course is a whole separate argument and often trumps the
others. And (d) is pretty much crap now too.

And as it happens, SLOB basically always beats SLAB on size.

SLUB only wins when it starts merging caches of different -types- based
on -size-, effectively throwing out the whole (b) concept. Which is
good, because its wrong. So SLUB starts to look a lot like a purely
size-binned allocator.

> I would suggest that if you guys are really serious about memory use, try 
> to do a size-based heap thing, and do best-fit in that heap. Or just some 
> really simple size-based binning, eg
> 
> 	if (size > 2*PAGE_SIZE)
> 		goto page_allocator;

I think both SLOB and SLUB punt everything >= PAGE_SIZE off to the page
allocator.

> 	bin = lookup_bin[(size+31) >> 5]
> 
> or whatever. Because first-fit is *known* to be bad.

It is known to be crummy, but it is NOT known to actually be worse than
the SLAB/SLUB approach when you consider both internal and external
fragmentation. Power-of-two ala SLAB kmalloc basically guarantees your
internal fragmentation will be 30% or so.

In fact, first-fit is known to be pretty good if the ratio of object
size to arena size is reasonable. I've already shown a system booting
with 6% overhead compared to SLUB's 35% (and SLAB's nearly 70%). The
fragmentation measurements for the small object list are in fact quite
nice. Not the best benchmark, but it certainly shows that there's
substantial room for improvement.

Where it hurts is larger objects (task structs, SKBs), which are also a
problem for SLAB/SLUB and I don't think any variation on the search
order is going to help there. If we threw 64k pages at it, those
problems might very well disappear. Hell, it's pretty tempting to throw
vmalloc at it, especially on small boxes..

> At try to change it to address-ordered first-fit or something (which is 
> much more complex than just plain LIFO, but hey, that's life).

Will think about that. Most of the literature here is of limited
usefulness - even Knuth didn't look at 4k heaps, let alone collections
of them.

> I haven't checked much, but I *think* SLOB is just basic first-fit 
> (perhaps the "next-fit" variation?) Next-fit is known to be EVEN WORSE 
> than the simple first-fit when it comes to fragmentation (so no, Knuth was 
> not always right - let's face it, much of Knuth is simply outdated).

The SLOB allocator is inherently two-level. I'm using a next-fit-like
approach to decide which heap (page) to use and first-fit within that
heap.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 16:13                               ` Linus Torvalds
  2008-01-10 17:49                                 ` Matt Mackall
@ 2008-01-10 18:13                                 ` Andi Kleen
  2008-07-30 21:51                                 ` Pekka J Enberg
  2 siblings, 0 replies; 69+ messages in thread
From: Andi Kleen @ 2008-01-10 18:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka J Enberg, Matt Mackall, Christoph Lameter, Ingo Molnar,
	Hugh Dickins, Andi Kleen, Peter Zijlstra,
	Linux Kernel Mailing List

> I would suggest that if you guys are really serious about memory use, try 
> to do a size-based heap thing, and do best-fit in that heap. Or just some 

iirc best fit usually also has some nasty long term fragmentation behaviour.
That is why it is usually also not used.

-Andi

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 17:49                                 ` Matt Mackall
@ 2008-01-10 18:28                                   ` Linus Torvalds
  2008-01-10 18:42                                     ` Matt Mackall
  2008-01-10 19:16                                   ` Christoph Lameter
  2008-01-10 21:25                                   ` Jörn Engel
  2 siblings, 1 reply; 69+ messages in thread
From: Linus Torvalds @ 2008-01-10 18:28 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Pekka J Enberg, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List



On Thu, 10 Jan 2008, Matt Mackall wrote:
> > 
> > (I'm not a fan of slabs per se - I think all the constructor/destructor 
> > crap is just that: total crap - but the size/type binning is a big deal, 
> > and I think SLOB was naïve to think a pure first-fit makes any sense. Now 
> > you guys are size-binning by just two or three bins, and it seems to make 
> > a difference for some loads, but compared to SLUB/SLAB it's a total hack).
> 
> Here I'm going to differ with you. The premises of the SLAB concept
> (from the original paper) are: 

I really don't think we differ.

The advantage of slab was largely the binning by type. Everything else was 
just a big crock. SLUB does the binning better, by really just making the 
type binning be about what really matters - the *size* of the type.

So my argument was that the type/size binning makes sense (size more so 
than type), but the rest of the original Sun arguments for why slab was 
such a great idea were basically just the crap.

Hard type binning was a mistake (but needed by slab due to the idiotic 
notion that constructors/destructors are "good for caches" - bleargh). I 
suspect that hard size binning is a mistake too (ie there are probably 
cases where you do want to split unused bigger size areas), but the fact 
that all of our allocators are two-level (with the page allocator acting 
as a size-agnostic free space) may help it somewhat.

And yes, I do agree that any current allocator has problems with the big 
sizes that don't fit well into a page or two (like task_struct). That 
said, most of those don't have lots of allocations under many normal 
circumstances (even if there are uses that will really blow them up).

The *big* slab users at least for me tend to be ext3_inode_cache and 
dentry. Everything else is orders of magnitude less. And of the two bad 
ones, ext3_inode_cache is the bad one at 700+ bytes or whatever (resulting 
in ~10% fragmentation just due to the page thing, regardless of whether 
you use an order-0 or order-1 page allocation).

Of course, dentries fit better in a page (due to being smaller), but then 
the bigger number of dentries per page make it harder to actually free 
pages, so then you get fragmentation from that. Oh well. You can't win.

			Linus

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 18:28                                   ` Linus Torvalds
@ 2008-01-10 18:42                                     ` Matt Mackall
  2008-01-10 19:24                                       ` Christoph Lameter
  2008-01-10 19:41                                       ` Linus Torvalds
  0 siblings, 2 replies; 69+ messages in thread
From: Matt Mackall @ 2008-01-10 18:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka J Enberg, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-10 at 10:28 -0800, Linus Torvalds wrote:
> 
> On Thu, 10 Jan 2008, Matt Mackall wrote:
> > > 
> > > (I'm not a fan of slabs per se - I think all the constructor/destructor 
> > > crap is just that: total crap - but the size/type binning is a big deal, 
> > > and I think SLOB was naïve to think a pure first-fit makes any sense. Now 
> > > you guys are size-binning by just two or three bins, and it seems to make 
> > > a difference for some loads, but compared to SLUB/SLAB it's a total hack).
> > 
> > Here I'm going to differ with you. The premises of the SLAB concept
> > (from the original paper) are: 
> 
> I really don't think we differ.
> 
> The advantage of slab was largely the binning by type. Everything else was 
> just a big crock. SLUB does the binning better, by really just making the 
> type binning be about what really matters - the *size* of the type.
> 
> So my argument was that the type/size binning makes sense (size more so 
> than type), but the rest of the original Sun arguments for why slab was 
> such a great idea were basically just the crap.
> 
> Hard type binning was a mistake (but needed by slab due to the idiotic 
> notion that constructors/destructors are "good for caches" - bleargh). I 
> suspect that hard size binning is a mistake too (ie there are probably 
> cases where you do want to split unused bigger size areas), but the fact 
> that all of our allocators are two-level (with the page allocator acting 
> as a size-agnostic free space) may help it somewhat.
> 
> And yes, I do agree that any current allocator has problems with the big 
> sizes that don't fit well into a page or two (like task_struct). That 
> said, most of those don't have lots of allocations under many normal 
> circumstances (even if there are uses that will really blow them up).
> 
> The *big* slab users at least for me tend to be ext3_inode_cache and 
> dentry. Everything else is orders of magnitude less. And of the two bad 
> ones, ext3_inode_cache is the bad one at 700+ bytes or whatever (resulting 
> in ~10% fragmentation just due to the page thing, regardless of whether 
> you use an order-0 or order-1 page allocation).
> 
> Of course, dentries fit better in a page (due to being smaller), but then 
> the bigger number of dentries per page make it harder to actually free 
> pages, so then you get fragmentation from that. Oh well. You can't win.

One idea I've been kicking around is pushing the boundary for the buddy
allocator back a bit (to 64k, say) and using SL*B under that. The page
allocators would call into buddy for larger than 64k (rare!) and SL*B
otherwise. This would let us greatly improve our handling of things like
task structs and skbs and possibly also things like 8k stacks and jumbo
frames. As SL*B would never be competing with the page allocator for
contiguous pages (the buddy allocator's granularity would be 64k), I
don't think this would exacerbate the page-level fragmentation issues.

Crazy?

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 17:49                                 ` Matt Mackall
  2008-01-10 18:28                                   ` Linus Torvalds
@ 2008-01-10 19:16                                   ` Christoph Lameter
  2008-01-10 19:23                                     ` Matt Mackall
  2008-01-10 21:25                                   ` Jörn Engel
  2 siblings, 1 reply; 69+ messages in thread
From: Christoph Lameter @ 2008-01-10 19:16 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 10 Jan 2008, Matt Mackall wrote:

> Here I'm going to differ with you. The premises of the SLAB concept
> (from the original paper) are: 
> 
> a) fragmentation of conventional allocators gets worse over time

Even fragmentation of SLAB/SLUB gets worses over time. That is why we need 
a defrag solution.

> b) grouping objects of the same -type- (not size) together should mean
> they have similar lifetimes and thereby keep fragmentation low

I agree that is crap. The lifetimes argument is mostly only exploitable in 
benchmarks. That is why SLUB just groups them by size if possible.

> d) constructors and destructors are cache-friendly

I agree. Crap too. We removed the destructors. The constructors are needed 
so that objects in slab pages always have a definite state. That is f.e.
necessary for slab defragmentation because it has to be able to inspect an 
object at an arbitrary time and either remove it or move it to another 
slab page.

Constructors also make sense because the initialization of a cache object 
may be expensive. Initializing list heads and spinlocks can take some code 
and that code can be omitted if objects have a definite state when they 
are free. We saw that when measuring the buffer_head constructors effect 
on performance.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 19:16                                   ` Christoph Lameter
@ 2008-01-10 19:23                                     ` Matt Mackall
  2008-01-10 19:31                                       ` Christoph Lameter
  0 siblings, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-01-10 19:23 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-10 at 11:16 -0800, Christoph Lameter wrote:
> On Thu, 10 Jan 2008, Matt Mackall wrote:
> 
> > Here I'm going to differ with you. The premises of the SLAB concept
> > (from the original paper) are: 
> > 
> > a) fragmentation of conventional allocators gets worse over time
> 
> Even fragmentation of SLAB/SLUB gets worses over time. That is why we need 
> a defrag solution.
> 
> > b) grouping objects of the same -type- (not size) together should mean
> > they have similar lifetimes and thereby keep fragmentation low
> 
> I agree that is crap. The lifetimes argument is mostly only exploitable in 
> benchmarks. That is why SLUB just groups them by size if possible.
> 
> > d) constructors and destructors are cache-friendly
> 
> I agree. Crap too. We removed the destructors. The constructors are needed 
> so that objects in slab pages always have a definite state. That is f.e.
> necessary for slab defragmentation because it has to be able to inspect an 
> object at an arbitrary time and either remove it or move it to another 
> slab page.

Are you saying that the state of -freed- objects matters for your active
defragmentation? That's odd.

> Constructors also make sense because the initialization of a cache object 
> may be expensive. Initializing list heads and spinlocks can take some code 
> and that code can be omitted if objects have a definite state when they 
> are free. We saw that when measuring the buffer_head constructors effect 
> on performance.

Hmm. SLOB proves that you don't need to segregate objects based on
constructors, so you could combine even slabs that have constructors and
just delay construction until allocation. I'm surprised constructors
have measurable advantage..

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 18:42                                     ` Matt Mackall
@ 2008-01-10 19:24                                       ` Christoph Lameter
  2008-01-10 19:44                                         ` Matt Mackall
  2008-01-10 19:41                                       ` Linus Torvalds
  1 sibling, 1 reply; 69+ messages in thread
From: Christoph Lameter @ 2008-01-10 19:24 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 10 Jan 2008, Matt Mackall wrote:

> One idea I've been kicking around is pushing the boundary for the buddy
> allocator back a bit (to 64k, say) and using SL*B under that. The page
> allocators would call into buddy for larger than 64k (rare!) and SL*B
> otherwise. This would let us greatly improve our handling of things like
> task structs and skbs and possibly also things like 8k stacks and jumbo
> frames. As SL*B would never be competing with the page allocator for
> contiguous pages (the buddy allocator's granularity would be 64k), I
> don't think this would exacerbate the page-level fragmentation issues.

This would create another large page size (and that would have my 
enthusiastic support). It would decrease listlock effect drastically for 
SLUB. Even the initial simplistic implementation of SLUB was superior on 
the database transaction tests (I think it was up ~1%) on IA64 from the 
get go. Likely due to the larger 16k page size there. The larger page size 
could also be used for the page cache (ducking and running.....)? A 64k 
page size that could be allocated without zone locks would be very good 
for SLUB.

However, isnt this is basically confessing that the page allocator is not 
efficient for 4k page allocations? I have seen some weaknesses there. The 
overhead in the allocation path in particular is bad and I was thinking 
about applying the same ideas used in SLUB also to the page allocator in 
order to bring the cycle count down from 500-1000 to 60 or so. Since both 
SLUB and SLOB use the page allocator for allocs >PAGE_SIZE this would not 
only benefit the general kernel but also the slab allocations.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 19:23                                     ` Matt Mackall
@ 2008-01-10 19:31                                       ` Christoph Lameter
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-10 19:31 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 10 Jan 2008, Matt Mackall wrote:

> > I agree. Crap too. We removed the destructors. The constructors are needed 
> > so that objects in slab pages always have a definite state. That is f.e.
> > necessary for slab defragmentation because it has to be able to inspect an 
> > object at an arbitrary time and either remove it or move it to another 
> > slab page.
> 
> Are you saying that the state of -freed- objects matters for your active
> defragmentation? That's odd.

The state of the object immediately after it is allocated matters for a 
defrag solution. A kmalloc leads to an object in a undetermined state if 
you have no constructor. Code will then initialize the object but defrag 
f.e. must be able to inspect the object before. This means either that the 
freed object has a defined state or that kmalloc establishes that state 
before the object is marked as allocated.

> > Constructors also make sense because the initialization of a cache object 
> > may be expensive. Initializing list heads and spinlocks can take some code 
> > and that code can be omitted if objects have a definite state when they 
> > are free. We saw that when measuring the buffer_head constructors effect 
> > on performance.
> 
> Hmm. SLOB proves that you don't need to segregate objects based on
> constructors, so you could combine even slabs that have constructors and
> just delay construction until allocation. I'm surprised constructors
> have measurable advantage..

That is not working if you need to inspect allocated objects at any time 
for a defrag solution. All objects in a defragmentable slab need to have a 
consistent object state if allocated. If you have some without 
constructors then these object have no defined state and may contain 
arbitrary bytes.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 18:42                                     ` Matt Mackall
  2008-01-10 19:24                                       ` Christoph Lameter
@ 2008-01-10 19:41                                       ` Linus Torvalds
  2008-01-10 19:46                                         ` Christoph Lameter
  2008-01-10 19:53                                         ` Andi Kleen
  1 sibling, 2 replies; 69+ messages in thread
From: Linus Torvalds @ 2008-01-10 19:41 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Pekka J Enberg, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List



On Thu, 10 Jan 2008, Matt Mackall wrote:
> 
> One idea I've been kicking around is pushing the boundary for the buddy
> allocator back a bit (to 64k, say) and using SL*B under that. The page
> allocators would call into buddy for larger than 64k (rare!) and SL*B
> otherwise. This would let us greatly improve our handling of things like
> task structs and skbs and possibly also things like 8k stacks and jumbo
> frames.

Yes, something like that may well be reasonable. It could possibly solve 
some of the issues for bigger page cache sizes too, but one issue is that 
many things actually end up having those power-of-two alignment 
constraints too - so an 8kB allocation would often still have to be 
naturally aligned, which then removes some of the freedom.

> Crazy?

It sounds like it might be worth trying out - there's just no way to know 
how well it would work. Buddy allocators sure as hell have problems too, 
no question about that. It's not like the page allocator is perfect.

It's not even clear that a buddy allocator even for the high-order pages 
is at all the right choice. Almost nobody actually wants >64kB blocks, and 
the ones that *do* want bigger allocations tend to want *much* bigger 
ones, so it's quite possible that it could be worth it to have something 
like a three-level allocator:

 - huge pages (superpages for those crazy db people)

   Just a simple linked list of these things is fine, we'd never care 
   about coalescing large pages together anyway.

 - "large pages" (on the order of ~64kB) - with *perhaps* a buddy bitmap 
   setup to try to coalesce back into huge-pages, but more likely just 
   admitting that you'd need something like migration to ever get back a 
   hugepage that got split into large-pages.

   So maybe a simple bitmap allocator per huge-page for large pages. Say 
   you have a 4MB huge-page, and just a 64-bit free-bitmap per huge-page 
   when you split it into large pages.

 - slab/slub/slob for anything else, and "get_free_page()" ends up being 
   just a shorthand for saying "naturally aligned kmalloc of size 
   "PAGE_SIZE<<order"

and maybe it would all work out ok. 

			Linus

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 19:24                                       ` Christoph Lameter
@ 2008-01-10 19:44                                         ` Matt Mackall
  2008-01-10 19:51                                           ` Christoph Lameter
  0 siblings, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-01-10 19:44 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List


On Thu, 2008-01-10 at 11:24 -0800, Christoph Lameter wrote:
> On Thu, 10 Jan 2008, Matt Mackall wrote:
> 
> > One idea I've been kicking around is pushing the boundary for the buddy
> > allocator back a bit (to 64k, say) and using SL*B under that. The page
> > allocators would call into buddy for larger than 64k (rare!) and SL*B
> > otherwise. This would let us greatly improve our handling of things like
> > task structs and skbs and possibly also things like 8k stacks and jumbo
> > frames. As SL*B would never be competing with the page allocator for
> > contiguous pages (the buddy allocator's granularity would be 64k), I
> > don't think this would exacerbate the page-level fragmentation issues.
> 
> This would create another large page size (and that would have my 
> enthusiastic support).

Well, I think we'd still have the same page size, in the sense that we'd
have a struct page for every hardware page and we'd still have hardware
page-sized pages in the page cache. We'd just change how we allocated
them. Right now we've got a stack that looks like:

 buddy / page allocator
 SL*B allocator
 kmalloc

And we'd change that to:

 buddy allocator
 SL*B allocator
 page allocator / kmalloc

So get_free_page() would still hand you back a hardware page, it would
just do it through SL*B.

>  It would decrease listlock effect drastically for SLUB.

Not sure what you're referring to here.

> However, isnt this is basically confessing that the page allocator is not 
> efficient for 4k page allocations?

Well I wasn't thinking of doing this for any performance reasons. But
there certainly could be some.
-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 19:41                                       ` Linus Torvalds
@ 2008-01-10 19:46                                         ` Christoph Lameter
  2008-01-10 19:53                                         ` Andi Kleen
  1 sibling, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-10 19:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matt Mackall, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 10 Jan 2008, Linus Torvalds wrote:

> It's not even clear that a buddy allocator even for the high-order pages 
> is at all the right choice. Almost nobody actually wants >64kB blocks, and 
> the ones that *do* want bigger allocations tend to want *much* bigger 
> ones, so it's quite possible that it could be worth it to have something 
> like a three-level allocator:

Excellent! I am definitely on board with this.

>  - huge pages (superpages for those crazy db people)
> 
>    Just a simple linked list of these things is fine, we'd never care 
>    about coalescing large pages together anyway.
> 
>  - "large pages" (on the order of ~64kB) - with *perhaps* a buddy bitmap 
>    setup to try to coalesce back into huge-pages, but more likely just 
>    admitting that you'd need something like migration to ever get back a 
>    hugepage that got split into large-pages.
> 
>    So maybe a simple bitmap allocator per huge-page for large pages. Say 
>    you have a 4MB huge-page, and just a 64-bit free-bitmap per huge-page 
>    when you split it into large pages.
> 
>  - slab/slub/slob for anything else, and "get_free_page()" ends up being 
>    just a shorthand for saying "naturally aligned kmalloc of size 
>    "PAGE_SIZE<<order"
> 
> and maybe it would all work out ok. 

Hmmm... a 3 level allocator? Basically we would have BASE_PAGE 
STANDARD_PAGE and HUGE_PAGE? We could simply extend the page allocator to 
have 3 pcp lists for these sizes and go from there?

Thinking about the arches this would mean

	BASE_PAGE	STANDARD_PAGE	HUGE_PAGE
x86_64	4k		64k		2M
i386	4k		16k		4M
ia64	16k		256k		1G

?


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 19:44                                         ` Matt Mackall
@ 2008-01-10 19:51                                           ` Christoph Lameter
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-10 19:51 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 10 Jan 2008, Matt Mackall wrote:

> Well, I think we'd still have the same page size, in the sense that we'd
> have a struct page for every hardware page and we'd still have hardware
> page-sized pages in the page cache. We'd just change how we allocated
> them. Right now we've got a stack that looks like:

We would not change the hardware page. Cannot do that. But we would have 
preferential threadment for 64k and 2M pages in the page allocator?

>  buddy / page allocator
>  SL*B allocator
>  kmalloc
> 
> And we'd change that to:
> 
>  buddy allocator
>  SL*B allocator
>  page allocator / kmalloc
> 
> So get_free_page() would still hand you back a hardware page, it would
> just do it through SL*B.

Hmm.... Not sure what effect this would have. We already have the pcp's 
that have a similar effect.
 
> >  It would decrease listlock effect drastically for SLUB.
> 
> Not sure what you're referring to here.

Allocations in 64k chunks means 16 times less basic allocation blocks to 
manage for the slab allocators. So the metadata to be maintained by the 
allocators is reduces by that factor. SLUB only needs to touch the 
list_lock (in some situations like a free to a non cpu slab) if a block 
becomes completely empty or is going from fully allocated to partially 
allocated. The larger the block size the more objects are in a block and 
the less of these actions that need a per node lock are needed.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 19:53                                         ` Andi Kleen
@ 2008-01-10 19:52                                           ` Christoph Lameter
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-01-10 19:52 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Linus Torvalds, Matt Mackall, Pekka J Enberg, Ingo Molnar,
	Hugh Dickins, Peter Zijlstra, Linux Kernel Mailing List

On Thu, 10 Jan 2008, Andi Kleen wrote:

> I did essentially that for my GBpages hugetlbfs patchkit. GB pages are already
> beyond MAX_ORDER and increasing MAX_ORDER didn't seem attractive because
> it would require aligning the zones all to 1GB which would quite nasty.

I am very very interested in that work and I could not find it when I 
looked for it a couple of weeks back. Can you sent me a copy?


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 19:41                                       ` Linus Torvalds
  2008-01-10 19:46                                         ` Christoph Lameter
@ 2008-01-10 19:53                                         ` Andi Kleen
  2008-01-10 19:52                                           ` Christoph Lameter
  1 sibling, 1 reply; 69+ messages in thread
From: Andi Kleen @ 2008-01-10 19:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matt Mackall, Pekka J Enberg, Christoph Lameter, Ingo Molnar,
	Hugh Dickins, Andi Kleen, Peter Zijlstra,
	Linux Kernel Mailing List

>  - huge pages (superpages for those crazy db people)
> 
>    Just a simple linked list of these things is fine, we'd never care 
>    about coalescing large pages together anyway.

I did essentially that for my GBpages hugetlbfs patchkit. GB pages are already
beyond MAX_ORDER and increasing MAX_ORDER didn't seem attractive because
it would require aligning the zones all to 1GB which would quite nasty.

So it just grabbed them out of bootmem early and managed them in a 
per node list.

Not sure it's a good idea for 2MB pages though.

-Andi

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 17:49                                 ` Matt Mackall
  2008-01-10 18:28                                   ` Linus Torvalds
  2008-01-10 19:16                                   ` Christoph Lameter
@ 2008-01-10 21:25                                   ` Jörn Engel
  2 siblings, 0 replies; 69+ messages in thread
From: Jörn Engel @ 2008-01-10 21:25 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka J Enberg, Christoph Lameter, Ingo Molnar,
	Hugh Dickins, Andi Kleen, Peter Zijlstra,
	Linux Kernel Mailing List

On Thu, 10 January 2008 11:49:25 -0600, Matt Mackall wrote:
> 
> b) grouping objects of the same -type- (not size) together should mean
> they have similar lifetimes and thereby keep fragmentation low
> 
> (b) is known to be false, you just have to look at our dcache and icache
> pinning.

(b) is half-true, actually.  The grouping by lifetime makes a lot of
sense.  LogFS has a similar problem to slabs (only full segments are
useful, a single object can pin the segment).  And when I grouped my
objects very roughly by their life expectency, the impact was *HUGE*!

In both cases, you want slabs/segments that are either close to 100%
full or close to 0% full.  It matters a lot when you have to move
objects around and I would bet it matters even more when you cannot move
objects and the slab just remains pinned.

So just because the type alone is a relatively bad heuristic for life
expectency does not make the concept false.  Bonwick was onto something.
He just failed in picking a good heuristic.  Quite likely spreading by
type was even a bonus when slab was developed, because even such a crude
heuristic is slightly better than completely randomized lifetimes.

I've been meaning to split the dentry cache into 2-3 seperate ones for a
while and kept spending my time elsewhere.  But I remain convinced that
this will make a measurable difference.

Jörn

-- 
Never argue with idiots - first they drag you down to their level,
then they beat you with experience.
-- unknown

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-01-10 16:13                               ` Linus Torvalds
  2008-01-10 17:49                                 ` Matt Mackall
  2008-01-10 18:13                                 ` Andi Kleen
@ 2008-07-30 21:51                                 ` Pekka J Enberg
  2008-07-30 22:00                                   ` Linus Torvalds
  2 siblings, 1 reply; 69+ messages in thread
From: Pekka J Enberg @ 2008-07-30 21:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matt Mackall, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes

Hi Linus,

(I'm replying to an old thread.)

On Thu, 10 Jan 2008, Linus Torvalds wrote:
> I would suggest that if you guys are really serious about memory use, try 
> to do a size-based heap thing, and do best-fit in that heap. Or just some 
> really simple size-based binning, eg
> 
> 	if (size > 2*PAGE_SIZE)
> 		goto page_allocator;
> 	bin = lookup_bin[(size+31) >> 5];
> 
> or whatever. Because first-fit is *known* to be bad.
> 
> At try to change it to address-ordered first-fit or something (which is 
> much more complex than just plain LIFO, but hey, that's life).
> 
> I haven't checked much, but I *think* SLOB is just basic first-fit 
> (perhaps the "next-fit" variation?) Next-fit is known to be EVEN WORSE 
> than the simple first-fit when it comes to fragmentation (so no, Knuth was 
> not always right - let's face it, much of Knuth is simply outdated).

In case you're interested, it turns out that best-fit with binning gives 
roughly the same results as SLOB (or alternatively, I messed something up 
with the design and implementation). The interesting bit here is that 
BINALLOC is more stable than SLOB (almost as stable as SLUB in my tests).

		Pekka

Subject: [PATCH] binalloc: best-fit allocation with binning
From: Pekka Enberg <penberg@cs.helsinki.fi>

As suggested by Linus, to optimize memory use, I have implemented a best-fit
general purpose kernel memory allocator with binning. You can find the original
discussion here:

  http://lkml.org/lkml/2008/1/10/229

The results are as follows:

[ the minimum, maximum, and average are of captured from 25 individual runs ]

                Total   Free (kB)                          Used (kB)
                (kB)    min     max     average    sd      min   max   average
SLOB		122372  117676  117768  117721.12  20.51   4604  4696  4650.88
BINALLOC        122368  117672  117732  117699.68  16.74   4636  4696  4668.32
SLUB (no debug) 122360  117284  117328  117308.96  15.27   5032  5076  5051.04

Thanks to Vegard Nossum for his help with debugging and testing the allocator
and to Johannes Weiner for fixing my bugs.

Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
---
diff --git a/include/linux/binalloc_def.h b/include/linux/binalloc_def.h
new file mode 100644
index 0000000..a6ea99e
--- /dev/null
+++ b/include/linux/binalloc_def.h
@@ -0,0 +1,34 @@
+#ifndef __LINUX_BINALLOC_DEF_H
+#define __LINUX_BINALLOC_DEF_H
+
+void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
+
+static inline void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
+{
+	return kmem_cache_alloc_node(cachep, flags, -1);
+}
+
+void *__kmalloc_node(size_t size, gfp_t flags, int node);
+
+static inline void *kmalloc_node(size_t size, gfp_t flags, int node)
+{
+	return __kmalloc_node(size, flags, node);
+}
+
+/**
+ * kmalloc - allocate memory
+ * @size: how many bytes of memory are required.
+ * @flags: the type of memory to allocate (see kcalloc).
+ *
+ * kmalloc is the normal method of allocating memory
+ * in the kernel.
+ */
+static inline void *kmalloc(size_t size, gfp_t flags)
+{
+	return __kmalloc_node(size, flags, -1);
+}
+
+void *__kmalloc(size_t size, gfp_t flags);
+
+#endif /* __LINUX_BINALLOC_DEF_H */
+
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 5ff9676..eeda03d 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -124,6 +124,8 @@ size_t ksize(const void *);
 #include <linux/slub_def.h>
 #elif defined(CONFIG_SLOB)
 #include <linux/slob_def.h>
+#elif defined(CONFIG_BINALLOC)
+#include <linux/binalloc_def.h>
 #else
 #include <linux/slab_def.h>
 #endif
@@ -186,7 +188,7 @@ static inline void *kcalloc(size_t n, size_t size, gfp_t flags)
 	return __kmalloc(n * size, flags | __GFP_ZERO);
 }
 
-#if !defined(CONFIG_NUMA) && !defined(CONFIG_SLOB)
+#if !defined(CONFIG_NUMA) && !defined(CONFIG_SLOB) && !defined(CONFIG_BINALLOC)
 /**
  * kmalloc_node - allocate memory from a specific node
  * @size: how many bytes of memory are required.
diff --git a/init/Kconfig b/init/Kconfig
index 250e02c..b9a6325 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -774,6 +774,13 @@ config SLOB
 	   allocator. SLOB is generally more space efficient but
 	   does not perform as well on large systems.
 
+config BINALLOC
+	depends on EMBEDDED
+	bool "BINALLOC"
+	help
+	   A best-fit general-purpose kernel memory allocator with binning
+	   that tries to be as memory efficient as possible.
+
 endchoice
 
 config PROFILING
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e1d4764..29e253a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -290,6 +290,13 @@ config SLUB_STATS
 	  out which slabs are relevant to a particular load.
 	  Try running: slabinfo -DA
 
+config BINALLOC_DEBUG
+	bool "Debug binalloc memory allocations"
+	default n
+	help
+	  Say Y here to have the memory allocator to do sanity checks for
+	  allocation and deallocation.
+
 config DEBUG_PREEMPT
 	bool "Debug preemptible kernel"
 	depends on DEBUG_KERNEL && PREEMPT && (TRACE_IRQFLAGS_SUPPORT || PPC64)
diff --git a/mm/Makefile b/mm/Makefile
index da4ccf0..94ed767 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -28,6 +28,7 @@ obj-$(CONFIG_SLOB) += slob.o
 obj-$(CONFIG_MMU_NOTIFIER) += mmu_notifier.o
 obj-$(CONFIG_SLAB) += slab.o
 obj-$(CONFIG_SLUB) += slub.o
+obj-$(CONFIG_BINALLOC) += binalloc.o
 obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
 obj-$(CONFIG_FS_XIP) += filemap_xip.o
 obj-$(CONFIG_MIGRATION) += migrate.o
diff --git a/mm/binalloc.c b/mm/binalloc.c
new file mode 100644
index 0000000..ecdfd33
--- /dev/null
+++ b/mm/binalloc.c
@@ -0,0 +1,826 @@
+/*
+ * A best-fit general purpose kernel memory allocator with binning.
+ *
+ * Copyright (C) 2008  Pekka Enberg
+ *
+ * This file is release under the GPLv2
+ *
+ * I. Overview
+ *
+ * This is a best-fit general purpose kernel memory allocator with binning. We
+ * use the page allocator to allocate one big contiguous chunk that is split
+ * into smaller chunks for successive kmalloc() calls until we run out of space
+ * after which a new page is allocated.
+ *
+ * This memory allocator bears close resemblance to the "Lea" memory allocator
+ * described in:
+ *
+ *   http://g.oswego.edu/dl/html/malloc.html
+ *
+ * II. Anatomy of a chunk
+ *
+ * To detect page boundaries, a zero-sized sentinel chunk is placed at the end
+ * of a page. Objects are aligned by padding bytes at the beginning of a chunk
+ * after the boundary tag. The padding is included in the allocated object size
+ * so that neighbor boundary tags can be calculated. Likewise, boundary tags
+ * are aligned by padding bytes at the end of a chunk when splitting the chunk
+ * so that the first non-allocated byte is properly aligned.
+ *
+ * III. Operation
+ *
+ * In the following example, we assume a 64-byte page although on real machines
+ * the page size ranges from 4 KB to 64 KB. The size of the boundary tag in
+ * this example is 4 bytes and the mandatory alignment required by the machine
+ * is 4 bytes.
+ *
+ * Initially, the kernel memory allocator allocates one page from the page
+ * allocator and turns that into contiguous chunk with sentinel. You can see
+ * the boundary tag bytes marked as 'B' and the sentinel chunk boundary tag
+ * bytes as 'S' in the following diagram:
+ *
+ *   0       8       16      24      32      40      48      56
+ *   +-------+-------+-------+-------+-------+-------+-------+-------
+ *   BBBB                                                        SSSS
+ *
+ * Now lets assume a kmalloc() call comes in and wants to allocate 3 bytes. As
+ * we find a big enough chunk, we simply split it in two parts as follows:
+ *
+ *   0       8       16      24      32      40      48      56
+ *   +-------+-------+-------+-------+-------+-------+-------+-------
+ *   BBBBOOOPBBBB                                                SSSS
+ *
+ * As the pointer after the boundary tag of the chunk is already aligned to the
+ * mandatory alignment 4, the object marked as 'O' starts immediately after the
+ * boundary tag. However, to make sure the boundary tag of the next chunk is
+ * also aligned propery, one byte of padding is added to the end of the object
+ * marked as 'P'.
+ *
+ * Now assume that a kmem_cache_alloc() call comes in to allocate 16 bytes of
+ * memory with mandatory alignment of 16. Now, as the location after the
+ * boundary tag of the second chunk is not aligned to 16 byte boundary, we add
+ * 8 bytes of padding in front of the object as illustarted in the following
+ * diagram:
+ *
+ *   0       8       16      24      32      40      48      56
+ *   +-------+-------+-------+-------+-------+-------+-------+-------
+ *   BBBBOOOPBBBBPPPPOOOOOOOOOOOOOOOOBBBB                        SSSS
+ *
+ * Note that the boundary tag is naturally aligned due to the fact that the
+ * object size is already aligned to 4 byte boundary which is the mandatory
+ * alignment for this machine.
+ */
+
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/rbtree.h>
+#include <linux/rcupdate.h>
+#include <linux/slab.h>
+
+/*
+ * Minimum alignments
+ */
+#ifndef ARCH_KMALLOC_MINALIGN
+#define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long)
+#endif
+
+#ifndef ARCH_SLAB_MINALIGN
+#define ARCH_SLAB_MINALIGN __alignof__(unsigned long)
+#endif
+
+#define MANDATORY_ALIGN max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN)
+
+struct kmem_rcu {
+	struct rcu_head		head;
+	size_t			size;
+};
+
+#define KMEM_CHUNK_RESERVED	0xababababUL
+#define KMEM_CHUNK_COALESCED	0xbcbcbcbcUL
+#define KMEM_CHUNK_FREE		0xfefefefeUL
+
+/*
+ * Each chunk has a boundary tag at the beginning of the chunk. The tag
+ * contains the size of this chunk and the size of the previous chunk which is
+ * required by chuck coalescing when an object is freed.
+ */
+struct kmem_boundary_tag {
+#ifdef CONFIG_BINALLOC_DEBUG
+	unsigned long		magic;
+#endif
+	unsigned short		prev_size;
+	unsigned short		size;
+	unsigned short		reserved;
+	unsigned short		align;
+};
+
+struct kmem_chunk {
+	struct kmem_boundary_tag	tag;
+	/* The following fields are defined only if the chunk is available */
+	struct rb_node			bin_tree;
+};
+
+/*
+ * The code assumes that the end of a boundary tag is aligned by power of two
+ * for calculating the alignment of an object in a chunk.
+ */
+#define BOUNDARY_TAG_SIZE roundup_pow_of_two(sizeof(struct kmem_boundary_tag))
+
+struct kmem_bin {
+	struct rb_root		freelist;
+};
+
+/*
+ * The chunk needs to be big enough to fit the freelist node pointers when it's
+ * available.
+ */
+#define MIN_CHUNK_SIZE \
+	(sizeof(struct kmem_chunk) - sizeof(struct kmem_boundary_tag))
+
+#define BIN_SHIFT 5
+#define NR_BINS ((PAGE_SIZE) / (1 << BIN_SHIFT))
+
+static struct kmem_bin bins[NR_BINS];
+static DEFINE_SPINLOCK(kmem_lock);
+
+static unsigned long chunk_size(struct kmem_chunk *chunk)
+{
+	return chunk->tag.size;
+}
+
+static void __chunk_set_size(struct kmem_chunk *chunk, unsigned long size)
+{
+	chunk->tag.size = size;
+}
+
+static void ptr_set_align(void *p, unsigned short align)
+{
+	unsigned short *buf = (void *)p - sizeof(unsigned short);
+
+	*buf = align;
+}
+
+static unsigned short ptr_get_align(const void *p)
+{
+	unsigned short *buf = (void *)p - sizeof(unsigned short);
+
+	return *buf;
+}
+
+static struct kmem_chunk *rawptr_to_chunk(const void *p)
+{
+	void *q = (void *)p - BOUNDARY_TAG_SIZE;
+	return q;
+}
+
+static void *chunk_to_rawptr(struct kmem_chunk *chunk)
+{
+	return (void *)chunk + BOUNDARY_TAG_SIZE;
+}
+
+static struct kmem_chunk *ptr_to_chunk(const void *p, unsigned short align)
+{
+	return rawptr_to_chunk(p - align);
+}
+
+static void *chunk_to_ptr(struct kmem_chunk *chunk, unsigned short align)
+{
+	void *p;
+
+	p = chunk_to_rawptr(chunk);
+	return PTR_ALIGN(p, align);
+}
+
+static struct kmem_chunk *prev_chunk(struct kmem_chunk *chunk)
+{
+	void *p = rawptr_to_chunk((void *) chunk - chunk->tag.prev_size);
+
+	BUG_ON(!virt_addr_valid(p));
+
+	return p;
+}
+
+static struct kmem_chunk *next_chunk(struct kmem_chunk *chunk)
+{
+	return chunk_to_rawptr(chunk) + chunk_size(chunk);
+}
+
+static bool chunk_is_first(struct kmem_chunk *b)
+{
+	return b->tag.prev_size == 0;
+}
+
+static bool chunk_is_last(struct kmem_chunk *b)
+{
+	struct kmem_chunk *next = next_chunk(b);
+
+	return chunk_size(next) == 0;
+}
+
+static void chunk_set_size(struct kmem_chunk *chunk, unsigned long size)
+{
+	BUG_ON(size < MIN_CHUNK_SIZE);
+	BUG_ON(size > PAGE_SIZE);
+
+	__chunk_set_size(chunk, size);
+
+	if (!chunk_is_last(chunk)) {
+		struct kmem_chunk *next = next_chunk(chunk);
+
+		next->tag.prev_size = size;
+	}
+}
+
+#ifdef CONFIG_BINALLOC_DEBUG
+
+#define DUMP_BYTES	128
+
+static void kmem_dump_chunk(struct kmem_chunk *chunk)
+{
+	print_hex_dump(KERN_ERR, "kmem: ", DUMP_PREFIX_ADDRESS, 16, 1,
+		chunk, DUMP_BYTES, 1);
+}
+
+static void kmem_verify_chunk(struct kmem_chunk *chunk, unsigned long magic)
+{
+	if (chunk->tag.magic != magic) {
+		printk(KERN_ERR "kmem: bad chunk magic: %lx, expected: %lx\n",
+				chunk->tag.magic, magic);
+		kmem_dump_chunk(chunk);
+		BUG();
+	}
+}
+
+static void chunk_set_magic(struct kmem_chunk *chunk, unsigned long magic)
+{
+	chunk->tag.magic = magic;
+}
+
+static void kmem_verify_page(struct page *page)
+{
+	struct kmem_chunk *chunk = page_address(page);
+
+	do {
+		BUG_ON(chunk_size(chunk) < MIN_CHUNK_SIZE);
+		BUG_ON(virt_to_page(chunk) != page);
+		chunk = next_chunk(chunk);
+	} while (chunk_size(chunk) != 0);
+}
+#else
+
+static inline void
+kmem_verify_chunk(struct kmem_chunk *chunk, unsigned long magic)
+{
+}
+
+static inline void
+chunk_set_magic(struct kmem_chunk *chunk, unsigned long magic)
+{
+}
+
+static inline void kmem_verify_page(struct page *page)
+{
+}
+#endif /* CONFIG_BINALLOC_DEBUG */
+
+static bool chunk_is_available(struct kmem_chunk *chunk)
+{
+	return chunk->tag.reserved != 1;
+}
+
+static void chunk_mark_reserved(struct kmem_chunk *chunk)
+{
+	chunk_set_magic(chunk, KMEM_CHUNK_RESERVED);
+	chunk->tag.reserved = 1;
+}
+
+static void chunk_mark_available(struct kmem_chunk *chunk)
+{
+	chunk_set_magic(chunk, KMEM_CHUNK_FREE);
+	chunk->tag.reserved = 0;
+}
+
+#define ALLOC_MASK (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK)
+
+static void *kmem_alloc_large(size_t size, gfp_t gfpflags, int node)
+{
+	struct page *page;
+	int order;
+
+	order = get_order(size);
+	page = alloc_pages_node(node, gfpflags & ALLOC_MASK, order);
+	if (!page)
+		return NULL;
+	page->private = size;
+
+	return page_address(page);
+}
+
+static int lookup_bin_idx(unsigned long size)
+{
+	return (size-1) >> BIN_SHIFT;
+}
+
+static struct kmem_bin *lookup_bin(unsigned long size)
+{
+	int idx = lookup_bin_idx(size);
+
+	BUG_ON(idx > NR_BINS);
+
+	return &bins[idx];
+}
+
+static struct kmem_bin *chunk_get_bin(struct kmem_chunk *chunk)
+{
+	unsigned long size = chunk_size(chunk);
+
+	return lookup_bin(size);
+}
+
+static void __insert_to_freelist(struct kmem_bin *bin, struct kmem_chunk *chunk)
+{
+	struct rb_node **p = &bin->freelist.rb_node;
+	struct rb_node *parent = NULL;
+
+	while (*p) {
+		struct kmem_chunk *this;
+		parent = *p;
+
+		this = rb_entry(parent, struct kmem_chunk, bin_tree);
+
+		if (chunk_size(chunk) < chunk_size(this))
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	rb_link_node(&chunk->bin_tree, parent, p);
+	rb_insert_color(&chunk->bin_tree, &bin->freelist);
+}
+
+static void insert_to_freelist(struct kmem_chunk *chunk)
+{
+	struct kmem_bin *bin;
+
+	kmem_verify_chunk(chunk, KMEM_CHUNK_FREE);
+
+	bin = chunk_get_bin(chunk);
+	__insert_to_freelist(bin, chunk);
+}
+
+static void remove_from_freelist(struct kmem_chunk *chunk)
+{
+	struct kmem_bin *bin = chunk_get_bin(chunk);
+
+	rb_erase(&chunk->bin_tree, &bin->freelist);
+}
+
+static struct kmem_chunk *chunk_page_alloc(gfp_t gfpflags)
+{
+	struct kmem_chunk *chunk, *sentinel;
+	struct page *page;
+
+	page = alloc_pages_node(-1, gfpflags & ALLOC_MASK, 0);
+	if (!page)
+		return NULL;
+
+	__SetPageSlab(page);
+
+	chunk = page_address(page);
+	chunk->tag.prev_size = 0;
+	__chunk_set_size(chunk, PAGE_SIZE - BOUNDARY_TAG_SIZE*2);
+	chunk_mark_available(chunk);
+
+	/*
+	 * The sentinel chunk marks the end of a page and it's the only one
+	 * that can have size zero.
+	 */
+	sentinel = page_address(page) + PAGE_SIZE - BOUNDARY_TAG_SIZE;
+	sentinel->tag.prev_size = chunk_size(chunk);
+	__chunk_set_size(sentinel, 0);
+	chunk_mark_reserved(sentinel);
+
+	return chunk;
+}
+
+static void split_chunk(struct kmem_chunk *chunk, size_t new_size)
+{
+	struct kmem_chunk *upper_half;
+	size_t size, remaining;
+
+	BUG_ON(new_size < MIN_CHUNK_SIZE);
+	BUG_ON(new_size > PAGE_SIZE);
+
+	kmem_verify_chunk(chunk, KMEM_CHUNK_FREE);
+
+	size = chunk_size(chunk);
+	BUG_ON(size < new_size);
+
+	remaining = size - new_size;
+
+	/*
+	 * Don't split if remaining half would end up too small.
+	 */
+	if (remaining < (BOUNDARY_TAG_SIZE + MIN_CHUNK_SIZE))
+		return;
+
+	chunk_set_size(chunk, new_size);
+
+	upper_half = next_chunk(chunk);
+	upper_half->tag.prev_size = chunk_size(chunk);
+	chunk_set_size(upper_half, remaining - BOUNDARY_TAG_SIZE);
+
+	chunk_mark_available(upper_half);
+	insert_to_freelist(upper_half);
+}
+
+static struct kmem_chunk *__kmem_alloc(struct kmem_bin *bin, size_t size)
+{
+	struct rb_node *n = bin->freelist.rb_node;
+	struct kmem_chunk *ret = NULL;
+
+	while (n) {
+		struct kmem_chunk *chunk = rb_entry(n, struct kmem_chunk, bin_tree);
+
+		if (chunk_size(chunk) < size) {
+			/*
+			 * The chunk is not big enough.
+			 */
+			n = n->rb_right;
+		} else {
+			/*
+			 * Look up the smallest possible chunk that is big
+			 * enough for us but bail out early if we find a
+			 * perfect fit.
+			 */
+			ret = chunk;
+			if (chunk_size(chunk) == size)
+				break;
+			n = n->rb_left;
+		}
+	}
+	if (ret)
+		remove_from_freelist(ret);
+
+	return ret;
+}
+
+/*
+ * This is the heart of the kernel memory allocator.
+ */
+static void *
+kmem_alloc(size_t size, gfp_t gfpflags, unsigned short align)
+{
+	struct kmem_chunk *chunk;
+	unsigned long flags;
+	void *p;
+	int i;
+
+	if (size < MIN_CHUNK_SIZE)
+		size = MIN_CHUNK_SIZE;
+
+	/*
+	 * The boundary tags are aligned to mandatory alignment so there's no
+	 * need to reserve extra space if the user also requested the same
+	 * mandatory alignment.
+	 */
+	if (align != MANDATORY_ALIGN)
+		size += align;
+
+	size = ALIGN(size, MANDATORY_ALIGN);
+
+	spin_lock_irqsave(&kmem_lock, flags);
+	/*
+	 * Look for available chunks from all bins starting from the smallest
+	 * one that is big enough for the requested size.
+	 */
+	for (i = lookup_bin_idx(size); i < NR_BINS; i++) {
+		struct kmem_bin *bin = &bins[i];
+
+		chunk = __kmem_alloc(bin, size);
+
+		/*
+		 * If we found a free chunk, return a pointer it; otherwise
+		 * continue scanning.
+		 */
+		if (chunk)
+			goto allocate_obj;
+	}
+
+	/*
+	 * Ok, we need more pages.
+	 */
+	spin_unlock_irqrestore(&kmem_lock, flags);
+	chunk = chunk_page_alloc(gfpflags);
+	if (!chunk)
+		return NULL;
+	spin_lock_irqsave(&kmem_lock, flags);
+
+allocate_obj:
+	split_chunk(chunk, size);
+	chunk_mark_reserved(chunk);
+	spin_unlock_irqrestore(&kmem_lock, flags);
+
+	p = chunk_to_ptr(chunk, align);
+	ptr_set_align(p, p - chunk_to_rawptr(chunk));
+
+	BUG_ON(!IS_ALIGNED((unsigned long) p, align));
+	return p;
+}
+
+static void *
+kmem_alloc_node(size_t size, gfp_t gfpflags, unsigned short align, int node)
+{
+	void *p;
+
+	if (unlikely(!size))
+		return ZERO_SIZE_PTR;
+
+	if (size < PAGE_SIZE)
+		p = kmem_alloc(size, gfpflags, align);
+	else
+		p = kmem_alloc_large(size, gfpflags, node);
+
+	if ((gfpflags & __GFP_ZERO) && p)
+		memset(p, 0, size);
+
+	if (size < PAGE_SIZE)
+		kmem_verify_page(virt_to_page(p));
+
+	return p;
+}
+
+void *__kmalloc_node(size_t size, gfp_t gfpflags, int node)
+{
+	return kmem_alloc_node(size, gfpflags, MANDATORY_ALIGN, node);
+}
+EXPORT_SYMBOL(__kmalloc_node);
+
+void *__kmalloc(size_t size, gfp_t gfpflags)
+{
+	return kmem_alloc_node(size, gfpflags, MANDATORY_ALIGN, -1);
+}
+EXPORT_SYMBOL(__kmalloc);
+
+static void
+coalesce_chunks(struct kmem_chunk *chunk, struct kmem_chunk *right_neighbor)
+{
+	unsigned long new_size;
+
+	kmem_verify_chunk(right_neighbor, KMEM_CHUNK_FREE);
+	kmem_verify_chunk(chunk, KMEM_CHUNK_FREE);
+
+	chunk_set_magic(right_neighbor, KMEM_CHUNK_COALESCED);
+
+	new_size = chunk_size(chunk) + chunk_size(right_neighbor);
+
+	/*
+	 * The boundary tag of the right-side neighbor is coalesced into the
+	 * chunk as well.
+	 */
+	chunk_set_size(chunk, new_size + BOUNDARY_TAG_SIZE);
+}
+
+static void kmem_free_chunk(struct kmem_chunk *chunk, struct page *page)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&kmem_lock, flags);
+
+	chunk_mark_available(chunk);
+
+	/*
+	 * Coalesce chunk with neighbor chunks that not reserved.
+	 */
+	if (likely(!chunk_is_first(chunk))) {
+		struct kmem_chunk *prev = prev_chunk(chunk);
+
+		if (chunk_is_available(prev)) {
+			remove_from_freelist(prev);
+			coalesce_chunks(prev, chunk);
+			chunk = prev;
+		}
+	}
+	if (likely(!chunk_is_last(chunk))) {
+		struct kmem_chunk *next = next_chunk(chunk);
+
+		if (chunk_is_available(next)) {
+			remove_from_freelist(next);
+			coalesce_chunks(chunk, next);
+		}
+	}
+
+	/*
+	 * If the chunk now covers a whole page, give it back to the page
+	 * allocator; otherwise insert it to the freelist.
+	 */
+	if (unlikely(chunk_size(chunk) == PAGE_SIZE-BOUNDARY_TAG_SIZE*2)) {
+		__ClearPageSlab(page);
+		__free_pages(page, 0);
+	} else
+		insert_to_freelist(chunk);
+
+	spin_unlock_irqrestore(&kmem_lock, flags);
+}
+
+static void kmem_free(const void *p, struct page *page)
+{
+	struct kmem_chunk *chunk;
+	unsigned short align;
+
+	kmem_verify_page(page);
+
+	align = ptr_get_align(p);
+	chunk = ptr_to_chunk(p, align);
+
+	kmem_free_chunk(chunk, page);
+}
+
+static void __kfree(const void *p)
+{
+	struct page *page = virt_to_page(p);
+
+	if (PageSlab(page))
+		kmem_free(p, page);
+	else
+		put_page(page);
+}
+
+void kfree(const void *p)
+{
+	if (unlikely(ZERO_OR_NULL_PTR(p)))
+		return;
+
+	__kfree(p);
+}
+EXPORT_SYMBOL(kfree);
+
+static size_t __ksize(const void *p)
+{
+	struct kmem_chunk *chunk;
+	unsigned short align;
+
+	align = ptr_get_align(p);
+	chunk = ptr_to_chunk(p, align);
+
+	/*
+	 * No need for locking here: the size of a reserved chunk can never
+	 * change.
+	 */
+	return chunk_size(chunk);	/* XXX */
+}
+
+size_t ksize(const void *p)
+{
+	struct page *page;
+
+	BUG_ON(!p);
+
+	if (unlikely(ZERO_OR_NULL_PTR(p)))
+		return 0;
+
+	page = virt_to_page(p);
+
+	if (PageSlab(page))
+		return __ksize(p);
+
+	return page->private;
+}
+EXPORT_SYMBOL(ksize);
+
+struct kmem_cache {
+	unsigned int size, align;
+	unsigned long gfpflags;
+	const char *name;
+	void (*ctor)(void *);
+};
+
+struct kmem_cache *
+kmem_cache_create(const char *name, size_t size, size_t align,
+		  unsigned long gfpflags,
+		  void (*ctor)(void *))
+{
+	struct kmem_cache *cache;
+
+	BUG_ON(size == 0);
+
+	cache = kmalloc(sizeof(*cache), GFP_KERNEL);
+	if (cache) {
+		cache->size = size;
+		if (gfpflags & SLAB_DESTROY_BY_RCU) {
+			/* leave room for rcu footer at the end of chunk */
+			cache->size += sizeof(struct kmem_rcu);
+		}
+		cache->align = max(align, ARCH_SLAB_MINALIGN);
+		cache->gfpflags = gfpflags;
+		cache->name = name;
+		cache->ctor = ctor;
+	}
+	return cache;
+}
+EXPORT_SYMBOL(kmem_cache_create);
+
+void kmem_cache_destroy(struct kmem_cache *cache)
+{
+	kfree(cache);
+}
+EXPORT_SYMBOL(kmem_cache_destroy);
+
+void *
+kmem_cache_alloc_node(struct kmem_cache *cache, gfp_t gfpflags, int node)
+{
+	void *p;
+
+	p = kmem_alloc_node(cache->size, gfpflags, cache->align, node);
+	if (!p)
+		return NULL;
+
+	if (cache->ctor)
+		cache->ctor(p);
+	return p;
+}
+EXPORT_SYMBOL(kmem_cache_alloc_node);
+
+static void kmem_rcu_free(struct rcu_head *head)
+{
+	struct kmem_rcu *kmem_rcu = (struct kmem_rcu *)head;
+	void *p = (void *)kmem_rcu - (kmem_rcu->size - sizeof(struct kmem_rcu));
+
+	__kfree(p);
+}
+
+void kmem_cache_free(struct kmem_cache *cache, void *p)
+{
+	if (unlikely(ZERO_OR_NULL_PTR(p)))
+		return;
+
+	if (unlikely(cache->gfpflags & SLAB_DESTROY_BY_RCU)) {
+		struct kmem_rcu *kmem_rcu;
+
+		kmem_rcu = p + (cache->size - sizeof(struct kmem_rcu));
+		INIT_RCU_HEAD(&kmem_rcu->head);
+		kmem_rcu->size = cache->size;
+		call_rcu(&kmem_rcu->head, kmem_rcu_free);
+	} else
+		__kfree(p);
+}
+EXPORT_SYMBOL(kmem_cache_free);
+
+unsigned int kmem_cache_size(struct kmem_cache *cache)
+{
+	return cache->size;
+}
+EXPORT_SYMBOL(kmem_cache_size);
+
+const char *kmem_cache_name(struct kmem_cache *cache)
+{
+	return cache->name;
+}
+EXPORT_SYMBOL(kmem_cache_name);
+
+int kmem_cache_shrink(struct kmem_cache *cache)
+{
+	return 0;
+}
+EXPORT_SYMBOL(kmem_cache_shrink);
+
+int kmem_ptr_validate(struct kmem_cache *cache, const void *p)
+{
+	return 0;
+}
+
+static unsigned int kmem_ready __read_mostly;
+
+int slab_is_available(void)
+{
+	return kmem_ready;
+}
+
+static void kmem_init_bin(struct kmem_bin *bin)
+{
+	bin->freelist = RB_ROOT;
+}
+
+#define NR_ALLOCS 64
+#define ALLOC_SIZE 128
+
+static void kmem_selftest(void)
+{
+	void *objs[NR_ALLOCS];
+	int i;
+
+	for (i = 0; i < NR_ALLOCS; i++)
+		objs[i] = kmalloc(ALLOC_SIZE, GFP_KERNEL);
+
+	for (i = 0; i < NR_ALLOCS; i++)
+		kfree(objs[i]);
+}
+
+void __init kmem_cache_init(void)
+{
+	int i;
+
+	for (i = 0; i < NR_BINS; i++)
+		kmem_init_bin(&bins[i]);
+	kmem_ready = 1;
+
+	kmem_selftest();
+}

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-30 21:51                                 ` Pekka J Enberg
@ 2008-07-30 22:00                                   ` Linus Torvalds
  2008-07-30 22:22                                     ` Pekka Enberg
  2008-07-31  1:09                                     ` Matt Mackall
  0 siblings, 2 replies; 69+ messages in thread
From: Linus Torvalds @ 2008-07-30 22:00 UTC (permalink / raw)
  To: Pekka J Enberg
  Cc: Matt Mackall, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes



On Thu, 31 Jul 2008, Pekka J Enberg wrote:
> 
> Subject: [PATCH] binalloc: best-fit allocation with binning
> From: Pekka Enberg <penberg@cs.helsinki.fi>

Shoot me now. 

> As suggested by Linus,

I'm happy to hear that the thing worked, but I'm not sure how happy I 
should be about yet _another_ allocator. Will it ever end?

But seriously, it looks simple and small enough, so in that sense there 
doesn't seem to be a problem. But I really don't look forward to another 
one of these, at least not without somebody deciding that yes, we can 
prune one of the old ones as never being better. Hmm?

		Linus

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-30 22:00                                   ` Linus Torvalds
@ 2008-07-30 22:22                                     ` Pekka Enberg
  2008-07-30 22:35                                       ` Linus Torvalds
  2008-07-31  1:09                                     ` Matt Mackall
  1 sibling, 1 reply; 69+ messages in thread
From: Pekka Enberg @ 2008-07-30 22:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Matt Mackall, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes

Hi Linus,

On Thu, Jul 31, 2008 at 1:00 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> I'm happy to hear that the thing worked, but I'm not sure how happy I
> should be about yet _another_ allocator. Will it ever end?

Oh, I didn't suggest this for merging. Just thought you'd be
interested to know that best-fit doesn't really do that much better
than what we have in the tree now. (Well, I was kinda hoping you'd
tell me why my implementation is wrong and you were right all along.)

On Thu, Jul 31, 2008 at 1:00 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> But seriously, it looks simple and small enough, so in that sense there
> doesn't seem to be a problem. But I really don't look forward to another
> one of these, at least not without somebody deciding that yes, we can
> prune one of the old ones as never being better. Hmm?

I think the current situation is bit unfortunate. SLOB doesn't want
cater for everybody (and probably can't do that either), SLAB sucks
for NUMA and embedded, and SLUB is too much of a "memory pig"
(although much less so than SLAB) to replace SLOB and it has that TPC
regression we don't have a test case for.

So while SLAB is (slowly) on it's way out, we really don't have a
strategy for SLOB/SLUB. I'm trying to come up with something that's
memory efficient first and then try to tune that for SMP and NUMA.
Looking at how tuned the fast-paths of SLAB and SLUB are (due to the
design), it seems unlikely that we can come up with anything that
could compete with them. But that doesn't mean we can't have fun
trying ;-).

                             Pekka

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-30 22:22                                     ` Pekka Enberg
@ 2008-07-30 22:35                                       ` Linus Torvalds
  2008-07-31  0:42                                         ` malc
  2008-07-31  1:03                                         ` Matt Mackall
  0 siblings, 2 replies; 69+ messages in thread
From: Linus Torvalds @ 2008-07-30 22:35 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Matt Mackall, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes



On Thu, 31 Jul 2008, Pekka Enberg wrote:
> 
> Oh, I didn't suggest this for merging. Just thought you'd be
> interested to know that best-fit doesn't really do that much better
> than what we have in the tree now. (Well, I was kinda hoping you'd
> tell me why my implementation is wrong and you were right all along.)

Heh. Most allocators tend to work pretty well under normal load, and the 
real fragmentation problems all tend to happen under special patterns. The 
one in glibc, for example, sucks donkey dick when using threading, but is 
apparently ok otherwise. 

I wouldn't actually expect most "normal" kernel use to show any really bad 
patterns on any normal loads. Google for

	worst-case first-fit fragmentation

(or 'next-fit' for that matter) to see some stuff. Of course, it is scary 
only if you can trigger it in practice (perhaps with certains games on 
packet size, or creating/removing files with pathname size patterns ec).

[ Of course, google probably mostly returns hits from all those ACM 
  portals etc. I wonder why google does that - they're almost totally 
  useless search results. Sad. If somebody knows how to turn those ACM 
  pay-portals off in google, pls let me know ]

			Linus

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-30 22:35                                       ` Linus Torvalds
@ 2008-07-31  0:42                                         ` malc
  2008-07-31  1:03                                         ` Matt Mackall
  1 sibling, 0 replies; 69+ messages in thread
From: malc @ 2008-07-31  0:42 UTC (permalink / raw)
  To: linux-kernel

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, 31 Jul 2008, Pekka Enberg wrote:
>> 
>> Oh, I didn't suggest this for merging. Just thought you'd be
>> interested to know that best-fit doesn't really do that much better
>> than what we have in the tree now. (Well, I was kinda hoping you'd
>> tell me why my implementation is wrong and you were right all along.)
>
> Heh. Most allocators tend to work pretty well under normal load, and the 
> real fragmentation problems all tend to happen under special patterns. The 
> one in glibc, for example, sucks donkey dick when using threading, but is 
> apparently ok otherwise. 
>
> I wouldn't actually expect most "normal" kernel use to show any really bad 
> patterns on any normal loads. Google for
>
> 	worst-case first-fit fragmentation
>
> (or 'next-fit' for that matter) to see some stuff. Of course, it is scary 
> only if you can trigger it in practice (perhaps with certains games on 
> packet size, or creating/removing files with pathname size patterns ec).
>
> [ Of course, google probably mostly returns hits from all those ACM 
>   portals etc. I wonder why google does that - they're almost totally 
>   useless search results. Sad. If somebody knows how to turn those ACM 
>   pay-portals off in google, pls let me know ]
>
> 			Linus

blahblah -site:portal.acm.org ?

-- 
mailto:av1474@comtv.ru


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-30 22:35                                       ` Linus Torvalds
  2008-07-31  0:42                                         ` malc
@ 2008-07-31  1:03                                         ` Matt Mackall
  1 sibling, 0 replies; 69+ messages in thread
From: Matt Mackall @ 2008-07-31  1:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka Enberg, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes


On Wed, 2008-07-30 at 15:35 -0700, Linus Torvalds wrote:

> [ Of course, google probably mostly returns hits from all those ACM 
>   portals etc. I wonder why google does that - they're almost totally 
>   useless search results. Sad. If somebody knows how to turn those ACM 
>   pay-portals off in google, pls let me know ]

Maybe you, as one of the better known developers in the world, could
write an open letter suggesting that ACM get a clue and open up their
archives. It's ridiculous that physicists, biologists, and historians
should have more and better web resources than the computer scientists.

I propose you start it with "Hey morons".

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-30 22:00                                   ` Linus Torvalds
  2008-07-30 22:22                                     ` Pekka Enberg
@ 2008-07-31  1:09                                     ` Matt Mackall
  2008-07-31 14:11                                       ` Andi Kleen
  2008-07-31 14:26                                       ` Christoph Lameter
  1 sibling, 2 replies; 69+ messages in thread
From: Matt Mackall @ 2008-07-31  1:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Pekka J Enberg, Christoph Lameter, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes


On Wed, 2008-07-30 at 15:00 -0700, Linus Torvalds wrote:
> 
> On Thu, 31 Jul 2008, Pekka J Enberg wrote:
> > 
> > Subject: [PATCH] binalloc: best-fit allocation with binning
> > From: Pekka Enberg <penberg@cs.helsinki.fi>
> 
> Shoot me now. 
> 
> > As suggested by Linus,
> 
> I'm happy to hear that the thing worked, but I'm not sure how happy I 
> should be about yet _another_ allocator. Will it ever end?

I think you can relax: the logical limit is probably two. We want an
allocator that is both optimally fast and scalable on one end and
optimally space-efficient on the other end and we're unlikely to find
one allocator that is simultaneously both. But I don't think there's
much call for things in the middle of the spectrum.

So if this new one (which I haven't looked at yet) beats SLOB in space
usage and simplicity, I'll be happy to see it replace SLOB.

Finally getting rid of SLAB is a much trickier proposition because SLUB
still loses in a few important corner cases.

-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-31  1:09                                     ` Matt Mackall
@ 2008-07-31 14:11                                       ` Andi Kleen
  2008-07-31 15:25                                         ` Christoph Lameter
  2008-07-31 14:26                                       ` Christoph Lameter
  1 sibling, 1 reply; 69+ messages in thread
From: Andi Kleen @ 2008-07-31 14:11 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka J Enberg, Christoph Lameter, Ingo Molnar,
	Hugh Dickins, Andi Kleen, Peter Zijlstra,
	Linux Kernel Mailing List, vegard.nossum, hannes

> Finally getting rid of SLAB is a much trickier proposition because SLUB
> still loses in a few important corner cases.

The big issue is that we haven't really made much progress on at least
some of these test cases (like the database benchmarks) for quite some
time (and that wasn't for a lack of trying) :-/ Might be a fundamental
problem.

-Andi

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-31  1:09                                     ` Matt Mackall
  2008-07-31 14:11                                       ` Andi Kleen
@ 2008-07-31 14:26                                       ` Christoph Lameter
  2008-07-31 15:38                                         ` Matt Mackall
  1 sibling, 1 reply; 69+ messages in thread
From: Christoph Lameter @ 2008-07-31 14:26 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes

Matt Mackall wrote:

> Finally getting rid of SLAB is a much trickier proposition because SLUB
> still loses in a few important corner cases.

Which corner cases? I know about the reports on the TPC issue which I guess is a result of the avoidance of queueing in the cold free case. It would be good to have a small C program (under an open source license) that shows the issue.


 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-31 14:11                                       ` Andi Kleen
@ 2008-07-31 15:25                                         ` Christoph Lameter
  2008-07-31 16:03                                           ` Andi Kleen
  0 siblings, 1 reply; 69+ messages in thread
From: Christoph Lameter @ 2008-07-31 15:25 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Matt Mackall, Linus Torvalds, Pekka J Enberg, Christoph Lameter,
	Ingo Molnar, Hugh Dickins, Peter Zijlstra,
	Linux Kernel Mailing List, vegard.nossum, hannes

Andi Kleen wrote:
>> Finally getting rid of SLAB is a much trickier proposition because SLUB
>> still loses in a few important corner cases.
> 
> The big issue is that we haven't really made much progress on at least
> some of these test cases (like the database benchmarks) for quite some
> time (and that wasn't for a lack of trying) :-/ Might be a fundamental
> problem.

It would be good to have more of these benchmark regressions than just TPC which requires a database setup etc etc. Preferably something that is easily runnable by everyone.

There is a fundamental difference in how frees are handled. Slub is queueless so it must use an atomic operation on the page struct of the slab that we free to (in the case that the freeing does not occur to the current cpu slab) to serialize access.

SLAB can avoid that by just stuffing the pointer to the object to be freed into a per cpu queue. Then later the queue is processed and the objects are merged into the slabs. But the later workqueue processing then causes run time variabilities which impact network performance and cause latencies. And the queuing only works in the SMP case. In the NUMA case we need to first check the locality of the object and then stuff it into alien queues (if its not local node). Then we need to expire the alien queues at some point. We have these alien queues per node which means they require locks and we have NODES * CPUS of them.




^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-31 14:26                                       ` Christoph Lameter
@ 2008-07-31 15:38                                         ` Matt Mackall
  2008-07-31 15:42                                           ` Christoph Lameter
  0 siblings, 1 reply; 69+ messages in thread
From: Matt Mackall @ 2008-07-31 15:38 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes


On Thu, 2008-07-31 at 09:26 -0500, Christoph Lameter wrote:
> Matt Mackall wrote:
> 
> > Finally getting rid of SLAB is a much trickier proposition because SLUB
> > still loses in a few important corner cases.
> 
> Which corner cases? I know about the reports on the TPC issue which I
> guess is a result of the avoidance of queueing in the cold free case.
> It would be good to have a small C program (under an open source
> license) that shows the issue.

That's the one I had in mind, which I have no personal insight into.
 
-- 
Mathematics is the supreme nostalgia of our time.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-31 15:38                                         ` Matt Mackall
@ 2008-07-31 15:42                                           ` Christoph Lameter
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-07-31 15:42 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Linus Torvalds, Pekka J Enberg, Ingo Molnar, Hugh Dickins,
	Andi Kleen, Peter Zijlstra, Linux Kernel Mailing List,
	vegard.nossum, hannes

Matt Mackall wrote:
> On Thu, 2008-07-31 at 09:26 -0500, Christoph Lameter wrote:
>> Matt Mackall wrote:
>>
>>> Finally getting rid of SLAB is a much trickier proposition because SLUB
>>> still loses in a few important corner cases.
>> Which corner cases? I know about the reports on the TPC issue which I
>> guess is a result of the avoidance of queueing in the cold free case.
>> It would be good to have a small C program (under an open source
>> license) that shows the issue.
> 
> That's the one I had in mind, which I have no personal insight into.

I also only so far had to rely on reports by others and guesswork as to what is going on.



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-31 15:25                                         ` Christoph Lameter
@ 2008-07-31 16:03                                           ` Andi Kleen
  2008-07-31 16:05                                             ` Christoph Lameter
  0 siblings, 1 reply; 69+ messages in thread
From: Andi Kleen @ 2008-07-31 16:03 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andi Kleen, Matt Mackall, Linus Torvalds, Pekka J Enberg,
	Christoph Lameter, Ingo Molnar, Hugh Dickins, Peter Zijlstra,
	Linux Kernel Mailing List, vegard.nossum, hannes, matthew

On Thu, Jul 31, 2008 at 10:25:45AM -0500, Christoph Lameter wrote:
> Andi Kleen wrote:
> >> Finally getting rid of SLAB is a much trickier proposition because SLUB
> >> still loses in a few important corner cases.
> > 
> > The big issue is that we haven't really made much progress on at least
> > some of these test cases (like the database benchmarks) for quite some
> > time (and that wasn't for a lack of trying) :-/ Might be a fundamental
> > problem.
> 
> It would be good to have more of these benchmark regressions than just TPC which requires a database setup etc etc. Preferably something that is easily runnable by everyone.

AFAIK willy had a small test case for at least one of them.

-Andi

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC PATCH] greatly reduce SLOB external fragmentation
  2008-07-31 16:03                                           ` Andi Kleen
@ 2008-07-31 16:05                                             ` Christoph Lameter
  0 siblings, 0 replies; 69+ messages in thread
From: Christoph Lameter @ 2008-07-31 16:05 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Matt Mackall, Linus Torvalds, Pekka J Enberg, Christoph Lameter,
	Ingo Molnar, Hugh Dickins, Peter Zijlstra,
	Linux Kernel Mailing List, vegard.nossum, hannes, matthew

Andi Kleen wrote:
> On Thu, Jul 31, 2008 at 10:25:45AM -0500, Christoph Lameter wrote:
>> Andi Kleen wrote:
>>>> Finally getting rid of SLAB is a much trickier proposition because SLUB
>>>> still loses in a few important corner cases.
>>> The big issue is that we haven't really made much progress on at least
>>> some of these test cases (like the database benchmarks) for quite some
>>> time (and that wasn't for a lack of trying) :-/ Might be a fundamental
>>> problem.
>> It would be good to have more of these benchmark regressions than just TPC which requires a database setup etc etc. Preferably something that is easily runnable by everyone.
> 
> AFAIK willy had a small test case for at least one of them.

He is developing a scsi performance test program that showed a regression but it was proprietary last time I heard of it. Maybe its now available?


^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2008-07-31 17:55 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-02 18:43 [PATCH] procfs: provide slub's /proc/slabinfo Hugh Dickins
2008-01-02 18:53 ` Christoph Lameter
2008-01-02 19:09 ` Pekka Enberg
2008-01-02 19:35   ` Linus Torvalds
2008-01-02 19:45     ` Linus Torvalds
2008-01-02 19:49     ` Pekka Enberg
2008-01-02 22:50     ` Matt Mackall
2008-01-03  8:52       ` Ingo Molnar
2008-01-03 16:46         ` Matt Mackall
2008-01-04  2:21           ` Christoph Lameter
2008-01-04  2:45             ` Andi Kleen
2008-01-04  4:34               ` Matt Mackall
2008-01-04  9:17               ` Peter Zijlstra
2008-01-04 20:37                 ` Christoph Lameter
2008-01-04  4:11             ` Matt Mackall
2008-01-04 20:34               ` Christoph Lameter
2008-01-04 20:55                 ` Matt Mackall
2008-01-04 21:36                   ` Christoph Lameter
2008-01-04 22:30                     ` Matt Mackall
2008-01-05 20:16                       ` Christoph Lameter
2008-01-05 16:21               ` Pekka J Enberg
2008-01-05 17:14                 ` Andi Kleen
2008-01-05 20:05                 ` Christoph Lameter
2008-01-07 20:12                   ` Pekka J Enberg
2008-01-06 17:51                 ` Matt Mackall
2008-01-07 18:06                   ` Pekka J Enberg
2008-01-07 19:03                     ` Matt Mackall
2008-01-07 19:53                       ` Pekka J Enberg
2008-01-07 20:44                       ` Pekka J Enberg
2008-01-10 10:04                       ` Pekka J Enberg
2008-01-09 19:15                     ` [RFC PATCH] greatly reduce SLOB external fragmentation Matt Mackall
2008-01-09 22:43                       ` Pekka J Enberg
2008-01-09 22:59                         ` Matt Mackall
2008-01-10 10:02                           ` Pekka J Enberg
2008-01-10 10:54                             ` Pekka J Enberg
2008-01-10 15:44                               ` Matt Mackall
2008-01-10 16:13                               ` Linus Torvalds
2008-01-10 17:49                                 ` Matt Mackall
2008-01-10 18:28                                   ` Linus Torvalds
2008-01-10 18:42                                     ` Matt Mackall
2008-01-10 19:24                                       ` Christoph Lameter
2008-01-10 19:44                                         ` Matt Mackall
2008-01-10 19:51                                           ` Christoph Lameter
2008-01-10 19:41                                       ` Linus Torvalds
2008-01-10 19:46                                         ` Christoph Lameter
2008-01-10 19:53                                         ` Andi Kleen
2008-01-10 19:52                                           ` Christoph Lameter
2008-01-10 19:16                                   ` Christoph Lameter
2008-01-10 19:23                                     ` Matt Mackall
2008-01-10 19:31                                       ` Christoph Lameter
2008-01-10 21:25                                   ` Jörn Engel
2008-01-10 18:13                                 ` Andi Kleen
2008-07-30 21:51                                 ` Pekka J Enberg
2008-07-30 22:00                                   ` Linus Torvalds
2008-07-30 22:22                                     ` Pekka Enberg
2008-07-30 22:35                                       ` Linus Torvalds
2008-07-31  0:42                                         ` malc
2008-07-31  1:03                                         ` Matt Mackall
2008-07-31  1:09                                     ` Matt Mackall
2008-07-31 14:11                                       ` Andi Kleen
2008-07-31 15:25                                         ` Christoph Lameter
2008-07-31 16:03                                           ` Andi Kleen
2008-07-31 16:05                                             ` Christoph Lameter
2008-07-31 14:26                                       ` Christoph Lameter
2008-07-31 15:38                                         ` Matt Mackall
2008-07-31 15:42                                           ` Christoph Lameter
2008-01-10  2:46                         ` Matt Mackall
2008-01-10 10:03                       ` Pekka J Enberg
2008-01-03 20:31         ` [PATCH] procfs: provide slub's /proc/slabinfo Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).