* [PATCH] patch-slab-split-03-tail @ 2002-10-04 17:04 Manfred Spraul 2002-10-04 19:06 ` Andrew Morton 0 siblings, 1 reply; 10+ messages in thread From: Manfred Spraul @ 2002-10-04 17:04 UTC (permalink / raw) To: akpm, linux-kernel; +Cc: mbligh [-- Attachment #1: Type: text/plain, Size: 385 bytes --] part 3: [depends on -02-SMP] If an object is freed from a slab, then move the slab to the tail of the partial list - this should increase the probability that the other objects from the same page are freed, too, and that a page can be returned to gfp later. The cpu arrays are now always in front of the list, i.e. cache hit rates should not matter. Please apply -- Manfred [-- Attachment #2: patch-slab-split-03-tail --] [-- Type: text/plain, Size: 331 bytes --] --- 2.5/mm/slab.c Fri Oct 4 18:59:01 2002 +++ build-2.5/mm/slab.c Fri Oct 4 18:59:11 2002 @@ -1478,7 +1478,7 @@ } else if (unlikely(inuse == cachep->num)) { /* Was full. */ list_del(&slabp->list); - list_add(&slabp->list, &cachep->slabs_partial); + list_add_tail(&slabp->list, &cachep->slabs_partial); } } } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 17:04 [PATCH] patch-slab-split-03-tail Manfred Spraul @ 2002-10-04 19:06 ` Andrew Morton 2002-10-04 19:07 ` Martin J. Bligh 2002-10-04 19:15 ` Manfred Spraul 0 siblings, 2 replies; 10+ messages in thread From: Andrew Morton @ 2002-10-04 19:06 UTC (permalink / raw) To: Manfred Spraul; +Cc: linux-kernel, mbligh Manfred Spraul wrote: > > part 3: > [depends on -02-SMP] > > If an object is freed from a slab, then move the slab to the tail of the > partial list - this should increase the probability that the other > objects from the same page are freed, too, and that a page can be > returned to gfp later. > > The cpu arrays are now always in front of the list, i.e. cache hit rates > should not matter. > Run that by me again? So we're saying "if we just freed an object from this page then make this page be the *last* page which is eligible for new allocations"? Under the assumption that other objects in that same page are about to be freed up as well? Makes sense. It would be nice to get this confirmed in targetted testing ;) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 19:06 ` Andrew Morton @ 2002-10-04 19:07 ` Martin J. Bligh 2002-10-04 19:15 ` Andrew Morton 2002-10-04 19:15 ` Manfred Spraul 1 sibling, 1 reply; 10+ messages in thread From: Martin J. Bligh @ 2002-10-04 19:07 UTC (permalink / raw) To: Andrew Morton, Manfred Spraul; +Cc: linux-kernel > Run that by me again? So we're saying "if we just freed an > object from this page then make this page be the *last* page > which is eligible for new allocations"? Under the assumption > that other objects in that same page are about to be freed > up as well? > > Makes sense. It would be nice to get this confirmed in > targetted testing ;) Just doing my normal boring kernel compile suggest Manfred's last big rollup performs exactly the same as without it. Not sure if that's any help or not .... M. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 19:07 ` Martin J. Bligh @ 2002-10-04 19:15 ` Andrew Morton 0 siblings, 0 replies; 10+ messages in thread From: Andrew Morton @ 2002-10-04 19:15 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Manfred Spraul, linux-kernel "Martin J. Bligh" wrote: > > > Run that by me again? So we're saying "if we just freed an > > object from this page then make this page be the *last* page > > which is eligible for new allocations"? Under the assumption > > that other objects in that same page are about to be freed > > up as well? > > > > Makes sense. It would be nice to get this confirmed in > > targetted testing ;) > > Just doing my normal boring kernel compile suggest Manfred's > last big rollup performs exactly the same as without it. Not > sure if that's any help or not .... > Well. This patch is supposed to decrease internal fragmentation. We need to prove that theory. An appropriate test would be: - boot with `mem=48m' - untar kernel - build kernel - capture /proc/slabinfo - apply patch - repeat - compare and explain. I know what your reboot times are like ;) I'll do it. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 19:06 ` Andrew Morton 2002-10-04 19:07 ` Martin J. Bligh @ 2002-10-04 19:15 ` Manfred Spraul 2002-10-04 20:22 ` Randy.Dunlap 1 sibling, 1 reply; 10+ messages in thread From: Manfred Spraul @ 2002-10-04 19:15 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, mbligh Andrew Morton wrote: > > Makes sense. It would be nice to get this confirmed in > targetted testing ;) > Not yet done. The right way to test it would be to collect data in kernel about alloc/free, and then run that data against both versions, and check which version gives less internal fragmentation. Or perhaps Bonwick has done that for his slab paper, but I don't have it :-( * An implementation of the Slab Allocator as described in outline in; * UNIX Internals: The New Frontiers by Uresh Vahalia * Pub: Prentice Hall ISBN 0-13-101908-2 * or with a little more detail in; * The Slab Allocator: An Object-Caching Kernel Memory Allocator * Jeff Bonwick (Sun Microsystems). * Presented at: USENIX Summer 1994 Technical Conference -- Manfred ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 19:15 ` Manfred Spraul @ 2002-10-04 20:22 ` Randy.Dunlap 2002-10-04 21:25 ` Manfred Spraul 0 siblings, 1 reply; 10+ messages in thread From: Randy.Dunlap @ 2002-10-04 20:22 UTC (permalink / raw) To: Manfred Spraul; +Cc: Andrew Morton, linux-kernel, mbligh On Fri, 4 Oct 2002, Manfred Spraul wrote: | Andrew Morton wrote: | > | > Makes sense. It would be nice to get this confirmed in | > targetted testing ;) | > | Not yet done. | | The right way to test it would be to collect data in kernel about | alloc/free, and then run that data against both versions, and check | which version gives less internal fragmentation. | | Or perhaps Bonwick has done that for his slab paper, but I don't have it :-( Did you look at http://www.usenix.org/events/usenix01/bonwick.html for it? | * An implementation of the Slab Allocator as described in outline in; | * UNIX Internals: The New Frontiers by Uresh Vahalia | * Pub: Prentice Hall ISBN 0-13-101908-2 | * or with a little more detail in; | * The Slab Allocator: An Object-Caching Kernel Memory Allocator | * Jeff Bonwick (Sun Microsystems). | * Presented at: USENIX Summer 1994 Technical Conference | -- -- ~Randy ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 20:22 ` Randy.Dunlap @ 2002-10-04 21:25 ` Manfred Spraul 2002-10-04 21:43 ` Robert Love 2002-10-05 0:14 ` Anton Blanchard 0 siblings, 2 replies; 10+ messages in thread From: Manfred Spraul @ 2002-10-04 21:25 UTC (permalink / raw) To: Randy.Dunlap; +Cc: Andrew Morton, linux-kernel, mbligh Randy.Dunlap wrote: > > Did you look at http://www.usenix.org/events/usenix01/bonwick.html > for it? > Thanks for the link - that describes the newer, per-cpu extensions to slab. Quite similar to the Linux implementation. The text also contains a link to the original paper: http://www.usenix.org/publications/library/proceedings/bos94/bonwick.html Bonwick used one partially sorted list [as linux in 2.2, and 2.4.<10], instead of seperate lists - move tail was not an option. The new paper contains one interesting comment: <<<<<<< An object cache's CPU layer contains per-CPU state that must be protected either by per-CPU locking or by disabling interrupts. We selected per-CPU locking for several reasons: [...] x Performance. On most modern processors, grabbing an uncontended lock is cheaper than modifying the processor interrupt level. <<<<<<<< Which cpus have slow local_irq_disable() implementations? At least for my Duron, this doesn't seem to be the case [~ 4 cpu cycles for cli] -- Manfred ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 21:25 ` Manfred Spraul @ 2002-10-04 21:43 ` Robert Love 2002-10-04 22:30 ` Manfred Spraul 2002-10-05 0:14 ` Anton Blanchard 1 sibling, 1 reply; 10+ messages in thread From: Robert Love @ 2002-10-04 21:43 UTC (permalink / raw) To: Manfred Spraul; +Cc: Randy.Dunlap, Andrew Morton, linux-kernel, mbligh On Fri, 2002-10-04 at 17:25, Manfred Spraul wrote: > Which cpus have slow local_irq_disable() implementations? At least > for my Duron, this doesn't seem to be the case [~ 4 cpu cycles > for cli] I believe there are pipeline effects to disabling interrupts, e.g. it has to be flushed? Robert Love ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 21:43 ` Robert Love @ 2002-10-04 22:30 ` Manfred Spraul 0 siblings, 0 replies; 10+ messages in thread From: Manfred Spraul @ 2002-10-04 22:30 UTC (permalink / raw) To: Robert Love; +Cc: Randy.Dunlap, Andrew Morton, linux-kernel, mbligh [-- Attachment #1: Type: text/plain, Size: 495 bytes --] Robert Love wrote: > On Fri, 2002-10-04 at 17:25, Manfred Spraul wrote: > > >>Which cpus have slow local_irq_disable() implementations? At least >>for my Duron, this doesn't seem to be the case [~ 4 cpu cycles >>for cli] > > > I believe there are pipeline effects to disabling interrupts, e.g. it > has to be flushed? > At least my Duron [700 MHz] obviously doesn't flush the pipeline. If the Pentium 4 flushes his pipeline, it could mean 20+ cycles - test app is attached. -- Manfred [-- Attachment #2: cli.cpp --] [-- Type: text/plain, Size: 3296 bytes --] /* * cli.cpp: RDTSC based performance tester. * * Copyright (C) 1999, 2001, 2002 by Manfred Spraul. * All rights reserved except the rights granted by the GPL. * * Redistribution of this file is permitted under the terms of the GNU * General Public License (GPL) version 2 or later. * $Header: /pub/home/manfred/cvs-tree/timetest/cli.cpp,v 1.4 2002/10/04 21:22:09 manfred Exp $ */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <getopt.h> // define a cache flushing function #undef CACHE_FLUSH // Intel recommends that a serializing instruction // should be called before and after rdtsc. // CPUID is a serializing instruction. // ".align 128:" P 4 L2 cache line size #define read_rdtsc_before(time) \ __asm__ __volatile__( \ ".align 128\n\t" \ "xor %%eax,%%eax\n\t" \ "cpuid\n\t" \ "rdtsc\n\t" \ "mov %%eax,(%0)\n\t" \ "mov %%edx,4(%0)\n\t" \ "xor %%eax,%%eax\n\t" \ "cpuid\n\t" \ : /* no output */ \ : "S"(&time) \ : "eax", "ebx", "ecx", "edx", "memory") #define read_rdtsc_after(time) \ __asm__ __volatile__( \ "xor %%eax,%%eax\n\t" \ "cpuid\n\t" \ "rdtsc\n\t" \ "mov %%eax,(%0)\n\t" \ "mov %%edx,4(%0)\n\t" \ "xor %%eax,%%eax\n\t" \ "cpuid\n\t" \ "sti\n\t" \ : /* no output */ \ : "S"(&time) \ : "eax", "ebx", "ecx", "edx", "memory") #define BUILD_TESTFNC(name, text, instructions) \ void name##_dummy(void) \ { \ __asm__ __volatile__( \ ".align 4096\n\t" \ "xor %%eax, %%eax\n\t" \ : : : "eax"); \ } \ static unsigned long name##_best = 1024*1024*1024; \ \ static void name(void) \ { \ unsigned long long time; \ unsigned long long time2; \ \ read_rdtsc_before(time); \ instructions; \ read_rdtsc_after(time2); \ if(time2-time < name##_best) { \ printf( text ":\t%10Ld ticks; \n", \ time2-time-zerotest_best); \ name##_best = time2-time; \ } \ } void filler(void) { static int i = 3; static int j; j = i*i; } #define DO_3(x) \ do { x; x; x; } while(0) #define DO_10(x) \ do { x; x; x; x; x; x; x; x; x; x;} while(0) #define DO_50(x) \ do { DO_10(x); DO_10(x);DO_10(x); DO_10(x);DO_10(x);} while(0) #define DO_T(y) do { \ DO_3(filler()); \ y; \ DO_3(filler());} while(0) #ifdef CACHE_FLUSH #define DRAIN_SZ (4*1024*1024) int other[3*DRAIN_SZ] __attribute ((aligned (4096))); static inline void drain_cache(void) { int i; for(i=0;i<DRAIN_SZ;i++) other[DRAIN_SZ+i]=0; for(i=0;i<DRAIN_SZ;i++) if(other[DRAIN_SZ+i]!=0) break; } #else static inline void drain_cache(void) { } #endif #define DO_TEST(x) \ do { \ int i; \ for(i=0;i<500000;i++) \ x; \ } while(0) ////////////////////////////////////////////////////////////////////////////// static inline void nothing() { __asm__ __volatile__("nop": : : "memory"); } BUILD_TESTFNC(zerotest,"zerotest", DO_T(nothing())); ////////////////////////////////////////////////////////////////////////////// static inline void test0() { __asm__ __volatile__("cli": : : "memory"); } BUILD_TESTFNC(test_0, "cli", DO_T(test0())) ////////////////////////////////////////////////////////////////////////////// extern "C" int iopl __P ((int __level)); int main() { printf("CLI bench\n"); iopl(3); for(;;) { DO_TEST(zerotest()); DO_TEST(test_0()); } return 0; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] patch-slab-split-03-tail 2002-10-04 21:25 ` Manfred Spraul 2002-10-04 21:43 ` Robert Love @ 2002-10-05 0:14 ` Anton Blanchard 1 sibling, 0 replies; 10+ messages in thread From: Anton Blanchard @ 2002-10-05 0:14 UTC (permalink / raw) To: Manfred Spraul; +Cc: Randy.Dunlap, Andrew Morton, linux-kernel, mbligh > <<<<<<< > An object cache's CPU layer contains per-CPU state that must be > protected either by per-CPU locking or by disabling interrupts. We > selected per-CPU locking for several reasons: > [...] > x Performance. On most modern processors, grabbing an uncontended > lock is cheaper than modifying the processor interrupt level. > <<<<<<<< > > Which cpus have slow local_irq_disable() implementations? At least for > my Duron, this doesn't seem to be the case [~ 4 cpu cycles for cli] Rusty did some tests and found on the intel chips he tested local_irq_disable was slower. He posted the results to lkml a few weeks ago. On ppc64 it varies between chips. Anton ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-10-05 1:22 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-10-04 17:04 [PATCH] patch-slab-split-03-tail Manfred Spraul 2002-10-04 19:06 ` Andrew Morton 2002-10-04 19:07 ` Martin J. Bligh 2002-10-04 19:15 ` Andrew Morton 2002-10-04 19:15 ` Manfred Spraul 2002-10-04 20:22 ` Randy.Dunlap 2002-10-04 21:25 ` Manfred Spraul 2002-10-04 21:43 ` Robert Love 2002-10-04 22:30 ` Manfred Spraul 2002-10-05 0:14 ` Anton Blanchard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).