Support worst case cache line sizes as config option
diff mbox series

Message ID 20030427022346.GA27933@averell
State New, archived
Headers show
Series
  • Support worst case cache line sizes as config option
Related show

Commit Message

Andi Kleen April 27, 2003, 2:23 a.m. UTC
This mirrors a change that has been in the SuSE/aa 2.4 kernel for a long time.

For a generic binary kernel you really want to assume the worst case
cache line size.  That's the P4's 128 byte currently.

The overhead of having the cache line size bigger on other CPUs is not
that bad, but if it is too small it will cost you dearly on SMP and
even a bit on UP in device drivers. 

This patch adds a new CONFIG_X86_GENERIC option for this. It currently
only forces 128byte cache lines, but could be used for more in the future.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Andi Kleen April 27, 2003, 12:52 p.m. UTC | #1
On Sun, Apr 27, 2003 at 01:36:45PM +0200, Riley Williams wrote:
> Does the order of those "default" lines actually matter?
> Your moving that one to the top of the list appears to imply that it does.

I don't know if it matters, probably not.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Adrian Bunk April 28, 2003, 9:16 a.m. UTC | #2
On Sun, Apr 27, 2003 at 04:23:46AM +0200, Andi Kleen wrote:
> 
> This mirrors a change that has been in the SuSE/aa 2.4 kernel for a long time.
> 
> For a generic binary kernel you really want to assume the worst case
> cache line size.  That's the P4's 128 byte currently.
> 
> The overhead of having the cache line size bigger on other CPUs is not
> that bad, but if it is too small it will cost you dearly on SMP and
> even a bit on UP in device drivers. 
> 
> This patch adds a new CONFIG_X86_GENERIC option for this. It currently
> only forces 128byte cache lines, but could be used for more in the future.
> 
> diff -u linux-gencpu/arch/i386/Kconfig-o linux-gencpu/arch/i386/Kconfig
> --- linux-gencpu/arch/i386/Kconfig-o	2003-04-27 02:40:32.000000000 +0200
> +++ linux-gencpu/arch/i386/Kconfig	2003-04-27 03:50:08.000000000 +0200
> @@ -273,6 +273,13 @@
>  
>  endchoice
>  
> +config X86_GENERIC
> +       bool "Generic x86 support" 
> +       help
> +       	  Include some tuning for non selected x86 CPUs too.
> +	  when it has moderate overhead. This is intended for generic 
> +	  distributions kernels.
> +
>  #
>  # Define implied options from the CPU selection here
>  #


Your X86_GENERIC is semantically equivalent to M386.



> @@ -288,10 +295,10 @@
>  
>  config X86_L1_CACHE_SHIFT
>  	int
> +	default "7" if MPENTIUM4 || X86_GENERIC
>  	default "4" if MELAN || M486 || M386
>  	default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2
>  	default "6" if MK7 || MK8
> -	default "7" if MPENTIUM4
>  
>  config RWSEM_GENERIC_SPINLOCK
>  	bool


This doesn't work. E.g. MPENTIUMIII has the semantics of "support 
Pentium-III and above". If you want to compile a kernel that runs on 
both a Pentium-III and a Pentium-4 you choose MPENTIUMIII which implies 
X86_L1_CACHE_SHIFT=5 ...


I'm currently working on changing the "Processor family" options from
the current "select the minimum processor you want to support" to 
"select all processors you want to support:
  [ ] 386
  [ ] 486
  ...
  [ ] VIA C3-2"
with the possibility to select one or more processors from the list.

X86_L1_CACHE_SHIFT will simply work with the following (using the 
Kconfig feature that the first "default" with fulfilled "if" is used):

config X86_L1_CACHE_SHIFT
        int
        default "7" if MPENTIUM4
        default "6" if MK7 || MK8
        default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2
        default "4" if MELAN || M486 || M386


Additionally this will make it possible to solve cases where users 
configuring the kernel currently ask "Which CPU should I select for a 
kernel that runs on both a K6 and a Pentium-III?" automatically inside 
arch/i386/Makefile.

I'll send a patch within the next days.


cu
Adrian
Andi Kleen April 28, 2003, 11:47 a.m. UTC | #3
On Mon, Apr 28, 2003 at 11:16:16AM +0200, Adrian Bunk wrote:
> Your X86_GENERIC is semantically equivalent to M386.

M386 is tuning for the Intel 386

X86_GENERIC is "try to tune for all CPUs if possible" 

> This doesn't work. E.g. MPENTIUMIII has the semantics of "support 
> Pentium-III and above". If you want to compile a kernel that runs on 
> both a Pentium-III and a Pentium-4 you choose MPENTIUMIII which implies 
> X86_L1_CACHE_SHIFT=5 ...

Admittedly the other options could be changed to 

default "4" if (MELAN || M486 || M386) && !X86_GENERIC

but that looked a bit too ugly and it seems to work even without.


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Adrian Bunk April 28, 2003, 12:19 p.m. UTC | #4
On Mon, Apr 28, 2003 at 01:47:17PM +0200, Andi Kleen wrote:
> On Mon, Apr 28, 2003 at 11:16:16AM +0200, Adrian Bunk wrote:
> > Your X86_GENERIC is semantically equivalent to M386.
> 
> M386 is tuning for the Intel 386
> 
> X86_GENERIC is "try to tune for all CPUs if possible" 

M386 says that the minimum CPU supported is the 386 - and all CPUs 
above are supported, too. E.g.:

config X86_PPRO_FENCE
        bool
        depends on M686 || M586MMX || M586TSC || M586 || M486 || M386
        default y

config X86_F00F_BUG
        bool
        depends on M586MMX || M586TSC || M586 || M486 || M386
        default y


> > This doesn't work. E.g. MPENTIUMIII has the semantics of "support 
> > Pentium-III and above". If you want to compile a kernel that runs on 
> > both a Pentium-III and a Pentium-4 you choose MPENTIUMIII which implies 
> > X86_L1_CACHE_SHIFT=5 ...
> 
> Admittedly the other options could be changed to 
> 
> default "4" if (MELAN || M486 || M386) && !X86_GENERIC
> 
> but that looked a bit too ugly and it seems to work even without.

Your approach as well as the approach I'm currently working on breaks
the current semantics that a plain M386 produces a kernel that runs on
all CPUs.

> -Andi

cu
Adrian
Jamie Lokier April 28, 2003, 12:48 p.m. UTC | #5
Andi Kleen wrote:
> Your approach as well as the approach I'm currently working on breaks
> the current semantics that a plain M386 produces a kernel that runs on
> all CPUs.

Good, because I _have_ a 386 and want a kernel that is tuned for it,
not a generic kernel.

cheers,
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andi Kleen April 28, 2003, 10 p.m. UTC | #6
On Mon, Apr 28, 2003 at 02:19:20PM +0200, Adrian Bunk wrote:
> Your approach as well as the approach I'm currently working on breaks
> the current semantics that a plain M386 produces a kernel that runs on
> CPUs.

You seem to be completely confused about what the patch is doing. 
CONFIG_X86_GENERIC does not break anything.

A kernel compiled with it will still run fine on the 386 (or whatever
the main CPU selection was). All it does is to make the worst case 
of the kernel running on a CPU with larger cache sizes that your "main" CPU
not as bad.

In fact if you read my patchkits in the last days they are all aimed
at making kernels run on more CPUs, not less. The eventual 
goal is to make Athlon kernels run on all 686+ class CPUs, and P4 kernels run
on all 686+ CPUs, and 386 kernels run well on all CPUs without bad
performance penalties.

M386 is a quite bad example here anyways. The standard situation 
is that people compile an SMP kernel for the P2 (which seems to be "the generic cpu" 
these days[1]). That kernel is compiled with a cache size of 32bytes.

This 32byte cache size is used to avoid false sharing in a lot of data structures;
e.g. arrays of per CPU data are usually padded to cache line size to make
sure each CPU has its own cache line in the array. This unfortunately does
not work when you run it on a CPU with a bigger cache line; like an Athlon
with 64byte cache line or an P4 with 128 byte cache. In this case a lot 
of performance will be lost on SMP because multiple CPUs will fight
for the data on a single cache line ("false sharing"). Always padding
to the worst case cache line size avoids this problem.

The issue is not an SMP only problem. Some device drivers already use
the cache line size to optimize PCI bus performance, and they have penalties
when the data is incorrectly padded.

Increasing the cache line size costs a bit of memory for more padding,
but overall the overhead is quite reasonable.

As far Jamies point: if you don't want your 386 kernel to be optimized
for the worst case just don't enable the X86_GENERIC option.

-Andi

[1] ignoring K6 and C3, which are too poor to have CMOV.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

diff -u linux-gencpu/arch/i386/Kconfig-o linux-gencpu/arch/i386/Kconfig
--- linux-gencpu/arch/i386/Kconfig-o	2003-04-27 02:40:32.000000000 +0200
+++ linux-gencpu/arch/i386/Kconfig	2003-04-27 03:50:08.000000000 +0200
@@ -273,6 +273,13 @@ 
 
 endchoice
 
+config X86_GENERIC
+       bool "Generic x86 support" 
+       help
+       	  Include some tuning for non selected x86 CPUs too.
+	  when it has moderate overhead. This is intended for generic 
+	  distributions kernels.
+
 #
 # Define implied options from the CPU selection here
 #
@@ -288,10 +295,10 @@ 
 
 config X86_L1_CACHE_SHIFT
 	int
+	default "7" if MPENTIUM4 || X86_GENERIC
 	default "4" if MELAN || M486 || M386
 	default "5" if MWINCHIP3D || MWINCHIP2 || MWINCHIPC6 || MCRUSOE || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2
 	default "6" if MK7 || MK8
-	default "7" if MPENTIUM4
 
 config RWSEM_GENERIC_SPINLOCK
 	bool