linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Update on AMD Athlon/Opteron/Athlon64 Prefetch Errata
@ 2003-09-11  0:56 richard.brunner
  2003-09-11  1:27 ` [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata Andi Kleen
  0 siblings, 1 reply; 55+ messages in thread
From: richard.brunner @ 2003-09-11  0:56 UTC (permalink / raw)
  To: linux-kernel

Dear LKML,

Continuing my yearly tradition of posting just one long
novel to LKML every year, here is the literary update on the
Prefetch Errata that the early 2.6 Kernels hit on AMD Athlon
Processors.

This previously published errata can occur infrequently and
is present in all AMD Athlon processors and earlier AMD
Opteron/Athlon64 processors.  See [1] and [2].

The full details are below, but the key point is that under
certain circumstances, prefetch instructions can get memory
management faults for addresses which would fault if they
were accessed by a load or store instruction.  We plan to
revise our published errata with the new information below.

The errata requires a kernel workaround, but the good news
is that it is:

  - Harmless in most cases where it could occur. Most of the
    time the prefetch will be targeting memory that is
    accessible under the current privilege mode. So the page
    will simply be "faulted in" slightly earlier than
    needed.

  - Rare and Infrequent. AMD Athlon processors have been
    available for years running numerous Operating Systems
    and only recently have we hit this errata outside of
    code specifically designed to target the errata --
    requiring tens of thousands of iterations to cause it.

  - It can be worked around. Andi Kleen has a 2.6 and a 2.4
    Kernel patches that we have tested at AMD on a large
    number of AMD Athlon processors and AMD Opteron/Athlon64
    processors (both legacy x86 and x86-64 long mode).  It
    works just fine. (Andi will be posting them soon when he 
    wakes up ;-)

  - AMD is fixing this in future revisions of AMD
    Opteron/Athlon64 processors.

  - Andi's kernel patches will not be needed on future
    AMD processors but it is forward compatible and so
    won't break on them either.
 


The Details
===========
Software prefetch instructions are defined to ignore page
faults. Under highly specific and detailed internal
circumstances, the following conditions may cause the
PREFETCH instruction to report a page fault.

+ The target address of the PREFETCH would cause a page
  fault if the address was accessed by an actual memory load
  or store instruction under the current privilege mode.

+ The instruction is a PREFETCH or PREFETCHNTA/0/1/2
  followed in execution-order by an actual or speculative
  byte-sized load to the same address.

  In this case, the page fault exception error code bits for
  the faulting PREFETCH would be identical to that for a
  byte-sized load to the same address.

+ The instruction is a PREFETCHW followed in execution-order
  by an actual or speculative byte-sized store to the same
  address.
  
  In this case, the page fault exception error code bits for
  the faulting PREFETCHW would be identical to that for a
  byte-sized store to the same address.

Note that some misaligned accesses can be broken up by the
processor into multiple accesses where at least one of the
accesses is a byte-sized access.

If the target address of the subsequent memory load or store
is aligned and not byte-sized, this errata does not occur
and no work-around is needed.

So the net effect is that an unexpected page fault may occur
infrequently on a PREFETCH instruction.


Kernel Work-around
=================
The kernel can work around the errata by modifying the Page
Fault Handler in the following way.  This is what Andi
Kleen's patches do.  Because the actual errata is infrequent
it does not produce an excessive number of page faults that
affect system performance.
   
+ Continue to allow the page fault handler to satisfy the
  page fault.  If the faulting instruction is permitted
  access to the page, return to it as usual.

+ If the faulting instruction is not permitted access to the
  page, scan the instruction stream bytes at the faulting
  Instruction Pointer to determine if the instruction is a
  PREFETCH. 

+ If it is not a PREFETCH instruction, generate the
  appropriate memory access control violation as
  appropriate.

+ If the faulting instruction is a PREFETCH instruction,
  simply return back to it; the internal hardware conditions
  that caused the PREFETCH to fault should be removed and
  operation should continue normally.


General Work-around
===================
If the page-fault handler for a kernel can be patched as
described above, no further action by software is
required. The following general work-arounds should only be
considered for kernels where the page-fault handler can not
be patched and a PREFETCH instruction could end up targeting
an address in an "inaccessible" page. (An "inaccessible"
page is one for which memory accesses are not allowed under
the current privilege mode.)

Because the actual errata is infrequent, it does not produce
an excessive number of page faults that affect system
performance.  Therefore a page fault from a PREFETCH
instruction for an address within an "accessible" page does
not require any general work-around.  (An "accessible" page
is one for which memory accesses are allowed under the
current privilege mode once the page is resident in memory)

Software can minimize the occurrence of the errata by
issuing only one PREFETCH instruction per cache-line (a
naturally-aligned 64-byte quantity on AMD Athlon and AMD
Opteron/Athlon64) and ensuring one of the following:

+ In many cases, if a particular target address of a
  prefetch is known to encounter this errata, simply change
  the prefetch to target the next byte.

+ Avoid prefetching inaccessible memory locations, when
  possible.

+ In the general case, ensure that the address used by the
  PREFETCH is offset into the middle of an aligned quadword
  near the end of the cache-line. For example, if the
  address desired to be prefetched is "ADDR", use an offset
  of 0x33 to compute the address used by the actual PREFETCH
  instruction as: "(ADDR & ~0x3f) + 0x33"




Footnotes
=========
[1] AMD Athlon(tm) Processor Model 6 Revision Guide 24332F June 2003.
    
www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24332.pdf


[2] Revision Guide for AMD Opteron(tm) Processors 25759 Rev. 3.07 Aug 2003

www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.PDF


] -Rich ...
] AMD Fellow


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11  0:56 Update on AMD Athlon/Opteron/Athlon64 Prefetch Errata richard.brunner
@ 2003-09-11  1:27 ` Andi Kleen
  2003-09-11  1:44   ` Andrew Morton
                     ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-11  1:27 UTC (permalink / raw)
  To: richard.brunner; +Cc: linux-kernel, akpm, torvalds

>     works just fine. (Andi will be posting them soon when he 
>     wakes up ;-)

He's still awake ;-)

Here is a patch to detect a prefect exception for 2.6.0test5/i386

I'm posting the versions for 2.4.22/x86-64 and 2.4.22/i386 in separate
mail. 2.6/x86-64 is not done yet, but will be in my next merge.

The patches only change the slow path in the page fault handler - 
PREFETCH is only checked for when the kernel would send a signal
anyways or print an oops. The check is also rather cheap so it is
unconditionally enabled.

It handles SSE2 prefetches and 3dnow! style prefetches.

The only tricky bit is that it has to be careful to avoid recursive
segfaults. The prefetch checker can cause another page fault when
it does __get_user, but these should not recurse more than once.

This is insured by the placement of the checks in the page fault handler and
only checking for prefetches if the fault came from user space
or the exception tables have been already checked.
To make this work without a per task exception nest counter
I had to change the SIGBUS handling path slightly. Now when
an get/put_user in kernel causes a SIGBUS it is not delivered
to user space. Instead you just get the standard EFAULT back.

It also removed Andrew's old workaround for the problem.

-Andi

--- linux-2.6.0test5-work/arch/i386/mm/fault.c-o	2003-05-27 03:00:20.000000000 +0200
+++ linux-2.6.0test5-work/arch/i386/mm/fault.c	2003-09-10 23:58:52.000000000 +0200
@@ -55,6 +55,77 @@
 	console_loglevel = loglevel_save;
 }
 
+/* Sometimes the CPU reports invalid exceptions on prefetch.
+   Check that here and ignore.
+   Opcode checker based on code by Richard Brunner */
+static int is_prefetch(struct pt_regs *regs, unsigned long addr)
+{ 
+	unsigned char *instr = (unsigned char *)(regs->eip);
+	int scan_more = 1;
+	int prefetch = 0; 
+	unsigned char *max_instr = instr + 15;
+
+	/* Avoid recursive faults. This makes kernel jumping to nirvana
+	   reporting work better. */
+	if (regs->eip == addr)
+		return 0; 
+
+	while (scan_more && instr < max_instr) { 
+		unsigned char opcode;
+		unsigned char instr_hi;
+		unsigned char instr_lo;
+
+		if (__get_user(opcode, instr))
+			break; 
+
+		instr_hi = opcode & 0xf0; 
+		instr_lo = opcode & 0x0f; 
+		instr++;
+
+		switch (instr_hi) { 
+		case 0x20:
+		case 0x30:
+			/* Values 0x26,0x2E,0x36,0x3E are valid x86
+			   prefixes.  In long mode, the CPU will signal
+			   invalid opcode if some of these prefixes are
+			   present so we will never get here anyway */
+			scan_more = ((instr_lo & 7) == 0x6);
+			break;
+			
+		case 0x40:
+			/* May be valid in long mode (REX prefixes) */
+			break; 
+			
+		case 0x60:
+			/* 0x64 thru 0x67 are valid prefixes in all modes. */
+			scan_more = (instr_lo & 0xC) == 0x4;
+			break;		
+		case 0xF0:
+			/* 0xF0, 0xF2, and 0xF3 are valid prefixes in all modes. */
+			scan_more = !instr_lo || (instr_lo>>1) == 1;
+			break;			
+		case 0x00:
+			/* Prefetch instruction is 0x0F0D or 0x0F18 */
+			scan_more = 0;
+			if (__get_user(opcode, instr)) 
+				break;
+			prefetch = (instr_lo == 0xF) &&
+				(opcode == 0x0D || opcode == 0x18);
+			break;			
+		default:
+			scan_more = 0;
+			break;
+		} 
+	}
+
+#if 0
+	if (prefetch)
+		printk("%s: prefetch caused page fault at %lx/%lx\n", current->comm,
+		       regs->eip, addr);
+#endif
+	return prefetch;
+}
+
 asmlinkage void do_invalid_op(struct pt_regs *, unsigned long);
 
 /*
@@ -110,7 +181,7 @@
 	 * atomic region then we must not take the fault..
 	 */
 	if (in_atomic() || !mm)
-		goto no_context;
+		goto bad_area_nosemaphore;
 
 	down_read(&mm->mmap_sem);
 
@@ -198,8 +269,12 @@
 bad_area:
 	up_read(&mm->mmap_sem);
 
+bad_area_nosemaphore:
 	/* User mode accesses just cause a SIGSEGV */
 	if (error_code & 4) {
+		if (is_prefetch(regs, address))
+			return;
+
 		tsk->thread.cr2 = address;
 		tsk->thread.error_code = error_code;
 		tsk->thread.trap_no = 14;
@@ -232,6 +307,9 @@
 	if (fixup_exception(regs))
 		return;
 
+ 	if (is_prefetch(regs, address))
+ 		return;
+
 /*
  * Oops. The kernel tried to access some bad page. We'll have to
  * terminate things with extreme prejudice.
@@ -286,10 +364,13 @@
 do_sigbus:
 	up_read(&mm->mmap_sem);
 
-	/*
-	 * Send a sigbus, regardless of whether we were in kernel
-	 * or user mode.
-	 */
+	/* Kernel mode? Handle exceptions or die */
+	if (!(error_code & 4))
+		goto no_context;
+
+	if (is_prefetch(regs, address))
+		return;
+
 	tsk->thread.cr2 = address;
 	tsk->thread.error_code = error_code;
 	tsk->thread.trap_no = 14;
@@ -298,10 +379,6 @@
 	info.si_code = BUS_ADRERR;
 	info.si_addr = (void *)address;
 	force_sig_info(SIGBUS, &info, tsk);
-
-	/* Kernel mode? Handle exceptions or die */
-	if (!(error_code & 4))
-		goto no_context;
 	return;
 
 vmalloc_fault:
--- linux-2.6.0test5-work/include/asm-i386/processor.h-o	2003-09-09 20:55:46.000000000 +0200
+++ linux-2.6.0test5-work/include/asm-i386/processor.h	2003-09-11 03:22:16.000000000 +0200
@@ -578,8 +578,6 @@
 #define ARCH_HAS_PREFETCH
 extern inline void prefetch(const void *x)
 {
-	if (cpu_data[0].x86_vendor == X86_VENDOR_AMD)
-		return;		/* Some athlons fault if the address is bad */
 	alternative_input(ASM_NOP4,
 			  "prefetchnta (%1)",
 			  X86_FEATURE_XMM,

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11  1:27 ` [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata Andi Kleen
@ 2003-09-11  1:44   ` Andrew Morton
  2003-09-11  1:47     ` Andi Kleen
  2003-09-11 13:54   ` Linus Torvalds
  2003-09-11 16:58   ` Jamie Lokier
  2 siblings, 1 reply; 55+ messages in thread
From: Andrew Morton @ 2003-09-11  1:44 UTC (permalink / raw)
  To: Andi Kleen; +Cc: richard.brunner, linux-kernel, torvalds

Andi Kleen <ak@suse.de> wrote:
>
>  +static int is_prefetch(struct pt_regs *regs, unsigned long addr)

Can we make this code go away if the configured CPU is, say, Intel?
(I couldn't find a sane CONFIG_ setting to use for this).

It might be vaguely interesting to add a user-visible counter for this
event? If someone somehow comes up with an application which hits the fault
frequently they will take a big performance hit.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11  1:44   ` Andrew Morton
@ 2003-09-11  1:47     ` Andi Kleen
  2003-09-11 14:15       ` Jeff Garzik
  0 siblings, 1 reply; 55+ messages in thread
From: Andi Kleen @ 2003-09-11  1:47 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andi Kleen, richard.brunner, linux-kernel, torvalds

On Wed, Sep 10, 2003 at 06:44:14PM -0700, Andrew Morton wrote:
> Andi Kleen <ak@suse.de> wrote:
> >
> >  +static int is_prefetch(struct pt_regs *regs, unsigned long addr)
> 
> Can we make this code go away if the configured CPU is, say, Intel?
> (I couldn't find a sane CONFIG_ setting to use for this).

It could be done but ... we are moving more and more to generic kernels
(e.g. see the alternative patch code which is enabled unconditionally)
So that when you have a kernel it will boot on near all modern CPUs.
Currently Athlon and P4 kernels run on each other for example.

I would hate to break this again just to save a few hundred bytes in 
this function. Also the overhead is very low so it is also not 
interesting to make it conditional for speed reasons.

> 
> It might be vaguely interesting to add a user-visible counter for this
> event? If someone somehow comes up with an application which hits the fault
> frequently they will take a big performance hit.

In that case they can just profile the kernel. is_prefetch should
show it then.

Of course someone can still add the counter if they want, I'm not
opposed to it.

-Andi


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11  1:27 ` [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata Andi Kleen
  2003-09-11  1:44   ` Andrew Morton
@ 2003-09-11 13:54   ` Linus Torvalds
  2003-09-11 14:01     ` Andi Kleen
  2003-09-11 14:14     ` Alan Cox
  2003-09-11 16:58   ` Jamie Lokier
  2 siblings, 2 replies; 55+ messages in thread
From: Linus Torvalds @ 2003-09-11 13:54 UTC (permalink / raw)
  To: Andi Kleen; +Cc: richard.brunner, linux-kernel, akpm


This patch is fragile and looks pointless.

What's wrong with the current status quo that just says "Athlon prefetch
is broken"?

		Linus


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 13:54   ` Linus Torvalds
@ 2003-09-11 14:01     ` Andi Kleen
  2003-09-11 14:14       ` Dave Jones
  2003-09-11 14:14     ` Alan Cox
  1 sibling, 1 reply; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 14:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: richard.brunner, linux-kernel, akpm

On Thu, 11 Sep 2003 06:54:53 -0700 (PDT)
Linus Torvalds <torvalds@osdl.org> wrote:

> 
> This patch is fragile and looks pointless.

Why do you think it is fragile?  I don't see anything fragile in it.

> 
> What's wrong with the current status quo that just says "Athlon prefetch
> is broken"?

It doesn't fix user space for once.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 13:54   ` Linus Torvalds
  2003-09-11 14:01     ` Andi Kleen
@ 2003-09-11 14:14     ` Alan Cox
  2003-09-11 14:24       ` Andi Kleen
  1 sibling, 1 reply; 55+ messages in thread
From: Alan Cox @ 2003-09-11 14:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, richard.brunner, Linux Kernel Mailing List, akpm

On Iau, 2003-09-11 at 14:54, Linus Torvalds wrote:
> This patch is fragile and looks pointless.
> 
> What's wrong with the current status quo that just says "Athlon prefetch
> is broken"?

Firstly performance, secondly user space exceptions. We work around lots
of other broken CPU things this one triggers a path that only matters on
Athlon.

There is *one* change needed. We shouldnt call is_prefetch unless the
current CPU is an Athlon. That way its a performance improvement for
PIV, and Athlon in general, fixes the bug and causes no fancy athlon
fixup code for non AMD processors.

Alan


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:01     ` Andi Kleen
@ 2003-09-11 14:14       ` Dave Jones
  2003-09-11 14:29         ` Andi Kleen
  0 siblings, 1 reply; 55+ messages in thread
From: Dave Jones @ 2003-09-11 14:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Linus Torvalds, richard.brunner, linux-kernel, akpm

On Thu, Sep 11, 2003 at 04:01:08PM +0200, Andi Kleen wrote:

 > > What's wrong with the current status quo that just says "Athlon prefetch
 > > is broken"?
 > It doesn't fix user space for once.

And for another, it cripples the earlier athlons which don't have this
errata. Andi's fix at least makes prefetch work again on those boxes.
It's also arguable that prefetch() helps the older K7's more than the
affected ones.

		Dave

-- 
 Dave Jones     http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11  1:47     ` Andi Kleen
@ 2003-09-11 14:15       ` Jeff Garzik
  2003-09-11 14:26         ` Andi Kleen
  0 siblings, 1 reply; 55+ messages in thread
From: Jeff Garzik @ 2003-09-11 14:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, richard.brunner, linux-kernel, torvalds

Andi Kleen wrote:
> It could be done but ... we are moving more and more to generic kernels
> (e.g. see the alternative patch code which is enabled unconditionally)
> So that when you have a kernel it will boot on near all modern CPUs.


Only with CONFIG_X86_GENERIC.  That's precisely why CONFIG_X86_GENERIC 
was created.

If I disabled CONFIG_X86_GENERIC and select CONFIG_MPENTIUM4, I darned 
well better not get any Athlon code.  The cpu setup code in particular I 
want to conditionalize, and there are other bits that need work... but 
for the most part it works as intended.

	Jeff




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:14     ` Alan Cox
@ 2003-09-11 14:24       ` Andi Kleen
  2003-09-11 14:28         ` Dave Jones
  2003-09-11 20:14         ` bill davidsen
  0 siblings, 2 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 14:24 UTC (permalink / raw)
  To: Alan Cox; +Cc: torvalds, richard.brunner, linux-kernel, akpm

On Thu, 11 Sep 2003 15:14:02 +0100
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> 
> There is *one* change needed. We shouldnt call is_prefetch unless the
> current CPU is an Athlon. That way its a performance improvement for
> PIV, and Athlon in general, fixes the bug and causes no fancy athlon
> fixup code for non AMD processors.

I considered that when writing the patch, but: is_prefetch is a single byte 
memory access for something already in cache. Checking for an Athlon
CPU needs two memory accesses in boot_cpu_data at least (checking vendor
and model) 

So checking for Athlon first would be actually more expensive for other CPUs.

The switch is left in the noise, gcc compiles it to very efficient code.

This ignores the possibility of a recursive fault in the __get_user, but these
are rather unlikely. Normal page-on-demand code faults are handled 
before is_prefetch is reached.

-Andi 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:15       ` Jeff Garzik
@ 2003-09-11 14:26         ` Andi Kleen
  2003-09-11 14:34           ` Jeff Garzik
  0 siblings, 1 reply; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 14:26 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: akpm, richard.brunner, linux-kernel, torvalds

On Thu, 11 Sep 2003 10:15:25 -0400
Jeff Garzik <jgarzik@pobox.com> wrote:

> Andi Kleen wrote:
> > It could be done but ... we are moving more and more to generic kernels
> > (e.g. see the alternative patch code which is enabled unconditionally)
> > So that when you have a kernel it will boot on near all modern CPUs.
> 
> 
> Only with CONFIG_X86_GENERIC.  That's precisely why CONFIG_X86_GENERIC 
> was created.

It was not created for that (I know that because I created it ;-)

X86_GENERIC is merely an optimization hint (currently it only changes the cache
line size hint) It does not change anything related to correctness. Everything
that handles correctness is checked unconditionally.

is_prefetch is a correctness thing.

> 
> If I disabled CONFIG_X86_GENERIC and select CONFIG_MPENTIUM4, I darned 
> well better not get any Athlon code.  The cpu setup code in particular I 
> want to conditionalize, and there are other bits that need work... but 
> for the most part it works as intended.

Now that's becomming silly. It's alttogether only a few KB and all
__init code anyways.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:24       ` Andi Kleen
@ 2003-09-11 14:28         ` Dave Jones
  2003-09-11 14:32           ` Andi Kleen
  2003-09-11 20:14         ` bill davidsen
  1 sibling, 1 reply; 55+ messages in thread
From: Dave Jones @ 2003-09-11 14:28 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Alan Cox, torvalds, richard.brunner, linux-kernel, akpm

On Thu, Sep 11, 2003 at 04:24:21PM +0200, Andi Kleen wrote:
 > I considered that when writing the patch, but: is_prefetch is a single byte 
 > memory access for something already in cache. Checking for an Athlon
 > CPU needs two memory accesses in boot_cpu_data at least (checking vendor
 > and model) 

You only need to check it once when the path is first taken, and then
set a variable that makes you exit as soon as you enter it again.

		Dave

-- 
 Dave Jones     http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:14       ` Dave Jones
@ 2003-09-11 14:29         ` Andi Kleen
  0 siblings, 0 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 14:29 UTC (permalink / raw)
  To: Dave Jones; +Cc: torvalds, richard.brunner, linux-kernel, akpm

On Thu, 11 Sep 2003 15:14:51 +0100
Dave Jones <davej@redhat.com> wrote:

> On Thu, Sep 11, 2003 at 04:01:08PM +0200, Andi Kleen wrote:
> 
>  > > What's wrong with the current status quo that just says "Athlon prefetch
>  > > is broken"?
>  > It doesn't fix user space for once.
> 
> And for another, it cripples the earlier athlons which don't have this
> errata. Andi's fix at least makes prefetch work again on those boxes.
> It's also arguable that prefetch() helps the older K7's more than the
> affected ones.

All Athlons have this Errata. I can trigger it on an old
900Mhz pre XP Athlon too. You just have to use 3dnow prefetch
instead of SSE prefetch.

BTW the older Athlons currently don't use prefetch because the alternative
patcher does not handle 3dnow style prefetch.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:28         ` Dave Jones
@ 2003-09-11 14:32           ` Andi Kleen
  0 siblings, 0 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 14:32 UTC (permalink / raw)
  To: Dave Jones; +Cc: alan, torvalds, richard.brunner, linux-kernel, akpm

On Thu, 11 Sep 2003 15:28:09 +0100
Dave Jones <davej@redhat.com> wrote:

> On Thu, Sep 11, 2003 at 04:24:21PM +0200, Andi Kleen wrote:
>  > I considered that when writing the patch, but: is_prefetch is a single byte 
>  > memory access for something already in cache. Checking for an Athlon
>  > CPU needs two memory accesses in boot_cpu_data at least (checking vendor
>  > and model) 
> 
> You only need to check it once when the path is first taken, and then
> set a variable that makes you exit as soon as you enter it again.

Checking the variable also an memory access. 

is_prefetch does a few more instructions around the memory access, but these
are completely left in the noise.

The is_prefetch check is likely faster even than checking that variable because
the chances that the EIP is already in cache are much higher than some rarely
used variable.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:26         ` Andi Kleen
@ 2003-09-11 14:34           ` Jeff Garzik
  2003-09-11 14:58             ` Andi Kleen
                               ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Jeff Garzik @ 2003-09-11 14:34 UTC (permalink / raw)
  To: Andi Kleen; +Cc: akpm, richard.brunner, linux-kernel, torvalds

Andi Kleen wrote:
> It was not created for that (I know that because I created it ;-)

hehe


> X86_GENERIC is merely an optimization hint (currently it only changes the cache
> line size hint) It does not change anything related to correctness. Everything
> that handles correctness is checked unconditionally.

When, building non-Pentium4-related code when CONFIG_MPENTIUM4 && 
!CONFIG_X86_GENERIC, it's OK that the code is incorrect for (picking 
example) AMD processors.

It would be a user bug to boot that on an AMD box, just like it would be 
user bug to boot a CONFIG_M586 kernel on an ancient 386.


> is_prefetch is a correctness thing.

When we know at compile time it's not needed, it should not be enabled.


>>If I disabled CONFIG_X86_GENERIC and select CONFIG_MPENTIUM4, I darned 
>>well better not get any Athlon code.  The cpu setup code in particular I 
>>want to conditionalize, and there are other bits that need work... but 
>>for the most part it works as intended.
> 
> 
> Now that's becomming silly. It's alttogether only a few KB and all
> __init code anyways.


If you're doing crazy LinuxBIOS stuff where flash size is limited, it 
makes a lot of sense.  (and I do such crazy things)  The core 2.6 kernel 
has really bloated with optional features, IMO.

	Jeff




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:34           ` Jeff Garzik
@ 2003-09-11 14:58             ` Andi Kleen
  2003-09-11 15:06               ` Jeff Garzik
  2003-09-11 20:08               ` bill davidsen
  2003-09-11 19:56             ` bill davidsen
  2003-09-12 17:32             ` Eric W. Biederman
  2 siblings, 2 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 14:58 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: akpm, richard.brunner, linux-kernel, torvalds

On Thu, 11 Sep 2003 10:34:36 -0400
Jeff Garzik <jgarzik@pobox.com> wrote:

> 
> > X86_GENERIC is merely an optimization hint (currently it only changes the cache
> > line size hint) It does not change anything related to correctness. Everything
> > that handles correctness is checked unconditionally.
> 
> When, building non-Pentium4-related code when CONFIG_MPENTIUM4 && 
> !CONFIG_X86_GENERIC, it's OK that the code is incorrect for (picking 
> example) AMD processors.

No 2.6 changed that. On 2.6 you can exchange the kernels.

[that was mainly done for distributions, but helps other users too]
 
> If you're doing crazy LinuxBIOS stuff where flash size is limited, it 
> makes a lot of sense.  (and I do such crazy things)  The core 2.6 kernel 
> has really bloated with optional features, IMO.

If you really want to save text space just use .bz2 compression 
or compile the kernel with -Os. There are also other subsystems
that would benefit much more (better effort/cost ratio) than adding
micro #ifdefs to core code.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:58             ` Andi Kleen
@ 2003-09-11 15:06               ` Jeff Garzik
  2003-09-11 20:08               ` bill davidsen
  1 sibling, 0 replies; 55+ messages in thread
From: Jeff Garzik @ 2003-09-11 15:06 UTC (permalink / raw)
  To: Andi Kleen; +Cc: akpm, richard.brunner, linux-kernel, torvalds

Andi Kleen wrote:
> On Thu, 11 Sep 2003 10:34:36 -0400
> Jeff Garzik <jgarzik@pobox.com> wrote:
> 
> 
>>>X86_GENERIC is merely an optimization hint (currently it only changes the cache
>>>line size hint) It does not change anything related to correctness. Everything
>>>that handles correctness is checked unconditionally.
>>
>>When, building non-Pentium4-related code when CONFIG_MPENTIUM4 && 
>>!CONFIG_X86_GENERIC, it's OK that the code is incorrect for (picking 
>>example) AMD processors.
> 
> 
> No 2.6 changed that. On 2.6 you can exchange the kernels.
> 
> [that was mainly done for distributions, but helps other users too]


If distributions are not building with CONFIG_X86_GENERIC, then they're 
broken.  So it was rather pointless for 2.6 to "change that."

Luckily, CONFIG_X86_GENERIC allows us the opportunity to be _less_ 
generic when it's not defined, regardless of what you originally 
intended ;-)

	Jeff




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11  1:27 ` [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata Andi Kleen
  2003-09-11  1:44   ` Andrew Morton
  2003-09-11 13:54   ` Linus Torvalds
@ 2003-09-11 16:58   ` Jamie Lokier
  2003-09-11 17:05     ` Andi Kleen
  2 siblings, 1 reply; 55+ messages in thread
From: Jamie Lokier @ 2003-09-11 16:58 UTC (permalink / raw)
  To: Andi Kleen; +Cc: richard.brunner, linux-kernel, akpm, torvalds

Andi Kleen wrote:
> +static int is_prefetch(struct pt_regs *regs, unsigned long addr)

Do I understand that certain values of "addr" can't be due to the
erratum?

In which case, could you skip most of the is_prefetch() instruction
decoder with a test like this?:

	if ((addr & 3) == 0)
		return 0;

I'm not sure from the description of the erratum what, exactly, are
the possible addresses which can appear in the fault information.

-- Jamie

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 16:58   ` Jamie Lokier
@ 2003-09-11 17:05     ` Andi Kleen
  2003-09-11 17:32       ` Jamie Lokier
  0 siblings, 1 reply; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 17:05 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: richard.brunner, linux-kernel, akpm, torvalds

On Thu, 11 Sep 2003 17:58:45 +0100
Jamie Lokier <jamie@shareable.org> wrote:

> Andi Kleen wrote:
> > +static int is_prefetch(struct pt_regs *regs, unsigned long addr)
> 
> Do I understand that certain values of "addr" can't be due to the
> erratum?
> 
> In which case, could you skip most of the is_prefetch() instruction
> decoder with a test like this?:
> 
> 	if ((addr & 3) == 0)
> 		return 0;

Maybe. But gcc generates quite good code (binary decision tree) code for
the switch() statement already so I don't think it is worth it to go 
through complications just to avoid that.

The decoder looks big in C, but when you take a look at its hot path in 
assembly it is quite fast.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 17:05     ` Andi Kleen
@ 2003-09-11 17:32       ` Jamie Lokier
  2003-09-11 17:39         ` Andi Kleen
  0 siblings, 1 reply; 55+ messages in thread
From: Jamie Lokier @ 2003-09-11 17:32 UTC (permalink / raw)
  To: Andi Kleen; +Cc: richard.brunner, linux-kernel, akpm, torvalds

Andi Kleen wrote:
> > 	if ((addr & 3) == 0)
> > 		return 0;
> 
> Maybe. But gcc generates quite good code (binary decision tree) code for
> the switch() statement already so I don't think it is worth it to go 
> through complications just to avoid that.
> 
> The decoder looks big in C, but when you take a look at its hot path in 
> assembly it is quite fast.

I wonder.  No part of the signal path looks especially slow, and yet
as a whole it isn't that fast.  The odd percentage slowdown here and
there is quite bothersome.

GCC generates a decision tree.  You realise that about half of the
branches in that tree will be mispredicted?

-- Jamie

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 17:32       ` Jamie Lokier
@ 2003-09-11 17:39         ` Andi Kleen
  2003-09-11 17:48           ` Jamie Lokier
  0 siblings, 1 reply; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 17:39 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: richard.brunner, linux-kernel, akpm, torvalds

On Thu, 11 Sep 2003 18:32:45 +0100
Jamie Lokier <jamie@shareable.org> wrote:

> Andi Kleen wrote:
> > > 	if ((addr & 3) == 0)
> > > 		return 0;
> > 
> > Maybe. But gcc generates quite good code (binary decision tree) code for
> > the switch() statement already so I don't think it is worth it to go 
> > through complications just to avoid that.
> > 
> > The decoder looks big in C, but when you take a look at its hot path in 
> > assembly it is quite fast.
> 
> I wonder.  No part of the signal path looks especially slow, and yet
> as a whole it isn't that fast.  The odd percentage slowdown here and
> there is quite bothersome.
> 
> GCC generates a decision tree.  You realise that about half of the
> branches in that tree will be mispredicted?

True. Doing a BTSL in a bitmap would be probably faster to check
bytes in a switch. Or maybe gcc should do that, but it doesn't.

But I doubt the small performance advantage of that for an still quite 
obscure path is worth the code uglification that would be require to do
it without gcc support.

To put it into perspective:

signal exception path is thousands of cycles, we're talking about tens
of cycles here.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 17:39         ` Andi Kleen
@ 2003-09-11 17:48           ` Jamie Lokier
  2003-09-11 18:18             ` Andi Kleen
  0 siblings, 1 reply; 55+ messages in thread
From: Jamie Lokier @ 2003-09-11 17:48 UTC (permalink / raw)
  To: Andi Kleen; +Cc: richard.brunner, linux-kernel, akpm, torvalds

Andi Kleen wrote:
> signal exception path is thousands of cycles, we're talking about tens
> of cycles here.

<hand-waving>

Tens vs thousands == percentage points.

Isn't it about 20 cycles per mispredicted branch on a P4?

Five of those and we're talking several percent slowdown, ridiculous
as it seems.

</hand-waving>

-- Jamie

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 17:48           ` Jamie Lokier
@ 2003-09-11 18:18             ` Andi Kleen
  2003-09-11 18:59               ` Jamie Lokier
  0 siblings, 1 reply; 55+ messages in thread
From: Andi Kleen @ 2003-09-11 18:18 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: richard.brunner, linux-kernel, akpm, torvalds

On Thu, 11 Sep 2003 18:48:39 +0100
Jamie Lokier <jamie@shareable.org> wrote:

> Andi Kleen wrote:
> > signal exception path is thousands of cycles, we're talking about tens
> > of cycles here.
> 
> <hand-waving>
> 
> Tens vs thousands == percentage points.

It is more thousands than tens :-)

[just the page fault alone is quite costly]

> 
> Isn't it about 20 cycles per mispredicted branch on a P4?
> 
> Five of those and we're talking several percent slowdown, ridiculous
> as it seems.

There are a lot of conditional branches in the signal
path. If you don't believe me I can send you simics full instruction traces of it.
I'm not going to believe that 2-3 more make a significant difference.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 18:18             ` Andi Kleen
@ 2003-09-11 18:59               ` Jamie Lokier
  0 siblings, 0 replies; 55+ messages in thread
From: Jamie Lokier @ 2003-09-11 18:59 UTC (permalink / raw)
  To: Andi Kleen; +Cc: richard.brunner, linux-kernel, akpm, torvalds

Andi Kleen wrote:
> There are a lot of conditional branches in the signal path. If you
> don't believe me I can send you simics full instruction traces of it.

Oh, I know.  The signal machinery is disturbingly branch-heavy.

-- Jamie

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:34           ` Jeff Garzik
  2003-09-11 14:58             ` Andi Kleen
@ 2003-09-11 19:56             ` bill davidsen
  2003-09-11 20:44               ` Alan Cox
  2003-09-12 17:32             ` Eric W. Biederman
  2 siblings, 1 reply; 55+ messages in thread
From: bill davidsen @ 2003-09-11 19:56 UTC (permalink / raw)
  To: linux-kernel

In article <3F6087FC.7090508@pobox.com>,
Jeff Garzik  <jgarzik@pobox.com> wrote:
| Andi Kleen wrote:
| > It was not created for that (I know that because I created it ;-)
| 
| hehe
| 
| 
| > X86_GENERIC is merely an optimization hint (currently it only changes the cache
| > line size hint) It does not change anything related to correctness. Everything
| > that handles correctness is checked unconditionally.
| 
| When, building non-Pentium4-related code when CONFIG_MPENTIUM4 && 
| !CONFIG_X86_GENERIC, it's OK that the code is incorrect for (picking 
| example) AMD processors.
| 
| It would be a user bug to boot that on an AMD box, just like it would be 
| user bug to boot a CONFIG_M586 kernel on an ancient 386.
| 
| 
| > is_prefetch is a correctness thing.
| 
| When we know at compile time it's not needed, it should not be enabled.

Clearly that's right. This buys nothing on CPUs which don't have the
problem, why have *any* overhead in size and speed? It's too bad that
people have to read around all that code, they don't need to give it a
home in their RAM and execute it as well.
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:58             ` Andi Kleen
  2003-09-11 15:06               ` Jeff Garzik
@ 2003-09-11 20:08               ` bill davidsen
  1 sibling, 0 replies; 55+ messages in thread
From: bill davidsen @ 2003-09-11 20:08 UTC (permalink / raw)
  To: linux-kernel

In article <20030911165826.06f2fd16.ak@suse.de>,
Andi Kleen  <ak@suse.de> wrote:

| If you really want to save text space just use .bz2 compression 
| or compile the kernel with -Os. There are also other subsystems
| that would benefit much more (better effort/cost ratio) than adding
| micro #ifdefs to core code.

Good idea, let's put stuff like this in hardware-dependent includes, and
just have a single line in the core code  like
  check_special_pfault_cases;
and that documents what is happening as well as avoiding reading around
it. It seems silly to leave a big hunk of code in when developers are
making efforts to drop cruft and keep Linux practical for embedded
systems.

People were willing to drop the whole prefetch feature, I don't see that
micro ifdefs are a bad thing, it's just that thought needs to go into
making the code readable.
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:24       ` Andi Kleen
  2003-09-11 14:28         ` Dave Jones
@ 2003-09-11 20:14         ` bill davidsen
  1 sibling, 0 replies; 55+ messages in thread
From: bill davidsen @ 2003-09-11 20:14 UTC (permalink / raw)
  To: linux-kernel

In article <20030911162421.419f4432.ak@suse.de>,
Andi Kleen  <ak@suse.de> wrote:
| On Thu, 11 Sep 2003 15:14:02 +0100
| Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
| 
| > 
| > There is *one* change needed. We shouldnt call is_prefetch unless the
| > current CPU is an Athlon. That way its a performance improvement for
| > PIV, and Athlon in general, fixes the bug and causes no fancy athlon
| > fixup code for non AMD processors.
| 
| I considered that when writing the patch, but: is_prefetch is a single byte 
| memory access for something already in cache. Checking for an Athlon
| CPU needs two memory accesses in boot_cpu_data at least (checking vendor
| and model) 

You are still missing the point, what's needed is not a better way to do
useless work, it's a way to avoid doing it at all. This code should only
be built for CPU's which need it, no accesses needed except in .config.
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 19:56             ` bill davidsen
@ 2003-09-11 20:44               ` Alan Cox
  2003-09-11 21:29                 ` Mike Fedyk
  2003-09-11 21:38                 ` bill davidsen
  0 siblings, 2 replies; 55+ messages in thread
From: Alan Cox @ 2003-09-11 20:44 UTC (permalink / raw)
  To: bill davidsen; +Cc: Linux Kernel Mailing List

On Iau, 2003-09-11 at 20:56, bill davidsen wrote:
> | When we know at compile time it's not needed, it should not be enabled.
> 
> Clearly that's right. This buys nothing on CPUs which don't have the
> problem, why have *any* overhead in size and speed? It's too bad that
> people have to read around all that code, they don't need to give it a
> home in their RAM and execute it as well.

We have always built kernels that contained the support for older chips.
A 586 kernel for example is minutely slowed by the fact it will run on
the Pentium Pro.

If someone wants to put in clear "this CPU only" stuff as well then
fine, but with my distributor hat on I defy you to benchmark the
difference in the real world between PIV and PIV with 100 bytes of extra
workaround code.

You could get that much code back by spending 10 minutes tidying some
random other piece of code you use, or shortening a couple of printk
messages.

 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 20:44               ` Alan Cox
@ 2003-09-11 21:29                 ` Mike Fedyk
  2003-09-11 21:38                 ` bill davidsen
  1 sibling, 0 replies; 55+ messages in thread
From: Mike Fedyk @ 2003-09-11 21:29 UTC (permalink / raw)
  To: Alan Cox; +Cc: bill davidsen, Linux Kernel Mailing List

On Thu, Sep 11, 2003 at 09:44:35PM +0100, Alan Cox wrote:
> On Iau, 2003-09-11 at 20:56, bill davidsen wrote:
> > | When we know at compile time it's not needed, it should not be enabled.
> > 
> > Clearly that's right. This buys nothing on CPUs which don't have the
> > problem, why have *any* overhead in size and speed? It's too bad that
> > people have to read around all that code, they don't need to give it a
> > home in their RAM and execute it as well.
> 
> We have always built kernels that contained the support for older chips.
> A 586 kernel for example is minutely slowed by the fact it will run on
> the Pentium Pro.
> 
> If someone wants to put in clear "this CPU only" stuff as well then
> fine, but with my distributor hat on I defy you to benchmark the
> difference in the real world between PIV and PIV with 100 bytes of extra
> workaround code.
> 
> You could get that much code back by spending 10 minutes tidying some
> random other piece of code you use, or shortening a couple of printk
> messages.

True, but then why have config options at all?

Why not just put this patch in as is, since it fits with the rest of the way
things are done now, and put it in the to do list for the kernel janitors to
split up the workarounds per cpu, and make config options so they can be
enabled for the specific cpu, and make the generic case just enable a bunch
of those individual cpu config options.

This change won't make it in 2.6, so it belongs in the to do until it has
action taken on it.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 20:44               ` Alan Cox
  2003-09-11 21:29                 ` Mike Fedyk
@ 2003-09-11 21:38                 ` bill davidsen
  1 sibling, 0 replies; 55+ messages in thread
From: bill davidsen @ 2003-09-11 21:38 UTC (permalink / raw)
  To: linux-kernel

In article <1063313075.3881.4.camel@dhcp23.swansea.linux.org.uk>,
Alan Cox  <alan@lxorguk.ukuu.org.uk> wrote:
| On Iau, 2003-09-11 at 20:56, bill davidsen wrote:
| > | When we know at compile time it's not needed, it should not be enabled.
| > 
| > Clearly that's right. This buys nothing on CPUs which don't have the
| > problem, why have *any* overhead in size and speed? It's too bad that
| > people have to read around all that code, they don't need to give it a
| > home in their RAM and execute it as well.
| 
| We have always built kernels that contained the support for older chips.
| A 586 kernel for example is minutely slowed by the fact it will run on
| the Pentium Pro.
| 
| If someone wants to put in clear "this CPU only" stuff as well then
| fine, but with my distributor hat on I defy you to benchmark the
| difference in the real world between PIV and PIV with 100 bytes of extra
| workaround code.

I was not even thinking of CPU overhead, embedded applications benchmark
changes with "size" rather than "/bin/time" and this code is relatively
large and unlikely to be useful on any embedded system.

| You could get that much code back by spending 10 minutes tidying some
| random other piece of code you use, or shortening a couple of printk
| messages.

Perhaps for 2.7 a good case could be made for i18n on some of the
printk's, including some very small messages indeed for the embedded
folks. Most are not overly verbose now, but could be replaced by
abbreviations, or a magic message number.

I'm not against the fix, I do 5-6 hours a week on a K7-1400 original
Athlon, and I'll be running applications on some Duron boxen late this
year. I just find the work done to support embedded applications
evidence that 2.6 will be better than 2.6 without nearly as much
hacking. This code is large enough that a few ifdefs will have less
effect size on the kernel source than the lack of them will on the
binary.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-11 14:34           ` Jeff Garzik
  2003-09-11 14:58             ` Andi Kleen
  2003-09-11 19:56             ` bill davidsen
@ 2003-09-12 17:32             ` Eric W. Biederman
  2003-09-12 17:56               ` Andi Kleen
  2003-09-12 18:03               ` Mike Fedyk
  2 siblings, 2 replies; 55+ messages in thread
From: Eric W. Biederman @ 2003-09-12 17:32 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andi Kleen, akpm, richard.brunner, linux-kernel, torvalds

Jeff Garzik <jgarzik@pobox.com> writes:

> >> If I disabled CONFIG_X86_GENERIC and select CONFIG_MPENTIUM4, I darned well
> >> better not get any Athlon code.  The cpu setup code in particular I want to
> >> conditionalize, and there are other bits that need work... but for the most
> >> part it works as intended.
> > Now that's becomming silly. It's alttogether only a few KB and all
> > __init code anyways.
> 
> 
> If you're doing crazy LinuxBIOS stuff where flash size is limited, it makes a
> lot of sense.  (and I do such crazy things)  The core 2.6 kernel has really
> bloated with optional features, IMO.

The size of a minimal bzImage (the stuff you can't compile out) has
increased by roughly 400KB since 1.0 200KB since 2.2 and 100KB since 2.4.

So please pardon those of us who complain at things that can't be
configured out.  The x86 kernel is on the edge of becoming useless
in some embedded scenarios because it is so fat.

When we can compile out code, we can at least narrow down where the
problems are.

I have to agree with Jeff the LinuxBIOS stuff is  crippled right now
because of this.

There may be better places to attack.  But new code is what is up for
examination and is easiest to fix.

Eric

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 17:32             ` Eric W. Biederman
@ 2003-09-12 17:56               ` Andi Kleen
  2003-09-12 17:59                 ` Jeff Garzik
  2003-09-14 23:47                 ` bill davidsen
  2003-09-12 18:03               ` Mike Fedyk
  1 sibling, 2 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-12 17:56 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: jgarzik, akpm, richard.brunner, linux-kernel, torvalds

On 12 Sep 2003 11:32:42 -0600
ebiederm@xmission.com (Eric W. Biederman) wrote:


> There may be better places to attack.  But new code is what is up for
> examination and is easiest to fix.

With is_prefetch:

   text    data     bss     dec     hex filename
   2782       4       0    2786     ae2 arch/i386/mm/fault.o

Without is_prefetch:

 text    data     bss     dec     hex filename
   2446       4       0    2450     992 arch/i386/mm/fault.o

Difference 332 bytes

If you start your attack on 332 bytes then IMHO you have your priorities wrong ;-)

The main reason I'm really against this is that currently the P4 kernels work
fine on Athlon. Just when is_prefetch is not integrated in them there will 
be an mysterious oops once every three months in the kernel in prefetch
on Athlon.
 
That would be bad. The alternative would be to prevent the P4 kernel
from booting on the Athlon at all, but doing that for 332 bytes
would seem a bit silly.

-Andi

P.S.: If you really want to shrink 2.6 I would start with making sysfs optional.
That would likely help much more than micro optimizing non bloated parts.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 17:56               ` Andi Kleen
@ 2003-09-12 17:59                 ` Jeff Garzik
  2003-09-12 18:22                   ` Adrian Bunk
  2003-09-14 23:47                 ` bill davidsen
  1 sibling, 1 reply; 55+ messages in thread
From: Jeff Garzik @ 2003-09-12 17:59 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Eric W. Biederman, akpm, richard.brunner, linux-kernel, torvalds

Andi Kleen wrote:
> The main reason I'm really against this is that currently the P4 kernels work
> fine on Athlon. Just when is_prefetch is not integrated in them there will 
> be an mysterious oops once every three months in the kernel in prefetch
> on Athlon.


Booting a P4 kernel _without_ CONFIG_X86_GENERIC on an Athlon would be a 
user bug.

	Jeff




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 17:32             ` Eric W. Biederman
  2003-09-12 17:56               ` Andi Kleen
@ 2003-09-12 18:03               ` Mike Fedyk
  1 sibling, 0 replies; 55+ messages in thread
From: Mike Fedyk @ 2003-09-12 18:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Jeff Garzik, Andi Kleen, akpm, richard.brunner, linux-kernel, torvalds

On Fri, Sep 12, 2003 at 11:32:42AM -0600, Eric W. Biederman wrote:
> The size of a minimal bzImage (the stuff you can't compile out) has
> increased by roughly 400KB since 1.0 200KB since 2.2 and 100KB since 2.4.
> 
> So please pardon those of us who complain at things that can't be
> configured out.  The x86 kernel is on the edge of becoming useless
> in some embedded scenarios because it is so fat.
> 
> When we can compile out code, we can at least narrow down where the
> problems are.
> 
> I have to agree with Jeff the LinuxBIOS stuff is  crippled right now
> because of this.
> 
> There may be better places to attack.  But new code is what is up for
> examination and is easiest to fix.

Yes, but is there enough framework to make this reasonable?

Is config generic with its new semantics (different from the origional
author) good enough for this?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 17:59                 ` Jeff Garzik
@ 2003-09-12 18:22                   ` Adrian Bunk
  2003-09-12 18:28                     ` Andi Kleen
  2003-09-14 23:49                     ` bill davidsen
  0 siblings, 2 replies; 55+ messages in thread
From: Adrian Bunk @ 2003-09-12 18:22 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Andi Kleen, Eric W. Biederman, akpm, richard.brunner,
	linux-kernel, torvalds

On Fri, Sep 12, 2003 at 01:59:43PM -0400, Jeff Garzik wrote:
> Andi Kleen wrote:
> >The main reason I'm really against this is that currently the P4 kernels 
> >work
> >fine on Athlon. Just when is_prefetch is not integrated in them there will 
> >be an mysterious oops once every three months in the kernel in prefetch
> >on Athlon.
> 
> 
> Booting a P4 kernel _without_ CONFIG_X86_GENERIC on an Athlon would be a 
> user bug.

But even CONFIG_X86_GENERIC doesn't do what you expect. A kernel 
compiled for Athlon wouldn't run on a Pentium 4 even with 
CONFIG_X86_GENERIC.

Quoting arch/i386/Kconfig in -test5:

<--  snip  -->

config X86_USE_3DNOW
        bool
        depends on MCYRIXIII || MK7
        default y

<--  snip  -->

My patch in the mail

  RFC: [2.6 patch] better i386 CPU selection

tries to solve these problem with a different approach (the user selects 
all CPUs he wants to support).

> 	Jeff

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 18:22                   ` Adrian Bunk
@ 2003-09-12 18:28                     ` Andi Kleen
  2003-09-12 18:39                       ` Jeff Garzik
                                         ` (4 more replies)
  2003-09-14 23:49                     ` bill davidsen
  1 sibling, 5 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-12 18:28 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: jgarzik, ebiederm, akpm, richard.brunner, linux-kernel, torvalds

On Fri, 12 Sep 2003 20:22:16 +0200
Adrian Bunk <bunk@fs.tum.de> wrote:


> 
> But even CONFIG_X86_GENERIC doesn't do what you expect. A kernel 
> compiled for Athlon wouldn't run on a Pentium 4 even with 
> CONFIG_X86_GENERIC.

It does. Just try it.

> 
> Quoting arch/i386/Kconfig in -test5:
> 
> <--  snip  -->
> 
> config X86_USE_3DNOW
>         bool
>         depends on MCYRIXIII || MK7
>         default y

That's obsolete and could be removed. All 3dnow! code is dynamically patched depending on the CPUID.

-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 18:28                     ` Andi Kleen
@ 2003-09-12 18:39                       ` Jeff Garzik
  2003-09-12 18:45                       ` Jeff Garzik
                                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 55+ messages in thread
From: Jeff Garzik @ 2003-09-12 18:39 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Adrian Bunk, ebiederm, akpm, richard.brunner, linux-kernel, torvalds

Andi Kleen wrote:
> That's obsolete and could be removed. All 3dnow! code is dynamically patched depending on the CPUID.


Ug.  Why compile in 3dnow code when you don't need it?  There is a 
sizeable number of processors without 3dnow...

	Jeff




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 18:28                     ` Andi Kleen
  2003-09-12 18:39                       ` Jeff Garzik
@ 2003-09-12 18:45                       ` Jeff Garzik
  2003-09-12 18:48                       ` Adrian Bunk
                                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 55+ messages in thread
From: Jeff Garzik @ 2003-09-12 18:45 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Adrian Bunk, ebiederm, akpm, richard.brunner, linux-kernel, torvalds

Andi Kleen wrote:
>>config X86_USE_3DNOW
>>        bool
>>        depends on MCYRIXIII || MK7
>>        default y
> 
> 
> All 3dnow! code is dynamically patched depending on the CPUID.


Note that the people who care most about x86 kernel size, such as people 
running on brand new 486 and 586 embedded processors (AMD Elan, ...) are 
precisely the people that don't need the "convenience" of having the 
feature dynamically patched into the kernel.

Removal of CONFIG_SSE2 was wrong for this same reason.

300 bytes here, 300 bytes there, and all this code is adding up quick.

We have CONFIG_EMBEDDED to hide such things from your sight, so you 
don't even have to worry about them...  they will always be dynamically 
patched in your kernels :)

	Jeff




^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 18:28                     ` Andi Kleen
  2003-09-12 18:39                       ` Jeff Garzik
  2003-09-12 18:45                       ` Jeff Garzik
@ 2003-09-12 18:48                       ` Adrian Bunk
  2003-09-12 19:07                         ` Andi Kleen
  2003-09-12 19:05                       ` Martin Schlemmer
  2003-09-14 23:51                       ` bill davidsen
  4 siblings, 1 reply; 55+ messages in thread
From: Adrian Bunk @ 2003-09-12 18:48 UTC (permalink / raw)
  To: Andi Kleen
  Cc: jgarzik, ebiederm, akpm, richard.brunner, linux-kernel, torvalds

On Fri, Sep 12, 2003 at 08:28:51PM +0200, Andi Kleen wrote:
> On Fri, 12 Sep 2003 20:22:16 +0200
> Adrian Bunk <bunk@fs.tum.de> wrote:
> 
> 
> > 
> > But even CONFIG_X86_GENERIC doesn't do what you expect. A kernel 
> > compiled for Athlon wouldn't run on a Pentium 4 even with 
> > CONFIG_X86_GENERIC.
> 
> It does. Just try it.
> 
> > 
> > Quoting arch/i386/Kconfig in -test5:
> > 
> > <--  snip  -->
> > 
> > config X86_USE_3DNOW
> >         bool
> >         depends on MCYRIXIII || MK7
> >         default y
> 
> That's obsolete and could be removed. All 3dnow! code is dynamically patched depending on the CPUID.

Quoting e.g. arch/i386/lib/memcpy.c:

<--  snip  -->

void * memcpy(void * to, const void * from, size_t n)
{
#ifdef CONFIG_X86_USE_3DNOW
        return __memcpy3d(to, from, n);
#else
        return __memcpy(to, from, n);
#endif
}

<--  snip  -->

This is hardly dynamic.

> -Andi

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 18:28                     ` Andi Kleen
                                         ` (2 preceding siblings ...)
  2003-09-12 18:48                       ` Adrian Bunk
@ 2003-09-12 19:05                       ` Martin Schlemmer
  2003-09-12 19:30                         ` Andi Kleen
  2003-09-14 23:51                       ` bill davidsen
  4 siblings, 1 reply; 55+ messages in thread
From: Martin Schlemmer @ 2003-09-12 19:05 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Adrian Bunk, jgarzik, ebiederm, akpm, richard.brunner, LKML, torvalds

On Fri, 2003-09-12 at 20:28, Andi Kleen wrote:
> On Fri, 12 Sep 2003 20:22:16 +0200
> Adrian Bunk <bunk@fs.tum.de> wrote:
> 
> 
> > 
> > But even CONFIG_X86_GENERIC doesn't do what you expect. A kernel 
> > compiled for Athlon wouldn't run on a Pentium 4 even with 
> > CONFIG_X86_GENERIC.
> 
> It does. Just try it.
> 
> > 
> > Quoting arch/i386/Kconfig in -test5:
> > 
> > <--  snip  -->
> > 
> > config X86_USE_3DNOW
> >         bool
> >         depends on MCYRIXIII || MK7
> >         default y
> 
> That's obsolete and could be removed. All 3dnow! code is dynamically patched depending on the CPUID.
> 

Ok, so how many instructions was added by this ?  Or is it
just in the init code ?  What else just add 'just another
one or two instructions' to common paths because of this?

Which ever way, the point I and some of the others (besides the
additional one from the embedded guys) want to make, is if I
select the CPU to be Pentium4, it means I want a kernel that is
optimised for my P4, without extra crap that I do not need.  Sure,
its an extra instruction here, two there, etc - but when will it
be too much ?  Is this not maybe the fabled 'slight slowdown' that
so many  people complain about from round the 2.5.[67]'s ?

Ok, so maybe my opinion about X86_GENERIC is not as intended, but
then IMHO, it should be 'fixed'.  I could not care less if my kernel
only boot just on my box, never mind on another P4 - I just want
the most optimised on possible.  Sure, some guys want a more generic
kernel - get X86_GENERIC to work for them.  Same for distro's.

I have long wondered if everything in arch/i386/kernel/cpu/ is
really linked in (meaning with no #ifdef as it now looks to be
at a quick peek), or if it was just easier to link them all,
but have non generic stuff (amd/cyrix/whatever specific code)
filtered by ifdef's.

This is just me, but why then don't we then just drop the specific
arch selection, and just have generics instead of pulling a sock
over the user's eyes ?


Thanks,

-- 
Martin Schlemmer



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 18:48                       ` Adrian Bunk
@ 2003-09-12 19:07                         ` Andi Kleen
  0 siblings, 0 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-12 19:07 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: jgarzik, ebiederm, akpm, richard.brunner, linux-kernel, torvalds

On Fri, 12 Sep 2003 20:48:01 +0200
Adrian Bunk <bunk@fs.tum.de> wrote:


> > That's obsolete and could be removed. All 3dnow! code is dynamically patched depending on the CPUID.
> 
> Quoting e.g. arch/i386/lib/memcpy.c:
> 
> <--  snip  -->
> 
> void * memcpy(void * to, const void * from, size_t n)
> {
> #ifdef CONFIG_X86_USE_3DNOW
>         return __memcpy3d(to, from, n);
> #else
>         return __memcpy(to, from, n);
> #endif


No, it really works. The "3d" copy code uses actually MMX, which both the P4 and the K7 support fine

The only 3dnow! thing in there is that it uses 3dnow style prefetches (original MMX didn't 
have prefetches), but these are handled in a transparent way since a long time. The code just 
has an exception handler and patches them away if the prefetch ever faulted with an illegal 
instruction:


arch/i386/lib/mmx.c:

...
        __asm__ __volatile__ (
                "1: prefetch (%0)\n"            /* This set is 28 bytes */
                "   prefetch 64(%0)\n"
                "   prefetch 128(%0)\n"
                "   prefetch 192(%0)\n"
                "   prefetch 256(%0)\n"
                "2:  \n"
                ".section .fixup, \"ax\"\n"
                "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
                "   jmp 2b\n"
                ".previous\n"

I admit the remaining #ifdefs and comments are a bit misleading though.


-Andi

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 19:05                       ` Martin Schlemmer
@ 2003-09-12 19:30                         ` Andi Kleen
  2003-09-12 19:58                           ` Martin Schlemmer
                                             ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Andi Kleen @ 2003-09-12 19:30 UTC (permalink / raw)
  To: Martin Schlemmer
  Cc: bunk, jgarzik, ebiederm, akpm, richard.brunner, linux-kernel, torvalds

On Fri, 12 Sep 2003 21:05:06 +0200
Martin Schlemmer <azarah@gentoo.org> wrote:


> Ok, so how many instructions was added by this ?  Or is it

None at all, Mr Inquisitor. It is all patched at early boot time.

> Ok, so maybe my opinion about X86_GENERIC is not as intended, but
> then IMHO, it should be 'fixed'.  I could not care less if my kernel
> only boot just on my box, never mind on another P4 - I just want
> the most optimised on possible.  Sure, some guys want a more generic
> kernel - get X86_GENERIC to work for them.  Same for distro's.

X86_GENERIC has nothing to do with all this. All it does is 
to always force the cache line padding to 128 byte. 

This means that when you have an SMP kernel, no matter compiled
for what CPU, it will not run like crap on a P4 Xeon based SMP system.
The cost is some more memory usage for more padding (Athlon is fine 
with 64 byte padding, P3 is fine with 32byte padding). If you don't
want that just don't enable it.
 
> I have long wondered if everything in arch/i386/kernel/cpu/ is
> really linked in (meaning with no #ifdef as it now looks to be
> at a quick peek), or if it was just easier to link them all,
> but have non generic stuff (amd/cyrix/whatever specific code)
> filtered by ifdef's.

It is (in MTRR drivers etc.), but the resulting overhead is small.

Currently I even link in Intel and Cyrix specific MTRR drivers on x86-64
just to keep the common code common and clean.

Really, when you want to save code size you should look elsewhere. All the 
CPU support code is pretty lean and in many cases is __init code anyways
(= is discarded after boot time) 

I can offer my old bloat-o-meter tool (ftp://ftp.firstfloor.org/pub/ak/perl/bloat-o-meter)
It is great to look for bloat. Just run it with a 2.4 kernel and 2.6 kernel and compare
the results. Then look at the top 50 bloat increasers. I bet with you that 
you won't find anything touched in this thread among them.

I suspect for example if you just worked on making sysfs optional you would
save a lot more.
 
> This is just me, but why then don't we then just drop the specific
> arch selection, and just have generics instead of pulling a sock
> over the user's eyes ?

It is doing a lot of optimizations for the specific CPU. For example
it tells gcc to compile for that CPU which can make a big difference (P4 prefers
very different code compared to P3 or Athlon). Or it sets the paddings correctly
for the CPU, which can make a very big difference in .text size. So when you
select CONFIG_MPENTIUM4 you will get a kernel that will perform optimally for P4.

To give a bit of perspective:

On 2.4 the situation was: 

- Athlon kernel would only boot on Athlon/K6 due to 3dnow prefetches
- P4 kernel would boot everywhere
- P3 kernel would boot everywhere

(ignoring really old CPUs or oddballs like C3 without CMOV support) 

(also note booting just means booting without oops, not being optimal for the specific CPU.
Being optimal is very different)

Then 2.6 added SSE1 prefetch support which made the P4 kernel not boot on
anything that didn't support SSE1 (like older Athlon before XP). I fixed
that then with dynamically patching the prefetches. The overhead at runtime
is zero because it is patched at boot, the .text overhead for the patch 
tables is minimal.

So basically the 2.6 alternative() stuff just restored the 2.4 de-facto situation 
in 2.6, and improved it slightly because the Athlon kernel also now works everywhere.

I think it's useful to keep kernels booting everywhere, it makes it a lot easier
to test a single kernel on multiple systems.

-Andi


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 19:30                         ` Andi Kleen
@ 2003-09-12 19:58                           ` Martin Schlemmer
  2003-09-12 20:00                           ` Adrian Bunk
  2003-09-15  0:15                           ` bill davidsen
  2 siblings, 0 replies; 55+ messages in thread
From: Martin Schlemmer @ 2003-09-12 19:58 UTC (permalink / raw)
  To: Andi Kleen; +Cc: bunk, jgarzik, ebiederm, akpm, richard.brunner, LKML, torvalds

On Fri, 2003-09-12 at 21:30, Andi Kleen wrote:
> On Fri, 12 Sep 2003 21:05:06 +0200
> Martin Schlemmer <azarah@gentoo.org> wrote:

> Ok, so how many instructions was added by this ?  Or is it
> 
> None at all, Mr Inquisitor. It is all patched at early boot time.
> 

Thanks :P

> > Ok, so maybe my opinion about X86_GENERIC is not as intended, but
> > then IMHO, it should be 'fixed'.  I could not care less if my kernel
>
> X86_GENERIC has nothing to do with all this. All it does is 
> to always force the cache line padding to 128 byte. 
> 

Ok, thanks

>  
> > I have long wondered if everything in arch/i386/kernel/cpu/ is
> > really linked in 
> 
> It is (in MTRR drivers etc.), but the resulting overhead is small.
> 
> Really, when you want to save code size you should look elsewhere. All the 
> CPU support code is pretty lean and in many cases is __init code anyways
> (= is discarded after boot time) 
> 
> I can offer my old bloat-o-meter tool (ftp://ftp.firstfloor.org/pub/ak/perl/bloat-o-meter)


I am not really bothered about code size, as I tried (?) to say.

>  
> > This is just me, but why then don't we then just drop the specific
> > arch selection, and just have generics instead of pulling a sock
> > over the user's eyes ?
> 
> It is doing a lot of optimizations for the specific CPU. For example
> it tells gcc to compile for that CPU which can make a big difference (P4 prefers
> very different code compared to P3 or Athlon). Or it sets the paddings correctly
> for the CPU, which can make a very big difference in .text size. So when you
> select CONFIG_MPENTIUM4 you will get a kernel that will perform optimally for P4.
> 

> Then 2.6 added SSE1 prefetch support which made the P4 kernel not boot on
> anything that didn't support SSE1 (like older Athlon before XP). I fixed
> that then with dynamically patching the prefetches. The overhead at runtime
> is zero because it is patched at boot, the .text overhead for the patch 
> tables is minimal.
> 

Ok, thanks.

> So basically the 2.6 alternative() stuff just restored the 2.4 de-facto situation 
> in 2.6, and improved it slightly because the Athlon kernel also now works everywhere.
> 
> I think it's useful to keep kernels booting everywhere, it makes it a lot easier
> to test a single kernel on multiple systems.
> 

Yes, given.  I just don't think most people know or look at it the
same.  I for one usually just try not to add anything not needed
at the cost of some minor speed regression, as things many times
especially in a big project sometimes tends to get out of hand,
as all those minor regressions added is one big regression.  I do
however admit that I will be a bit out of my league trying to
judge in this case, but it still would be interesting ....

I however will not keep this up if you guys say its is fine.
Two things however does still does bother:

1)  What will it look like after being put through some benchmarks
just to verify.  I may however just need to get to bed early for
a change =)

2)  Will it be that difficult to also patch it in at boot like
the rest.  Once again it might just be paranoia from my side 8)


Thanks for the more in depth info.

Regards,

-- 
Martin Schlemmer



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 19:30                         ` Andi Kleen
  2003-09-12 19:58                           ` Martin Schlemmer
@ 2003-09-12 20:00                           ` Adrian Bunk
  2003-09-15  0:15                           ` bill davidsen
  2 siblings, 0 replies; 55+ messages in thread
From: Adrian Bunk @ 2003-09-12 20:00 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Martin Schlemmer, jgarzik, ebiederm, akpm, richard.brunner,
	linux-kernel, torvalds

On Fri, Sep 12, 2003 at 09:30:16PM +0200, Andi Kleen wrote:
>...
> I think it's useful to keep kernels booting everywhere, it makes it a lot easier
> to test a single kernel on multiple systems.

Different people have different needs:

Sometimes you want kernels booting everywhere, e.g. a distribution might
want to support all CPUs from an 386 to an Opteron with one kernel for
their boot floppies.

For a system administrator with only Pentium 3 and Pentum 4 machines
support for 386 and Opteron isn't of much worth.

In some embedded systems people are happy about every kB their kernel is 
smaller.

> -Andi

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 17:56               ` Andi Kleen
  2003-09-12 17:59                 ` Jeff Garzik
@ 2003-09-14 23:47                 ` bill davidsen
  2003-09-15  1:02                   ` Zwane Mwaikambo
  1 sibling, 1 reply; 55+ messages in thread
From: bill davidsen @ 2003-09-14 23:47 UTC (permalink / raw)
  To: linux-kernel

In article <20030912195606.24e73086.ak@suse.de>,
Andi Kleen  <ak@suse.de> wrote:
| On 12 Sep 2003 11:32:42 -0600
| ebiederm@xmission.com (Eric W. Biederman) wrote:
| 
| 
| > There may be better places to attack.  But new code is what is up for
| > examination and is easiest to fix.
| 
| With is_prefetch:
| 
|    text    data     bss     dec     hex filename
|    2782       4       0    2786     ae2 arch/i386/mm/fault.o
| 
| Without is_prefetch:
| 
|  text    data     bss     dec     hex filename
|    2446       4       0    2450     992 arch/i386/mm/fault.o
| 
| Difference 332 bytes
| 
| If you start your attack on 332 bytes then IMHO you have your priorities wrong ;-)
| 
| The main reason I'm really against this is that currently the P4 kernels work
| fine on Athlon. Just when is_prefetch is not integrated in them there will 
| be an mysterious oops once every three months in the kernel in prefetch
| on Athlon.
|  
| That would be bad. The alternative would be to prevent the P4 kernel
| from booting on the Athlon at all, but doing that for 332 bytes
| would seem a bit silly.

I am really missing something here, why is it you want to run a P4
kernel on Athlon? And why is it good to push bloat into the kernel and
then tell people who really care about size to go peddle their papers
elsewhere? There is perfectly good code in the kernel now to disable
prefetch on Athlon, leave it in unless the kernel is built for Athlon
support. A P4 kernel should run fine on an Athlon already.

We just got a start on making Linux smaller to encourage embedded use, I
don't see adding 300+ bytes of wasted code so people can run
misconfigured kernels.

I rather have to patch this in for my Athlon kernels than have people
who aren't cutting corners trying to avoid building matching kernels
have to live with the overhead.
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 18:22                   ` Adrian Bunk
  2003-09-12 18:28                     ` Andi Kleen
@ 2003-09-14 23:49                     ` bill davidsen
  1 sibling, 0 replies; 55+ messages in thread
From: bill davidsen @ 2003-09-14 23:49 UTC (permalink / raw)
  To: linux-kernel

In article <20030912182216.GK27368@fs.tum.de>,
Adrian Bunk  <bunk@fs.tum.de> wrote:
                [...]

| But even CONFIG_X86_GENERIC doesn't do what you expect. A kernel 
| compiled for Athlon wouldn't run on a Pentium 4 even with 
| CONFIG_X86_GENERIC.
| 
| Quoting arch/i386/Kconfig in -test5:
| 
| <--  snip  -->
| 
| config X86_USE_3DNOW
|         bool
|         depends on MCYRIXIII || MK7
|         default y
| 
| <--  snip  -->
| 
| My patch in the mail
| 
|   RFC: [2.6 patch] better i386 CPU selection
| 
| tries to solve these problem with a different approach (the user selects 
| all CPUs he wants to support).

If you have managed to work around all the conflicting capabilities
without leaving the kernel way suboptimal on some, I salute your better
solution.
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 18:28                     ` Andi Kleen
                                         ` (3 preceding siblings ...)
  2003-09-12 19:05                       ` Martin Schlemmer
@ 2003-09-14 23:51                       ` bill davidsen
  4 siblings, 0 replies; 55+ messages in thread
From: bill davidsen @ 2003-09-14 23:51 UTC (permalink / raw)
  To: linux-kernel

In article <20030912202851.3529e7e7.ak@suse.de>,
Andi Kleen  <ak@suse.de> wrote:
| On Fri, 12 Sep 2003 20:22:16 +0200
| Adrian Bunk <bunk@fs.tum.de> wrote:
| 
| 
| > 
| > But even CONFIG_X86_GENERIC doesn't do what you expect. A kernel 
| > compiled for Athlon wouldn't run on a Pentium 4 even with 
| > CONFIG_X86_GENERIC.
| 
| It does. Just try it.
| 
| > 
| > Quoting arch/i386/Kconfig in -test5:
| > 
| > <--  snip  -->
| > 
| > config X86_USE_3DNOW
| >         bool
| >         depends on MCYRIXIII || MK7
| >         default y
| 
| That's obsolete and could be removed. All 3dnow! code is dynamically patched depending on the CPUID.

And if that isn't all in init it's another good target for getting rid
of bloat. You suggested people look for other places, here's one.
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-12 19:30                         ` Andi Kleen
  2003-09-12 19:58                           ` Martin Schlemmer
  2003-09-12 20:00                           ` Adrian Bunk
@ 2003-09-15  0:15                           ` bill davidsen
  2 siblings, 0 replies; 55+ messages in thread
From: bill davidsen @ 2003-09-15  0:15 UTC (permalink / raw)
  To: linux-kernel

In article <20030912213016.47a4e5de.ak@suse.de>,
Andi Kleen  <ak@suse.de> wrote:
| On Fri, 12 Sep 2003 21:05:06 +0200
| Martin Schlemmer <azarah@gentoo.org> wrote:
| 
| > I have long wondered if everything in arch/i386/kernel/cpu/ is
| > really linked in (meaning with no #ifdef as it now looks to be
| > at a quick peek), or if it was just easier to link them all,
| > but have non generic stuff (amd/cyrix/whatever specific code)
| > filtered by ifdef's.
| 
| It is (in MTRR drivers etc.), but the resulting overhead is small.
| 
| Currently I even link in Intel and Cyrix specific MTRR drivers on x86-64
| just to keep the common code common and clean.
| 
| Really, when you want to save code size you should look elsewhere. All the 
| CPU support code is pretty lean and in many cases is __init code anyways
| (= is discarded after boot time) 

If you want lean code you should look everywhere. I agree that init code
is free, and that there are worse offenders. But adding additional bloat
that the people who actually builds a kernel properly to fit their
hardware can't eliminate without patching, then perhaps that shouldn't
happen.

I'm not at all against having an option in the menu to "eliminate code
to support non-targeted processors." That could remove the code you so
want to have in so you don't have to build a properly configured kernel
for each machine.

How does "#if !defined(NO_BLOAT) || WANT_ARCH_IN_QUESTION feel for a
good thing to surround many parts of the kernel.

| 
| I can offer my old bloat-o-meter tool (ftp://ftp.firstfloor.org/pub/ak/perl/bloat-o-meter)
| It is great to look for bloat. Just run it with a 2.4 kernel and 2.6 kernel and compare
| the results. Then look at the top 50 bloat increasers. I bet with you that 
| you won't find anything touched in this thread among them.

The term "grandfathered" comes to mind, you are proposing to add new
bloat, and you justify that so you don't have to build a properly
configures kernel for each hardware configuration. I don't think anyone
will argue that all such code should be found and treated in the near
future, but if we did have a flag to prevent eggregious support for
non-selected hardware, I bet the embedded guys would contribute patches
over time to clean things up. And shouldn't 2.6 better than 2.4, instead
of "not much worse?"
| 
| I suspect for example if you just worked on making sysfs optional you would
| save a lot more.

As long as you don't have to add new code to support the functionality
in another way. The advantage is that if you can remove it you can
probably also make it modular, and that might be a nice thing during
development of embedded systems. Somewhat smaller but available?
-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-14 23:47                 ` bill davidsen
@ 2003-09-15  1:02                   ` Zwane Mwaikambo
  2003-09-15  2:08                     ` Nick Piggin
  2003-09-15  3:55                     ` Bill Davidsen
  0 siblings, 2 replies; 55+ messages in thread
From: Zwane Mwaikambo @ 2003-09-15  1:02 UTC (permalink / raw)
  To: bill davidsen; +Cc: linux-kernel

On Sun, 14 Sep 2003, bill davidsen wrote:

> We just got a start on making Linux smaller to encourage embedded use, I
> don't see adding 300+ bytes of wasted code so people can run
> misconfigured kernels.
> 
> I rather have to patch this in for my Athlon kernels than have people
> who aren't cutting corners trying to avoid building matching kernels
> have to live with the overhead.

Overhead? Really you could save more memory by cleaning up a lot of 
drivers. Andi already said it before, there are better places to be 
looking at.

Also 'patching' for Athlon kernels doesn't cut it for people who need to 
distribute kernels which run on various hardware (such as distros). This 
alone is benefit enough to justify this supposed 'bloat'.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-15  1:02                   ` Zwane Mwaikambo
@ 2003-09-15  2:08                     ` Nick Piggin
  2003-09-15  3:55                     ` Bill Davidsen
  1 sibling, 0 replies; 55+ messages in thread
From: Nick Piggin @ 2003-09-15  2:08 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: bill davidsen, linux-kernel



Zwane Mwaikambo wrote:

>On Sun, 14 Sep 2003, bill davidsen wrote:
>
>
>>We just got a start on making Linux smaller to encourage embedded use, I
>>don't see adding 300+ bytes of wasted code so people can run
>>misconfigured kernels.
>>
>>I rather have to patch this in for my Athlon kernels than have people
>>who aren't cutting corners trying to avoid building matching kernels
>>have to live with the overhead.
>>
>
>Overhead? Really you could save more memory by cleaning up a lot of 
>drivers. Andi already said it before, there are better places to be 
>looking at.
>
>Also 'patching' for Athlon kernels doesn't cut it for people who need to 
>distribute kernels which run on various hardware (such as distros). This 
>alone is benefit enough to justify this supposed 'bloat'.
>

Hi Zwane,

I still don't mind Adrian's patch (at least the concept). Its true, the
Athlon workaround is insignificant, but Adrian's patch allows the
possibility of compiling things like it out. Its also clearer to me than
the current scheme.

The only objections I have seen are from those who don't understand what
it does or implmentation issues. It might be the case that it can't be
done "nicely" without more support from kconfig and/or kbuild though.



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-15  1:02                   ` Zwane Mwaikambo
  2003-09-15  2:08                     ` Nick Piggin
@ 2003-09-15  3:55                     ` Bill Davidsen
  2003-09-15  7:45                       ` Alan Cox
  1 sibling, 1 reply; 55+ messages in thread
From: Bill Davidsen @ 2003-09-15  3:55 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: linux-kernel

On Sun, 14 Sep 2003, Zwane Mwaikambo wrote:

> On Sun, 14 Sep 2003, bill davidsen wrote:
> 
> > We just got a start on making Linux smaller to encourage embedded use, I
> > don't see adding 300+ bytes of wasted code so people can run
> > misconfigured kernels.
> > 
> > I rather have to patch this in for my Athlon kernels than have people
> > who aren't cutting corners trying to avoid building matching kernels
> > have to live with the overhead.
> 
> Overhead? Really you could save more memory by cleaning up a lot of 
> drivers. Andi already said it before, there are better places to be 
> looking at.

The best way to remove overhead is not to add it. Putting in 300+ bytes of
code which is pure overhead on anything but an Athlon is silly. And to
justify it because it save effort for people who don't want to bother
building the correct distribution is pure hubris. The Athlon is not so
important that making it run a tiny bit faster justifies taking extra
space on every machine which doesn't benefit.
> 
> Also 'patching' for Athlon kernels doesn't cut it for people who need to 
> distribute kernels which run on various hardware (such as distros). This 
> alone is benefit enough to justify this supposed 'bloat'.

You clearly think that a kernel will not run on Athlon without this
patch. In real life kernel built for 386, 486, Ppro, etc will run
nicely on every CPU except those less featureful, including the
precious Athlon.

That being the case, there is no justification for forcing the code on
every build rather than putting ifdefs around it. Any vendor who wants
to make Athlon a tiny bit faster at the expense of making every other
system use more memory for code which is never executed can just use the
config option to include the support.

Let's try the facts agin:
1 - the code is not needed for Athlon, prefetch is turned off on broken
    CPUs now. A generic kernel runs fine on Athlon.
2 - the discussion is not on having the code or not, but on putting
    ifdefs around this code so it is only generated if wanted.

Let's not continue to pollute the kernel with architecture dependent
code, we have subtrees for that.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-15  3:55                     ` Bill Davidsen
@ 2003-09-15  7:45                       ` Alan Cox
  2003-09-15 12:11                         ` Bill Davidsen
  0 siblings, 1 reply; 55+ messages in thread
From: Alan Cox @ 2003-09-15  7:45 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Zwane Mwaikambo, Linux Kernel Mailing List

On Llu, 2003-09-15 at 04:55, Bill Davidsen wrote:
> 1 - the code is not needed for Athlon, prefetch is turned off on broken
>     CPUs now. A generic kernel runs fine on Athlon.

That disable you talk about is bloat. It also trashes the performance of
PIV boxes. In fact I checked out of interest - the disable hack
currently being used is adding *over* 300 bytes to my kernel as its
inlined repeatedly. So its larger, and it ruins performance for all
processors.

You also need it for userspace prefetch fault fixup for a kernel without
CONFIG_MK7 to run stuff perfectly on Athlon.



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-15  7:45                       ` Alan Cox
@ 2003-09-15 12:11                         ` Bill Davidsen
  2003-09-15 13:48                           ` Alan Cox
  0 siblings, 1 reply; 55+ messages in thread
From: Bill Davidsen @ 2003-09-15 12:11 UTC (permalink / raw)
  To: Alan Cox; +Cc: Zwane Mwaikambo, Linux Kernel Mailing List

On Mon, 15 Sep 2003, Alan Cox wrote:

> On Llu, 2003-09-15 at 04:55, Bill Davidsen wrote:
> > 1 - the code is not needed for Athlon, prefetch is turned off on broken
> >     CPUs now. A generic kernel runs fine on Athlon.
> 
> That disable you talk about is bloat. It also trashes the performance of
> PIV boxes. In fact I checked out of interest - the disable hack
> currently being used is adding *over* 300 bytes to my kernel as its
> inlined repeatedly. So its larger, and it ruins performance for all
> processors.

The code to disable prefetch on Athlon is 300 bytes and hurts your PIV?
Really? I'll dig back through the code, but I recall it as adding or
deleting an entry in a table to enable prefetch. If it's affecting PIV the
code to use prefetch is seriously broken.

And since this only buys 2-5% in the kernel, I really question your "ruins
performance" claim.

> 
> You also need it for userspace prefetch fault fixup for a kernel without
> CONFIG_MK7 to run stuff perfectly on Athlon.

You mean you have a program which enables userspace prefetch and doesn't
handle the botch? Or that you have programs which you compiled for PIV and
which don't run properly on Athlon. It's called misconfigured, they won't
run on 386 either.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-15 12:11                         ` Bill Davidsen
@ 2003-09-15 13:48                           ` Alan Cox
  2003-09-15 18:50                             ` Bill Davidsen
  0 siblings, 1 reply; 55+ messages in thread
From: Alan Cox @ 2003-09-15 13:48 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Zwane Mwaikambo, Linux Kernel Mailing List

On Llu, 2003-09-15 at 13:11, Bill Davidsen wrote:
> The code to disable prefetch on Athlon is 300 bytes and hurts your PIV?
> Really? I'll dig back through the code, but I recall it as adding or
> deleting an entry in a table to enable prefetch. If it's affecting PIV the
> code to use prefetch is seriously broken.

Big time. Its over 300 bytes long because its embedded in each inline
prefetch and it causes a branch misprediction almost every time. Amazing
what you find when you actually measure stuff 8)

So right now, its faster on PIV to delete the prefetch than use the
current hack, and adding the Athlon fix makes Athlon and PIV faster and
total memory size lower.



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata
  2003-09-15 13:48                           ` Alan Cox
@ 2003-09-15 18:50                             ` Bill Davidsen
  0 siblings, 0 replies; 55+ messages in thread
From: Bill Davidsen @ 2003-09-15 18:50 UTC (permalink / raw)
  To: Alan Cox; +Cc: Zwane Mwaikambo, Linux Kernel Mailing List

On Mon, 15 Sep 2003, Alan Cox wrote:

> On Llu, 2003-09-15 at 13:11, Bill Davidsen wrote:
> > The code to disable prefetch on Athlon is 300 bytes and hurts your PIV?
> > Really? I'll dig back through the code, but I recall it as adding or
> > deleting an entry in a table to enable prefetch. If it's affecting PIV the
> > code to use prefetch is seriously broken.
> 
> Big time. Its over 300 bytes long because its embedded in each inline
> prefetch and it causes a branch misprediction almost every time. Amazing
> what you find when you actually measure stuff 8)
> 
> So right now, its faster on PIV to delete the prefetch than use the
> current hack, and adding the Athlon fix makes Athlon and PIV faster and
> total memory size lower.

Of course if you have PIV kernel you don't need any of the Athlon code at
all, so the size and performance penalty is zero. And if you build for a
CPU which doesn't have prefetch you don't need any of that code at all.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2003-09-15 18:59 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-11  0:56 Update on AMD Athlon/Opteron/Athlon64 Prefetch Errata richard.brunner
2003-09-11  1:27 ` [PATCH] 2.6 workaround for Athlon/Opteron prefetch errata Andi Kleen
2003-09-11  1:44   ` Andrew Morton
2003-09-11  1:47     ` Andi Kleen
2003-09-11 14:15       ` Jeff Garzik
2003-09-11 14:26         ` Andi Kleen
2003-09-11 14:34           ` Jeff Garzik
2003-09-11 14:58             ` Andi Kleen
2003-09-11 15:06               ` Jeff Garzik
2003-09-11 20:08               ` bill davidsen
2003-09-11 19:56             ` bill davidsen
2003-09-11 20:44               ` Alan Cox
2003-09-11 21:29                 ` Mike Fedyk
2003-09-11 21:38                 ` bill davidsen
2003-09-12 17:32             ` Eric W. Biederman
2003-09-12 17:56               ` Andi Kleen
2003-09-12 17:59                 ` Jeff Garzik
2003-09-12 18:22                   ` Adrian Bunk
2003-09-12 18:28                     ` Andi Kleen
2003-09-12 18:39                       ` Jeff Garzik
2003-09-12 18:45                       ` Jeff Garzik
2003-09-12 18:48                       ` Adrian Bunk
2003-09-12 19:07                         ` Andi Kleen
2003-09-12 19:05                       ` Martin Schlemmer
2003-09-12 19:30                         ` Andi Kleen
2003-09-12 19:58                           ` Martin Schlemmer
2003-09-12 20:00                           ` Adrian Bunk
2003-09-15  0:15                           ` bill davidsen
2003-09-14 23:51                       ` bill davidsen
2003-09-14 23:49                     ` bill davidsen
2003-09-14 23:47                 ` bill davidsen
2003-09-15  1:02                   ` Zwane Mwaikambo
2003-09-15  2:08                     ` Nick Piggin
2003-09-15  3:55                     ` Bill Davidsen
2003-09-15  7:45                       ` Alan Cox
2003-09-15 12:11                         ` Bill Davidsen
2003-09-15 13:48                           ` Alan Cox
2003-09-15 18:50                             ` Bill Davidsen
2003-09-12 18:03               ` Mike Fedyk
2003-09-11 13:54   ` Linus Torvalds
2003-09-11 14:01     ` Andi Kleen
2003-09-11 14:14       ` Dave Jones
2003-09-11 14:29         ` Andi Kleen
2003-09-11 14:14     ` Alan Cox
2003-09-11 14:24       ` Andi Kleen
2003-09-11 14:28         ` Dave Jones
2003-09-11 14:32           ` Andi Kleen
2003-09-11 20:14         ` bill davidsen
2003-09-11 16:58   ` Jamie Lokier
2003-09-11 17:05     ` Andi Kleen
2003-09-11 17:32       ` Jamie Lokier
2003-09-11 17:39         ` Andi Kleen
2003-09-11 17:48           ` Jamie Lokier
2003-09-11 18:18             ` Andi Kleen
2003-09-11 18:59               ` Jamie Lokier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).