* K7/Athlon optimizations and Sacrifices to the Great Ones.
@ 2001-09-06 19:51 Nicholas Knight
2001-09-06 20:21 ` Alan Cox
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Nicholas Knight @ 2001-09-06 19:51 UTC (permalink / raw)
To: linux-kernel
I'm upset, and angry, and could go into more detail but a future employer
might read this.
133Mhz FSB + KT133A chipset theory has been officially shot to hell.
Not only that, but 6-4-4 (family/model/stepping) processors don't seem to
be the culprit. I've now had reports of 6-4-2 experiencing problems, and
6-4-4 NOT experiencing problems, even on KT133A @ 133Mhz.
At this point, I can't even isolate a MOTHERBOARD that could be the
culprit, and I don't think it's the power supply.
The *only* other theory I have left in my arsenal is cooling.
Unfortunately this data is more difficult to obtain from users, and thus
wouldn't gather as many responses and in the end it'd probably be
useless. It would be best tested in controlled conditions, but
unfortunately I don't have the money to purchase the neccisary hardware
to test these issues. Any companies want to sponsor tests? I'm serious,
if someone wanted to, I'd be willing to test this possibility.
There is another remote possibility, and that's the fab plant that the
processors came from. I didn't ask for CPU serial numbers, so I can't
speak to that effect.
It's also possible that this is related to a specific batch or batches of
KT133A chipsets, however I currently have one report of a guy seeing this
problem on the SAME physical board, just two different processors. Both
6-4-2, the only difference is that one is 1.13Ghz and doesn't have the
problem, and the other is 1.2Ghz and DOES have the problem. This of
course leads me back to the clock speed theory, but again it doesn't make
any SENSE because the FSB on both of them is 133Mhz and I've got at least
two reports of 1.33Ghz chips running FINE! ARG!
At this point, I'm giving up on collecting data, as I just don't see a
definitive pattern, all I can say for sure is that the "majority"
KT133A-based motherboards seem to have problems, but not ALL. I don't
know of a single report outside of the KT133A chipset of these problems.
If anyone wants to keep collecting data, I'd be happy to send all the
information I have so far to that person, and any that filters in over
the next couple days.
Now for the Sacrifices.
At this point, I'd like to sacrifice a Red Hat Linux 6.2 CD to Alan Cox.
I would also like to sacrifice Minix 1.3(?) installation diskettes to
Linus Torvalds.
I perform these sacrifices in the hope that enlightenment comes to me.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: K7/Athlon optimizations and Sacrifices to the Great Ones.
2001-09-06 19:51 K7/Athlon optimizations and Sacrifices to the Great Ones Nicholas Knight
@ 2001-09-06 20:21 ` Alan Cox
2001-09-06 20:32 ` Dan Hollis
2001-09-07 8:30 ` VDA
2001-09-07 16:25 ` Jussi Laako
2 siblings, 1 reply; 8+ messages in thread
From: Alan Cox @ 2001-09-06 20:21 UTC (permalink / raw)
To: tegeran; +Cc: linux-kernel
> At this point, I'd like to sacrifice a Red Hat Linux 6.2 CD to Alan Cox.
> I would also like to sacrifice Minix 1.3(?) installation diskettes to
> Linus Torvalds.
>
> I perform these sacrifices in the hope that enlightenment comes to me.
A deep booming voice says "I have no idea either"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: K7/Athlon optimizations and Sacrifices to the Great Ones.
2001-09-06 20:21 ` Alan Cox
@ 2001-09-06 20:32 ` Dan Hollis
2001-09-06 23:24 ` David Hollister
2001-09-07 17:03 ` Heinz Deinhart
0 siblings, 2 replies; 8+ messages in thread
From: Dan Hollis @ 2001-09-06 20:32 UTC (permalink / raw)
To: Alan Cox; +Cc: tegeran, linux-kernel
On Thu, 6 Sep 2001, Alan Cox wrote:
> > At this point, I'd like to sacrifice a Red Hat Linux 6.2 CD to Alan Cox.
> > I would also like to sacrifice Minix 1.3(?) installation diskettes to
> > Linus Torvalds.
> > I perform these sacrifices in the hope that enlightenment comes to me.
> A deep booming voice says "I have no idea either"
We need a good tester (floppy-bootable k7-killer, something along the
lines of memtest86) and many more data points.
Anyone yet verified if burnMMX2 causes the same failures the
athlon-optimized kernel does?
-Dan
--
[-] Omae no subete no kichi wa ore no mono da. [-]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: K7/Athlon optimizations and Sacrifices to the Great Ones.
2001-09-06 20:32 ` Dan Hollis
@ 2001-09-06 23:24 ` David Hollister
2001-09-07 17:03 ` Heinz Deinhart
1 sibling, 0 replies; 8+ messages in thread
From: David Hollister @ 2001-09-06 23:24 UTC (permalink / raw)
To: Dan Hollis; +Cc: Alan Cox, tegeran, linux-kernel
Dan Hollis wrote:
> On Thu, 6 Sep 2001, Alan Cox wrote:
>
>>>At this point, I'd like to sacrifice a Red Hat Linux 6.2 CD to Alan Cox.
>>>I would also like to sacrifice Minix 1.3(?) installation diskettes to
>>>Linus Torvalds.
>>>I perform these sacrifices in the hope that enlightenment comes to me.
>>>
>>A deep booming voice says "I have no idea either"
>>
>
> We need a good tester (floppy-bootable k7-killer, something along the
> lines of memtest86) and many more data points.
>
> Anyone yet verified if burnMMX2 causes the same failures the
> athlon-optimized kernel does?
>
> -Dan
MMX2 does not cause any problems for me. Robert (the guy who wrote these) has
provided me with two more versions that mimic the Athlon optimized
fast_page_copy and fast_page_clear functions in mmx.c. They aren't exact
copies, but are close. One fails for me consistently, the other does not. The
one that fails consistently is the one that mimics the fast_page_copy code. I'm
still trying to provide him more datapoints about the failures to see if we can
uncover anything.
--
David Hollister
Driversoft Engineering: http://devicedrivers.com
Digital Audio Resources: http://digitalaudioresources.org
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: K7/Athlon optimizations and Sacrifices to the Great Ones.
2001-09-06 19:51 K7/Athlon optimizations and Sacrifices to the Great Ones Nicholas Knight
2001-09-06 20:21 ` Alan Cox
@ 2001-09-07 8:30 ` VDA
2001-09-07 16:25 ` Jussi Laako
2 siblings, 0 replies; 8+ messages in thread
From: VDA @ 2001-09-07 8:30 UTC (permalink / raw)
To: linux-kernel
Thursday, September 06, 2001, 10:51:36 PM, Nicholas Knight <tegeran@home.com> wrote:
NK> 133Mhz FSB + KT133A chipset theory has been officially shot to hell.
NK> Not only that, but 6-4-4 (family/model/stepping) processors don't seem to
NK> be the culprit. I've now had reports of 6-4-2 experiencing problems, and
NK> 6-4-4 NOT experiencing problems, even on KT133A @ 133Mhz.
NK> At this point, I can't even isolate a MOTHERBOARD that could be the
NK> culprit, and I don't think it's the power supply.
...
NK> At this point, I'm giving up on collecting data, as I just don't see a
NK> definitive pattern, all I can say for sure is that the "majority"
NK> KT133A-based motherboards seem to have problems, but not ALL. I don't
NK> know of a single report outside of the KT133A chipset of these problems.
Well, why can't guys with Athlons and KT133As who did enable K7
optimizations just open their boxes and report to Nicholas:
* processor and chipset markings
* bus speed
* CPUcore/bus multiplier
* Motherboard model
* BIOS manufacturer, version, date
This is important. BIOS might fix/mask chipset bugs
by programming it to stable but slow cfg
* do they see K7 related oops
("I don't see oops" is a valuable report too!)
* did oops go away with 100MHz FSB
* did oops go away with different CPU voltage
* did oops go away with smaller multiplier
* did oops go away with BIOS update
* did oops go away with any trick with mmx.c - see below.
More advanced reporters might try to fiddle with
arch/i386/lib/mmx.c and try to make oops disappear
* did oops go away with K7 optimization off
* results of memtest86
* results of running burnK7/burnMMX
(Last 3 *'s to make sure it is the K7 bug, not bad memory or
something)
Need to check for oops? "Simen Thoresen" <simen-tt@online.no>:
>I've determined that with the Athlon-optimized fast_copy_page,
>the machine is easy to push into oopsing. Just starting
>a dd with blocksize 128M (half available ram) provokes an oops.
>This is repeatable, consistent and almost fun.
Since fast_copy_page() from arch/i386/lib/mmx.c has been isolated as
a code which triggers oops, it can be instrumented to check whether page
is indeed copied right by questionable K7 code and barf loudly if it is not.
Since oops are not instant, looks like interrupts might interfere
with movntq instruction... On the other hand, fast_clear_page()
isn't triggering oops (right?) so maybe mixing normal and
cache-bypassing instructions is triggering oops...
Comparing K7 and MMX fast_copy_page...
Does replacing movntq->movq fix makes oops go avay?
If no, does this (or similar) change makes oops go away?
movq (%0), %%mm0 -> movq (%0), %%mm0
movntq %%mm0, (%1) -> movq 8(%0), %%mm1
movq 8(%0), %%mm1 -> movq 16(%0), %%mm2
movntq %%mm1, 8(%1) -> movq 24(%0), %%mm3
movq 16(%0), %%mm2 -> movntq %%mm0, (%1)
movntq %%mm2, 16(%1) -> movntq %%mm1, 8(%1)
movq 24(%0), %%mm3 -> movntq %%mm2, 16(%1)
movntq %%mm3, 24(%1) -> movntq %%mm3, 24(%1)
movq 32(%0), %%mm4 -> movq 32(%0), %%mm4
movntq %%mm4, 32(%1) -> movq 40(%0), %%mm5
movq 40(%0), %%mm5 -> movq 48(%0), %%mm6
movntq %%mm5, 40(%1) -> movq 56(%0), %%mm7
movq 48(%0), %%mm6 -> movntq %%mm4, 32(%1)
movntq %%mm6, 48(%1) -> movntq %%mm5, 40(%1)
movq 56(%0), %%mm7 -> movntq %%mm6, 48(%1)
movntq %%mm7, 56(%1) -> movntq %%mm7, 56(%1)
No? Changing first for() loop from
for(i=0; i<(4096-320)/64; i++) into
for(i=0; i<4096/64; i++) and eliminating second for() -
does this help?
One of above changes HAS to fix K7 oops, because you convert K7
fast_copy_page to MMX fast_copy_page that way :-)
So if you have an Athlon - try these and report. I don't have the
hardware.
David Hollister <david@digitalaudioresources.org> wrote:
>MMX2 does not cause any problems for me. Robert (the guy who wrote these) has
>provided me with two more versions that mimic the Athlon optimized
>fast_page_copy and fast_page_clear functions in mmx.c. They aren't exact
>copies, but are close. One fails for me consistently, the other does not. The
>one that fails consistently is the one that mimics the fast_page_copy code.
Robert Redelmeier: redelm@ev1.net http://users.ev1.net/~redelm/
Although this tester is not on his page (yet?).
--
Best regards, VDA
mailto:VDA@port.imtp.ilyichevsk.odessa.ua
http://port.imtp.ilyichevsk.odessa.ua/vda/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: K7/Athlon optimizations and Sacrifices to the Great Ones.
2001-09-06 19:51 K7/Athlon optimizations and Sacrifices to the Great Ones Nicholas Knight
2001-09-06 20:21 ` Alan Cox
2001-09-07 8:30 ` VDA
@ 2001-09-07 16:25 ` Jussi Laako
2 siblings, 0 replies; 8+ messages in thread
From: Jussi Laako @ 2001-09-07 16:25 UTC (permalink / raw)
To: tegeran; +Cc: linux-kernel
Nicholas Knight wrote:
>
> problem on the SAME physical board, just two different processors. Both
> 6-4-2, the only difference is that one is 1.13Ghz and doesn't have the
> problem, and the other is 1.2Ghz and DOES have the problem. This of
> course leads me back to the clock speed theory, but again it doesn't make
> any SENSE because the FSB on both of them is 133Mhz and I've got at least
> two reports of 1.33Ghz chips running FINE! ARG!
How about the synchronization issue that came up in the power saving thread?
Some bus synchronization problem (integer/non-integer multiplier)?
- Jussi Laako
--
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B 39DD A4DE 63EB C216 1E4B
Available at PGP keyservers
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: K7/Athlon optimizations and Sacrifices to the Great Ones.
2001-09-06 20:32 ` Dan Hollis
2001-09-06 23:24 ` David Hollister
@ 2001-09-07 17:03 ` Heinz Deinhart
2001-09-07 20:52 ` Alan Cox
1 sibling, 1 reply; 8+ messages in thread
From: Heinz Deinhart @ 2001-09-07 17:03 UTC (permalink / raw)
To: Dan Hollis; +Cc: linux-kernel
On Thu, 6 Sep 2001, Dan Hollis wrote:
> Anyone yet verified if burnMMX2 causes the same failures the
> athlon-optimized kernel does?
several versions of Robert's burnMMX2 run on by problematic athlons
without failing for several hours.
I did some trial and error modifications to mmx.c and found out
that this one makes my athlons happy (but must admin i have
no clue why). it seems to run stable now.
--- linux-2.4.9/arch/i386/lib/mmx.c Tue May 22 19:23:16 2001
+++ linux-2.4.9-ac6-hack/arch/i386/lib/mmx.c Sat Sep 8 00:51:33 2001
@@ -194,6 +194,9 @@
: : "r" (from), "r" (to) : "memory");
from+=64;
to+=64;
+ __asm__ __volatile__ (
+ " sfence \n" : :
+ );
}
for(i=(4096-320)/64; i<4096/64; i++)
{
maybe someone with more knowledge can take a look..
ciao,
heinz
--
Heinz Deinhart <heinz@auto.tuwien.ac.at>
+43 1 58801-18321
Technische Universitaet Wien, Dept. E183/1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: K7/Athlon optimizations and Sacrifices to the Great Ones.
2001-09-07 17:03 ` Heinz Deinhart
@ 2001-09-07 20:52 ` Alan Cox
0 siblings, 0 replies; 8+ messages in thread
From: Alan Cox @ 2001-09-07 20:52 UTC (permalink / raw)
To: Heinz Deinhart; +Cc: Dan Hollis, linux-kernel
> I did some trial and error modifications to mmx.c and found out
> that this one makes my athlons happy (but must admin i have
> no clue why). it seems to run stable now.
>
> --- linux-2.4.9/arch/i386/lib/mmx.c Tue May 22 19:23:16 2001
> +++ linux-2.4.9-ac6-hack/arch/i386/lib/mmx.c Sat Sep 8 00:51:33 2001
> @@ -194,6 +194,9 @@
> : : "r" (from), "r" (to) : "memory");
> from+=64;
> to+=64;
> + __asm__ __volatile__ (
> + " sfence \n" : :
> + );
> }
> for(i=(4096-320)/64; i<4096/64; i++)
> {
>
You are effectively continually stalling the processor so that the fast
streaming memory transfers dont occur. Instead you start, block, start,
block, ...
Alan
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2001-09-07 20:49 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-06 19:51 K7/Athlon optimizations and Sacrifices to the Great Ones Nicholas Knight
2001-09-06 20:21 ` Alan Cox
2001-09-06 20:32 ` Dan Hollis
2001-09-06 23:24 ` David Hollister
2001-09-07 17:03 ` Heinz Deinhart
2001-09-07 20:52 ` Alan Cox
2001-09-07 8:30 ` VDA
2001-09-07 16:25 ` Jussi Laako
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).