* 2.4.19/20, 2.5 missing P4 ifdef ?
@ 2002-11-28 14:17 Margit Schubert-While
2002-11-28 14:24 ` Dave Jones
0 siblings, 1 reply; 8+ messages in thread
From: Margit Schubert-While @ 2002-11-28 14:17 UTC (permalink / raw)
To: linux-kernel
Just noticed this in "include/asm-i386/processor.h" :
--- snip ---
/* Prefetch instructions for Pentium III and AMD Athlon */
#ifdef CONFIG_MPENTIUMIII
#define ARCH_HAS_PREFETCH
extern inline void prefetch(const void *x)
{
__asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
}
#elif CONFIG_X86_USE_3DNOW
--- end snip ---
The P4 has SSE and prefetch or no ?
Margit
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
2002-11-28 14:17 2.4.19/20, 2.5 missing P4 ifdef ? Margit Schubert-While
@ 2002-11-28 14:24 ` Dave Jones
2002-11-28 17:12 ` Bill Davidsen
2002-11-29 0:08 ` J.A. Magallon
0 siblings, 2 replies; 8+ messages in thread
From: Dave Jones @ 2002-11-28 14:24 UTC (permalink / raw)
To: Margit Schubert-While; +Cc: linux-kernel
On Thu, Nov 28, 2002 at 03:17:53PM +0100, Margit Schubert-While wrote:
> Just noticed this in "include/asm-i386/processor.h" :
>
> --- snip ---
> /* Prefetch instructions for Pentium III and AMD Athlon */
> #ifdef CONFIG_MPENTIUMIII
> #define ARCH_HAS_PREFETCH
> extern inline void prefetch(const void *x)
> {
> __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
> }
> #elif CONFIG_X86_USE_3DNOW
> --- end snip ---
>
> The P4 has SSE and prefetch or no ?
It does. You seem to have found a bug.
Dave
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
2002-11-28 14:24 ` Dave Jones
@ 2002-11-28 17:12 ` Bill Davidsen
2002-11-29 4:51 ` GrandMasterLee
2002-11-29 0:08 ` J.A. Magallon
1 sibling, 1 reply; 8+ messages in thread
From: Bill Davidsen @ 2002-11-28 17:12 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux-Kernel Mailing List
On Thu, 28 Nov 2002, Dave Jones wrote:
> On Thu, Nov 28, 2002 at 03:17:53PM +0100, Margit Schubert-While wrote:
> > Just noticed this in "include/asm-i386/processor.h" :
> >
> > --- snip ---
> > /* Prefetch instructions for Pentium III and AMD Athlon */
> > #ifdef CONFIG_MPENTIUMIII
> > #define ARCH_HAS_PREFETCH
> > extern inline void prefetch(const void *x)
> > {
> > __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
> > }
> > #elif CONFIG_X86_USE_3DNOW
> > --- end snip ---
> >
> > The P4 has SSE and prefetch or no ?
>
> It does. You seem to have found a bug.
A bug? An inefficiency, obviously, but it should be functionally correct,
no? Or is there a problem I've missed other than performance?
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
2002-11-28 14:24 ` Dave Jones
2002-11-28 17:12 ` Bill Davidsen
@ 2002-11-29 0:08 ` J.A. Magallon
2002-12-02 13:24 ` Dave Jones
1 sibling, 1 reply; 8+ messages in thread
From: J.A. Magallon @ 2002-11-29 0:08 UTC (permalink / raw)
To: Dave Jones; +Cc: Margit Schubert-While, linux-kernel
On 2002.11.28 Dave Jones wrote:
>On Thu, Nov 28, 2002 at 03:17:53PM +0100, Margit Schubert-While wrote:
> > Just noticed this in "include/asm-i386/processor.h" :
> >
> > --- snip ---
> > /* Prefetch instructions for Pentium III and AMD Athlon */
> > #ifdef CONFIG_MPENTIUMIII
> > #define ARCH_HAS_PREFETCH
> > extern inline void prefetch(const void *x)
> > {
> > __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
> > }
> > #elif CONFIG_X86_USE_3DNOW
> > --- end snip ---
> >
> > The P4 has SSE and prefetch or no ?
>
>It does. You seem to have found a bug.
>
Two questions:
- I am trying to use gcc's __builtin_prefetch, and it is able to
spit different prefetch instructions depending on 'temporal
locality' of the data:
prefetchnta, prefetcht2, prefetcht1, prefetcht0
temp-loc: 0 1 2 3
0 means you can just discard after r or w, and 3 means you
are really interested in data lasting in cache.
Do not know if the use of prefetch in kernel is extensive,
but perhaps this is something to investigate...
- PII also supports the prefetches. Is it worth to add it ?
TIA
--
J.A. Magallon <jamagallon@able.es> \ Software is like sex:
werewolf.able.es \ It's better when it's free
Mandrake Linux release 9.1 (Cooker) for i586
Linux 2.4.20-rc4-jam0 (gcc 3.2 (Mandrake Linux 9.1 3.2-4mdk))
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
2002-11-28 17:12 ` Bill Davidsen
@ 2002-11-29 4:51 ` GrandMasterLee
0 siblings, 0 replies; 8+ messages in thread
From: GrandMasterLee @ 2002-11-29 4:51 UTC (permalink / raw)
To: Bill Davidsen; +Cc: Dave Jones, Linux-Kernel Mailing List
On Thu, 2002-11-28 at 11:12, Bill Davidsen wrote:
> On Thu, 28 Nov 2002, Dave Jones wrote:
>
> > On Thu, Nov 28, 2002 at 03:17:53PM +0100, Margit Schubert-While wrote:
> > > Just noticed this in "include/asm-i386/processor.h" :
> > >
> > > --- snip ---
> > > /* Prefetch instructions for Pentium III and AMD Athlon */
> > > #ifdef CONFIG_MPENTIUMIII
> > > #define ARCH_HAS_PREFETCH
> > > extern inline void prefetch(const void *x)
> > > {
> > > __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
> > > }
> > > #elif CONFIG_X86_USE_3DNOW
> > > --- end snip ---
> > >
> > > The P4 has SSE and prefetch or no ?
> >
> > It does. You seem to have found a bug.
>
> A bug? An inefficiency, obviously, but it should be functionally correct,
> no? Or is there a problem I've missed other than performance?
IMHO, when building systems, any deficiency I find, is logged as a bug.
I'd imagine, anything perceived as a problem, should be treated this
way. Imho, performance, or lack of, is a bug, if a potential fix is
available.
--The GrandMaster
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
2002-11-29 0:08 ` J.A. Magallon
@ 2002-12-02 13:24 ` Dave Jones
0 siblings, 0 replies; 8+ messages in thread
From: Dave Jones @ 2002-12-02 13:24 UTC (permalink / raw)
To: J.A. Magallon; +Cc: Margit Schubert-While, linux-kernel
On Fri, Nov 29, 2002 at 01:08:59AM +0100, J.A. Magallon wrote:
> - PII also supports the prefetches. Is it worth to add it ?
I think you are mistaken. The prefetch instructions came to
Intel CPUs with SSE. There are no (afair) no SSE Pentium II's.
Dave
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* 2.4.19/20, 2.5 missing P4 ifdef ?
@ 2002-12-03 9:01 Margit Schubert-While
0 siblings, 0 replies; 8+ messages in thread
From: Margit Schubert-While @ 2002-12-03 9:01 UTC (permalink / raw)
To: linux-kernel
> I think you are mistaken. The prefetch instructions came to
> Intel CPUs with SSE. There are no (afair) no SSE Pentium II's.
Correct.
And while we are at it, any kernel guru like to take a squint at
include/asm-i386 and arch/i386 ?
It seems to me that it should be possible to get a lot more out
of P3's and P4's.
Taking the example of the (misnamed) 3DNOW page_clear code,
for the P4(SSE2) it could be implemented as :
__asm__ __volatile__ (
" pxor %%xmm0, %%xmm0\n" : :
);
for(i=0;i<4096/128;i++)
{
__asm__ __volatile__ (
" movntdq %%xmm0, (%0)\n"
" movntdq %%xmm0, 16(%0)\n"
" movntdq %%xmm0, 32(%0)\n"
" movntdq %%xmm0, 48(%0)\n"
" movntdq %%xmm0, 64(%0)\n"
" movntdq %%xmm0, 80(%0)\n"
" movntdq %%xmm0, 96(%0)\n"
" movntdq %%xmm0, 112(%0)\n"
: : "r" (page) : "memory");
page+=128;
}
/* since movntdq is weakly-ordered, a "sfence" is needed to become
* ordered again.
*/
__asm__ __volatile__ (
" sfence \n" : :
);
I'm quite willing to be flamed on this :-)
Margit
^ permalink raw reply [flat|nested] 8+ messages in thread
* 2.4.19/20, 2.5 missing P4 ifdef ?
@ 2002-11-29 7:42 Margit Schubert-While
0 siblings, 0 replies; 8+ messages in thread
From: Margit Schubert-While @ 2002-11-29 7:42 UTC (permalink / raw)
To: linux-kernel
Here is the link to the Intel IA32 Software Developers Manual (Includes P4) :
http://developer.intel.com/design/pentium4/manuals/245470.htm
Margit
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-12-03 8:54 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-28 14:17 2.4.19/20, 2.5 missing P4 ifdef ? Margit Schubert-While
2002-11-28 14:24 ` Dave Jones
2002-11-28 17:12 ` Bill Davidsen
2002-11-29 4:51 ` GrandMasterLee
2002-11-29 0:08 ` J.A. Magallon
2002-12-02 13:24 ` Dave Jones
2002-11-29 7:42 Margit Schubert-While
2002-12-03 9:01 Margit Schubert-While
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).