linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.4.19/20, 2.5 missing P4 ifdef ?
@ 2002-11-28 14:17 Margit Schubert-While
  2002-11-28 14:24 ` Dave Jones
  0 siblings, 1 reply; 8+ messages in thread
From: Margit Schubert-While @ 2002-11-28 14:17 UTC (permalink / raw)
  To: linux-kernel

Just noticed this in "include/asm-i386/processor.h" :

--- snip ---
/* Prefetch instructions for Pentium III and AMD Athlon */
#ifdef  CONFIG_MPENTIUMIII
#define ARCH_HAS_PREFETCH
extern inline void prefetch(const void *x)
{
         __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
}
#elif CONFIG_X86_USE_3DNOW
--- end snip ---

The P4 has SSE and prefetch or no ?

Margit 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
  2002-11-28 14:17 2.4.19/20, 2.5 missing P4 ifdef ? Margit Schubert-While
@ 2002-11-28 14:24 ` Dave Jones
  2002-11-28 17:12   ` Bill Davidsen
  2002-11-29  0:08   ` J.A. Magallon
  0 siblings, 2 replies; 8+ messages in thread
From: Dave Jones @ 2002-11-28 14:24 UTC (permalink / raw)
  To: Margit Schubert-While; +Cc: linux-kernel

On Thu, Nov 28, 2002 at 03:17:53PM +0100, Margit Schubert-While wrote:
 > Just noticed this in "include/asm-i386/processor.h" :
 > 
 > --- snip ---
 > /* Prefetch instructions for Pentium III and AMD Athlon */
 > #ifdef  CONFIG_MPENTIUMIII
 > #define ARCH_HAS_PREFETCH
 > extern inline void prefetch(const void *x)
 > {
 >         __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
 > }
 > #elif CONFIG_X86_USE_3DNOW
 > --- end snip ---
 > 
 > The P4 has SSE and prefetch or no ?

It does. You seem to have found a bug.

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
  2002-11-28 14:24 ` Dave Jones
@ 2002-11-28 17:12   ` Bill Davidsen
  2002-11-29  4:51     ` GrandMasterLee
  2002-11-29  0:08   ` J.A. Magallon
  1 sibling, 1 reply; 8+ messages in thread
From: Bill Davidsen @ 2002-11-28 17:12 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux-Kernel Mailing List

On Thu, 28 Nov 2002, Dave Jones wrote:

> On Thu, Nov 28, 2002 at 03:17:53PM +0100, Margit Schubert-While wrote:
>  > Just noticed this in "include/asm-i386/processor.h" :
>  > 
>  > --- snip ---
>  > /* Prefetch instructions for Pentium III and AMD Athlon */
>  > #ifdef  CONFIG_MPENTIUMIII
>  > #define ARCH_HAS_PREFETCH
>  > extern inline void prefetch(const void *x)
>  > {
>  >         __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
>  > }
>  > #elif CONFIG_X86_USE_3DNOW
>  > --- end snip ---
>  > 
>  > The P4 has SSE and prefetch or no ?
> 
> It does. You seem to have found a bug.

A bug? An inefficiency, obviously, but it should be functionally correct,
no? Or is there a problem I've missed other than performance?

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
  2002-11-28 14:24 ` Dave Jones
  2002-11-28 17:12   ` Bill Davidsen
@ 2002-11-29  0:08   ` J.A. Magallon
  2002-12-02 13:24     ` Dave Jones
  1 sibling, 1 reply; 8+ messages in thread
From: J.A. Magallon @ 2002-11-29  0:08 UTC (permalink / raw)
  To: Dave Jones; +Cc: Margit Schubert-While, linux-kernel


On 2002.11.28 Dave Jones wrote:
>On Thu, Nov 28, 2002 at 03:17:53PM +0100, Margit Schubert-While wrote:
> > Just noticed this in "include/asm-i386/processor.h" :
> > 
> > --- snip ---
> > /* Prefetch instructions for Pentium III and AMD Athlon */
> > #ifdef  CONFIG_MPENTIUMIII
> > #define ARCH_HAS_PREFETCH
> > extern inline void prefetch(const void *x)
> > {
> >         __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
> > }
> > #elif CONFIG_X86_USE_3DNOW
> > --- end snip ---
> > 
> > The P4 has SSE and prefetch or no ?
>
>It does. You seem to have found a bug.
>

Two questions:
- I am trying to use gcc's __builtin_prefetch, and it is able to
  spit different prefetch instructions depending on 'temporal
  locality' of the data:
	prefetchnta, prefetcht2, prefetcht1, prefetcht0
temp-loc:    0           1         2           3
  0 means you can just discard after r or w, and 3 means you
  are really interested in data lasting in cache.
  Do not know if the use of prefetch in kernel is extensive,
  but perhaps this is something to investigate...

- PII also supports the prefetches. Is it worth to add it ?

TIA

-- 
J.A. Magallon <jamagallon@able.es>      \                 Software is like sex:
werewolf.able.es                         \           It's better when it's free
Mandrake Linux release 9.1 (Cooker) for i586
Linux 2.4.20-rc4-jam0 (gcc 3.2 (Mandrake Linux 9.1 3.2-4mdk))

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
  2002-11-28 17:12   ` Bill Davidsen
@ 2002-11-29  4:51     ` GrandMasterLee
  0 siblings, 0 replies; 8+ messages in thread
From: GrandMasterLee @ 2002-11-29  4:51 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Dave Jones, Linux-Kernel Mailing List

On Thu, 2002-11-28 at 11:12, Bill Davidsen wrote:
> On Thu, 28 Nov 2002, Dave Jones wrote:
> 
> > On Thu, Nov 28, 2002 at 03:17:53PM +0100, Margit Schubert-While wrote:
> >  > Just noticed this in "include/asm-i386/processor.h" :
> >  > 
> >  > --- snip ---
> >  > /* Prefetch instructions for Pentium III and AMD Athlon */
> >  > #ifdef  CONFIG_MPENTIUMIII
> >  > #define ARCH_HAS_PREFETCH
> >  > extern inline void prefetch(const void *x)
> >  > {
> >  >         __asm__ __volatile__ ("prefetchnta (%0)" : : "r"(x));
> >  > }
> >  > #elif CONFIG_X86_USE_3DNOW
> >  > --- end snip ---
> >  > 
> >  > The P4 has SSE and prefetch or no ?
> > 
> > It does. You seem to have found a bug.
> 
> A bug? An inefficiency, obviously, but it should be functionally correct,
> no? Or is there a problem I've missed other than performance?

IMHO, when building systems, any deficiency I find, is logged as a bug.
I'd imagine, anything perceived as a problem, should be treated this
way. Imho, performance, or lack of, is a bug, if a potential fix is
available. 

  --The GrandMaster

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.4.19/20, 2.5 missing P4 ifdef ?
  2002-11-29  0:08   ` J.A. Magallon
@ 2002-12-02 13:24     ` Dave Jones
  0 siblings, 0 replies; 8+ messages in thread
From: Dave Jones @ 2002-12-02 13:24 UTC (permalink / raw)
  To: J.A. Magallon; +Cc: Margit Schubert-While, linux-kernel

On Fri, Nov 29, 2002 at 01:08:59AM +0100, J.A. Magallon wrote:

 > - PII also supports the prefetches. Is it worth to add it ?

I think you are mistaken. The prefetch instructions came to
Intel CPUs with SSE. There are no (afair) no SSE Pentium II's.

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* 2.4.19/20, 2.5 missing P4 ifdef ?
@ 2002-12-03  9:01 Margit Schubert-While
  0 siblings, 0 replies; 8+ messages in thread
From: Margit Schubert-While @ 2002-12-03  9:01 UTC (permalink / raw)
  To: linux-kernel

 > I think you are mistaken. The prefetch instructions came to
 > Intel CPUs with SSE. There are no (afair) no SSE Pentium II's.

Correct.
And while we are at it, any kernel guru like to take a squint at
include/asm-i386 and arch/i386 ?
It seems to me that it should be possible to get a lot more out
of P3's and P4's.
Taking the example of the (misnamed) 3DNOW page_clear code,
for the P4(SSE2) it could be implemented as :

         __asm__ __volatile__ (
                 "  pxor %%xmm0, %%xmm0\n" : :
         );

         for(i=0;i<4096/128;i++)
         {
                 __asm__ __volatile__ (
                 "  movntdq %%xmm0, (%0)\n"
                 "  movntdq %%xmm0, 16(%0)\n"
                 "  movntdq %%xmm0, 32(%0)\n"
                 "  movntdq %%xmm0, 48(%0)\n"
                 "  movntdq %%xmm0, 64(%0)\n"
                 "  movntdq %%xmm0, 80(%0)\n"
                 "  movntdq %%xmm0, 96(%0)\n"
                 "  movntdq %%xmm0, 112(%0)\n"
                 : : "r" (page) : "memory");
                 page+=128;
         }
         /* since movntdq is weakly-ordered, a "sfence" is needed to become
          * ordered again.
          */
         __asm__ __volatile__ (
                 "  sfence \n" : :
         );

I'm quite willing to be flamed on this :-)

Margit 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* 2.4.19/20, 2.5 missing P4 ifdef ?
@ 2002-11-29  7:42 Margit Schubert-While
  0 siblings, 0 replies; 8+ messages in thread
From: Margit Schubert-While @ 2002-11-29  7:42 UTC (permalink / raw)
  To: linux-kernel

Here is the link to the Intel IA32 Software Developers Manual (Includes P4) :
http://developer.intel.com/design/pentium4/manuals/245470.htm

Margit 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-12-03  8:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-28 14:17 2.4.19/20, 2.5 missing P4 ifdef ? Margit Schubert-While
2002-11-28 14:24 ` Dave Jones
2002-11-28 17:12   ` Bill Davidsen
2002-11-29  4:51     ` GrandMasterLee
2002-11-29  0:08   ` J.A. Magallon
2002-12-02 13:24     ` Dave Jones
2002-11-29  7:42 Margit Schubert-While
2002-12-03  9:01 Margit Schubert-While

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).