linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [patch 0/2] sLeAZY FPU feature
@ 2006-07-02  0:57 Chuck Ebbert
  0 siblings, 0 replies; 4+ messages in thread
From: Chuck Ebbert @ 2006-07-02  0:57 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Andrew Morton, linux-kernel, Andi Kleen, Nick Piggin

In-Reply-To: <1151782942.3195.56.camel@laptopd505.fenrus.org>

On Sat, 01 Jul 2006 21:42:22 +0200, Arjan van de Ven wrote:

> > What sort of test?
>
> the one I did was long running FPU app (calculating PI using FPU)

Mine was just running a program that loops doing getpid() in one window
and this in another:

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>

#define rdtscll(t)      asm("rdtsc" : "=A" (t))

int main(int argc, char * const argv[])
{
        long long tsc1, tsc2;
        long double ld = 0.0;
        int i, iters = 999999999;

        rdtscll(tsc1);
        for (i = 0; i < iters; i++)
                ld += 1.0;
        rdtscll(tsc2);

        printf("count: %Lf, clocks: %llu\n", ld, tsc2 - tsc1);

        return 0;
}

So the ~0.4% gain I saw (averaging 10 tests) was likely the minimum
and Arjan's 8.5% gain when switching tasks after every FPU operation
is the max.

-- 
Chuck
 "You can't read a newspaper if you can't read."  --George W. Bush

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch 0/2] sLeAZY FPU feature
  2006-07-01 17:40 ` Nick Piggin
@ 2006-07-01 19:42   ` Arjan van de Ven
  0 siblings, 0 replies; 4+ messages in thread
From: Arjan van de Ven @ 2006-07-01 19:42 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel, akpm, ak

On Sun, 2006-07-02 at 03:40 +1000, Nick Piggin wrote:

> What sort of test?

the one I did was long running FPU app (calculating PI using FPU)

>  Any idea of the results for a best case microbenchmark
> (something like two threads ping-pong a couple of futexes between them,
> in between doing a single FPU op)

ok I wrote a test scenario for this; the performance gain I get with
this is 8.5% 

the FPU part of the hot loop I used is
                A = 0.3 * (A+B);
with A and B doubles




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch 0/2] sLeAZY FPU feature
  2006-07-01 17:11 Arjan van de Ven
@ 2006-07-01 17:40 ` Nick Piggin
  2006-07-01 19:42   ` Arjan van de Ven
  0 siblings, 1 reply; 4+ messages in thread
From: Nick Piggin @ 2006-07-01 17:40 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: linux-kernel, akpm, ak

Arjan van de Ven wrote:
> Hi,
> 
> the two patches in this series (the x86-64 on by me, the i386 one by
> Chuck Ebbert) change how the lazy fpu feature works. In the current
> situation, we are 100% lazy, meaning that after every context switch,
> the application takes a trap on the first FPU use, which then restores
> the FPU context.
> 
> The sLeAZY FPU patch changes this behavior; if a process has used the
> FPU for 5 stints at a row, the behavior becomes proactive and the FPU
> context is restored during the regular context switch already. This
> means we can avoid the trap.
> 
> The underlying assumption is that if a process uses 5 times consecutive,
> it's likely to do it the 6th and later times as well (eg it's not a
> one-off behavior).
> 
> There is a limit built in; this proactive behavior resets after 255
> times, so that when a process is long lived and chances behavior, it'll
> still get the right behavior (for performance) after some time.
> 
> Chuck measured a +/- 0.4% performance gain, and my experiments show a
> similar improvement.

What sort of test? Any idea of the results for a best case microbenchmark
(something like two threads ping-pong a couple of futexes between them,
in between doing a single FPU op)

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [patch 0/2] sLeAZY FPU feature
@ 2006-07-01 17:11 Arjan van de Ven
  2006-07-01 17:40 ` Nick Piggin
  0 siblings, 1 reply; 4+ messages in thread
From: Arjan van de Ven @ 2006-07-01 17:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, ak

Hi,

the two patches in this series (the x86-64 on by me, the i386 one by
Chuck Ebbert) change how the lazy fpu feature works. In the current
situation, we are 100% lazy, meaning that after every context switch,
the application takes a trap on the first FPU use, which then restores
the FPU context.

The sLeAZY FPU patch changes this behavior; if a process has used the
FPU for 5 stints at a row, the behavior becomes proactive and the FPU
context is restored during the regular context switch already. This
means we can avoid the trap.

The underlying assumption is that if a process uses 5 times consecutive,
it's likely to do it the 6th and later times as well (eg it's not a
one-off behavior).

There is a limit built in; this proactive behavior resets after 255
times, so that when a process is long lived and chances behavior, it'll
still get the right behavior (for performance) after some time.

Chuck measured a +/- 0.4% performance gain, and my experiments show a
similar improvement.

Greetings,
   Arjan van de Ven


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-07-02  1:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-07-02  0:57 [patch 0/2] sLeAZY FPU feature Chuck Ebbert
  -- strict thread matches above, loose matches on Subject: below --
2006-07-01 17:11 Arjan van de Ven
2006-07-01 17:40 ` Nick Piggin
2006-07-01 19:42   ` Arjan van de Ven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).