* Re: [patch 0/2] sLeAZY FPU feature
@ 2006-07-02 0:57 Chuck Ebbert
0 siblings, 0 replies; 4+ messages in thread
From: Chuck Ebbert @ 2006-07-02 0:57 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Andrew Morton, linux-kernel, Andi Kleen, Nick Piggin
In-Reply-To: <1151782942.3195.56.camel@laptopd505.fenrus.org>
On Sat, 01 Jul 2006 21:42:22 +0200, Arjan van de Ven wrote:
> > What sort of test?
>
> the one I did was long running FPU app (calculating PI using FPU)
Mine was just running a program that loops doing getpid() in one window
and this in another:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#define rdtscll(t) asm("rdtsc" : "=A" (t))
int main(int argc, char * const argv[])
{
long long tsc1, tsc2;
long double ld = 0.0;
int i, iters = 999999999;
rdtscll(tsc1);
for (i = 0; i < iters; i++)
ld += 1.0;
rdtscll(tsc2);
printf("count: %Lf, clocks: %llu\n", ld, tsc2 - tsc1);
return 0;
}
So the ~0.4% gain I saw (averaging 10 tests) was likely the minimum
and Arjan's 8.5% gain when switching tasks after every FPU operation
is the max.
--
Chuck
"You can't read a newspaper if you can't read." --George W. Bush
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch 0/2] sLeAZY FPU feature
2006-07-01 17:40 ` Nick Piggin
@ 2006-07-01 19:42 ` Arjan van de Ven
0 siblings, 0 replies; 4+ messages in thread
From: Arjan van de Ven @ 2006-07-01 19:42 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel, akpm, ak
On Sun, 2006-07-02 at 03:40 +1000, Nick Piggin wrote:
> What sort of test?
the one I did was long running FPU app (calculating PI using FPU)
> Any idea of the results for a best case microbenchmark
> (something like two threads ping-pong a couple of futexes between them,
> in between doing a single FPU op)
ok I wrote a test scenario for this; the performance gain I get with
this is 8.5%
the FPU part of the hot loop I used is
A = 0.3 * (A+B);
with A and B doubles
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch 0/2] sLeAZY FPU feature
2006-07-01 17:11 Arjan van de Ven
@ 2006-07-01 17:40 ` Nick Piggin
2006-07-01 19:42 ` Arjan van de Ven
0 siblings, 1 reply; 4+ messages in thread
From: Nick Piggin @ 2006-07-01 17:40 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: linux-kernel, akpm, ak
Arjan van de Ven wrote:
> Hi,
>
> the two patches in this series (the x86-64 on by me, the i386 one by
> Chuck Ebbert) change how the lazy fpu feature works. In the current
> situation, we are 100% lazy, meaning that after every context switch,
> the application takes a trap on the first FPU use, which then restores
> the FPU context.
>
> The sLeAZY FPU patch changes this behavior; if a process has used the
> FPU for 5 stints at a row, the behavior becomes proactive and the FPU
> context is restored during the regular context switch already. This
> means we can avoid the trap.
>
> The underlying assumption is that if a process uses 5 times consecutive,
> it's likely to do it the 6th and later times as well (eg it's not a
> one-off behavior).
>
> There is a limit built in; this proactive behavior resets after 255
> times, so that when a process is long lived and chances behavior, it'll
> still get the right behavior (for performance) after some time.
>
> Chuck measured a +/- 0.4% performance gain, and my experiments show a
> similar improvement.
What sort of test? Any idea of the results for a best case microbenchmark
(something like two threads ping-pong a couple of futexes between them,
in between doing a single FPU op)
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* [patch 0/2] sLeAZY FPU feature
@ 2006-07-01 17:11 Arjan van de Ven
2006-07-01 17:40 ` Nick Piggin
0 siblings, 1 reply; 4+ messages in thread
From: Arjan van de Ven @ 2006-07-01 17:11 UTC (permalink / raw)
To: linux-kernel; +Cc: akpm, ak
Hi,
the two patches in this series (the x86-64 on by me, the i386 one by
Chuck Ebbert) change how the lazy fpu feature works. In the current
situation, we are 100% lazy, meaning that after every context switch,
the application takes a trap on the first FPU use, which then restores
the FPU context.
The sLeAZY FPU patch changes this behavior; if a process has used the
FPU for 5 stints at a row, the behavior becomes proactive and the FPU
context is restored during the regular context switch already. This
means we can avoid the trap.
The underlying assumption is that if a process uses 5 times consecutive,
it's likely to do it the 6th and later times as well (eg it's not a
one-off behavior).
There is a limit built in; this proactive behavior resets after 255
times, so that when a process is long lived and chances behavior, it'll
still get the right behavior (for performance) after some time.
Chuck measured a +/- 0.4% performance gain, and my experiments show a
similar improvement.
Greetings,
Arjan van de Ven
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-07-02 1:03 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-07-02 0:57 [patch 0/2] sLeAZY FPU feature Chuck Ebbert
-- strict thread matches above, loose matches on Subject: below --
2006-07-01 17:11 Arjan van de Ven
2006-07-01 17:40 ` Nick Piggin
2006-07-01 19:42 ` Arjan van de Ven
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).