* get_cycles() on i386
@ 2003-11-04 23:22 Joel Becker
2003-11-04 23:27 ` john stultz
0 siblings, 1 reply; 7+ messages in thread
From: Joel Becker @ 2003-11-04 23:22 UTC (permalink / raw)
To: linux-kernel, Marcelo Tosatti, john stultz
Folks,
Certain distributions are building all of their SMP kernels
NUMA-aware. This is great, as the kernels support boxes like the x440
with no trouble. However, this implicitly disables CONFIG_X86_TSC.
While that is good for NUMA systems, and fine from a kernel timing
standpoint, it also eliminates any generic access to the TSC via
get_cycles(). With CONFIG_X86_TSC not defined, get_cycles() always
returns 0.
Given that >95% of machines will not be x440s, this means that a
user of that kernel cannot access a high resolution timer via
get_cycles(). I don't want to have to litter my code with rdtscll()
when I managed to remove it!
The proposed patch is trivial. If the system has a TSC, it is
available get_cycles(). This makes no change to the other parts of the
kernel protected by CONFIG_X86_TSC.
Joel
diff -uNr ../kernel-2.4.21-4.0.1.EL/linux-2.4.21/include/asm-i386/timex.h linux-2.4.21/include/asm-i386/timex.h
--- ../kernel-2.4.21-4.0.1.EL/linux-2.4.21/include/asm-i386/timex.h 2002-11-28 15:53:15.000000000 -0800
+++ linux-2.4.21/include/asm-i386/timex.h 2003-11-04 11:33:08.000000000 -0800
@@ -40,7 +40,7 @@
static inline cycles_t get_cycles (void)
{
-#ifndef CONFIG_X86_TSC
+#ifndef CONFIG_X86_HAS_TSC
return 0;
#else
unsigned long long ret;
--
"Hey mister if you're gonna walk on water,
Could you drop a line my way?"
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: get_cycles() on i386
2003-11-04 23:22 get_cycles() on i386 Joel Becker
@ 2003-11-04 23:27 ` john stultz
2003-11-04 23:54 ` Linus Torvalds
2003-12-05 15:55 ` Marcelo Tosatti
0 siblings, 2 replies; 7+ messages in thread
From: john stultz @ 2003-11-04 23:27 UTC (permalink / raw)
To: Joel Becker; +Cc: lkml, Marcelo Tosatti
On Tue, 2003-11-04 at 15:22, Joel Becker wrote:
> Folks,
> Certain distributions are building all of their SMP kernels
> NUMA-aware. This is great, as the kernels support boxes like the x440
> with no trouble. However, this implicitly disables CONFIG_X86_TSC.
> While that is good for NUMA systems, and fine from a kernel timing
> standpoint, it also eliminates any generic access to the TSC via
> get_cycles(). With CONFIG_X86_TSC not defined, get_cycles() always
> returns 0.
> Given that >95% of machines will not be x440s, this means that a
> user of that kernel cannot access a high resolution timer via
> get_cycles(). I don't want to have to litter my code with rdtscll()
> when I managed to remove it!
> The proposed patch is trivial. If the system has a TSC, it is
> available get_cycles(). This makes no change to the other parts of the
> kernel protected by CONFIG_X86_TSC.
CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
compile time option and using dynamic detection. Something like (not
recently tested and i believe against 2.5.something, but you get the
idea):
diff -Nru a/include/asm-i386/timex.h b/include/asm-i386/timex.h
--- a/include/asm-i386/timex.h Mon Feb 24 21:09:32 2003
+++ b/include/asm-i386/timex.h Mon Feb 24 21:09:32 2003
@@ -40,14 +40,10 @@
static inline cycles_t get_cycles (void)
{
-#ifndef CONFIG_X86_TSC
- return 0;
-#else
- unsigned long long ret;
-
- rdtscll(ret);
+ unsigned long long ret = 0;
+ if(cpu_has_tsc)
+ rdtscll(ret);
return ret;
-#endif
}
extern unsigned long cpu_khz;
thanks
-john
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: get_cycles() on i386
2003-11-04 23:27 ` john stultz
@ 2003-11-04 23:54 ` Linus Torvalds
2003-11-05 2:53 ` Nick Piggin
2003-11-05 13:35 ` Marcelo Tosatti
2003-12-05 15:55 ` Marcelo Tosatti
1 sibling, 2 replies; 7+ messages in thread
From: Linus Torvalds @ 2003-11-04 23:54 UTC (permalink / raw)
To: john stultz; +Cc: Joel Becker, lkml, Marcelo Tosatti
On 4 Nov 2003, john stultz wrote:
>
> CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
> compile time option and using dynamic detection. Something like (not
> recently tested and i believe against 2.5.something, but you get the
> idea):
Some of the users are really timing-critical (eg scheduler).
How about just using the "alternative()" infrastructure that we already
have in 2.6.x for this? See <asm-i386/system.h> for details.
We don't have an "alternative_output()" available yet, but using that it
would look something like:
static inline unsigned long long get_cycle(void)
{
unsigned long long tsc;
alternative_output(
"xorl %%eax,%%eax ; xorl %%edx,%%edx",
"rdtsc",
X86_FEATURE_TSC,
"=A" (tsc));
return tsc;
}
which should allow for "perfect" code (well, gcc tends to mess up 64-bit
stuff, but you get the idea).
We use the "alternative_input()" thing for prefetch() handling (see
<asm-i386/processor.h>).
Linus
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: get_cycles() on i386
2003-11-04 23:54 ` Linus Torvalds
@ 2003-11-05 2:53 ` Nick Piggin
2003-11-05 3:02 ` Nick Piggin
2003-11-05 13:35 ` Marcelo Tosatti
1 sibling, 1 reply; 7+ messages in thread
From: Nick Piggin @ 2003-11-05 2:53 UTC (permalink / raw)
To: Linus Torvalds; +Cc: john stultz, Joel Becker, lkml, Marcelo Tosatti
Linus Torvalds wrote:
>On 4 Nov 2003, john stultz wrote:
>
>>CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
>>compile time option and using dynamic detection. Something like (not
>>recently tested and i believe against 2.5.something, but you get the
>>idea):
>>
>
>Some of the users are really timing-critical (eg scheduler).
>
The scheduler uses its own sched_clock which only gives jiffies
resolution if CONFIG_NUMA is defined. Unfortunate because I think
its interactive behaviour isn't so good with ms resolution.
The scheduler does not need to have synchronised TSCs though, I think.
It just means 2 more calls to sched_clock in a slow path (smp migration).
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: get_cycles() on i386
2003-11-05 2:53 ` Nick Piggin
@ 2003-11-05 3:02 ` Nick Piggin
0 siblings, 0 replies; 7+ messages in thread
From: Nick Piggin @ 2003-11-05 3:02 UTC (permalink / raw)
To: Linus Torvalds; +Cc: john stultz, Joel Becker, lkml, Marcelo Tosatti
Nick Piggin wrote:
>
>
> Linus Torvalds wrote:
>
>> On 4 Nov 2003, john stultz wrote:
>>
>>> CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
>>> compile time option and using dynamic detection. Something like (not
>>> recently tested and i believe against 2.5.something, but you get the
>>> idea):
>>>
>>
>> Some of the users are really timing-critical (eg scheduler).
>>
>
> The scheduler uses its own sched_clock which only gives jiffies
> resolution if CONFIG_NUMA is defined. Unfortunate because I think
> its interactive behaviour isn't so good with ms resolution.
>
> The scheduler does not need to have synchronised TSCs though, I think.
> It just means 2 more calls to sched_clock in a slow path (smp migration).
>
Well no, its much trickier than that I think :(
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: get_cycles() on i386
2003-11-04 23:54 ` Linus Torvalds
2003-11-05 2:53 ` Nick Piggin
@ 2003-11-05 13:35 ` Marcelo Tosatti
1 sibling, 0 replies; 7+ messages in thread
From: Marcelo Tosatti @ 2003-11-05 13:35 UTC (permalink / raw)
To: Linus Torvalds; +Cc: john stultz, Joel Becker, lkml, Marcelo Tosatti
On Tue, 4 Nov 2003, Linus Torvalds wrote:
>
> On 4 Nov 2003, john stultz wrote:
> >
> > CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
> > compile time option and using dynamic detection. Something like (not
> > recently tested and i believe against 2.5.something, but you get the
> > idea):
>
> Some of the users are really timing-critical (eg scheduler).
>
> How about just using the "alternative()" infrastructure that we already
> have in 2.6.x for this? See <asm-i386/system.h> for details.
>
> We don't have an "alternative_output()" available yet, but using that it
> would look something like:
>
> static inline unsigned long long get_cycle(void)
> {
> unsigned long long tsc;
>
> alternative_output(
> "xorl %%eax,%%eax ; xorl %%edx,%%edx",
> "rdtsc",
> X86_FEATURE_TSC,
> "=A" (tsc));
> return tsc;
> }
>
> which should allow for "perfect" code (well, gcc tends to mess up 64-bit
> stuff, but you get the idea).
>
> We use the "alternative_input()" thing for prefetch() handling (see
> <asm-i386/processor.h>).
I'm not confident this is something for 2.4.
The "if (cpu_has_tsc)" fix from John sounds fine.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: get_cycles() on i386
2003-11-04 23:27 ` john stultz
2003-11-04 23:54 ` Linus Torvalds
@ 2003-12-05 15:55 ` Marcelo Tosatti
1 sibling, 0 replies; 7+ messages in thread
From: Marcelo Tosatti @ 2003-12-05 15:55 UTC (permalink / raw)
To: john stultz; +Cc: Joel Becker, lkml, Marcelo Tosatti
Any concerns?
On 4 Nov 2003, john stultz wrote:
> On Tue, 2003-11-04 at 15:22, Joel Becker wrote:
> > Folks,
> > Certain distributions are building all of their SMP kernels
> > NUMA-aware. This is great, as the kernels support boxes like the x440
> > with no trouble. However, this implicitly disables CONFIG_X86_TSC.
> > While that is good for NUMA systems, and fine from a kernel timing
> > standpoint, it also eliminates any generic access to the TSC via
> > get_cycles(). With CONFIG_X86_TSC not defined, get_cycles() always
> > returns 0.
> > Given that >95% of machines will not be x440s, this means that a
> > user of that kernel cannot access a high resolution timer via
> > get_cycles(). I don't want to have to litter my code with rdtscll()
> > when I managed to remove it!
> > The proposed patch is trivial. If the system has a TSC, it is
> > available get_cycles(). This makes no change to the other parts of the
> > kernel protected by CONFIG_X86_TSC.
>
> CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
> compile time option and using dynamic detection. Something like (not
> recently tested and i believe against 2.5.something, but you get the
> idea):
>
>
> diff -Nru a/include/asm-i386/timex.h b/include/asm-i386/timex.h
> --- a/include/asm-i386/timex.h Mon Feb 24 21:09:32 2003
> +++ b/include/asm-i386/timex.h Mon Feb 24 21:09:32 2003
> @@ -40,14 +40,10 @@
>
> static inline cycles_t get_cycles (void)
> {
> -#ifndef CONFIG_X86_TSC
> - return 0;
> -#else
> - unsigned long long ret;
> -
> - rdtscll(ret);
> + unsigned long long ret = 0;
> + if(cpu_has_tsc)
> + rdtscll(ret);
> return ret;
> -#endif
> }
>
> extern unsigned long cpu_khz;
John, Joel,
I believe this is reliable. I'll apply it.
Any concerns?
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2003-12-05 16:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-04 23:22 get_cycles() on i386 Joel Becker
2003-11-04 23:27 ` john stultz
2003-11-04 23:54 ` Linus Torvalds
2003-11-05 2:53 ` Nick Piggin
2003-11-05 3:02 ` Nick Piggin
2003-11-05 13:35 ` Marcelo Tosatti
2003-12-05 15:55 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).