linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* get_cycles() on i386
@ 2003-11-04 23:22 Joel Becker
  2003-11-04 23:27 ` john stultz
  0 siblings, 1 reply; 7+ messages in thread
From: Joel Becker @ 2003-11-04 23:22 UTC (permalink / raw)
  To: linux-kernel, Marcelo Tosatti, john stultz

Folks,
	Certain distributions are building all of their SMP kernels
NUMA-aware.  This is great, as the kernels support boxes like the x440
with no trouble.  However, this implicitly disables CONFIG_X86_TSC.
While that is good for NUMA systems, and fine from a kernel timing
standpoint, it also eliminates any generic access to the TSC via
get_cycles().  With CONFIG_X86_TSC not defined, get_cycles() always
returns 0.
	Given that >95% of machines will not be x440s, this means that a
user of that kernel cannot access a high resolution timer via
get_cycles().  I don't want to have to litter my code with rdtscll()
when I managed to remove it!
	The proposed patch is trivial.  If the system has a TSC, it is
available get_cycles().  This makes no change to the other parts of the
kernel protected by CONFIG_X86_TSC.

Joel

diff -uNr ../kernel-2.4.21-4.0.1.EL/linux-2.4.21/include/asm-i386/timex.h linux-2.4.21/include/asm-i386/timex.h
--- ../kernel-2.4.21-4.0.1.EL/linux-2.4.21/include/asm-i386/timex.h	2002-11-28 15:53:15.000000000 -0800
+++ linux-2.4.21/include/asm-i386/timex.h	2003-11-04 11:33:08.000000000 -0800
@@ -40,7 +40,7 @@
 
 static inline cycles_t get_cycles (void)
 {
-#ifndef CONFIG_X86_TSC
+#ifndef CONFIG_X86_HAS_TSC
 	return 0;
 #else
 	unsigned long long ret;


-- 

"Hey mister if you're gonna walk on water,
 Could you drop a line my way?"

Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: get_cycles() on i386
  2003-11-04 23:22 get_cycles() on i386 Joel Becker
@ 2003-11-04 23:27 ` john stultz
  2003-11-04 23:54   ` Linus Torvalds
  2003-12-05 15:55   ` Marcelo Tosatti
  0 siblings, 2 replies; 7+ messages in thread
From: john stultz @ 2003-11-04 23:27 UTC (permalink / raw)
  To: Joel Becker; +Cc: lkml, Marcelo Tosatti

On Tue, 2003-11-04 at 15:22, Joel Becker wrote:
> Folks,
> 	Certain distributions are building all of their SMP kernels
> NUMA-aware.  This is great, as the kernels support boxes like the x440
> with no trouble.  However, this implicitly disables CONFIG_X86_TSC.
> While that is good for NUMA systems, and fine from a kernel timing
> standpoint, it also eliminates any generic access to the TSC via
> get_cycles().  With CONFIG_X86_TSC not defined, get_cycles() always
> returns 0.
> 	Given that >95% of machines will not be x440s, this means that a
> user of that kernel cannot access a high resolution timer via
> get_cycles().  I don't want to have to litter my code with rdtscll()
> when I managed to remove it!
> 	The proposed patch is trivial.  If the system has a TSC, it is
> available get_cycles().  This makes no change to the other parts of the
> kernel protected by CONFIG_X86_TSC.

CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
compile time option and using dynamic detection. Something like (not
recently tested and i believe against 2.5.something, but you get the
idea):


diff -Nru a/include/asm-i386/timex.h b/include/asm-i386/timex.h
--- a/include/asm-i386/timex.h	Mon Feb 24 21:09:32 2003
+++ b/include/asm-i386/timex.h	Mon Feb 24 21:09:32 2003
@@ -40,14 +40,10 @@
 
 static inline cycles_t get_cycles (void)
 {
-#ifndef CONFIG_X86_TSC
-	return 0;
-#else
-	unsigned long long ret;
-
-	rdtscll(ret);
+	unsigned long long ret = 0;
+	if(cpu_has_tsc)
+		rdtscll(ret);
 	return ret;
-#endif
 }
 
 extern unsigned long cpu_khz;


thanks
-john



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: get_cycles() on i386
  2003-11-04 23:27 ` john stultz
@ 2003-11-04 23:54   ` Linus Torvalds
  2003-11-05  2:53     ` Nick Piggin
  2003-11-05 13:35     ` Marcelo Tosatti
  2003-12-05 15:55   ` Marcelo Tosatti
  1 sibling, 2 replies; 7+ messages in thread
From: Linus Torvalds @ 2003-11-04 23:54 UTC (permalink / raw)
  To: john stultz; +Cc: Joel Becker, lkml, Marcelo Tosatti


On 4 Nov 2003, john stultz wrote:
> 
> CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
> compile time option and using dynamic detection. Something like (not
> recently tested and i believe against 2.5.something, but you get the
> idea):

Some of the users are really timing-critical (eg scheduler).

How about just using the "alternative()" infrastructure that we already 
have in 2.6.x for this? See <asm-i386/system.h> for details.

We don't have an "alternative_output()" available yet, but using that it
would look something like:

	static inline unsigned long long get_cycle(void)
	{
		unsigned long long tsc;

		alternative_output(
			"xorl %%eax,%%eax ; xorl %%edx,%%edx",
			"rdtsc",
			X86_FEATURE_TSC,
			"=A" (tsc));
		return tsc;
	 }

which should allow for "perfect" code (well, gcc tends to mess up 64-bit 
stuff, but you get the idea).

We use the "alternative_input()" thing for prefetch() handling (see 
<asm-i386/processor.h>).

		Linus


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: get_cycles() on i386
  2003-11-04 23:54   ` Linus Torvalds
@ 2003-11-05  2:53     ` Nick Piggin
  2003-11-05  3:02       ` Nick Piggin
  2003-11-05 13:35     ` Marcelo Tosatti
  1 sibling, 1 reply; 7+ messages in thread
From: Nick Piggin @ 2003-11-05  2:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: john stultz, Joel Becker, lkml, Marcelo Tosatti



Linus Torvalds wrote:

>On 4 Nov 2003, john stultz wrote:
>
>>CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
>>compile time option and using dynamic detection. Something like (not
>>recently tested and i believe against 2.5.something, but you get the
>>idea):
>>
>
>Some of the users are really timing-critical (eg scheduler).
>

The scheduler uses its own sched_clock which only gives jiffies
resolution if CONFIG_NUMA is defined. Unfortunate because I think
its interactive behaviour isn't so good with ms resolution.

The scheduler does not need to have synchronised TSCs though, I think.
It just means 2 more calls to sched_clock in a slow path (smp migration).



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: get_cycles() on i386
  2003-11-05  2:53     ` Nick Piggin
@ 2003-11-05  3:02       ` Nick Piggin
  0 siblings, 0 replies; 7+ messages in thread
From: Nick Piggin @ 2003-11-05  3:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: john stultz, Joel Becker, lkml, Marcelo Tosatti



Nick Piggin wrote:

>
>
> Linus Torvalds wrote:
>
>> On 4 Nov 2003, john stultz wrote:
>>
>>> CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
>>> compile time option and using dynamic detection. Something like (not
>>> recently tested and i believe against 2.5.something, but you get the
>>> idea):
>>>
>>
>> Some of the users are really timing-critical (eg scheduler).
>>
>
> The scheduler uses its own sched_clock which only gives jiffies
> resolution if CONFIG_NUMA is defined. Unfortunate because I think
> its interactive behaviour isn't so good with ms resolution.
>
> The scheduler does not need to have synchronised TSCs though, I think.
> It just means 2 more calls to sched_clock in a slow path (smp migration).
>
Well no, its much trickier than that I think :(




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: get_cycles() on i386
  2003-11-04 23:54   ` Linus Torvalds
  2003-11-05  2:53     ` Nick Piggin
@ 2003-11-05 13:35     ` Marcelo Tosatti
  1 sibling, 0 replies; 7+ messages in thread
From: Marcelo Tosatti @ 2003-11-05 13:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: john stultz, Joel Becker, lkml, Marcelo Tosatti



On Tue, 4 Nov 2003, Linus Torvalds wrote:

> 
> On 4 Nov 2003, john stultz wrote:
> > 
> > CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
> > compile time option and using dynamic detection. Something like (not
> > recently tested and i believe against 2.5.something, but you get the
> > idea):
> 
> Some of the users are really timing-critical (eg scheduler).
> 
> How about just using the "alternative()" infrastructure that we already 
> have in 2.6.x for this? See <asm-i386/system.h> for details.
> 
> We don't have an "alternative_output()" available yet, but using that it
> would look something like:
> 
> 	static inline unsigned long long get_cycle(void)
> 	{
> 		unsigned long long tsc;
> 
> 		alternative_output(
> 			"xorl %%eax,%%eax ; xorl %%edx,%%edx",
> 			"rdtsc",
> 			X86_FEATURE_TSC,
> 			"=A" (tsc));
> 		return tsc;
> 	 }
> 
> which should allow for "perfect" code (well, gcc tends to mess up 64-bit 
> stuff, but you get the idea).
> 
> We use the "alternative_input()" thing for prefetch() handling (see 
> <asm-i386/processor.h>).

I'm not confident this is something for 2.4.

The "if (cpu_has_tsc)" fix from John sounds fine. 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: get_cycles() on i386
  2003-11-04 23:27 ` john stultz
  2003-11-04 23:54   ` Linus Torvalds
@ 2003-12-05 15:55   ` Marcelo Tosatti
  1 sibling, 0 replies; 7+ messages in thread
From: Marcelo Tosatti @ 2003-12-05 15:55 UTC (permalink / raw)
  To: john stultz; +Cc: Joel Becker, lkml, Marcelo Tosatti




Any concerns? 

On 4 Nov 2003, john stultz wrote:

> On Tue, 2003-11-04 at 15:22, Joel Becker wrote:
> > Folks,
> > 	Certain distributions are building all of their SMP kernels
> > NUMA-aware.  This is great, as the kernels support boxes like the x440
> > with no trouble.  However, this implicitly disables CONFIG_X86_TSC.
> > While that is good for NUMA systems, and fine from a kernel timing
> > standpoint, it also eliminates any generic access to the TSC via
> > get_cycles().  With CONFIG_X86_TSC not defined, get_cycles() always
> > returns 0.
> > 	Given that >95% of machines will not be x440s, this means that a
> > user of that kernel cannot access a high resolution timer via
> > get_cycles().  I don't want to have to litter my code with rdtscll()
> > when I managed to remove it!
> > 	The proposed patch is trivial.  If the system has a TSC, it is
> > available get_cycles().  This makes no change to the other parts of the
> > kernel protected by CONFIG_X86_TSC.
> 
> CONFIG_X86_TSC be the devil. Personally, I'd much prefer dropping the
> compile time option and using dynamic detection. Something like (not
> recently tested and i believe against 2.5.something, but you get the
> idea):
> 
> 
> diff -Nru a/include/asm-i386/timex.h b/include/asm-i386/timex.h
> --- a/include/asm-i386/timex.h	Mon Feb 24 21:09:32 2003
> +++ b/include/asm-i386/timex.h	Mon Feb 24 21:09:32 2003
> @@ -40,14 +40,10 @@
>  
>  static inline cycles_t get_cycles (void)
>  {
> -#ifndef CONFIG_X86_TSC
> -	return 0;
> -#else
> -	unsigned long long ret;
> -
> -	rdtscll(ret);
> +	unsigned long long ret = 0;
> +	if(cpu_has_tsc)
> +		rdtscll(ret);
>  	return ret;
> -#endif
>  }
>  
>  extern unsigned long cpu_khz;

John, Joel, 

I believe this is reliable. I'll apply it.

Any concerns? 



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-12-05 16:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-04 23:22 get_cycles() on i386 Joel Becker
2003-11-04 23:27 ` john stultz
2003-11-04 23:54   ` Linus Torvalds
2003-11-05  2:53     ` Nick Piggin
2003-11-05  3:02       ` Nick Piggin
2003-11-05 13:35     ` Marcelo Tosatti
2003-12-05 15:55   ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).