linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* x86_32 tsc/pit and hrtimers
@ 2008-10-07 22:41 Jeff Hansen
  2008-10-08 18:41 ` Chris Snook
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff Hansen @ 2008-10-07 22:41 UTC (permalink / raw)
  To: torvalds, linux-kernel, mingo

Linus, Ingo, All,

I've been struggling with hrtimer support in 2.6.26.5 on an older 
x86_32/i386 system, and I'm wondering if there are any easy fixes that you
(or anyone else) would suggest.

Basically, this system does not print out the message:

"Switched to high resolution mode on CPU 0"

indicating that one-shot, hrtimers, etc. won't work, since high resolution
mode has not been enabled.  I've verified that hrtimers started with
hrtimer_start do not have the expected resolution further than 1/HZ.

This system does not have LAPIC, ACPI, or HPET, so really the only
clocksources I can use are TSC and PIT.  This should be fine (in theory,
unless it wasn't designed like that), but apparently the clocksource flags
are not initialized in such a way that one of them ever gets marked as
CLOCK_SOURCE_VALID_FOR_HRES.

The flow of the flags on each of these clocksources is as follows:

1) The flags on the TSC clocksource are CLOCK_SOURCE_IS_CONTINUOUS |
     CLOCK_SOURCE_MUST_VERIFY, which causes PIT to be used as the watchdog
     clocksource. (see kernel/time/clocksource.c:~171)
2) Around line 122 in kernel/time/clocksource.c, where most clocksources'
     flags usually get ORed with CLOCK_SOURCE_VALID_FOR_HRES, the PIT's do
     not because it is not CLOCK_SOURCE_IS_CONTINUOUS, and the TSC's do not
     also because the PIT (as the watchdog) is not
     CLOCK_SOURCE_IS_CONTINUOUS.

I get the same results on a new laptop booting into 32-bit Linux with hpet
and acpi disabled.

Can you please tell me if this is supposed to work, and I just have a
poorly configured kernel; or if TSC/PIT drivers were not designed to work
this way in the first place.  If it wasn't designed to do this, do you
have any tips on implementing this, since I'll be needing to do that?

-Jeff Hansen

---------------------------------------------------
"If someone's gotta do it, it might as well be me."

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: x86_32 tsc/pit and hrtimers
  2008-10-07 22:41 x86_32 tsc/pit and hrtimers Jeff Hansen
@ 2008-10-08 18:41 ` Chris Snook
  2008-10-08 19:46   ` Jeff Hansen
  2008-10-09 17:20   ` Pavel Machek
  0 siblings, 2 replies; 23+ messages in thread
From: Chris Snook @ 2008-10-08 18:41 UTC (permalink / raw)
  To: Jeff Hansen; +Cc: torvalds, linux-kernel, mingo

Jeff Hansen wrote:
> Linus, Ingo, All,
> 
> I've been struggling with hrtimer support in 2.6.26.5 on an older 
> x86_32/i386 system, and I'm wondering if there are any easy fixes that you
> (or anyone else) would suggest.
> 
> Basically, this system does not print out the message:
> 
> "Switched to high resolution mode on CPU 0"
> 
> indicating that one-shot, hrtimers, etc. won't work, since high resolution
> mode has not been enabled.  I've verified that hrtimers started with
> hrtimer_start do not have the expected resolution further than 1/HZ.
> 
> This system does not have LAPIC, ACPI, or HPET, so really the only
> clocksources I can use are TSC and PIT.  This should be fine (in theory,
> unless it wasn't designed like that), but apparently the clocksource flags
> are not initialized in such a way that one of them ever gets marked as
> CLOCK_SOURCE_VALID_FOR_HRES.
> 
> The flow of the flags on each of these clocksources is as follows:
> 
> 1) The flags on the TSC clocksource are CLOCK_SOURCE_IS_CONTINUOUS |
>     CLOCK_SOURCE_MUST_VERIFY, which causes PIT to be used as the watchdog
>     clocksource. (see kernel/time/clocksource.c:~171)
> 2) Around line 122 in kernel/time/clocksource.c, where most clocksources'
>     flags usually get ORed with CLOCK_SOURCE_VALID_FOR_HRES, the PIT's do
>     not because it is not CLOCK_SOURCE_IS_CONTINUOUS, and the TSC's do not
>     also because the PIT (as the watchdog) is not
>     CLOCK_SOURCE_IS_CONTINUOUS.
> 
> I get the same results on a new laptop booting into 32-bit Linux with hpet
> and acpi disabled.
> 
> Can you please tell me if this is supposed to work, and I just have a
> poorly configured kernel; or if TSC/PIT drivers were not designed to work
> this way in the first place.  If it wasn't designed to do this, do you
> have any tips on implementing this, since I'll be needing to do that?
> 
> -Jeff Hansen

This is not supposed to work, but it might be worthwhile to add a boot option to 
force the kernel to trust the TSC, as hardware that lacks any high-res timers 
also tends to be primitive enough that the TSC can be trusted, if it exists.  If 
you patch out the CLOCK_SOURCE_MUST_VERIFY flag on the TSC, do you get 
correctly-functioning high-res timers on this system?

-- Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: x86_32 tsc/pit and hrtimers
  2008-10-08 18:41 ` Chris Snook
@ 2008-10-08 19:46   ` Jeff Hansen
  2008-10-08 20:25     ` Chris Snook
  2008-10-09 17:20   ` Pavel Machek
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff Hansen @ 2008-10-08 19:46 UTC (permalink / raw)
  To: Chris Snook; +Cc: torvalds, linux-kernel, mingo

Chris,

This worked perfectly!  I second adding a kernel option that forces 
trusting the TSC.  I can make a patch if you'd like.  Should the option be 
something like "trusttsc" or "tsc=noverify"?

-Jeff

---------------------------------------------------
"If someone's gotta do it, it might as well be me."

On Wed, 8 Oct 2008, Chris Snook wrote:

> Jeff Hansen wrote:
>>  Linus, Ingo, All,
>>
>>  I've been struggling with hrtimer support in 2.6.26.5 on an older
>>  x86_32/i386 system, and I'm wondering if there are any easy fixes that you
>>  (or anyone else) would suggest.
>>
>>  Basically, this system does not print out the message:
>>
>>  "Switched to high resolution mode on CPU 0"
>>
>>  indicating that one-shot, hrtimers, etc. won't work, since high resolution
>>  mode has not been enabled.  I've verified that hrtimers started with
>>  hrtimer_start do not have the expected resolution further than 1/HZ.
>>
>>  This system does not have LAPIC, ACPI, or HPET, so really the only
>>  clocksources I can use are TSC and PIT.  This should be fine (in theory,
>>  unless it wasn't designed like that), but apparently the clocksource flags
>>  are not initialized in such a way that one of them ever gets marked as
>>  CLOCK_SOURCE_VALID_FOR_HRES.
>>
>>  The flow of the flags on each of these clocksources is as follows:
>>
>>  1) The flags on the TSC clocksource are CLOCK_SOURCE_IS_CONTINUOUS |
>>      CLOCK_SOURCE_MUST_VERIFY, which causes PIT to be used as the watchdog
>>      clocksource. (see kernel/time/clocksource.c:~171)
>>  2) Around line 122 in kernel/time/clocksource.c, where most clocksources'
>>      flags usually get ORed with CLOCK_SOURCE_VALID_FOR_HRES, the PIT's do
>>      not because it is not CLOCK_SOURCE_IS_CONTINUOUS, and the TSC's do not
>>      also because the PIT (as the watchdog) is not
>>      CLOCK_SOURCE_IS_CONTINUOUS.
>>
>>  I get the same results on a new laptop booting into 32-bit Linux with hpet
>>  and acpi disabled.
>>
>>  Can you please tell me if this is supposed to work, and I just have a
>>  poorly configured kernel; or if TSC/PIT drivers were not designed to work
>>  this way in the first place.  If it wasn't designed to do this, do you
>>  have any tips on implementing this, since I'll be needing to do that?
>>
>>  -Jeff Hansen
>
> This is not supposed to work, but it might be worthwhile to add a boot option 
> to force the kernel to trust the TSC, as hardware that lacks any high-res 
> timers also tends to be primitive enough that the TSC can be trusted, if it 
> exists.  If you patch out the CLOCK_SOURCE_MUST_VERIFY flag on the TSC, do 
> you get correctly-functioning high-res timers on this system?
>
> -- Chris
>
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: x86_32 tsc/pit and hrtimers
  2008-10-08 19:46   ` Jeff Hansen
@ 2008-10-08 20:25     ` Chris Snook
  2008-10-08 21:43       ` [PATCH] " Jeff Hansen
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Snook @ 2008-10-08 20:25 UTC (permalink / raw)
  To: Jeff Hansen; +Cc: torvalds, linux-kernel, mingo

Jeff Hansen wrote:
> This worked perfectly!  I second adding a kernel option that forces 
> trusting the TSC.  I can make a patch if you'd like.  Should the option 
> be something like "trusttsc" or "tsc=noverify"?

How about a "highres=noverify" flag, to disable all CLOCK_SOURCE_MUST_VERIFY 
checking? People might want to use this with other timers too.

-- Chris

> On Wed, 8 Oct 2008, Chris Snook wrote:
> 
>> Jeff Hansen wrote:
>>>  Linus, Ingo, All,
>>>
>>>  I've been struggling with hrtimer support in 2.6.26.5 on an older
>>>  x86_32/i386 system, and I'm wondering if there are any easy fixes 
>>> that you
>>>  (or anyone else) would suggest.
>>>
>>>  Basically, this system does not print out the message:
>>>
>>>  "Switched to high resolution mode on CPU 0"
>>>
>>>  indicating that one-shot, hrtimers, etc. won't work, since high 
>>> resolution
>>>  mode has not been enabled.  I've verified that hrtimers started with
>>>  hrtimer_start do not have the expected resolution further than 1/HZ.
>>>
>>>  This system does not have LAPIC, ACPI, or HPET, so really the only
>>>  clocksources I can use are TSC and PIT.  This should be fine (in 
>>> theory,
>>>  unless it wasn't designed like that), but apparently the clocksource 
>>> flags
>>>  are not initialized in such a way that one of them ever gets marked as
>>>  CLOCK_SOURCE_VALID_FOR_HRES.
>>>
>>>  The flow of the flags on each of these clocksources is as follows:
>>>
>>>  1) The flags on the TSC clocksource are CLOCK_SOURCE_IS_CONTINUOUS |
>>>      CLOCK_SOURCE_MUST_VERIFY, which causes PIT to be used as the 
>>> watchdog
>>>      clocksource. (see kernel/time/clocksource.c:~171)
>>>  2) Around line 122 in kernel/time/clocksource.c, where most 
>>> clocksources'
>>>      flags usually get ORed with CLOCK_SOURCE_VALID_FOR_HRES, the 
>>> PIT's do
>>>      not because it is not CLOCK_SOURCE_IS_CONTINUOUS, and the TSC's 
>>> do not
>>>      also because the PIT (as the watchdog) is not
>>>      CLOCK_SOURCE_IS_CONTINUOUS.
>>>
>>>  I get the same results on a new laptop booting into 32-bit Linux 
>>> with hpet
>>>  and acpi disabled.
>>>
>>>  Can you please tell me if this is supposed to work, and I just have a
>>>  poorly configured kernel; or if TSC/PIT drivers were not designed to 
>>> work
>>>  this way in the first place.  If it wasn't designed to do this, do you
>>>  have any tips on implementing this, since I'll be needing to do that?
>>>
>>>  -Jeff Hansen
>>
>> This is not supposed to work, but it might be worthwhile to add a boot 
>> option to force the kernel to trust the TSC, as hardware that lacks 
>> any high-res timers also tends to be primitive enough that the TSC can 
>> be trusted, if it exists.  If you patch out the 
>> CLOCK_SOURCE_MUST_VERIFY flag on the TSC, do you get 
>> correctly-functioning high-res timers on this system?
>>
>> -- Chris
>>
>>
>>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-08 20:25     ` Chris Snook
@ 2008-10-08 21:43       ` Jeff Hansen
  2008-10-08 21:47         ` Randy.Dunlap
  2008-10-08 21:47         ` Chris Snook
  0 siblings, 2 replies; 23+ messages in thread
From: Jeff Hansen @ 2008-10-08 21:43 UTC (permalink / raw)
  To: Chris Snook; +Cc: torvalds, linux-kernel, mingo

[HRTIMER]: Add highres=noverify option to bypass clocksource verification

This disregards the CLOCK_SOURCE_MUST_VERIFY flag on all clocksources.
This is particularly useful on legacy x86_32 systems that have no ACPI,
LAPIC, or HPET timers, where only TSC and PIT are available.

Thanks to Chris Snook for suggesting this.
---
  include/linux/clocksource.h |    2 ++
  kernel/hrtimer.c            |    2 ++
  kernel/time/clocksource.c   |    3 ++-
  3 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 55e434f..90ae835 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -104,6 +104,8 @@ extern struct clocksource *clock;	/* current clocksource */
  #define CLOCK_SOURCE_WATCHDOG			0x10
  #define CLOCK_SOURCE_VALID_FOR_HRES		0x20

+extern int clocksource_noverify;
+
  /* simplify initialization of mask field */
  #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ? ((1ULL<<(bits))-1) : -1)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ab80515..2fdf59f 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -476,6 +476,8 @@ static int __init setup_hrtimer_hres(char *str)
  		hrtimer_hres_enabled = 0;
  	else if (!strcmp(str, "on"))
  		hrtimer_hres_enabled = 1;
+	else if (!strcmp(str, "noverify"))
+		clocksource_noverify = 1;
  	else
  		return 0;
  	return 1;
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index dadde53..d3c14c1 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -54,6 +54,7 @@ static LIST_HEAD(clocksource_list);
  static DEFINE_SPINLOCK(clocksource_lock);
  static char override_name[32];
  static int finished_booting;
+int clocksource_noverify;

  /* clocksource_done_booting - Called near the end of core bootup
   *
@@ -165,7 +166,7 @@ static void clocksource_check_watchdog(struct clocksource *cs)
  	unsigned long flags;

  	spin_lock_irqsave(&watchdog_lock, flags);
-	if (cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
+	if (!clocksource_noverify && cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
  		int started = !list_empty(&watchdog_list);

  		list_add(&cs->wd_list, &watchdog_list);
-- 
1.5.6.4

On Wed, 8 Oct 2008, Chris Snook wrote:

> Jeff Hansen wrote:
>>  This worked perfectly!  I second adding a kernel option that forces
>>  trusting the TSC.  I can make a patch if you'd like.  Should the option be
>>  something like "trusttsc" or "tsc=noverify"?
>
> How about a "highres=noverify" flag, to disable all CLOCK_SOURCE_MUST_VERIFY 
> checking? People might want to use this with other timers too.
>
> -- Chris
>
>>  On Wed, 8 Oct 2008, Chris Snook wrote:
>> 
>> >  Jeff Hansen wrote:
>> > >   Linus, Ingo, All,
>> > > 
>> > >   I've been struggling with hrtimer support in 2.6.26.5 on an older
>> > >   x86_32/i386 system, and I'm wondering if there are any easy fixes 
>> > >  that you
>> > >   (or anyone else) would suggest.
>> > > 
>> > >   Basically, this system does not print out the message:
>> > > 
>> > >   "Switched to high resolution mode on CPU 0"
>> > > 
>> > >  indicating that one-shot, hrtimers, etc. won't work, since high 
>> > >  resolution
>> > >   mode has not been enabled.  I've verified that hrtimers started with
>> > >   hrtimer_start do not have the expected resolution further than 1/HZ.
>> > > 
>> > >   This system does not have LAPIC, ACPI, or HPET, so really the only
>> > >   clocksources I can use are TSC and PIT.  This should be fine (in 
>> > >  theory,
>> > >  unless it wasn't designed like that), but apparently the clocksource 
>> > >  flags
>> > >   are not initialized in such a way that one of them ever gets marked 
>> > >   as
>> > >   CLOCK_SOURCE_VALID_FOR_HRES.
>> > > 
>> > >   The flow of the flags on each of these clocksources is as follows:
>> > > 
>> > >   1) The flags on the TSC clocksource are CLOCK_SOURCE_IS_CONTINUOUS |
>> > >       CLOCK_SOURCE_MUST_VERIFY, which causes PIT to be used as the 
>> > >  watchdog
>> > >       clocksource. (see kernel/time/clocksource.c:~171)
>> > >  2) Around line 122 in kernel/time/clocksource.c, where most 
>> > >  clocksources'
>> > >       flags usually get ORed with CLOCK_SOURCE_VALID_FOR_HRES, the 
>> > >  PIT's do
>> > >       not because it is not CLOCK_SOURCE_IS_CONTINUOUS, and the TSC's 
>> > >  do not
>> > >       also because the PIT (as the watchdog) is not
>> > >       CLOCK_SOURCE_IS_CONTINUOUS.
>> > > 
>> > >  I get the same results on a new laptop booting into 32-bit Linux with 
>> > >  hpet
>> > >   and acpi disabled.
>> > > 
>> > >   Can you please tell me if this is supposed to work, and I just have a
>> > >   poorly configured kernel; or if TSC/PIT drivers were not designed to 
>> > >  work
>> > >   this way in the first place.  If it wasn't designed to do this, do 
>> > >   you
>> > >   have any tips on implementing this, since I'll be needing to do that?
>> > > 
>> > >   -Jeff Hansen
>> > 
>> >  This is not supposed to work, but it might be worthwhile to add a boot 
>> >  option to force the kernel to trust the TSC, as hardware that lacks any 
>> >  high-res timers also tends to be primitive enough that the TSC can be 
>> >  trusted, if it exists.  If you patch out the CLOCK_SOURCE_MUST_VERIFY 
>> >  flag on the TSC, do you get correctly-functioning high-res timers on 
>> >  this system?
>> > 
>> >  -- Chris
>> > 
>> > 
>> > 
>
>
>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-08 21:43       ` [PATCH] " Jeff Hansen
@ 2008-10-08 21:47         ` Randy.Dunlap
  2008-10-08 21:47         ` Chris Snook
  1 sibling, 0 replies; 23+ messages in thread
From: Randy.Dunlap @ 2008-10-08 21:47 UTC (permalink / raw)
  To: Jeff Hansen; +Cc: Chris Snook, torvalds, linux-kernel, mingo

On Wed, 8 Oct 2008, Jeff Hansen wrote:

> [HRTIMER]: Add highres=noverify option to bypass clocksource verification
> 
> This disregards the CLOCK_SOURCE_MUST_VERIFY flag on all clocksources.
> This is particularly useful on legacy x86_32 systems that have no ACPI,
> LAPIC, or HPET timers, where only TSC and PIT are available.
> 
> Thanks to Chris Snook for suggesting this.
> ---
>  include/linux/clocksource.h |    2 ++
>  kernel/hrtimer.c            |    2 ++
>  kernel/time/clocksource.c   |    3 ++-
>  3 files changed, 6 insertions(+), 1 deletions(-)

Oops, missing update to Documentation/kernel-parameters.txt.
Please see Documentation/SubmitChecklist :)
Thanks.

> diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
> index 55e434f..90ae835 100644
> --- a/include/linux/clocksource.h
> +++ b/include/linux/clocksource.h
> @@ -104,6 +104,8 @@ extern struct clocksource *clock;	/* current clocksource
> */
>  #define CLOCK_SOURCE_WATCHDOG			0x10
>  #define CLOCK_SOURCE_VALID_FOR_HRES		0x20
> 
> +extern int clocksource_noverify;
> +
>  /* simplify initialization of mask field */
>  #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ? ((1ULL<<(bits))-1) :
> -1)
> 
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index ab80515..2fdf59f 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -476,6 +476,8 @@ static int __init setup_hrtimer_hres(char *str)
>  		hrtimer_hres_enabled = 0;
>  	else if (!strcmp(str, "on"))
>  		hrtimer_hres_enabled = 1;
> +	else if (!strcmp(str, "noverify"))
> +		clocksource_noverify = 1;
>  	else
>  		return 0;
>  	return 1;
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index dadde53..d3c14c1 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -54,6 +54,7 @@ static LIST_HEAD(clocksource_list);
>  static DEFINE_SPINLOCK(clocksource_lock);
>  static char override_name[32];
>  static int finished_booting;
> +int clocksource_noverify;
> 
>  /* clocksource_done_booting - Called near the end of core bootup
>   *
> @@ -165,7 +166,7 @@ static void clocksource_check_watchdog(struct clocksource
> *cs)
>  	unsigned long flags;
> 
>  	spin_lock_irqsave(&watchdog_lock, flags);
> -	if (cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
> +	if (!clocksource_noverify && cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
>  		int started = !list_empty(&watchdog_list);
> 
>  		list_add(&cs->wd_list, &watchdog_list);
> 

-- 
~Randy

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-08 21:43       ` [PATCH] " Jeff Hansen
  2008-10-08 21:47         ` Randy.Dunlap
@ 2008-10-08 21:47         ` Chris Snook
  2008-10-08 21:56           ` Jeff Hansen
  1 sibling, 1 reply; 23+ messages in thread
From: Chris Snook @ 2008-10-08 21:47 UTC (permalink / raw)
  To: Jeff Hansen; +Cc: torvalds, linux-kernel, mingo

Jeff Hansen wrote:
> [HRTIMER]: Add highres=noverify option to bypass clocksource verification
> 
> This disregards the CLOCK_SOURCE_MUST_VERIFY flag on all clocksources.
> This is particularly useful on legacy x86_32 systems that have no ACPI,
> LAPIC, or HPET timers, where only TSC and PIT are available.
> 
> Thanks to Chris Snook for suggesting this.
> ---
>  include/linux/clocksource.h |    2 ++
>  kernel/hrtimer.c            |    2 ++
>  kernel/time/clocksource.c   |    3 ++-
>  3 files changed, 6 insertions(+), 1 deletions(-)

Please also update Documentation/kernel-parameters.txt, and resubmit with your 
Signed-off-by tag.

-- Chris

> diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
> index 55e434f..90ae835 100644
> --- a/include/linux/clocksource.h
> +++ b/include/linux/clocksource.h
> @@ -104,6 +104,8 @@ extern struct clocksource *clock;    /* current 
> clocksource */
>  #define CLOCK_SOURCE_WATCHDOG            0x10
>  #define CLOCK_SOURCE_VALID_FOR_HRES        0x20
> 
> +extern int clocksource_noverify;
> +
>  /* simplify initialization of mask field */
>  #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ? 
> ((1ULL<<(bits))-1) : -1)
> 
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index ab80515..2fdf59f 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -476,6 +476,8 @@ static int __init setup_hrtimer_hres(char *str)
>          hrtimer_hres_enabled = 0;
>      else if (!strcmp(str, "on"))
>          hrtimer_hres_enabled = 1;
> +    else if (!strcmp(str, "noverify"))
> +        clocksource_noverify = 1;
>      else
>          return 0;
>      return 1;
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index dadde53..d3c14c1 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -54,6 +54,7 @@ static LIST_HEAD(clocksource_list);
>  static DEFINE_SPINLOCK(clocksource_lock);
>  static char override_name[32];
>  static int finished_booting;
> +int clocksource_noverify;
> 
>  /* clocksource_done_booting - Called near the end of core bootup
>   *
> @@ -165,7 +166,7 @@ static void clocksource_check_watchdog(struct 
> clocksource *cs)
>      unsigned long flags;
> 
>      spin_lock_irqsave(&watchdog_lock, flags);
> -    if (cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
> +    if (!clocksource_noverify && cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
>          int started = !list_empty(&watchdog_list);
> 
>          list_add(&cs->wd_list, &watchdog_list);


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-08 21:47         ` Chris Snook
@ 2008-10-08 21:56           ` Jeff Hansen
  2008-10-09  7:14             ` Thomas Gleixner
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff Hansen @ 2008-10-08 21:56 UTC (permalink / raw)
  To: Chris Snook; +Cc: torvalds, linux-kernel, mingo

[HRTIMER]: Add highres=noverify option to bypass clocksource verification

This disregards the CLOCK_SOURCE_MUST_VERIFY flag on all clocksources.
This is particularly useful on legacy x86_32 systems that have no ACPI,
LAPIC, or HPET timers, where only TSC and PIT are available.

Thanks to Chris Snook for suggesting this.

Signed-off-by: Jeff Hansen <jhansen@cardaccess-inc.com>
---
  Documentation/kernel-parameters.txt |    7 ++++---
  include/linux/clocksource.h         |    2 ++
  kernel/hrtimer.c                    |    2 ++
  kernel/time/clocksource.c           |    3 ++-
  4 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b52f47d..9611505 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -730,9 +730,10 @@ and is between 256 and 4096 characters. It is defined in the file
  			highmem otherwise. This also works to reduce highmem
  			size on bigger boxes.

-	highres=	[KNL] Enable/disable high resolution timer mode.
-			Valid parameters: "on", "off"
-			Default: "on"
+	highres=	[KNL] Modify high resolution timer parameters.
+			highres=on: Enable high-resolution timer mode (default)
+			highres=off: Disable high-resolution timer mode
+			highres=noverify: Assume all clocksources are stable

  	hisax=		[HW,ISDN]
  			See Documentation/isdn/README.HiSax.
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 55e434f..90ae835 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -104,6 +104,8 @@ extern struct clocksource *clock;	/* current clocksource */
  #define CLOCK_SOURCE_WATCHDOG			0x10
  #define CLOCK_SOURCE_VALID_FOR_HRES		0x20

+extern int clocksource_noverify;
+
  /* simplify initialization of mask field */
  #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ? ((1ULL<<(bits))-1) : -1)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ab80515..2fdf59f 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -476,6 +476,8 @@ static int __init setup_hrtimer_hres(char *str)
  		hrtimer_hres_enabled = 0;
  	else if (!strcmp(str, "on"))
  		hrtimer_hres_enabled = 1;
+	else if (!strcmp(str, "noverify"))
+		clocksource_noverify = 1;
  	else
  		return 0;
  	return 1;
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index dadde53..d3c14c1 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -54,6 +54,7 @@ static LIST_HEAD(clocksource_list);
  static DEFINE_SPINLOCK(clocksource_lock);
  static char override_name[32];
  static int finished_booting;
+int clocksource_noverify;

  /* clocksource_done_booting - Called near the end of core bootup
   *
@@ -165,7 +166,7 @@ static void clocksource_check_watchdog(struct clocksource *cs)
  	unsigned long flags;

  	spin_lock_irqsave(&watchdog_lock, flags);
-	if (cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
+	if (!clocksource_noverify && cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
  		int started = !list_empty(&watchdog_list);

  		list_add(&cs->wd_list, &watchdog_list);
-- 
1.5.6.4

On Wed, 8 Oct 2008, Chris Snook wrote:

> Jeff Hansen wrote:
>>  [HRTIMER]: Add highres=noverify option to bypass clocksource verification
>>
>>  This disregards the CLOCK_SOURCE_MUST_VERIFY flag on all clocksources.
>>  This is particularly useful on legacy x86_32 systems that have no ACPI,
>>  LAPIC, or HPET timers, where only TSC and PIT are available.
>>
>>  Thanks to Chris Snook for suggesting this.
>>  ---
>>   include/linux/clocksource.h |    2 ++
>>   kernel/hrtimer.c            |    2 ++
>>   kernel/time/clocksource.c   |    3 ++-
>>   3 files changed, 6 insertions(+), 1 deletions(-)
>
> Please also update Documentation/kernel-parameters.txt, and resubmit with 
> your Signed-off-by tag.
>
> -- Chris
>
>>  diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
>>  index 55e434f..90ae835 100644
>>  --- a/include/linux/clocksource.h
>>  +++ b/include/linux/clocksource.h
>>  @@ -104,6 +104,8 @@ extern struct clocksource *clock;    /* current
>>  clocksource */
>>   #define CLOCK_SOURCE_WATCHDOG            0x10
>>   #define CLOCK_SOURCE_VALID_FOR_HRES        0x20
>>
>>  +extern int clocksource_noverify;
>>  +
>>   /* simplify initialization of mask field */
>>   #define CLOCKSOURCE_MASK(bits) (cycle_t)((bits) < 64 ?
>>  ((1ULL<<(bits))-1) : -1)
>>
>>  diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
>>  index ab80515..2fdf59f 100644
>>  --- a/kernel/hrtimer.c
>>  +++ b/kernel/hrtimer.c
>>  @@ -476,6 +476,8 @@ static int __init setup_hrtimer_hres(char *str)
>>           hrtimer_hres_enabled = 0;
>>       else if (!strcmp(str, "on"))
>>           hrtimer_hres_enabled = 1;
>>  +    else if (!strcmp(str, "noverify"))
>>  +        clocksource_noverify = 1;
>>       else
>>           return 0;
>>       return 1;
>>  diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
>>  index dadde53..d3c14c1 100644
>>  --- a/kernel/time/clocksource.c
>>  +++ b/kernel/time/clocksource.c
>>  @@ -54,6 +54,7 @@ static LIST_HEAD(clocksource_list);
>>   static DEFINE_SPINLOCK(clocksource_lock);
>>   static char override_name[32];
>>   static int finished_booting;
>>  +int clocksource_noverify;
>>
>>   /* clocksource_done_booting - Called near the end of core bootup
>>    *
>>  @@ -165,7 +166,7 @@ static void clocksource_check_watchdog(struct
>>  clocksource *cs)
>>       unsigned long flags;
>>
>>       spin_lock_irqsave(&watchdog_lock, flags);
>>  -    if (cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
>>  +    if (!clocksource_noverify && cs->flags & CLOCK_SOURCE_MUST_VERIFY) {
>>           int started = !list_empty(&watchdog_list);
>>
>>           list_add(&cs->wd_list, &watchdog_list);
>
>
>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-08 21:56           ` Jeff Hansen
@ 2008-10-09  7:14             ` Thomas Gleixner
  2008-10-09 18:39               ` Chris Snook
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2008-10-09  7:14 UTC (permalink / raw)
  To: Jeff Hansen; +Cc: Chris Snook, torvalds, linux-kernel, mingo

On Wed, 8 Oct 2008, Jeff Hansen wrote:

> [HRTIMER]: Add highres=noverify option to bypass clocksource verification
> 
> This disregards the CLOCK_SOURCE_MUST_VERIFY flag on all clocksources.
> This is particularly useful on legacy x86_32 systems that have no ACPI,
> LAPIC, or HPET timers, where only TSC and PIT are available.

While I agree in principle, adding this to highres is the wrong thing
to do. This is a property of clocksources and not of high resolution
timer support.

The affected clocksource is TSC and it should go there as a command
line option e.g. tsc=stable, which in turn clears the
CLOCK_SOURCE_MUST_VERIFY flag in the tsc clocksource.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: x86_32 tsc/pit and hrtimers
  2008-10-08 18:41 ` Chris Snook
  2008-10-08 19:46   ` Jeff Hansen
@ 2008-10-09 17:20   ` Pavel Machek
  1 sibling, 0 replies; 23+ messages in thread
From: Pavel Machek @ 2008-10-09 17:20 UTC (permalink / raw)
  To: Chris Snook; +Cc: Jeff Hansen, torvalds, linux-kernel, mingo

Hi!

>> Can you please tell me if this is supposed to work, and I just have a
>> poorly configured kernel; or if TSC/PIT drivers were not designed to work
>> this way in the first place.  If it wasn't designed to do this, do you
>> have any tips on implementing this, since I'll be needing to do that?
>>
> This is not supposed to work, but it might be worthwhile to add a boot 
> option to force the kernel to trust the TSC, as hardware that lacks any 
> high-res timers also tends to be primitive enough that the TSC can be 
> trusted, if it exists.  If you patch out the CLOCK_SOURCE_MUST_VERIFY 
> flag on the TSC, do you get correctly-functioning high-res timers on this 
> system?

Untrue. Unstable tsc dates back to pentium MMX times.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09  7:14             ` Thomas Gleixner
@ 2008-10-09 18:39               ` Chris Snook
  2008-10-09 19:18                 ` Thomas Gleixner
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Snook @ 2008-10-09 18:39 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Jeff Hansen, torvalds, linux-kernel, mingo

Thomas Gleixner wrote:
> On Wed, 8 Oct 2008, Jeff Hansen wrote:
> 
>> [HRTIMER]: Add highres=noverify option to bypass clocksource verification
>>
>> This disregards the CLOCK_SOURCE_MUST_VERIFY flag on all clocksources.
>> This is particularly useful on legacy x86_32 systems that have no ACPI,
>> LAPIC, or HPET timers, where only TSC and PIT are available.
> 
> While I agree in principle, adding this to highres is the wrong thing
> to do. This is a property of clocksources and not of high resolution
> timer support.
> 
> The affected clocksource is TSC and it should go there as a command
> line option e.g. tsc=stable, which in turn clears the
> CLOCK_SOURCE_MUST_VERIFY flag in the tsc clocksource.
> 
> Thanks,
> 
> 	tglx

Fair enough, but do you think it's worthwhile to have an option to 
disable CLOCK_SOURCE_MUST_VERIFY checking for all timers, or should we 
implement this on a case-by-case basis when there's legitimate cause, 
like with TSC?

-- Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 18:39               ` Chris Snook
@ 2008-10-09 19:18                 ` Thomas Gleixner
  2008-10-09 19:45                   ` Jeff Hansen
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2008-10-09 19:18 UTC (permalink / raw)
  To: Chris Snook; +Cc: Jeff Hansen, torvalds, linux-kernel, mingo

On Thu, 9 Oct 2008, Chris Snook wrote:
> 
> Fair enough, but do you think it's worthwhile to have an option to disable
> CLOCK_SOURCE_MUST_VERIFY checking for all timers, or should we implement this
> on a case-by-case basis when there's legitimate cause, like with TSC?

Case-by-case is preferred.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 19:18                 ` Thomas Gleixner
@ 2008-10-09 19:45                   ` Jeff Hansen
  2008-10-09 19:53                     ` Thomas Gleixner
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff Hansen @ 2008-10-09 19:45 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Chris Snook, torvalds, linux-kernel, mingo

OK, so are we all agreed that something like clocksource_trust=tsc 
would be the best?

-Jeff

---------------------------------------------------
"If someone's gotta do it, it might as well be me."

On Thu, 9 Oct 2008, Thomas Gleixner wrote:

> On Thu, 9 Oct 2008, Chris Snook wrote:
>>
>> Fair enough, but do you think it's worthwhile to have an option to disable
>> CLOCK_SOURCE_MUST_VERIFY checking for all timers, or should we implement this
>> on a case-by-case basis when there's legitimate cause, like with TSC?
>
> Case-by-case is preferred.
>
> Thanks,
>
> 	tglx
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 19:45                   ` Jeff Hansen
@ 2008-10-09 19:53                     ` Thomas Gleixner
  2008-10-09 20:45                       ` Alok kataria
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2008-10-09 19:53 UTC (permalink / raw)
  To: Jeff Hansen; +Cc: Chris Snook, torvalds, linux-kernel, mingo

On Thu, 9 Oct 2008, Jeff Hansen wrote:

> OK, so are we all agreed that something like clocksource_trust=tsc would be
> the best?

No, it's per affected device: tsc=trust or tsc=stable or whatever
unintuitive name we want to come up. And it is a modification to TSC
not to the clocksource layer.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 19:53                     ` Thomas Gleixner
@ 2008-10-09 20:45                       ` Alok kataria
  2008-10-09 21:03                         ` Chris Snook
  0 siblings, 1 reply; 23+ messages in thread
From: Alok kataria @ 2008-10-09 20:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jeff Hansen, Chris Snook, torvalds, linux-kernel, mingo, akataria

On Thu, Oct 9, 2008 at 12:53 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 9 Oct 2008, Jeff Hansen wrote:
>
>> OK, so are we all agreed that something like clocksource_trust=tsc would be
>> the best?
>
> No, it's per affected device: tsc=trust or tsc=stable or whatever
> unintuitive name we want to come up. And it is a modification to TSC
> not to the clocksource layer.

Yep, this is cool. I too have a patch in my local tree which does a
similar thing i have a tsc_reliable flag which is set right now only
when we are running under a VMware hypervisor.
Along with marking the no_verify flag for TSC, this patch of mine also
skips the TSC synchornization checks.

        The TSC synchronization loop which is run whenever a new cpu is
brought up is not actually needed on systems which are known to have a
reliable TSC. TSC between 2 cpus can be off by a marginal value on such
systems and thats okay for timekeeping, since we do check for tsc going
back in read_tsc.

Can this reasoning be included and synchronization skipped for all
these systems with reliable aka trustworthy TSC's  ?

Thanks,
Alok
>
> Thanks,
>
>        tglx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 20:45                       ` Alok kataria
@ 2008-10-09 21:03                         ` Chris Snook
  2008-10-09 21:18                           ` Jeff Hansen
  2008-10-09 21:53                           ` Alok Kataria
  0 siblings, 2 replies; 23+ messages in thread
From: Chris Snook @ 2008-10-09 21:03 UTC (permalink / raw)
  To: Alok kataria
  Cc: Thomas Gleixner, Jeff Hansen, torvalds, linux-kernel, mingo, akataria

Alok kataria wrote:
> On Thu, Oct 9, 2008 at 12:53 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Thu, 9 Oct 2008, Jeff Hansen wrote:
>>
>>> OK, so are we all agreed that something like clocksource_trust=tsc would be
>>> the best?
>> No, it's per affected device: tsc=trust or tsc=stable or whatever
>> unintuitive name we want to come up. And it is a modification to TSC
>> not to the clocksource layer.
> 
> Yep, this is cool. I too have a patch in my local tree which does a
> similar thing i have a tsc_reliable flag which is set right now only
> when we are running under a VMware hypervisor.
> Along with marking the no_verify flag for TSC, this patch of mine also
> skips the TSC synchornization checks.
> 
>         The TSC synchronization loop which is run whenever a new cpu is
> brought up is not actually needed on systems which are known to have a
> reliable TSC. TSC between 2 cpus can be off by a marginal value on such
> systems and thats okay for timekeeping, since we do check for tsc going
> back in read_tsc.
> 
> Can this reasoning be included and synchronization skipped for all
> these systems with reliable aka trustworthy TSC's  ?

In general, no.  Not all hardware/hypervisors behave this way, even when the TSC 
is otherwise stable once synchronized.

-- Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 21:03                         ` Chris Snook
@ 2008-10-09 21:18                           ` Jeff Hansen
  2008-10-09 22:03                             ` Alok Kataria
  2008-10-09 21:53                           ` Alok Kataria
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff Hansen @ 2008-10-09 21:18 UTC (permalink / raw)
  To: Chris Snook
  Cc: Alok kataria, Thomas Gleixner, torvalds, linux-kernel, mingo, akataria

[X86] Add tsc=stable option for marking TSC as stable

This enables legacy hardware or VMs without HPET, LAPIC, or ACPI
timers to enter high-resolution timer mode.

Signed-off-by: Jeff Hansen <jhansen@cardaccess-inc.com>
---
  Documentation/kernel-parameters.txt |    4 ++++
  arch/x86/kernel/tsc_32.c            |   12 +++++++++++-
  arch/x86/kernel/tsc_64.c            |    9 +++++++++
  3 files changed, 24 insertions(+), 1 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9611505..8488074 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2084,6 +2084,10 @@ and is between 256 and 4096 characters. It is defined in the file
  			Format:
  			<io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq>

+	tsc=		[X86-32,X86-64]
+			tsc=stable: Mark TSC clocksource as stable, enabling
+			        high-resolution timer mode on older hardware.
+
  	turbografx.map[2|3]=	[HW,JOY]
  			TurboGraFX parallel port interface
  			Format:
diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c
index 65b7063..f6e8f71 100644
--- a/arch/x86/kernel/tsc_32.c
+++ b/arch/x86/kernel/tsc_32.c
@@ -49,6 +49,17 @@ static int __init tsc_setup(char *str)

  __setup("notsc", tsc_setup);

+static struct clocksource clocksource_tsc;
+
+static int __init tscx_setup(char *str)
+{
+	if (!strcmp(str, "stable"))
+		clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
+	return 1;
+}
+
+__setup("tsc=", tscx_setup);
+
  /*
   * code to mark and check if the TSC is unstable
   * due to cpufreq or due to unsynced TSCs
@@ -287,7 +298,6 @@ core_initcall(cpufreq_tsc);
  /* clock source code */

  static unsigned long current_tsc_khz;
-static struct clocksource clocksource_tsc;

  /*
   * We compare the TSC to the cycle_last value in the clocksource
diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c
index 1784b80..4a3e555 100644
--- a/arch/x86/kernel/tsc_64.c
+++ b/arch/x86/kernel/tsc_64.c
@@ -291,6 +291,15 @@ __setup("notsc", notsc_setup);

  static struct clocksource clocksource_tsc;

+static int __init tscx_setup(char *str)
+{
+	if (!strcmp(str, "stable"))
+		clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
+	return 1;
+}
+
+__setup("tsc=", tscx_setup);
+
  /*
   * We compare the TSC to the cycle_last value in the clocksource
   * structure to avoid a nasty time-warp. This can be observed in a
-- 
1.5.6.4

---------------------------------------------------
"If someone's gotta do it, it might as well be me."

On Thu, 9 Oct 2008, Chris Snook wrote:

> Alok kataria wrote:
>>  On Thu, Oct 9, 2008 at 12:53 PM, Thomas Gleixner <tglx@linutronix.de>
>>  wrote:
>> >  On Thu, 9 Oct 2008, Jeff Hansen wrote:
>> > 
>> > >  OK, so are we all agreed that something like clocksource_trust=tsc 
>> > >  would be
>> > >  the best?
>> >  No, it's per affected device: tsc=trust or tsc=stable or whatever
>> >  unintuitive name we want to come up. And it is a modification to TSC
>> >  not to the clocksource layer.
>>
>>  Yep, this is cool. I too have a patch in my local tree which does a
>>  similar thing i have a tsc_reliable flag which is set right now only
>>  when we are running under a VMware hypervisor.
>>  Along with marking the no_verify flag for TSC, this patch of mine also
>>  skips the TSC synchornization checks.
>>
>>          The TSC synchronization loop which is run whenever a new cpu is
>>  brought up is not actually needed on systems which are known to have a
>>  reliable TSC. TSC between 2 cpus can be off by a marginal value on such
>>  systems and thats okay for timekeeping, since we do check for tsc going
>>  back in read_tsc.
>>
>>  Can this reasoning be included and synchronization skipped for all
>>  these systems with reliable aka trustworthy TSC's  ?
>
> In general, no.  Not all hardware/hypervisors behave this way, even when the 
> TSC is otherwise stable once synchronized.
>
> -- Chris
>
>
>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 21:03                         ` Chris Snook
  2008-10-09 21:18                           ` Jeff Hansen
@ 2008-10-09 21:53                           ` Alok Kataria
  2008-10-09 22:50                             ` Chris Snook
  1 sibling, 1 reply; 23+ messages in thread
From: Alok Kataria @ 2008-10-09 21:53 UTC (permalink / raw)
  To: Chris Snook
  Cc: Alok kataria, Thomas Gleixner, Jeff Hansen, torvalds,
	linux-kernel, mingo

On Thu, 2008-10-09 at 14:03 -0700, Chris Snook wrote:
> Alok kataria wrote:
> > On Thu, Oct 9, 2008 at 12:53 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> >> On Thu, 9 Oct 2008, Jeff Hansen wrote:
> >>
> >>> OK, so are we all agreed that something like clocksource_trust=tsc would be
> >>> the best?
> >> No, it's per affected device: tsc=trust or tsc=stable or whatever
> >> unintuitive name we want to come up. And it is a modification to TSC
> >> not to the clocksource layer.
> >
> > Yep, this is cool. I too have a patch in my local tree which does a
> > similar thing i have a tsc_reliable flag which is set right now only
> > when we are running under a VMware hypervisor.
> > Along with marking the no_verify flag for TSC, this patch of mine also
> > skips the TSC synchornization checks.
> >
> >         The TSC synchronization loop which is run whenever a new cpu is
> > brought up is not actually needed on systems which are known to have a
> > reliable TSC. TSC between 2 cpus can be off by a marginal value on such
> > systems and thats okay for timekeeping, since we do check for tsc going
> > back in read_tsc.
> >
> > Can this reasoning be included and synchronization skipped for all
> > these systems with reliable aka trustworthy TSC's  ?
> 
> In general, no.  Not all hardware/hypervisors behave this way, even when the TSC
> is otherwise stable once synchronized.

I agree that in general this should be no, but since this is a
commandline variable it will be normally set for only those systems
which have only TSC as a option or know that the TSC is reliable.
wouldn't doing this be ok for such systems ? 

Thanks,
Alok

> 
> -- Chris


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 21:18                           ` Jeff Hansen
@ 2008-10-09 22:03                             ` Alok Kataria
  0 siblings, 0 replies; 23+ messages in thread
From: Alok Kataria @ 2008-10-09 22:03 UTC (permalink / raw)
  To: Jeff Hansen
  Cc: Chris Snook, Alok kataria, Thomas Gleixner, torvalds,
	linux-kernel, mingo

On Thu, 2008-10-09 at 14:18 -0700, Jeff Hansen wrote:
> [X86] Add tsc=stable option for marking TSC as stable
> 
> This enables legacy hardware or VMs without HPET, LAPIC, or ACPI
> timers to enter high-resolution timer mode.
> 
> Signed-off-by: Jeff Hansen <jhansen@cardaccess-inc.com>
> ---
>   Documentation/kernel-parameters.txt |    4 ++++
>   arch/x86/kernel/tsc_32.c            |   12 +++++++++++-
>   arch/x86/kernel/tsc_64.c            |    9 +++++++++
>   3 files changed, 24 insertions(+), 1 deletions(-)

Hi Jeff,

I see that these changes are not for the mainline tree. Do you plan to
do something of this sought for mainline (.28/tip) ?

My comments below are mainly for generalizing this stuff, don't know how
useful these (my comments) will be for 26.5


> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 9611505..8488074 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2084,6 +2084,10 @@ and is between 256 and 4096 characters. It is defined in the file
>                         Format:
>                         <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq>
> 
> +       tsc=            [X86-32,X86-64]
> +                       tsc=stable: Mark TSC clocksource as stable, enabling
> +                               high-resolution timer mode on older hardware.
> +
>         turbografx.map[2|3]=    [HW,JOY]
>                         TurboGraFX parallel port interface
>                         Format:
> diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c
> index 65b7063..f6e8f71 100644
> --- a/arch/x86/kernel/tsc_32.c
> +++ b/arch/x86/kernel/tsc_32.c
> @@ -49,6 +49,17 @@ static int __init tsc_setup(char *str)
> 
>   __setup("notsc", tsc_setup);
> 
> +static struct clocksource clocksource_tsc;
> +
> +static int __init tscx_setup(char *str)
> +{
> +       if (!strcmp(str, "stable"))
> +               clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;

Should we add a flag here instead, say tsc_reliable ? or whatever else
you want to call that. 

Also, there is code in check_geode_tsc_reliable which too sets this flag
for the TSC clocksource. 
Maybe you can just set the tsc_reliable flag from tscx_setup and
check_geode_tsc_reliable, and then check "tsc_reliable" value and set
the verify flag in init_tsc_clocksource. 

Thanks,
Alok




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 21:53                           ` Alok Kataria
@ 2008-10-09 22:50                             ` Chris Snook
  2008-10-09 23:22                               ` Alok Kataria
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Snook @ 2008-10-09 22:50 UTC (permalink / raw)
  To: akataria
  Cc: Alok kataria, Thomas Gleixner, Jeff Hansen, torvalds,
	linux-kernel, mingo

Alok Kataria wrote:
> On Thu, 2008-10-09 at 14:03 -0700, Chris Snook wrote:
>> Alok kataria wrote:
>>> On Thu, Oct 9, 2008 at 12:53 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>>>> On Thu, 9 Oct 2008, Jeff Hansen wrote:
>>>>
>>>>> OK, so are we all agreed that something like clocksource_trust=tsc would be
>>>>> the best?
>>>> No, it's per affected device: tsc=trust or tsc=stable or whatever
>>>> unintuitive name we want to come up. And it is a modification to TSC
>>>> not to the clocksource layer.
>>> Yep, this is cool. I too have a patch in my local tree which does a
>>> similar thing i have a tsc_reliable flag which is set right now only
>>> when we are running under a VMware hypervisor.
>>> Along with marking the no_verify flag for TSC, this patch of mine also
>>> skips the TSC synchornization checks.
>>>
>>>         The TSC synchronization loop which is run whenever a new cpu is
>>> brought up is not actually needed on systems which are known to have a
>>> reliable TSC. TSC between 2 cpus can be off by a marginal value on such
>>> systems and thats okay for timekeeping, since we do check for tsc going
>>> back in read_tsc.
>>>
>>> Can this reasoning be included and synchronization skipped for all
>>> these systems with reliable aka trustworthy TSC's  ?
>> In general, no.  Not all hardware/hypervisors behave this way, even when the TSC
>> is otherwise stable once synchronized.
> 
> I agree that in general this should be no, but since this is a
> commandline variable it will be normally set for only those systems
> which have only TSC as a option or know that the TSC is reliable.
> wouldn't doing this be ok for such systems ? 

Hardware doesn't deliberately do any TSC synchronization, though you might get 
it by accident in some configurations.  A VMware guest gets it for free thanks 
to the hypervisor doing it in software, but we need to run the check when we're 
booting on bare metal.  Jeff's patch enables the feature with as little risk as 
possible.  TSC sync is pretty cheap, so it's not worth it to add a more 
dangerous special case just for guests of certain hypervisors.

-- Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 22:50                             ` Chris Snook
@ 2008-10-09 23:22                               ` Alok Kataria
  2008-10-09 23:37                                 ` Chris Snook
  0 siblings, 1 reply; 23+ messages in thread
From: Alok Kataria @ 2008-10-09 23:22 UTC (permalink / raw)
  To: Chris Snook
  Cc: Alok kataria, Thomas Gleixner, Jeff Hansen, torvalds,
	linux-kernel, mingo

On Thu, 2008-10-09 at 15:50 -0700, Chris Snook wrote:
> Alok Kataria wrote:
> > On Thu, 2008-10-09 at 14:03 -0700, Chris Snook wrote:
> >
> > I agree that in general this should be no, but since this is a
> > commandline variable it will be normally set for only those systems
> > which have only TSC as a option or know that the TSC is reliable.
> > wouldn't doing this be ok for such systems ?
> 
> Hardware doesn't deliberately do any TSC synchronization, though you might get
> it by accident in some configurations.  A VMware guest gets it for free thanks
> to the hypervisor doing it in software, but we need to run the check when we're
> booting on bare metal. 

The TSC sync algorithm right now expects that TSC are perfectly in sync
between cpus.
But, the hardware doesn't deliberately do any synchronization, so we can
have situations where TSC was (accidently ? )off by a marginal value
during boot and as a result we mark TSC as unstable and don't use it as
a clocksource at all. For systems like the ones Jeff is using wouldn't
that be a problem. IOW, even though the TSC was *marginally* off during
bootup it should still be used as a clocksource, since you have no other
option, no ? 

>  Jeff's patch enables the feature with as little risk as
> possible.  TSC sync is pretty cheap, so it's not worth it to add a more
> dangerous special case just for guests of certain hypervisors.

I am not saying that we should go ahead and disable TSC sync for all
cases, we can very well do that just when running on a particular type
of system. The point here is to understand what should we do in a case
like above, on physical hardware.

Thanks,
Alok

> -- Chris


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 23:22                               ` Alok Kataria
@ 2008-10-09 23:37                                 ` Chris Snook
  2008-10-10 14:24                                   ` Jeff Hansen
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Snook @ 2008-10-09 23:37 UTC (permalink / raw)
  To: akataria
  Cc: Alok kataria, Thomas Gleixner, Jeff Hansen, torvalds,
	linux-kernel, mingo

Alok Kataria wrote:
> On Thu, 2008-10-09 at 15:50 -0700, Chris Snook wrote:
>> Alok Kataria wrote:
>>> On Thu, 2008-10-09 at 14:03 -0700, Chris Snook wrote:
>>>
>>> I agree that in general this should be no, but since this is a
>>> commandline variable it will be normally set for only those systems
>>> which have only TSC as a option or know that the TSC is reliable.
>>> wouldn't doing this be ok for such systems ?
>> Hardware doesn't deliberately do any TSC synchronization, though you might get
>> it by accident in some configurations.  A VMware guest gets it for free thanks
>> to the hypervisor doing it in software, but we need to run the check when we're
>> booting on bare metal. 
> 
> The TSC sync algorithm right now expects that TSC are perfectly in sync
> between cpus.
> But, the hardware doesn't deliberately do any synchronization, so we can
> have situations where TSC was (accidently ? )off by a marginal value
> during boot and as a result we mark TSC as unstable and don't use it as
> a clocksource at all. For systems like the ones Jeff is using wouldn't
> that be a problem. IOW, even though the TSC was *marginally* off during
> bootup it should still be used as a clocksource, since you have no other
> option, no ? 

You seem to be conflating position and rate.  When we mark TSC as stable, we're 
saying it will always advance at a known rate on all CPUs, but this says nothing 
about the relative positions on the different CPUs.  That skew can be huge on 
some hardware, not just marginal, so we still need to synchronize them at boot 
time, even though we don't need to (and can't, in this case) verify stability 
with another continuous clock source.

-- Chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] Re: x86_32 tsc/pit and hrtimers
  2008-10-09 23:37                                 ` Chris Snook
@ 2008-10-10 14:24                                   ` Jeff Hansen
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff Hansen @ 2008-10-10 14:24 UTC (permalink / raw)
  To: Chris Snook
  Cc: akataria, Alok kataria, Thomas Gleixner, torvalds, linux-kernel, mingo

This one is against 2.6.27.


[X86] Add tsc=stable option for marking TSC as stable

This enables legacy hardware or VMs without HPET, LAPIC, or ACPI timers
to enter high-resolution timer mode.

Signed-off-by: Jeff Hansen <jhansen@cardaccess-inc.com>
---
  Documentation/kernel-parameters.txt |    4 ++++
  arch/x86/kernel/tsc.c               |   11 +++++++++++
  2 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 1150444..0528bcb 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2174,6 +2174,10 @@ and is between 256 and 4096 characters. It is defined in the file
  			Format:
  			<io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq>

+	tsc=		[X86-32,X86-64]
+			tsc=stable: Mark TSC clocksource as stable, enabling
+			        high-resolution timer mode on older hardware.
+
  	turbografx.map[2|3]=	[HW,JOY]
  			TurboGraFX parallel port interface
  			Format:
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 8f98e9d..70e485e 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -98,6 +98,17 @@ int __init notsc_setup(char *str)

  __setup("notsc", notsc_setup);

+static struct clocksource clocksource_tsc;
+
+static int __init tscx_setup(char *str)
+{
+	if (!strcmp(str, "stable"))
+		clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
+	return 1;
+}
+
+__setup("tsc=", tscx_setup);
+
  #define MAX_RETRIES     5
  #define SMI_TRESHOLD    50000

-- 
1.5.6.4

---------------------------------------------------
"If someone's gotta do it, it might as well be me."

On Thu, 9 Oct 2008, Chris Snook wrote:

> Alok Kataria wrote:
>>  On Thu, 2008-10-09 at 15:50 -0700, Chris Snook wrote:
>> >  Alok Kataria wrote:
>> > >  On Thu, 2008-10-09 at 14:03 -0700, Chris Snook wrote:
>> > > 
>> > >  I agree that in general this should be no, but since this is a
>> > >  commandline variable it will be normally set for only those systems
>> > >  which have only TSC as a option or know that the TSC is reliable.
>> > >  wouldn't doing this be ok for such systems ?
>> >  Hardware doesn't deliberately do any TSC synchronization, though you 
>> >  might get
>> >  it by accident in some configurations.  A VMware guest gets it for free 
>> >  thanks
>> >  to the hypervisor doing it in software, but we need to run the check 
>> >  when we're
>> >  booting on bare metal.
>>
>>  The TSC sync algorithm right now expects that TSC are perfectly in sync
>>  between cpus.
>>  But, the hardware doesn't deliberately do any synchronization, so we can
>>  have situations where TSC was (accidently ? )off by a marginal value
>>  during boot and as a result we mark TSC as unstable and don't use it as
>>  a clocksource at all. For systems like the ones Jeff is using wouldn't
>>  that be a problem. IOW, even though the TSC was *marginally* off during
>>  bootup it should still be used as a clocksource, since you have no other
>>  option, no ? 
>
> You seem to be conflating position and rate.  When we mark TSC as stable, 
> we're saying it will always advance at a known rate on all CPUs, but this 
> says nothing about the relative positions on the different CPUs.  That skew 
> can be huge on some hardware, not just marginal, so we still need to 
> synchronize them at boot time, even though we don't need to (and can't, in 
> this case) verify stability with another continuous clock source.
>
> -- Chris
>
>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2008-10-10 14:25 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-07 22:41 x86_32 tsc/pit and hrtimers Jeff Hansen
2008-10-08 18:41 ` Chris Snook
2008-10-08 19:46   ` Jeff Hansen
2008-10-08 20:25     ` Chris Snook
2008-10-08 21:43       ` [PATCH] " Jeff Hansen
2008-10-08 21:47         ` Randy.Dunlap
2008-10-08 21:47         ` Chris Snook
2008-10-08 21:56           ` Jeff Hansen
2008-10-09  7:14             ` Thomas Gleixner
2008-10-09 18:39               ` Chris Snook
2008-10-09 19:18                 ` Thomas Gleixner
2008-10-09 19:45                   ` Jeff Hansen
2008-10-09 19:53                     ` Thomas Gleixner
2008-10-09 20:45                       ` Alok kataria
2008-10-09 21:03                         ` Chris Snook
2008-10-09 21:18                           ` Jeff Hansen
2008-10-09 22:03                             ` Alok Kataria
2008-10-09 21:53                           ` Alok Kataria
2008-10-09 22:50                             ` Chris Snook
2008-10-09 23:22                               ` Alok Kataria
2008-10-09 23:37                                 ` Chris Snook
2008-10-10 14:24                                   ` Jeff Hansen
2008-10-09 17:20   ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).