linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] clocksource: document some basic concepts
@ 2010-11-15 10:33 Linus Walleij
  2010-11-15 10:48 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Linus Walleij @ 2010-11-15 10:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Walleij, Thomas Gleixner, Nicolas Pitre, Colin Cross,
	John Stultz, Peter Zijlstra, Ingo Molnar, Rabin Vincent

This adds some documentation about clock sources and the weak
sched_clock() function that answers questions that repeatedly
arise on the mailing lists.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Colin Cross <ccross@google.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
---
 Documentation/timers/00-INDEX        |    2 +
 Documentation/timers/clocksource.txt |  106 ++++++++++++++++++++++++++++++++++
 2 files changed, 108 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/timers/clocksource.txt

diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX
index a9248da..fb88065 100644
--- a/Documentation/timers/00-INDEX
+++ b/Documentation/timers/00-INDEX
@@ -1,5 +1,7 @@
 00-INDEX
 	- this file
+clocksource.txt
+	- Clock sources and sched_clock() notes
 highres.txt
 	- High resolution timers and dynamic ticks design notes
 hpet.txt
diff --git a/Documentation/timers/clocksource.txt b/Documentation/timers/clocksource.txt
new file mode 100644
index 0000000..cf4ab9e
--- /dev/null
+++ b/Documentation/timers/clocksource.txt
@@ -0,0 +1,106 @@
+Clock sources and sched_clock()
+-------------------------------
+
+If you grep through the kernel source you will find a number of architecture-
+specific implementations of clock sources and several likewise architecture-
+specific overrides of the sched_clock() function.
+
+To provide timekeeping for your platform, the clock source provides
+the basic timeline, whereas clock events shoot interrupts on certain points
+on this timeline, providing facilities such as high-resolution timers.
+sched_clock() is used for scheduling and timestamping.
+
+
+Clock sources
+-------------
+
+The purpose of the clock source is to provide a timeline for the system that
+tells you where you are in time. For example issuing the command 'date' on
+a Linux system will eventually read the clock source to determine exactly
+what time it is.
+
+Typically the clock source is a monotonic, atomic counter which will provide
+n bits which count from 0 to (2^n-1) and then wraps around to 0 and start over.
+
+The clock source shall have as high resolution as possible, and shall be as
+stable and correct as possible as compared to a real-world wall clock. It
+should not move unpredictably back and forth in time or miss a few cycles
+here and there.
+
+It must be immune the kind of effects that occur in hardware where e.g. the
+counter register is read in two phases on the bus lowest 16 bits first and
+the higher 16 bits in a second bus cycle with the counter bits potentially
+being updated inbetween leading to the risk of very strange values from the
+counter.
+
+When the wall-clock accuracy of the clock source isn't satisfactory, there
+are various quirks and layers in the timekeeping code for e.g. synchronizing
+the user-visible time to RTC clocks in the system or against networked time
+servers using NTP, but all they do is basically to update an offset against
+the clock source, which provides the fundamental timeline for the system.
+These measures does not affect the clock source per se.
+
+The clock source struct shall provide means to translate the provided counter
+into a rough nanosecond value as an unsigned long long (unsigned 64 bit) number.
+Since this operation may be invoked very often doing this in a strict
+mathematical sense is not desireable: instead the number is taken as close as
+possible to a nanosecond value using only the arithmetic operations
+mult and shift, so in clocksource_cyc2ns() you find:
+
+  ns ~= (clocksource * mult) >> shift
+
+You will find a number of helper functions in the clock source code intended
+to aid in providing these mult and shift values, such as
+clocksource_khz2mult(), clocksource_hz2mult() that help determinining the
+mult factor from a fixed shift, and clocksource_calc_mult_shift() and
+clocksource_register_hz() which will help out assigning both shift and mult
+factors using the frequency of the clock source and desirable minimum idle
+time as the only input. In the past, the timekeeping authors would come up with
+these values by hand, which is why you will sometimes find hard-coded shift
+and mult values in the code.
+
+Since a 32 bit counter at say 100 MHz will wrap around to zero after some 43
+seconds, the code handling the clock source will have to compensate for this.
+That is the reason to why the clock source struct also contains a 'mask'
+member telling how many bits of the source are valid. This way the timekeeping
+code knows when the counter will wrap around and can insert the necessary
+compensation code on both sides of the wrap point so that the system timeline
+remains monotonic. Note that the clocksource_cyc2ns() function will not
+compensate for wrap-arounds: it will return the rough number of nanoseconds
+since the last wrap-around.
+
+You will notice that the clock event device code is based on the same basic
+idea about translating counters to nanoseconds using mult and shift
+arithmetics, and you find the same family of helper functions again for
+assigning these values. The clock event driver does not need a 'mask'
+attribute however: the system will not try to plan events beyond the time
+horizon of the clock event.
+
+
+sched_clock()
+-------------
+
+In addition to the clock sources and clock events there is a special weak
+function in the kernel called sched_clock(). This function shall return the
+number of nanoseconds since the system was started. An architecture may or
+may not provide an implementation of sched_clock() on its own.
+
+As the name suggests, sched_clock() is used for scheduling the system,
+determining the absolute timeslice for a certain process in the CFS scheduler
+for example. It is also used for printk timestamps when you have selected to
+include time information in printk for things like bootcharts.
+
+Compared to clock sources, sched_clock() has to be very fast: it is called
+much more often, especially by the scheduler. If you have to do trade-offs
+between accuracy compared to the clock source, you may sacrifice accuracy
+for speed in sched_clock(). It however require the same basic characteristics
+as the clock source, i.e. it has to be monotonic.
+
+The sched_clock() function may wrap only on unsigned long long boundaries,
+i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
+after circa 585 years. (For most practical systems this means "never".)
+
+If an architecture does not provide its own implementation of this function,
+it will fall back to using jiffies, making its maximum resolution 1/HZ of the
+jiffy frequency for the architecture. This will affect scheduling accuracy
+and will likely show up in system benchmarks.
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] clocksource: document some basic concepts
  2010-11-15 10:33 [PATCH] clocksource: document some basic concepts Linus Walleij
@ 2010-11-15 10:48 ` Peter Zijlstra
  2010-11-15 10:50   ` Peter Zijlstra
                     ` (2 more replies)
  2010-11-15 16:34 ` Randy Dunlap
  2010-11-15 19:45 ` john stultz
  2 siblings, 3 replies; 8+ messages in thread
From: Peter Zijlstra @ 2010-11-15 10:48 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-kernel, Thomas Gleixner, Nicolas Pitre, Colin Cross,
	John Stultz, Ingo Molnar, Rabin Vincent

On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> +sched_clock()
> +-------------
> +
> +In addition to the clock sources and clock events there is a special weak
> +function in the kernel called sched_clock(). This function shall return the
> +number of nanoseconds since the system was started. An architecture may or
> +may not provide an implementation of sched_clock() on its own.
> +
> +As the name suggests, sched_clock() is used for scheduling the system,
> +determining the absolute timeslice for a certain process in the CFS scheduler
> +for example. It is also used for printk timestamps when you have selected to
> +include time information in printk for things like bootcharts.
> +
> +Compared to clock sources, sched_clock() has to be very fast: it is called
> +much more often, especially by the scheduler. If you have to do trade-offs
> +between accuracy compared to the clock source, you may sacrifice accuracy
> +for speed in sched_clock(). It however require the same basic characteristics
> +as the clock source, i.e. it has to be monotonic.

Not so, we prefer it be synchronized and monotonic, but we don't require
so, see below.

> +The sched_clock() function may wrap only on unsigned long long boundaries,
> +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> +after circa 585 years. (For most practical systems this means "never".)

Currently true, John Stultz was going to look into ammending this by
teaching the kernel/sched_clock.c bits about early wraps (and a way for
architectures to specify this)

#define SCHED_CLOCK_WRAP_BITS 48

...

#ifdef SCHED_CLOCK_WRAP_BITS
  /* handle short wraps */
#endif

foo for wrap_min/wrap_max and "delta = now - scd->tick_raw" like things
might work.

> +If an architecture does not provide its own implementation of this function,
> +it will fall back to using jiffies, making its maximum resolution 1/HZ of the
> +jiffy frequency for the architecture. This will affect scheduling accuracy
> +and will likely show up in system benchmarks. 

sched_clock() need not be synchronized between CPUs, nor even be
monotonic, we prefer a fast high res clock over a slow one,
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK provides infrastructure to sanitize the
output of sched_clock().

[ of course we prefer a fast and synchronized clock, but we take fast
over synchronized ]

sched_clock() requires local IRQs to be disabled.

Therefore, sched_clock() shall not be used, see kernel/sched_clock.c for
detail and alternative interfaces.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] clocksource: document some basic concepts
  2010-11-15 10:48 ` Peter Zijlstra
@ 2010-11-15 10:50   ` Peter Zijlstra
  2010-11-15 19:48   ` john stultz
  2010-11-15 20:06   ` Nicolas Pitre
  2 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2010-11-15 10:50 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-kernel, Thomas Gleixner, Nicolas Pitre, Colin Cross,
	John Stultz, Ingo Molnar, Rabin Vincent

On Mon, 2010-11-15 at 11:48 +0100, Peter Zijlstra wrote:
> 
> Therefore, sched_clock() shall not be used, see kernel/sched_clock.c for
> detail and alternative interfaces. 

we should probably rename the thing to __arch_sched_clock() and migrate
people to the kernel/sched_clock.c interfaces.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] clocksource: document some basic concepts
  2010-11-15 10:33 [PATCH] clocksource: document some basic concepts Linus Walleij
  2010-11-15 10:48 ` Peter Zijlstra
@ 2010-11-15 16:34 ` Randy Dunlap
  2010-11-15 19:45 ` john stultz
  2 siblings, 0 replies; 8+ messages in thread
From: Randy Dunlap @ 2010-11-15 16:34 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-kernel, Thomas Gleixner, Nicolas Pitre, Colin Cross,
	John Stultz, Peter Zijlstra, Ingo Molnar, Rabin Vincent

On Mon, 15 Nov 2010 11:33:48 +0100 Linus Walleij wrote:

> This adds some documentation about clock sources and the weak
> sched_clock() function that answers questions that repeatedly
> arise on the mailing lists.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Cc: Colin Cross <ccross@google.com>
> Cc: John Stultz <johnstul@us.ibm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Rabin Vincent <rabin.vincent@stericsson.com>
> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
> ---
>  Documentation/timers/00-INDEX        |    2 +
>  Documentation/timers/clocksource.txt |  106 ++++++++++++++++++++++++++++++++++
>  2 files changed, 108 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/timers/clocksource.txt
> 
> diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX
> index a9248da..fb88065 100644
> --- a/Documentation/timers/00-INDEX
> +++ b/Documentation/timers/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>  	- this file
> +clocksource.txt
> +	- Clock sources and sched_clock() notes
>  highres.txt
>  	- High resolution timers and dynamic ticks design notes
>  hpet.txt
> diff --git a/Documentation/timers/clocksource.txt b/Documentation/timers/clocksource.txt
> new file mode 100644
> index 0000000..cf4ab9e
> --- /dev/null
> +++ b/Documentation/timers/clocksource.txt
> @@ -0,0 +1,106 @@
> +Clock sources and sched_clock()
> +-------------------------------
> +
> +If you grep through the kernel source you will find a number of architecture-
> +specific implementations of clock sources and several likewise architecture-
> +specific overrides of the sched_clock() function.
> +
> +To provide timekeeping for your platform, the clock source provides
> +the basic timeline, whereas clock events shoot interrupts on certain points
> +on this timeline, providing facilities such as high-resolution timers.
> +sched_clock() is used for scheduling and timestamping.
> +
> +
> +Clock sources
> +-------------
> +
> +The purpose of the clock source is to provide a timeline for the system that
> +tells you where you are in time. For example issuing the command 'date' on
> +a Linux system will eventually read the clock source to determine exactly
> +what time it is.
> +
> +Typically the clock source is a monotonic, atomic counter which will provide
> +n bits which count from 0 to (2^n-1) and then wraps around to 0 and start over.
> +
> +The clock source shall have as high resolution as possible, and shall be as
> +stable and correct as possible as compared to a real-world wall clock. It
> +should not move unpredictably back and forth in time or miss a few cycles
> +here and there.
> +
> +It must be immune the kind of effects that occur in hardware where e.g. the

              immune from the

> +counter register is read in two phases on the bus lowest 16 bits first and

                                          on the bus (lowest

> +the higher 16 bits in a second bus cycle with the counter bits potentially

                                  bus cycle) with

> +being updated inbetween leading to the risk of very strange values from the
> +counter.
> +
> +When the wall-clock accuracy of the clock source isn't satisfactory, there
> +are various quirks and layers in the timekeeping code for e.g. synchronizing
> +the user-visible time to RTC clocks in the system or against networked time
> +servers using NTP, but all they do is basically to update an offset against
> +the clock source, which provides the fundamental timeline for the system.
> +These measures does not affect the clock source per se.
> +
> +The clock source struct shall provide means to translate the provided counter
> +into a rough nanosecond value as an unsigned long long (unsigned 64 bit) number.

                                                                    64-bit)

> +Since this operation may be invoked very often doing this in a strict
> +mathematical sense is not desireable: instead the number is taken as close as

                             desirable:

> +possible to a nanosecond value using only the arithmetic operations
> +mult and shift, so in clocksource_cyc2ns() you find:
> +
> +  ns ~= (clocksource * mult) >> shift
> +
> +You will find a number of helper functions in the clock source code intended
> +to aid in providing these mult and shift values, such as
> +clocksource_khz2mult(), clocksource_hz2mult() that help determinining the

                                                 that help determine

> +mult factor from a fixed shift, and clocksource_calc_mult_shift() and
> +clocksource_register_hz() which will help out assigning both shift and mult
> +factors using the frequency of the clock source and desirable minimum idle
> +time as the only input. In the past, the timekeeping authors would come up with
> +these values by hand, which is why you will sometimes find hard-coded shift
> +and mult values in the code.
> +
> +Since a 32 bit counter at say 100 MHz will wrap around to zero after some 43

           32-bit

> +seconds, the code handling the clock source will have to compensate for this.
> +That is the reason to why the clock source struct also contains a 'mask'
> +member telling how many bits of the source are valid. This way the timekeeping
> +code knows when the counter will wrap around and can insert the necessary
> +compensation code on both sides of the wrap point so that the system timeline
> +remains monotonic. Note that the clocksource_cyc2ns() function will not
> +compensate for wrap-arounds: it will return the rough number of nanoseconds
> +since the last wrap-around.
> +
> +You will notice that the clock event device code is based on the same basic
> +idea about translating counters to nanoseconds using mult and shift
> +arithmetics, and you find the same family of helper functions again for
> +assigning these values. The clock event driver does not need a 'mask'
> +attribute however: the system will not try to plan events beyond the time
> +horizon of the clock event.
> +
> +
> +sched_clock()
> +-------------
> +
> +In addition to the clock sources and clock events there is a special weak
> +function in the kernel called sched_clock(). This function shall return the
> +number of nanoseconds since the system was started. An architecture may or
> +may not provide an implementation of sched_clock() on its own.
> +
> +As the name suggests, sched_clock() is used for scheduling the system,
> +determining the absolute timeslice for a certain process in the CFS scheduler
> +for example. It is also used for printk timestamps when you have selected to
> +include time information in printk for things like bootcharts.
> +
> +Compared to clock sources, sched_clock() has to be very fast: it is called
> +much more often, especially by the scheduler. If you have to do trade-offs
> +between accuracy compared to the clock source, you may sacrifice accuracy
> +for speed in sched_clock(). It however require the same basic characteristics

                                          requires

> +as the clock source, i.e. it has to be monotonic.
> +
> +The sched_clock() function may wrap only on unsigned long long boundaries,
> +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> +after circa 585 years. (For most practical systems this means "never".)
> +
> +If an architecture does not provide its own implementation of this function,
> +it will fall back to using jiffies, making its maximum resolution 1/HZ of the
> +jiffy frequency for the architecture. This will affect scheduling accuracy
> +and will likely show up in system benchmarks.
> -- 


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] clocksource: document some basic concepts
  2010-11-15 10:33 [PATCH] clocksource: document some basic concepts Linus Walleij
  2010-11-15 10:48 ` Peter Zijlstra
  2010-11-15 16:34 ` Randy Dunlap
@ 2010-11-15 19:45 ` john stultz
  2 siblings, 0 replies; 8+ messages in thread
From: john stultz @ 2010-11-15 19:45 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-kernel, Thomas Gleixner, Nicolas Pitre, Colin Cross,
	Peter Zijlstra, Ingo Molnar, Rabin Vincent

On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> This adds some documentation about clock sources and the weak
> sched_clock() function that answers questions that repeatedly
> arise on the mailing lists.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Cc: Colin Cross <ccross@google.com>
> Cc: John Stultz <johnstul@us.ibm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Rabin Vincent <rabin.vincent@stericsson.com>
> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
> ---
>  Documentation/timers/00-INDEX        |    2 +
>  Documentation/timers/clocksource.txt |  106 ++++++++++++++++++++++++++++++++++
>  2 files changed, 108 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/timers/clocksource.txt
> 
> diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX
> index a9248da..fb88065 100644
> --- a/Documentation/timers/00-INDEX
> +++ b/Documentation/timers/00-INDEX
> @@ -1,5 +1,7 @@
>  00-INDEX
>  	- this file
> +clocksource.txt
> +	- Clock sources and sched_clock() notes
>  highres.txt
>  	- High resolution timers and dynamic ticks design notes
>  hpet.txt
> diff --git a/Documentation/timers/clocksource.txt b/Documentation/timers/clocksource.txt
> new file mode 100644
> index 0000000..cf4ab9e
> --- /dev/null
> +++ b/Documentation/timers/clocksource.txt
> @@ -0,0 +1,106 @@
> +Clock sources and sched_clock()
> +-------------------------------

Thanks for writing this up!

I do worry a little that by talking about the two subjects in the same
document, it creates an impression that the two infrastructures are
conceptually linked (even though this is mostly about the differences
between them).

> +If you grep through the kernel source you will find a number of architecture-
> +specific implementations of clock sources and several likewise architecture-
> +specific overrides of the sched_clock() function.
> +
> +To provide timekeeping for your platform, the clock source provides
> +the basic timeline, whereas clock events shoot interrupts on certain points
> +on this timeline, providing facilities such as high-resolution timers.
> +sched_clock() is used for scheduling and timestamping.
> +
> +
> +Clock sources
> +-------------
> +
> +The purpose of the clock source is to provide a timeline for the system that
> +tells you where you are in time. For example issuing the command 'date' on
> +a Linux system will eventually read the clock source to determine exactly
> +what time it is.
> +
> +Typically the clock source is a monotonic, atomic counter which will provide
> +n bits which count from 0 to (2^n-1) and then wraps around to 0 and start over.
> +
> +The clock source shall have as high resolution as possible, and shall be as
> +stable and correct as possible as compared to a real-world wall clock. It
> +should not move unpredictably back and forth in time or miss a few cycles
> +here and there.
> +
> +It must be immune the kind of effects that occur in hardware where e.g. the
> +counter register is read in two phases on the bus lowest 16 bits first and
> +the higher 16 bits in a second bus cycle with the counter bits potentially
> +being updated inbetween leading to the risk of very strange values from the
> +counter.
> +
> +When the wall-clock accuracy of the clock source isn't satisfactory, there
> +are various quirks and layers in the timekeeping code for e.g. synchronizing
> +the user-visible time to RTC clocks in the system or against networked time
> +servers using NTP, but all they do is basically to update an offset against
> +the clock source, which provides the fundamental timeline for the system.
> +These measures does not affect the clock source per se.

Its not so much updating an offset, but more adjusting the frequency to
steer the clocksource to NTP time. 

Also while syncing the RTC is something that the timekeeping code does,
its not really connected to the clocksource code in particular. 


> +
> +The clock source struct shall provide means to translate the provided counter
> +into a rough nanosecond value as an unsigned long long (unsigned 64 bit) number.
> +Since this operation may be invoked very often doing this in a strict
> +mathematical sense is not desireable: instead the number is taken as close as
> +possible to a nanosecond value using only the arithmetic operations
> +mult and shift, so in clocksource_cyc2ns() you find:
> +
> +  ns ~= (clocksource * mult) >> shift
> +
> +You will find a number of helper functions in the clock source code intended
> +to aid in providing these mult and shift values, such as
> +clocksource_khz2mult(), clocksource_hz2mult() that help determinining the
> +mult factor from a fixed shift, and clocksource_calc_mult_shift() and
> +clocksource_register_hz() which will help out assigning both shift and mult
> +factors using the frequency of the clock source and desirable minimum idle
> +time as the only input. In the past, the timekeeping authors would come up with
> +these values by hand, which is why you will sometimes find hard-coded shift
> +and mult values in the code.

Yea. I'm working on cleaning these out, so I'd recommend just pointing
to using clocksource_register_hz/khz(), to have a proper mult-shift pair
calculated out for you. The explanation about the hard-coded bit from
the past is good while we're in transition.

> +Since a 32 bit counter at say 100 MHz will wrap around to zero after some 43
> +seconds, the code handling the clock source will have to compensate for this.
> +That is the reason to why the clock source struct also contains a 'mask'
> +member telling how many bits of the source are valid. This way the timekeeping
> +code knows when the counter will wrap around and can insert the necessary
> +compensation code on both sides of the wrap point so that the system timeline
> +remains monotonic. Note that the clocksource_cyc2ns() function will not
> +compensate for wrap-arounds: it will return the rough number of nanoseconds
> +since the last wrap-around.

Hrm. There are some more non-obvious conditions on this. In fact, for
clocksources that wrap at longer periods, you may hit an multiplication
overflows before the wrap boundary.

I'm starting to feel like clocksource_cyc2ns() should be internalized to
the timekeeping code so its subtle limitations aren't accidentally
tripped over, if its incorrectly re-used for some other purpose.

In fact, as with the clocksource_register_hz/khz, I'm thinking we should
move more towards internalizing most of the complex bits of the
clocksource structure. I'm hoping a read(), freq_hz/khz value, rating
and flags would be all that's needed, hopefully simplifying things for
clocksource writers, and reducing the chance folks might get something
wrong.

thanks
-john


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] clocksource: document some basic concepts
  2010-11-15 10:48 ` Peter Zijlstra
  2010-11-15 10:50   ` Peter Zijlstra
@ 2010-11-15 19:48   ` john stultz
  2010-11-15 20:06   ` Nicolas Pitre
  2 siblings, 0 replies; 8+ messages in thread
From: john stultz @ 2010-11-15 19:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Walleij, linux-kernel, Thomas Gleixner, Nicolas Pitre,
	Colin Cross, Ingo Molnar, Rabin Vincent

On Mon, 2010-11-15 at 11:48 +0100, Peter Zijlstra wrote:
> On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> > +sched_clock()
> > +-------------
> > +
> > +In addition to the clock sources and clock events there is a special weak
> > +function in the kernel called sched_clock(). This function shall return the
> > +number of nanoseconds since the system was started. An architecture may or
> > +may not provide an implementation of sched_clock() on its own.
> > +
> > +As the name suggests, sched_clock() is used for scheduling the system,
> > +determining the absolute timeslice for a certain process in the CFS scheduler
> > +for example. It is also used for printk timestamps when you have selected to
> > +include time information in printk for things like bootcharts.
> > +
> > +Compared to clock sources, sched_clock() has to be very fast: it is called
> > +much more often, especially by the scheduler. If you have to do trade-offs
> > +between accuracy compared to the clock source, you may sacrifice accuracy
> > +for speed in sched_clock(). It however require the same basic characteristics
> > +as the clock source, i.e. it has to be monotonic.
> 
> Not so, we prefer it be synchronized and monotonic, but we don't require
> so, see below.
> 
> > +The sched_clock() function may wrap only on unsigned long long boundaries,
> > +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> > +after circa 585 years. (For most practical systems this means "never".)
> 
> Currently true, John Stultz was going to look into ammending this by
> teaching the kernel/sched_clock.c bits about early wraps (and a way for
> architectures to specify this)

I'd like to, although at the moment I don't have much space on my plate
to do this, so in the mean time, if someone has time and interest into
looking at this, ping me and I can lay out the basics of what likely
should be done.

thanks
-john



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] clocksource: document some basic concepts
  2010-11-15 10:48 ` Peter Zijlstra
  2010-11-15 10:50   ` Peter Zijlstra
  2010-11-15 19:48   ` john stultz
@ 2010-11-15 20:06   ` Nicolas Pitre
  2010-11-15 21:13     ` Peter Zijlstra
  2 siblings, 1 reply; 8+ messages in thread
From: Nicolas Pitre @ 2010-11-15 20:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Walleij, lkml, Thomas Gleixner, Colin Cross, John Stultz,
	Ingo Molnar, Rabin Vincent

On Mon, 15 Nov 2010, Peter Zijlstra wrote:

> On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> > +The sched_clock() function may wrap only on unsigned long long boundaries,
> > +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> > +after circa 585 years. (For most practical systems this means "never".)

This is not necessarily the case.  Some implementations require a 
scaling factor too, making the number of remaining bits smaller than 64.  
See arch/arm/mach-pxa/time.c:sched_clock() for example, which has a 
maximum range of 208 days.  Of course, in practice we don't really care 
if sched_clock() wraps each 208 days, unlike for clock-source.

> Currently true, John Stultz was going to look into ammending this by
> teaching the kernel/sched_clock.c bits about early wraps (and a way for
> architectures to specify this)
> 
> #define SCHED_CLOCK_WRAP_BITS 48
> 
> ...
> 
> #ifdef SCHED_CLOCK_WRAP_BITS
>   /* handle short wraps */
> #endif

Is this worth supporting?  I'd simply use the low 32 bits and extend it 
to 63 bits using cnt32_to_63(). If the low 32 bits are wrapping too 
fast, then just shifting them down a few positions first should do the 
trick.  That certainly would have a much faster result.


Nicolas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] clocksource: document some basic concepts
  2010-11-15 20:06   ` Nicolas Pitre
@ 2010-11-15 21:13     ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2010-11-15 21:13 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Linus Walleij, lkml, Thomas Gleixner, Colin Cross, John Stultz,
	Ingo Molnar, Rabin Vincent

On Mon, 2010-11-15 at 15:06 -0500, Nicolas Pitre wrote:
> On Mon, 15 Nov 2010, Peter Zijlstra wrote:
> 
> > On Mon, 2010-11-15 at 11:33 +0100, Linus Walleij wrote:
> > > +The sched_clock() function may wrap only on unsigned long long boundaries,
> > > +i.e. after 64 bits. Since this is a nanosecond value this will mean it wraps
> > > +after circa 585 years. (For most practical systems this means "never".)
> 
> This is not necessarily the case.  Some implementations require a 
> scaling factor too, making the number of remaining bits smaller than 64.  
> See arch/arm/mach-pxa/time.c:sched_clock() for example, which has a 
> maximum range of 208 days.  Of course, in practice we don't really care 
> if sched_clock() wraps each 208 days, unlike for clock-source.

Right, its like sched_clock() would go backwards and we loose some
precision during that jiffy (assuming the arch uses
HAVE_UNSTABLE_SCHED_CLOCK), nothing too horrible.

> > Currently true, John Stultz was going to look into ammending this by
> > teaching the kernel/sched_clock.c bits about early wraps (and a way for
> > architectures to specify this)
> > 
> > #define SCHED_CLOCK_WRAP_BITS 48
> > 
> > ...
> > 
> > #ifdef SCHED_CLOCK_WRAP_BITS
> >   /* handle short wraps */
> > #endif
> 
> Is this worth supporting?  I'd simply use the low 32 bits and extend it 
> to 63 bits using cnt32_to_63(). If the low 32 bits are wrapping too 
> fast, then just shifting them down a few positions first should do the 
> trick.  That certainly would have a much faster result.

Whatever works, dealing with the wrap is only a few shifts.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-11-15 21:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-15 10:33 [PATCH] clocksource: document some basic concepts Linus Walleij
2010-11-15 10:48 ` Peter Zijlstra
2010-11-15 10:50   ` Peter Zijlstra
2010-11-15 19:48   ` john stultz
2010-11-15 20:06   ` Nicolas Pitre
2010-11-15 21:13     ` Peter Zijlstra
2010-11-15 16:34 ` Randy Dunlap
2010-11-15 19:45 ` john stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).