linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: john stultz <johnstul@us.ibm.com>
To: Roman Zippel <zippel@linux-m68k.org>
Cc: Valdis.Kletnieks@vt.edu, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org
Subject: Re: 2.6.17-mm2 hrtimer code wedges at boot?
Date: Thu, 06 Jul 2006 15:05:05 -0700	[thread overview]
Message-ID: <1152223506.24656.173.camel@cog.beaverton.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0607061912370.12900@scrub.home>

On Thu, 2006-07-06 at 22:33 +0200, Roman Zippel wrote:
> Hi,
> 
> On Wed, 5 Jul 2006, john stultz wrote:
> 
> > Roman: While I'm not 100% confident about my assessment above, I worry
> > this is mimicking the problems I had been seeing in my simulator w/ your
> > clocksource_adjustment algorithm when I simulated dropping many ticks.
> > While currently this behavior points to some other problem, with the
> > dynticks patch, its much more likely that we will see 100s of ticks
> > skipped.
> > 
> > I quickly revived my P-D adjustment patch and it does not appear to
> > suffer from the same problem with the above droptick change (although
> > its only been lightly tested). 
> 
> The PID concept had me confused for a while and I'm still not sure it 
> really applies, but there are similarities and what I want to do is not 
> that different from yours (limiting then the effect of the current error) 
> with the biggest difference that I keep everything in the big adjust 
> function, so the extra work is only done when needed.

Yea. I've been trying to map your method to the PID concept as well
(when all you have is a hammer... ;) and they are similar, however the
"future error" aspect of yours makes it a little more difficult to
grasp, but it is more compact.


> > +		int adj, multadj = 0;
> > +		s64 offset_update = 0, snsec_update = 0;
> > +		
> > +		/* First do the frequency adjustment:
> > +		 *   The idea here is to look at the error 
> > +		 *   accumulated since the last call to 
> > +		 *   update_wall_time to determine the 
> > +		 *   frequency adjustment needed so no new
> > +		 *   error will be incurred in the next
> > +		 *   interval.
> > +		 *
> > +		 *   This is basically derivative control
> > +		 *   using the PID terminology (we're calculating
> > +		 *   the derivative of the slope and correcting it).
> > +		 *
> > +		 *   The math is basically:
> > +		 *      multadj = interval_error/interval_cycles
> > +		 *   Which we fudge using binary approximation.
> > +		 */
> > +		if(interval_error >= 0) {
> > +			adj = error_aproximation(interval_error,
> > +						interval_cycs, 0);
> > +			multadj += 1 << adj;
> > +			snsec_update += interval << adj;
> > +			offset_update += offset << adj;
> > +		} else {
> > +			adj = error_aproximation(-interval_error, 
> > +						interval_cycs, 0);
> > +			multadj -= 1 << adj;
> > +			snsec_update -= interval << adj;
> > +			offset_update -= offset << adj;
> > +	        }
> 
> Here we actually don't derive very much, we already know how it will 
> change. :)

Correct, by looking one step ahead in your method that's where you're
taking the derivative into effect. Although keeping this future
reference does require some extra mental gymnastics for me.

> We basically want to keep the difference between tick length and xtime 
> interval as small as possible. Since most of the difference comes via 
> second_overflow(), we might want to do some precalculations near there, so 
> we avoid accumulating a lot of error in the first place, this is 
> especially important with only rare updates.

I don't think I disagree here, I can see how looking ahead in the future
is helpful to correct for error when the adjustment changes.

> > +		/* Now do the offset adjustment:
> > +		 *   Now that the frequncy is fixed, we
> > +		 *   want to look at the total error accumulated
> > +		 *   to move us back in sync using the same method.
> > +		 *   However, we must be careful as if we make too 
> > +		 *   sudden an adjustment we might overshoot. So we 
> > +		 *   limit the amount of change to spread the 
> > +		 *   adjustment (using MAXOFFADJ) over a longer 
> > +		 *   period of time.
> > +		 *
> > +		 *   This is basically proportional control
> > +		 *   using the PID terminology.
> > +		 *
> > +		 *   We use interval_cycs here as the divisor, which
> > +		 *   hopes that the next sample will be similar in 
> > +		 *   distance from the last.
> > +		 */
> > +		if(error >= 0) {
> > +			adj = error_aproximation(error, 
> > +					interval_cycs, MAXOFFADJ);
> > +			multadj += 1<<adj;
> > +			snsec_update += interval <<adj;
> > +			offset_update += offset << adj;
> > +		} else {
> > +			adj = error_aproximation(-error,
> > +						interval_cycs, MAXOFFADJ);
> > +			multadj -= 1<<adj;
> > +			snsec_update -= interval <<adj;
> > +			offset_update -= offset << adj;
> > +		}
> 
> The limitation avoids the biggest problems, but it also may slow down the 
> error reduction (the more the bigger the shift value is). The main problem 
> is really that we don't know how many ticks will be skipped and with 
> interval_cycs you only know what happened last and this won't help if 
> ticks are skipped in irregular intervals.

I think one of the important parts of my method is that I've broken up
the error into two different types of error, that can be manipulated
independently:

1) Error accumulated in the last period - the derivative part.
2) Total error - the proportional part.

Taking component #1, we directly correct the frequency to flatten the
error slope so in the next tick we should not accumulate any *extra*
error (ideally, of course, since our granularity won't quite allow for
that). Note that there is no "gain" (using the PID terminology) in that
sense being applied. Its just an absolute frequency correction.

Then taking component #2, if we calculated error/interval that would be
the ideal correction to the proportional component for the next tick.
However since we don't know when the next tick will occur, we use two
methods to dampen the effect to avoid overshoot: a) Use the value
interval/interval_cycs as our "gain" factor, thus if ticks are being
lost, we assume will will continue to lose ticks. You are correct here
in that irregular tick skipping could defeat this, but if a constant
gain factor (of say, 1<<8) was used, we would have a hard time
converging because the error/(interval*gain) would goto zero early. 

So instead, keeping the non-constant gain in (a), we use b) a limiter
(MAXOFFADJ) so any proportional change is very dampened. This does have
the effect that any large error takes awhile to reduce, but since we
correct the recent (derivative error) directly, we should not be
accumulating more.

The combination of these two effects allows us to converge reasonably
quickly when we are not skipping ticks, and controls overshoot when we
are skipping ticks (proportionally to the number of ticks skipped).


> Regarding dynticks it would help a lot here to know how many ticks are 
> skipped so you can spread the error correction evenly over this period 
> until the next adjustment and the correction is stopped in time.

I've considered alternatives, where we use a constant gain with the
proportional component (so any proportional change would be spread over
one second), but to avoid constant offsets when the total error was
smaller then the gain factor, we could also calculate a non-gain'ed
adjustment w/ a limiter and include that. This would be more ideal in
your irregular ticks example, but I'm not sure its really that much more
beneficial from what I proposed.

Anyway, I'm interested in what you are thinking for a more minimal
approach to what we have in mainline, since I have a harder time
wrapping my head around the future-error part (I can intuit what you
want to accomplish, and I understand the benefit of looking ahead, but I
just haven't been able to work out a proof for it :).

thanks
-john


  reply	other threads:[~2006-07-06 22:05 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-24 13:19 2.6.17-mm2 Andrew Morton
2006-06-24 15:53 ` 2.6.17-mm2 Rafael J. Wysocki
2006-06-24 17:20   ` 2.6.17-mm2 Dave Jones
2006-06-24 21:34     ` 2.6.17-mm2 Andrew Morton
2006-06-25  8:51       ` 2.6.17-mm2 Rafael J. Wysocki
2006-06-25 10:22         ` 2.6.17-mm2 Andrew Morton
2006-06-25 15:16           ` 2.6.17-mm2 Andrew Morton
2006-06-25 18:23             ` 2.6.17-mm2 Sam Ravnborg
2006-06-25 18:40               ` 2.6.17-mm2 Andrew Morton
2006-06-25 21:21                 ` 2.6.17-mm2 Sam Ravnborg
2006-06-30  7:38             ` 2.6.17-mm2 Randy.Dunlap
2006-07-02 10:11               ` 2.6.17-mm2 Russell King
2006-07-02 18:42                 ` 2.6.17-mm2 Randy.Dunlap
2006-07-02 18:47                   ` 2.6.17-mm2 Arjan van de Ven
2006-07-02 18:47                   ` 2.6.17-mm2 Sam Ravnborg
2006-07-03  5:50                 ` 2.6.17-mm2 Randy.Dunlap
2006-07-03 13:49                   ` 2.6.17-mm2 Russell King
2006-06-25 19:19           ` 2.6.17-mm2 Rafael J. Wysocki
2006-06-26 20:13           ` 2.6.17-mm2 Chandra Seetharaman
2006-06-24 19:41 ` 2.6.17-mm2 Dominik Karall
2006-06-24 21:43   ` 2.6.17-mm2 Andrew Morton
2006-06-25  6:06 ` 2.6.17-mm2 Reuben Farrelly
2006-06-25  9:37   ` 2.6.17-mm2 Barry K. Nathan
2006-06-25 10:29     ` 2.6.17-mm2 Reuben Farrelly
2006-06-25 11:19 ` 2.6.17-mm2 Michal Piotrowski
2006-06-25 11:40   ` 2.6.17-mm2 Andrew Morton
2006-06-25 12:18     ` 2.6.17-mm2 Michal Piotrowski
2006-06-25 16:25 ` 2.6.17-mm2 (NULL pointer dereference) Dominik Karall
2006-06-25 17:18   ` Andrew Morton
2006-06-25 18:11     ` Dominik Karall
2006-06-25 16:47 ` 2.6.17-mm2: no QLA3YYY_NAPI help text Adrian Bunk
2006-06-25 19:32 ` 2.6.17-mm2: BLK_CPQ_CISS_DA=m error Adrian Bunk
2006-06-26  0:41   ` Vivek Goyal
2006-06-25 23:13 ` [-mm patch] make drivers/scsi/pata_it821x.c:it821x_passthru_dev_select() static Adrian Bunk
2006-06-25 23:27   ` Alan Cox
2006-06-27  1:03   ` Jeff Garzik
2006-06-25 23:13 ` [-mm patch] fs/cifs/cifsproto.h: remove #ifdef around small_smb_init_no_tc() prototype Adrian Bunk
2006-06-26  4:05   ` Steven French
2006-06-26 15:17 ` [-mm patch] drivers/scsi/arcmsr/: cleanups Adrian Bunk
2006-06-26 20:27 ` [-mm patch] drivers/md/raid5.c: remove an unused variable Adrian Bunk
2006-06-26 21:41 ` 2.6.17-mm2 hrtimer code wedges at boot? Valdis.Kletnieks
2006-06-26 22:50   ` Valdis.Kletnieks
2006-06-26 23:02   ` john stultz
2006-06-26 23:27   ` Thomas Gleixner
2006-06-27  2:12     ` Valdis.Kletnieks
2006-06-27  5:54       ` Thomas Gleixner
2006-06-27 10:16   ` Roman Zippel
2006-06-27 16:43     ` Valdis.Kletnieks
2006-06-27 17:10       ` Roman Zippel
2006-06-27 17:23         ` Roman Zippel
2006-06-27 19:07           ` Valdis.Kletnieks
2006-06-28  0:07             ` john stultz
2006-06-28 10:35               ` Roman Zippel
2006-06-28 11:44                 ` Roman Zippel
2006-06-29 23:07                   ` Valdis.Kletnieks
2006-06-30 19:26                     ` john stultz
2006-06-30 21:04                       ` Valdis.Kletnieks
2006-07-03  1:13                         ` Roman Zippel
2006-07-03  1:56                           ` Daniel Walker
2006-07-03  2:20                             ` Valdis.Kletnieks
2006-07-03 20:08                               ` john stultz
2006-07-03 19:59                             ` john stultz
2006-07-04 22:21                               ` Valdis.Kletnieks
2006-07-05  4:29                           ` Valdis.Kletnieks
2006-07-06  0:37                             ` Roman Zippel
2006-07-06  0:56                               ` john stultz
2006-07-06  6:38                               ` Valdis.Kletnieks
2006-07-06  0:51                             ` john stultz
2006-07-06  1:12                               ` john stultz
2006-07-06  5:43                                 ` john stultz
2006-07-06 20:33                               ` Roman Zippel
2006-07-06 22:05                                 ` john stultz [this message]
2006-07-07 23:16                                   ` Roman Zippel
2006-07-08 20:02                                   ` [PATCH] adjust clock for lost ticks Roman Zippel
2006-07-09 21:25                                     ` john stultz
2006-06-28 23:41                 ` 2.6.17-mm2 hrtimer code wedges at boot? john stultz
2006-06-29 11:24                   ` Roman Zippel
2006-06-28 16:54 ` [-mm patch] include/asm-i386/acpi.h should #include <asm/processor.h> Adrian Bunk
2006-06-28 16:54 ` [-mm patch] fix sgivwfb compile Adrian Bunk
2006-06-28 16:54 ` [-mm patch] arch/i386/mach-visws/setup.c: remove dummy function calls Adrian Bunk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1152223506.24656.173.camel@cog.beaverton.ibm.com \
    --to=johnstul@us.ibm.com \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=zippel@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).