All of lore.kernel.org
 help / color / mirror / Atom feed
* Timekeeping oddities on MacMini G4s
@ 2017-02-01  3:10 Hugh Blemings
  2017-02-01  3:34 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Hugh Blemings @ 2017-02-01  3:10 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: devel, Hal Murray, Paul Mackerras, Benjamin Herrenschmidt,
	Michael Ellerman

Hi,

I've recently been lurking on the ntpsec project mailing list.

One of the developers (Hal Murray, CCd) has been doing some tests of the 
codebase using FreeBSD and Debian on a G4 based Mac Mini - this largely 
motivated by checking all is well on a big-endian system.

Hal has identified what appears to be a systemic inaccuracy in kernel 
timekeeping of around 2500ppm on both Linux and FreeBSD and on several 
different MacMini G4s (1.42GHz and 1.5GHz variants)

That it appears on both Linux and FreeBSD kernels and some other data 
points leads us to wonder if the CPU frequency reported by Open Firmware 
being different to the actual raw clock is the root cause, but I/we 
speculate at this point.

To be clear we're not claiming "kernel bug" but something odd that 
perhaps the collective wisdom of ya'll might be able shed some light on, 
other places to dig etc.

Below is I hope sufficient salient information, copied from the various 
messages in the thread on the ntpsec mailing list.  If you'd like to 
refer to the original thread it's available in the archive starting here 
- https://lists.ntpsec.org/pipermail/devel/2017-January/003443.html

Would welcome any feedback/pointers you can provide.

Insights into how the various values from OpenFirmware interact with the 
derived figures in the kernel would be interesting too :)

Thanks,

Cheers,
Hugh


-------------------------
 From Hal's original email;

I'm reasonably confident that the system doesn't keep reasonable time 
when ntpd isn't running.

Here is my test case:
   assuming you have a working ntp setup
   add "disable ntp" to ntp.conf
   make sure you are logging loopstats:
     statsdir /var/log/ntp or ntpstats or ...
     filegen loopstats  type day link
reboot the system to start clean

You will get things like this in loopstats.
57783 14135.823 0.684044126 0.000 0.000000954 0.000000 6
57783 14137.823 0.688713522 0.000 0.000000954 0.000000 6
57783 14139.823 0.692100011 0.000 0.000000954 0.000000 6
...
57783 16640.823 6.663863776 0.000 0.000000954 0.000000 6
57783 16692.823 6.743695119 0.000 0.000000954 0.000000 6
57783 16708.823 6.823918338 0.000 0.000000954 0.000000 6
The second column is the seconds this day.
The 3rd column is the offset from the servers you are using.
It should be changing slowly.  If it is slow enough, ntpd will correct 
it by adjusting the drift.

You can calculate the drift as
    (offset2-offset1)*1000000/(time2 - time1)

$ dc
16708.823 14135.823 - p
2573.000
6.823918338 0.684044126 - p
6.139874212
1000000 * p
6139874.212000000
2573.000 / p
2386

That's 2386 ppm.  "slow enough" is under 500 ppm.  Sane numbers are 
under 100.  (either sign)

Without the "disable ntp", the 4th column will be the drift.  It should 
vary with temperature.  Ballpark change is 1 ppm per C.

-------------------------

Collated system information;

$ uname -a
Linux deb-ppc.example.com 3.16.0-4-powerpc #1 Debian 3.16.39-1 
(2016-12-30) ppc GNU/Linux

$ cat /etc/issue
Debian GNU/Linux 8

--------------

First system - one used in sample case shown above, 1.5GHz CPU.

/pro/cpuinfo says:
processor       : 0
cpu             : 7447A, altivec supported
clock           : 1499.999994MHz
revision        : 1.2 (pvr 8003 0102)
bogomips        : 83.20
timebase        : 41600571
platform        : PowerMac
model           : PowerMac10,2
machine         : PowerMac10,2
motherboard     : PowerMac10,2 MacRISC3 Power Macintosh
detected as     : 287 (Mac mini (Late 2005))
pmac flags      : 00000010
L2 cache        : 512K unified
pmac-generation : NewWorld
Memory          : 512 MB

--------------

 From a second system (labelled as a 1.42GHz machine by Apple):
Same install image as first system;

$ more /proc/cpuinfo
processor       : 0
cpu             : 7447A, altivec supported
clock           : 1416.666661MHz
revision        : 1.2 (pvr 8003 0102)
bogomips        : 83.24
timebase        : 41620907
platform        : PowerMac
model           : PowerMac10,1
machine         : PowerMac10,1
motherboard     : PowerMac10,1 MacRISC3 Power Macintosh
detected as     : 287 (Mac mini)
pmac flags      : 00000010
L2 cache        : 512K unified
pmac-generation : NewWorld
Memory          : 512 MB

 From syslog;

30 Jan 19:46:40 ntpd[3773]: frequency error 2712 PPM exceeds tolerance 
500 PPM
30 Jan 19:46:43 ntpd[3773]: ntpd exiting on signal 15 (Terminated)

-----------------

Open Firmware provides:

1.42GHz machine
    41620997 027b1605 timebase-frequency
  1416666661 54709e25 clock-frequency
  1415000000 54572fc0 rounded-clock-frequency
  1415113906 5458ecb2 recalced-clock-frequency
   166483989 09ec5815 bus-frequency
   166666666 09ef21aa config-bus-frequency
          17 00000011 processor-to-bus-ratio*2

1.5GHz machine
    41600571 027ac63b timebase-frequency
  1499999994 59682efa clock-frequency
  1498000000 5949aa80 rounded-clock-frequency
  1497620583 5943e067 recalced-clock-frequency
   166402287 09eb18ef bus-frequency
   166666666 09ef21aa config-bus-frequency
          18 00000012 processor-to-bus-ratio*2

-----------------

Reports of possibly similar issues elsewhere

"Clock Drift on Mac Mini (G4-based), ajdtimex, ntp" - undated
http://i1.dk/misc/mac_mini_clock_drift_adjtimex_ntp.html

"System clock falls behind quickly on Mac mini G4" (2014 on FreeBSD)
https://lists.freebsd.org/pipermail/freebsd-ppc/2014-April/006931.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-01  3:10 Timekeeping oddities on MacMini G4s Hugh Blemings
@ 2017-02-01  3:34 ` Benjamin Herrenschmidt
  2017-02-01  6:59   ` Hal Murray
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2017-02-01  3:34 UTC (permalink / raw)
  To: Hugh Blemings, linuxppc-dev
  Cc: devel, Hal Murray, Paul Mackerras, Michael Ellerman

On Wed, 2017-02-01 at 14:10 +1100, Hugh Blemings wrote:
> Hi,
> 
> I've recently been lurking on the ntpsec project mailing list.
> 
> One of the developers (Hal Murray, CCd) has been doing some tests of the 
> codebase using FreeBSD and Debian on a G4 based Mac Mini - this largely 
> motivated by checking all is well on a big-endian system.
> 
> Hal has identified what appears to be a systemic inaccuracy in kernel 
> timekeeping of around 2500ppm on both Linux and FreeBSD and on several 
> different MacMini G4s (1.42GHz and 1.5GHz variants)
> 
> That it appears on both Linux and FreeBSD kernels and some other data 
> points leads us to wonder if the CPU frequency reported by Open Firmware 
> being different to the actual raw clock is the root cause, but I/we 
> speculate at this point.

Right, we just use the value provided by Open Firmware. Any chance you
can try with MacOS X ?

>From the value in the properties you showed me (and the ones I have in
some DT snapshots) it looks like the value isn't fixed but somewhat
calibrated by Open Firmware during boot.

It could be that this calibration sucks. MacOS X seems to do its own
calibration at boot time based on the KeyLargo timer. It would be useful
to see if their stuff is more precise, we could write something similar
for Linux and BSD.

Cheers,
Ben.

 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-01  3:34 ` Benjamin Herrenschmidt
@ 2017-02-01  6:59   ` Hal Murray
  2017-02-01  7:13     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Hal Murray @ 2017-02-01  6:59 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Hugh Blemings, linuxppc-dev, devel, Hal Murray, Paul Mackerras,
	Michael Ellerman

Thanks.

benh@kernel.crashing.org said:
> Right, we just use the value provided by Open Firmware. Any chance you can
> try with MacOS X ? 

Not easily.  I'm using boxes from eBay.  They didn't come with CDs and I've 
already installed other software.


> From the value in the properties you showed me (and the ones I have in some
> DT snapshots) it looks like the value isn't fixed but somewhat calibrated by
> Open Firmware during boot. 

I rebooted several times.  It always got the exact same clock speed numbers.

I don't know anything about the insides of the PowerPC chip.  Can you confirm 
that the kernel time keeping works off an always ticking register similar to 
the Intel TSC and uses the timebase-frequency as the scale factor?

If so, I should be able to "fix" it from Open Firmware.  I tried that but 
things got worse.  I could easily have fatfingered something but more likely 
my reasoning for computing the right value was buggy.  I guess I'll try again.

I see that powerpc/kernel/time.c reads both timebase-frequency and 
clock-frequency, but doesn't seem to use clock-frequency.  Was that just a 
handy place to read it that got called before anybody else needed it?



-- 
These are my opinions.  I hate spam.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-01  6:59   ` Hal Murray
@ 2017-02-01  7:13     ` Benjamin Herrenschmidt
  2017-02-01  7:56       ` Hal Murray
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2017-02-01  7:13 UTC (permalink / raw)
  To: Hal Murray
  Cc: Hugh Blemings, linuxppc-dev, devel, Paul Mackerras, Michael Ellerman

On Tue, 2017-01-31 at 22:59 -0800, Hal Murray wrote:
> Thanks.
> 
> benh@kernel.crashing.org said:
> > Right, we just use the value provided by Open Firmware. Any chance you can
> > try with MacOS X ? 
> 
> Not easily.  I'm using boxes from eBay.  They didn't come with CDs and I've 
> already installed other software.

Ok, I do have one though somewhere with OS X on it. If you give me instructions
on how to test (I know near to nothing about ntpsec), I should be able to compile
and run it.
> 
> > From the value in the properties you showed me (and the ones I have in some
> > DT snapshots) it looks like the value isn't fixed but somewhat calibrated by
> > Open Firmware during boot. 
> 
> I rebooted several times.  It always got the exact same clock speed numbers.

Interesting. Though different units get different numbers...

> I don't know anything about the insides of the PowerPC chip.  Can you confirm 
> that the kernel time keeping works off an always ticking register similar to 
> the Intel TSC and uses the timebase-frequency as the scale factor?

It should be externally clocked on these CPUs. Either that or a divisor of
the bus frequency, I don't remember, but I *think* Apple uses an external
clock.

But yes, the timebase is supposed to be always running at a constant speed
which is the timebase-frequency (no scaling, the register is always running
at *that* speed).

> If so, I should be able to "fix" it from Open Firmware.  I tried that but 
> things got worse.  I could easily have fatfingered something but more likely 
> my reasoning for computing the right value was buggy.  I guess I'll try again.
> 
> I see that powerpc/kernel/time.c reads both timebase-frequency and 
> clock-frequency, but doesn't seem to use clock-frequency.  Was that just a 
> handy place to read it that got called before anybody else needed it?

Right, it's for display in /proc/cpuinfo in absence of a specific frequency
control driver for the platform (there should be one for the mac mini though).

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-01  7:13     ` Benjamin Herrenschmidt
@ 2017-02-01  7:56       ` Hal Murray
  2017-02-05  0:19         ` Fred Wright
  2017-02-07  2:21         ` Michael Ellerman
  0 siblings, 2 replies; 14+ messages in thread
From: Hal Murray @ 2017-02-01  7:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Hal Murray, Hugh Blemings, linuxppc-dev, devel, Paul Mackerras,
	Michael Ellerman


benh@kernel.crashing.org said:
> Ok, I do have one though somewhere with OS X on it. If you give me
> instructions on how to test (I know near to nothing about ntpsec), I should
> be able to compile and run it.

I'm assuming you are already running the normal ntpd from ntp classic, or 
Apple's version of it.

ntpq -c "rv 0 frequency" <host-name, defaults to localhost>
will get you the fudge-factor that ntpd passes to the kernel to get
the clock ticking accurately.  Units are parts-per-million.

There is a source-address filter in ntp.conf (restrict is the keyword), so 
try from localhost if it doesn't work from the net.

The problem that started this is that it's off by more than 500 ppm.  If all 
the arithmetic and documentation is correct, it should be the crystal error.  
A few or few 10s of ppm is reasonable at normal temperature.  Over 50 is a 
bit strange, but anything under 100 is within normal.  Over 100 is getting 
suspicious but could easily be due to some round off someplace.



ntpsec should be the same as ntp classic.  I tried ntp classic on FreeBSD 
(same trouble) but haven't tried it on Debian.

If you want to try ntpsec...

git clone git@gitlab.com:NTPsec/ntpsec.git xxx
cd xxx
./waf configure build check

I think it builds cleanly on OS-X, but I can't verify that.

ps ax | grep ntpd  # to get args
service ntpd stop
./build/main/ntpd/ntpd <args-from-above>

Unless you are doing something unusual, it should run with your existing 
ntp.conf and get the same frequency correction.



-- 
These are my opinions.  I hate spam.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-01  7:56       ` Hal Murray
@ 2017-02-05  0:19         ` Fred Wright
  2017-02-05  3:32           ` Hal Murray
  2017-02-05 23:22           ` Benjamin Herrenschmidt
  2017-02-07  2:21         ` Michael Ellerman
  1 sibling, 2 replies; 14+ messages in thread
From: Fred Wright @ 2017-02-05  0:19 UTC (permalink / raw)
  To: linuxppc-dev, devel


On Tue, 31 Jan 2017, Hal Murray wrote:

> benh@kernel.crashing.org said:
> > Right, we just use the value provided by Open Firmware. Any chance you can

That seems inconsistent with the following comment in
arch/powerpc/kernel/time.c:

 * TODO (not necessarily in this file):
 * - improve precision and reproducibility of timebase frequency
 * measurement at boot time.

Unless it's an outdated comment that nobody bothered to remove.

> > From the value in the properties you showed me (and the ones I have in some
> > DT snapshots) it looks like the value isn't fixed but somewhat calibrated by
> > Open Firmware during boot.

Or by the OS, if the comment is to be believed.  It would be interesting
to check OF values guaranteed to come directly from OF.

Runtime calibration often has issues of its own.  For example, on x86, the
kernel likes to calibrate the TSC against the RTC at boot time.  But if an
SMI intervenes during the calibration loop (which is not prevented by
disabling interrupts), it throws the calibration so badly out of whack
that the system can't keep time properly until it's rebooted.  At Google,
we had to disable ECC-related SMIs on at least one server model for that
reason.

When you think about it, the manufacturer knows perfectly well the
nominal frequency of the crystal being stuffed, and is also programming
onboard nonvolatile memory (typically EEPROM) with various parameters, so
directly reporting the nominal frequency should be much more reliable than
trying to measure it in a short test at boot time.  And detecting that
it's reported incorrectly should be the job of a diagnostic, not an OS.

One would, of course, like to base timekeeping on the *actual* frequency
rather than the nominal frequency, but measuring that accurately enough to
be useful takes longer than one would like to spend in early startup,
especially if the only accurate time source is Internet-based NTP.  The
RTC is *not* good enough for this purpose, since *its* crystal has its own
errors.

> I rebooted several times.  It always got the exact same clock speed numbers.

Most likely not runtime calibration, then.

> I don't know anything about the insides of the PowerPC chip.  Can you confirm
> that the kernel time keeping works off an always ticking register similar to
> the Intel TSC and uses the timebase-frequency as the scale factor?

That's certainly the way it's normally done on PowerPC, and a cursory
examination of the sources looks consistent with that.  The PowerPC
timebase is a 64-bit free-running counter.  Unlike the TSC, it's not
per-core.  On the plus side, that means that the values are guaranteed not
to be core-specific.  On the minus side, it means that its count rate is
lower, and it's sufficiently "distant" that accessing it is somewhat more
expensive.

The PowerPC architecture permits the timebase frequency to be variable,
but I'm not aware of any implementations that take advantage of that.  The
Motorola 32-bit implementations in general run it on the "bus clock",
which is independent of processor-clock multipliers, and is also common
across processor chips in systems with more than one.  The IBM 970 (G5)
runs it on the "mesh clock".  That can change frequencies, but by factors
of two which are accounted for in the way that the timebase counts, making
it effectively constant rate.

> If so, I should be able to "fix" it from Open Firmware.  I tried that but
> things got worse.  I could easily have fatfingered something but more likely
> my reasoning for computing the right value was buggy.  I guess I'll try again.

You are aware, aren't you, that frequency errors reported by NTPd have the
wrong sign?  I.e., a negative value in the driftfile means that the
frequency of your local clock oscillator is too high.  I imagine it's too
late to fix that now, by decades.

> I see that powerpc/kernel/time.c reads both timebase-frequency and
> clock-frequency, but doesn't seem to use clock-frequency.  Was that just a
> handy place to read it that got called before anybody else needed it?

Perhaps there's some way that it's reported to humans.

On Tue, 31 Jan 2017, Hal Murray wrote:
> benh@kernel.crashing.org said:
> > Ok, I do have one though somewhere with OS X on it. If you give me
> > instructions on how to test (I know near to nothing about ntpsec), I should
> > be able to compile and run it.
>
> I'm assuming you are already running the normal ntpd from ntp classic, or
> Apple's version of it.

Or perhaps the one from MacPorts, which is close to the ntp.org version.

> ntpq -c "rv 0 frequency" <host-name, defaults to localhost>
> will get you the fudge-factor that ntpd passes to the kernel to get
> the clock ticking accurately.  Units are parts-per-million.

And three decimal places is at least two too few if you're using a
rubidium-based frequency reference. :-)

> The problem that started this is that it's off by more than 500 ppm.  If all
> the arithmetic and documentation is correct, it should be the crystal error.
> A few or few 10s of ppm is reasonable at normal temperature.  Over 50 is a
> bit strange, but anything under 100 is within normal.  Over 100 is getting
> suspicious but could easily be due to some round off someplace.

Generally, yes.  Tolerances on run-of-the-mill crystals are usually 100ppm
or better, with 50 being quite common.  I imagine that the 500ppm limit is
intended as a fairly loose sanity check, on the theory that if it's that
far off, it's unclear whether it's due to frequency confusion or general
brokenness.

> If you want to try ntpsec...
>
> git clone git@gitlab.com:NTPsec/ntpsec.git xxx
> cd xxx
> ./waf configure build check
>
> I think it builds cleanly on OS-X, but I can't verify that.

Only on the very latest version (10.12 "Sierra").  Otherwise, the build
fails because the clock_gettime/clock_settime fallback code is broken in
multiple ways.  Since the last PPC-compatible OSX was 10.5, this would be
a no go by seven major versions.

Fred Wright

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-05  0:19         ` Fred Wright
@ 2017-02-05  3:32           ` Hal Murray
  2017-02-05 15:36             ` Frank Nicholas
  2017-02-05 23:22           ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 14+ messages in thread
From: Hal Murray @ 2017-02-05  3:32 UTC (permalink / raw)
  To: Fred Wright; +Cc: linuxppc-dev, devel, hmurray

Thanks.

fw@fwright.net said:
> That seems inconsistent with the following comment in arch/powerpc/kernel/
> time.c:

>  * TODO (not necessarily in this file):
>  * - improve precision and reproducibility of timebase frequency
>  * measurement at boot time. 

I didn't see any calibration code, but I could easily have missed it.  I'd 
expect it to print the result, and I haven't found that either.

> Or by the OS, if the comment is to be believed.  It would be interesting to
> check OF values guaranteed to come directly from OF. 

>> I rebooted several times.  It always got the exact same clock speed numbers.
> Most likely not runtime calibration, then.

I've checked the OF numbers quite a few times before boot.  They are also available after boot as /proc/device-tree/cpus/PowerPC,G4@0/
Unless I patch something, they have always been the same value.

----------

More data...

I have 3 Mac minis.  They came from eBay.  (Without trying, I ended up with 3 different CPU speeds: 1.25, 1.42, and 1.5.)

I think the system bus runs at 166 MHz and the time keeping register runs at 1/4 of that.
That would be 41.5.

I have a hack that reads that register and prints out the time and register value every minute.  I'm assuming that ntpd is running and stepping the clock to keep time close-enough.  If I average over a long enough time, I should be able to compute the actual frequency of that register.

I did that.  I got 41.501276.  I told Open Firmware to use that number.  Happyness.  I haven't gone back to verify that 41.5 without the low digits works correctly and get the actual drift.

On the second system, I just started with 41.5.  More happyness.  That was FreeBSD.

It didn't work on the 3rd system.  (I think that's the same one the caused the troubles reported on the first message in this thread.)  With a bit of trial and error, 41.6215 is within 5 ppm.  But my program that measures the frequency prints out 41.501338  That's averaged over 20 hours.  The last few digits wobble around.

I have no idea how 41.625 turns into 41.5  Does anybody else?
Is there anything I should look for and/or any experiments I should run to collect more data?

I think I'll go back and repeat things with more careful notes.

-----------

I think the 1.416 is correct.  (rather than 1.42)

1.25 is 7.5 * 166.666
1.416 is 8.5 * 166.666
1.5 is 9 * 166.666

-------

There is a potential problem in this area.  166.666 is really 6 nanoseconds.  But you order crystals by frequency rather than cycle time.  So the crystal is probably 166 or 166.6 depending on how many digits the order form had.  Maybe 167 or 166.7 if somebody rounded up.  But maybe they wouldn't round up because that might push something over a timing spec.  Mumble.  If it were easier to take apart, I'd look inside to see if I could find the crystal and see what was printed on it.

0.666 out of 166 is 4000 ppm.  That's the right ballpark but I don't see anything that matches close enough to explain any of my observations.  166.6 is 400 ppm which ntpd should be able to handle.

Does anybody know the actual frequency of the crystal on a Mac mini?  (It could easily be some sub-multiple.)

166 / 4 is 41.5 which matches what I measured.
166.6 / 4 is 41.65  That's 2800 ppm from 41.5, but only 720 ppm from 41.62

There is also the possibility of one of the EMI clock spreading chips.  I think I saw one data sheet that said 1/2% and 30 KHz.  (I think the PCI specs were updated to allow that.  The parameters are important if you are using a PLL chip to make a zero-delay clock buffer.)  I haven't seen any numbers that would support that.

--------

Shortly after I started working at Xerox way back in 1976, Ed Taft put out a new version of the operating system for the Alto.  It tweaked the magic timekeeping constant.  The Alto was designed to run at 170 ns.  The crystal was 5.88 MHz.  The original software had derived the constant from 170 ns which is 5.882352 rather than 5.88.  If I did the math right, that's 400 ppm or 30 seconds per day.  That's not enough to notice if you only watch for a few minutes but easy to catch if you watch for several hours.


-- 
These are my opinions.  I hate spam.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-05  3:32           ` Hal Murray
@ 2017-02-05 15:36             ` Frank Nicholas
  2017-02-06  5:59               ` Hal Murray
  0 siblings, 1 reply; 14+ messages in thread
From: Frank Nicholas @ 2017-02-05 15:36 UTC (permalink / raw)
  To: Hal Murray; +Cc: Fred Wright, linuxppc-dev, devel

I=E2=80=99ve had mine apart many times (for memory upgrade, sensors for =
use in a vehicle, etc. - http://mt.nfshost.com )

Tell me what to look for, and I=E2=80=99ll take as many hi-res pictures =
as you want.

Thanks,
Frank

> On Feb 4, 2017, at 10:32 PM, Hal Murray <hmurray@megapathdsl.net> =
wrote:
>=20
> Mumble.  If it were easier to take apart, I'd look inside to see if I =
could find the crystal and see what was printed on it.
>=20

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-05  0:19         ` Fred Wright
  2017-02-05  3:32           ` Hal Murray
@ 2017-02-05 23:22           ` Benjamin Herrenschmidt
  2017-02-06  2:12             ` Segher Boessenkool
  1 sibling, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2017-02-05 23:22 UTC (permalink / raw)
  To: Fred Wright, linuxppc-dev, devel

On Sat, 2017-02-04 at 16:19 -0800, Fred Wright wrote:
> On Tue, 31 Jan 2017, Hal Murray wrote:
> 
> > > benh@kernel.crashing.org said:
> > > Right, we just use the value provided by Open Firmware. Any chance you can
> 
> That seems inconsistent with the following comment in
> arch/powerpc/kernel/time.c:
> 
>  * TODO (not necessarily in this file):
>  * - improve precision and reproducibility of timebase frequency
>  * measurement at boot time.

That comment is probably ancient ;-) Different platforms use different
methods of calculating or obtaining the TB freq within arch/powerpc.

The most common however for anything recent is to just pick the value
from the device-tree. However, I noticed that MacOS X does "calibrate"
it using the timers provided by the KeyLargo chip.

> Unless it's an outdated comment that nobody bothered to remove.
> 
> > > From the value in the properties you showed me (and the ones I have in some
> > > DT snapshots) it looks like the value isn't fixed but somewhat calibrated by
> > > Open Firmware during boot.
> 
> Or by the OS, if the comment is to be believed.  It would be interesting
> to check OF values guaranteed to come directly from OF.

We don't change the DT values. Looking at some old dumps of Apple OF implementation
I have lying around it appears that the timebase either come from some specific
configuration area of the flash or some very early boot asm calibration.

> Runtime calibration often has issues of its own.  For example, on x86, the
> kernel likes to calibrate the TSC against the RTC at boot time.  But if an
> SMI intervenes during the calibration loop (which is not prevented by
> disabling interrupts), it throws the calibration so badly out of whack
> that the system can't keep time properly until it's rebooted.  At Google,
> we had to disable ECC-related SMIs on at least one server model for that
> reason.

Right. We don't have SMIs on Power and we can probably make sure we disable
(or catch & retry) things like Machine Checks. So we can make it slightly
more accurate.

> When you think about it, the manufacturer knows perfectly well the
> nominal frequency of the crystal being stuffed, and is also programming
> onboard nonvolatile memory (typically EEPROM) with various parameters, so
> directly reporting the nominal frequency should be much more reliable than
> trying to measure it in a short test at boot time.  And detecting that
> it's reported incorrectly should be the job of a diagnostic, not an OS.

Right. On recent POWER servers it's architected. The core always sees 512Mhz,
though I don't know how precise that is (see below).

> One would, of course, like to base timekeeping on the *actual* frequency
> rather than the nominal frequency, but measuring that accurately enough to
> be useful takes longer than one would like to spend in early startup,
> especially if the only accurate time source is Internet-based NTP.  The
> RTC is *not* good enough for this purpose, since *its* crystal has its own
> errors.
> 
> > I rebooted several times.  It always got the exact same clock speed numbers.
> 
> Most likely not runtime calibration, then.

Yup

> > I don't know anything about the insides of the PowerPC chip.  Can you confirm
> > that the kernel time keeping works off an always ticking register similar to
> > the Intel TSC and uses the timebase-frequency as the scale factor?
> 
> That's certainly the way it's normally done on PowerPC, and a cursory
> examination of the sources looks consistent with that.  The PowerPC
> timebase is a 64-bit free-running counter.  Unlike the TSC, it's not
> per-core.

Actually it is, see below :-)

>   On the plus side, that means that the values are guaranteed not
> to be core-specific.  On the minus side, it means that its count rate is
> lower, and it's sufficiently "distant" that accessing it is somewhat more
> expensive.

Right so there are various configuration options and ways to feed the timebase
to PowerPC chips depending on the generation and manufacturer. On the old
32-bit chips, typically it was either a divisor of the bus frequency or
externally clocked. Apple typically used the latter.

However there was always an architectural requirement that it was perfectly
synchronized between cores.

On IBM POWER chips since P6 at least, there's a unit in the chip called the
ChipTOD that provides a reference clock to all the cores at a 16th of the
timebase frequency iirc.

There's a special protocol to slave the TODs of secondary chips to the primary
along with an automatic fallback to a backup network in case of failure.

The cores feed the top bits of the TB from that. The bottom bits are locally
generated by each core in such a way that guarantees that the TB can never
be observed going backward.

> The PowerPC architecture permits the timebase frequency to be variable,
> but I'm not aware of any implementations that take advantage of that.

I think it's pretty much accepted that this would be a very bad idea
and no implementation did it.

>   The
> Motorola 32-bit implementations in general run it on the "bus clock",
> which is independent of processor-clock multipliers, and is also common
> across processor chips in systems with more than one.

There's also a TBEN external pin iirc which can be used to feed it.

>   The IBM 970 (G5)
> runs it on the "mesh clock".  That can change frequencies, but by factors
> of two which are accounted for in the way that the timebase counts, making
> it effectively constant rate.
>
> > If so, I should be able to "fix" it from Open Firmware.  I tried that but
> > things got worse.  I could easily have fatfingered something but more likely
> > my reasoning for computing the right value was buggy.  I guess I'll try again.
> 
> You are aware, aren't you, that frequency errors reported by NTPd have the
> wrong sign?  I.e., a negative value in the driftfile means that the
> frequency of your local clock oscillator is too high.  I imagine it's too
> late to fix that now, by decades.
> 
> > I see that powerpc/kernel/time.c reads both timebase-frequency and
> > clock-frequency, but doesn't seem to use clock-frequency.  Was that just a
> > handy place to read it that got called before anybody else needed it?
> 
> Perhaps there's some way that it's reported to humans.

Yup, it's the default for /proc/cpuinfo in absence of a dedicated cpufreq driver
for the platform.

> On Tue, 31 Jan 2017, Hal Murray wrote:
> > > benh@kernel.crashing.org said:
> > > Ok, I do have one though somewhere with OS X on it. If you give me
> > > instructions on how to test (I know near to nothing about ntpsec), I should
> > > be able to compile and run it.
> >
> > I'm assuming you are already running the normal ntpd from ntp classic, or
> > Apple's version of it.
> 
> Or perhaps the one from MacPorts, which is close to the ntp.org version.
> 
> > ntpq -c "rv 0 frequency" <host-name, defaults to localhost>
> > will get you the fudge-factor that ntpd passes to the kernel to get
> > the clock ticking accurately.  Units are parts-per-million.
> 
> And three decimal places is at least two too few if you're using a
> rubidium-based frequency reference. :-)
> 
> > The problem that started this is that it's off by more than 500 ppm.  If all
> > the arithmetic and documentation is correct, it should be the crystal error.
> > A few or few 10s of ppm is reasonable at normal temperature.  Over 50 is a
> > bit strange, but anything under 100 is within normal.  Over 100 is getting
> > suspicious but could easily be due to some round off someplace.
> 
> Generally, yes.  Tolerances on run-of-the-mill crystals are usually 100ppm
> or better, with 50 being quite common.  I imagine that the 500ppm limit is
> intended as a fairly loose sanity check, on the theory that if it's that
> far off, it's unclear whether it's due to frequency confusion or general
> brokenness.
> 
> > If you want to try ntpsec...
> >
> > > git clone git@gitlab.com:NTPsec/ntpsec.git xxx
> > cd xxx
> > ./waf configure build check
> >
> > I think it builds cleanly on OS-X, but I can't verify that.
> 
> Only on the very latest version (10.12 "Sierra").  Otherwise, the build
> fails because the clock_gettime/clock_settime fallback code is broken in
> multiple ways.  Since the last PPC-compatible OSX was 10.5, this would be
> a no go by seven major versions.
> 
> Fred Wright

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-05 23:22           ` Benjamin Herrenschmidt
@ 2017-02-06  2:12             ` Segher Boessenkool
  0 siblings, 0 replies; 14+ messages in thread
From: Segher Boessenkool @ 2017-02-06  2:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Fred Wright, linuxppc-dev, devel

On Mon, Feb 06, 2017 at 10:22:01AM +1100, Benjamin Herrenschmidt wrote:
> >   On the plus side, that means that the values are guaranteed not
> > to be core-specific.  On the minus side, it means that its count rate is
> > lower, and it's sufficiently "distant" that accessing it is somewhat more
> > expensive.
> 
> Right so there are various configuration options and ways to feed the timebase
> to PowerPC chips depending on the generation and manufacturer. On the old
> 32-bit chips, typically it was either a divisor of the bus frequency or
> externally clocked. Apple typically used the latter.

On all 6xx and most 7xx/7xxx it is 1:4 of the bus clock.  And on the
newer machines the clock chip uses clock spreading.  So you then cannot
calibrate with a dumb fast routine (the time base ticks pretty slow
anyhow, you cannot calibrate any fast if you want decent results; but
with clock spreading you either have to measure for many seconds, or you
need to find the period of the spreading and work with that).

> > The PowerPC architecture permits the timebase frequency to be variable,
> > but I'm not aware of any implementations that take advantage of that.
> 
> I think it's pretty much accepted that this would be a very bad idea
> and no implementation did it.

See above.

> >   The
> > Motorola 32-bit implementations in general run it on the "bus clock",
> > which is independent of processor-clock multipliers, and is also common
> > across processor chips in systems with more than one.
> 
> There's also a TBEN external pin iirc which can be used to feed it.

Some implementations have an MSR bit to stop the TB as well (7450 for
example).


Segher

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-05 15:36             ` Frank Nicholas
@ 2017-02-06  5:59               ` Hal Murray
  0 siblings, 0 replies; 14+ messages in thread
From: Hal Murray @ 2017-02-06  5:59 UTC (permalink / raw)
  To: Frank Nicholas; +Cc: Hal Murray, Fred Wright, linuxppc-dev, devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 627 bytes --]

> Tell me what to look for, and I’ll take as many hi-res pictures as you want.

I'm looking for the frequency printed on the oscillator/crystal.

Here is a picture with several examples:
  https://en.wikipedia.org/wiki/File:Crystal_Packages.jpg

The row of Oscillators is most likely.  They also come in plastic packages.  
You will probably be able to see that they have 2 or 4 connections.  They 
will probably be quite a bit thicker than normal surface mount plastic 
packages.

Likely numbers are 166, 166.6 or an integer any sub multiple.  41.5 or 41.65 
is a good possibility.

-- 
These are my opinions.  I hate spam.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-01  7:56       ` Hal Murray
  2017-02-05  0:19         ` Fred Wright
@ 2017-02-07  2:21         ` Michael Ellerman
  2017-02-07  9:56           ` Hal Murray
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2017-02-07  2:21 UTC (permalink / raw)
  To: Hal Murray, Benjamin Herrenschmidt
  Cc: Hugh Blemings, linuxppc-dev, devel, Paul Mackerras

Hal Murray <hmurray@megapathdsl.net> writes:

> benh@kernel.crashing.org said:
>> Ok, I do have one though somewhere with OS X on it. If you give me
>> instructions on how to test (I know near to nothing about ntpsec), I should
>> be able to compile and run it.
>
> I'm assuming you are already running the normal ntpd from ntp classic, or 
> Apple's version of it.
>
> ntpq -c "rv 0 frequency" <host-name, defaults to localhost>
> will get you the fudge-factor that ntpd passes to the kernel to get
> the clock ticking accurately.  Units are parts-per-million.
>
> There is a source-address filter in ntp.conf (restrict is the keyword), so 
> try from localhost if it doesn't work from the net.
>
> The problem that started this is that it's off by more than 500 ppm.  If all 
> the arithmetic and documentation is correct, it should be the crystal error.  
> A few or few 10s of ppm is reasonable at normal temperature.  Over 50 is a 
> bit strange, but anything under 100 is within normal.  Over 100 is getting 
> suspicious but could easily be due to some round off someplace.
>
>
>
> ntpsec should be the same as ntp classic.  I tried ntp classic on FreeBSD 
> (same trouble) but haven't tried it on Debian.
>
> If you want to try ntpsec...
>
> git clone git@gitlab.com:NTPsec/ntpsec.git xxx
> cd xxx
> ./waf configure build check
>
> I think it builds cleanly on OS-X, but I can't verify that.
>
> ps ax | grep ntpd  # to get args
> service ntpd stop
> ./build/main/ntpd/ntpd <args-from-above>
>
> Unless you are doing something unusual, it should run with your existing 
> ntp.conf and get the same frequency correction.

What do I do if I don't have an existing ntp.conf ?

cheers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Timekeeping oddities on MacMini G4s
  2017-02-07  2:21         ` Michael Ellerman
@ 2017-02-07  9:56           ` Hal Murray
  0 siblings, 0 replies; 14+ messages in thread
From: Hal Murray @ 2017-02-07  9:56 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Hal Murray, Benjamin Herrenschmidt, Hugh Blemings, linuxppc-dev,
	devel, Paul Mackerras

> What do I do if I don't have an existing ntp.conf ?

Assuming ntpsec builds...

There are sample config files in config/
  contrib/ntp.conf.log.sample has logging which you probably want.
it has lots of comments.

It is setup to use us.pool.ntp.org
If you aren't in the us, you should probably change that to your country code.

If you know of any good servers near your location (by network quality, not 
miles/km), you can add them with:
  server <ip-address-or-name> iburst

It will write in several files:
  /var/lib/ntp/ntp.drift
  /var/log/ntp.log
  /var/log/ntpstats/<various>
So you need to make sure the directories exist.


Turn off whatever is currently keeping time on your system.

./build/main/ntpd/ntpd -N -c ./contrib/ntp.conf.log.sample

(It may need a file name starting at /)


-- 
These are my opinions.  I hate spam.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Timekeeping oddities on MacMini G4s
       [not found] <1316061184.389961.1486121485729@mail.yahoo.com>
@ 2017-02-03 11:41 ` Jochen Rollwagen
  0 siblings, 0 replies; 14+ messages in thread
From: Jochen Rollwagen @ 2017-02-03 11:41 UTC (permalink / raw)
  To: linuxppc-dev

Hello There,

here's the output on a mac mini g4 1,5GhZ running OS X 10.4 which hadn't 
booted that OS in ages :-)



jochen-rollwagens-mac-mini:~ jochenrollwagen$ ntpq -c "rv 0 frequency" localhost
status=c011 sync_alarm, sync_unspec, 1 event, event_restart,
frequency=0.000


If there's anything else i can contribute, let me know.

Cheers

Jochen

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-02-07  9:57 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-01  3:10 Timekeeping oddities on MacMini G4s Hugh Blemings
2017-02-01  3:34 ` Benjamin Herrenschmidt
2017-02-01  6:59   ` Hal Murray
2017-02-01  7:13     ` Benjamin Herrenschmidt
2017-02-01  7:56       ` Hal Murray
2017-02-05  0:19         ` Fred Wright
2017-02-05  3:32           ` Hal Murray
2017-02-05 15:36             ` Frank Nicholas
2017-02-06  5:59               ` Hal Murray
2017-02-05 23:22           ` Benjamin Herrenschmidt
2017-02-06  2:12             ` Segher Boessenkool
2017-02-07  2:21         ` Michael Ellerman
2017-02-07  9:56           ` Hal Murray
     [not found] <1316061184.389961.1486121485729@mail.yahoo.com>
2017-02-03 11:41 ` Jochen Rollwagen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.