linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux 2.6.29-rc6
@ 2009-02-23  4:31 Linus Torvalds
  2009-02-23 14:07 ` Linux 2.6.29-rc6 - Fix oops in i915_gem_retire_requests Karsten Wiese
                   ` (2 more replies)
  0 siblings, 3 replies; 81+ messages in thread
From: Linus Torvalds @ 2009-02-23  4:31 UTC (permalink / raw)
  To: Linux Kernel Mailing List


This is mostly lots of small fixes, with the stats being dominated by some 
DocBook movement and an ia64 defconfig addition:

  20.4% Documentation/DocBook/
   3.9% Documentation/
   2.0% arch/arm/
  30.2% arch/ia64/configs/
   5.5% arch/x86/
   2.4% arch/
   3.8% drivers/gpu/drm/i915/
   2.3% drivers/scsi/
  12.6% drivers/
   2.2% fs/btrfs/
   5.5% fs/cifs/
   2.3% fs/

(the above is the "non-cumulative" dirstat, which doesn't add up 
subdirectories cumulatively, and thus highlights individual directories 
that contain changes, rather than the top-level directories).

But most of the changes are really pretty small, and the shortlog gives a 
feel for it. About 350 files changed, averaging roughly 20 lines of 
changes per file - but the average is somewhat misleading, because most 
changes are just a couple of lines, and then the "big" changes are about 
moving a few hundred lines of documentation or the 1601 lines of 
defconfig.

Regressions fixed, small cleanups, and some changes to help future 
merging.

		Linus

---
Adam Baker (1):
      V4L/DVB (10619): gspca - main: Destroy the URBs at disconnection time.

Adam Lackorzynski (1):
      jsm: additional device support

Al Viro (1):
      Fix incomplete __mntput locking

Alan Jenkins (1):
      PM/hibernate: fix "swap breaks after hibernation failures"

Alex Chiang (3):
      PCI: Documentation: fix minor PCIe HOWTO thinko
      [IA64] Revert "prevent ia64 from invoking irq handlers on offline CPUs"
      [IA64] Remove redundant cpu_clear() in __cpu_disable path

Alexey Dobriyan (3):
      kbuild: fix tags generation of config symbols
      mfd: fix sm501 section mismatches
      eeepc: should depend on INPUT

Alexey Starikovskiy (1):
      ACPI: EC: Add delay for slow MSI controller

Alok N Kataria (1):
      x86, vmi: TSC going backwards check in vmi clocksource

Andi Kleen (4):
      kbuild: create the source symlink earlier in the objdir
      x86, mce: reinitialize per cpu features on resume
      x86, mce: use force_sig_info to kill process in machine check
      x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown

Andrew Vasquez (3):
      [SCSI] qla2xxx: Properly acknowledge IDC notification messages.
      [SCSI] qla2xxx: Mask out 'reserved' bits while processing FLT regions.
      [SCSI] qla2xxx: Update version number to 8.03.00-k3.

Andrew Victor (2):
      [ARM] 5390/1: AT91: Watchdog fixes
      [ARM] 5391/1: AT91: Enable GPIO clocks earlier

Andrey Borzenkov (1):
      PM: Fix pm_notifiers during user mode hibernation

Aneesh Kumar K.V (3):
      ext4: Fix lockdep warning
      ext4: Initialize preallocation list_head's properly
      ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages

Anirban Chakraborty (2):
      [SCSI] qla2xxx: Remove interrupt request bit check in the response processing path in multiq mode.
      [SCSI] qla2xxx: Correct slab-error overwrite during vport creation and deletion.

Anssi Hannula (1):
      HID: move tmff and zpff devices from ignore_list to blacklist

Arjan van de Ven (4):
      scripts: add x86 register parser to markup_oops.pl
      scripts: add x86 64 bit support to the markup_oops.pl script
      Consolidate driver_probe_done() loops into one place
      PM/resume: wait for device probing to finish

Arve Hjønnevåg (2):
      PM: Wait for console in resume
      PM: Fix suspend_console and resume_console to use only one semaphore

Atsushi Nemoto (1):
      atmel_serial might lose modem status change

Avi Kivity (2):
      KVM: Avoid using CONFIG_ in userspace visible headers
      KVM: VMX: Flush volatile msrs before emulating rdmsr

Benjamin Herrenschmidt (1):
      vmalloc: add __get_vm_area_caller()

Bernhard Walle (1):
      Bernhard has moved

Bill Nottingham (1):
      vt: Declare PIO_CMAP/GIO_CMAP as compatbile ioctls.

Bjorn Helgaas (1):
      ACPI: remove CONFIG_ACPI_SYSTEM

Boaz Harrosh (1):
      bsg: Fix sense buffer bug in SG_IO

Brian King (3):
      [SCSI] ibmvfc: Fix command timeout errors
      [SCSI] ibmvfc: Fix rport relogin
      [SCSI] ibmvfc: Increase cancel timeout

Chip Coldwell (1):
      cciss: PCI power management reset for kexec

Chris Ball (1):
      x86, olpc: fix model detection without OFW

Chris Mason (5):
      Btrfs: process mount options on mount -o remount,
      Btrfs: use larger metadata clusters in ssd mode
      Btrfs: don't clean old snapshots on sync(1)
      Btrfs: make a lockdep class for the extent buffer locks
      Btrfs: check file pointer in btrfs_sync_file

Chris Wilson (16):
      drm: Potential use-after-free on error path.
      drm: Free the object ref on error.
      drm/i915: Cleanup trivial leak on execbuffer error path.
      drm/i915: hold mutex for unreference() in i915_gem_tiling.c
      drm/i915: refleak along pin() error path.
      drm: Do not leak a new reference for flink() on an existing name
      drm/i915: Set framebuffer alignment based upon the fence constraints.
      drm/i915: Release and unlock on mmap_gtt error path.
      drm/i915: unpin for an invalid memory domain.
      drm/i915: Unpin the ringbuffer if we fail to ioremap it.
      drm/i915: Unpin the hws if we fail to kmap.
      drm/i915: Unpin the fb on error during construction.
      drm/i915: Cleanup the hws on ringbuffer constrution failure.
      drm: Check for a NULL encoder when reverting on error path
      drm: Propagate failure from setting crtc base.
      drm/i915: Fix regression in 95ca9d

Christian Borntraeger (1):
      [S390] Fix timeval regression on s390

Clemens Ladisch (2):
      sound: usb-audio: fix uninitialized variable with M-Audio MIDI interfaces
      sound: virtuoso: revert "do not overwrite EEPROM on Xonar D2/D2X"

Dan Carpenter (3):
      ext4: Fix NULL dereference in ext4_ext_migrate()'s error handling
      HID: unlock properly on error paths in hidraw_ioctl()
      sx.c: avoid referencing freed memory if copy_from_user() fails

Dan Williams (1):
      atmel-mci: fix initialization of dma slave data

Dave Hansen (1):
      powerpc/mm: Fix numa reserve bootmem page selection

David Brownell (2):
      omap_hsmmc: card detect irq bugfix
      omap_hsmmc: only MMC1 allows HCTL.SDVS != 1.8V

David Howells (1):
      mn10300: fix oprofile

David Vrabel (1):
      wusb: whci-hcd: always lock whc->lock with interrupts disabled

David Woodhouse (2):
      iommu: fix Intel IOMMU write-buffer flushing
      Fix Intel IOMMU write-buffer flushing

Davide Libenzi (1):
      timerfd: add flags check

Ed L. Cashin (1):
      aoe: ignore vendor extension AoE responses

Eric Anholt (3):
      drm/i915: Cut two args to set_to_gpu_domain that confused this tricky path.
      drm/i915: Don't let a device flush to prepare buffers clear new write_domains.
      drm/i915: Retire requests from i915_gem_busy_ioctl.

Eric Biederman (1):
      seq_file: properly cope with pread

Felix Blyakher (2):
      Revert "[XFS] use scalable vmap API"
      Revert "[XFS] remove old vmap cache"

Frank Seidel (1):
      MAINTAINERS: Switch hdaps to Frank Seidel

Frederic Weisbecker (1):
      tracing/function-graph-tracer: trace the idle tasks

Geert Uytterhoeven (1):
      m68k: atari - Rename "mfp" to "st_mfp"

Geoff Levand (1):
      powerpc/ps3: Move ps3_mm_add_memory to device_initcall

Giuseppe Bilotta (2):
      lis3lv02d: support both one- and two-byte sensors
      lis3lv02d: add axes knowledge of HP Pavilion dv5 models

Gregory CLEMENT (1):
      [ARM] 5400/1: Add support for inverted rdy_busy pin for Atmel nand device controller

H. Peter Anvin (1):
      x86, mce: remove incorrect __cpuinit for mce_cpu_features()

Hannes Reinecke (1):
      block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list

Hans Verkuil (2):
      V4L/DVB (10625): ivtv: fix decoder crash regression
      V4L/DVB (10626): ivtv: fix regression in get sliced vbi format

Hans de Goede (1):
      hwmon: Fix ACPI resource check error handling

Hartley Sweeten (1):
      [ARM] 5405/1: ep93xx: remove unused gesbc9312.h header

Heiko Carstens (1):
      [S390] fix "mem=" handling in case of standby memory

Helmut Schaa (1):
      sdhci: fix led naming

Herbert Xu (1):
      crypto: lrw - Fix big endian support

Igor Mammedov (1):
      [CIFS] Prevent OOPs when mounting with remote prefixpath.

Ilpo Järvinen (1):
      sx.c: fix dbl statement if - add missing braces

Ingo Molnar (4):
      sched: cpu hotplug fix
      inotify: fix GFP_KERNEL related deadlock
      x86: use the right protections for split-up pagetables
      PM: Split up sysdev_[suspend|resume] from device_power_[down|up], fix

Isaku Yamahata (1):
      [IA64] fixes configs and add default config for ia64 xen domU

James Smart (1):
      [SCSI] scsi_scan: add missing interim SDEV_DEL state if slave_alloc fails

Jan Kara (3):
      jbd2: Fix return value of jbd2_journal_start_commit()
      Revert "ext4: wait on all pending commits in ext4_sync_fs()"
      jbd2: Avoid possible NULL dereference in jbd2_journal_begin_ordered_truncate()

Jean Delvare (2):
      mfd: terminate pcf50633 i2c_device_id list
      hwmon: (f71882fg) Hide misleading error message

Jean Pihet (2):
      omap_hsmmc: recover from transfer failures
      omap_hsmmc: Change while(); loops with finite version

Jeff Layton (3):
      cifs: refactor new_inode() calls and inode initialization
      cifs: properly handle case where CIFSGetSrvInodeNumber fails
      cifs: posix fill in inode needed by posix open

Jeff Mahoney (2):
      Btrfs: balance_level checks !child after access
      Btrfs: remove btrfs_init_path

Jens Axboe (2):
      block: fix bad definition of BIO_RW_SYNC
      block: revert part of 18ce3751ccd488c78d3827e9f6bf54e6322676fb

Jeremy Fitzhardinge (2):
      x86/cpa: make sure cpa is safe to call in lazy mmu mode
      x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption

Jesse Barnes (4):
      drm/i915: take struct mutex around fb unref
      drm/i915: Keep refs on the object over the lifetime of vmas for GTT mmap.
      drm/i915: suspend/resume GEM when KMS is active
      drm/i915: fix WC mapping in non-GEM i915 code.

Jiri Slaby (3):
      HID: fix bus endianity in file2alias
      x86_64: acpi/wakeup_64 cleanup
      x86_64: Fix S3 fail path

Johannes Weiner (3):
      slab: introduce kzfree()
      swsusp: dont fiddle with swappiness
      swsusp: clean up shrink_all_zones()

John Stultz (1):
      x86, hpet: fix for LS21 + HPET = boot hang

Joris van Rantwijk (1):
      ALSA: usb-audio - Workaround for misdetected sample rate with CM6207

Josef Bacik (1):
      Btrfs: make sure all pending extent operations are complete

Josh Hunt (1):
      kbuild: add vmlinux to kernel rpm

Julia Lawall (3):
      [SCSI] lpfc: introduce missing kfree
      Btrfs: fs/btrfs/volumes.c: remove useless kzalloc
      mfd: Fix egpio kzalloc return test

KAMEZAWA Hiroyuki (2):
      mm: clean up for early_pfn_to_nid()
      mm: fix memmap init for handling memory hole

Kristian Høgsberg (5):
      drm: Release user fbs in drm_release
      drm: Add locking around cursor gem operations.
      drm: Bring PLL limits in sync with DDX values.
      drm: Collapse identical i8xx_clock() and i9xx_clock().
      drm: Use spread spectrum when the bios tells us it's ok.

Krzysztof Helt (1):
      fbdev/drm: fix Kconfig submenu mess in "Graphics support"

Li Zefan (4):
      cgroups: update documentation about css_set hash table
      cgroups: fix possible use after free
      README: fix a wrong filename
      cpuset: various documentation fixes and updates

Linus Torvalds (2):
      x86: Add IRQF_TIMER to legacy x86 timer interrupt descriptors
      Linux 2.6.29-rc6

Luca Bigliardi (1):
      uml: fix vde network backend in user mode linux

Makito SHIOKAWA (1):
      [ARM] 5404/1: Fix condition in arm_elf_read_implies_exec() to set READ_IMPLIES_EXEC

Marcelo Tosatti (4):
      KVM: mmu_notifiers release method
      KVM: PIT: fix i8254 pending count read
      KVM: x86: disable kvmclock on non constant TSC hosts
      KVM: x86: fix LAPIC pending count calculation

Mark Brown (5):
      mfd: Initialise WM8350 interrupts earlier
      mfd: Improve diagnostics for WM8350 ID register probe
      mfd: Mark WM835x USB_SLV_500MA bit as accessible
      mfd: Fix TWL4030 build on some ARM variants
      mfd: Ensure all WM8350 IRQs are masked at startup

Mark McLoughlin (1):
      KVM: Fix assigned devices circular locking dependency

Markus Metzger (1):
      x86, ptrace, mm: fix double-free on race

Martin Peschke (1):
      [SCSI] sg: fix device number in blktrace data

Matthew Wilcox (1):
      PCI/MSI: fix msi_mask() shift fix

Mauro Carvalho Chehab (3):
      V4L/DVB (10527): tuner: fix TUV1236D analog/digital setup
      V4L/DVB (10572): Revert commit dda06a8e4610757def753ee3a541a0b1a1feb36b
      8250: fix boot hang with serial console when using with Serial Over Lan port

Michael Buesch (2):
      spi-gpio: sanitize MISO bitvalue
      spi_bitbang: add more lowlevel function documentation

Michael Neuling (2):
      powerpc/vsx: Fix VSX alignment handler for regs 32-63
      bootgraph: fix for use with dot symbols

Michael Tokarev (1):
      HID: blacklist Powercom USB UPS

Mike Christie (1):
      [SCSI] libiscsi: Fix scsi command timeout oops in iscsi_eh_timed_out

Mike Frysinger (1):
      kbuild,setlocalversion: shorten the make time when using svn

Mike Murphy (2):
      PATCH [1/2] Documentation/driver-model/device.txt: fix struct device_attribute
      PATCH [2/2] Documentation/filesystems/sysfs.txt: fix descriptions of device attributes

Neil Brown (1):
      block: fix booting from partitioned md array

Nick Piggin (1):
      mm: task dirty accounting fix

Nicolas Pitre (2):
      [ARM] 5401/1: Orion: fix edge triggered GPIO interrupt support
      [ARM] 5402/1: fix a case of wrap-around in sanity_check_meminfo()

Paul E. McKenney (1):
      x86, rcu: fix strange load average and ksoftirqd behavior

Paul Moore (2):
      cipso: Fix documentation comment
      selinux: Fix the NetLabel glue code for setsockopt()

Paul Turner (1):
      vfs: separate FMODE_PREAD/FMODE_PWRITE into separate flags

Pavel Machek (2):
      Pavel has moved
      hp accelerometer: add freefall detection

Pekka Paalanen (3):
      mmiotrace: count events lost due to not recording
      trace: mmiotrace to the tracer menu in Kconfig
      doc: mmiotrace.txt, buffer size control change

Peter Oberparleiter (1):
      [S390] sclp: handle empty event buffers

Peter Zijlstra (3):
      futex: fix reference leak
      timers: more consistently use clock vs timer
      fs/super.c: add lockdep annotation to s_umount

Philipp Zabel (1):
      mfd: fix htc-egpio iomem resource handling using resource_size

Philippe De Muyter (1):
      floppy: request and release only the ports we actually use

Philippe Gerum (1):
      powerpc/mm: Fix _PAGE_CHG_MASK to protect _PAGE_SPECIAL

Pierre Ossman (1):
      Revert "sdhci: force high speed capability on some controllers"

Pierre Willenbrock (1):
      drm/i915: Add missing mutex_lock(&dev->struct_mutex)

Qinghuang Feng (1):
      Btrfs: remove unused code in split_state()

Rabin Vincent (2):
      kbuild: add sys_* entries for syscalls in tags
      mmc_test: fix basic read test

Rafael J. Wysocki (4):
      USB/PCI: Fix resume breakage of controllers behind cardbus bridges
      pm: fix build for CONFIG_PM unset
      PM: fix build for CONFIG_PM unset
      PM: Split up sysdev_[suspend|resume] from device_power_[down|up]

Rakib Mullick (1):
      mfd: Fix sm501_register_gpio section mismatch

Randy Dunlap (7):
      PCI: fix rom.c kernel-doc warning
      PCI: fix struct pci_platform_pm_ops kernel-doc
      PCI: fix missing kernel-doc and typos
      x86: dell-laptop: depends on POWER_SUPPLY
      docsrc: use config instead of menuconfig
      docbook: split kernel-api for device-drivers
      acpi/doc: add missing param value

Richard Hughes (1):
      battery: don't assume we are fully charged when not charging or discharging

Robert Jennings (1):
      [SCSI] ibmvscsi: Correct DMA mapping leak

Robin Holt (1):
      [IA64] bte_copy of BTE_MAX_XFER trips BUG_ON.

Roel Kluin (4):
      mfd: wm8350 tries reaches -1
      FRV: __pte_to_swp_entry doesn't expand correctly
      paride/pg.c: xs(): &&/|| confusion
      [ARM] 5403/1: pxa25x_ep_fifo_flush() *ep->reg_udccs always set to 0

Roland Dreier (1):
      drm/i915: Fix potential AB-BA deadlock in i915_gem_execbuffer()

Russell King (3):
      [ARM] omap: fix omap2_divisor_to_clksel() error return value
      [ARM] omap: fix _omap2_clksel_get_src_field()
      [ARM] omap: fix clock reparenting in omap2_clk_set_parent()

Rusty Russell (2):
      cpumask: fix powernow-k8: partial revert of 2fdf66b491ac706657946442789ec644cc317e1a
      cpumask: Use cpu_*_mask accessors code: alpha

Sergei Shtylyov (1):
      libata-sff: fix 32-bit PIO ATAPI regression

Sheng Yang (4):
      KVM: Add kvm_arch_sync_events to sync with asynchronize events
      KVM: Fix racy in kvm_free_assigned_irq
      KVM: MMU: Map device MMIO as UC in EPT
      KVM: Fix INTx for device assignment

Shyam_Iyer@Dell.com (1):
      [SCSI] qla2xxx: fix Kernel Panic with Qlogic 2472 Card.

Steve Aarnio (1):
      drm/i915: Don't add panel_fixed_mode to the probed modes list at LVDS init.

Steve French (4):
      [CIFS] ipv6_addr_equal for address comparison
      [CIFS] Fix oops in cifs_strfromUCS_le mounting to servers which do not specify their OS
      [CIFS] improve posix semantics of file create
      [CIFS] Fix multiuser mounts so server does not invalidate earlier security contexts

Steven Rostedt (3):
      tracing: disable tracing while testing ring buffer
      tracing: have function trace select kallsyms
      tracing: limit the number of loops the ring buffer self test can make

Subhash Peddamallu (1):
      fs/bio: bio_alloc_bioset: pass right object ptr to mempool_free

Suresh Siddha (1):
      x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem

Takashi Iwai (3):
      Revert "Sound: hda - Restore PCI configuration space with interrupts off"
      ALSA: usb-audio - Fix non-continuous rate detection
      ALSA: jack - Use card->shortname for input name

Tejun Heo (2):
      sata_nv: give up hardreset on nf2
      vmalloc: call flush_cache_vunmap() from unmap_kernel_range()

Thomas Gleixner (3):
      x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
      x86: CPA avoid repeated lazy mmu flush
      x86, vm86: fix preemption bug

Tobias Klauser (1):
      drm/i915: Storage class should be before const qualifier

Tobias Lorenz (2):
      V4L/DVB (10532): Correction of Stereo detection/setting and signal strength indication
      V4L/DVB (10533): fix LED status output

Tony Luck (2):
      [IA64] Build fix for __early_pfn_to_nid() undefined link error
      [IA64] xen_domu build fix

Tony Vroon (1):
      fujitsu-laptop: Use RFKILL support bitmask from firmware

Trent Piepho (1):
      V4L/DVB (10516a): zoran: Update MAINTAINERS entry

Wei Yongjun (2):
      ext4: Fix to read empty directory blocks correctly in 64k
      mn10300: fix typo && -> || in arch/mn10300/unit-asb2305/pci.c

Wim Van Sebroeck (1):
      [WATCHDOG] iTCO_wdt: fix SMI_EN regression 2

Yan Zheng (2):
      Btrfs: Avoid using __GFP_HIGHMEM with slab allocator
      Btrfs: hold trans_mutex when using btrfs_record_root_in_trans

Yang Hongyang (1):
      atyfb: remove unused local variable `pwr_command'

Yang Zhang (1):
      KVM: ia64: fix fp fault/trap handler

Yauhen Kharuzhy (1):
      s3cmci: Fix hangup in do_pio_write()

Yi Li (1):
      MMC: fix bug - SDHC card capacity not correct

Zachary Amsden (1):
      MAINTAINERS: paravirt-ops maintainers update

Zlatko Calusic (1):
      Add support for VT6415 PCIE PATA IDE Host Controller

etienne (1):
      drm/radeon: update sarea copies of last_ variables on resume.

wanzongshun (1):
      [ARM] 5398/1: Add Wan ZongShun to MAINTAINERS for W90P910

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6 - Fix oops in i915_gem_retire_requests
  2009-02-23  4:31 Linux 2.6.29-rc6 Linus Torvalds
@ 2009-02-23 14:07 ` Karsten Wiese
  2009-02-26 11:15 ` Linux 2.6.29-rc6 Jesper Krogh
  2009-02-26 19:55 ` Jesper Krogh
  2 siblings, 0 replies; 81+ messages in thread
From: Karsten Wiese @ 2009-02-23 14:07 UTC (permalink / raw)
  To: Linus Torvalds, Eric Anholt; +Cc: Linux Kernel Mailing List

Fix an oops in i915_gem_retire_requests()

dev_priv->hw_status_page can be NULL, if i915_gem_retire_requests()
is called from i915_gem_busy_ioctl().

Signed-off-by Karsten Wiese <fzu@wemgehoertderstaat.de>
---
 drivers/gpu/drm/i915/i915_gem.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 25b3374..28b726d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1051,6 +1051,9 @@ i915_gem_retire_requests(struct drm_device *dev)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	uint32_t seqno;
 
+	if (!dev_priv->hw_status_page)
+		return;
+
 	seqno = i915_get_gem_seqno(dev);
 
 	while (!list_empty(&dev_priv->mm.request_list)) {
-- 
1.6.0.6

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-23  4:31 Linux 2.6.29-rc6 Linus Torvalds
  2009-02-23 14:07 ` Linux 2.6.29-rc6 - Fix oops in i915_gem_retire_requests Karsten Wiese
@ 2009-02-26 11:15 ` Jesper Krogh
  2009-02-26 17:17   ` MTD_CK804XROM warning (Was: Linux 2.6.29-rc6) Marcin Slusarz
  2009-02-26 17:53   ` Linux 2.6.29-rc6 Linus Torvalds
  2009-02-26 19:55 ` Jesper Krogh
  2 siblings, 2 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-02-26 11:15 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel


Booting up 2.6.29-rc6 gave me this one in dmesg...

[   21.136149] ck804xrom ck804xrom_init_one(): Unable to register 
resource 0x00000000ff000000-0x00000000ffffffff - kernel bug?
[   21.136258] resource map sanity check conflict: 0xff000000 0xffffffff 
0xff700000 0xffffffff reserved
[   21.136267] ------------[ cut here ]------------
[   21.136269] WARNING: at arch/x86/mm/ioremap.c:208 
__ioremap_caller+0x359/0x390()
[   21.136271] Hardware name: Sun Fire X2200 M2 with Quad Core Processor
[   21.136273] Info: mapping multiple BARs. Your kernel is fine.Modules 
linked in: ck804xrom(+) mtd chipreg pcspkr(+) shpchp button pci_hotplug 
i2c_nforce2 i2c_core map_funcs evdev ext3 jbd mbcache sg sd_mod usbhid 
hid amd74xx sata_nv tg3 ata_generic libphy ehci_hcd libata ohci_hcd 
forcedeth scsi_mod usbcore thermal processor fan thermal_sys fuse
[   21.136289] Pid: 3843, comm: modprobe Not tainted 2.6.29-rc6 #2
[   21.136291] Call Trace:
[   21.136298]  [<ffffffff8023d352>] warn_slowpath+0xf2/0x130
[   21.136301]  [<ffffffff8023d62a>] __call_console_drivers+0x6a/0x90
[   21.136304]  [<ffffffff8023e1fe>] printk+0x4e/0x60
[   21.136306]  [<ffffffff8023e1fe>] printk+0x4e/0x60
[   21.136309]  [<ffffffff8036b520>] match_pci_dev_by_id+0x0/0x60
[   21.136313]  [<ffffffff8024360e>] iomem_map_sanity_check+0xbe/0xd0
[   21.136316]  [<ffffffff80229799>] __ioremap_caller+0x359/0x390
[   21.136320]  [<ffffffffa01eb1f6>] init_ck804xrom+0x1f6/0x62c [ck804xrom]
[   21.136322]  [<ffffffffa01eb1f6>] init_ck804xrom+0x1f6/0x62c [ck804xrom]
[   21.136326]  [<ffffffff80275eac>] tracepoint_update_probe_range+0x1c/0xb0
[   21.136329]  [<ffffffffa01eb000>] init_ck804xrom+0x0/0x62c [ck804xrom]
[   21.136332]  [<ffffffff8020903b>] _stext+0x3b/0x160
[   21.136335]  [<ffffffff80359141>] __up_read+0x21/0xb0
[   21.136340]  [<ffffffff80256495>] 
__blocking_notifier_call_chain+0x65/0x90
[   21.136343]  [<ffffffff80265604>] sys_init_module+0xb4/0x200
[   21.136346]  [<ffffffff8020c35b>] system_call_fastpath+0x16/0x1b
[   21.136348] ---[ end trace f807e12658961c2d ]---


System is fully operational, but I didnt get it in 2.6.26.8 (most recent 
kernel tried on this hardware).


-- 
Jesper



^ permalink raw reply	[flat|nested] 81+ messages in thread

* MTD_CK804XROM warning (Was: Linux 2.6.29-rc6)
  2009-02-26 11:15 ` Linux 2.6.29-rc6 Jesper Krogh
@ 2009-02-26 17:17   ` Marcin Slusarz
  2009-02-26 17:53   ` Linux 2.6.29-rc6 Linus Torvalds
  1 sibling, 0 replies; 81+ messages in thread
From: Marcin Slusarz @ 2009-02-26 17:17 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Linus Torvalds, linux-kernel, Dave Olsen, Ryan Jackson,
	David.Woodhouse, linux-mtd

On Thu, Feb 26, 2009 at 12:15:52PM +0100, Jesper Krogh wrote:
>
> Booting up 2.6.29-rc6 gave me this one in dmesg...
>
> [   21.136149] ck804xrom ck804xrom_init_one(): Unable to register resource 
> 0x00000000ff000000-0x00000000ffffffff - kernel bug?
> [   21.136258] resource map sanity check conflict: 0xff000000 0xffffffff 
> 0xff700000 0xffffffff reserved
> [   21.136267] ------------[ cut here ]------------
> [   21.136269] WARNING: at arch/x86/mm/ioremap.c:208 
> __ioremap_caller+0x359/0x390()
> [   21.136271] Hardware name: Sun Fire X2200 M2 with Quad Core Processor
> [   21.136273] Info: mapping multiple BARs. Your kernel is fine.Modules 
> linked in: ck804xrom(+) mtd chipreg pcspkr(+) shpchp button pci_hotplug 
> i2c_nforce2 i2c_core map_funcs evdev ext3 jbd mbcache sg sd_mod usbhid hid 
> amd74xx sata_nv tg3 ata_generic libphy ehci_hcd libata ohci_hcd forcedeth 
> scsi_mod usbcore thermal processor fan thermal_sys fuse
> [   21.136289] Pid: 3843, comm: modprobe Not tainted 2.6.29-rc6 #2
> [   21.136291] Call Trace:
> [   21.136298]  [<ffffffff8023d352>] warn_slowpath+0xf2/0x130
> [   21.136301]  [<ffffffff8023d62a>] __call_console_drivers+0x6a/0x90
> [   21.136304]  [<ffffffff8023e1fe>] printk+0x4e/0x60
> [   21.136306]  [<ffffffff8023e1fe>] printk+0x4e/0x60
> [   21.136309]  [<ffffffff8036b520>] match_pci_dev_by_id+0x0/0x60
> [   21.136313]  [<ffffffff8024360e>] iomem_map_sanity_check+0xbe/0xd0
> [   21.136316]  [<ffffffff80229799>] __ioremap_caller+0x359/0x390
> [   21.136320]  [<ffffffffa01eb1f6>] init_ck804xrom+0x1f6/0x62c [ck804xrom]
> [   21.136322]  [<ffffffffa01eb1f6>] init_ck804xrom+0x1f6/0x62c [ck804xrom]
> [   21.136326]  [<ffffffff80275eac>] 
> tracepoint_update_probe_range+0x1c/0xb0
> [   21.136329]  [<ffffffffa01eb000>] init_ck804xrom+0x0/0x62c [ck804xrom]
> [   21.136332]  [<ffffffff8020903b>] _stext+0x3b/0x160
> [   21.136335]  [<ffffffff80359141>] __up_read+0x21/0xb0
> [   21.136340]  [<ffffffff80256495>] 
> __blocking_notifier_call_chain+0x65/0x90
> [   21.136343]  [<ffffffff80265604>] sys_init_module+0xb4/0x200
> [   21.136346]  [<ffffffff8020c35b>] system_call_fastpath+0x16/0x1b
> [   21.136348] ---[ end trace f807e12658961c2d ]---
>
>
> System is fully operational, but I didnt get it in 2.6.26.8 (most recent 
> kernel tried on this hardware).

This message comes from this code in drivers/mtd/maps/ck804xrom.c:
        /*
         * Try to reserve the window mem region.  If this fails then
         * it is likely due to a fragment of the window being
         * "reserved" by the BIOS.  In the case that the
         * request_mem_region() fails then once the rom size is
         * discovered we will try to reserve the unreserved fragment.
         */
        window->rsrc.name = MOD_NAME;
        window->rsrc.start = window->phys;
        window->rsrc.end   = window->phys + window->size - 1;
        window->rsrc.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
        if (request_resource(&iomem_resource, &window->rsrc)) {
                window->rsrc.parent = NULL;
                printk(KERN_ERR MOD_NAME
                        " %s(): Unable to register resource"
                        " 0x%.016llx-0x%.016llx - kernel bug?\n",
                        __func__,
                        (unsigned long long)window->rsrc.start,
                        (unsigned long long)window->rsrc.end);
        }

So it's probably harmless.
Adding CC's.

Marcin

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 11:15 ` Linux 2.6.29-rc6 Jesper Krogh
  2009-02-26 17:17   ` MTD_CK804XROM warning (Was: Linux 2.6.29-rc6) Marcin Slusarz
@ 2009-02-26 17:53   ` Linus Torvalds
  2009-02-26 19:22     ` David Woodhouse
  2009-02-26 19:31     ` Jesper Krogh
  1 sibling, 2 replies; 81+ messages in thread
From: Linus Torvalds @ 2009-02-26 17:53 UTC (permalink / raw)
  To: Jesper Krogh, David Woodhouse, Dave Olsen, Ryan Jackson, linux-mtd
  Cc: Linux Kernel Mailing List

Dave Olsen <dolsen@lnxi.com>,
    Ryan Jackson <rjackson@lnxi.com>, David.Woodhouse@intel.com, 
linux-mtd@lists.infradead.org


On Thu, 26 Feb 2009, Jesper Krogh wrote:
>
> 
> Booting up 2.6.29-rc6 gave me this one in dmesg...
> 
> [   21.136149] ck804xrom ck804xrom_init_one(): Unable to register resource 0x00000000ff000000-0x00000000ffffffff - kernel bug?

Well, it _is_ a kernel bug, but it's in that stupid driver. It does 
everything wrong, including printing out a scary message.

Piece of sh*t driver, in other words.

I mean, it even has a _comment_ about how the request_region is likely to 
not succeed, and then it prints out that scary message when it 
then doesn't do so.

Not to mention that the driver is likely _wrong_ to just unconditionally 
try to enable that resource without *first* checking whether the resource 
can actually be enabled or whether there are other resources in that same 
window.

Quite frankly, I find that whole thing scary. The driver should be deleted 
or at least marked EXPERIMENTAL or BROKEN.

It has a "BE VERY CAREFUL" in the Kconfig _help_ text, but is not marked 
as being dangerous any other way.

That said, I really don't see why you would get this message _now_. The 
total braindamage of that driver in no way seems new. Did you perhaps not 
notice before, or did you just not enable it before?

> [   21.136269] WARNING: at arch/x86/mm/ioremap.c:208 __ioremap_caller+0x359/0x390()

This is a different, but related warning, since the driver is doing an 
ioremap across different resources. The warning is directly related to the 
fact that the resource wasn't actually valid to begin with.

What does "cat /proc/iomem" say?

> System is fully operational, but I didnt get it in 2.6.26.8 (most recent
> kernel tried on this hardware).

The ioremap() warning is newish, and may be what made you notice the 
previous (just one-line) crappy warning.

Quite frankly, having looked at that horrible driver, I would seriously 
consider disabling it. Stuff like that should not be allowed to exist.

		Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 17:53   ` Linux 2.6.29-rc6 Linus Torvalds
@ 2009-02-26 19:22     ` David Woodhouse
  2009-02-26 19:31     ` Jesper Krogh
  1 sibling, 0 replies; 81+ messages in thread
From: David Woodhouse @ 2009-02-26 19:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jesper Krogh, Dave Olsen, Ryan Jackson, linux-mtd,
	Linux Kernel Mailing List

On Thu, 2009-02-26 at 17:53 +0000, Linus Torvalds wrote:
> Dave Olsen <dolsen@lnxi.com>,
>     Ryan Jackson <rjackson@lnxi.com>, David.Woodhouse@intel.com, 
> linux-mtd@lists.infradead.org
> 
> 
> On Thu, 26 Feb 2009, Jesper Krogh wrote:
> >
> > 
> > Booting up 2.6.29-rc6 gave me this one in dmesg...
> > 
> > [   21.136149] ck804xrom ck804xrom_init_one(): Unable to register resource 0x00000000ff000000-0x00000000ffffffff - kernel bug?
> 
> Well, it _is_ a kernel bug, but it's in that stupid driver. It does 
> everything wrong, including printing out a scary message.
> 
> Piece of sh*t driver, in other words.
> 
> I mean, it even has a _comment_ about how the request_region is likely to 
> not succeed, and then it prints out that scary message when it 
> then doesn't do so.
> 
> Not to mention that the driver is likely _wrong_ to just unconditionally 
> try to enable that resource without *first* checking whether the resource 
> can actually be enabled or whether there are other resources in that same 
> window.
>
> Quite frankly, I find that whole thing scary. The driver should be deleted 
> or at least marked EXPERIMENTAL or BROKEN.

It's giving you access to your BIOS flash so that you can overwrite it
from within Linux. It's _supposed_ to be scary :)

It's also always going to be a hack -- it's a PITA getting direct access
to that flash on most PeeCee chipsets. The driver operates on the
principle that it knows the hardware, and it can _make_ the flash appear
at the appropriate physical addresses. The theory, at least, is that it
knows better than the kernel does.

But yeah, it should probably at least look for other things which
already overlap with the region that it's trying to 'create'. Although
the comment leads me to believe that sometimes that's _expected_ and
shouldn't cause the driver to abort.

Dave, Ryan, are you still actively using this?

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 17:53   ` Linux 2.6.29-rc6 Linus Torvalds
  2009-02-26 19:22     ` David Woodhouse
@ 2009-02-26 19:31     ` Jesper Krogh
  2009-02-26 19:36       ` David Woodhouse
  2009-02-26 20:32       ` Linus Torvalds
  1 sibling, 2 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-02-26 19:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Woodhouse, Dave Olsen, Ryan Jackson, linux-mtd,
	Linux Kernel Mailing List

Linus Torvalds wrote:
> Dave Olsen <dolsen@lnxi.com>,
>     Ryan Jackson <rjackson@lnxi.com>, David.Woodhouse@intel.com, 
> linux-mtd@lists.infradead.org
> 
> 
> On Thu, 26 Feb 2009, Jesper Krogh wrote:
>>
>> Booting up 2.6.29-rc6 gave me this one in dmesg...
>>
>> [   21.136149] ck804xrom ck804xrom_init_one(): Unable to register resource 0x00000000ff000000-0x00000000ffffffff - kernel bug?
> 
> Well, it _is_ a kernel bug, but it's in that stupid driver. It does 
> everything wrong, including printing out a scary message.

I've seen that before.. (even reported it before). It just "slipped" 
into the cut'n'paste It was the following stuff that I intended to report.

>> [   21.136269] WARNING: at arch/x86/mm/ioremap.c:208 __ioremap_caller+0x359/0x390()
> 
> This is a different, but related warning, since the driver is doing an 
> ioremap across different resources. The warning is directly related to the 
> fact that the resource wasn't actually valid to begin with.
> 
> What does "cat /proc/iomem" say?

http://krogh.cc/~jesper/iomem.txt

>> System is fully operational, but I didnt get it in 2.6.26.8 (most recent
>> kernel tried on this hardware).
> 
> The ioremap() warning is newish, and may be what made you notice the 
> previous (just one-line) crappy warning.
> 
> Quite frankly, having looked at that horrible driver, I would seriously 
> consider disabling it. Stuff like that should not be allowed to exist.

Being a "stupid" user, I pick the easy way to build a fresh kernel:
1) pick the distro .config
2) make oldconfig
3) Let the kernel load what it think it needs.
4) Report if I see and strange stuff (warnings / bugs / oops) or 
misbehaviour.

So I dont know if I need that driver for anything vital. Should I care? 
Or shouldn't it "just work"?

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 19:31     ` Jesper Krogh
@ 2009-02-26 19:36       ` David Woodhouse
  2009-02-26 19:46         ` Jesper Krogh
  2009-02-26 20:53         ` Carl-Daniel Hailfinger
  2009-02-26 20:32       ` Linus Torvalds
  1 sibling, 2 replies; 81+ messages in thread
From: David Woodhouse @ 2009-02-26 19:36 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Linus Torvalds, Dave Olsen, Ryan Jackson, linux-mtd,
	Linux Kernel Mailing List

On Thu, 2009-02-26 at 19:31 +0000, Jesper Krogh wrote:
> 1) pick the distro .config
> 2) make oldconfig

So it should have been a module, not built-in?

> 3) Let the kernel load what it think it needs.

That part at least ought to be disabled -- we don't let this driver
autoload, because unless you _know_ you need it, you don't need it.

It's for overwriting your BIOS.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 19:36       ` David Woodhouse
@ 2009-02-26 19:46         ` Jesper Krogh
  2009-02-26 19:49           ` David Woodhouse
  2009-02-26 20:53         ` Carl-Daniel Hailfinger
  1 sibling, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-02-26 19:46 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Linus Torvalds, Dave Olsen, Ryan Jackson, linux-mtd,
	Linux Kernel Mailing List

David Woodhouse wrote:
> On Thu, 2009-02-26 at 19:31 +0000, Jesper Krogh wrote:
>> 1) pick the distro .config
>> 2) make oldconfig
> 
> So it should have been a module, not built-in?

It is a module.. and it somehow gets auto-loaded on my system. (not 
listed in /etc/modules).

$ grep -i ck804xrom /boot/config-2.6.29-rc6
CONFIG_MTD_CK804XROM=m

Same in the distro .config
$ grep -i ck804xrom /boot/config-2.6.24-23-server
CONFIG_MTD_CK804XROM=m


>> 3) Let the kernel load what it think it needs.
> 
> That part at least ought to be disabled -- we don't let this driver
> autoload, because unless you _know_ you need it, you don't need it.
> 
> It's for overwriting your BIOS.

Oh. Thanks for your time... I'll just make sure to disable it from now on.

-- 
Jesper


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 19:46         ` Jesper Krogh
@ 2009-02-26 19:49           ` David Woodhouse
  0 siblings, 0 replies; 81+ messages in thread
From: David Woodhouse @ 2009-02-26 19:49 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Linus Torvalds, Dave Olsen, Ryan Jackson, linux-mtd,
	Linux Kernel Mailing List

On Thu, 2009-02-26 at 20:46 +0100, Jesper Krogh wrote:
> It is a module.. and it somehow gets auto-loaded on my system. (not 
> listed in /etc/modules).

Oops, we should have disabled that, but it still has a
MODULE_DEVICE_TABLE(). I'll remove that, for a start...

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-23  4:31 Linux 2.6.29-rc6 Linus Torvalds
  2009-02-23 14:07 ` Linux 2.6.29-rc6 - Fix oops in i915_gem_retire_requests Karsten Wiese
  2009-02-26 11:15 ` Linux 2.6.29-rc6 Jesper Krogh
@ 2009-02-26 19:55 ` Jesper Krogh
  2009-02-26 20:33   ` Linus Torvalds
  2009-03-01 15:09   ` Jesper Krogh
  2 siblings, 2 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-02-26 19:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

2.6.29-rc6 seems to have trouble running ntpd reliable under load. My 
nagios system has just alerted me of drifting time on the machine upgraded.

Feb 26 19:09:25 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
Feb 26 19:10:31 quad12 ntpd[4901]: synchronized to 10.194.133.12, stratum 4
Feb 26 19:25:21 quad12 ntpd[4901]: time reset -0.915488 s
Feb 26 19:29:11 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
Feb 26 19:31:21 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
Feb 26 19:34:37 quad12 ntpd[4901]: synchronized to 10.194.133.12, stratum 4
Feb 26 19:37:53 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
Feb 26 19:46:27 quad12 ntpd[4901]: synchronized to 10.194.133.12, stratum 4
Feb 26 19:46:27 quad12 ntpd[4901]: time reset -0.961386 s
Feb 26 19:50:30 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
Feb 26 19:51:34 quad12 ntpd[4901]: synchronized to 10.194.133.12, stratum 4
Feb 26 20:01:55 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
Feb 26 20:06:18 quad12 ntpd[4901]: time reset -0.979177 s
Feb 26 20:10:15 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
Feb 26 20:11:21 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
Feb 26 20:14:52 quad12 ntpd[4901]: synchronized to 10.194.133.12, stratum 4
Feb 26 20:19:10 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
Feb 26 20:26:00 quad12 ntpd[4901]: time reset -0.923268 s
Feb 26 20:30:01 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
Feb 26 20:30:30 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
Feb 26 20:45:36 quad12 ntpd[4901]: time reset -0.919609 s
Feb 26 20:49:49 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13

2.6.26.8 doesnt have this problem.

The "current_clocsource" is the same on both systems.

$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc


-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 19:31     ` Jesper Krogh
  2009-02-26 19:36       ` David Woodhouse
@ 2009-02-26 20:32       ` Linus Torvalds
  1 sibling, 0 replies; 81+ messages in thread
From: Linus Torvalds @ 2009-02-26 20:32 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: David Woodhouse, Dave Olsen, Ryan Jackson, linux-mtd,
	Linux Kernel Mailing List



On Thu, 26 Feb 2009, Jesper Krogh wrote:

> Linus Torvalds wrote:
> > On Thu, 26 Feb 2009, Jesper Krogh wrote:
> > > 
> > > Booting up 2.6.29-rc6 gave me this one in dmesg...
> > > 
> > > [   21.136149] ck804xrom ck804xrom_init_one(): Unable to register resource
> > > 0x00000000ff000000-0x00000000ffffffff - kernel bug?
> > 
> > Well, it _is_ a kernel bug, but it's in that stupid driver. It does
> > everything wrong, including printing out a scary message.
> 
> I've seen that before.. (even reported it before). It just "slipped" into the
> cut'n'paste It was the following stuff that I intended to report.

Ok. They very much are related. The new warning is just that - a new 
warning.

> > > [   21.136269] WARNING: at arch/x86/mm/ioremap.c:208
> > > __ioremap_caller+0x359/0x390()
> > 
> > This is a different, but related warning, since the driver is doing an
> > ioremap across different resources. The warning is directly related to the
> > fact that the resource wasn't actually valid to begin with.
> > 
> > What does "cat /proc/iomem" say?
> 
> http://krogh.cc/~jesper/iomem.txt

Ok, so the thing conflicts with

	ff700000-ffffffff : reserved
	  ff700000-ffffffff : pnp 00:0b

and that probably _is_ somehow related to the whole flash thing. 

I guess the driver could use "insert_resource()" and the problem would go 
away. Except I do think it should be marked very dangerous some way, so 
that you can't even enable it unless you really really know you want to 
(eg something like EXPERIMENTAL). Because I don't think this driver is 
appropriate in any other case..

> Being a "stupid" user, I pick the easy way to build a fresh kernel: 1) 
> pick the distro .config 2) make oldconfig 3) Let the kernel load what it 
> think it needs. 4) Report if I see and strange stuff (warnings / bugs / 
> oops) or misbehaviour.
> 
> So I dont know if I need that driver for anything vital. Should I care? 
> Or shouldn't it "just work"?

You definitely don't need it, and everything will work without it. 

			Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 19:55 ` Jesper Krogh
@ 2009-02-26 20:33   ` Linus Torvalds
  2009-02-26 20:43     ` Jesper Krogh
  2009-03-01 15:09   ` Jesper Krogh
  1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-02-26 20:33 UTC (permalink / raw)
  To: Jesper Krogh; +Cc: Linux Kernel Mailing List



On Thu, 26 Feb 2009, Jesper Krogh wrote:
> 
> 2.6.26.8 doesnt have this problem.
> 
> The "current_clocsource" is the same on both systems.
> 
> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> tsc

What does the frequency calibrate to? It should be in the dmesg. Does it 
differ by a big amount?

		Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 20:33   ` Linus Torvalds
@ 2009-02-26 20:43     ` Jesper Krogh
  2009-02-26 21:19       ` john stultz
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-02-26 20:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

Linus Torvalds wrote:
> 
> On Thu, 26 Feb 2009, Jesper Krogh wrote:
>> 2.6.26.8 doesnt have this problem.
>>
>> The "current_clocsource" is the same on both systems.
>>
>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>> tsc
> 
> What does the frequency calibrate to? It should be in the dmesg. Does it 
> differ by a big amount?

Non-working:
$ dmesg | grep -i freq
[    0.004007] Calibrating delay loop (skipped), value calculated using 
timer frequency.. 4620.05 BogoMIPS (lpj=9240104)

2.6.26.8 doesn't have that information.

-- 
Jesper


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 19:36       ` David Woodhouse
  2009-02-26 19:46         ` Jesper Krogh
@ 2009-02-26 20:53         ` Carl-Daniel Hailfinger
  1 sibling, 0 replies; 81+ messages in thread
From: Carl-Daniel Hailfinger @ 2009-02-26 20:53 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Jesper Krogh, Ryan Jackson, linux-mtd, Dave Olsen,
	Linus Torvalds, Linux Kernel Mailing List

On 26.02.2009 20:36, David Woodhouse wrote:
> It's for overwriting your BIOS.
>   

There's a pure userspace replacement for it. That replacement is even
packaged in most distros. See http://www.coreboot.org/Flashrom .


Regards,
Carl-Daniel

-- 
http://www.hailfinger.org/


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 20:43     ` Jesper Krogh
@ 2009-02-26 21:19       ` john stultz
  2009-02-26 21:35         ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-02-26 21:19 UTC (permalink / raw)
  To: Jesper Krogh; +Cc: Linus Torvalds, Linux Kernel Mailing List

On Thu, Feb 26, 2009 at 12:43 PM, Jesper Krogh <jesper@krogh.cc> wrote:
> Linus Torvalds wrote:
>>
>> On Thu, 26 Feb 2009, Jesper Krogh wrote:
>>>
>>> 2.6.26.8 doesnt have this problem.
>>>
>>> The "current_clocsource" is the same on both systems.
>>>
>>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>>> tsc
>>
>> What does the frequency calibrate to? It should be in the dmesg. Does it
>> differ by a big amount?
>
> Non-working:
> $ dmesg | grep -i freq
> [    0.004007] Calibrating delay loop (skipped), value calculated using
> timer frequency.. 4620.05 BogoMIPS (lpj=9240104)
>
> 2.6.26.8 doesn't have that information.

I'm surprised the clocksource watchdog isn't catching it.

What's the output from:
cat /sys/devices/system/clocksource/clocksource0/available_clocksource

Also mind sending the full dmesg for both kernels?

thanks
-john

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:19       ` john stultz
@ 2009-02-26 21:35         ` Jesper Krogh
  2009-02-26 21:46           ` john stultz
                             ` (2 more replies)
  0 siblings, 3 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-02-26 21:35 UTC (permalink / raw)
  To: john stultz; +Cc: Linus Torvalds, Linux Kernel Mailing List

john stultz wrote:
> On Thu, Feb 26, 2009 at 12:43 PM, Jesper Krogh <jesper@krogh.cc> wrote:
>> Linus Torvalds wrote:
>>> On Thu, 26 Feb 2009, Jesper Krogh wrote:
>>>> 2.6.26.8 doesnt have this problem.
>>>>
>>>> The "current_clocsource" is the same on both systems.
>>>>
>>>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>>>> tsc
>>> What does the frequency calibrate to? It should be in the dmesg. Does it
>>> differ by a big amount?
>> Non-working:
>> $ dmesg | grep -i freq
>> [    0.004007] Calibrating delay loop (skipped), value calculated using
>> timer frequency.. 4620.05 BogoMIPS (lpj=9240104)
>>
>> 2.6.26.8 doesn't have that information.
> 
> I'm surprised the clocksource watchdog isn't catching it.
> 
> What's the output from:
> cat /sys/devices/system/clocksource/clocksource0/available_clocksource

$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc acpi_pm jiffies

Same on both.

> Also mind sending the full dmesg for both kernels?

http://krogh.cc/~jesper/dmesg-2.6.29-rc6.txt
http://krogh.cc/~jesper/dmesg-2.6.26.8.txt

-- 
Jesper


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:35         ` Jesper Krogh
@ 2009-02-26 21:46           ` john stultz
  2009-02-26 21:54             ` Thomas Gleixner
                               ` (2 more replies)
  2009-02-26 21:49           ` Linus Torvalds
  2009-02-26 21:54           ` john stultz
  2 siblings, 3 replies; 81+ messages in thread
From: john stultz @ 2009-02-26 21:46 UTC (permalink / raw)
  To: Jesper Krogh; +Cc: Linus Torvalds, Linux Kernel Mailing List, Thomas Gleixner

On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
> john stultz wrote:
> > On Thu, Feb 26, 2009 at 12:43 PM, Jesper Krogh <jesper@krogh.cc> wrote:
> >> Linus Torvalds wrote:
> >>> On Thu, 26 Feb 2009, Jesper Krogh wrote:
> >>>> 2.6.26.8 doesnt have this problem.
> >>>>
> >>>> The "current_clocsource" is the same on both systems.
> >>>>
> >>>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> >>>> tsc
> >>> What does the frequency calibrate to? It should be in the dmesg. Does it
> >>> differ by a big amount?
> >> Non-working:
> >> $ dmesg | grep -i freq
> >> [    0.004007] Calibrating delay loop (skipped), value calculated using
> >> timer frequency.. 4620.05 BogoMIPS (lpj=9240104)
> >>
> >> 2.6.26.8 doesn't have that information.
> > 
> > I'm surprised the clocksource watchdog isn't catching it.
> > 
> > What's the output from:
> > cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> 
> $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> tsc acpi_pm jiffies

Hmm. Does booting w/ "clocksourc=acpi_pm" also show the severe (~550ppm,
which NTP can't handle) drift?

>From the dmesg, I don't see any major calibration difference right off. 

So I'd suspect something like TSC halting in idle could be causing
problems, but the watchdog should catch that as well. My only guess at
this point is that the ACPI PM is halting in idle along with the TSC. 

And you said this only happens under load? 

-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:35         ` Jesper Krogh
  2009-02-26 21:46           ` john stultz
@ 2009-02-26 21:49           ` Linus Torvalds
  2009-03-01 15:04             ` Jesper Krogh
  2009-02-26 21:54           ` john stultz
  2 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-02-26 21:49 UTC (permalink / raw)
  To: Jesper Krogh; +Cc: john stultz, Linux Kernel Mailing List



On Thu, 26 Feb 2009, Jesper Krogh wrote:
> 
> > Also mind sending the full dmesg for both kernels?
> 
> http://krogh.cc/~jesper/dmesg-2.6.29-rc6.txt
> http://krogh.cc/~jesper/dmesg-2.6.26.8.txt

Try changing

	#define QUICK_PIT_MS 15

in arch/x86/kernel/tsc.c into something bigger. Let's say just doubling 
it to 30. Does that change anything?

		Linus


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:35         ` Jesper Krogh
  2009-02-26 21:46           ` john stultz
  2009-02-26 21:49           ` Linus Torvalds
@ 2009-02-26 21:54           ` john stultz
  2009-02-26 22:06             ` Thomas Gleixner
  2 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-02-26 21:54 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Linus Torvalds, Linux Kernel Mailing List, Thomas Gleixner, Len Brown

On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
> > Also mind sending the full dmesg for both kernels?
> 
> http://krogh.cc/~jesper/dmesg-2.6.29-rc6.txt
> http://krogh.cc/~jesper/dmesg-2.6.26.8.txt

So one interesting difference:
2.6.26.8:	TSC calibrated against PM_TIMER
2.6.29-rc6:	Fast TSC calibration using PIT

Thomas, any thoughts as to why we might be calibrating off the PIT
instead of the PM_TIMER w/ 2.6.29?

Maybe does this line provide a hint?
FADT: X_PM1a_EVT_BLK.bit_width (16) does not match PM1_EVT_LEN (4)


thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:46           ` john stultz
@ 2009-02-26 21:54             ` Thomas Gleixner
  2009-02-26 22:04               ` Jesper Krogh
  2009-02-27  6:30             ` Jesper Krogh
  2009-03-01 13:51             ` Jesper Krogh
  2 siblings, 1 reply; 81+ messages in thread
From: Thomas Gleixner @ 2009-02-26 21:54 UTC (permalink / raw)
  To: john stultz; +Cc: Jesper Krogh, Linus Torvalds, Linux Kernel Mailing List

On Thu, 26 Feb 2009, john stultz wrote:
> On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
> > john stultz wrote:
> > > On Thu, Feb 26, 2009 at 12:43 PM, Jesper Krogh <jesper@krogh.cc> wrote:
> > >> Linus Torvalds wrote:
> > >>> On Thu, 26 Feb 2009, Jesper Krogh wrote:
> > >>>> 2.6.26.8 doesnt have this problem.
> > >>>>
> > >>>> The "current_clocsource" is the same on both systems.
> > >>>>
> > >>>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > >>>> tsc
> > >>> What does the frequency calibrate to? It should be in the dmesg. Does it
> > >>> differ by a big amount?
> > >> Non-working:
> > >> $ dmesg | grep -i freq
> > >> [    0.004007] Calibrating delay loop (skipped), value calculated using
> > >> timer frequency.. 4620.05 BogoMIPS (lpj=9240104)
> > >>
> > >> 2.6.26.8 doesn't have that information.
> > > 
> > > I'm surprised the clocksource watchdog isn't catching it.
> > > 
> > > What's the output from:
> > > cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > 
> > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > tsc acpi_pm jiffies
> 
> Hmm. Does booting w/ "clocksourc=acpi_pm" also show the severe (~550ppm,
> which NTP can't handle) drift?
> 
> >From the dmesg, I don't see any major calibration difference right off. 
> 
> So I'd suspect something like TSC halting in idle could be causing
> problems, but the watchdog should catch that as well. My only guess at
> this point is that the ACPI PM is halting in idle along with the TSC. 

But why would it do that on 29-rc6 and not on 2.6.28.8 ? I'm not aware
of changes which might cause that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:54             ` Thomas Gleixner
@ 2009-02-26 22:04               ` Jesper Krogh
  0 siblings, 0 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-02-26 22:04 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: john stultz, Linus Torvalds, Linux Kernel Mailing List

Thomas Gleixner wrote:
> On Thu, 26 Feb 2009, john stultz wrote:
>> On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
>>> john stultz wrote:
>>>> On Thu, Feb 26, 2009 at 12:43 PM, Jesper Krogh <jesper@krogh.cc> wrote:
>>>>> Linus Torvalds wrote:
>>>>>> On Thu, 26 Feb 2009, Jesper Krogh wrote:
>>>>>>> 2.6.26.8 doesnt have this problem.
>>>>>>>
>>>>>>> The "current_clocsource" is the same on both systems.
>>>>>>>
>>>>>>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>>>>>>> tsc
>>>>>> What does the frequency calibrate to? It should be in the dmesg. Does it
>>>>>> differ by a big amount?
>>>>> Non-working:
>>>>> $ dmesg | grep -i freq
>>>>> [    0.004007] Calibrating delay loop (skipped), value calculated using
>>>>> timer frequency.. 4620.05 BogoMIPS (lpj=9240104)
>>>>>
>>>>> 2.6.26.8 doesn't have that information.
>>>> I'm surprised the clocksource watchdog isn't catching it.
>>>>
>>>> What's the output from:
>>>> cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>>> $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>>> tsc acpi_pm jiffies
>> Hmm. Does booting w/ "clocksourc=acpi_pm" also show the severe (~550ppm,
>> which NTP can't handle) drift?
>>
>> >From the dmesg, I don't see any major calibration difference right off. 
>>
>> So I'd suspect something like TSC halting in idle could be causing
>> problems, but the watchdog should catch that as well. My only guess at
>> this point is that the ACPI PM is halting in idle along with the TSC. 
> 
> But why would it do that on 29-rc6 and not on 2.6.28.8 ? I'm not aware
> of changes which might cause that.

My comparison is 2.6.26.8 not 2.6.28.8 .. so fairly old.

It is a small cluster, so I'm slipping some test-kernels in when the
cluster is idle.

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:54           ` john stultz
@ 2009-02-26 22:06             ` Thomas Gleixner
  2009-02-26 22:24               ` Linus Torvalds
  2009-02-26 22:31               ` john stultz
  0 siblings, 2 replies; 81+ messages in thread
From: Thomas Gleixner @ 2009-02-26 22:06 UTC (permalink / raw)
  To: john stultz
  Cc: Jesper Krogh, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Thu, 26 Feb 2009, john stultz wrote:
> On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
> > > Also mind sending the full dmesg for both kernels?
> > 
> > http://krogh.cc/~jesper/dmesg-2.6.29-rc6.txt
> > http://krogh.cc/~jesper/dmesg-2.6.26.8.txt
> 
> So one interesting difference:
> 2.6.26.8:	TSC calibrated against PM_TIMER
> 2.6.29-rc6:	Fast TSC calibration using PIT
> 
> Thomas, any thoughts as to why we might be calibrating off the PIT
> instead of the PM_TIMER w/ 2.6.29?

Yup, because we introduced the Fast PIT calibration in 2.6.28.

Is the delta anything NTP might get upset about:

2.6.26: time.c: Detected 2311.847 MHz processor.
2.6.29: Detected 2310.029 MHz processor.

If yes, then we need to fix NTP not the calibration code :)
 
> Maybe does this line provide a hint?
> FADT: X_PM1a_EVT_BLK.bit_width (16) does not match PM1_EVT_LEN (4)

Red herring.
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 22:06             ` Thomas Gleixner
@ 2009-02-26 22:24               ` Linus Torvalds
  2009-02-26 22:31                 ` Linus Torvalds
  2009-02-26 22:31               ` john stultz
  1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-02-26 22:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: john stultz, Jesper Krogh, Linux Kernel Mailing List, Len Brown



On Thu, 26 Feb 2009, Thomas Gleixner wrote:
> 
> Is the delta anything NTP might get upset about:
> 
> 2.6.26: time.c: Detected 2311.847 MHz processor.
> 2.6.29: Detected 2310.029 MHz processor.
> 
> If yes, then we need to fix NTP not the calibration code :)

Well, that _is_ about 500ppm difference, and we claim that we _should_ 
have reached 150ppm with the 15ms delay. We clearly don't seem to have 
done that. I'm not quite sure why - we _should_ be finding the edge of the 
PIT events to within roughly a microsecond (assuming that's about as long 
as an "inb" takes), and that should give us a pretty good fast 
calibration, but maybe I'm overlooking something.

Or - and this may be more likely - there are chipsets that aren't very 
good at reading the PIT in a tight loop. That may explain why it's a 
problem on Jesper's hardware, but we haven't gotten tons of reports of 
this from others.

I see that it's a SunFire X2200, which I think uses an nVidia HT 
southbridge. I assume it's an nForce4 thing. There shouldn't be anything 
odd there, and the PIT read shouldn't be taking any longer than on 
anything else, but who knows? 

		Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 22:06             ` Thomas Gleixner
  2009-02-26 22:24               ` Linus Torvalds
@ 2009-02-26 22:31               ` john stultz
  2009-02-26 22:40                 ` Linus Torvalds
  2009-02-27  6:47                 ` Jesper Krogh
  1 sibling, 2 replies; 81+ messages in thread
From: john stultz @ 2009-02-26 22:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jesper Krogh, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Thu, 2009-02-26 at 23:06 +0100, Thomas Gleixner wrote:
> On Thu, 26 Feb 2009, john stultz wrote:
> > On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
> > > > Also mind sending the full dmesg for both kernels?
> > > 
> > > http://krogh.cc/~jesper/dmesg-2.6.29-rc6.txt
> > > http://krogh.cc/~jesper/dmesg-2.6.26.8.txt
> > 
> > So one interesting difference:
> > 2.6.26.8:	TSC calibrated against PM_TIMER
> > 2.6.29-rc6:	Fast TSC calibration using PIT
> > 
> > Thomas, any thoughts as to why we might be calibrating off the PIT
> > instead of the PM_TIMER w/ 2.6.29?
> 
> Yup, because we introduced the Fast PIT calibration in 2.6.28.

Ah. Ok.

> Is the delta anything NTP might get upset about:
> 
> 2.6.26: time.c: Detected 2311.847 MHz processor.
> 2.6.29: Detected 2310.029 MHz processor.

I wouldn't think so.

Although, I'm recalling on some systems here right after we deploy them
we'll see something similar to the originally reported ntpd "time reset"
noise for a period of time while ntpd tries to find the right freq. For
some reason, I've noticed, having multiple servers in your ntp.conf
seems to increase NTP's difficulty at picking a time and converging. 

So this may be just the slight calibration change is confusing ntp or it
may be the NTP_INTERVAL_LENGTH change from awhile back which would cause
the drift value to change could be doing the same thing (although I
thought that landed in the 2.6.24 timeframe, but I may be forgetting).

I'll kick up some of my own testing between these two releases to see if
I can't find something similar.

Jesper: How long was the box up for when you noticed the ntpd noise?

Also what's the output of the following under the different kernels:
ntpdc -c peers
ntpdc -c kerninfo

thanks
-john


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 22:24               ` Linus Torvalds
@ 2009-02-26 22:31                 ` Linus Torvalds
  0 siblings, 0 replies; 81+ messages in thread
From: Linus Torvalds @ 2009-02-26 22:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: john stultz, Jesper Krogh, Linux Kernel Mailing List, Len Brown



On Thu, 26 Feb 2009, Linus Torvalds wrote:

> 
> 
> On Thu, 26 Feb 2009, Thomas Gleixner wrote:
> > 
> > Is the delta anything NTP might get upset about:
> > 
> > 2.6.26: time.c: Detected 2311.847 MHz processor.
> > 2.6.29: Detected 2310.029 MHz processor.
> > 
> > If yes, then we need to fix NTP not the calibration code :)
> 
> Well, that _is_ about 500ppm difference

Doing the math rather than just eyeballing it, I think it's closer to 
800ppm than 500ppm. But maybe I did that wrong too.

Which is definitely pretty far out. The theory is that if we can catch the 
edge of the PIT timer to 1us, and even if we get it maximally wrong at 
beginning/end (ie the difference is off by 2us), a 2us error over 15ms 
should be on the order of just a 133ppm error.

So 800ppm looks too big. We're clearly not getting to within 1us of the 
PIT timer event edge. But it would be interesting to hear whether making 
teh 15ms be 30ms will get us to a better place, and make ntp happier.

And maybe my math is just wrong, and it's not the "within 1us" assumption 
that was wrong.

		Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 22:31               ` john stultz
@ 2009-02-26 22:40                 ` Linus Torvalds
  2009-02-26 22:59                   ` john stultz
  2009-02-27  6:47                 ` Jesper Krogh
  1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-02-26 22:40 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Jesper Krogh, Linux Kernel Mailing List, Len Brown



On Thu, 26 Feb 2009, john stultz wrote:
> 
> I'll kick up some of my own testing between these two releases to see if
> I can't find something similar.

Since the PIT timer read is possibly hw-dependent, it might be that you 
can't necessarily reproduce it on some random hardware.

How sensitive is ntpd to (stable) drift? IOW, if we get the calibration 
wrong, the TSC should still hopefully be very _stable_, it's just that the 
initial guesstimate for the frequency is off and ntp would have to correct 
for that.

The easiest way to test might be to just force a 1000ppm estimation error 
with something like this total hack (indented just so that nobody would 
ever apply this by mistake):

	 arch/x86/kernel/tsc.c |    4 ++++
	 1 files changed, 4 insertions(+), 0 deletions(-)

	diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
	index 599e581..b80a0c4 100644
	--- a/arch/x86/kernel/tsc.c
	+++ b/arch/x86/kernel/tsc.c
	@@ -350,6 +350,10 @@ static unsigned long quick_pit_calibrate(void)
	 		delta = (t2 - t1)*PIT_TICK_RATE;
	 		do_div(delta, QUICK_PIT_ITERATIONS*256*1000);
	 		printk("Fast TSC calibration using PIT\n");
	+
	+		/* HACK! */
	+		delta -= delta >> 10;
	+
	 		return delta;
	 	}
	 failed:

which wouldn't be hardware-dependent.

			Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 22:40                 ` Linus Torvalds
@ 2009-02-26 22:59                   ` john stultz
  2009-02-27  7:33                     ` Ingo Molnar
  0 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-02-26 22:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Jesper Krogh, Linux Kernel Mailing List, Len Brown

On Thu, 2009-02-26 at 14:40 -0800, Linus Torvalds wrote:
> 
> On Thu, 26 Feb 2009, john stultz wrote:
> > 
> > I'll kick up some of my own testing between these two releases to see if
> > I can't find something similar.
> 
> Since the PIT timer read is possibly hw-dependent, it might be that you 
> can't necessarily reproduce it on some random hardware.
> 
> How sensitive is ntpd to (stable) drift? IOW, if we get the calibration 
> wrong, the TSC should still hopefully be very _stable_, it's just that the 
> initial guesstimate for the frequency is off and ntp would have to correct 
> for that.

NTP can adjust the clock about +/-500ppm (so a 1000ppm range). Past that
it starts throwing errors.

Part of the issue is that if the drift value changes in between boots,
NTPd can take a while to settle down on the right freq. I suspect that's
whats happening here, and should the box be left alone for a few hours
(maybe overnight) NTPd will find the new drift correction the issue will
go away.

Thomas tripped over this a little while back when the
NTP_INTERVAL_LENGTH change landed, but I think that was prior to 2.6.26,
so its probably the calibration changes discussed, but I wanted to see
if there were any other slight changes that might be contributing to the
issue as well.

thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:46           ` john stultz
  2009-02-26 21:54             ` Thomas Gleixner
@ 2009-02-27  6:30             ` Jesper Krogh
  2009-03-01 13:51             ` Jesper Krogh
  2 siblings, 0 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-02-27  6:30 UTC (permalink / raw)
  To: john stultz; +Cc: Linus Torvalds, Linux Kernel Mailing List, Thomas Gleixner

john stultz wrote:
> On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
>> john stultz wrote:
>>> On Thu, Feb 26, 2009 at 12:43 PM, Jesper Krogh <jesper@krogh.cc> wrote:
>>>> Linus Torvalds wrote:
>>>>> On Thu, 26 Feb 2009, Jesper Krogh wrote:
>>>>>> 2.6.26.8 doesnt have this problem.
>>>>>>
>>>>>> The "current_clocsource" is the same on both systems.
>>>>>>
>>>>>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>>>>>> tsc
>>>>> What does the frequency calibrate to? It should be in the dmesg. Does it
>>>>> differ by a big amount?
>>>> Non-working:
>>>> $ dmesg | grep -i freq
>>>> [    0.004007] Calibrating delay loop (skipped), value calculated using
>>>> timer frequency.. 4620.05 BogoMIPS (lpj=9240104)
>>>>
>>>> 2.6.26.8 doesn't have that information.
>>> I'm surprised the clocksource watchdog isn't catching it.
>>>
>>> What's the output from:
>>> cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>> $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>> tsc acpi_pm jiffies
> 
> Hmm. Does booting w/ "clocksourc=acpi_pm" also show the severe (~550ppm,
> which NTP can't handle) drift?

I booted another server (identical hardware) with the same kernel and
the above clocksource line, it has run over night (8 hours) with full
load and ntp has not complained about anything on that server.

>>From the dmesg, I don't see any major calibration difference right off. 
> 
> So I'd suspect something like TSC halting in idle could be causing
> problems, but the watchdog should catch that as well. My only guess at
> this point is that the ACPI PM is halting in idle along with the TSC. 
> 
> And you said this only happens under load? 

I cant say that, but I've only observed it under load.

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 22:31               ` john stultz
  2009-02-26 22:40                 ` Linus Torvalds
@ 2009-02-27  6:47                 ` Jesper Krogh
  2009-02-27 20:35                   ` john stultz
  1 sibling, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-02-27  6:47 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
> On Thu, 2009-02-26 at 23:06 +0100, Thomas Gleixner wrote:
>> On Thu, 26 Feb 2009, john stultz wrote:
>>> On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
>>>>> Also mind sending the full dmesg for both kernels?
>>>> http://krogh.cc/~jesper/dmesg-2.6.29-rc6.txt
>>>> http://krogh.cc/~jesper/dmesg-2.6.26.8.txt
>>> So one interesting difference:
>>> 2.6.26.8:	TSC calibrated against PM_TIMER
>>> 2.6.29-rc6:	Fast TSC calibration using PIT
>>>
>>> Thomas, any thoughts as to why we might be calibrating off the PIT
>>> instead of the PM_TIMER w/ 2.6.29?
>> Yup, because we introduced the Fast PIT calibration in 2.6.28.
> 
> Ah. Ok.
> 
>> Is the delta anything NTP might get upset about:
>>
>> 2.6.26: time.c: Detected 2311.847 MHz processor.
>> 2.6.29: Detected 2310.029 MHz processor.
> 
> I wouldn't think so.
> 
> Although, I'm recalling on some systems here right after we deploy them
> we'll see something similar to the originally reported ntpd "time reset"
> noise for a period of time while ntpd tries to find the right freq. For
> some reason, I've noticed, having multiple servers in your ntp.conf
> seems to increase NTP's difficulty at picking a time and converging. 
> 
> So this may be just the slight calibration change is confusing ntp or it
> may be the NTP_INTERVAL_LENGTH change from awhile back which would cause
> the drift value to change could be doing the same thing (although I
> thought that landed in the 2.6.24 timeframe, but I may be forgetting).
> 
> I'll kick up some of my own testing between these two releases to see if
> I can't find something similar.
> 
> Jesper: How long was the box up for when you noticed the ntpd noise?

I was booted Feb 25 21:58 .. the first noice from ntp starts here:
Feb 25 22:09:53 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
Feb 25 22:09:56 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
Feb 25 22:14:08 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
Feb 25 22:16:20 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
Feb 25 22:32:25 quad12 ntpd[4901]: time reset -1.601641 s
Feb 25 22:36:18 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
Feb 25 22:36:45 quad12 ntpd[4901]: synchronized to 10.194.133.12, stratum 4
Feb 25 22:51:41 quad12 ntpd[4901]: time reset -0.922993 s
Feb 25 22:55:05 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13



> Also what's the output of the following under the different kernels:
> ntpdc -c peers
> ntpdc -c kerninfo

Working (clocksource=acpi_pm) 2.6.29-rc6
jk@quad02:~$ ntpdc -c kerninfo
pll offset:           -0.001577 s
pll frequency:        -45.787 ppm
maximum error:        0.066739 s
estimated error:      0.000768 s
status:               0001  pll
pll time constant:    6
precision:            1e-06 s
frequency tolerance:  500 ppm
jk@quad02:~$ ntpdc -c peers
      remote           local      st poll reach  delay   offset    disp
=======================================================================
*hal.nzcorp.net  10.194.132.81    4   64  377 0.00008  0.003752 0.04816
=svn.nzcorp.net  10.194.132.81    4   64  377 0.00009 -0.008724 0.04979
=LOCAL(0)        127.0.0.1       13   64  377 0.00000  0.000000 0.03082


Working (clocksource=tsc) 2.6.26.8
jk@quad03:~$ ntpdc -c kerninfo
pll offset:           0.003208 s
pll frequency:        -25.070 ppm
maximum error:        0.833193 s
estimated error:      0.002787 s
status:               4001  pll
pll time constant:    10
precision:            1e-06 s
frequency tolerance:  500 ppm
jk@quad03:~$ ntpdc -c peers
      remote           local      st poll reach  delay   offset    disp
=======================================================================
*hal.nzcorp.net  10.194.132.82    4 1024  377 0.00781  0.006788 0.13666
=sal.nzcorp.net  10.194.132.82    4 1024  377 0.00018 -0.000541 0.12175
=LOCAL(0)        127.0.0.1       13   64  377 0.00000  0.000000 0.03041

Non-working (clocksource=tsc) 2.6.29-rc6
jk@quad12:~$ ntpdc -c kerninfo
pll offset:           0 s
pll frequency:        -34.754 ppm
maximum error:        0.023514 s
estimated error:      0 s
status:               0001  pll
pll time constant:    6
precision:            1e-06 s
frequency tolerance:  500 ppm
jk@quad12:~$ ntpdc -c peers
      remote           local      st poll reach  delay   offset    disp
=======================================================================
=hal.nzcorp.net  10.194.132.91    4   64   17 0.00011 -0.069377 0.96895
=trac.nzcorp.net 10.194.132.91    4   64   17 0.00011 -0.096107 0.96904
*LOCAL(0)        127.0.0.1       13   64   17 0.00000  0.000000 0.96857


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 22:59                   ` john stultz
@ 2009-02-27  7:33                     ` Ingo Molnar
  2009-02-27 20:50                       ` john stultz
  0 siblings, 1 reply; 81+ messages in thread
From: Ingo Molnar @ 2009-02-27  7:33 UTC (permalink / raw)
  To: john stultz
  Cc: Linus Torvalds, Thomas Gleixner, Jesper Krogh,
	Linux Kernel Mailing List, Len Brown


* john stultz <johnstul@us.ibm.com> wrote:

> On Thu, 2009-02-26 at 14:40 -0800, Linus Torvalds wrote:
> > 
> > On Thu, 26 Feb 2009, john stultz wrote:
> > > 
> > > I'll kick up some of my own testing between these two releases to see if
> > > I can't find something similar.
> > 
> > Since the PIT timer read is possibly hw-dependent, it might be that you 
> > can't necessarily reproduce it on some random hardware.
> > 
> > How sensitive is ntpd to (stable) drift? IOW, if we get the calibration 
> > wrong, the TSC should still hopefully be very _stable_, it's just that the 
> > initial guesstimate for the frequency is off and ntp would have to correct 
> > for that.
>
> NTP can adjust the clock about +/-500ppm (so a 1000ppm range). 
> Past that it starts throwing errors.

Well, it will start throwing errors but still it will correct 
the clock and find the frequency delta between the host clock 
and the reference clock just fine, and converge in a couple of 
hours, correct?

500ppm is 0.05% of a frequency drift which is awfully small - 
thermal effects alone can cause such differences so it should 
not be anything out of the ordinary for ntpd.

> Part of the issue is that if the drift value changes in 
> between boots, NTPd can take a while to settle down on the 
> right freq. I suspect that's whats happening here, and should 
> the box be left alone for a few hours (maybe overnight) NTPd 
> will find the new drift correction the issue will go away.

If the default poll interval of 64 seconds is used then it can 
take that much time - so i'd sugges to decrease that to below 10 
seconds.

It's not like the frequency is changing rapidly here. The 
correction pattern to find is a very simple and very static and 
reliable multiplicator of ~1.000800 between the two frequencies.

Say the over-the-network reference clock ntpd follows has a 10 
msecs of intrinsic observation noise. For that 10 msecs noise to 
go down to the 10 ppm range [to the local but drifted time 
source which has ~10 ppm precision straight away], we need 
roughly 1000 samples. [simplified, fewer are enough in reality, 
especially if you have some known-to-have-converged-before 
cached value to start out with.]

1000 samples with 64 seconds intervals can take half a day to 
converge. 1000 samples with 1 second intervals takes just 15 
minutes to converge.

We'll improve in-kernel calibration but calibration noise in the 
0.05% range should be expected in some cases.

	Ingo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-27  6:47                 ` Jesper Krogh
@ 2009-02-27 20:35                   ` john stultz
  2009-03-01 20:13                     ` Jesper Krogh
  2009-03-02  9:53                     ` Jesper Krogh
  0 siblings, 2 replies; 81+ messages in thread
From: john stultz @ 2009-02-27 20:35 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Fri, 2009-02-27 at 07:47 +0100, Jesper Krogh wrote:
> john stultz wrote:
> > Jesper: How long was the box up for when you noticed the ntpd noise?
> 
> I was booted Feb 25 21:58 .. the first noice from ntp starts here:
> Feb 25 22:09:53 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
> Feb 25 22:09:56 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
> Feb 25 22:14:08 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
> Feb 25 22:16:20 quad12 ntpd[4901]: synchronized to 10.194.133.13, stratum 4
> Feb 25 22:32:25 quad12 ntpd[4901]: time reset -1.601641 s
> Feb 25 22:36:18 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13
> Feb 25 22:36:45 quad12 ntpd[4901]: synchronized to 10.194.133.12, stratum 4
> Feb 25 22:51:41 quad12 ntpd[4901]: time reset -0.922993 s
> Feb 25 22:55:05 quad12 ntpd[4901]: synchronized to LOCAL(0), stratum 13

Ok, so that's not very long. I'd expect by now, if the box is still up,
the messages have stopped. Is that true, or is it still resetting?


> > Also what's the output of the following under the different kernels:
> > ntpdc -c peers
> > ntpdc -c kerninfo
[snip]
> Working (clocksource=tsc) 2.6.26.8
> jk@quad03:~$ ntpdc -c kerninfo
> pll offset:           0.003208 s
> pll frequency:        -25.070 ppm
[snip]
> Non-working (clocksource=tsc) 2.6.29-rc6
> jk@quad12:~$ ntpdc -c kerninfo
> pll offset:           0 s
> pll frequency:        -34.754 ppm


Ok, so it seems ntp hasn't really had a chance to settle down, its only
made a 10ppm adjustment so far. NTPd will stop corrections at ~
+/-500ppm, so you're not at that bound yet, where things would be really
broken.

If the affected kernel isn't resetting in the logs anymore, I'd be
interested in what the new ppm value is.

thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-27  7:33                     ` Ingo Molnar
@ 2009-02-27 20:50                       ` john stultz
  0 siblings, 0 replies; 81+ messages in thread
From: john stultz @ 2009-02-27 20:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Thomas Gleixner, Jesper Krogh,
	Linux Kernel Mailing List, Len Brown

On Fri, 2009-02-27 at 08:33 +0100, Ingo Molnar wrote:
> * john stultz <johnstul@us.ibm.com> wrote:
> 
> > On Thu, 2009-02-26 at 14:40 -0800, Linus Torvalds wrote:
> > > 
> > > On Thu, 26 Feb 2009, john stultz wrote:
> > > > 
> > > > I'll kick up some of my own testing between these two releases to see if
> > > > I can't find something similar.
> > > 
> > > Since the PIT timer read is possibly hw-dependent, it might be that you 
> > > can't necessarily reproduce it on some random hardware.
> > > 
> > > How sensitive is ntpd to (stable) drift? IOW, if we get the calibration 
> > > wrong, the TSC should still hopefully be very _stable_, it's just that the 
> > > initial guesstimate for the frequency is off and ntp would have to correct 
> > > for that.
> >
> > NTP can adjust the clock about +/-500ppm (so a 1000ppm range). 
> > Past that it starts throwing errors.
> 
> Well, it will start throwing errors but still it will correct 
> the clock and find the frequency delta between the host clock 
> and the reference clock just fine, and converge in a couple of 
> hours, correct?

No  NTP spec limits the freq correction to ~+/-500ppm. Once NTPd hits
that 500ppm wall, it will throw an error and stop trying to sync the
clock.

> 500ppm is 0.05% of a frequency drift which is awfully small - 
> thermal effects alone can cause such differences so it should 
> not be anything out of the ordinary for ntpd.

Practically I've not seen boxes that vary that much. I've seen very poor
systems who's crystals are off by ~280ppm, but those don't vary that
much over time much.


> > Part of the issue is that if the drift value changes in 
> > between boots, NTPd can take a while to settle down on the 
> > right freq. I suspect that's whats happening here, and should 
> > the box be left alone for a few hours (maybe overnight) NTPd 
> > will find the new drift correction the issue will go away.
> 
> If the default poll interval of 64 seconds is used then it can 
> take that much time - so i'd sugges to decrease that to below 10 
> seconds.

Indeed. Shortening the maxpoll value in the ntp.conf greatly improves
how fast and how close the client will sync to the server, but take
caution, as that can cause undue load on public time servers.  

thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:46           ` john stultz
  2009-02-26 21:54             ` Thomas Gleixner
  2009-02-27  6:30             ` Jesper Krogh
@ 2009-03-01 13:51             ` Jesper Krogh
  2 siblings, 0 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-03-01 13:51 UTC (permalink / raw)
  To: john stultz; +Cc: Linus Torvalds, Linux Kernel Mailing List, Thomas Gleixner

john stultz wrote:
> On Thu, 2009-02-26 at 22:35 +0100, Jesper Krogh wrote:
>> john stultz wrote:
>>> On Thu, Feb 26, 2009 at 12:43 PM, Jesper Krogh <jesper@krogh.cc> wrote:
>>>> Linus Torvalds wrote:
>>>>> On Thu, 26 Feb 2009, Jesper Krogh wrote:
>>>>>> 2.6.26.8 doesnt have this problem.
>>>>>>
>>>>>> The "current_clocsource" is the same on both systems.
>>>>>>
>>>>>> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>>>>>> tsc
>>>>> What does the frequency calibrate to? It should be in the dmesg. Does it
>>>>> differ by a big amount?
>>>> Non-working:
>>>> $ dmesg | grep -i freq
>>>> [    0.004007] Calibrating delay loop (skipped), value calculated using
>>>> timer frequency.. 4620.05 BogoMIPS (lpj=9240104)
>>>>
>>>> 2.6.26.8 doesn't have that information.
>>> I'm surprised the clocksource watchdog isn't catching it.
>>>
>>> What's the output from:
>>> cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>> $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>> tsc acpi_pm jiffies
> 
> Hmm. Does booting w/ "clocksourc=acpi_pm" also show the severe (~550ppm,
> which NTP can't handle) drift?
> 
>>From the dmesg, I don't see any major calibration difference right off. 
> 
> So I'd suspect something like TSC halting in idle could be causing
> problems, but the watchdog should catch that as well. My only guess at
> this point is that the ACPI PM is halting in idle along with the TSC. 
> 
> And you said this only happens under load? 

That wasn't true.. I got some real sunday testing done today. A fresh 
2.6.28.7 has the same problem with a load of 0.00 0.00 0.00

2.6.27.19 doesn't have problems keeping time.

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 21:49           ` Linus Torvalds
@ 2009-03-01 15:04             ` Jesper Krogh
  0 siblings, 0 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-03-01 15:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: john stultz, Linux Kernel Mailing List

Linus Torvalds wrote:
> 
> On Thu, 26 Feb 2009, Jesper Krogh wrote:
>>> Also mind sending the full dmesg for both kernels?
>> http://krogh.cc/~jesper/dmesg-2.6.29-rc6.txt
>> http://krogh.cc/~jesper/dmesg-2.6.26.8.txt
> 
> Try changing
> 
> 	#define QUICK_PIT_MS 15
> 
> in arch/x86/kernel/tsc.c into something bigger. Let's say just doubling 
> it to 30. Does that change anything?

It seems to "slow down" the process (time from bootup to first clock 
reset).

Mar  1 15:38:41 quad01 ntpd[4603]: synchronized to LOCAL(0), stratum 13
Mar  1 15:38:41 quad01 ntpd[4603]: kernel time sync status change 0001
Mar  1 15:39:47 quad01 ntpd[4603]: synchronized to 10.194.133.13, stratum 4
Mar  1 15:43:02 quad01 ntpd[4603]: synchronized to 10.194.133.12, stratum 4
Mar  1 15:53:41 quad01 ntpd[4603]: time reset -0.352221 s
Mar  1 15:57:18 quad01 ntpd[4603]: synchronized to LOCAL(0), stratum 13
Mar  1 15:58:23 quad01 ntpd[4603]: synchronized to 10.194.133.13, stratum 4
jk@quad01:~$ w
  16:03:29 up 28 min,  2 users,  load average: 0.04, 0.01, 0.00

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-26 19:55 ` Jesper Krogh
  2009-02-26 20:33   ` Linus Torvalds
@ 2009-03-01 15:09   ` Jesper Krogh
  2009-03-01 15:44     ` Linux 2.6.29-rc6 (clocksource) Sitsofe Wheeler
  1 sibling, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-01 15:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

Jesper Krogh wrote:
> The "current_clocsource" is the same on both systems.
> 
> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> tsc

What selects the "current_clocksource"? I tried to boot one of the 
kernels hat have the problem on another piece of hardware and on that 
system it ended up defaulting to "acpi_pm" instead of "tsc".

http://krogh.cc/~jesper/dmesg-2.6.28.7.txt

"acpi_pm" seems to be reliable all the time.

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6 (clocksource)
  2009-03-01 15:09   ` Jesper Krogh
@ 2009-03-01 15:44     ` Sitsofe Wheeler
  0 siblings, 0 replies; 81+ messages in thread
From: Sitsofe Wheeler @ 2009-03-01 15:44 UTC (permalink / raw)
  To: Jesper Krogh; +Cc: Linus Torvalds, Linux Kernel Mailing List

On Sun, Mar 01, 2009 at 04:09:03PM +0100, Jesper Krogh wrote:
> Jesper Krogh wrote:
> >The "current_clocsource" is the same on both systems.
> >
> >$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> >tsc
> 
> What selects the "current_clocksource"? I tried to boot one of the 
> kernels hat have the problem on another piece of hardware and on that 
> system it ended up defaulting to "acpi_pm" instead of "tsc".

I believe different clock sources have different priorities based on
their resolution and behaviour. Clock sources's that "go bad" because
hardware interactions are hopefully detected and subsequent "best" clock
sources are then tried.

There was a nice treatment of different clocksourcs in this
kernelnewbies thread:
http://www.mail-archive.com/kernelnewbies@nl.linux.org/msg05164.html .

-- 
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-27 20:35                   ` john stultz
@ 2009-03-01 20:13                     ` Jesper Krogh
  2009-03-02  9:53                     ` Jesper Krogh
  1 sibling, 0 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-03-01 20:13 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
>> Working (clocksource=tsc) 2.6.26.8
>> jk@quad03:~$ ntpdc -c kerninfo
>> pll offset:           0.003208 s
>> pll frequency:        -25.070 ppm
> [snip]
>> Non-working (clocksource=tsc) 2.6.29-rc6
>> jk@quad12:~$ ntpdc -c kerninfo
>> pll offset:           0 s
>> pll frequency:        -34.754 ppm
> 
> 
> Ok, so it seems ntp hasn't really had a chance to settle down, its only
> made a 10ppm adjustment so far. NTPd will stop corrections at ~
> +/-500ppm, so you're not at that bound yet, where things would be really
> broken.

But I should settle within a "reasonable" period of time? (not hours?).

> If the affected kernel isn't resetting in the logs anymore, I'd be
> interested in what the new ppm value is.

I keeps resetting after 7 hours ..   Is there more information I can 
provide?

Jesper

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-02-27 20:35                   ` john stultz
  2009-03-01 20:13                     ` Jesper Krogh
@ 2009-03-02  9:53                     ` Jesper Krogh
  2009-03-02 21:27                       ` john stultz
  1 sibling, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-02  9:53 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
> Ok, so it seems ntp hasn't really had a chance to settle down, its only
> made a 10ppm adjustment so far. NTPd will stop corrections at ~
> +/-500ppm, so you're not at that bound yet, where things would be really
> broken.
> 
> If the affected kernel isn't resetting in the logs anymore, I'd be
> interested in what the new ppm value is.

After 20 hours.. its still resetting.
Mar  2 10:43:24 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
Mar  2 10:50:37 quad12 ntpd[4416]: time reset -1.103654 s
jk@quad12:~$ uptime
  10:51:36 up 20:46,  1 user,  load average: 0.00, 0.00, 0.00

And it hasn't shifted clocksource either.

jk@quad12:~$ cat 
/sys/devices/system/clocksource/clocksource0/current_clocksource
tsc

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-02  9:53                     ` Jesper Krogh
@ 2009-03-02 21:27                       ` john stultz
  2009-03-03  6:04                         ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-03-02 21:27 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Mon, 2009-03-02 at 10:53 +0100, Jesper Krogh wrote:
> john stultz wrote:
> > Ok, so it seems ntp hasn't really had a chance to settle down, its only
> > made a 10ppm adjustment so far. NTPd will stop corrections at ~
> > +/-500ppm, so you're not at that bound yet, where things would be really
> > broken.
> > 
> > If the affected kernel isn't resetting in the logs anymore, I'd be
> > interested in what the new ppm value is.
> 
> After 20 hours.. its still resetting.
> Mar  2 10:43:24 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
> Mar  2 10:50:37 quad12 ntpd[4416]: time reset -1.103654 s

So what's the "ntpdc -c kerninfo" output now?

thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-02 21:27                       ` john stultz
@ 2009-03-03  6:04                         ` Jesper Krogh
  2009-03-03 19:53                           ` john stultz
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-03  6:04 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
> On Mon, 2009-03-02 at 10:53 +0100, Jesper Krogh wrote:
>> john stultz wrote:
>>> Ok, so it seems ntp hasn't really had a chance to settle down, its only
>>> made a 10ppm adjustment so far. NTPd will stop corrections at ~
>>> +/-500ppm, so you're not at that bound yet, where things would be really
>>> broken.
>>>
>>> If the affected kernel isn't resetting in the logs anymore, I'd be
>>> interested in what the new ppm value is.
>> After 20 hours.. its still resetting.
>> Mar  2 10:43:24 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
>> Mar  2 10:50:37 quad12 ntpd[4416]: time reset -1.103654 s
> 
> So what's the "ntpdc -c kerninfo" output now?

Mar  3 06:41:10 quad12 ntpd[4416]: time reset -0.813957 s
Mar  3 06:45:20 quad12 ntpd[4416]: synchronized to LOCAL(0), stratum 13
Mar  3 06:45:36 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
Mar  3 06:51:57 quad12 ntpd[4416]: synchronized to 10.194.133.13, stratum 4
Mar  3 07:00:29 quad12 ntpd[4416]: time reset -0.783390 s
jk@quad12:~$ ntpdc -c kerninfo
pll offset:           0 s
pll frequency:        -28.691 ppm
maximum error:        1.0433 s
estimated error:      0 s
status:               0001  pll
pll time constant:    4
precision:            1e-06 s
frequency tolerance:  500 ppm
jk@quad12:~$ w
  07:03:17 up 1 day, 16:59,  1 user,  load average: 0.00, 0.00, 0.00



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-03  6:04                         ` Jesper Krogh
@ 2009-03-03 19:53                           ` john stultz
  2009-03-03 20:19                             ` Jesper Krogh
  2009-03-03 20:39                             ` Jesper Krogh
  0 siblings, 2 replies; 81+ messages in thread
From: john stultz @ 2009-03-03 19:53 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Tue, 2009-03-03 at 07:04 +0100, Jesper Krogh wrote:
> john stultz wrote:
> > On Mon, 2009-03-02 at 10:53 +0100, Jesper Krogh wrote:
> >> john stultz wrote:
> >>> Ok, so it seems ntp hasn't really had a chance to settle down, its only
> >>> made a 10ppm adjustment so far. NTPd will stop corrections at ~
> >>> +/-500ppm, so you're not at that bound yet, where things would be really
> >>> broken.
> >>>
> >>> If the affected kernel isn't resetting in the logs anymore, I'd be
> >>> interested in what the new ppm value is.
> >> After 20 hours.. its still resetting.
> >> Mar  2 10:43:24 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
> >> Mar  2 10:50:37 quad12 ntpd[4416]: time reset -1.103654 s
> > 
> > So what's the "ntpdc -c kerninfo" output now?
> 
> Mar  3 06:41:10 quad12 ntpd[4416]: time reset -0.813957 s
> Mar  3 06:45:20 quad12 ntpd[4416]: synchronized to LOCAL(0), stratum 13
> Mar  3 06:45:36 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
> Mar  3 06:51:57 quad12 ntpd[4416]: synchronized to 10.194.133.13, stratum 4
> Mar  3 07:00:29 quad12 ntpd[4416]: time reset -0.783390 s
> jk@quad12:~$ ntpdc -c kerninfo
> pll offset:           0 s
> pll frequency:        -28.691 ppm


This is baffling. You've only gone from -34.754ppm to -28.691ppm in over
a day? And you're still not syncing? If the calibration was so bad that
NTP couldn't sync, I'd expect the freq value to hit +/-500ppm before it
gave up. This just doesn't follow my expectations.

Could you provide:
/usr/sbin/ntpdc -c version

Do you see the same behavior if you drop all but one server (including
the local clock: 127.127.1.0)? 

You might also add "minpoll 4 maxpoll 4" to the server line to speed up
testing.

Actually, if you could, I'd be interested if you could send your
ntp.conf 

thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-03 19:53                           ` john stultz
@ 2009-03-03 20:19                             ` Jesper Krogh
  2009-03-03 22:22                               ` john stultz
  2009-03-03 20:39                             ` Jesper Krogh
  1 sibling, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-03 20:19 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
> On Tue, 2009-03-03 at 07:04 +0100, Jesper Krogh wrote:
>> john stultz wrote:
>>> On Mon, 2009-03-02 at 10:53 +0100, Jesper Krogh wrote:
>>>> john stultz wrote:
>>>>> Ok, so it seems ntp hasn't really had a chance to settle down, its only
>>>>> made a 10ppm adjustment so far. NTPd will stop corrections at ~
>>>>> +/-500ppm, so you're not at that bound yet, where things would be really
>>>>> broken.
>>>>>
>>>>> If the affected kernel isn't resetting in the logs anymore, I'd be
>>>>> interested in what the new ppm value is.
>>>> After 20 hours.. its still resetting.
>>>> Mar  2 10:43:24 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
>>>> Mar  2 10:50:37 quad12 ntpd[4416]: time reset -1.103654 s
>>> So what's the "ntpdc -c kerninfo" output now?
>> Mar  3 06:41:10 quad12 ntpd[4416]: time reset -0.813957 s
>> Mar  3 06:45:20 quad12 ntpd[4416]: synchronized to LOCAL(0), stratum 13
>> Mar  3 06:45:36 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
>> Mar  3 06:51:57 quad12 ntpd[4416]: synchronized to 10.194.133.13, stratum 4
>> Mar  3 07:00:29 quad12 ntpd[4416]: time reset -0.783390 s
>> jk@quad12:~$ ntpdc -c kerninfo
>> pll offset:           0 s
>> pll frequency:        -28.691 ppm
> 
> 
> This is baffling. You've only gone from -34.754ppm to -28.691ppm in over
> a day? And you're still not syncing? If the calibration was so bad that
> NTP couldn't sync, I'd expect the freq value to hit +/-500ppm before it
> gave up. This just doesn't follow my expectations.

It's resetting.. without deep knowledge about ntp, doesnt that mean 
"start over again"? I believe it hits +/-500ppm

> Could you provide:
> /usr/sbin/ntpdc -c version

$ ntpdc -c version
ntpdc 4.2.4p4@1.1520-o Tue Jan  6 15:51:00 UTC 2009 (1)

> Do you see the same behavior if you drop all but one server (including
> the local clock: 127.127.1.0)? 
> 
> You might also add "minpoll 4 maxpoll 4" to the server line to speed up
> testing.

Will try those option while debugging.

> Actually, if you could, I'd be interested if you could send your
> ntp.conf 

http://krogh.cc/~jesper/ntp.conf

But this seems to be a "regression". Since 2.6.27.19 doesn't misbehave. 
Same NTP, same configuration, same hardware. only change is the kernel 
version. Or am I missing some parameter here?

Would it make sense to try to bisect it?

Jesper

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-03 19:53                           ` john stultz
  2009-03-03 20:19                             ` Jesper Krogh
@ 2009-03-03 20:39                             ` Jesper Krogh
  2009-03-03 22:16                               ` john stultz
  1 sibling, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-03 20:39 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
> Do you see the same behavior if you drop all but one server (including
> the local clock: 127.127.1.0)? 

Yes.
Mar  3 21:20:59 quad12 ntpd[2435]: ntpd 4.2.4p4@1.1520-o Tue Jan  6 
15:50:55 UTC 2009 (1)
Mar  3 21:20:59 quad12 ntpd[2436]: precision = 1.000 usec
Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #0 wildcard, 
0.0.0.0#123 Disabled
Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #1 wildcard, 
::#123 Disabled
Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #2 lo, ::1#123 
Enabled
Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #3 bond0, 
fe80::21e:68ff:fe57:8169#123 Enabled
Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #4 lo, 
127.0.0.1#123 Enabled
Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #5 bond0, 
10.194.132.91#123 Enabled
Mar  3 21:20:59 quad12 ntpd[2436]: kernel time sync status 0040
Mar  3 21:20:59 quad12 ntpd[2436]: frequency initialized -29.286 PPM 
from /var/lib/ntp/ntp.drift
Mar  3 21:21:58 quad12 ntpd[2436]: synchronized to 10.194.133.12, stratum 4
Mar  3 21:21:58 quad12 ntpd[2436]: time reset -6.148275 s
Mar  3 21:21:58 quad12 ntpd[2436]: kernel time sync status change 0001
Mar  3 21:25:01 quad12 ntpd[2436]: synchronized to 10.194.133.12, stratum 4
Mar  3 21:37:03 quad12 ntpd[2436]: time reset -0.664351 s

Only one server and the minpoll 4 maxpoll 4 options to the server line.

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-03 20:39                             ` Jesper Krogh
@ 2009-03-03 22:16                               ` john stultz
  2009-03-04  5:36                                 ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-03-03 22:16 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Tue, 2009-03-03 at 21:39 +0100, Jesper Krogh wrote:
> john stultz wrote:
> > Do you see the same behavior if you drop all but one server (including
> > the local clock: 127.127.1.0)? 
> 
> Yes.
> Mar  3 21:20:59 quad12 ntpd[2435]: ntpd 4.2.4p4@1.1520-o Tue Jan  6 
> 15:50:55 UTC 2009 (1)
> Mar  3 21:20:59 quad12 ntpd[2436]: precision = 1.000 usec
> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #0 wildcard, 
> 0.0.0.0#123 Disabled
> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #1 wildcard, 
> ::#123 Disabled
> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #2 lo, ::1#123 
> Enabled
> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #3 bond0, 
> fe80::21e:68ff:fe57:8169#123 Enabled
> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #4 lo, 
> 127.0.0.1#123 Enabled
> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #5 bond0, 
> 10.194.132.91#123 Enabled
> Mar  3 21:20:59 quad12 ntpd[2436]: kernel time sync status 0040
> Mar  3 21:20:59 quad12 ntpd[2436]: frequency initialized -29.286 PPM 
> from /var/lib/ntp/ntp.drift
> Mar  3 21:21:58 quad12 ntpd[2436]: synchronized to 10.194.133.12, stratum 4
> Mar  3 21:21:58 quad12 ntpd[2436]: time reset -6.148275 s
> Mar  3 21:21:58 quad12 ntpd[2436]: kernel time sync status change 0001
> Mar  3 21:25:01 quad12 ntpd[2436]: synchronized to 10.194.133.12, stratum 4
> Mar  3 21:37:03 quad12 ntpd[2436]: time reset -0.664351 s
> 
> Only one server and the minpoll 4 maxpoll 4 options to the server line.

Well, it may still need a few hours to settle. :)  Again, those time
resets are seen when NTPd doesn't have a good drift ppm at startup, and
it has to find it.

thanks
-john


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-03 20:19                             ` Jesper Krogh
@ 2009-03-03 22:22                               ` john stultz
  2009-03-04 15:30                                 ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-03-03 22:22 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

[-- Attachment #1: Type: text/plain, Size: 4908 bytes --]

On Tue, 2009-03-03 at 21:19 +0100, Jesper Krogh wrote:
> john stultz wrote:
> > On Tue, 2009-03-03 at 07:04 +0100, Jesper Krogh wrote:
> >> john stultz wrote:
> >>> On Mon, 2009-03-02 at 10:53 +0100, Jesper Krogh wrote:
> >>>> john stultz wrote:
> >>>>> Ok, so it seems ntp hasn't really had a chance to settle down, its only
> >>>>> made a 10ppm adjustment so far. NTPd will stop corrections at ~
> >>>>> +/-500ppm, so you're not at that bound yet, where things would be really
> >>>>> broken.
> >>>>>
> >>>>> If the affected kernel isn't resetting in the logs anymore, I'd be
> >>>>> interested in what the new ppm value is.
> >>>> After 20 hours.. its still resetting.
> >>>> Mar  2 10:43:24 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
> >>>> Mar  2 10:50:37 quad12 ntpd[4416]: time reset -1.103654 s
> >>> So what's the "ntpdc -c kerninfo" output now?
> >> Mar  3 06:41:10 quad12 ntpd[4416]: time reset -0.813957 s
> >> Mar  3 06:45:20 quad12 ntpd[4416]: synchronized to LOCAL(0), stratum 13
> >> Mar  3 06:45:36 quad12 ntpd[4416]: synchronized to 10.194.133.12, stratum 4
> >> Mar  3 06:51:57 quad12 ntpd[4416]: synchronized to 10.194.133.13, stratum 4
> >> Mar  3 07:00:29 quad12 ntpd[4416]: time reset -0.783390 s
> >> jk@quad12:~$ ntpdc -c kerninfo
> >> pll offset:           0 s
> >> pll frequency:        -28.691 ppm
> > 
> > 
> > This is baffling. You've only gone from -34.754ppm to -28.691ppm in over
> > a day? And you're still not syncing? If the calibration was so bad that
> > NTP couldn't sync, I'd expect the freq value to hit +/-500ppm before it
> > gave up. This just doesn't follow my expectations.
> 
> It's resetting.. without deep knowledge about ntp, doesnt that mean 
> "start over again"? I believe it hits +/-500ppm

No, the "time reset" message means that when the offset is larger
then .125sec (the slew boundary), NTPd has corrected it by calling
settimeofday instead of slewing the clock.

Here's some background about how NTP and the kernel interact:
Every time NTPd calls adjtimex(), its provides the current offset from
the tracked ntp server. The kernel takes this offset and applies a
temporary correction factor to the clocksource frequency to converge
that offset. It also takes the provided offset, dampens it, and then
uses the result to adjust the frequency value. Once the freq value hits
the max adjustment value (+/- 500ppm), then NTP will start throwing
error messages and give up.

The part that is so odd with your data, is that the freq value isn't
changing very much. After a time reset, I'd expect to see adjustments in
the 100us, then multiple ms, and only once we get above 100ms to see
another time reset. All the while, these adjustment values should be
tweaking the freq value, causing the clocks to converge.

The case I can think of that could cause this, is if the drift is
somehow jumping above the slew boundary before NTPd actually makes any
adjtimex calls, so we end up with minimal correction to the freq value,
but that still doesn't completely vibe with the data.


> > Could you provide:
> > /usr/sbin/ntpdc -c version
> 
> $ ntpdc -c version
> ntpdc 4.2.4p4@1.1520-o Tue Jan  6 15:51:00 UTC 2009 (1)
> 
> > Do you see the same behavior if you drop all but one server (including
> > the local clock: 127.127.1.0)? 
> > 
> > You might also add "minpoll 4 maxpoll 4" to the server line to speed up
> > testing.
> 
> Will try those option while debugging.
> 
> > Actually, if you could, I'd be interested if you could send your
> > ntp.conf 
> 
> http://krogh.cc/~jesper/ntp.conf

Cool, I see you're collecting stats already. Depending on the results of
the tests above I may want to check those out as well.

> But this seems to be a "regression". Since 2.6.27.19 doesn't misbehave. 
> Same NTP, same configuration, same hardware. only change is the kernel 
> version. Or am I missing some parameter here?
> 
> Would it make sense to try to bisect it?

Well, I suspect you'll just bisect it to the fast-pit TSC calibration
causing a different correction freq to be needed for synchronization.
The odd part is that the userland NTPd isn't behaving as I'd expect if
the TSC calibration was really so bad that NTP couldn't handle it.

Bisection may be something worth trying just to verify or disprove that
theory, so if you have the time, it would be interesting to see. But if
the theory is true then we're back to the same spot.

I guess something to test my idea above (that the drift is bad enough
that NTPd isn't making slew adjustments via adjtimex offset) is to
remove NTPd from the init.d startup.

Then after rebooting (into 2.6.29), run the attached python script for
10 minutes or so to get an idea of the ppm drift. Then repeat with
2.6.26.

To run: 
./drift-test.py <ntp server>

It will give some wild ppm numbers, but after a few minutes it should
settle down to the "natural drift" of the system.

thanks
-john


[-- Attachment #2: drift-test.py --]
[-- Type: text/x-python, Size: 1523 bytes --]

#!/usr/bin/python

# Time Drift Script
#		Periodically checks and displays time drift
#		by john stultz (jstultz@us.ibm.com)

import commands
import sys
import string
import time

server_default = "yourserverhere"
sleep_time_default  = 60

server = ""
sleep_time = 0
set_time = 0

#parse args
for arg in sys.argv[1:]:
	if arg == "-s":
		set_time = 1
	elif server == "":
		server = arg
	elif sleep_time == 0:
		sleep_time = string.atoi(arg)

if server == "":
	server = server_default
if sleep_time == 0:
	sleep_time = sleep_time_default

#set time
if (set_time == 1):
	cmd = commands.getoutput('/usr/sbin/ntpdate -ub ' + server)

cmd = commands.getoutput('/usr/sbin/ntpdate -uq ' + server)
line = string.split(cmd)

#parse original offset
start_offset = string.atof(line[-2]);
#parse original time
start_time = time.localtime(time.time())
datestr = time.strftime("%d %b %Y %H:%M:%S", start_time)

time.sleep(1)
while 1:
	cmd = commands.getoutput('/usr/sbin/ntpdate -uq ' + server)
	line = string.split(cmd)

	#parse offset
	now_offset = string.atof(line[-2]);

	#parse time
	now_time = time.localtime(time.time())
	datestr = time.strftime("%d %b %Y %H:%M:%S", now_time)

	# calculate drift
	delta_time = time.mktime(now_time) - time.mktime(start_time)
	delta_offset = now_offset - start_offset
	drift =  delta_offset / delta_time * 1000000

	#print output
	print time.strftime("%d %b %H:%M:%S",now_time), 
	print "	offset:", now_offset , 
	print "	drift:", drift ,"ppm"
	sys.stdout.flush()

	#sleep 
	time.sleep(sleep_time)

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-03 22:16                               ` john stultz
@ 2009-03-04  5:36                                 ` Jesper Krogh
  0 siblings, 0 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-03-04  5:36 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
> On Tue, 2009-03-03 at 21:39 +0100, Jesper Krogh wrote:
>> john stultz wrote:
>>> Do you see the same behavior if you drop all but one server (including
>>> the local clock: 127.127.1.0)? 
>> Yes.
>> Mar  3 21:20:59 quad12 ntpd[2435]: ntpd 4.2.4p4@1.1520-o Tue Jan  6 
>> 15:50:55 UTC 2009 (1)
>> Mar  3 21:20:59 quad12 ntpd[2436]: precision = 1.000 usec
>> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #0 wildcard, 
>> 0.0.0.0#123 Disabled
>> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #1 wildcard, 
>> ::#123 Disabled
>> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #2 lo, ::1#123 
>> Enabled
>> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #3 bond0, 
>> fe80::21e:68ff:fe57:8169#123 Enabled
>> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #4 lo, 
>> 127.0.0.1#123 Enabled
>> Mar  3 21:20:59 quad12 ntpd[2436]: Listening on interface #5 bond0, 
>> 10.194.132.91#123 Enabled
>> Mar  3 21:20:59 quad12 ntpd[2436]: kernel time sync status 0040
>> Mar  3 21:20:59 quad12 ntpd[2436]: frequency initialized -29.286 PPM 
>> from /var/lib/ntp/ntp.drift
>> Mar  3 21:21:58 quad12 ntpd[2436]: synchronized to 10.194.133.12, stratum 4
>> Mar  3 21:21:58 quad12 ntpd[2436]: time reset -6.148275 s
>> Mar  3 21:21:58 quad12 ntpd[2436]: kernel time sync status change 0001
>> Mar  3 21:25:01 quad12 ntpd[2436]: synchronized to 10.194.133.12, stratum 4
>> Mar  3 21:37:03 quad12 ntpd[2436]: time reset -0.664351 s
>>
>> Only one server and the minpoll 4 maxpoll 4 options to the server line.
> 
> Well, it may still need a few hours to settle. :)  Again, those time
> resets are seen when NTPd doesn't have a good drift ppm at startup, and
> it has to find it.

With one server and the maxpoll minpoll stuff, this on "settled" after a 
bit more than 3 hours:
Mar  4 01:14:05 quad12 ntpd[2436]: time reset -0.381826 s
Mar  4 01:15:39 quad12 ntpd[2436]: synchronized to 10.194.133.12, stratum 4
jk@quad12:~$ uptime
  06:35:40 up 15:55,  1 user,  load average: 0.00, 0.00, 0.00
jk@quad12:~$ ntpq -c peers
      remote           refid      st t when poll reach   delay   offset 
  jitter
==============================================================================
*bioinf.nzcorp.n 10.192.96.19     4 u    8   16  377    0.098  -80.184 
  0.673
jk@quad12:~$ ntpdc -c kerinfo
***Command `kerinfo' unknown
jk@quad12:~$ ntpdc -c kerninfo
pll offset:           -0.06619 s
pll frequency:        -500.000 ppm
maximum error:        0.130081 s
estimated error:      0.001201 s
status:               0001  pll
pll time constant:    4
precision:            1e-06 s
frequency tolerance:  500 ppm

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-03 22:22                               ` john stultz
@ 2009-03-04 15:30                                 ` Jesper Krogh
  2009-03-04 18:36                                   ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-04 15:30 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
> I guess something to test my idea above (that the drift is bad enough
> that NTPd isn't making slew adjustments via adjtimex offset) is to
> remove NTPd from the init.d startup.
> 
> Then after rebooting (into 2.6.29), run the attached python script for
> 10 minutes or so to get an idea of the ppm drift. Then repeat with
> 2.6.26.
> 
> To run: 
> ./drift-test.py <ntp server>
> 
> It will give some wild ppm numbers, but after a few minutes it should
> settle down to the "natural drift" of the system.

Ok. I removed ntpd from the system... heres is from "non-working 
2.6.28.7 kernel".
04 Mar 14:59:16 	offset: -0.139829 	drift: -656.0 ppm
04 Mar 15:00:16 	offset: -0.175233 	drift: -591.147540984 ppm
04 Mar 15:01:16 	offset: -0.210637 	drift: -590.611570248 ppm
04 Mar 15:02:16 	offset: -0.246033 	drift: -590.386740331 ppm
04 Mar 15:03:17 	offset: -0.28144 	drift: -587.880165289 ppm
04 Mar 15:04:17 	offset: -0.31684 	drift: -588.301324503 ppm
04 Mar 15:05:17 	offset: -0.352247 	drift: -588.602209945 ppm
04 Mar 15:06:17 	offset: -0.387649 	drift: -588.805687204 ppm
04 Mar 15:07:17 	offset: -0.423046 	drift: -588.94813278 ppm
04 Mar 15:08:17 	offset: -0.458451 	drift: -589.073800738 ppm
04 Mar 15:09:18 	offset: -0.493856 	drift: -588.1973466 ppm
04 Mar 15:10:18 	offset: -0.529265 	drift: -588.374057315 ppm
04 Mar 15:11:18 	offset: -0.564661 	drift: -588.503457815 ppm
04 Mar 15:12:18 	offset: -0.600063 	drift: -588.620689655 ppm
04 Mar 15:13:18 	offset: -0.635458 	drift: -588.712930012 ppm
04 Mar 15:14:18 	offset: -0.040699 	drift: 109.052048726 ppm
04 Mar 15:15:18 	offset: -0.076098 	drift: 65.4984423676 ppm
04 Mar 15:16:18 	offset: -0.111495 	drift: 27.0557184751 ppm
04 Mar 15:17:18 	offset: -0.146885 	drift: -7.12096029548 ppm
04 Mar 15:18:19 	offset: -0.182285 	drift: -37.6853146853 ppm
04 Mar 15:19:19 	offset: -0.217688 	drift: -65.2117940199 ppm
04 Mar 15:20:19 	offset: -0.253085 	drift: -90.1202531646 ppm
04 Mar 15:21:19 	offset: -0.288479 	drift: -112.768882175 ppm
04 Mar 15:22:19 	offset: -0.323866 	drift: -133.448699422 ppm
04 Mar 15:23:19 	offset: -0.359259 	drift: -152.414127424 ppm
04 Mar 15:24:20 	offset: -0.394648 	drift: -169.750830565 ppm
04 Mar 15:25:20 	offset: -0.430047 	drift: -185.861980831 ppm
04 Mar 15:26:20 	offset: -0.46544 	drift: -200.779692308 ppm
04 Mar 15:27:20 	offset: -0.500835 	drift: -214.63620178 ppm
04 Mar 15:28:20 	offset: -0.536221 	drift: -227.534670487 ppm
04 Mar 15:29:20 	offset: -0.571605 	drift: -239.574515235 ppm
04 Mar 15:30:21 	offset: -0.606992 	drift: -250.706859593 ppm
04 Mar 15:31:21 	offset: -0.64241 	drift: -261.286085151 ppm
04 Mar 15:32:21 	offset: -0.677792 	drift: -271.20795569 ppm
04 Mar 15:33:21 	offset: -0.713187 	drift: -280.554252199 ppm
04 Mar 15:34:21 	offset: -0.040744 	drift: 46.7374169041 ppm
04 Mar 15:35:21 	offset: -0.076145 	drift: 29.0987996307 ppm
04 Mar 15:36:21 	offset: -0.111551 	drift: 12.4088050314 ppm
04 Mar 15:37:21 	offset: -0.146952 	drift: -3.40288713911 ppm

And from working 2.6.27.19 kernel.

jk@quad12:~$ python drift-test.py 10.192.96.19
04 Mar 16:17:23         offset: -0.006929       drift: -62.0 ppm
04 Mar 16:18:24         offset: -0.010252       drift: -54.5967741935 ppm
04 Mar 16:19:24         offset: -0.013574       drift: -54.9754098361 ppm
04 Mar 16:20:24         offset: -0.016897       drift: -55.1098901099 ppm
04 Mar 16:21:24         offset: -0.020233       drift: -55.2314049587 ppm
04 Mar 16:22:24         offset: -0.023566       drift: -55.2947019868 ppm
04 Mar 16:23:24         offset: -0.026895       drift: -55.3259668508 ppm
04 Mar 16:24:24         offset: -0.030217       drift: -55.3317535545 ppm
04 Mar 16:25:24         offset: -0.033539       drift: -55.3360995851 ppm
04 Mar 16:26:24         offset: -0.036865       drift: -55.3468634686 ppm
04 Mar 16:27:25         offset: -0.038266       drift: -52.0713101161 ppm
04 Mar 16:28:25         offset: -0.039747       drift: -49.592760181 ppm
04 Mar 16:29:25         offset: -0.041331       drift: -47.6680497925 ppm


-- 
Jesper



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-04 15:30                                 ` Jesper Krogh
@ 2009-03-04 18:36                                   ` Jesper Krogh
  2009-03-04 18:57                                     ` John Stultz
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-04 18:36 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

Jesper Krogh wrote:
> john stultz wrote:
>> I guess something to test my idea above (that the drift is bad enough
>> that NTPd isn't making slew adjustments via adjtimex offset) is to
>> remove NTPd from the init.d startup.
>>
>> Then after rebooting (into 2.6.29), run the attached python script for
>> 10 minutes or so to get an idea of the ppm drift. Then repeat with
>> 2.6.26.
>>
>> To run: ./drift-test.py <ntp server>
>>
>> It will give some wild ppm numbers, but after a few minutes it should
>> settle down to the "natural drift" of the system.
> 
> Ok. I removed ntpd from the system... heres is from "non-working 

Updated. I think I has NTPd running in the former "non-working" test. I 
just tried to reproduce the numbers, and they look like this 
(reproducible on 2.6.29-rc6).

jk@quad12:~$ python drift-test.py 10.192.96.19
04 Mar 19:27:10         offset: -0.157696       drift: -693.0 ppm
04 Mar 19:28:10         offset: -0.195134       drift: -625.098360656 ppm
04 Mar 19:29:10         offset: -0.232579       drift: -624.595041322 ppm
04 Mar 19:30:10         offset: -0.270021       drift: -624.408839779 ppm
04 Mar 19:31:11         offset: -0.307461       drift: -621.727272727 ppm
04 Mar 19:32:11         offset: -0.344903       drift: -622.185430464 ppm
04 Mar 19:33:11         offset: -0.382345       drift: -622.491712707 ppm
04 Mar 19:34:11         offset: -0.419794       drift: -622.727488152 ppm
04 Mar 19:35:11         offset: -0.457239       drift: -622.89626556 ppm

Still the same.

> And from working 2.6.27.19 kernel.
> 
> jk@quad12:~$ python drift-test.py 10.192.96.19
> 04 Mar 16:17:23         offset: -0.006929       drift: -62.0 ppm
> 04 Mar 16:18:24         offset: -0.010252       drift: -54.5967741935 ppm
> 04 Mar 16:19:24         offset: -0.013574       drift: -54.9754098361 ppm
> 04 Mar 16:20:24         offset: -0.016897       drift: -55.1098901099 ppm
> 04 Mar 16:21:24         offset: -0.020233       drift: -55.2314049587 ppm
> 04 Mar 16:22:24         offset: -0.023566       drift: -55.2947019868 ppm
> 04 Mar 16:23:24         offset: -0.026895       drift: -55.3259668508 ppm
> 04 Mar 16:24:24         offset: -0.030217       drift: -55.3317535545 ppm
> 04 Mar 16:25:24         offset: -0.033539       drift: -55.3360995851 ppm
> 04 Mar 16:26:24         offset: -0.036865       drift: -55.3468634686 ppm
> 04 Mar 16:27:25         offset: -0.038266       drift: -52.0713101161 ppm
> 04 Mar 16:28:25         offset: -0.039747       drift: -49.592760181 ppm
> 04 Mar 16:29:25         offset: -0.041331       drift: -47.6680497925 ppm



-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-04 18:36                                   ` Jesper Krogh
@ 2009-03-04 18:57                                     ` John Stultz
  2009-03-05  2:39                                       ` john stultz
  0 siblings, 1 reply; 81+ messages in thread
From: John Stultz @ 2009-03-04 18:57 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Wed, 2009-03-04 at 19:36 +0100, Jesper Krogh wrote:
> Jesper Krogh wrote:
> > john stultz wrote:
> >> I guess something to test my idea above (that the drift is bad enough
> >> that NTPd isn't making slew adjustments via adjtimex offset) is to
> >> remove NTPd from the init.d startup.
> >>
> >> Then after rebooting (into 2.6.29), run the attached python script for
> >> 10 minutes or so to get an idea of the ppm drift. Then repeat with
> >> 2.6.26.
> >>
> >> To run: ./drift-test.py <ntp server>
> >>
> >> It will give some wild ppm numbers, but after a few minutes it should
> >> settle down to the "natural drift" of the system.
> > 
> > Ok. I removed ntpd from the system... heres is from "non-working 
> 
> Updated. I think I has NTPd running in the former "non-working" test. I 
> just tried to reproduce the numbers, and they look like this 
> (reproducible on 2.6.29-rc6).

Yea, the last numbers did look odd :)

> jk@quad12:~$ python drift-test.py 10.192.96.19
> 04 Mar 19:27:10         offset: -0.157696       drift: -693.0 ppm
> 04 Mar 19:28:10         offset: -0.195134       drift: -625.098360656 ppm
> 04 Mar 19:29:10         offset: -0.232579       drift: -624.595041322 ppm
> 04 Mar 19:30:10         offset: -0.270021       drift: -624.408839779 ppm
> 04 Mar 19:31:11         offset: -0.307461       drift: -621.727272727 ppm
> 04 Mar 19:32:11         offset: -0.344903       drift: -622.185430464 ppm
> 04 Mar 19:33:11         offset: -0.382345       drift: -622.491712707 ppm
> 04 Mar 19:34:11         offset: -0.419794       drift: -622.727488152 ppm
> 04 Mar 19:35:11         offset: -0.457239       drift: -622.89626556 ppm


Yea, so from this and the settled ntpdc -c kerninfo data before, we can
see that the drift is further out then the 500ppm NTP can handle.

So with that at least confirmed, we can focus back on to the fast-pit
tsc calibration code.

Ingo, Thomas: I'm missing a bit of the context to that patch, other then
just speeding up boot times, was there other rational for moving away
from the ACPI PM timer based calibration?

Could we maybe add a quick test that the pit reads actually take the
assumed 2us max? Doing this maybe via the HPET/ACPI PM?

thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-04 18:57                                     ` John Stultz
@ 2009-03-05  2:39                                       ` john stultz
  2009-03-05  2:52                                         ` john stultz
  0 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-03-05  2:39 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Wed, 2009-03-04 at 10:57 -0800, John Stultz wrote:
> On Wed, 2009-03-04 at 19:36 +0100, Jesper Krogh wrote:
> > jk@quad12:~$ python drift-test.py 10.192.96.19
> > 04 Mar 19:27:10         offset: -0.157696       drift: -693.0 ppm
> > 04 Mar 19:28:10         offset: -0.195134       drift: -625.098360656 ppm
> > 04 Mar 19:29:10         offset: -0.232579       drift: -624.595041322 ppm
> > 04 Mar 19:30:10         offset: -0.270021       drift: -624.408839779 ppm
> > 04 Mar 19:31:11         offset: -0.307461       drift: -621.727272727 ppm
> > 04 Mar 19:32:11         offset: -0.344903       drift: -622.185430464 ppm
> > 04 Mar 19:33:11         offset: -0.382345       drift: -622.491712707 ppm
> > 04 Mar 19:34:11         offset: -0.419794       drift: -622.727488152 ppm
> > 04 Mar 19:35:11         offset: -0.457239       drift: -622.89626556 ppm
> 
> 
> Yea, so from this and the settled ntpdc -c kerninfo data before, we can
> see that the drift is further out then the 500ppm NTP can handle.
> 
> So with that at least confirmed, we can focus back on to the fast-pit
> tsc calibration code.
> 
> Ingo, Thomas: I'm missing a bit of the context to that patch, other then
> just speeding up boot times, was there other rational for moving away
> from the ACPI PM timer based calibration?
> 
> Could we maybe add a quick test that the pit reads actually take the
> assumed 2us max? Doing this maybe via the HPET/ACPI PM?

Hey Jesper,

	Here's a very-hackish patch to see if the approach I'm considering
might fix the issue you're hitting. Could you apply it, boot the kernel
a few times and send me the following segments of the dmesg for each of
those boots (the example below is from my test box)? 

tsc delta: 44418024
ref_freq: 3000100  pit_freq: 3000384
TSC: Fast PIT calibration matches PMTIMER.
TSC: PIT calibration matches PMTIMER. 1 loops
Detected 3000.045 MHz processor.

I'm trying to see how regular the mis-calculation is, as well as see how
well the alternate calibration method does to handle this on your
hardware.

Its likely the fat pit calibration can be better integrated with the
other calibration methods, so this probably isn't anything close to what
the actual fix will look like.

Ingo, Thomas: On the hardware I'm testing the fast-pit calibration only
triggers probably 80-90% of the time. About 10-20% of the time, the
initial check to pit_expect_msb(0xff) fails (count=0), so we may need to
look more at this approach.

thanks
-john





^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-05  2:39                                       ` john stultz
@ 2009-03-05  2:52                                         ` john stultz
  2009-03-05  8:43                                           ` Ingo Molnar
  2009-03-09 20:42                                           ` Jesper Krogh
  0 siblings, 2 replies; 81+ messages in thread
From: john stultz @ 2009-03-05  2:52 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

On Wed, 2009-03-04 at 18:39 -0800, john stultz wrote:
> On Wed, 2009-03-04 at 10:57 -0800, John Stultz wrote:
> > On Wed, 2009-03-04 at 19:36 +0100, Jesper Krogh wrote:
> > > jk@quad12:~$ python drift-test.py 10.192.96.19
> > > 04 Mar 19:27:10         offset: -0.157696       drift: -693.0 ppm
> > > 04 Mar 19:28:10         offset: -0.195134       drift: -625.098360656 ppm
> > > 04 Mar 19:29:10         offset: -0.232579       drift: -624.595041322 ppm
> > > 04 Mar 19:30:10         offset: -0.270021       drift: -624.408839779 ppm
> > > 04 Mar 19:31:11         offset: -0.307461       drift: -621.727272727 ppm
> > > 04 Mar 19:32:11         offset: -0.344903       drift: -622.185430464 ppm
> > > 04 Mar 19:33:11         offset: -0.382345       drift: -622.491712707 ppm
> > > 04 Mar 19:34:11         offset: -0.419794       drift: -622.727488152 ppm
> > > 04 Mar 19:35:11         offset: -0.457239       drift: -622.89626556 ppm
> > 
> > 
> > Yea, so from this and the settled ntpdc -c kerninfo data before, we can
> > see that the drift is further out then the 500ppm NTP can handle.
> > 
> > So with that at least confirmed, we can focus back on to the fast-pit
> > tsc calibration code.
> > 
> > Ingo, Thomas: I'm missing a bit of the context to that patch, other then
> > just speeding up boot times, was there other rational for moving away
> > from the ACPI PM timer based calibration?
> > 
> > Could we maybe add a quick test that the pit reads actually take the
> > assumed 2us max? Doing this maybe via the HPET/ACPI PM?
> 
> Hey Jesper,
> 
> 	Here's a very-hackish patch to see if the approach I'm considering
> might fix the issue you're hitting. Could you apply it, boot the kernel
> a few times and send me the following segments of the dmesg for each of
> those boots (the example below is from my test box)? 
> 
> tsc delta: 44418024
> ref_freq: 3000100  pit_freq: 3000384
> TSC: Fast PIT calibration matches PMTIMER.
> TSC: PIT calibration matches PMTIMER. 1 loops
> Detected 3000.045 MHz processor.
> 
> I'm trying to see how regular the mis-calculation is, as well as see how
> well the alternate calibration method does to handle this on your
> hardware.
> 
> Its likely the fat pit calibration can be better integrated with the
> other calibration methods, so this probably isn't anything close to what
> the actual fix will look like.
> 
> Ingo, Thomas: On the hardware I'm testing the fast-pit calibration only
> triggers probably 80-90% of the time. About 10-20% of the time, the
> initial check to pit_expect_msb(0xff) fails (count=0), so we may need to
> look more at this approach.

Err. Sorry, hit send before I included the patch.

-john



Not for inclusion.

Signed-off-by: John Stultz <johnstul@us.ibm.com>

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 599e581..2e16d30 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -317,15 +317,17 @@ static unsigned long quick_pit_calibrate(void)
 
 	if (pit_expect_msb(0xff)) {
 		int i;
-		u64 t1, t2, delta;
+		u64 t1, t2, delta, ref1, ref2;
+		u64 ref_freq = 0, pit_freq = 0;
+		int hpet = is_hpet_enabled();
 		unsigned char expect = 0xfe;
 
-		t1 = get_cycles();
+		t1 = tsc_read_refs(&ref1, hpet);
 		for (i = 0; i < QUICK_PIT_ITERATIONS; i++, expect--) {
 			if (!pit_expect_msb(expect))
 				goto failed;
 		}
-		t2 = get_cycles();
+		t2 = tsc_read_refs(&ref2, hpet);
 
 		/*
 		 * Make sure we can rely on the second TSC timestamp:
@@ -333,6 +335,13 @@ static unsigned long quick_pit_calibrate(void)
 		if (!pit_expect_msb(expect))
 			goto failed;
 
+
+		delta = (t2 - t1);
+		if (hpet)
+			ref_freq = calc_hpet_ref(delta*1000000LL, ref1, ref2);
+		else
+			ref_freq = calc_pmtimer_ref(delta*1000000LL, ref1, ref2);
+
 		/*
 		 * Ok, if we get here, then we've seen the
 		 * MSB of the PIT decrement QUICK_PIT_ITERATIONS
@@ -347,10 +356,32 @@ static unsigned long quick_pit_calibrate(void)
 		 * kHz = (t2 - t1) / (QPI * 256 / PIT_TICK_RATE) / 1000
 		 * kHz = ((t2 - t1) * PIT_TICK_RATE) / (QPI * 256 * 1000)
 		 */
-		delta = (t2 - t1)*PIT_TICK_RATE;
-		do_div(delta, QUICK_PIT_ITERATIONS*256*1000);
+		printk("tsc delta: %lld\n", t2-t1);
+
+		pit_freq = delta *  PIT_TICK_RATE;
+		do_div(pit_freq, QUICK_PIT_ITERATIONS*256*1000);
+
+		printk("ref_freq: %lld  pit_freq: %lld\n", ref_freq, pit_freq);
+
+		/* Check the reference deviation */
+		delta = ((u64) pit_freq) * 100;
+		do_div(delta, ref_freq);
+
+		/*
+		 * If both calibration results are inside a 10% window
+		 * then we can be sure, that the calibration
+		 * succeeded. We break out of the loop right away. We
+		 * use the reference value, as it is more precise.
+		 */
+		if (delta >= 90 && delta <= 110) {
+			printk(KERN_INFO
+			       "TSC: Fast PIT calibration matches %s.\n",
+			       hpet ? "HPET" : "PMTIMER");
+			return ref_freq;
+		}
+
 		printk("Fast TSC calibration using PIT\n");
-		return delta;
+		return pit_freq;
 	}
 failed:
 	return 0;
@@ -375,7 +406,7 @@ unsigned long native_calibrate_tsc(void)
 	local_irq_save(flags);
 	fast_calibrate = quick_pit_calibrate();
 	local_irq_restore(flags);
-	if (fast_calibrate)
+	if (0 && fast_calibrate)
 		return fast_calibrate;
 
 	/*



^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-05  2:52                                         ` john stultz
@ 2009-03-05  8:43                                           ` Ingo Molnar
  2009-03-06  3:13                                             ` john stultz
  2009-03-09 20:42                                           ` Jesper Krogh
  1 sibling, 1 reply; 81+ messages in thread
From: Ingo Molnar @ 2009-03-05  8:43 UTC (permalink / raw)
  To: john stultz
  Cc: Jesper Krogh, Thomas Gleixner, Linus Torvalds,
	Linux Kernel Mailing List, Len Brown


* john stultz <johnstul@us.ibm.com> wrote:

> > Ingo, Thomas: On the hardware I'm testing the fast-pit 
> > calibration only triggers probably 80-90% of the time. About 
> > 10-20% of the time, the initial check to 
> > pit_expect_msb(0xff) fails (count=0), so we may need to look 
> > more at this approach.

We definitely need to improve calibration quality.

The question is - why does fast-calibration fail 10-20% of the 
time on your test-system? Also, why exactly do we miscalibrate? 
Could you please have a look at that?

One theory would be that the PIT readout is unreliable. Windows 
does not make use of it, so it's not the most tested aspect of 
the PIT. Is that what happens on your box?

	Ingo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-05  8:43                                           ` Ingo Molnar
@ 2009-03-06  3:13                                             ` john stultz
  2009-03-06  3:54                                               ` john stultz
  0 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-03-06  3:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jesper Krogh, Thomas Gleixner, Linus Torvalds,
	Linux Kernel Mailing List, Len Brown

On Thu, 2009-03-05 at 09:43 +0100, Ingo Molnar wrote:
> * john stultz <johnstul@us.ibm.com> wrote:
> 
> > > Ingo, Thomas: On the hardware I'm testing the fast-pit 
> > > calibration only triggers probably 80-90% of the time. About 
> > > 10-20% of the time, the initial check to 
> > > pit_expect_msb(0xff) fails (count=0), so we may need to look 
> > > more at this approach.
> 
> We definitely need to improve calibration quality.
> 
> The question is - why does fast-calibration fail 10-20% of the 
> time on your test-system? Also, why exactly do we miscalibrate? 
> Could you please have a look at that?

Working on it, I just wanted to let you know I was seeing some different
odd behavior then Jesper.

> One theory would be that the PIT readout is unreliable. Windows 
> does not make use of it, so it's not the most tested aspect of 
> the PIT. Is that what happens on your box?

Still looking into it, but from my initial debugging it seems that by
reading the PIT very quickly after setting it, we may be getting junk
values. If I re-read the PIT again, I see the expected 0xff value. 

Its been somewhat of a heisenbug, as if I add any printk's or even just
a mb() after the outb it seems to make the problem go away (or just rare
enough I don't have the patience to reproduce it :)

So I don't know if a small delay is appropriate here (seems counter
productive to the whole fast-pit calibration ;) or if we should just try
to catch these bad reads and try again before failing?

Thoughts?

thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-06  3:13                                             ` john stultz
@ 2009-03-06  3:54                                               ` john stultz
  2009-03-06 11:34                                                 ` Ingo Molnar
  0 siblings, 1 reply; 81+ messages in thread
From: john stultz @ 2009-03-06  3:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jesper Krogh, Thomas Gleixner, Linus Torvalds,
	Linux Kernel Mailing List, Len Brown

On Thu, 2009-03-05 at 19:13 -0800, john stultz wrote:
> On Thu, 2009-03-05 at 09:43 +0100, Ingo Molnar wrote:
> > * john stultz <johnstul@us.ibm.com> wrote:
> > 
> > > > Ingo, Thomas: On the hardware I'm testing the fast-pit 
> > > > calibration only triggers probably 80-90% of the time. About 
> > > > 10-20% of the time, the initial check to 
> > > > pit_expect_msb(0xff) fails (count=0), so we may need to look 
> > > > more at this approach.
> > 
> > We definitely need to improve calibration quality.
> > 
> > The question is - why does fast-calibration fail 10-20% of the 
> > time on your test-system? Also, why exactly do we miscalibrate? 
> > Could you please have a look at that?
> 
> Working on it, I just wanted to let you know I was seeing some different
> odd behavior then Jesper.
> 
> > One theory would be that the PIT readout is unreliable. Windows 
> > does not make use of it, so it's not the most tested aspect of 
> > the PIT. Is that what happens on your box?
> 
> Still looking into it, but from my initial debugging it seems that by
> reading the PIT very quickly after setting it, we may be getting junk
> values. If I re-read the PIT again, I see the expected 0xff value. 
> 
> Its been somewhat of a heisenbug, as if I add any printk's or even just
> a mb() after the outb it seems to make the problem go away (or just rare
> enough I don't have the patience to reproduce it :)
> 
> So I don't know if a small delay is appropriate here (seems counter
> productive to the whole fast-pit calibration ;) or if we should just try
> to catch these bad reads and try again before failing?

Maybe something like the following? (Not tested heavily yet!)

Again, just for clarity, as we've mixed a few issues here, this patch is
for a side issue and not related to the original regression reported by
Jesper. I'm still waiting on debug output from Jesper to further
diagnose whats going wrong with his TSC calibration.

thanks
-john


Apparently some hardware may occasionally return junk values if you try
to read the pit immediately after setting it. This causes the
pit_expect_msb() to occasionally fail (~10% of the time).

This patch tries to work around this issue by not failing if the first
read right after setting the PIT is not what we expect.

NOT FOR INCLUSION (yet!)

Signed-off-by: John Stultz <johnstul@us.ibm.com>

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 599e581..2ca5ba4 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -280,8 +280,17 @@ static inline int pit_expect_msb(unsigned char val)
 	for (count = 0; count < 50000; count++) {
 		/* Ignore LSB */
 		inb(0x42);
-		if (inb(0x42) != val)
+		if (inb(0x42) != val) {
+			/*
+			 * If we're too fast, we may read
+			 * junk values right after we set
+			 * the PIT. So if this is the first
+			 * read, try again
+			 */
+			if (val == 0xff && count == 0)
+				continue;
 			break;
+		}
 	}
 	return count > 50;
 }



^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-06  3:54                                               ` john stultz
@ 2009-03-06 11:34                                                 ` Ingo Molnar
  0 siblings, 0 replies; 81+ messages in thread
From: Ingo Molnar @ 2009-03-06 11:34 UTC (permalink / raw)
  To: john stultz
  Cc: Jesper Krogh, Thomas Gleixner, Linus Torvalds,
	Linux Kernel Mailing List, Len Brown


* john stultz <johnstul@us.ibm.com> wrote:

> On Thu, 2009-03-05 at 19:13 -0800, john stultz wrote:
> > On Thu, 2009-03-05 at 09:43 +0100, Ingo Molnar wrote:
> > > * john stultz <johnstul@us.ibm.com> wrote:
> > > 
> > > > > Ingo, Thomas: On the hardware I'm testing the fast-pit 
> > > > > calibration only triggers probably 80-90% of the time. About 
> > > > > 10-20% of the time, the initial check to 
> > > > > pit_expect_msb(0xff) fails (count=0), so we may need to look 
> > > > > more at this approach.
> > > 
> > > We definitely need to improve calibration quality.
> > > 
> > > The question is - why does fast-calibration fail 10-20% of the 
> > > time on your test-system? Also, why exactly do we miscalibrate? 
> > > Could you please have a look at that?
> > 
> > Working on it, I just wanted to let you know I was seeing some different
> > odd behavior then Jesper.
> > 
> > > One theory would be that the PIT readout is unreliable. Windows 
> > > does not make use of it, so it's not the most tested aspect of 
> > > the PIT. Is that what happens on your box?
> > 
> > Still looking into it, but from my initial debugging it seems that by
> > reading the PIT very quickly after setting it, we may be getting junk
> > values. If I re-read the PIT again, I see the expected 0xff value. 
> > 
> > Its been somewhat of a heisenbug, as if I add any printk's or even just
> > a mb() after the outb it seems to make the problem go away (or just rare
> > enough I don't have the patience to reproduce it :)
> > 
> > So I don't know if a small delay is appropriate here (seems counter
> > productive to the whole fast-pit calibration ;) or if we should just try
> > to catch these bad reads and try again before failing?
> 
> Maybe something like the following? (Not tested heavily yet!)
> 
> Again, just for clarity, as we've mixed a few issues here, this patch is
> for a side issue and not related to the original regression reported by
> Jesper. I'm still waiting on debug output from Jesper to further
> diagnose whats going wrong with his TSC calibration.
> 
> thanks
> -john
> 
> 
> Apparently some hardware may occasionally return junk values if you try
> to read the pit immediately after setting it. This causes the
> pit_expect_msb() to occasionally fail (~10% of the time).
> 
> This patch tries to work around this issue by not failing if the first
> read right after setting the PIT is not what we expect.
> 
> NOT FOR INCLUSION (yet!)
> 
> Signed-off-by: John Stultz <johnstul@us.ibm.com>
> 
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 599e581..2ca5ba4 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -280,8 +280,17 @@ static inline int pit_expect_msb(unsigned char val)
>  	for (count = 0; count < 50000; count++) {
>  		/* Ignore LSB */
>  		inb(0x42);
> -		if (inb(0x42) != val)
> +		if (inb(0x42) != val) {
> +			/*
> +			 * If we're too fast, we may read
> +			 * junk values right after we set
> +			 * the PIT. So if this is the first
> +			 * read, try again
> +			 */
> +			if (val == 0xff && count == 0)
> +				continue;
>  			break;

We could do something like that if it helps the end result. But 
this special thing inside the loop should just be an 
unconditional inb(0x42) outside the loop. It does not hurt 
performance there, and we'll get simpler code that way.

But ... i really dont like how we rely on PIT readouts and how 
we work around PIT readout artifacts. Only Linux does PIT 
readouts while Windows does not - so we rely on a under-tested 
aspect of PC hardware.

I think we should think about a fundamentally different, IRQ 
driven way of calibration. For example we could program a 27 
milliseconds PIT periodic interrupt with the maximum count and 
measure its arrival timestamp in two subsequent interrupts.

We could do that with about 1-2 usecs precision realistically 
(this early during bootup we are really quiescent) - and over a 
27,000 usecs period that gives us an accuracy of 1:13500, or 
about 75 ppm. That's still only about 50 milliseconds spent 
calibrating, so very fast.

We can re-write the IRQ#0 vector with a special temporary 
calibration interrupt handler to make this really single-purpose 
and precise.

Hm?

	Ingo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-05  2:52                                         ` john stultz
  2009-03-05  8:43                                           ` Ingo Molnar
@ 2009-03-09 20:42                                           ` Jesper Krogh
  2009-03-10  4:26                                             ` Linus Torvalds
  2009-03-15  1:19                                             ` Linus Torvalds
  1 sibling, 2 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-03-09 20:42 UTC (permalink / raw)
  To: john stultz
  Cc: Thomas Gleixner, Linus Torvalds, Linux Kernel Mailing List, Len Brown

john stultz wrote:
> On Wed, 2009-03-04 at 18:39 -0800, john stultz wrote:
>> On Wed, 2009-03-04 at 10:57 -0800, John Stultz wrote:
>>> On Wed, 2009-03-04 at 19:36 +0100, Jesper Krogh wrote:
>>>> jk@quad12:~$ python drift-test.py 10.192.96.19
>>>> 04 Mar 19:27:10         offset: -0.157696       drift: -693.0 ppm
>>>> 04 Mar 19:28:10         offset: -0.195134       drift: -625.098360656 ppm
>>>> 04 Mar 19:29:10         offset: -0.232579       drift: -624.595041322 ppm
>>>> 04 Mar 19:30:10         offset: -0.270021       drift: -624.408839779 ppm
>>>> 04 Mar 19:31:11         offset: -0.307461       drift: -621.727272727 ppm
>>>> 04 Mar 19:32:11         offset: -0.344903       drift: -622.185430464 ppm
>>>> 04 Mar 19:33:11         offset: -0.382345       drift: -622.491712707 ppm
>>>> 04 Mar 19:34:11         offset: -0.419794       drift: -622.727488152 ppm
>>>> 04 Mar 19:35:11         offset: -0.457239       drift: -622.89626556 ppm
>>>
>>> Yea, so from this and the settled ntpdc -c kerninfo data before, we can
>>> see that the drift is further out then the 500ppm NTP can handle.
>>>
>>> So with that at least confirmed, we can focus back on to the fast-pit
>>> tsc calibration code.
>>>
>>> Ingo, Thomas: I'm missing a bit of the context to that patch, other then
>>> just speeding up boot times, was there other rational for moving away
>>> from the ACPI PM timer based calibration?
>>>
>>> Could we maybe add a quick test that the pit reads actually take the
>>> assumed 2us max? Doing this maybe via the HPET/ACPI PM?
>> Hey Jesper,
>>
>> 	Here's a very-hackish patch to see if the approach I'm considering
>> might fix the issue you're hitting. Could you apply it, boot the kernel
>> a few times and send me the following segments of the dmesg for each of
>> those boots (the example below is from my test box)? 
>>
>> tsc delta: 44418024
>> ref_freq: 3000100  pit_freq: 3000384
>> TSC: Fast PIT calibration matches PMTIMER.
>> TSC: PIT calibration matches PMTIMER. 1 loops
>> Detected 3000.045 MHz processor.

Hi John.

Patched into 2.6.28.7 ..

First boot.
[    0.000000] tsc delta: 34203220
[    0.000000] ref_freq: 2311825  pit_freq: 2310386
[    0.000000] TSC: Fast PIT calibration matches PMTIMER.
[    0.000000] TSC: PIT calibration matches PMTIMER. 2 loops
[    0.000000] Detected 2311.877 MHz processor.
Second boot:
[    0.000000] tsc delta: 34200313
[    0.000000] ref_freq: 2311803  pit_freq: 2310190
[    0.000000] TSC: Fast PIT calibration matches PMTIMER.
[    0.000000] TSC: PIT calibration matches PMTIMER. 2 loops
[    0.000000] Detected 2311.876 MHz processor.
Third boot:
[    0.000000] tsc delta: 34198686
[    0.000000] ref_freq: 2311824  pit_freq: 2310080
[    0.000000] TSC: Fast PIT calibration matches PMTIMER.
[    0.000000] TSC: PIT calibration matches PMTIMER. 1 loops
[    0.000000] Detected 2311.872 MHz processor.
Fourth boot:
[    0.000000] tsc delta: 34199433
[    0.000000] ref_freq: 2311831  pit_freq: 2310130
[    0.000000] TSC: Fast PIT calibration matches PMTIMER.
[    0.000000] TSC: PIT calibration matches PMTIMER. 2 loops
[    0.000000] Detected 2311.821 MHz processor.



-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-09 20:42                                           ` Jesper Krogh
@ 2009-03-10  4:26                                             ` Linus Torvalds
  2009-03-10 11:29                                               ` Thomas Gleixner
  2009-03-15  1:19                                             ` Linus Torvalds
  1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-03-10  4:26 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown



On Mon, 9 Mar 2009, Jesper Krogh wrote:
> 
> First boot.
> [    0.000000] ref_freq: 2311825  pit_freq: 2310386
> Second boot:
> [    0.000000] ref_freq: 2311803  pit_freq: 2310190
> Third boot:
> [    0.000000] ref_freq: 2311824  pit_freq: 2310080
> Fourth boot:
> [    0.000000] ref_freq: 2311831  pit_freq: 2310130

It's really quite impressively stable, but the fast-PIT calibration 
frequency is reliably about 3/4 of a promille low. Or, put another way, 
the TSC difference over the pit calibration is just a _tad_ too small 
compared to the value we'd expect if that loop of pit_expect_msb() would 
really run at the expected delay of a 1.193182MHz clock divided by 256.

And it's stable in that it really always seems to be off by a very similar 
amount. It's not moving around very much.

I also wonder why it seems to happen mainly just to _you_. There's 
absolutely nothing odd in your system, neither a slow CPU or anything 
else that would stand out.

Grr. Very annoyingly non-obvious.

		Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-10  4:26                                             ` Linus Torvalds
@ 2009-03-10 11:29                                               ` Thomas Gleixner
  2009-03-10 19:42                                                 ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: Thomas Gleixner @ 2009-03-10 11:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jesper Krogh, john stultz, Linux Kernel Mailing List, Len Brown

On Mon, 9 Mar 2009, Linus Torvalds wrote:
> On Mon, 9 Mar 2009, Jesper Krogh wrote:
> > 
> > First boot.
> > [    0.000000] ref_freq: 2311825  pit_freq: 2310386
> > Second boot:
> > [    0.000000] ref_freq: 2311803  pit_freq: 2310190
> > Third boot:
> > [    0.000000] ref_freq: 2311824  pit_freq: 2310080
> > Fourth boot:
> > [    0.000000] ref_freq: 2311831  pit_freq: 2310130
> 
> It's really quite impressively stable, but the fast-PIT calibration 
> frequency is reliably about 3/4 of a promille low. Or, put another way, 
> the TSC difference over the pit calibration is just a _tad_ too small 
> compared to the value we'd expect if that loop of pit_expect_msb() would 
> really run at the expected delay of a 1.193182MHz clock divided by 256.
> 
> And it's stable in that it really always seems to be off by a very similar 
> amount. It's not moving around very much.
> 
> I also wonder why it seems to happen mainly just to _you_. There's 
> absolutely nothing odd in your system, neither a slow CPU or anything 
> else that would stand out.
> 
> Grr. Very annoyingly non-obvious.

Indeed. One hint is in the slow calibration path. 3 of 4 boots have:

> > [    0.000000] TSC: PIT calibration matches PMTIMER. 2 loops

So the slow calibration path detects some disturbance.

Jesper, can you please apply the following patch instead of Johns and
provide the output for a couple of boots? The output is:

Fast TSC calibration using PIT
tsc 43425305 tscmin 624008 tscmax 632610

Thanks,

	tglx

--- linux-2.6.orig/arch/x86/kernel/tsc.c
+++ linux-2.6/arch/x86/kernel/tsc.c
@@ -317,15 +317,22 @@ static unsigned long quick_pit_calibrate
 
 	if (pit_expect_msb(0xff)) {
 		int i;
-		u64 t1, t2, delta;
+		u64 t1, t2, t3, delta;
 		unsigned char expect = 0xfe;
+		unsigned long tscmin = ULONG_MAX, tscmax = 0;
 
-		t1 = get_cycles();
+		t1 = t2 = get_cycles();
 		for (i = 0; i < QUICK_PIT_ITERATIONS; i++, expect--) {
 			if (!pit_expect_msb(expect))
 				goto failed;
+			t3 = get_cycles();
+			delta = t3 - t2;
+			t2 = t3;
+			if ((unsigned long) delta < tscmin)
+				tscmin = (unsigned int) delta;
+			if ((unsigned long) delta > tscmax)
+				tscmax = (unsigned int) delta;
 		}
-		t2 = get_cycles();
 
 		/*
 		 * Make sure we can rely on the second TSC timestamp:
@@ -350,6 +357,8 @@ static unsigned long quick_pit_calibrate
 		delta = (t2 - t1)*PIT_TICK_RATE;
 		do_div(delta, QUICK_PIT_ITERATIONS*256*1000);
 		printk("Fast TSC calibration using PIT\n");
+		printk("tsc %ld tscmin %ld tscmax %ld\n",
+		       (unsigned long) (t2 - t1), tscmin, tscmax);
 		return delta;
 	}
 failed:

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-10 11:29                                               ` Thomas Gleixner
@ 2009-03-10 19:42                                                 ` Jesper Krogh
  2009-03-10 22:22                                                   ` Thomas Gleixner
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-10 19:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, john stultz, Linux Kernel Mailing List, Len Brown

Thomas Gleixner wrote:
> On Mon, 9 Mar 2009, Linus Torvalds wrote:
>> On Mon, 9 Mar 2009, Jesper Krogh wrote:
>>> First boot.
>>> [    0.000000] ref_freq: 2311825  pit_freq: 2310386
>>> Second boot:
>>> [    0.000000] ref_freq: 2311803  pit_freq: 2310190
>>> Third boot:
>>> [    0.000000] ref_freq: 2311824  pit_freq: 2310080
>>> Fourth boot:
>>> [    0.000000] ref_freq: 2311831  pit_freq: 2310130
>> It's really quite impressively stable, but the fast-PIT calibration 
>> frequency is reliably about 3/4 of a promille low. Or, put another way, 
>> the TSC difference over the pit calibration is just a _tad_ too small 
>> compared to the value we'd expect if that loop of pit_expect_msb() would 
>> really run at the expected delay of a 1.193182MHz clock divided by 256.
>>
>> And it's stable in that it really always seems to be off by a very similar 
>> amount. It's not moving around very much.
>>
>> I also wonder why it seems to happen mainly just to _you_. There's 
>> absolutely nothing odd in your system, neither a slow CPU or anything 
>> else that would stand out.
>>
>> Grr. Very annoyingly non-obvious.
> 
> Indeed. One hint is in the slow calibration path. 3 of 4 boots have:
> 
>>> [    0.000000] TSC: PIT calibration matches PMTIMER. 2 loops
> 
> So the slow calibration path detects some disturbance.
> 
> Jesper, can you please apply the following patch instead of Johns and
> provide the output for a couple of boots? The output is:
> 
> Fast TSC calibration using PIT
> tsc 43425305 tscmin 624008 tscmax 632610

First boot:
[    0.000000] Fast TSC calibration using PIT
[    0.000000] tsc 34202223 tscmin 474069 tscmax 500664
Second boot:
Here I didnt get above messages.. http://krogh.cc/~jesper/dmesg-boot2.txt
Third boot:
[    0.000000] Fast TSC calibration using PIT
[    0.000000] tsc 34199856 tscmin 470321 tscmax 502182
Forth boot:
[    0.000000] Fast TSC calibration using PIT
[    0.000000] tsc 34202008 tscmin 475510 tscmax 501501

The second one is really strange.. is'nt it?

While booting up I saw this one on the serial console..
root@quad12:~# hwclock --systohc
Cannot access the Hardware Clock via any known method.
Use the --debug option to see the details of our search for an access 
method.
root@quad12:~# hwclock --systohc --debug
hwclock from util-linux-ng 2.13.1
hwclock: Open of /dev/rtc failed, errno=2: No such file or directory.
No usable clock interface found.
Cannot access the Hardware Clock via any known method.

Jesper
-- 
Jesper


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-10 19:42                                                 ` Jesper Krogh
@ 2009-03-10 22:22                                                   ` Thomas Gleixner
  2009-03-15 19:53                                                     ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: Thomas Gleixner @ 2009-03-10 22:22 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Linus Torvalds, john stultz, Linux Kernel Mailing List,
	Len Brown, Ingo Molnar

Jesper,

On Tue, 10 Mar 2009, Jesper Krogh wrote:
> First boot:
> [    0.000000] Fast TSC calibration using PIT
> [    0.000000] tsc 34202223 tscmin 474069 tscmax 500664
> Second boot:
> Here I didnt get above messages.. http://krogh.cc/~jesper/dmesg-boot2.txt
> Third boot:
> [    0.000000] Fast TSC calibration using PIT
> [    0.000000] tsc 34199856 tscmin 470321 tscmax 502182
> Forth boot:
> [    0.000000] Fast TSC calibration using PIT
> [    0.000000] tsc 34202008 tscmin 475510 tscmax 501501
> 
> The second one is really strange.. is'nt it?

No, there simply the fast PIT calibration failed and it dropped into
the slow path:
[    0.000000] TSC: PIT calibration matches PMTIMER. 1 loops
[    0.000000] Detected 2311.878 MHz processor.

But the variance of the third run is interesting:

    avg = tsc / loops = 495650
    avg - tscmin      =  25329 (~ 10.9 us)
    tscmax - avg      =   6532 (~  2.8 us)

While this is in the range which the PIT calibration code accepts the
resulting CPU frequency of this run is 2310.159 MHz which is way off
the result of the slow path in the 2nd run. The 1st and the 4th run
have significant high variance as well.

I run the same patch on a couple of test machines and all have
deviations from avg in the range of +/- 2 us and the calibration
result is stable and correct.

I have no idea what might cause the problem with your machine. PIT via
SMM emulation comes to mind :)

But we can use the tscmin/max method to figure out whether the fast
PIT result is reliable. See patch below. It should drop out into the
slow calibration path on every boot on your machine.

    (tscmax - tscmin) / avg = 0.064 (result from third run)

On my test machines I get values below 0.02

While it's statistically not really correct we still can use that info
to catch cases like we see on your machines.

> While booting up I saw this one on the serial console..
> root@quad12:~# hwclock --systohc
> Cannot access the Hardware Clock via any known method.
> Use the --debug option to see the details of our search for an access method.
> root@quad12:~# hwclock --systohc --debug
> hwclock from util-linux-ng 2.13.1
> hwclock: Open of /dev/rtc failed, errno=2: No such file or directory.
> No usable clock interface found.
> Cannot access the Hardware Clock via any known method.

Can you provide your .config file please ?

Thanks,

	tglx

--------->

Subject: x86: make TSC fast calibration more robust
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 10 Mar 2009 11:12:03 +0100

Check the min/max duration of each PIT loop against the resulting
average value and dismiss the fast calibration if it's larger than
2.5%. 2.5% is in the range of +/- 2us, which is a reasonable range
when we assume that a PIT read can easily take 1 us.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/tsc.c |   30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/kernel/tsc.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/tsc.c
+++ linux-2.6/arch/x86/kernel/tsc.c
@@ -317,15 +317,22 @@ static unsigned long quick_pit_calibrate
 
 	if (pit_expect_msb(0xff)) {
 		int i;
-		u64 t1, t2, delta;
+		u64 t1, t2, t3, delta;
 		unsigned char expect = 0xfe;
+		unsigned long tscmin = ULONG_MAX, tscmax = 0;
 
-		t1 = get_cycles();
+		t1 = t2 = get_cycles();
 		for (i = 0; i < QUICK_PIT_ITERATIONS; i++, expect--) {
 			if (!pit_expect_msb(expect))
 				goto failed;
+			t3 = get_cycles();
+			delta = t3 - t2;
+			t2 = t3;
+			if ((unsigned long) delta < tscmin)
+				tscmin = (unsigned long) delta;
+			if ((unsigned long) delta > tscmax)
+				tscmax = (unsigned long) delta;
 		}
-		t2 = get_cycles();
 
 		/*
 		 * Make sure we can rely on the second TSC timestamp:
@@ -334,6 +341,23 @@ static unsigned long quick_pit_calibrate
 			goto failed;
 
 		/*
+		 * Sanity check the min max values:
+		 *
+		 * We calculate the average tsc increment per loop
+		 * step. Now we take the tscmin and tscmax value and
+		 * check whether the deviation is inside an acceptable
+		 * range.
+		 */
+		delta = (t2 - t1);
+		do_div(delta, QUICK_PIT_ITERATIONS);
+		t3 = (unsigned long) delta;
+		delta = tscmax - tscmin;
+		delta *= 10000;
+		do_div(delta, t3);
+		/* Fail if the deviation is > 2.5 % */
+		if (delta > 250)
+			goto failed;
+		/*
 		 * Ok, if we get here, then we've seen the
 		 * MSB of the PIT decrement QUICK_PIT_ITERATIONS
 		 * times, and each MSB had many hits, so we never

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-09 20:42                                           ` Jesper Krogh
  2009-03-10  4:26                                             ` Linus Torvalds
@ 2009-03-15  1:19                                             ` Linus Torvalds
  2009-03-15 15:44                                               ` Jesper Krogh
  1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-03-15  1:19 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown



Jesper, here's a patch that actually tries to take teh TSC error really 
into account, and which I suspect will result (on your machine) in failing 
the fast PIT calibration. 

It also has a few extra printk's for debugging, and to see just what the 
values are on your machine.

The idea behind the patch is to just keep track of how big the difference 
was in TSC values between two successive reads of the PIT timer. We only 
really care about the difference when the MSB turns around, and we only 
really care about the two end points. The maximum error in TSC estimation 
will simply be the sum of the differences at those points (d1 and d2).

We can then compare the maximum error with the actual TSC differences 
between those points, and see if the max error is within 500 ppm. That 
_should_ mean that it all works - assuming that the PIT itself is running 
at the correct frequency, of course!

Regardless of whether is succeeds or not, it will print out some debug 
messages, which will be interesting to see.

What's nice about this is that it really should make that whole "yes, it's 
really within 500ppm" assertion have some solid legs to stand on. Rather 
than depend on us being able to read the PIT a certain number of times, we 
can literally give an estimation of the max error.

		Linus

---
 arch/x86/kernel/tsc.c |   41 +++++++++++++++++++++++++++++------------
 1 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 599e581..8e1db42 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -273,17 +273,26 @@ static unsigned long pit_calibrate_tsc(u32 latch, unsigned long ms, int loopmin)
  * use the TSC value at the transitions to calculate a pretty
  * good value for the TSC frequencty.
  */
-static inline int pit_expect_msb(unsigned char val)
+static unsigned long pit_expect_msb(unsigned char val, u64 *tscp, unsigned long *deltap)
 {
-	int count = 0;
+	int count;
+	u64 tsc = 0;
 
 	for (count = 0; count < 50000; count++) {
 		/* Ignore LSB */
 		inb(0x42);
 		if (inb(0x42) != val)
 			break;
+		tsc = get_cycles();
 	}
-	return count > 50;
+	*deltap = get_cycles() - tsc;
+	*tscp = tsc;
+
+	/*
+	 * We require _some_ success, but the quality control
+	 * will be based on the error terms on the TSC values.
+	 */
+	return count > 5;
 }
 
 /*
@@ -297,6 +306,10 @@ static inline int pit_expect_msb(unsigned char val)
 
 static unsigned long quick_pit_calibrate(void)
 {
+	u64 t1, t2;
+	unsigned long d1, d2;
+	unsigned char expect = 0xff;
+
 	/* Set the Gate high, disable speaker */
 	outb((inb(0x61) & ~0x02) | 0x01, 0x61);
 
@@ -315,22 +328,24 @@ static unsigned long quick_pit_calibrate(void)
 	outb(0xff, 0x42);
 	outb(0xff, 0x42);
 
-	if (pit_expect_msb(0xff)) {
+	if (pit_expect_msb(0xff, &t1, &d1)) {
 		int i;
-		u64 t1, t2, delta;
-		unsigned char expect = 0xfe;
+		u64 delta;
 
-		t1 = get_cycles();
+		expect--;
 		for (i = 0; i < QUICK_PIT_ITERATIONS; i++, expect--) {
-			if (!pit_expect_msb(expect))
+			if (!pit_expect_msb(expect, &t2, &d2))
 				goto failed;
 		}
-		t2 = get_cycles();
 
 		/*
-		 * Make sure we can rely on the second TSC timestamp:
+		 * We require the max error on the calibration to be
+		 * within 500 ppm, since that's the limit of ntpd
+		 * drift correction. So the TSC delta must be more
+		 * than 2000x the possible error term (d1+d2).
 		 */
-		if (!pit_expect_msb(expect))
+		delta = t2 - t1;
+		if (d1+d2 > delta >> 11)
 			goto failed;
 
 		/*
@@ -347,12 +362,14 @@ static unsigned long quick_pit_calibrate(void)
 		 * kHz = (t2 - t1) / (QPI * 256 / PIT_TICK_RATE) / 1000
 		 * kHz = ((t2 - t1) * PIT_TICK_RATE) / (QPI * 256 * 1000)
 		 */
-		delta = (t2 - t1)*PIT_TICK_RATE;
+		printk("Fast TSC delta=%lld, error=%lu+%lu=%lu\n", delta, d1, d2, d1+d2);
+		delta *= PIT_TICK_RATE;
 		do_div(delta, QUICK_PIT_ITERATIONS*256*1000);
 		printk("Fast TSC calibration using PIT\n");
 		return delta;
 	}
 failed:
+	printk("Fast TSC calibration failed at %u %llu(%lu) %llu(%lu)\n", expect, t1, d1, t2, d2);
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-15  1:19                                             ` Linus Torvalds
@ 2009-03-15 15:44                                               ` Jesper Krogh
  2009-03-15 18:09                                                 ` Linus Torvalds
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-15 15:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown

Linus Torvalds wrote:
> 
> Jesper, here's a patch that actually tries to take teh TSC error really 
> into account, and which I suspect will result (on your machine) in failing 
> the fast PIT calibration. 
> 
> It also has a few extra printk's for debugging, and to see just what the 
> values are on your machine.
> 
> The idea behind the patch is to just keep track of how big the difference 
> was in TSC values between two successive reads of the PIT timer. We only 
> really care about the difference when the MSB turns around, and we only 
> really care about the two end points. The maximum error in TSC estimation 
> will simply be the sum of the differences at those points (d1 and d2).
> 
> We can then compare the maximum error with the actual TSC differences 
> between those points, and see if the max error is within 500 ppm. That 
> _should_ mean that it all works - assuming that the PIT itself is running 
> at the correct frequency, of course!
> 
> Regardless of whether is succeeds or not, it will print out some debug 
> messages, which will be interesting to see.


[    0.000000] Fast TSC delta=34227730, error=6223+6219=12442
[    0.000000] Fast TSC calibration using PIT
[    0.000000] Detected 2312.045 MHz processor.

Using "ntpq -c peers" .. the offset steadily grows as time goes.

Full dmesg: http://krogh.cc/~jesper/dmesg-linux-2.6.29-rc8-linus1.txt

jk@quad11:~$ ntpdc -c kerninfo
pll offset:           0.085167 s
pll frequency:        -18.722 ppm
maximum error:        0.137231 s
estimated error:      0.008823 s
status:               0001  pll
pll time constant:    6
precision:            1e-06 s
frequency tolerance:  500 ppm



-- 
Jesper



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-15 15:44                                               ` Jesper Krogh
@ 2009-03-15 18:09                                                 ` Linus Torvalds
  2009-03-15 18:38                                                   ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-03-15 18:09 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown



On Sun, 15 Mar 2009, Jesper Krogh wrote:
> Linus Torvalds wrote:
> > 
> > Regardless of whether is succeeds or not, it will print out some debug
> > messages, which will be interesting to see.
> 
> 
> [    0.000000] Fast TSC delta=34227730, error=6223+6219=12442
> [    0.000000] Fast TSC calibration using PIT
> [    0.000000] Detected 2312.045 MHz processor.

Ok. This claims that the error really is smaller than 500ppm (it's about 
360 ppm). Which is about what we're aiming for (in real life, the actual 
error is about half that - we're just adding up the error terms for 
maximum theoretical error).

> Using "ntpq -c peers" .. the offset steadily grows as time goes.
> 
> Full dmesg: http://krogh.cc/~jesper/dmesg-linux-2.6.29-rc8-linus1.txt
> 
> jk@quad11:~$ ntpdc -c kerninfo
> pll offset:           0.085167 s
> pll frequency:        -18.722 ppm
> maximum error:        0.137231 s
> estimated error:      0.008823 s
> status:               0001  pll
> pll time constant:    6
> precision:            1e-06 s
> frequency tolerance:  500 ppm

Hmm. But now it all seems to _work_, no? Or do you still get time resets? 
Now your "pll frequency" and "estimated error" are real values, not just 
"0s" like in your previous failure cases.

Of course, maybe that happens only after the time reset actually kicks in.

But one thing my patch did - apart from the error estimation - was to 
synchronize the TSC read with the actual PIT MSB wrap event. Maybe that 
mattered.

The other possibility (if the time reset actually happens) is that your 
PIT is simply not running at the expected frequency. That would be really 
quite odd, since that nominal 1193181.8181 Hz frequency is very standard, 
and has been around foreve.

I do not know how to test that. We need a reference timer to sync to, and 
the PIT has traditionally been a _lot_ more reliable than the other timers 
in the system (the PM timer may be reliable on modern machines, but almost 
certainly not on anything a few years old).

			Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-15 18:09                                                 ` Linus Torvalds
@ 2009-03-15 18:38                                                   ` Jesper Krogh
  2009-03-15 19:02                                                     ` Linus Torvalds
  2009-03-15 20:32                                                     ` Linus Torvalds
  0 siblings, 2 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-03-15 18:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown

Linus Torvalds wrote:
> 
> On Sun, 15 Mar 2009, Jesper Krogh wrote:
>> Linus Torvalds wrote:
>>> Regardless of whether is succeeds or not, it will print out some debug
>>> messages, which will be interesting to see.
>>
>> [    0.000000] Fast TSC delta=34227730, error=6223+6219=12442
>> [    0.000000] Fast TSC calibration using PIT
>> [    0.000000] Detected 2312.045 MHz processor.
> 
> Ok. This claims that the error really is smaller than 500ppm (it's about 
> 360 ppm). Which is about what we're aiming for (in real life, the actual 
> error is about half that - we're just adding up the error terms for 
> maximum theoretical error).
> 
>> Using "ntpq -c peers" .. the offset steadily grows as time goes.
>>
>> Full dmesg: http://krogh.cc/~jesper/dmesg-linux-2.6.29-rc8-linus1.txt
>>
>> jk@quad11:~$ ntpdc -c kerninfo
>> pll offset:           0.085167 s
>> pll frequency:        -18.722 ppm
>> maximum error:        0.137231 s
>> estimated error:      0.008823 s
>> status:               0001  pll
>> pll time constant:    6
>> precision:            1e-06 s
>> frequency tolerance:  500 ppm
> 
> Hmm. But now it all seems to _work_, no? Or do you still get time resets? 

My conclusion was that I would get a time reset after some time since 
the offset just increased as time went by (being reasonably small at the
beginning).

I had it up for around 30 minutes... Should I have tested longer?

I went on to trying Thomas Gleixners patch (which seems to do excactly 
the same .. ), I'll write a reply in to that message in a few minutes.

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-15 18:38                                                   ` Jesper Krogh
@ 2009-03-15 19:02                                                     ` Linus Torvalds
  2009-03-15 19:52                                                       ` Jesper Krogh
  2009-03-15 20:32                                                     ` Linus Torvalds
  1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-03-15 19:02 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown



On Sun, 15 Mar 2009, Jesper Krogh wrote:
> > > 
> > > [    0.000000] Fast TSC delta=34227730, error=6223+6219=12442
> > > [    0.000000] Fast TSC calibration using PIT
> > > [    0.000000] Detected 2312.045 MHz processor.
> 
> My conclusion was that I would get a time reset after some time since the
> offset just increased as time went by (being reasonably small at the
> beginning).
> 
> I had it up for around 30 minutes... Should I have tested longer?

It would be good to test longer. Your previous emails showed:

	2.6.26: time.c: Detected 2311.847 MHz processor.
	2.6.29: Detected 2310.029 MHz processor.

where that first one was a successful boot, and the second one was a 
failing one. So let's assume that 2311.847 is the "correct" frequency.

The difference between the correct one and your failing one is ~790 ppm, 
which is above the 500ppm ntpd threshhold. And as we saw earlier, those 
differences were pretty consistent, ie in your list of four successive 
boots, the old code consistently gave a frequency error that was roughly 
.7 permille off (ie exactly that 700 ppm).

HOWEVER! With that patch you just tried, you got 

	Detected 2312.045 MHz processor.

and the difference between _that_ and the assumed-correct-one is actually 
just 85 ppm. Which should be perfectly fine.

[ With the "test against PM timer, you had:

	[    0.000000] ref_freq: 2311825  pit_freq: 2310386
	[    0.000000] ref_freq: 2311803  pit_freq: 2310190
	[    0.000000] ref_freq: 2311824  pit_freq: 2310080
	[    0.000000] ref_freq: 2311831  pit_freq: 2310130

  on four boots, so averaging them gives 2311.82 Mhz, and the 2312.045MHz 
  you got with the improved fast-PIT code is still _way_ below 500ppm from 
  that - it's ~95 ppm away.

  IOW, the new frequency realy looks likely to work. ]

Quite frankly, we don't know how exact the PM-timer is either - we just 
know that the detection is "stable" (but so was the old PIT timer 
detection: it was stably at 700ppm lower from the PM timer. So there is 
nothing that says that 2311.82Mhz is the "correct" frequency, but we 
obviously know from your ntpd saga that it is much closer to correct than 
the old 2310.029 was.

End result of all this: I'd really like you to try the modified PIT 
frequency code for longer. Also, remember that getting one (or a couple) 
"time reset" messages from ntpd while it's trying to sync up is not a 
problem per se - it can validly take a while to synchronize. The problem 
is literally only if it doesn't synchonize over time at all.

			Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-15 19:02                                                     ` Linus Torvalds
@ 2009-03-15 19:52                                                       ` Jesper Krogh
  2009-03-16 18:59                                                         ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-15 19:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown

Linus Torvalds wrote:
> End result of all this: I'd really like you to try the modified PIT 
> frequency code for longer. Also, remember that getting one (or a couple) 
> "time reset" messages from ntpd while it's trying to sync up is not a 
> problem per se - it can validly take a while to synchronize. The problem 
> is literally only if it doesn't synchonize over time at all.

Ok. I'll get it on and report back in 24 hours or so..


-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-10 22:22                                                   ` Thomas Gleixner
@ 2009-03-15 19:53                                                     ` Jesper Krogh
  2009-03-16 18:40                                                       ` Jesper Krogh
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-15 19:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, john stultz, Linux Kernel Mailing List,
	Len Brown, Ingo Molnar

Thomas Gleixner wrote:

> slow calibration path on every boot on your machine.
> 
>     (tscmax - tscmin) / avg = 0.064 (result from third run)
> 
> On my test machines I get values below 0.02
> 
> While it's statistically not really correct we still can use that info
> to catch cases like we see on your machines.
> 
>> While booting up I saw this one on the serial console..
>> root@quad12:~# hwclock --systohc
>> Cannot access the Hardware Clock via any known method.
>> Use the --debug option to see the details of our search for an access method.
>> root@quad12:~# hwclock --systohc --debug
>> hwclock from util-linux-ng 2.13.1
>> hwclock: Open of /dev/rtc failed, errno=2: No such file or directory.
>> No usable clock interface found.
>> Cannot access the Hardware Clock via any known method.
> 
> Can you provide your .config file please ?

http://krogh.cc/~jesper/config-2.6.29-rc8.txt

I testet the attached patch.. and after 1.5 hours it seems to work. I'll 
remain on this one at least a day to see how it works. I'll keep it on 
for now and report back in 24 hours or so.

Its still using tsc as clock-source.

Jesper

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-15 18:38                                                   ` Jesper Krogh
  2009-03-15 19:02                                                     ` Linus Torvalds
@ 2009-03-15 20:32                                                     ` Linus Torvalds
  1 sibling, 0 replies; 81+ messages in thread
From: Linus Torvalds @ 2009-03-15 20:32 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown



On Sun, 15 Mar 2009, Jesper Krogh wrote:
> 
> I went on to trying Thomas Gleixners patch (which seems to do excactly the
> same .. ), I'll write a reply in to that message in a few minutes.

Side note: no, Thomas' patch doesn't do at all exactly the same. It does 
something similar, in that it looks at the time differences between calls 
to the whole "wait for the PIT MSB to change" function, but those 
differences _could_ in theory be very small, even if the error is very 
big.

That's especially true if the PIT read ends up serializing with the PIT, 
so that the "wait for MSB" essentially always takes exactly the same 
amount of cycles (giving a zero error estimation in Thomas' version), but 
the reads themselves can still be quite slow (giving a non-zero error term 
in the end result).

IOW, Thomas' patch is good at finding variability in the reads - which 
could be the result of SMM interaction, while my patch literally measures 
how long it takes to read the MSB change.

Now in practice I suspect the variability in the MSB reads _probably_ 
correlate reasonably well with how long a single PIT read will take (ie 
rather than finding variability due to SMM interaction, it will find 
variability due to the "quanitization" effect of the reads taking a 
reasonably long time), so I suspect that in many cases Thomas' patch will 
error out for the same cases mine does.

But the two patches are rather fundamentally different.

		Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-15 19:53                                                     ` Jesper Krogh
@ 2009-03-16 18:40                                                       ` Jesper Krogh
  0 siblings, 0 replies; 81+ messages in thread
From: Jesper Krogh @ 2009-03-16 18:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, john stultz, Linux Kernel Mailing List,
	Len Brown, Ingo Molnar

Jesper Krogh wrote:
> Thomas Gleixner wrote:
> 
>> slow calibration path on every boot on your machine.
>>
>>     (tscmax - tscmin) / avg = 0.064 (result from third run)
>>
>> On my test machines I get values below 0.02
>>
>> While it's statistically not really correct we still can use that info
>> to catch cases like we see on your machines.
>>
>>> While booting up I saw this one on the serial console..
>>> root@quad12:~# hwclock --systohc
>>> Cannot access the Hardware Clock via any known method.
>>> Use the --debug option to see the details of our search for an access 
>>> method.
>>> root@quad12:~# hwclock --systohc --debug
>>> hwclock from util-linux-ng 2.13.1
>>> hwclock: Open of /dev/rtc failed, errno=2: No such file or directory.
>>> No usable clock interface found.
>>> Cannot access the Hardware Clock via any known method.
>>
>> Can you provide your .config file please ?
> 
> http://krogh.cc/~jesper/config-2.6.29-rc8.txt
> 
> I testet the attached patch.. and after 1.5 hours it seems to work. I'll 
> remain on this one at least a day to see how it works. I'll keep it on 
> for now and report back in 24 hours or so.
> 
> Its still using tsc as clock-source.

No resets after 24 hours..  it works.

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-15 19:52                                                       ` Jesper Krogh
@ 2009-03-16 18:59                                                         ` Jesper Krogh
  2009-03-16 19:32                                                           ` Linus Torvalds
  0 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-16 18:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List, Len Brown

Jesper Krogh wrote:
> Linus Torvalds wrote:
>> End result of all this: I'd really like you to try the modified PIT 
>> frequency code for longer. Also, remember that getting one (or a 
>> couple) "time reset" messages from ntpd while it's trying to sync up 
>> is not a problem per se - it can validly take a while to synchronize. 
>> The problem is literally only if it doesn't synchonize over time at all.
> 
> Ok. I'll get it on and report back in 24 hours or so..

you were right. It works. No resets so far.

-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-16 18:59                                                         ` Jesper Krogh
@ 2009-03-16 19:32                                                           ` Linus Torvalds
  2009-03-17  1:43                                                             ` john stultz
                                                                               ` (2 more replies)
  0 siblings, 3 replies; 81+ messages in thread
From: Linus Torvalds @ 2009-03-16 19:32 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List,
	Len Brown, Ingo Molnar



On Mon, 16 Mar 2009, Jesper Krogh wrote:
> 
> you were right. It works. No resets so far.

Goodie.

Here's a slightly cleaned-up patch that removes the debug messages, and 
also re-organizes the code a bit so that it actually uses the "better than 
500 ppm" as the way to decide when to stop calibrating.

Why?

I tested the 500 ppm check on some slower machines, and the old algorithm 
of just waiting for 15ms actually failed that 500 ppm test. It was _very_ 
close - 16ms was enough - but it convinced me that the logic was too damn 
fragile.

I also think I know why John reported this:

> Ingo, Thomas: On the hardware I'm testing the fast-pit calibration only
> triggers probably 80-90% of the time. About 10-20% of the time, the
> initial check to pit_expect_msb(0xff) fails (count=0), so we may need to
> look more at this approach.

and the reason is that when we re-program the PIT, it will actually take 
until the next timer edge (the incoming 1.1MHz timer) for the new values 
to take effect. So before the first call to pit_expect_msb(), we should 
make sure to delay for at least one PIT cycle. The simplest way to do that 
is to simply read the PIT latch once, it will take about 2us.

So this patch fixes that too.

John, does that make the PIT calibration work reliably on your machine?

The patch looks bigger than it is: most of the noise is just 
re-indentation and some trivial re-organizing.

			Linus

---
 arch/x86/kernel/tsc.c |  110 +++++++++++++++++++++++++++++--------------------
 1 files changed, 65 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 599e581..d5cebb5 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -273,30 +273,43 @@ static unsigned long pit_calibrate_tsc(u32 latch, unsigned long ms, int loopmin)
  * use the TSC value at the transitions to calculate a pretty
  * good value for the TSC frequencty.
  */
-static inline int pit_expect_msb(unsigned char val)
+static inline int pit_expect_msb(unsigned char val, u64 *tscp, unsigned long *deltap)
 {
-	int count = 0;
+	int count;
+	u64 tsc = 0;
 
 	for (count = 0; count < 50000; count++) {
 		/* Ignore LSB */
 		inb(0x42);
 		if (inb(0x42) != val)
 			break;
+		tsc = get_cycles();
 	}
-	return count > 50;
+	*deltap = get_cycles() - tsc;
+	*tscp = tsc;
+
+	/*
+	 * We require _some_ success, but the quality control
+	 * will be based on the error terms on the TSC values.
+	 */
+	return count > 5;
 }
 
 /*
- * How many MSB values do we want to see? We aim for a
- * 15ms calibration, which assuming a 2us counter read
- * error should give us roughly 150 ppm precision for
- * the calibration.
+ * How many MSB values do we want to see? We aim for
+ * a maximum error rate of 500ppm (in practice the
+ * real error is much smaller), but refuse to spend
+ * more than 25ms on it.
  */
-#define QUICK_PIT_MS 15
-#define QUICK_PIT_ITERATIONS (QUICK_PIT_MS * PIT_TICK_RATE / 1000 / 256)
+#define MAX_QUICK_PIT_MS 25
+#define MAX_QUICK_PIT_ITERATIONS (MAX_QUICK_PIT_MS * PIT_TICK_RATE / 1000 / 256)
 
 static unsigned long quick_pit_calibrate(void)
 {
+	int i;
+	u64 tsc, delta;
+	unsigned long d1, d2;
+
 	/* Set the Gate high, disable speaker */
 	outb((inb(0x61) & ~0x02) | 0x01, 0x61);
 
@@ -315,45 +328,52 @@ static unsigned long quick_pit_calibrate(void)
 	outb(0xff, 0x42);
 	outb(0xff, 0x42);
 
-	if (pit_expect_msb(0xff)) {
-		int i;
-		u64 t1, t2, delta;
-		unsigned char expect = 0xfe;
-
-		t1 = get_cycles();
-		for (i = 0; i < QUICK_PIT_ITERATIONS; i++, expect--) {
-			if (!pit_expect_msb(expect))
-				goto failed;
+	/*
+	 * The PIT starts counting at the next edge, so we
+	 * need to delay for a microsecond. The easiest way
+	 * to do that is to just read back the 16-bit counter
+	 * once from the PIT.
+	 */
+	inb(0x42);
+	inb(0x42);
+
+	if (pit_expect_msb(0xff, &tsc, &d1)) {
+		for (i = 1; i <= MAX_QUICK_PIT_ITERATIONS; i++) {
+			if (!pit_expect_msb(0xff-i, &delta, &d2))
+				break;
+
+			/*
+			 * Iterate until the error is less than 500 ppm
+			 */
+			delta -= tsc;
+			if (d1+d2 < delta >> 11)
+				goto success;
 		}
-		t2 = get_cycles();
-
-		/*
-		 * Make sure we can rely on the second TSC timestamp:
-		 */
-		if (!pit_expect_msb(expect))
-			goto failed;
-
-		/*
-		 * Ok, if we get here, then we've seen the
-		 * MSB of the PIT decrement QUICK_PIT_ITERATIONS
-		 * times, and each MSB had many hits, so we never
-		 * had any sudden jumps.
-		 *
-		 * As a result, we can depend on there not being
-		 * any odd delays anywhere, and the TSC reads are
-		 * reliable.
-		 *
-		 * kHz = ticks / time-in-seconds / 1000;
-		 * kHz = (t2 - t1) / (QPI * 256 / PIT_TICK_RATE) / 1000
-		 * kHz = ((t2 - t1) * PIT_TICK_RATE) / (QPI * 256 * 1000)
-		 */
-		delta = (t2 - t1)*PIT_TICK_RATE;
-		do_div(delta, QUICK_PIT_ITERATIONS*256*1000);
-		printk("Fast TSC calibration using PIT\n");
-		return delta;
 	}
-failed:
+	printk("Fast TSC calibration failed\n");
 	return 0;
+
+success:
+	/*
+	 * Ok, if we get here, then we've seen the
+	 * MSB of the PIT decrement 'i' times, and the
+	 * error has shrunk to less than 500 ppm.
+	 *
+	 * As a result, we can depend on there not being
+	 * any odd delays anywhere, and the TSC reads are
+	 * reliable (within the error). We also adjust the
+	 * delta to the middle of the error bars, just
+	 * because it looks nicer.
+	 *
+	 * kHz = ticks / time-in-seconds / 1000;
+	 * kHz = (t2 - t1) / (I * 256 / PIT_TICK_RATE) / 1000
+	 * kHz = ((t2 - t1) * PIT_TICK_RATE) / (I * 256 * 1000)
+	 */
+	delta += (long)(d2 - d1)/2;
+	delta *= PIT_TICK_RATE;
+	do_div(delta, i*256*1000);
+	printk("Fast TSC calibration using PIT\n");
+	return delta;
 }
 
 /**

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-16 19:32                                                           ` Linus Torvalds
@ 2009-03-17  1:43                                                             ` john stultz
  2009-03-17  8:14                                                             ` Ingo Molnar
  2009-03-21  9:11                                                             ` Jesper Krogh
  2 siblings, 0 replies; 81+ messages in thread
From: john stultz @ 2009-03-17  1:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jesper Krogh, Thomas Gleixner, Linux Kernel Mailing List,
	Len Brown, Ingo Molnar

On Mon, 2009-03-16 at 12:32 -0700, Linus Torvalds wrote:
> I also think I know why John reported this:
> 
> > Ingo, Thomas: On the hardware I'm testing the fast-pit calibration only
> > triggers probably 80-90% of the time. About 10-20% of the time, the
> > initial check to pit_expect_msb(0xff) fails (count=0), so we may need to
> > look more at this approach.
> 
> and the reason is that when we re-program the PIT, it will actually take 
> until the next timer edge (the incoming 1.1MHz timer) for the new values 
> to take effect. So before the first call to pit_expect_msb(), we should 
> make sure to delay for at least one PIT cycle. The simplest way to do that 
> is to simply read the PIT latch once, it will take about 2us.
> 
> So this patch fixes that too.
> 
> John, does that make the PIT calibration work reliably on your machine?

Yep, I haven't seen a failure with it so far. And it's the same net
effect change my earlier patch was doing (one extra read cycle) just
without all the conditionals, so it should be fine.

thanks
-john



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-16 19:32                                                           ` Linus Torvalds
  2009-03-17  1:43                                                             ` john stultz
@ 2009-03-17  8:14                                                             ` Ingo Molnar
  2009-03-17 15:48                                                               ` Linus Torvalds
  2009-03-21  9:11                                                             ` Jesper Krogh
  2 siblings, 1 reply; 81+ messages in thread
From: Ingo Molnar @ 2009-03-17  8:14 UTC (permalink / raw)
  To: Linus Torvalds, Peter Zijlstra
  Cc: Jesper Krogh, john stultz, Thomas Gleixner,
	Linux Kernel Mailing List, Len Brown


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, 16 Mar 2009, Jesper Krogh wrote:
> > 
> > you were right. It works. No resets so far.
> 
> Goodie.
> 
> Here's a slightly cleaned-up patch that removes the debug 
> messages, and also re-organizes the code a bit so that it 
> actually uses the "better than 500 ppm" as the way to decide 
> when to stop calibrating.
> 
> Why?
> 
> I tested the 500 ppm check on some slower machines, and the 
> old algorithm of just waiting for 15ms actually failed that 
> 500 ppm test. It was _very_ close - 16ms was enough - but it 
> convinced me that the logic was too damn fragile.
> 
> I also think I know why John reported this:
> 
> > Ingo, Thomas: On the hardware I'm testing the fast-pit calibration only
> > triggers probably 80-90% of the time. About 10-20% of the time, the
> > initial check to pit_expect_msb(0xff) fails (count=0), so we may need to
> > look more at this approach.
> 
> and the reason is that when we re-program the PIT, it will 
> actually take until the next timer edge (the incoming 1.1MHz 
> timer) for the new values to take effect. So before the first 
> call to pit_expect_msb(), we should make sure to delay for at 
> least one PIT cycle. The simplest way to do that is to simply 
> read the PIT latch once, it will take about 2us.
> 
> So this patch fixes that too.
> 
> John, does that make the PIT calibration work reliably on your 
> machine?
> 
> The patch looks bigger than it is: most of the noise is just 
> re-indentation and some trivial re-organizing.

Cool. Will you apply it yourself (in the merge window) or should 
we pick it up?

Incidentally, yesterday i wrote a PIT auto-calibration routine 
(see WIP patch below).

The core idea is to use _all_ thousands of measurement points 
(not just two) to calculate the frequency ratio, with a built-in 
noise detector which drops out of the loop if the observed noise 
goes below ~10 ppm.

It is free-running: i.e. it observes noise and if the result 
stabilizes quickly it can exit quickly. (with an upper bound for 
unreliable PITs or virtualized systems, etc.)

It's WIP because it's not working yet (or at all?): i couldnt 
get the statistical model right - it's too noisy at 1000-2000 
ppm and the frequency result is off by 5000 ppm. Totally against 
expectations. I traced it on a box with a good PIT and in the 
trace the calculations look sane and the noise levels go down 
nicely - except that the result sucks.

I also like yours more because it's simpler.

	Ingo

Index: linux/arch/x86/kernel/tsc.c
===================================================================
--- linux.orig/arch/x86/kernel/tsc.c
+++ linux/arch/x86/kernel/tsc.c
@@ -240,63 +240,201 @@ static unsigned long pit_calibrate_tsc(u
 }
 
 /*
- * This reads the current MSB of the PIT counter, and
- * checks if we are running on sufficiently fast and
- * non-virtualized hardware.
+ * Rolling statistical analysis of (PIT,TSC) measurement deltas.
  *
- * Our expectations are:
- *
- *  - the PIT is running at roughly 1.19MHz
- *
- *  - each IO is going to take about 1us on real hardware,
- *    but we allow it to be much faster (by a factor of 10) or
- *    _slightly_ slower (ie we allow up to a 2us read+counter
- *    update - anything else implies a unacceptably slow CPU
- *    or PIT for the fast calibration to work.
- *
- *  - with 256 PIT ticks to read the value, we have 214us to
- *    see the same MSB (and overhead like doing a single TSC
- *    read per MSB value etc).
- *
- *  - We're doing 2 reads per loop (LSB, MSB), and we expect
- *    them each to take about a microsecond on real hardware.
- *    So we expect a count value of around 100. But we'll be
- *    generous, and accept anything over 50.
- *
- *  - if the PIT is stuck, and we see *many* more reads, we
- *    return early (and the next caller of pit_expect_msb()
- *    then consider it a failure when they don't see the
- *    next expected value).
- *
- * These expectations mean that we know that we have seen the
- * transition from one expected value to another with a fairly
- * high accuracy, and we didn't miss any events. We can thus
- * use the TSC value at the transitions to calculate a pretty
- * good value for the TSC frequencty.
+ * We use a decaying average to estimate current noise levels.
+ * If noise falls below the expected threshold we exit the loop
+ * with the result.
+ *
+ * If this never happens - for example because the PIT is unreliable,
+ * then we break out after a limit and fail this type of calibration.
+ *
+ * Note that this method observes the statistical noise as-is without
+ * making any assumptions, so it is fundamentally robust against
+ * occasional PIT blips or SMI related system activities that can
+ * disturb calibration. An SMI in the wrong  moment pushes up the
+ * noise level and causes the calibration loop to exit a tiny bit
+ * later - but still with a precise and reliable result.
  */
-static inline int pit_expect_msb(unsigned char val)
+static s64 sum_slope;
+static s64 sum_slope_noise;
+static s64 prev_slope;
+
+static int nr_measurements;
+
+#define MAX_MEASUREMENTS	10000
+
+#define MIN_MEASUREMENTS	100
+
+struct entry {
+	u64			tsc;
+	unsigned int		pit;
+};
+
+/*
+ * A single measurement is as simple as possible:
+ */
+static inline void do_one_measurement(struct entry *entry)
 {
-	int count = 0;
+	unsigned char pit_lsb, pit_msb;
+	u64 tsc;
 
-	for (count = 0; count < 50000; count++) {
-		/* Ignore LSB */
-		inb(0x42);
-		if (inb(0x42) != val)
-			break;
-	}
-	return count > 50;
+	/*
+	 * We use the PIO accesses as natural TSC serialization barriers:
+	 */
+	pit_lsb			= inb(0x42);
+	tsc			= get_cycles();
+	pit_msb			= inb(0x42);
+
+	entry->tsc		= tsc;
+	entry->pit		= pit_msb*256 + pit_lsb;
+
+	trace_printk("tsc: %Ld, count: %d, nr: %d\n",
+		     entry->tsc, entry->pit, nr_measurements);
 }
 
 /*
- * How many MSB values do we want to see? We aim for a
- * 15ms calibration, which assuming a 2us counter read
- * error should give us roughly 150 ppm precision for
- * the calibration.
+ * We scale numbers up by 1024 to reduce quantization effects:
  */
-#define QUICK_PIT_MS 15
-#define QUICK_PIT_ITERATIONS (QUICK_PIT_MS * PIT_TICK_RATE / 1000 / 256)
+static unsigned long do_delta_analysis(struct entry *e0, struct entry *e1)
+{
+	s64 slope, dslope;
+	s64 noise;
+	int decay;
+	int dc;
+	s64 dt;
+
+	dt = e1->tsc - e0->tsc; /* TSC is going up */
+	dc = e0->pit - e1->pit; /* PIT counter is going down */
+
+	/*
+	 * Delta-PIT-count can be positive (or negative in case of
+	 * an anomaly), but we made sure in do_measurement() that
+	 * it can never be zero:
+	 */
+	slope = 1024 * dt / dc;
+
+	dslope = slope - prev_slope;
+	noise = dslope;
+
+	trace_printk("                   dt:  %20Ld\n", dt);
+	trace_printk("                   dc:  %20d\n", dc);
+	trace_printk("                slope:  %20Ld\n", slope);
+	trace_printk("               dslope:  %20Ld\n", dslope);
+
+	/*
+	 * Add a gentle decaying average to the slope and noise averages:
+	 */
+	trace_printk("       prev sum_slope:  %20Ld\n", sum_slope);
 
-static unsigned long quick_pit_calibrate(void)
+	/*
+	 * Dynamic decay - starts with low values then later on
+	 * the system cools down:
+	 */
+	decay = 1;
+	if (sum_slope_noise)
+		decay = sum_slope / 64 / sum_slope_noise;
+	decay = min(2000, decay);
+	decay = max(nr_measurements/4, decay);
+
+	sum_slope = ((decay - 1)*sum_slope + slope)/decay;
+	trace_printk("        new sum_slope:  %20Ld [decay: %d]\n",
+		     sum_slope, decay);
+
+	trace_printk(" prev sum_slope_noise:  %20Ld\n", sum_slope_noise);
+	sum_slope_noise = (1023*sum_slope_noise + noise)/1024;
+	trace_printk("  new sum_slope_noise:  %20Ld\n", sum_slope_noise);
+
+	prev_slope = slope;
+
+	if (nr_measurements >= 64*MIN_MEASUREMENTS && sum_slope_noise < 10 ) {
+		trace_printk(" => low noise early exit!\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+static int do_measurements(void)
+{
+	unsigned int pit_stuck;
+	unsigned long flags;
+	struct entry e0, e1;
+	int err = 0;
+
+	sum_slope_noise = 0;
+	sum_slope = 0;
+	prev_slope = 0;
+
+	nr_measurements = 0;
+
+	local_irq_save(flags);
+
+	trace_printk("PIT begin\n");
+	do_one_measurement(&e0);
+
+	do_one_measurement(&e0);
+
+	for (;;) {
+		pit_stuck = 0;
+repeat_e1:
+		do_one_measurement(&e1);
+		/*
+		 * The typical case is that the PIT advanced a bit
+		 * since we last read it (the PIOs take time, etc.).
+		 * In case it did not advance (some really fast
+		 * PIO implementation or virtualization) we will allow
+		 * the count to stay 'stuck' up to 100 times:
+		 *
+		 * (Note that making sure that the count progresses also
+		 * simplifies data processing later on.)
+		 */
+		if (e0.pit != e1.pit) {
+			nr_measurements++;
+			if (nr_measurements >= MAX_MEASUREMENTS) {
+				printk("PIT: final count: %d\n", e1.pit);
+				break;
+			}
+			if (do_delta_analysis(&e0, &e1)) {
+				printk("PIT: low-noise count: %d\n", e1.pit);
+				break;
+			}
+			/*
+			 * Reuse the second measurement point for the
+			 * next delta measurement:
+			 */
+			e0 = e1;
+			trace_printk("\n");
+			continue;
+		}
+		if (pit_stuck++ < 100)
+			goto repeat_e1;
+
+		printk(KERN_INFO "PIT auto-calibration: counter stuck at %d!\n",
+			e1.pit);
+		err = -EINVAL;
+	}
+
+	trace_printk("PIT end\n");
+	local_irq_restore(flags);
+
+	return err;
+}
+
+static unsigned long auto_pit_calibrate(void)
+{
+	if (do_measurements() < 0)
+		return 0;
+
+	printk("PIT: sum_slope:        %Ld\n", sum_slope);
+	printk("PIT: Hz:               %Ld\n", sum_slope * PIT_TICK_RATE);
+	printk("PIT: sum_slope_noise:  %Ld\n", sum_slope_noise);
+	printk("PIT: nr_measurements:  %d\n", nr_measurements);
+
+	return sum_slope * PIT_TICK_RATE / 1024 / 1000;
+}
+
+unsigned long quick_pit_calibrate(void)
 {
 	/* Set the Gate high, disable speaker */
 	outb((inb(0x61) & ~0x02) | 0x01, 0x61);
@@ -316,45 +454,7 @@ static unsigned long quick_pit_calibrate
 	outb(0xff, 0x42);
 	outb(0xff, 0x42);
 
-	if (pit_expect_msb(0xff)) {
-		int i;
-		u64 t1, t2, delta;
-		unsigned char expect = 0xfe;
-
-		t1 = get_cycles();
-		for (i = 0; i < QUICK_PIT_ITERATIONS; i++, expect--) {
-			if (!pit_expect_msb(expect))
-				goto failed;
-		}
-		t2 = get_cycles();
-
-		/*
-		 * Make sure we can rely on the second TSC timestamp:
-		 */
-		if (!pit_expect_msb(expect))
-			goto failed;
-
-		/*
-		 * Ok, if we get here, then we've seen the
-		 * MSB of the PIT decrement QUICK_PIT_ITERATIONS
-		 * times, and each MSB had many hits, so we never
-		 * had any sudden jumps.
-		 *
-		 * As a result, we can depend on there not being
-		 * any odd delays anywhere, and the TSC reads are
-		 * reliable.
-		 *
-		 * kHz = ticks / time-in-seconds / 1000;
-		 * kHz = (t2 - t1) / (QPI * 256 / PIT_TICK_RATE) / 1000
-		 * kHz = ((t2 - t1) * PIT_TICK_RATE) / (QPI * 256 * 1000)
-		 */
-		delta = (t2 - t1)*PIT_TICK_RATE;
-		do_div(delta, QUICK_PIT_ITERATIONS*256*1000);
-		printk("Fast TSC calibration using PIT\n");
-		return delta;
-	}
-failed:
-	return 0;
+	return auto_pit_calibrate();
 }
 
 /**

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-17  8:14                                                             ` Ingo Molnar
@ 2009-03-17 15:48                                                               ` Linus Torvalds
  2009-03-17 16:13                                                                 ` Ingo Molnar
  0 siblings, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-03-17 15:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Jesper Krogh, john stultz, Thomas Gleixner,
	Linux Kernel Mailing List, Len Brown



On Tue, 17 Mar 2009, Ingo Molnar wrote:
> 
> Cool. Will you apply it yourself (in the merge window) or should 
> we pick it up?

I'll commit it.  I already split it into two commits - one for the trivial 
startup problem that John had, one for the "estimate error and exit when 
smaller than 500ppm" part.

> Incidentally, yesterday i wrote a PIT auto-calibration routine 
> (see WIP patch below).
> 
> The core idea is to use _all_ thousands of measurement points 
> (not just two) to calculate the frequency ratio, with a built-in 
> noise detector which drops out of the loop if the observed noise 
> goes below ~10 ppm.

I suspect that reaching 10 ppm is going to take too long in general. 
Considering that I found a machine where reaching 500ppm took 16ms, 
getting to 10ppm would take almost a second. That's a long time at bootup, 
considering that people want the whole boot to take about that time ;)

I also do think it's a bit unnecessarily complicated. We really only care 
about the end points - obviously we can end up being unlucky and get a 
very noisy end-point due to something like SMI or virtualization, but if 
that happens, we're really just better off failing quickly instead, and 
we'll go on to the slower calibration routines.

On real hardware without SMI or virtualization overhead, the delays 
_should_ be very stable. On my main machine, for example, the PIT read 
really seems very stable at about 2.5us (which matches the expectation 
that one 'inb' should take roughly one microsecond pretty closely). So 
that should be the default case, and the case that the fast calibration is 
designed for.

For the other cases, we really can just exit and do something else.

> It's WIP because it's not working yet (or at all?): i couldnt 
> get the statistical model right - it's too noisy at 1000-2000 
> ppm and the frequency result is off by 5000 ppm.

I suspect your measurement overhead is getting noticeable. You do all 
those divides, but even more so, you do all those traces. Also, it looks 
like you do purely local pairwise analysis at subsequent PIT modelling 
points, which can't work - you need to average over a long time to 
stabilize it.

So you _can_ do something like what you do, but you'd need to find a 
low-noise start and end point, and do analysis over that longer range 
instead of trying to do it over individual cases. 

> I also like yours more because it's simpler.

In fact, it's much simpler than what we used to do. No real assumptions 
about how quickly we can read the PIT, no need for magic values ("we can 
distinguish a slow virtual environment from real hardware by the fact that 
we can do at least 50 PIT reads in one cycle"), no nothing. Just a simple 
"is it below 500ppm yet?".

(Well, technically, it compares to 1 in 2048 rather than 500 in a million, 
since that is much cheaper, so it's really looking for "better than 
488ppm")

		Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-17 15:48                                                               ` Linus Torvalds
@ 2009-03-17 16:13                                                                 ` Ingo Molnar
  2009-03-17 16:28                                                                   ` Linus Torvalds
  2009-03-17 17:28                                                                   ` Olivier Galibert
  0 siblings, 2 replies; 81+ messages in thread
From: Ingo Molnar @ 2009-03-17 16:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Jesper Krogh, john stultz, Thomas Gleixner,
	Linux Kernel Mailing List, Len Brown


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, 17 Mar 2009, Ingo Molnar wrote:
> > 
> > Cool. Will you apply it yourself (in the merge window) or should 
> > we pick it up?
> 
> I'll commit it.  I already split it into two commits - one for the 
> trivial startup problem that John had, one for the "estimate error 
> and exit when smaller than 500ppm" part.

ok.

> > Incidentally, yesterday i wrote a PIT auto-calibration routine 
> > (see WIP patch below).
> > 
> > The core idea is to use _all_ thousands of measurement points 
> > (not just two) to calculate the frequency ratio, with a built-in 
> > noise detector which drops out of the loop if the observed noise 
> > goes below ~10 ppm.
> 
> I suspect that reaching 10 ppm is going to take too long in 
> general. Considering that I found a machine where reaching 500ppm 
> took 16ms, getting to 10ppm would take almost a second. That's a 
> long time at bootup, considering that people want the whole boot 
> to take about that time ;)
> 
> I also do think it's a bit unnecessarily complicated. We really 
> only care about the end points - obviously we can end up being 
> unlucky and get a very noisy end-point due to something like SMI 
> or virtualization, but if that happens, we're really just better 
> off failing quickly instead, and we'll go on to the slower 
> calibration routines.

That's the idea of my patch: to use not two endpoints but thousands 
of measurement points. That way we dont have to worry about the 
precision of the endpoints - any 'bad' measurement will be 
counter-acted by thousands of 'good' measurements.

That's the theory at least - practice got in my way ;-)

By measuring more we can get a more precise result, and we also do 
not assume anything about how much time passes between two 
measurement points. A single measurement is:

+	/*
+        * We use the PIO accesses as natural TSC serialization barriers:
+        */
+       pit_lsb                 = inb(0x42);
+       tsc                     = get_cycles();
+       pit_msb                 = inb(0x42);

Just like we can prove that there's an exoplanet around a star, just 
by doing a _ton_ of measurements of a very noisy data source. As 
long as there's an underlying physical value to be measured (and we 
are not measuring pure noise) that value is recoverable, with enough 
measurements.

> On real hardware without SMI or virtualization overhead, the 
> delays _should_ be very stable. On my main machine, for example, 
> the PIT read really seems very stable at about 2.5us (which 
> matches the expectation that one 'inb' should take roughly one 
> microsecond pretty closely). So that should be the default case, 
> and the case that the fast calibration is designed for.
> 
> For the other cases, we really can just exit and do something 
> else.
> 
> > It's WIP because it's not working yet (or at all?): i couldnt 
> > get the statistical model right - it's too noisy at 1000-2000 
> > ppm and the frequency result is off by 5000 ppm.
> 
> I suspect your measurement overhead is getting noticeable. You do 
> all those divides, but even more so, you do all those traces. 
> Also, it looks like you do purely local pairwise analysis at 
> subsequent PIT modelling points, which can't work - you need to 
> average over a long time to stabilize it.

Actually, it's key to my trick that what happens _between_ the 
measurement points does not matter _at all_.

My 'delta' algorithm does not assume anything about how much time 
passes between two measurement points - it calculates the slope and 
keeps a rolling average of that slope.

That's why i could put the delta analysis there. We are capturing 
thousands of measurement points, and what matters is the precision 
of the 'pair' of (PIT,TSC) timestamp measurements.

I got roughly the same end result noise and the same anomalies with 
tracing enabled and disabled. (and the number of data points was cut 
in half with tracing enabled)

	Ingo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-17 16:13                                                                 ` Ingo Molnar
@ 2009-03-17 16:28                                                                   ` Linus Torvalds
  2009-03-17 16:40                                                                     ` Ingo Molnar
  2009-03-17 17:28                                                                   ` Olivier Galibert
  1 sibling, 1 reply; 81+ messages in thread
From: Linus Torvalds @ 2009-03-17 16:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Jesper Krogh, john stultz, Thomas Gleixner,
	Linux Kernel Mailing List, Len Brown



On Tue, 17 Mar 2009, Ingo Molnar wrote:
> 
> That's the idea of my patch: to use not two endpoints but thousands 
> of measurement points.

Umm. Except you don't.

> By measuring more we can get a more precise result, and we also do 
> not assume anything about how much time passes between two 
> measurement points.

That's fine, but your actual code doesn't _do_ that.

> My 'delta' algorithm does not assume anything about how much time 
> passes between two measurement points - it calculates the slope and 
> keeps a rolling average of that slope.

No, you keep a very bad measure of "some kind of random average of the 
last few points", which - if I read things right:

 - lacks precision (you really need to use 'double' floating point to do 
   it well, otherwise the rounding errors will kill you). You seem to be 
   aiming for a 10-bit fixed point thing, which may or may not work if 
   done cleverly, but:

 - seems to be based on a rather weak averaging function which certainly 
   will lose data over time.

The thing is, the only _accurate_ average is the one done over long time 
distances. It's very true that your slope thing works very well over such 
long times, and you'd get accurate measurement if you did it that way, BUT 
THAT IS NOT WHAT YOU DO. You have a very tight loop, so you get very bad 
slopes, and then you use a weak averaging function to try to make them 
better, but it never does.

Also, there seems to be a fundamental bug in your PIT reading routine. My 
fast-TSC calibration only looks at the MSB of the PIT read for a very good 
reason: if you don't use the explicit LATCH command, you may be getting 
the MSB of one counter value, and then the LSB of another. So your PIT 
read can easily be off by ~256 PIT cycles. Only by caring only for the MSB 
can you do an unlatched read!

That is why pit_expect_msb() looks for the "edge" where the MSB changes, 
and never actually looks at the LSB. 

This issue may be an additional reason for your problems, although maybe 
your noise correction will be able to avoid those cases.

			Linus

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-17 16:28                                                                   ` Linus Torvalds
@ 2009-03-17 16:40                                                                     ` Ingo Molnar
  0 siblings, 0 replies; 81+ messages in thread
From: Ingo Molnar @ 2009-03-17 16:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Jesper Krogh, john stultz, Thomas Gleixner,
	Linux Kernel Mailing List, Len Brown


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, 17 Mar 2009, Ingo Molnar wrote:
> > 
> > That's the idea of my patch: to use not two endpoints but thousands 
> > of measurement points.
> 
> Umm. Except you don't.
> 
> > By measuring more we can get a more precise result, and we also do 
> > not assume anything about how much time passes between two 
> > measurement points.
> 
> That's fine, but your actual code doesn't _do_ that.
> 
> > My 'delta' algorithm does not assume anything about how much time 
> > passes between two measurement points - it calculates the slope and 
> > keeps a rolling average of that slope.
> 
> No, you keep a very bad measure of "some kind of random average of the 
> last few points", which - if I read things right:
> 
>  - lacks precision (you really need to use 'double' floating point to do 
>    it well, otherwise the rounding errors will kill you). You seem to be 
>    aiming for a 10-bit fixed point thing, which may or may not work if 
>    done cleverly, but:
> 
>  - seems to be based on a rather weak averaging function which certainly 
>    will lose data over time.
> 
> The thing is, the only _accurate_ average is the one done over 
> long time distances. It's very true that your slope thing works 
> very well over such long times, and you'd get accurate measurement 
> if you did it that way, BUT THAT IS NOT WHAT YOU DO. You have a 
> very tight loop, so you get very bad slopes, and then you use a 
> weak averaging function to try to make them better, but it never 
> does.

Hm, the intention there was to have a memory of ~1000 entries via a 
decaying average of 1:1000.

In parallel to that there's also a noise estimator (which too decays 
over time). So basically when observed noise is very low we 
essentially use the data from the last ~1000 measurements. (well, 
not exactly - as the 'memory' of more recent data will be stronger 
than that of older ones.)

Again ... it's a clearly non-working patch so it's not really a 
defendable concept :-)

> Also, there seems to be a fundamental bug in your PIT reading 
> routine. My fast-TSC calibration only looks at the MSB of the PIT 
> read for a very good reason: if you don't use the explicit LATCH 
> command, you may be getting the MSB of one counter value, and then 
> the LSB of another. So your PIT read can easily be off by ~256 PIT 
> cycles. Only by caring only for the MSB can you do an unlatched 
> read!
> 
> That is why pit_expect_msb() looks for the "edge" where the MSB 
> changes, and never actually looks at the LSB.
> 
> This issue may be an additional reason for your problems, although 
> maybe your noise correction will be able to avoid those cases.

indeed. I did check the trace results though via gnuplot yesterday 
(suspectig PIT readout outliers) and there were no outliers.

For any final patch it's still a showstopper issue.

But the source of error and miscalibration is elsewhere.

	Ingo

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-17 16:13                                                                 ` Ingo Molnar
  2009-03-17 16:28                                                                   ` Linus Torvalds
@ 2009-03-17 17:28                                                                   ` Olivier Galibert
  1 sibling, 0 replies; 81+ messages in thread
From: Olivier Galibert @ 2009-03-17 17:28 UTC (permalink / raw)
  To: Linux Kernel Mailing List

On Tue, Mar 17, 2009 at 05:13:22PM +0100, Ingo Molnar wrote:
> That's why i could put the delta analysis there. We are capturing 
> thousands of measurement points, and what matters is the precision 
> of the 'pair' of (PIT,TSC) timestamp measurements.
> 
> I got roughly the same end result noise and the same anomalies with 
> tracing enabled and disabled. (and the number of data points was cut 
> in half with tracing enabled)

Any reason for not doing a bog-standard linear regression?

  OG.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-16 19:32                                                           ` Linus Torvalds
  2009-03-17  1:43                                                             ` john stultz
  2009-03-17  8:14                                                             ` Ingo Molnar
@ 2009-03-21  9:11                                                             ` Jesper Krogh
  2009-03-21 10:06                                                               ` Ingo Molnar
  2 siblings, 1 reply; 81+ messages in thread
From: Jesper Krogh @ 2009-03-21  9:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: john stultz, Thomas Gleixner, Linux Kernel Mailing List,
	Len Brown, Ingo Molnar

Linus Torvalds wrote:
> 
> On Mon, 16 Mar 2009, Jesper Krogh wrote:
>> you were right. It works. No resets so far.
> 
> Goodie.
> 
> Here's a slightly cleaned-up patch that removes the debug messages, and 
> also re-organizes the code a bit so that it actually uses the "better than 
> 500 ppm" as the way to decide when to stop calibrating.

Can we ship:
commit a6a80e1d8cf82b46a69f88e659da02749231eb36
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Mar 17 07:58:26 2009 -0700

     Fix potential fast PIT TSC calibration startup glitch

and
commit 9e8912e04e612b43897b4b722205408b92f423e5
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Mar 17 08:13:17 2009 -0700

     Fast TSC calibration: calculate proper frequency error bounds


to the 2.6.28-stable series..  The first one needed to apply the second.

Jesper
-- 
Jesper

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: Linux 2.6.29-rc6
  2009-03-21  9:11                                                             ` Jesper Krogh
@ 2009-03-21 10:06                                                               ` Ingo Molnar
  0 siblings, 0 replies; 81+ messages in thread
From: Ingo Molnar @ 2009-03-21 10:06 UTC (permalink / raw)
  To: Jesper Krogh
  Cc: Linus Torvalds, john stultz, Thomas Gleixner,
	Linux Kernel Mailing List, Len Brown


* Jesper Krogh <jesper@krogh.cc> wrote:

> Linus Torvalds wrote:
>>
>> On Mon, 16 Mar 2009, Jesper Krogh wrote:
>>> you were right. It works. No resets so far.
>>
>> Goodie.
>>
>> Here's a slightly cleaned-up patch that removes the debug messages, and 
>> also re-organizes the code a bit so that it actually uses the "better 
>> than 500 ppm" as the way to decide when to stop calibrating.
>
> Can we ship:
> commit a6a80e1d8cf82b46a69f88e659da02749231eb36
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Tue Mar 17 07:58:26 2009 -0700
>
>     Fix potential fast PIT TSC calibration startup glitch
>
> and
> commit 9e8912e04e612b43897b4b722205408b92f423e5
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Tue Mar 17 08:13:17 2009 -0700
>
>     Fast TSC calibration: calculate proper frequency error bounds
>
>
> to the 2.6.28-stable series..  The first one needed to apply the second.

Yes, would be nice to have these fixes in .28.9.

	Ingo

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2009-03-21 10:07 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-23  4:31 Linux 2.6.29-rc6 Linus Torvalds
2009-02-23 14:07 ` Linux 2.6.29-rc6 - Fix oops in i915_gem_retire_requests Karsten Wiese
2009-02-26 11:15 ` Linux 2.6.29-rc6 Jesper Krogh
2009-02-26 17:17   ` MTD_CK804XROM warning (Was: Linux 2.6.29-rc6) Marcin Slusarz
2009-02-26 17:53   ` Linux 2.6.29-rc6 Linus Torvalds
2009-02-26 19:22     ` David Woodhouse
2009-02-26 19:31     ` Jesper Krogh
2009-02-26 19:36       ` David Woodhouse
2009-02-26 19:46         ` Jesper Krogh
2009-02-26 19:49           ` David Woodhouse
2009-02-26 20:53         ` Carl-Daniel Hailfinger
2009-02-26 20:32       ` Linus Torvalds
2009-02-26 19:55 ` Jesper Krogh
2009-02-26 20:33   ` Linus Torvalds
2009-02-26 20:43     ` Jesper Krogh
2009-02-26 21:19       ` john stultz
2009-02-26 21:35         ` Jesper Krogh
2009-02-26 21:46           ` john stultz
2009-02-26 21:54             ` Thomas Gleixner
2009-02-26 22:04               ` Jesper Krogh
2009-02-27  6:30             ` Jesper Krogh
2009-03-01 13:51             ` Jesper Krogh
2009-02-26 21:49           ` Linus Torvalds
2009-03-01 15:04             ` Jesper Krogh
2009-02-26 21:54           ` john stultz
2009-02-26 22:06             ` Thomas Gleixner
2009-02-26 22:24               ` Linus Torvalds
2009-02-26 22:31                 ` Linus Torvalds
2009-02-26 22:31               ` john stultz
2009-02-26 22:40                 ` Linus Torvalds
2009-02-26 22:59                   ` john stultz
2009-02-27  7:33                     ` Ingo Molnar
2009-02-27 20:50                       ` john stultz
2009-02-27  6:47                 ` Jesper Krogh
2009-02-27 20:35                   ` john stultz
2009-03-01 20:13                     ` Jesper Krogh
2009-03-02  9:53                     ` Jesper Krogh
2009-03-02 21:27                       ` john stultz
2009-03-03  6:04                         ` Jesper Krogh
2009-03-03 19:53                           ` john stultz
2009-03-03 20:19                             ` Jesper Krogh
2009-03-03 22:22                               ` john stultz
2009-03-04 15:30                                 ` Jesper Krogh
2009-03-04 18:36                                   ` Jesper Krogh
2009-03-04 18:57                                     ` John Stultz
2009-03-05  2:39                                       ` john stultz
2009-03-05  2:52                                         ` john stultz
2009-03-05  8:43                                           ` Ingo Molnar
2009-03-06  3:13                                             ` john stultz
2009-03-06  3:54                                               ` john stultz
2009-03-06 11:34                                                 ` Ingo Molnar
2009-03-09 20:42                                           ` Jesper Krogh
2009-03-10  4:26                                             ` Linus Torvalds
2009-03-10 11:29                                               ` Thomas Gleixner
2009-03-10 19:42                                                 ` Jesper Krogh
2009-03-10 22:22                                                   ` Thomas Gleixner
2009-03-15 19:53                                                     ` Jesper Krogh
2009-03-16 18:40                                                       ` Jesper Krogh
2009-03-15  1:19                                             ` Linus Torvalds
2009-03-15 15:44                                               ` Jesper Krogh
2009-03-15 18:09                                                 ` Linus Torvalds
2009-03-15 18:38                                                   ` Jesper Krogh
2009-03-15 19:02                                                     ` Linus Torvalds
2009-03-15 19:52                                                       ` Jesper Krogh
2009-03-16 18:59                                                         ` Jesper Krogh
2009-03-16 19:32                                                           ` Linus Torvalds
2009-03-17  1:43                                                             ` john stultz
2009-03-17  8:14                                                             ` Ingo Molnar
2009-03-17 15:48                                                               ` Linus Torvalds
2009-03-17 16:13                                                                 ` Ingo Molnar
2009-03-17 16:28                                                                   ` Linus Torvalds
2009-03-17 16:40                                                                     ` Ingo Molnar
2009-03-17 17:28                                                                   ` Olivier Galibert
2009-03-21  9:11                                                             ` Jesper Krogh
2009-03-21 10:06                                                               ` Ingo Molnar
2009-03-15 20:32                                                     ` Linus Torvalds
2009-03-03 20:39                             ` Jesper Krogh
2009-03-03 22:16                               ` john stultz
2009-03-04  5:36                                 ` Jesper Krogh
2009-03-01 15:09   ` Jesper Krogh
2009-03-01 15:44     ` Linux 2.6.29-rc6 (clocksource) Sitsofe Wheeler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).