* REGRESSION: the new i386 timer code fails to sync CPUs @ 2006-07-22 23:36 Matthias Urlichs 2006-07-23 0:36 ` Andrew Morton 0 siblings, 1 reply; 25+ messages in thread From: Matthias Urlichs @ 2006-07-22 23:36 UTC (permalink / raw) To: linux-kernel; +Cc: ohnstul, akpm, torvalds, bunk, lethal, hirofumi Hi, the change 5d0cf410e94b1f1ff852c3f210d22cc6c5a27ffa [PATCH] Time: i386 Clocksource Drivers Implement the time sources for i386 (acpi_pm, cyclone, hpet, pit, and tsc). With this patch, the conversion of the i386 arch to the generic timekeeping code should be complete. The patch should be fairly straight forward, only adding the new clocksources. causes the clocks of the two CPUs in my dual-Xeon server to lose (or, maybe, never gain) sync. Before this change, they're in sync; afterwards, they're not. This is a problem because, as soon as the system decides to switch CPUs while a program is sleeping (which happens quite early during boot-up), that sleep takes a *long* time. :-/ Checked by simply running "date" repeatedly. Thanks to Linux' superb scheduler, this command reliably runs on alternate CPUs, thereby demonstrating the problem, and I didn't have to resort to "taskset". Boot log: Linux version 2.6.17-test-1.29 (root@kiste) (gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)) #1 SMP PREEMPT Sun Jul 23 01:05:35 CEST 2006 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009b000 (usable) BIOS-e820: 000000000009b000 - 00000000000a0000 (reserved) BIOS-e820: 00000000000ca000 - 00000000000cc000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000d7f70000 (usable) BIOS-e820: 00000000d7f70000 - 00000000d7f7b000 (ACPI data) BIOS-e820: 00000000d7f7b000 - 00000000d7f80000 (ACPI NVS) BIOS-e820: 00000000d7f80000 - 00000000d8000000 (reserved) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved) BIOS-e820: 00000000fffffc00 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000128000000 (usable) Warning only 4GB will be used. Use a PAE enabled kernel. 3200MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000f6700 On node 0 totalpages: 1048576 DMA zone: 4096 pages, LIFO batch:0 Normal zone: 225280 pages, LIFO batch:31 HighMem zone: 819200 pages, LIFO batch:31 DMI present. ACPI: RSDP (v000 PTLTD ) @ 0x000f6750 ACPI: RSDT (v001 PTLTD RSDT 0x06040000 LTP 0x00000000) @ 0xd7f75ea1 ACPI: FADT (v001 INTEL TUMWATER 0x06040000 PTL 0x00000003) @ 0xd7f7ae35 ACPI: MADT (v001 PTLTD APIC 0x06040000 LTP 0x00000000) @ 0xd7f7aea9 ACPI: BOOT (v001 PTLTD $SBFTBL$ 0x06040000 LTP 0x00000001) @ 0xd7f7af39 ACPI: MCFG (v001 PTLTD Mcfg 0x06040000 LTP 0x00000000) @ 0xd7f7af61 ACPI: ASF! (v016 CETP CETP 0x06040000 PTL 0x00000001) @ 0xd7f7af9d ACPI: SSDT (v001 PmRef CpuPm 0x00003000 INTL 0x20030224) @ 0xd7f75edd ACPI: DSDT (v001 Intel Lindhrst 0x06040000 MSFT 0x0100000e) @ 0x00000000 ACPI: PM-Timer IO Port: 0x1008 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 15:4 APIC version 20 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x06] enabled) Processor #6 15:4 APIC version 20 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 15:4 APIC version 20 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled) Processor #7 15:4 APIC version 20 ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1]) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x03] address[0xfec10000] gsi_base[24]) IOAPIC[1]: apic_id 3, version 32, address 0xfec10000, GSI 24-47 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 2 I/O APICs Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at f1000000 (gap: f0000000:0ec00000) Detected 3000.352 MHz processor. Built 1 zonelists. Total pages: 1048576 Kernel command line: root=/dev/md1 ro break mapped APIC to ffffd000 (fee00000) mapped IOAPIC to ffffc000 (fec00000) mapped IOAPIC to ffffb000 (fec10000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 4096 (order: 12, 16384 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 3483476k/4194304k available (1724k kernel code, 53692k reserved, 982k data, 204k init, 2620864k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 6004.95 BogoMIPS (lpj=12009905) Mount-cache hash table entries: 512 CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 monitor/mwait feature present. using mwait in idle threads. CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (24) available Checking 'hlt' instruction... OK. Freeing SMP alternatives: 12k freed ACPI: Core revision 20060608 CPU0: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03 Booting processor 1/1 eip 2000 Initializing CPU#1 Calibrating delay using timer specific routine.. 6000.73 BogoMIPS (lpj=12001474) CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 monitor/mwait feature present. CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel P4/Xeon Extended MCE MSRs (24) available CPU1: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03 Booting processor 2/6 eip 2000 Initializing CPU#2 Calibrating delay using timer specific routine.. 5600.72 BogoMIPS (lpj=11201446) CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 monitor/mwait feature present. CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 3 CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#2. CPU2: Intel P4/Xeon Extended MCE MSRs (24) available CPU2: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03 Booting processor 3/7 eip 2000 Initializing CPU#3 Calibrating delay using timer specific routine.. 5600.70 BogoMIPS (lpj=11201416) CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 monitor/mwait feature present. CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 3 CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#3. CPU3: Intel P4/Xeon Extended MCE MSRs (24) available CPU3: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03 Total of 4 processors activated (23207.12 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 checking TSC synchronization across 4 CPUs: Brought up 4 CPUs migration_cost=93,1739 checking if image is initramfs... it is Freeing initrd memory: 4192k freed NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using MMCONFIG Setting up standard PCI resources ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) PCI: Probing PCI hardware (bus 00) PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO PCI quirk: region 1180-11bf claimed by ICH4 GPIO PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1 Boot video device is 0000:04:03.0 PCI: Transparent bridge - 0000:00:1e.0 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIX._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIB._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 10 11 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *10 11 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 *11 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 *11 14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 4 5 6 7 10 11 14 15) *3 ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 *10 11 14 15) ACPI: PCI Interrupt Link [LNKH] (IRQs 4 5 6 7 *10 11 14 15) Linux Plug and Play Support v0.97 (c) Adam Belay PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report PCI: Bridge: 0000:00:02.0 IO window: disabled. MEM window: da100000-da1fffff PREFETCH window: disabled. PCI: Bridge: 0000:00:03.0 IO window: disabled. MEM window: da200000-da2fffff PREFETCH window: disabled. PCI: Bridge: 0000:00:1c.0 IO window: 2000-2fff MEM window: da300000-da3fffff PREFETCH window: de000000-dfffffff PCI: Bridge: 0000:05:04.0 IO window: 4000-4fff MEM window: dc000000-dc0fffff PREFETCH window: disabled. PCI: Bridge: 0000:04:02.0 IO window: 4000-4fff MEM window: dc000000-dc0fffff PREFETCH window: disabled. PCI: Bridge: 0000:00:1e.0 IO window: 3000-4fff MEM window: da400000-dc0fffff PREFETCH window: f1000000-f10fffff ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 169 PCI: Setting latency timer of device 0000:00:02.0 to 64 ACPI: PCI Interrupt 0000:00:03.0[A] -> GSI 16 (level, low) -> IRQ 169 PCI: Setting latency timer of device 0000:00:03.0 to 64 PCI: Setting latency timer of device 0000:00:1e.0 to 64 NET: Registered protocol family 2 IP route cache hash table entries: 131072 (order: 7, 524288 bytes) TCP established hash table entries: 131072 (order: 9, 3145728 bytes) TCP bind hash table entries: 65536 (order: 8, 1572864 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered Simple Boot Flag at 0x36 set to 0x1 Machine check exception polling timer started. highmem bounce pool size: 64 pages Initializing Cryptographic API io scheduler noop registered io scheduler cfq registered (default) isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize PNP: No PS/2 controller found. Probing ports directly. serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 mice: PS/2 mouse device common for all mice TCP bic registered NET: Registered protocol family 1 Starting balanced_irq Using IPI Shortcut mode Freeing unused kernel memory: 204k freed Time: tsc clocksource has been installed. .config file (trimmed: anything not mentioned is turned off): CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" CONFIG_SYSVIPC=y CONFIG_SYSCTL=y CONFIG_INITRAMFS_SOURCE="" CONFIG_UID16=y CONFIG_VM86=y CONFIG_EMBEDDED=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y CONFIG_KALLSYMS_EXTRA_PASS=y CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SLAB=y CONFIG_TINY_SHMEM=y CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODVERSIONS=y CONFIG_KMOD=y CONFIG_IOSCHED_NOOP=y CONFIG_DEFAULT_NOOP=y CONFIG_DEFAULT_IOSCHED="noop" CONFIG_SMP=y CONFIG_X86_PC=y CONFIG_MPENTIUMIII=y CONFIG_X86_GENERIC=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=7 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_CMPXCHG64=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_HPET_TIMER=y CONFIG_NR_CPUS=4 CONFIG_SCHED_SMT=y CONFIG_SCHED_MC=y CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=y CONFIG_X86_REBOOTFIXUPS=y CONFIG_EDD=m CONFIG_HIGHMEM4G=y CONFIG_PAGE_OFFSET=0xC0000000 CONFIG_HIGHMEM=y CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y CONFIG_SPLIT_PTLOCK_CPUS=4 CONFIG_MTRR=y CONFIG_IRQBALANCE=y CONFIG_REGPARM=y CONFIG_SECCOMP=y CONFIG_HZ_250=y CONFIG_HZ=250 CONFIG_PHYSICAL_START=0x100000 CONFIG_PM=y CONFIG_ACPI=y CONFIG_ACPI_AC=m CONFIG_ACPI_BATTERY=m CONFIG_ACPI_BUTTON=m CONFIG_ACPI_VIDEO=m CONFIG_ACPI_FAN=m CONFIG_ACPI_PROCESSOR=m CONFIG_ACPI_THERMAL=m CONFIG_ACPI_BLACKLIST_YEAR=0 CONFIG_ACPI_EC=y CONFIG_ACPI_POWER=y CONFIG_ACPI_SYSTEM=y CONFIG_X86_PM_TIMER=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_MMCONFIG=y CONFIG_PCI_MSI=y CONFIG_ISA_DMA_API=y CONFIG_BINFMT_ELF=y CONFIG_NET=y CONFIG_UNIX=y CONFIG_NETWORK_SECMARK=y CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=m CONFIG_BLK_DEV_RAM=m CONFIG_BLK_DEV_RAM_COUNT=16 CONFIG_BLK_DEV_RAM_SIZE=4096 CONFIG_BLK_DEV_INITRD=y CONFIG_INPUT=y CONFIG_INPUT_MOUSEDEV=y CONFIG_INPUT_MOUSEDEV_PSAUX=y CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 CONFIG_INPUT_EVDEV=m CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_SERIO_LIBPS2=y CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_PCI=m CONFIG_SERIAL_8250_NR_UARTS=4 CONFIG_SERIAL_8250_RUNTIME_UARTS=4 CONFIG_SERIAL_8250_EXTENDED=y CONFIG_SERIAL_8250_MANY_PORTS=y CONFIG_SERIAL_8250_SHARE_IRQ=y CONFIG_SERIAL_8250_RSA=y CONFIG_SERIAL_CORE=y CONFIG_UNIX98_PTYS=y CONFIG_NVRAM=m CONFIG_RTC=m CONFIG_GEN_RTC=m CONFIG_GEN_RTC_X=y CONFIG_HPET=y CONFIG_HPET_MMAP=y CONFIG_VIDEO_V4L2=y CONFIG_FB=y CONFIG_FB_CFB_FILLRECT=y CONFIG_FB_CFB_COPYAREA=y CONFIG_FB_CFB_IMAGEBLIT=y CONFIG_FB_FIRMWARE_EDID=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y CONFIG_FB_VESA=y CONFIG_VIDEO_SELECT=y CONFIG_VGA_CONSOLE=y CONFIG_VGACON_SOFT_SCROLLBACK=y CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64 CONFIG_DUMMY_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y CONFIG_FONT_8x8=y CONFIG_FONT_8x16=y CONFIG_USB_ARCH_HAS_HCD=y CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_USB_ARCH_HAS_EHCI=y CONFIG_DMA_ENGINE=y CONFIG_NET_DMA=y CONFIG_INTEL_IOATDMA=m CONFIG_ROMFS_FS=y CONFIG_INOTIFY=y CONFIG_INOTIFY_USER=y CONFIG_DNOTIFY=y CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y CONFIG_TMPFS=y CONFIG_RAMFS=y CONFIG_CRAMFS=y CONFIG_PARTITION_ADVANCED=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_KERNEL=y CONFIG_LOG_BUF_SHIFT=17 CONFIG_DETECT_SOFTLOCKUP=y CONFIG_DEBUG_SLAB=y CONFIG_DEBUG_SLAB_LEAK=y CONFIG_DEBUG_PREEMPT=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_SPINLOCK_SLEEP=y CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_INFO=y CONFIG_DEBUG_FS=y CONFIG_DEBUG_VM=y CONFIG_FRAME_POINTER=y CONFIG_UNWIND_INFO=y CONFIG_FORCED_INLINING=y CONFIG_EARLY_PRINTK=y CONFIG_DEBUG_STACKOVERFLOW=y CONFIG_DEBUG_STACK_USAGE=y CONFIG_STACK_BACKTRACE_COLS=2 CONFIG_DEBUG_PAGEALLOC=y CONFIG_DEBUG_RODATA=y CONFIG_4KSTACKS=y CONFIG_X86_FIND_SMP_CONFIG=y CONFIG_X86_MPPARSE=y CONFIG_DOUBLEFAULT=y CONFIG_CRYPTO=y CONFIG_CRYPTO_HMAC=y CONFIG_CRYPTO_ARC4=m CONFIG_CRC_CCITT=m CONFIG_CRC32=y CONFIG_ZLIB_INFLATE=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_X86_SMP=y CONFIG_X86_HT=y CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y CONFIG_KTIME_SCALAR=y -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - You are tricky, but never to the point of dishonesty. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-22 23:36 REGRESSION: the new i386 timer code fails to sync CPUs Matthias Urlichs @ 2006-07-23 0:36 ` Andrew Morton 2006-07-23 8:16 ` Matthias Urlichs 0 siblings, 1 reply; 25+ messages in thread From: Andrew Morton @ 2006-07-23 0:36 UTC (permalink / raw) To: Matthias Urlichs; +Cc: linux-kernel, johnstul, torvalds, bunk, lethal, hirofumi On Sun, 23 Jul 2006 01:36:38 +0200 Matthias Urlichs <smurf@smurf.noris.de> wrote: > Hi, > > the change 5d0cf410e94b1f1ff852c3f210d22cc6c5a27ffa > [PATCH] Time: i386 Clocksource Drivers > > Implement the time sources for i386 (acpi_pm, cyclone, hpet, pit, and tsc). > With this patch, the conversion of the i386 arch to the generic timekeeping > code should be complete. > > The patch should be fairly straight forward, only adding the new clocksources. > > causes the clocks of the two CPUs in my dual-Xeon server to lose > (or, maybe, never gain) sync. > > Before this change, they're in sync; afterwards, they're not. > > This is a problem because, as soon as the system decides to switch CPUs > while a program is sleeping (which happens quite early during boot-up), > that sleep takes a *long* time. :-/ > > Checked by simply running "date" repeatedly. Thanks to Linux' superb > scheduler, this command reliably runs on alternate CPUs, thereby > demonstrating the problem, and I didn't have to resort to "taskset". > > > Boot log: > > Linux version 2.6.17-test-1.29 (root@kiste) (gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)) #1 SMP PREEMPT Sun Jul 23 01:05:35 CEST 2006 What is 2.6.17-test-1.29? How do you know that 5d0cf410e94b1f1ff852c3f210d22cc6c5a27ffa caused this? > checking TSC synchronization across 4 CPUs: That code's a bit sick. But nothing much has changed in there. > Time: tsc clocksource has been installed. OK. Are you able to test the below? It should fix up the reporting. Are you able to compare the present bootlog with the 2.6.17 bootlog? AFAICT the "fixed it up" claim is simply untrue. Odd. --- a/arch/i386/kernel/smpboot.c~synchronize_tsc-fixes +++ a/arch/i386/kernel/smpboot.c @@ -215,7 +215,7 @@ valid_k7: static atomic_t tsc_start_flag = ATOMIC_INIT(0); static atomic_t tsc_count_start = ATOMIC_INIT(0); static atomic_t tsc_count_stop = ATOMIC_INIT(0); -static unsigned long long tsc_values[NR_CPUS]; +static unsigned long long __initdata tsc_values[NR_CPUS]; #define NR_LOOPS 5 @@ -286,7 +286,6 @@ static void __init synchronize_tsc_bp (v avg = sum; do_div(avg, num_booting_cpus()); - sum = 0; for (i = 0; i < NR_CPUS; i++) { if (!cpu_isset(i, cpu_callout_map)) continue; @@ -297,7 +296,8 @@ static void __init synchronize_tsc_bp (v * We report bigger than 2 microseconds clock differences. */ if (delta > 2*one_usec) { - long realdelta; + long long realdelta; + if (!buggy) { buggy = 1; printk("\n"); @@ -307,12 +307,10 @@ static void __init synchronize_tsc_bp (v if (tsc_values[i] < avg) realdelta = -realdelta; - if (realdelta > 0) - printk(KERN_INFO "CPU#%d had %ld usecs TSC " + if (realdelta) + printk(KERN_INFO "CPU#%d had %Ld usecs TSC " "skew, fixed it up.\n", i, realdelta); } - - sum += delta; } if (!buggy) printk("passed.\n"); _ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-23 0:36 ` Andrew Morton @ 2006-07-23 8:16 ` Matthias Urlichs 2006-07-23 11:46 ` Andrew Morton 0 siblings, 1 reply; 25+ messages in thread From: Matthias Urlichs @ 2006-07-23 8:16 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, johnstul, torvalds, bunk, lethal, hirofumi Hi, Azndrew Morton: > What is 2.6.17-test-1.29? My test build; standard kernel during bisection. > How do you know that 5d0cf410e94b1f1ff852c3f210d22cc6c5a27ffa caused this? > git bisect. > Are you able to test the below? It should fix up the reporting. > Applied. > Are you able to compare the present bootlog with the 2.6.17 bootlog? > Sure. The diff says: checking TSC synchronization across 4 CPUs: +CPU#0 had 748437 usecs TSC skew, fixed it up. +CPU#1 had 748437 usecs TSC skew, fixed it up. +CPU#2 had -748437 usecs TSC skew, fixed it up. +CPU#3 had -748437 usecs TSC skew, fixed it up. Brought up 4 CPUs -migration_cost=4000,8000 +migration_cost=85,1724 ... but apparently, that skew is not corrected. These numbers do match the difference in observed "date" outputs. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - The first place you look for something is the last place you'd expect to find it. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-23 8:16 ` Matthias Urlichs @ 2006-07-23 11:46 ` Andrew Morton 2006-07-23 12:08 ` Matthias Urlichs 0 siblings, 1 reply; 25+ messages in thread From: Andrew Morton @ 2006-07-23 11:46 UTC (permalink / raw) To: Matthias Urlichs; +Cc: linux-kernel, johnstul, torvalds, bunk, lethal, hirofumi On Sun, 23 Jul 2006 10:16:04 +0200 Matthias Urlichs <smurf@smurf.noris.de> wrote: > > Are you able to compare the present bootlog with the 2.6.17 bootlog? > > > Sure. The diff says: > > checking TSC synchronization across 4 CPUs: > +CPU#0 had 748437 usecs TSC skew, fixed it up. > +CPU#1 had 748437 usecs TSC skew, fixed it up. > +CPU#2 had -748437 usecs TSC skew, fixed it up. > +CPU#3 had -748437 usecs TSC skew, fixed it up. > Brought up 4 CPUs > -migration_cost=4000,8000 > +migration_cost=85,1724 > > ... but apparently, that skew is not corrected. > > These numbers do match the difference in observed "date" outputs. >From this I'll assume that - CPU0 and CPU1 share a TSC and CPU2 and CPU3 share another TSC. - write_tsc() simply doesn't work on this machine. - Earlier kernels weren't able to modify the TSC either. - Earlier kernels didn't use the TSC as a time source whereas this one does, hence the problems which you're observing. Some or all of the below might be wrong, but I don't think so: I assume that booting with clock=pit or clock=pmtmr fixes it? It would be useful to check your 2.6.17 boot logs, see if we can work out what 2.6.17 was using for a clock source. We need to fix that "fixed it up" message because it just ain't so. The new clocksouce code needs to detect this and to mark the TSC source as unstable, or otherwise unusable. We _could_ fix the TSC skew up, by adjusting the rdtsc output by the tsc_values[] entry wherever we read the TSC. It would of course be better to make write_tsc() work. I wonder why it doesn't? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-23 11:46 ` Andrew Morton @ 2006-07-23 12:08 ` Matthias Urlichs 2006-07-23 12:37 ` Andrew Morton 0 siblings, 1 reply; 25+ messages in thread From: Matthias Urlichs @ 2006-07-23 12:08 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, johnstul, torvalds, bunk, lethal, hirofumi Hi, Andrew Morton: > - CPU0 and CPU1 share a TSC and CPU2 and CPU3 share another TSC. > That mmakes sense, since they're one dual-core Xeon each. > - Earlier kernels didn't use the TSC as a time source whereas this one > does, hence the problems which you're observing. > Correct; see below. > I assume that booting with clock=pit or clock=pmtmr fixes it? > Testing... yes, both. > It would be useful to check your 2.6.17 boot logs, see if we can work out > what 2.6.17 was using for a clock source. > That's easy: 2.6.17 -Using pmtmr for high-res timesource 2.6.18git +Time: tsc clocksource has been installed. I missed those two lines, as in the boot logs they're not really adjacent, so they got lost in the jumble of other differences. Interestingly, CPU0/1 gets 6000 bogomips while CPU2/3 only reaches 5600 ..? (That happens with both kernels.) I do wonder why, and whether this has any bearing on the current problem. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - BEANS ARE NEITHER FRUIT NOR MUSICAL -- Bart Simpson on chalkboard in episode 1F22 ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-23 12:08 ` Matthias Urlichs @ 2006-07-23 12:37 ` Andrew Morton 2006-07-23 12:58 ` Matthias Urlichs ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Andrew Morton @ 2006-07-23 12:37 UTC (permalink / raw) To: Matthias Urlichs; +Cc: linux-kernel, johnstul, torvalds, bunk, lethal, hirofumi On Sun, 23 Jul 2006 14:08:29 +0200 Matthias Urlichs <smurf@smurf.noris.de> wrote: > Hi, > > Andrew Morton: > > - CPU0 and CPU1 share a TSC and CPU2 and CPU3 share another TSC. > > > That mmakes sense, since they're one dual-core Xeon each. OK. > > - Earlier kernels didn't use the TSC as a time source whereas this one > > does, hence the problems which you're observing. > > > Correct; see below. > > > I assume that booting with clock=pit or clock=pmtmr fixes it? > > > Testing... yes, both. > > > It would be useful to check your 2.6.17 boot logs, see if we can work out > > what 2.6.17 was using for a clock source. > > > That's easy: > > 2.6.17 -Using pmtmr for high-res timesource > 2.6.18git +Time: tsc clocksource has been installed. > > I missed those two lines, as in the boot logs they're not really > adjacent, so they got lost in the jumble of other differences. OK, thanks. Marking the TSC as bad in this case is simple to do - let us let John work out the best way. We must have lost a TSC sanity check somewhere along the way. I wonder what it was? > Interestingly, CPU0/1 gets 6000 bogomips while CPU2/3 only reaches 5600 ..? > (That happens with both kernels.) I do wonder why, and whether this has any > bearing on the current problem. I wouldn't expect it to matter, unless the TSCs are running at different speeds or something. Also the sched-domain migration costs are grossly different between the two kernels. Maybe we changed the migration-cost-estimation code; I forget. I'll see if we can get an expert opinion on the write_tsc() failure. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-23 12:37 ` Andrew Morton @ 2006-07-23 12:58 ` Matthias Urlichs 2006-07-24 15:52 ` Siddha, Suresh B 2006-07-24 15:58 ` john stultz 2 siblings, 0 replies; 25+ messages in thread From: Matthias Urlichs @ 2006-07-23 12:58 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, johnstul, torvalds, bunk, lethal, hirofumi Hi, Andrew Morton: > Also the sched-domain migration costs are grossly different between the two > kernels. Maybe we changed the migration-cost-estimation code; I forget. > The old values look suspicious. 4000 and 8000 ?? Maybe there was some excess delay or wait time in the estimator. The only code in the kernel I'd accept to take exactly 4000 of *anything* without further investigation is a call to "mdelay(4)". ;-) -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - Silence is the element in which great things fashion themselves. -- Thomas Carlyle ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-23 12:37 ` Andrew Morton 2006-07-23 12:58 ` Matthias Urlichs @ 2006-07-24 15:52 ` Siddha, Suresh B 2006-07-24 15:58 ` john stultz 2 siblings, 0 replies; 25+ messages in thread From: Siddha, Suresh B @ 2006-07-24 15:52 UTC (permalink / raw) To: smurf Cc: Matthias Urlichs, linux-kernel, johnstul, torvalds, bunk, lethal, hirofumi, akpm On Sun, Jul 23, 2006 at 05:37:55AM -0700, Andrew Morton wrote: > On Sun, 23 Jul 2006 14:08:29 +0200 > Matthias Urlichs <smurf@smurf.noris.de> wrote: > > Interestingly, CPU0/1 gets 6000 bogomips while CPU2/3 only reaches 5600 ..? > > (That happens with both kernels.) I do wonder why, and whether this has any > > bearing on the current problem. > > I wouldn't expect it to matter, unless the TSCs are running at different > speeds or something. Matthias, Can you send us the /proc/cpuinfo output of your system? >From your config it looks like CPU_FREQ is disabled. Perhaps, can you try with CPU_FREQ enabled in your config and see if you see the same issue? thanks, suresh ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-23 12:37 ` Andrew Morton 2006-07-23 12:58 ` Matthias Urlichs 2006-07-24 15:52 ` Siddha, Suresh B @ 2006-07-24 15:58 ` john stultz 2006-07-24 17:17 ` Matthias Urlichs 2006-07-24 17:39 ` Andi Kleen 2 siblings, 2 replies; 25+ messages in thread From: john stultz @ 2006-07-24 15:58 UTC (permalink / raw) To: Andrew Morton Cc: Matthias Urlichs, linux-kernel, torvalds, bunk, lethal, hirofumi, Andi Kleen On Sun, 2006-07-23 at 05:37 -0700, Andrew Morton wrote: > On Sun, 23 Jul 2006 14:08:29 +0200 > Matthias Urlichs <smurf@smurf.noris.de> wrote: > > > Hi, > > > > Andrew Morton: > > > - CPU0 and CPU1 share a TSC and CPU2 and CPU3 share another TSC. > > > > > That mmakes sense, since they're one dual-core Xeon each. > > OK. > > > > - Earlier kernels didn't use the TSC as a time source whereas this one > > > does, hence the problems which you're observing. > > > > > Correct; see below. > > > > > I assume that booting with clock=pit or clock=pmtmr fixes it? > > > > > Testing... yes, both. > > > > > It would be useful to check your 2.6.17 boot logs, see if we can work out > > > what 2.6.17 was using for a clock source. > > > > > That's easy: > > > > 2.6.17 -Using pmtmr for high-res timesource > > 2.6.18git +Time: tsc clocksource has been installed. > > > > I missed those two lines, as in the boot logs they're not really > > adjacent, so they got lost in the jumble of other differences. > > OK, thanks. Marking the TSC as bad in this case is simple to do - let us > let John work out the best way. > > We must have lost a TSC sanity check somewhere along the way. I wonder > what it was? Well, I changed the TSC vs ACPI PM timer priority ordering to be more like x86-64 (Andi had a similar patch he was proposing as well). For awhile suse/redhat kernels have been swapping them, as the TSC gives such a performance boost, however the ACPI PM timer is usually the safer option (distro customers are often told to use clock=pmtmr on some boxes). I'll see what we can do to narrow it down, but its been assumed by both x86-64 and the new i386 code that the TSCs on Intel SMP boxes are synched, unless we're explicitly told they aren't (Summit, etc). With the current code it is trivial to mark the TSC as unstable and the system will automatically fall back to the next best clocksource. The difficulty is just making sure we've got all the cases covered without needlessly disqualifying synced systems. Andi: If this is a generic issue, and not specific to Matthias' box, we may need to re-think the assumption that Intel SMP is synced. You're thoughts? > > Interestingly, CPU0/1 gets 6000 bogomips while CPU2/3 only reaches 5600 ..? > > (That happens with both kernels.) I do wonder why, and whether this has any > > bearing on the current problem. > > I wouldn't expect it to matter, unless the TSCs are running at different > speeds or something. Matthias: "clock=pmtmr" is probably the best workaround in the short term. Could you send me your dmesg and dmidecode output? We'll try to find something to key off of so it will mark the tsc as unstable by default on your system. thanks -john ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-24 15:58 ` john stultz @ 2006-07-24 17:17 ` Matthias Urlichs 2006-07-24 17:51 ` Andi Kleen 2006-07-24 17:39 ` Andi Kleen 1 sibling, 1 reply; 25+ messages in thread From: Matthias Urlichs @ 2006-07-24 17:17 UTC (permalink / raw) To: john stultz Cc: Andrew Morton, linux-kernel, torvalds, bunk, lethal, hirofumi, Andi Kleen Hi, john stultz: > I'll see what we can do to narrow it down, but its been assumed by both > x86-64 and the new i386 code that the TSCs on Intel SMP boxes are > synched, unless we're explicitly told they aren't (Summit, etc). > Apparently not. :-/ > Andi: If this is a generic issue, and not specific to Matthias' box, we > may need to re-think the assumption that Intel SMP is synced. You're > thoughts? > "Your". ;-) You can probably assume that they're synced on systems with no more than one dual-core / hyperthreaded CPU. My system obviously has two of those. > Matthias: "clock=pmtmr" is probably the best workaround in the short > term. Could you send me your dmesg and dmidecode output? We'll try to > find something to key off of so it will mark the tsc as unstable by > default on your system. > I'd assume that finding (and, possibly, being unable to correct) TSC skew is sufficient. Whether it's also necessary (in the mathematical sense ;-) for the problem to exist is a question somebody else might want to answer (or not). Linux version 2.6.17-test-1.29 (root@kiste) (gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)) #2 SMP PREEMPT Sun Jul 23 09:00:44 CEST 2006 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009b000 (usable) BIOS-e820: 000000000009b000 - 00000000000a0000 (reserved) BIOS-e820: 00000000000ca000 - 00000000000cc000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000d7f70000 (usable) BIOS-e820: 00000000d7f70000 - 00000000d7f7b000 (ACPI data) BIOS-e820: 00000000d7f7b000 - 00000000d7f80000 (ACPI NVS) BIOS-e820: 00000000d7f80000 - 00000000d8000000 (reserved) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved) BIOS-e820: 00000000fffffc00 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000128000000 (usable) Warning only 4GB will be used. Use a PAE enabled kernel. 3200MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000f6700 On node 0 totalpages: 1048576 DMA zone: 4096 pages, LIFO batch:0 Normal zone: 225280 pages, LIFO batch:31 HighMem zone: 819200 pages, LIFO batch:31 DMI present. ACPI: RSDP (v000 PTLTD ) @ 0x000f6750 ACPI: RSDT (v001 PTLTD RSDT 0x06040000 LTP 0x00000000) @ 0xd7f75ea1 ACPI: FADT (v001 INTEL TUMWATER 0x06040000 PTL 0x00000003) @ 0xd7f7ae35 ACPI: MADT (v001 PTLTD APIC 0x06040000 LTP 0x00000000) @ 0xd7f7aea9ACPI: BOOT (v001 PTLTD $SBFTBL$ 0x06040000 LTP 0x00000001) @ 0xd7f7af39 ACPI: MCFG (v001 PTLTD Mcfg 0x06040000 LTP 0x00000000) @ 0xd7f7af61ACPI: ASF! (v016 CETP CETP 0x06040000 PTL 0x00000001) @ 0xd7f7af9d ACPI: SSDT (v001 PmRef CpuPm 0x00003000 INTL 0x20030224) @ 0xd7f75edd ACPI: DSDT (v001 Intel Lindhrst 0x06040000 MSFT 0x0100000e) @ 0x00000000 ACPI: PM-Timer IO Port: 0x1008 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 15:4 APIC version 20 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x06] enabled) Processor #6 15:4 APIC version 20 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 15:4 APIC version 20 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled) Processor #7 15:4 APIC version 20 ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1]) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 ACPI: IOAPIC (id[0x03] address[0xfec10000] gsi_base[24]) IOAPIC[1]: apic_id 3, version 32, address 0xfec10000, GSI 24-47 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 2 I/O APICs Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at f1000000 (gap: f0000000:0ec00000) Detected 3000.267 MHz processor. Built 1 zonelists. Total pages: 1048576 Kernel command line: root=/dev/md1 ro break mapped APIC to ffffd000 (fee00000) mapped IOAPIC to ffffc000 (fec00000) mapped IOAPIC to ffffb000 (fec10000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 4096 (order: 12, 16384 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 3483476k/4194304k available (1724k kernel code, 53692k reserved, 982k data, 204k init, 2620864k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 6004.95 BogoMIPS (lpj=12009910)Mount-cache hash table entries: 512 CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 monitor/mwait feature present. using mwait in idle threads. CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (24) available Checking 'hlt' instruction... OK. Freeing SMP alternatives: 12k freed ACPI: Core revision 20060608 CPU0: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03 Booting processor 1/1 eip 2000 Initializing CPU#1 Calibrating delay using timer specific routine.. 6000.69 BogoMIPS (lpj=12001383)CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 monitor/mwait feature present. CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 0 CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel P4/Xeon Extended MCE MSRs (24) available CPU1: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03 Booting processor 2/6 eip 2000 Initializing CPU#2 Calibrating delay using timer specific routine.. 5600.72 BogoMIPS (lpj=11201451)CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 monitor/mwait feature present. CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 3 CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#2. CPU2: Intel P4/Xeon Extended MCE MSRs (24) available CPU2: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03 Booting processor 3/7 eip 2000 Initializing CPU#3 Calibrating delay using timer specific routine.. 5600.72 BogoMIPS (lpj=11201442)CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 CPU: After vendor identify, caps: bfebfbff 20100000 00000000 00000000 0000641d 00000000 00000000 monitor/mwait feature present. CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 2048K CPU: Physical Processor ID: 3 CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000641d 00000000 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#3. CPU3: Intel P4/Xeon Extended MCE MSRs (24) available CPU3: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03 Total of 4 processors activated (23207.09 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 checking TSC synchronization across 4 CPUs: CPU#0 had 748437 usecs TSC skew, fixed it up. CPU#1 had 748437 usecs TSC skew, fixed it up. CPU#2 had -748437 usecs TSC skew, fixed it up. CPU#3 had -748437 usecs TSC skew, fixed it up. Brought up 4 CPUs migration_cost=85,1724 checking if image is initramfs... it is Freeing initrd memory: 4192k freed NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using MMCONFIG Setting up standard PCI resources ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) PCI: Probing PCI hardware (bus 00) PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO PCI quirk: region 1180-11bf claimed by ICH4 GPIO PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1 Boot video device is 0000:04:03.0 PCI: Transparent bridge - 0000:00:1e.0 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIX._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIB._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 10 11 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *10 11 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 *11 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 *11 14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 4 5 6 7 10 11 14 15) *3 ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 *10 11 14 15) ACPI: PCI Interrupt Link [LNKH] (IRQs 4 5 6 7 *10 11 14 15) Linux Plug and Play Support v0.97 (c) Adam Belay PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report PCI: Bridge: 0000:00:02.0 IO window: disabled. MEM window: da100000-da1fffff PREFETCH window: disabled. PCI: Bridge: 0000:00:03.0 IO window: disabled. MEM window: da200000-da2fffff PREFETCH window: disabled. PCI: Bridge: 0000:00:1c.0 IO window: 2000-2fff MEM window: da300000-da3fffff PREFETCH window: de000000-dfffffff PCI: Bridge: 0000:05:04.0 IO window: 4000-4fff MEM window: dc000000-dc0fffff PREFETCH window: disabled. PCI: Bridge: 0000:04:02.0 IO window: 4000-4fff MEM window: dc000000-dc0fffff PREFETCH window: disabled. PCI: Bridge: 0000:00:1e.0 IO window: 3000-4fff MEM window: da400000-dc0fffff PREFETCH window: f1000000-f10fffff ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 169 PCI: Setting latency timer of device 0000:00:02.0 to 64 ACPI: PCI Interrupt 0000:00:03.0[A] -> GSI 16 (level, low) -> IRQ 169 PCI: Setting latency timer of device 0000:00:03.0 to 64 PCI: Setting latency timer of device 0000:00:1e.0 to 64 NET: Registered protocol family 2 IP route cache hash table entries: 131072 (order: 7, 524288 bytes) TCP established hash table entries: 131072 (order: 9, 3145728 bytes) TCP bind hash table entries: 65536 (order: 8, 1572864 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered Simple Boot Flag at 0x36 set to 0x1 Machine check exception polling timer started. highmem bounce pool size: 64 pages Initializing Cryptographic API io scheduler noop registered io scheduler cfq registered (default) isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize PNP: No PS/2 controller found. Probing ports directly. serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 mice: PS/2 mouse device common for all mice TCP bic registered NET: Registered protocol family 1 Starting balanced_irq Using IPI Shortcut mode Freeing unused kernel memory: 204k freed Time: tsc clocksource has been installed. [ snip -- /sbin/init gets called here ] # dmidecode 2.7 SMBIOS 2.33 present. 42 structures occupying 1394 bytes. Table at 0x000EFC60. Handle 0x0000, DMI type 0, 20 bytes. BIOS Information Vendor: Phoenix Technologies LTD Version: 6.00 Release Date: 09/29/2005 Address: 0xE1C80 Runtime Size: 123776 bytes ROM Size: 1024 kB Characteristics: ISA is supported PCI is supported PC Card (PCMCIA) is supported PNP is supported APM is supported BIOS is upgradeable BIOS shadowing is allowed ESCD support is available USB legacy is supported Smart battery is supported BIOS boot specification is supported Handle 0x0001, DMI type 1, 25 bytes. System Information Manufacturer: Intel Corporation Product Name: Nocona/Tumwater Customer Reference Board Version: Revision A0 Serial Number: 0123456789 UUID: 0A0A0A0A-0A0A-0A0A-0A0A-0A0A0A0A0A0A Wake-up Type: Power Switch Handle 0x0002, DMI type 2, 8 bytes. Base Board Information Manufacturer: Intel Corporation Product Name: TYAN Tiger-i7320-S5350 Version: Revision A0 Serial Number: 9876543210 Handle 0x0003, DMI type 3, 17 bytes. Chassis Information Manufacturer: No Enclosure Type: Other Lock: Not Present Version: N/A Serial Number: None Asset Tag: No Asset Tag Boot-up State: Safe Power Supply State: Safe Thermal State: Safe Security Status: None OEM Information: 0x00001234 Handle 0x0004, DMI type 4, 35 bytes. Processor Information Socket Designation: SOCKET603/604 Type: Central Processor Family: Pentium 4 Manufacturer: Intel ID: 43 0F 00 00 FF FB EB BF Signature: Type 0, Family 15, Model 4, Stepping 3 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) DS (Debug store) ACPI (ACPI supported) MMX (MMX technology supported) FXSR (Fast floating-point save and restore) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) SS (Self-snoop) HTT (Hyper-threading technology) TM (Thermal monitor supported) PBE (Pending break enabled) Version: A0 Voltage: 1.4 V External Clock: Unknown Max Speed: 3600 MHz Current Speed: 3000 MHz Status: Populated, Enabled Upgrade: Slot 1 L1 Cache Handle: 0x0005 L2 Cache Handle: 0x0006 L3 Cache Handle: Not Provided Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0005, DMI type 7, 19 bytes. Cache Information Socket Designation: L1 Cache Configuration: Enabled, Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 16 KB Maximum Size: 16 KB Supported SRAM Types: Burst Pipeline Burst Asynchronous Installed SRAM Type: Asynchronous Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: Unknown Handle 0x0006, DMI type 7, 19 bytes. Cache Information Socket Designation: L2 Cache Configuration: Enabled, Socketed, Level 2 Operational Mode: Write Back Location: Internal Installed Size: 2048 KB Maximum Size: 512 KB Supported SRAM Types: Burst Pipeline Burst Asynchronous Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: Unknown Handle 0x0007, DMI type 7, 19 bytes. Cache Information Socket Designation: L3 Cache Configuration: Enabled, Socketed, Level 3 Operational Mode: Write Back Location: Internal Installed Size: 2048 KB Maximum Size: 512 KB Supported SRAM Types: Burst Pipeline Burst Asynchronous Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: Unknown Handle 0x0008, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: J2A1 Internal Connector Type: 9 Pin Dual Inline (pin 10 cut) External Reference Designator: COM 1 External Connector Type: DB-9 male Port Type: Serial Port 16550A Compatible Handle 0x0009, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: J3A1 Internal Connector Type: 25 Pin Dual Inline (pin 26 cut) External Reference Designator: Parallel External Connector Type: DB-25 female Port Type: Parallel Port ECP/EPP Handle 0x000A, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: J1A1 Internal Connector Type: None External Reference Designator: Keyboard External Connector Type: Circular DIN-8 male Port Type: Keyboard Port Handle 0x000B, DMI type 8, 9 bytes. Port Connector Information Internal Reference Designator: J1A1 Internal Connector Type: None External Reference Designator: PS/2 Mouse External Connector Type: Circular DIN-8 male Port Type: Keyboard Port Handle 0x000C, DMI type 9, 13 bytes. System Slot Information Designation: PCI/33 Slot #3 - J8B2 Type: 32-bit PCI Current Usage: Unknown Length: Long ID: 0 Characteristics: 5.0 V is provided 3.3 V is provided Handle 0x000D, DMI type 10, 6 bytes. On Board Device Information Type: Sound Status: Disabled Description: ADI1886 Handle 0x000E, DMI type 11, 5 bytes. OEM Strings String 1: Intel Nocona/Lindenhurst String 2: CRB - ROADRUNNER Handle 0x000F, DMI type 12, 5 bytes. System Configuration Options Option 1: Jumper settings can be described here. Handle 0x0010, DMI type 15, 29 bytes. System Event Log Area Length: 32 bytes Header Start Offset: 0x0000 Header Length: 16 bytes Data Start Offset: 0x0010 Access Method: General-purpose non-volatile data functions Access Address: 0x0000 Status: Invalid, Not Full Change Token: 0x00000001 Header Format: Type 1 Supported Log Type Descriptors: 3 Descriptor 1: POST error Data Format 1: POST results bitmap Descriptor 2: Single-bit ECC memory error Data Format 2: Multiple-event Descriptor 3: Multi-bit ECC memory error Data Format 3: Multiple-event Handle 0x0011, DMI type 16, 15 bytes. Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: None Maximum Capacity: 4 GB Error Information Handle: Not Provided Number Of Devices: 2 Handle 0x0012, DMI type 17, 27 bytes. Memory Device Array Handle: 0x0011 Error Information Handle: No Error Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: 1 Locator: J3B1 Bank Locator: DIMM A1 Type: DDR Type Detail: Synchronous Speed: 166 MHz (6.0 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0013, DMI type 17, 27 bytes. Memory Device Array Handle: 0x0011 Error Information Handle: No Error Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: 1 Locator: J3B3 Bank Locator: DIMM A2 Type: DDR Type Detail: Synchronous Speed: 166 MHz (6.0 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0014, DMI type 17, 27 bytes. Memory Device Array Handle: 0x0011 Error Information Handle: No Error Total Width: 72 bits Data Width: 64 bits Size: 1024 MB Form Factor: DIMM Set: 1 Locator: J2B2 Bank Locator: DIMM A3 Type: DDR Type Detail: Synchronous Speed: 166 MHz (6.0 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0015, DMI type 17, 27 bytes. Memory Device Array Handle: 0x0011 Error Information Handle: No Error Total Width: 72 bits Data Width: 64 bits Size: 1024 MB Form Factor: DIMM Set: 1 Locator: J2B4 Bank Locator: DIMM A4 Type: DDR Type Detail: Synchronous Speed: 166 MHz (6.0 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0016, DMI type 17, 27 bytes. Memory Device Array Handle: 0x0011 Error Information Handle: No Error Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: 1 Locator: J3B2 Bank Locator: DIMM B1 Type: DDR Type Detail: Synchronous Speed: 166 MHz (6.0 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0017, DMI type 17, 27 bytes. Memory Device Array Handle: 0x0011 Error Information Handle: No Error Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: DIMM Set: 1 Locator: J2B1 Bank Locator: DIMM B2 Type: DDR Type Detail: Synchronous Speed: 166 MHz (6.0 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0018, DMI type 17, 27 bytes. Memory Device Array Handle: 0x0011 Error Information Handle: No Error Total Width: 72 bits Data Width: 64 bits Size: 1024 MB Form Factor: DIMM Set: 1 Locator: J2B3 Bank Locator: DIMM B3 Type: DDR Type Detail: Synchronous Speed: 166 MHz (6.0 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x0019, DMI type 17, 27 bytes. Memory Device Array Handle: 0x0011 Error Information Handle: No Error Total Width: 72 bits Data Width: 64 bits Size: 1024 MB Form Factor: DIMM Set: 1 Locator: J1B1 Bank Locator: DIMM B4 Type: DDR Type Detail: Synchronous Speed: 166 MHz (6.0 ns) Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Handle 0x001A, DMI type 19, 15 bytes. Memory Array Mapped Address Starting Address: 0x00000000000 Ending Address: 0x000FFFFFFFF Range Size: 4 GB Physical Array Handle: 0x0011 Partition Width: 0 Handle 0x001B, DMI type 20, 19 bytes. Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x000000003FF Range Size: 1 kB Physical Device Handle: 0x0012 Memory Array Mapped Address Handle: 0x001A Partition Row Position: Unknown Interleave Position: Unknown Interleaved Data Depth: Unknown Handle 0x001C, DMI type 20, 19 bytes. Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x000000003FF Range Size: 1 kB Physical Device Handle: 0x0013 Memory Array Mapped Address Handle: 0x001A Partition Row Position: Unknown Interleave Position: Unknown Interleaved Data Depth: Unknown Handle 0x001D, DMI type 23, 13 bytes. System Reset Status: Enabled Watchdog Timer: Present Boot Option: Do Not Reboot Boot Option On Limit: Do Not Reboot Reset Count: Unknown Reset Limit: Unknown Timer Interval: Unknown Timeout: Unknown Handle 0x001E, DMI type 24, 5 bytes. Hardware Security Power-On Password Status: Disabled Keyboard Password Status: Unknown Administrator Password Status: Enabled Front Panel Reset Status: Unknown Handle 0x001F, DMI type 25, 9 bytes. System Power Controls Next Scheduled Power-on: 12-31 23:59:59 Handle 0x0020, DMI type 26, 20 bytes. Voltage Probe Description: Voltage Probe Location: Processor Status: OK Maximum Value: Unknown Minimum Value: Unknown Resolution: Unknown Tolerance: Unknown Accuracy: Unknown OEM-specific Information: 0x00000000 Handle 0x0021, DMI type 27, 12 bytes. Cooling Device Temperature Probe Handle: 0x0022 Type: Fan Status: OK OEM-specific Information: 0x00000000 Handle 0x0022, DMI type 28, 20 bytes. Temperature Probe Description: Temperature Probe Location: Processor Status: OK Maximum Value: Unknown Minimum Value Unknown Resolution: Unknown Tolerance: Unknown Accuracy: Unknown OEM-specific Information: 0x00000000 Handle 0x0023, DMI type 29, 20 bytes. Electrical Current Probe Description: Electrical Current Probe Location: Processor Status: OK Maximum Value: Unknown Minimum Value: Unknown Resolution: Unknown Tolerance: Unknown Accuracy: Unknown OEM-specific Information: 0x00000000 Handle 0x0024, DMI type 30, 6 bytes. Out-of-band Remote Access Manufacturer Name: Intel Inbound Connection: Enabled Outbound Connection: Disabled Handle 0x0025, DMI type 32, 20 bytes. System Boot Information Status: <OUT OF SPEC> Handle 0x0026, DMI type 126, 4 bytes. Inactive Handle 0x0027, DMI type 127, 4 bytes. End Of Table Handle 0x0028, DMI type 129, 28 bytes. OEM-specific Type Header and Data: 81 1C 28 00 01 01 02 01 00 00 00 01 00 00 10 01 00 01 00 01 00 00 18 01 00 02 00 01 Strings: Intel_ASF_001 Intel_ASF_001 Handle 0x0029, DMI type 127, 4 bytes. End Of Table > thanks > -john > > -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - :recursion: n. See {recursion}. See also {tail recursion}. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-24 17:17 ` Matthias Urlichs @ 2006-07-24 17:51 ` Andi Kleen 2006-07-24 20:54 ` john stultz 0 siblings, 1 reply; 25+ messages in thread From: Andi Kleen @ 2006-07-24 17:51 UTC (permalink / raw) To: Matthias Urlichs Cc: john stultz, Andrew Morton, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick On Mon, Jul 24, 2006 at 07:17:11PM +0200, Matthias Urlichs wrote: > > Andi: If this is a generic issue, and not specific to Matthias' box, we > > may need to re-think the assumption that Intel SMP is synced. You're > > thoughts? > > > "Your". ;-) > > You can probably assume that they're synced on systems with no more > than one dual-core / hyperthreaded CPU. > > My system obviously has two of those. According to Intel on all of their chipsets/motherboard reference designs all the sockets run from a single clock crystal. I've confirmed this for a long time on 64bit and even to some extent on 32bit on distro kernels. Maybe you got a broken BIOS or similar though. > > Matthias: "clock=pmtmr" is probably the best workaround in the short > > term. Could you send me your dmesg and dmidecode output? We'll try to > > find something to key off of so it will mark the tsc as unstable by > > default on your system. > > > I'd assume that finding (and, possibly, being unable to correct) TSC skew The BIOS normally guarantee it at boot. However maybe you got a broken one. We used to do TSC sync correction at boot on Intel, but stopped doing that when we found out that the TSC sync code adds an error To an already perfectly synchronized system. Actually I think i386 still does it, just x86-64 stopped My first assumption would be that you hit a bug somewhere in the new clock code. What happens when you boot an older kernel (like 2.6.17) with clock=tsc ? > BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved) > BIOS-e820: 00000000fffffc00 - 0000000100000000 (reserved) > BIOS-e820: 0000000100000000 - 0000000128000000 (usable) > Warning only 4GB will be used. You should at least set CONFIG_HIGHMEM_64G or use a 64bit kernel if the system does long mode. > ENABLING IO-APIC IRQs > ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 > checking TSC synchronization across 4 CPUs: > CPU#0 had 748437 usecs TSC skew, fixed it up. > CPU#1 had 748437 usecs TSC skew, fixed it up. > CPU#2 had -748437 usecs TSC skew, fixed it up. > CPU#3 had -748437 usecs TSC skew, fixed it up. Hmm, that looks unusual. Maybe the BIOS is really broken. On most Intel systems I saw Normally Linux should fix it up here and then the TSC should tick synchronous though (but with an small offset that the sync code cannot entirely avoid) > > Handle 0x0000, DMI type 0, 20 bytes. > BIOS Information > Vendor: Phoenix Technologies LTD > Version: 6.00 > Release Date: 09/29/2005 > > Handle 0x0001, DMI type 1, 25 bytes. > System Information > Manufacturer: Intel Corporation > Product Name: Nocona/Tumwater Customer Reference Board > Version: Revision A0 Hmm, those should definitely have a synced TSC. However A0 suspiciously sounds like a engineering sample, normally production systems have higher revision numbers. If it's just a beta hardware bug we probably won't care. Asit, do you know of any TSC sync between CPUs issues in that board/BIOS version? -Andi > Serial Number: 0123456789 > UUID: 0A0A0A0A-0A0A-0A0A-0A0A-0A0A0A0A0A0A > Wake-up Type: Power Switch > > Handle 0x0002, DMI type 2, 8 bytes. > Base Board Information > Manufacturer: Intel Corporation > Product Name: TYAN Tiger-i7320-S5350 > Version: Revision A0 > Serial Number: 9876543210 > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-24 17:51 ` Andi Kleen @ 2006-07-24 20:54 ` john stultz 2006-07-30 9:03 ` Andrew Morton 0 siblings, 1 reply; 25+ messages in thread From: john stultz @ 2006-07-24 20:54 UTC (permalink / raw) To: Andi Kleen Cc: Matthias Urlichs, Andrew Morton, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick On Mon, 2006-07-24 at 19:51 +0200, Andi Kleen wrote: > On Mon, Jul 24, 2006 at 07:17:11PM +0200, Matthias Urlichs wrote: > > You can probably assume that they're synced on systems with no more > > than one dual-core / hyperthreaded CPU. > > > > My system obviously has two of those. > > According to Intel on all of their chipsets/motherboard reference > designs all the sockets run from a single clock crystal. > > I've confirmed this for a long time on 64bit and even to some > extent on 32bit on distro kernels. Andi: Which 32bit distro patch are you referring to here? > Maybe you got a broken BIOS or similar though. > > > > Matthias: "clock=pmtmr" is probably the best workaround in the short > > > term. Could you send me your dmesg and dmidecode output? We'll try to > > > find something to key off of so it will mark the tsc as unstable by > > > default on your system. > > > > > I'd assume that finding (and, possibly, being unable to correct) TSC skew > > The BIOS normally guarantee it at boot. However maybe you got a broken one. > > We used to do TSC sync correction at boot on Intel, but stopped doing > that when we found out that the TSC sync code adds an error > To an already perfectly synchronized system. > > Actually I think i386 still does it, just x86-64 stopped Indeed i386 still does it. I knew x86-64 had a new ia64 inspired algorithm, but I didn't realize that they didn't even try to call it in most cases. The (untested) patch below will disable it on i386. Matthias, mind trying it out to see if the TSC sync code is causing the problem? > My first assumption would be that you hit a bug somewhere in the new > clock code. What happens when you boot an older kernel (like 2.6.17) > with clock=tsc ? Yes, that would be good to confirm the issue. :) thanks -john Hack out the i386 TSC sync code. diff --git a/arch/i386/kernel/smpboot.c b/arch/i386/kernel/smpboot.c index 6f5fea0..cd28914 100644 --- a/arch/i386/kernel/smpboot.c +++ b/arch/i386/kernel/smpboot.c @@ -435,7 +435,7 @@ static void __devinit smp_callin(void) /* * Synchronize the TSC with the BP */ - if (cpu_has_tsc && cpu_khz && !tsc_sync_disabled) + if (0 && cpu_has_tsc && cpu_khz && !tsc_sync_disabled) synchronize_tsc_ap(); } @@ -1305,7 +1305,7 @@ static void __init smp_boot_cpus(unsigne /* * Synchronize the TSC with the AP */ - if (cpu_has_tsc && cpucount && cpu_khz) + if (0 && cpu_has_tsc && cpucount && cpu_khz) synchronize_tsc_bp(); } ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-24 20:54 ` john stultz @ 2006-07-30 9:03 ` Andrew Morton 2006-07-30 9:49 ` Matthias Urlichs ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Andrew Morton @ 2006-07-30 9:03 UTC (permalink / raw) To: john stultz Cc: ak, smurf, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick On Mon, 24 Jul 2006 13:54:03 -0700 john stultz <johnstul@us.ibm.com> wrote: > On Mon, 2006-07-24 at 19:51 +0200, Andi Kleen wrote: > > On Mon, Jul 24, 2006 at 07:17:11PM +0200, Matthias Urlichs wrote: > > > You can probably assume that they're synced on systems with no more > > > than one dual-core / hyperthreaded CPU. > > > > > > My system obviously has two of those. > > > > According to Intel on all of their chipsets/motherboard reference > > designs all the sockets run from a single clock crystal. > > > > I've confirmed this for a long time on 64bit and even to some > > extent on 32bit on distro kernels. > > Andi: Which 32bit distro patch are you referring to here? > > > Maybe you got a broken BIOS or similar though. > > > > > > Matthias: "clock=pmtmr" is probably the best workaround in the short > > > > term. Could you send me your dmesg and dmidecode output? We'll try to > > > > find something to key off of so it will mark the tsc as unstable by > > > > default on your system. > > > > > > > I'd assume that finding (and, possibly, being unable to correct) TSC skew > > > > The BIOS normally guarantee it at boot. However maybe you got a broken one. > > > > We used to do TSC sync correction at boot on Intel, but stopped doing > > that when we found out that the TSC sync code adds an error > > To an already perfectly synchronized system. > > > > Actually I think i386 still does it, just x86-64 stopped > > Indeed i386 still does it. I knew x86-64 had a new ia64 inspired > algorithm, but I didn't realize that they didn't even try to call it in > most cases. > > The (untested) patch below will disable it on i386. Matthias, mind > trying it out to see if the TSC sync code is causing the problem? > > > > My first assumption would be that you hit a bug somewhere in the new > > clock code. What happens when you boot an older kernel (like 2.6.17) > > with clock=tsc ? > > Yes, that would be good to confirm the issue. :) > > thanks > -john > > > Hack out the i386 TSC sync code. > > > diff --git a/arch/i386/kernel/smpboot.c b/arch/i386/kernel/smpboot.c > index 6f5fea0..cd28914 100644 > --- a/arch/i386/kernel/smpboot.c > +++ b/arch/i386/kernel/smpboot.c > @@ -435,7 +435,7 @@ static void __devinit smp_callin(void) > /* > * Synchronize the TSC with the BP > */ > - if (cpu_has_tsc && cpu_khz && !tsc_sync_disabled) > + if (0 && cpu_has_tsc && cpu_khz && !tsc_sync_disabled) > synchronize_tsc_ap(); > } > > @@ -1305,7 +1305,7 @@ static void __init smp_boot_cpus(unsigne > /* > * Synchronize the TSC with the AP > */ > - if (cpu_has_tsc && cpucount && cpu_khz) > + if (0 && cpu_has_tsc && cpucount && cpu_khz) > synchronize_tsc_bp(); > } I guess Matthias didn't test this patch. Can we get some obviously-correct fix in place for 2.6.18? Also, I was rather hoping we'd be able to work out why write_tsc() isn't working on this CPU. If that's fixable, that would be the best fix for this bug, no? It is a "CPU0: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03". ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 9:03 ` Andrew Morton @ 2006-07-30 9:49 ` Matthias Urlichs 2006-07-30 20:10 ` Andi Kleen 2006-07-31 14:24 ` Matthias Urlichs 2 siblings, 0 replies; 25+ messages in thread From: Matthias Urlichs @ 2006-07-30 9:49 UTC (permalink / raw) To: Andrew Morton Cc: john stultz, ak, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick Hi, Andrew Morton: > I guess Matthias didn't test this patch. Not yet, sorry -- the thing is my main server, and customers tend to dislike downtime. I've already got it scheduled for tonight. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - No matter where you go, there you are. -- Buckaroo Banzai ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 9:03 ` Andrew Morton 2006-07-30 9:49 ` Matthias Urlichs @ 2006-07-30 20:10 ` Andi Kleen 2006-07-30 20:55 ` Andrew Morton 2006-07-30 21:13 ` Matthias Urlichs 2006-07-31 14:24 ` Matthias Urlichs 2 siblings, 2 replies; 25+ messages in thread From: Andi Kleen @ 2006-07-30 20:10 UTC (permalink / raw) To: Andrew Morton Cc: john stultz, smurf, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick > I guess Matthias didn't test this patch. Can we get some obviously-correct > fix in place for 2.6.18? So far we don't have any idea what the problem is on that system. > It is a "CPU0: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03". Was that on that system? I guess it could be checked for and TSC be forced off. It sounds like a real CPU bug however. -Andi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 20:10 ` Andi Kleen @ 2006-07-30 20:55 ` Andrew Morton 2006-07-30 21:13 ` Matthias Urlichs 1 sibling, 0 replies; 25+ messages in thread From: Andrew Morton @ 2006-07-30 20:55 UTC (permalink / raw) To: Andi Kleen Cc: johnstul, smurf, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick On 30 Jul 2006 22:10:05 +0200 Andi Kleen <ak@muc.de> wrote: > > I guess Matthias didn't test this patch. Can we get some obviously-correct > > fix in place for 2.6.18? > > So far we don't have any idea what the problem is on that system. I believe we do know what the problem is: a) write_tsc() doesn't work, b) the TSC's are unsynced (or have an offset), c) we removed a check which would have caused pmtmr/rtc fallback. > > It is a "CPU0: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03". > > Was that on that system? yes. > I guess it could be checked for and TSC > be forced off. There's no need for that, I think. synchronize_tsc_bp() knows for-sure that the synchronization failed, in a way which works on all CPUs. So all we need to do is to set some flag in synchronize_tsc_bp() if `buggy' is set, telling the clocksource code to give up on the TSC. > It sounds like a real CPU bug however. I was hoping the Intel guys could help out with that. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 20:10 ` Andi Kleen 2006-07-30 20:55 ` Andrew Morton @ 2006-07-30 21:13 ` Matthias Urlichs 2006-07-30 21:20 ` Arjan van de Ven 2006-07-30 21:57 ` Andi Kleen 1 sibling, 2 replies; 25+ messages in thread From: Matthias Urlichs @ 2006-07-30 21:13 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, john stultz, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick Hi, Andi Kleen: > > It is a "CPU0: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03". > > Was that on that system? I guess it could be checked for and TSC > be forced off. It sounds like a real CPU bug however. > Board problem? After all, it has some very noxious DMI entries: System Information Manufacturer: Intel Corporation Product Name: Nocona/Tumwater Customer Reference Board Version: Revision A0 Serial Number: 0123456789 UUID: 0A0A0A0A-0A0A-0A0A-0A0A-0A0A0A0A0A0A ... all of which are patently *wrong*. You'd have to ask the people from Tyan what the hell they were smoking when they blindly copied the Intel data. At least the different CPU speed issue is a known bug, fixed by a BIOS update. I'll postpone that until we have a working kernel fix, for obvious reasons. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - You might be a Redneck if ... You consider a six-pack and a bug-zapper high-quality entertainment. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 21:13 ` Matthias Urlichs @ 2006-07-30 21:20 ` Arjan van de Ven 2006-07-30 21:55 ` Matthias Urlichs 2006-07-30 21:57 ` Andi Kleen 1 sibling, 1 reply; 25+ messages in thread From: Arjan van de Ven @ 2006-07-30 21:20 UTC (permalink / raw) To: Matthias Urlichs Cc: Andi Kleen, Andrew Morton, john stultz, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick On Sun, 2006-07-30 at 23:13 +0200, Matthias Urlichs wrote: > Hi, > > Andi Kleen: > > > It is a "CPU0: Intel(R) Xeon(TM) CPU 3.00GHz stepping 03". > > > > Was that on that system? I guess it could be checked for and TSC > > be forced off. It sounds like a real CPU bug however. > > > Board problem? After all, it has some very noxious DMI entries: > > System Information > Manufacturer: Intel Corporation > Product Name: Nocona/Tumwater Customer Reference Board > Version: Revision A0 > Serial Number: 0123456789 > UUID: 0A0A0A0A-0A0A-0A0A-0A0A-0A0A0A0A0A0A > > ... all of which are patently *wrong*. > > You'd have to ask the people from Tyan what the hell they were smoking > when they blindly copied the Intel data. > > At least the different CPU speed issue is a known bug, fixed by a > BIOS update. I'll postpone that until we have a working kernel fix, > for obvious reasons. if the hardware side is different *speed*.. then a tsc sync ain't going to work... sure we write to it but it's immediately out of sync again > -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 21:20 ` Arjan van de Ven @ 2006-07-30 21:55 ` Matthias Urlichs 2006-08-01 1:47 ` Siddha, Suresh B 0 siblings, 1 reply; 25+ messages in thread From: Matthias Urlichs @ 2006-07-30 21:55 UTC (permalink / raw) To: Arjan van de Ven Cc: Andi Kleen, Andrew Morton, john stultz, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick Hi, Arjan van de Ven: > if the hardware side is different *speed*.. then a tsc sync ain't going > to work... sure we write to it but it's immediately out of sync again > No, it's in fact the same speed -- the BIOS just reads it wrongly. I checked: the two date values do advance at the same rate. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - "We need not invite the Devil to our table; he is too ready to come without being asked. The air all about us is filled with demons...." [Martin Luther] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 21:55 ` Matthias Urlichs @ 2006-08-01 1:47 ` Siddha, Suresh B 2006-08-01 3:14 ` Matthias Urlichs 0 siblings, 1 reply; 25+ messages in thread From: Siddha, Suresh B @ 2006-08-01 1:47 UTC (permalink / raw) To: Matthias Urlichs Cc: Arjan van de Ven, Andi Kleen, Andrew Morton, john stultz, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick On Sun, Jul 30, 2006 at 11:55:09PM +0200, Matthias Urlichs wrote: > Hi, > > Arjan van de Ven: > > if the hardware side is different *speed*.. then a tsc sync ain't going > > to work... sure we write to it but it's immediately out of sync again > > > No, it's in fact the same speed -- the BIOS just reads it wrongly. It sounds to me as a BIOS issue. From the boot log, it is quite clear that TSCs are running at different speeds(different bogomips show this). This CPU stepping has constant TSC behavior. So this is most probably happening because of bios setting different core to bus clock ratios for each package. Different CPU speed BIOS issue that you mention also points to this. Can you check if your BIOS settings are set to max ratio(15?) available? or try an updated BIOS? > > I checked: the two date values do advance at the same rate. Perhaps data overflow(because of unsync TSC's) in timer code calculations may be causing this? thanks, suresh ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-08-01 1:47 ` Siddha, Suresh B @ 2006-08-01 3:14 ` Matthias Urlichs 0 siblings, 0 replies; 25+ messages in thread From: Matthias Urlichs @ 2006-08-01 3:14 UTC (permalink / raw) To: Siddha, Suresh B Cc: Arjan van de Ven, Andi Kleen, Andrew Morton, john stultz, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick [-- Attachment #1: Type: text/plain, Size: 638 bytes --] Hi, Siddha, Suresh B: > > No, it's in fact the same speed -- the BIOS just reads it wrongly. > > It sounds to me as a BIOS issue. From the boot log, it is quite clear that > TSCs are running at different speeds(different bogomips show this). Ah. OK, that convinces me -- I'll do a BIOS update as soon as possible. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - Anyone nit-picking enough to write a letter of correction to an editor doubtless deserves the error that provoked it. -- Alvin Toffler [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 191 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 21:13 ` Matthias Urlichs 2006-07-30 21:20 ` Arjan van de Ven @ 2006-07-30 21:57 ` Andi Kleen 2006-07-30 22:28 ` Matthias Urlichs 1 sibling, 1 reply; 25+ messages in thread From: Andi Kleen @ 2006-07-30 21:57 UTC (permalink / raw) To: Matthias Urlichs Cc: Andrew Morton, john stultz, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick > At least the different CPU speed issue is a known bug, fixed by a > BIOS update. That will likely fix your problem without changing the kernel. Please try it. > I'll postpone that until we have a working kernel fix, > for obvious reasons. What are the obvious reasons? -Andi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 21:57 ` Andi Kleen @ 2006-07-30 22:28 ` Matthias Urlichs 0 siblings, 0 replies; 25+ messages in thread From: Matthias Urlichs @ 2006-07-30 22:28 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, john stultz, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick Hi, Andi Kleen: > > I'll postpone that until we have a working kernel fix, > > for obvious reasons. > > What are the obvious reasons? > - No endangering of customers' production machines without a compelling reason. - No stepping back, as I don't know which BIOS version is on the board (DMI says "6.0", but Tyan has 1.0x on their website; the release date doesn't match either). - I'd rather test a working workaround in the kernel before updating; if I have the problem, others have it too, and gettng Linux to boot in that situation isn't exactly trivial -- the fact that you need a clock= parameter when udevsend hangs is kindof non-obvious. - ... and the not-quite-obvious reason: Tyan specifies that I *need* a Win95 or Win98 boot floppy to do that. While I don't really believe them, I still don't have one of those handy ... which brings me back to the first item in this list. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - "Seems I can't get me 'ead down these days without rescuin' people or foilin' robbers or sunnink." -- It's a wonder dog's life (Terry Pratchett, Moving Pictures) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-30 9:03 ` Andrew Morton 2006-07-30 9:49 ` Matthias Urlichs 2006-07-30 20:10 ` Andi Kleen @ 2006-07-31 14:24 ` Matthias Urlichs 2 siblings, 0 replies; 25+ messages in thread From: Matthias Urlichs @ 2006-07-31 14:24 UTC (permalink / raw) To: Andrew Morton Cc: john stultz, ak, linux-kernel, torvalds, bunk, lethal, hirofumi, asit.k.mallick [-- Attachment #1: Type: text/plain, Size: 1219 bytes --] Hi, Andrew Morton: > > Hack out the i386 TSC sync code. > > > > diff --git a/arch/i386/kernel/smpboot.c b/arch/i386/kernel/smpboot.c > > index 6f5fea0..cd28914 100644 > > --- a/arch/i386/kernel/smpboot.c > > +++ b/arch/i386/kernel/smpboot.c > > @@ -435,7 +435,7 @@ static void __devinit smp_callin(void) > > /* > > * Synchronize the TSC with the BP > > */ > > - if (cpu_has_tsc && cpu_khz && !tsc_sync_disabled) > > + if (0 && cpu_has_tsc && cpu_khz && !tsc_sync_disabled) > > synchronize_tsc_ap(); > > } > > > > @@ -1305,7 +1305,7 @@ static void __init smp_boot_cpus(unsigne > > /* > > * Synchronize the TSC with the AP > > */ > > - if (cpu_has_tsc && cpucount && cpu_khz) > > + if (0 && cpu_has_tsc && cpucount && cpu_khz) > > synchronize_tsc_bp(); > > } > > I guess Matthias didn't test this patch. Can we get some obviously-correct > fix in place for 2.6.18? > This patch doesn't change the problem. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - Success is always being able to wear clothing that you actually like. -- SJM [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 191 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: REGRESSION: the new i386 timer code fails to sync CPUs 2006-07-24 15:58 ` john stultz 2006-07-24 17:17 ` Matthias Urlichs @ 2006-07-24 17:39 ` Andi Kleen 1 sibling, 0 replies; 25+ messages in thread From: Andi Kleen @ 2006-07-24 17:39 UTC (permalink / raw) To: john stultz Cc: Andrew Morton, Matthias Urlichs, linux-kernel, torvalds, bunk, lethal, hirofumi On Mon, Jul 24, 2006 at 08:58:58AM -0700, john stultz wrote: > On Sun, 2006-07-23 at 05:37 -0700, Andrew Morton wrote: > > On Sun, 23 Jul 2006 14:08:29 +0200 > > Matthias Urlichs <smurf@smurf.noris.de> wrote: > > > > > Hi, > > > > > > Andrew Morton: > > > > - CPU0 and CPU1 share a TSC and CPU2 and CPU3 share another TSC. > > > > > > > That mmakes sense, since they're one dual-core Xeon each. > > > > OK. > > > > > > - Earlier kernels didn't use the TSC as a time source whereas this one > > > > does, hence the problems which you're observing. > > > > > > > Correct; see below. > > > > > > > I assume that booting with clock=pit or clock=pmtmr fixes it? > > > > > > > Testing... yes, both. > > > > > > > It would be useful to check your 2.6.17 boot logs, see if we can work out > > > > what 2.6.17 was using for a clock source. > > > > > > > That's easy: > > > > > > 2.6.17 -Using pmtmr for high-res timesource > > > 2.6.18git +Time: tsc clocksource has been installed. > > > > > > I missed those two lines, as in the boot logs they're not really > > > adjacent, so they got lost in the jumble of other differences. > > > > OK, thanks. Marking the TSC as bad in this case is simple to do - let us > > let John work out the best way. > > > > We must have lost a TSC sanity check somewhere along the way. I wonder > > what it was? > > Well, I changed the TSC vs ACPI PM timer priority ordering to be more > like x86-64 (Andi had a similar patch he was proposing as well). For > awhile suse/redhat kernels have been swapping them, as the TSC gives > such a performance boost, however the ACPI PM timer is usually the safer > option (distro customers are often told to use clock=pmtmr on some > boxes). > > I'll see what we can do to narrow it down, but its been assumed by both > x86-64 and the new i386 code that the TSCs on Intel SMP boxes are > synched, unless we're explicitly told they aren't (Summit, etc). Or it supports C3. I just had to add that check on 64bit too for Merom. > With the current code it is trivial to mark the TSC as unstable and the > system will automatically fall back to the next best clocksource. The > difficulty is just making sure we've got all the cases covered without > needlessly disqualifying synced systems. > > Andi: If this is a generic issue, and not specific to Matthias' box, we > may need to re-think the assumption that Intel SMP is synced. You're > thoughts? I'm missing context. Full log files/full system description? At least on x86-64 I'm doing it like this for a long time and didn't have any complaints so I would assume that the 64bit capable boxes are near completely ok. -Andi ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2006-08-01 3:15 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2006-07-22 23:36 REGRESSION: the new i386 timer code fails to sync CPUs Matthias Urlichs 2006-07-23 0:36 ` Andrew Morton 2006-07-23 8:16 ` Matthias Urlichs 2006-07-23 11:46 ` Andrew Morton 2006-07-23 12:08 ` Matthias Urlichs 2006-07-23 12:37 ` Andrew Morton 2006-07-23 12:58 ` Matthias Urlichs 2006-07-24 15:52 ` Siddha, Suresh B 2006-07-24 15:58 ` john stultz 2006-07-24 17:17 ` Matthias Urlichs 2006-07-24 17:51 ` Andi Kleen 2006-07-24 20:54 ` john stultz 2006-07-30 9:03 ` Andrew Morton 2006-07-30 9:49 ` Matthias Urlichs 2006-07-30 20:10 ` Andi Kleen 2006-07-30 20:55 ` Andrew Morton 2006-07-30 21:13 ` Matthias Urlichs 2006-07-30 21:20 ` Arjan van de Ven 2006-07-30 21:55 ` Matthias Urlichs 2006-08-01 1:47 ` Siddha, Suresh B 2006-08-01 3:14 ` Matthias Urlichs 2006-07-30 21:57 ` Andi Kleen 2006-07-30 22:28 ` Matthias Urlichs 2006-07-31 14:24 ` Matthias Urlichs 2006-07-24 17:39 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).