* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? [not found] ` <fa.f9f2gij.1kua0f@ifi.uio.no> @ 2003-12-06 17:38 ` William Park 0 siblings, 0 replies; 27+ messages in thread From: William Park @ 2003-12-06 17:38 UTC (permalink / raw) To: linux-kernel On Sat, Dec 06, 2003 at 10:59:01AM +0000, William Lee Irwin III wrote: > This leads to a similar conclusion to Stian Jordet's case. It's not > mistaking you for HT, it's the lack of an internal distinction between > the cases that need and don't need irq balancing. I have VP6 (Apollo Pro 133A, 82c694X/686B) dual-P3 (800MHz/133MHz). I'm currently using MPS 1.1, because USB doesn't work with MPS 1.4 (or, it does but only with 'noapic'). And, I can report the same finding as others. Before: ------- CPU0 CPU1 0: 79365 63 IO-APIC-edge timer 1: 215 1 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 4: 8654 0 IO-APIC-edge serial 8: 1 1 IO-APIC-edge rtc 12: 52 1 IO-APIC-edge i8042 14: 2155 0 IO-APIC-edge ide0 15: 2 0 IO-APIC-edge ide1 17: 0 0 IO-APIC-level eth0 NMI: 0 0 LOC: 79231 79266 ERR: 0 MIS: 0 After 'noirqbalance': --------------------- CPU0 CPU1 0: 15039 16025 IO-APIC-edge timer 1: 47 75 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 4: 21 43 IO-APIC-edge serial 8: 2 0 IO-APIC-edge rtc 12: 21 32 IO-APIC-edge i8042 14: 828 410 IO-APIC-edge ide0 15: 2 0 IO-APIC-edge ide1 17: 0 0 IO-APIC-level eth0 NMI: 0 0 LOC: 30865 30900 ERR: 0 MIS: 0 -- William Park, Open Geometry Consulting, <opengeometry@yahoo.ca> Linux solution for data management and processing. ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <10wU2-1mR-11@gated-at.bofh.it>]
[parent not found: <10wU2-1mR-13@gated-at.bofh.it>]
[parent not found: <10wU2-1mR-15@gated-at.bofh.it>]
[parent not found: <10wU3-1mR-17@gated-at.bofh.it>]
[parent not found: <10wU3-1mR-19@gated-at.bofh.it>]
[parent not found: <10wU3-1mR-21@gated-at.bofh.it>]
[parent not found: <10wU3-1mR-23@gated-at.bofh.it>]
[parent not found: <10wU3-1mR-25@gated-at.bofh.it>]
[parent not found: <10wU2-1mR-9@gated-at.bofh.it>]
[parent not found: <10xGk-38t-15@gated-at.bofh.it>]
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? [not found] ` <10xGk-38t-15@gated-at.bofh.it> @ 2003-12-08 19:24 ` Matthew Kanar 0 siblings, 0 replies; 27+ messages in thread From: Matthew Kanar @ 2003-12-08 19:24 UTC (permalink / raw) To: linux-kernel Zwane Mwaikambo wrote: > Are you sure you're not running the userspace irq balancer (ps ax | grep > irqbalance)? <blush> Well, userspace irqbalancer wasn't running, but it was started and then stopped during the system startup. My apologies. A few changes in the rcX.d directories and a few reboots later, I have found that this system exhibits similar behavior to that of others in this thread. /proc/interrupts without noirqbalance: CPU0 CPU1 0: 331404 14 IO-APIC-edge timer 1: 10 1 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 8: 0 1 IO-APIC-edge rtc 12: 49 1 IO-APIC-edge i8042 14: 5503 0 IO-APIC-edge ide0 15: 1 0 IO-APIC-edge ide1 16: 1004 1 IO-APIC-level eth0 NMI: 0 0 LOC: 331292 331291 ERR: 0 MIS: 0 /proc/interrupts with noirqbalance: CPU0 CPU1 0: 70723 137438 IO-APIC-edge timer 1: 2 9 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 8: 1 0 IO-APIC-edge rtc 12: 2 48 IO-APIC-edge i8042 14: 3222 1850 IO-APIC-edge ide0 15: 1 0 IO-APIC-edge ide1 16: 471 459 IO-APIC-level eth0 NMI: 0 0 LOC: 208023 208022 ERR: 0 MIS: 0 --Matt Kanar ^ permalink raw reply [flat|nested] 27+ messages in thread
[parent not found: <ZAwx-88m-3@gated-at.bofh.it>]
[parent not found: <ZAGd-8ma-5@gated-at.bofh.it>]
[parent not found: <ZAQ7-6X-13@gated-at.bofh.it>]
[parent not found: <ZAZB-pS-11@gated-at.bofh.it>]
[parent not found: <ZCoI-2oz-9@gated-at.bofh.it>]
[parent not found: <ZCyh-2Bv-1@gated-at.bofh.it>]
[parent not found: <ZCI5-2Pv-3@gated-at.bofh.it>]
[parent not found: <ZCIb-2Pv-11@gated-at.bofh.it>]
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? [not found] ` <ZCIb-2Pv-11@gated-at.bofh.it> @ 2003-12-08 16:42 ` Matthew Kanar 2003-12-08 17:21 ` William Lee Irwin III 2003-12-08 17:38 ` Zwane Mwaikambo 0 siblings, 2 replies; 27+ messages in thread From: Matthew Kanar @ 2003-12-08 16:42 UTC (permalink / raw) To: linux-kernel Here is one of my SMP systems (Dell-Dual P3), although noirqbalance doesn't seem to change things -- uname -nrvm: k12.kanar.net 2.6.0-test11 #2 SMP Wed Dec 3 18:50:36 EST 2003 i686 _Without_ noirqbalance - uptime: 10:31:51 up 4 days, 15:07, 10 users, load average: 0.00, 0.02, 0.00 /proc/interrupts: CPU0 CPU1 0: 27505 400084866 IO-APIC-edge timer 1: 1438 1 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 8: 1 0 IO-APIC-edge rtc 12: 1837 161 IO-APIC-edge i8042 14: 661467 822411 IO-APIC-edge ide0 15: 1 0 IO-APIC-edge ide1 16: 104949011 10 IO-APIC-level eth0 NMI: 0 0 LOC: 400153184 400153183 ERR: 0 MIS: 10 _With_ noirqbalance - uptime: 11:36:12 up 1:01, 4 users, load average: 0.00, 0.09, 0.28 /proc/interrupts: CPU0 CPU1 0: 16726 3707690 IO-APIC-edge timer 1: 3 8 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 8: 0 1 IO-APIC-edge rtc 12: 14 36 IO-APIC-edge i8042 14: 28140 659 IO-APIC-edge ide0 15: 1 0 IO-APIC-edge ide1 16: 12775 12 IO-APIC-level eth0 NMI: 0 0 LOC: 3724639 3724638 ERR: 0 MIS: 0 CPU0 timer: value 16726 hasn't changed since boot an hour ago. dmesg: Linux version 2.6.0-test11 (root@k12.kanar.net) (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) #2 SMP Wed Dec 3 18:50:36 EST 2003 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000001ff9e000 (usable) BIOS-e820: 000000001ff9e000 - 0000000020000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) 0MB HIGHMEM available. 511MB LOWMEM available. found SMP MP-table at 000fe710 hm, page 000fe000 reserved twice. hm, page 000ff000 reserved twice. hm, page 000f0000 reserved twice. On node 0 totalpages: 130974 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 126878 pages, LIFO batch:16 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. ACPI: RSDP (v000 DELL ) @ 0x000fd720 ACPI: RSDT (v001 DELL WS 420 0x00000008 ASL 0x00000061) @ 0x000fd734 ACPI: FADT (v001 DELL WS 420 0x00000008 ASL 0x00000061) @ 0x000fd760 ACPI: MADT (v001 DELL WS 420 0x00000008 ASL 0x00000061) @ 0x000fd7d4 ACPI: DSDT (v001 DELL dt_ex 0x00001000 MSFT 0x0100000b) @ 0x00000000 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 6:8 APIC version 17 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 6:8 APIC version 17 ACPI: LAPIC_NMI (acpi_id[0x01] polarity[0x1] trigger[0x1] lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] polarity[0x1] trigger[0x1] lint[0x1]) Using ACPI for processor (LAPIC) configuration information Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: DELL Product ID: WS 420 APIC at: 0xFEE00000 I/O APIC #2 Version 32 at 0xFEC00000. Enabling APIC mode: Flat. Using 1 I/O APICs Processors: 2 Building zonelist for node : 0 Kernel command line: ro root=/dev/hda2 noirqbalance Initializing CPU#0 PID hash table entries: 2048 (order 11: 16384 bytes) Detected 930.947 MHz processor. Console: colour VGA+ 80x25 Memory: 514112k/523896k available (2409k kernel code, 9044k reserved, 579k data, 352k init, 0k highmem) Calibrating delay loop... 1839.10 BogoMIPS Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX CPU0: Intel Pentium III (Coppermine) stepping 06 per-CPU timeslice cutoff: 732.06 usecs. task migration cache decay timeout: 1 msecs. enabled ExtINT on CPU#0 ESR value before enabling vector: 00000040 ESR value after enabling vector: 00000000 Booting processor 1/1 eip 2000 Initializing CPU#1 masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 1859.58 BogoMIPS CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel Pentium III (Coppermine) stepping 06 Total of 2 processors activated (3698.68 BogoMIPS). ENABLING IO-APIC IRQs Setting 2 in the phys_id_present_map ...changing IO-APIC physical APIC ID to 2 ... ok. init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-13, 2-20, 2-21, 2-22, 2-23 not connected. ..TIMER: vector=0x31 pin1=2 pin2=0 number of MP IRQ sources: 44. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00170020 ....... : max redirection entries: 0017 ....... : PRQ implemented: 0 ....... : IO APIC version: 0020 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 001 01 0 0 0 0 0 1 1 39 02 001 01 0 0 0 0 0 1 1 31 03 001 01 0 0 0 0 0 1 1 41 04 001 01 0 0 0 0 0 1 1 49 05 001 01 0 0 0 0 0 1 1 51 06 001 01 0 0 0 0 0 1 1 59 07 001 01 0 0 0 0 0 1 1 61 08 001 01 0 0 0 0 0 1 1 69 09 001 01 0 0 0 0 0 1 1 71 0a 001 01 0 0 0 0 0 1 1 79 0b 001 01 0 0 0 0 0 1 1 81 0c 001 01 0 0 0 0 0 1 1 89 0d 000 00 1 0 0 0 0 0 0 00 0e 001 01 0 0 0 0 0 1 1 91 0f 001 01 0 0 0 0 0 1 1 99 10 001 01 1 1 0 1 0 1 1 A1 11 001 01 1 1 0 1 0 1 1 A9 12 001 01 1 1 0 1 0 1 1 B1 13 001 01 1 1 0 1 0 1 1 B9 14 000 00 1 0 0 0 0 0 0 00 15 000 00 1 0 0 0 0 0 0 00 16 000 00 1 0 0 0 0 0 0 00 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ5 -> 0:5 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ10 -> 0:10 IRQ11 -> 0:11 IRQ12 -> 0:12 IRQ14 -> 0:14 IRQ15 -> 0:15 IRQ16 -> 0:16 IRQ17 -> 0:17 IRQ18 -> 0:18 IRQ19 -> 0:19 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 930.0810 MHz. ..... host bus clock speed is 132.0972 MHz. checking TSC synchronization across 2 CPUs: passed. Starting migration thread for cpu 0 Bringing up 1 CPU 1 IS NOW UP! Starting migration thread for cpu 1 CPUS done 8 NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfc03e, last bus=3 PCI: Using configuration type 1 mtrr: v2.0 (20020519) Linux Plug and Play Support v0.97 (c) Adam Belay PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) Transparent bridge - 0000:00:1e.0 PCI: Using IRQ router PIIX/ICH [8086/2410] at 0000:00:1f.0 PCI->APIC IRQ transform: (B0,I31,P3) -> 19 PCI->APIC IRQ transform: (B0,I31,P1) -> 17 PCI->APIC IRQ transform: (B1,I0,P0) -> 16 PCI->APIC IRQ transform: (B2,I4,P0) -> 16 PCI->APIC IRQ transform: (B2,I6,P0) -> 18 PCI->APIC IRQ transform: (B2,I11,P0) -> 19 apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac) apm: disabled - APM is not SMP safe. VFS: Disk quotas dquot_6.5.1 SGI XFS for Linux with no debug enabled isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found pty: 2048 Unix98 ptys configured Real Time Clock Driver v1.12 Using anticipatory io scheduler Floppy drive(s): fd0 is 1.44M FDC 0 is a National Semiconductor PC87306 RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize divert: not allocating divert_blk for non-ethernet device lo Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ICH: IDE controller at PCI slot 0000:00:1f.1 ICH: chipset revision 2 ICH: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio hda: WDC WD1200JB-75CRA0, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: LITEON DVD-ROM LTD163, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: Host Protected Area detected. current capacity is 234375000 sectors (120000 MB) native capacity is 234375120 sectors (120000 MB) hda: 234375000 sectors (120000 MB) w/8192KiB Cache, CHS=65535/16/63, UDMA(66) hda: hda1 hda2 hda3 mice: PS/2 mouse device common for all mice input: PS/2 Generic Mouse on isa0060/serio1 serio: i8042 AUX port at 0x60,0x64 irq 12 input: AT Translated Set 2 keyboard on isa0060/serio0 serio: i8042 KBD port at 0x60,0x64 irq 1 md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 NET: Registered protocol family 2 IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 32768 bind 32768) NET: Registered protocol family 1 NET: Registered protocol family 17 NET: Registered protocol family 8 NET: Registered protocol family 20 md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 352k freed drivers/usb/core/usb.c: registered new driver usbfs drivers/usb/core/usb.c: registered new driver hub drivers/usb/core/usb.c: registered new driver hid drivers/usb/input/hid-core.c: v2.0:USB HID core driver EXT3 FS on hda2, internal journal Adding 1044216k swap on /dev/hda3. Priority:-1 extents:1 kjournald starting. Commit interval 5 seconds EXT3 FS on hda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. kudzu: numerical sysctl 1 23 is obsolete. parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP] parport0: irq 7 detected parport0: cpp_daisy: aa5500ff(08) parport0: assign_addrs: aa5500ff(08) parport0: cpp_daisy: aa5500ff(08) parport0: assign_addrs: aa5500ff(08) 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html 0000:02:04.0: 3Com PCI 3c905C Tornado at 0xec80. Vers LK1.1.19 divert: allocating divert_blk for eth0 /proc/cpuinfo: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 6 cpu MHz : 930.947 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1839.10 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 6 cpu MHz : 930.947 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1859.58 ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-08 16:42 ` Matthew Kanar @ 2003-12-08 17:21 ` William Lee Irwin III 2003-12-08 17:38 ` Zwane Mwaikambo 1 sibling, 0 replies; 27+ messages in thread From: William Lee Irwin III @ 2003-12-08 17:21 UTC (permalink / raw) To: Matthew Kanar; +Cc: linux-kernel On Mon, Dec 08, 2003 at 11:42:57AM -0500, Matthew Kanar wrote: > Here is one of my SMP systems (Dell-Dual P3), although noirqbalance > doesn't seem to change things -- > uname -nrvm: > k12.kanar.net 2.6.0-test11 #2 SMP Wed Dec 3 18:50:36 EST 2003 i686 > _Without_ noirqbalance - > uptime: > 10:31:51 up 4 days, 15:07, 10 users, load average: 0.00, 0.02, 0.00 ACPI may be doing something strange (as usual). It looks like you somehow got slammed into fixed delivery mode. -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-08 16:42 ` Matthew Kanar 2003-12-08 17:21 ` William Lee Irwin III @ 2003-12-08 17:38 ` Zwane Mwaikambo 1 sibling, 0 replies; 27+ messages in thread From: Zwane Mwaikambo @ 2003-12-08 17:38 UTC (permalink / raw) To: Matthew Kanar; +Cc: linux-kernel On Mon, 8 Dec 2003, Matthew Kanar wrote: > _Without_ noirqbalance - > > uptime: > 10:31:51 up 4 days, 15:07, 10 users, load average: 0.00, 0.02, 0.00 > > /proc/interrupts: > CPU0 CPU1 > 0: 27505 400084866 IO-APIC-edge timer > 1: 1438 1 IO-APIC-edge i8042 > 2: 0 0 XT-PIC cascade > 8: 1 0 IO-APIC-edge rtc > 12: 1837 161 IO-APIC-edge i8042 > 14: 661467 822411 IO-APIC-edge ide0 > 15: 1 0 IO-APIC-edge ide1 > 16: 104949011 10 IO-APIC-level eth0 > NMI: 0 0 > LOC: 400153184 400153183 > ERR: 0 > MIS: 10 > > > _With_ noirqbalance - > > uptime: > 11:36:12 up 1:01, 4 users, load average: 0.00, 0.09, 0.28 > > /proc/interrupts: > CPU0 CPU1 > 0: 16726 3707690 IO-APIC-edge timer > 1: 3 8 IO-APIC-edge i8042 > 2: 0 0 XT-PIC cascade > 8: 0 1 IO-APIC-edge rtc > 12: 14 36 IO-APIC-edge i8042 > 14: 28140 659 IO-APIC-edge ide0 > 15: 1 0 IO-APIC-edge ide1 > 16: 12775 12 IO-APIC-level eth0 > NMI: 0 0 > LOC: 3724639 3724638 > ERR: 0 > MIS: 0 Are you sure you're not running the userspace irq balancer (ps ax | grep irqbalance)? ^ permalink raw reply [flat|nested] 27+ messages in thread
* SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? @ 2003-12-06 2:32 Colin Coe 2003-12-06 2:42 ` William Lee Irwin III 0 siblings, 1 reply; 27+ messages in thread From: Colin Coe @ 2003-12-06 2:32 UTC (permalink / raw) To: linux-kernel [-- Attachment #1: Type: text/plain, Size: 3857 bytes --] Hi all I believe there is a SMP problem with the Dell PowerEdge 4100/200 under 2.6.0-testxx kernels. I've looked through the archives but haven't found anyone mention any problems similar to this. The system has 640MB RAM (4x32MB and 4x128MB), dual Pentium Pro 200s and is pure SCSI (AIC-7860 and AMI MegaRAID 428 controllers). Under v2.4.23, 'cat /proc/interrupts' shows the following (with an uptime of 10 minutes): CPU0 CPU1 0: 28961 33597 IO-APIC-edge timer 1: 2 3 IO-APIC-edge keyboard 2: XT-PIC cascade 4: 38 11 IO-APIC-edge serial 5: 191 209 IO-APIC-level eth1 8: 1 IO-APIC-edge rtc 10: 1241 1238 IO-APIC-level aic7xxx 11: 299 315 IO-APIC-level eth0 12: 25 3 IO-APIC-edge PS/2 Mouse 14: IO-APIC-level cs46xx 15: 2756 2679 IO-APIC-level megaraid NMI: 0 LOC: 62438 62440 ERR: 0 MIS: 0 This indicates to me that the processing load is being evenly distributed accross the two processes. Under v2.6.0-testxx however, 'cat /proc/interrupts' shows this: [root@host root]# cat /proc/interrupts CPU0 CPU1 0: 633122 30 IO-APIC-edge timer 1: 207 IO-APIC-edge i8042 2: XT-PIC cascade 4: 48 1 IO-APIC-edge serial 5: 449 1 IO-APIC-level eth1 10: 135 1 IO-APIC-level aic7xxx 11: 1447 1 IO-APIC-level eth0 12: 61 IO-APIC-edge i8042 14: IO-APIC-level CS46XX 15: 14982 1 IO-APIC-level megaraid NMI: 0 LOC: 632444 632443 ERR: 0 MIS: 0 which indicates to me that only CPU0 is being used. It is a RH9 based system with several packages updated so as to satisfy the kernel's minimum requirements. [root@host root]# lspci 00:00.0 Host bridge: Intel Corp. 440FX - 82441FX PMC [Natoma] (rev 02) 00:0d.0 PCI bridge: Digital Equipment Corporation DECchip 21052 (rev 02) 00:0f.0 Non-VGA unclassified device: Intel Corp. 82375EB (rev 15) 00:10.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78) 00:11.0 VGA compatible controller: Matrox Graphics, Inc. MGA 2164W [Millennium II] 00:13.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30) 01:08.0 Multimedia audio controller: Cirrus Logic CS 4614/22/24 [CrystalClear SoundFusion Audio Accelerator] (rev 01) 01:0b.0 SCSI storage controller: Adaptec AIC-7860 (rev 01) 01:0d.0 Unknown mass storage controller: American Megatrends Inc. MegaRAID 428 Ultra RAID Controller (rev 03) [root@host linux]# ./scripts/ver_linux If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux host.domain.com 2.6.0-test11 #3 SMP Sat Dec 6 09:49:19 WST 2003 i686 i686 i386 GNU/Linux Gnu C 3.2.2 Gnu make 3.79.1 util-linux 2.11y mount 2.11y module-init-tools implemented (0.9.14) e2fsprogs 1.32 PPP 2.4.1 Linux C Library 2.3.2 Dynamic linker (ldd) 2.3.2 Procps 3.1.13 Net-tools 1.60 Kbd 1.08 Sh-utils 4.5.3 Modules Loaded [root@host linux]# ld -v GNU ld version 2.13.90.0.18 20030206 I think I've provided the necessary info, but if I've missed anything please advise and I'll provide what is required. The .config for 2.6.0-test11 is attached. Has anyone seen this or have any idea what may be going on? TIA CC -- "Obnoxious frog..." Spike, 2071AD [-- Attachment #2: config.txt --] [-- Type: text/plain, Size: 4901 bytes --] CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_STANDALONE=y CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSCTL=y CONFIG_LOG_BUF_SHIFT=15 CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_KALLSYMS=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_OBSOLETE_MODPARM=y CONFIG_MODVERSIONS=y CONFIG_KMOD=y CONFIG_X86_PC=y CONFIG_M686=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_PPRO_FENCE=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_SMP=y CONFIG_NR_CPUS=4 CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_MICROCODE=m CONFIG_X86_MSR=m CONFIG_X86_CPUID=m CONFIG_NOHIGHMEM=y CONFIG_MTRR=y CONFIG_HAVE_DEC_LOCK=y CONFIG_ACPI_BOOT=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_LEGACY_PROC=y CONFIG_PCI_NAMES=y CONFIG_ISA=y CONFIG_EISA=y CONFIG_EISA_PCI_EISA=y CONFIG_EISA_VIRTUAL_ROOT=y CONFIG_EISA_NAMES=y CONFIG_BINFMT_ELF=y CONFIG_BINFMT_AOUT=y CONFIG_BINFMT_MISC=y CONFIG_PNP=y CONFIG_ISAPNP=y CONFIG_PNPBIOS=y CONFIG_BLK_DEV_FD=y CONFIG_BLK_DEV_INITRD=y CONFIG_LBD=y CONFIG_SCSI=y CONFIG_SCSI_PROC_FS=y CONFIG_BLK_DEV_SD=y CONFIG_CHR_DEV_ST=y CONFIG_BLK_DEV_SR=y CONFIG_CHR_DEV_SG=y CONFIG_SCSI_REPORT_LUNS=y CONFIG_SCSI_AIC7XXX=y CONFIG_AIC7XXX_CMDS_PER_DEVICE=32 CONFIG_AIC7XXX_RESET_DELAY_MS=15000 CONFIG_AIC7XXX_DEBUG_ENABLE=y CONFIG_AIC7XXX_DEBUG_MASK=0 CONFIG_AIC7XXX_REG_PRETTY_PRINT=y CONFIG_SCSI_MEGARAID=y CONFIG_MD=y CONFIG_BLK_DEV_DM=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_INET_ECN=y CONFIG_SYN_COOKIES=y CONFIG_NETFILTER=y CONFIG_IP_NF_CONNTRACK=y CONFIG_IP_NF_FTP=y CONFIG_IP_NF_IRC=y CONFIG_IP_NF_TFTP=y CONFIG_IP_NF_AMANDA=y CONFIG_IP_NF_QUEUE=y CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_MATCH_LIMIT=y CONFIG_IP_NF_MATCH_IPRANGE=y CONFIG_IP_NF_MATCH_MAC=y CONFIG_IP_NF_MATCH_PKTTYPE=y CONFIG_IP_NF_MATCH_MARK=y CONFIG_IP_NF_MATCH_MULTIPORT=y CONFIG_IP_NF_MATCH_TOS=y CONFIG_IP_NF_MATCH_RECENT=y CONFIG_IP_NF_MATCH_ECN=y CONFIG_IP_NF_MATCH_DSCP=y CONFIG_IP_NF_MATCH_AH_ESP=y CONFIG_IP_NF_MATCH_LENGTH=y CONFIG_IP_NF_MATCH_TTL=y CONFIG_IP_NF_MATCH_TCPMSS=y CONFIG_IP_NF_MATCH_HELPER=y CONFIG_IP_NF_MATCH_STATE=y CONFIG_IP_NF_MATCH_CONNTRACK=y CONFIG_IP_NF_MATCH_OWNER=y CONFIG_IP_NF_FILTER=y CONFIG_IP_NF_TARGET_REJECT=y CONFIG_IP_NF_NAT=y CONFIG_IP_NF_NAT_NEEDED=y CONFIG_IP_NF_TARGET_MASQUERADE=y CONFIG_IP_NF_TARGET_REDIRECT=y CONFIG_IP_NF_TARGET_NETMAP=y CONFIG_IP_NF_TARGET_SAME=y CONFIG_IP_NF_NAT_LOCAL=y CONFIG_IP_NF_NAT_SNMP_BASIC=y CONFIG_IP_NF_NAT_IRC=y CONFIG_IP_NF_NAT_FTP=y CONFIG_IP_NF_NAT_TFTP=y CONFIG_IP_NF_NAT_AMANDA=y CONFIG_IP_NF_MANGLE=y CONFIG_IP_NF_TARGET_TOS=y CONFIG_IP_NF_TARGET_ECN=y CONFIG_IP_NF_TARGET_DSCP=y CONFIG_IP_NF_TARGET_MARK=y CONFIG_IP_NF_TARGET_CLASSIFY=y CONFIG_IP_NF_TARGET_LOG=y CONFIG_IP_NF_TARGET_ULOG=y CONFIG_IP_NF_TARGET_TCPMSS=y CONFIG_IP_NF_ARPTABLES=y CONFIG_IP_NF_ARPFILTER=y CONFIG_IP_NF_ARP_MANGLE=y CONFIG_IPV6_SCTP__=y CONFIG_NETDEVICES=y CONFIG_DUMMY=y CONFIG_NET_ETHERNET=y CONFIG_NET_VENDOR_3COM=y CONFIG_VORTEX=y CONFIG_PPP=y CONFIG_PPP_ASYNC=y CONFIG_PPP_DEFLATE=y CONFIG_PPP_BSDCOMP=y CONFIG_INPUT=y CONFIG_INPUT_MOUSEDEV=y CONFIG_INPUT_MOUSEDEV_PSAUX=y CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 CONFIG_GAMEPORT=y CONFIG_SOUND_GAMEPORT=y CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y CONFIG_INPUT_MOUSE=y CONFIG_MOUSE_PS2=y CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y CONFIG_SERIAL_NONSTANDARD=y CONFIG_STALDRV=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_NR_UARTS=4 CONFIG_SERIAL_CORE=y CONFIG_UNIX98_PTYS=y CONFIG_UNIX98_PTY_COUNT=256 CONFIG_FB=y CONFIG_VIDEO_SELECT=y CONFIG_FB_MATROX=y CONFIG_FB_MATROX_MILLENIUM=y CONFIG_VGA_CONSOLE=y CONFIG_DUMMY_CONSOLE=y CONFIG_SOUND=y CONFIG_SND=y CONFIG_SND_SEQUENCER=y CONFIG_SND_OSSEMUL=y CONFIG_SND_MIXER_OSS=y CONFIG_SND_PCM_OSS=y CONFIG_SND_SEQUENCER_OSS=y CONFIG_SND_CS46XX=y CONFIG_SND_CS46XX_NEW_DSP=y CONFIG_EXT2_FS=y CONFIG_EXT3_FS=y CONFIG_EXT3_FS_XATTR=y CONFIG_JBD=y CONFIG_FS_MBCACHE=y CONFIG_ISO9660_FS=y CONFIG_JOLIET=y CONFIG_UDF_FS=y CONFIG_FAT_FS=y CONFIG_MSDOS_FS=y CONFIG_VFAT_FS=y CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_DEVPTS_FS=y CONFIG_TMPFS=y CONFIG_RAMFS=y CONFIG_SMB_FS=y CONFIG_MSDOS_PARTITION=y CONFIG_SMB_NLS=y CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" CONFIG_NLS_CODEPAGE_437=y CONFIG_NLS_ISO8859_1=y CONFIG_X86_EXTRA_IRQS=y CONFIG_X86_FIND_SMP_CONFIG=y CONFIG_X86_MPPARSE=y CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=y CONFIG_X86_SMP=y CONFIG_X86_HT=y CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y CONFIG_PC=y ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 2:32 Colin Coe @ 2003-12-06 2:42 ` William Lee Irwin III 2003-12-06 2:48 ` William Lee Irwin III 2003-12-06 2:48 ` Nick Piggin 0 siblings, 2 replies; 27+ messages in thread From: William Lee Irwin III @ 2003-12-06 2:42 UTC (permalink / raw) To: Colin Coe; +Cc: linux-kernel On Sat, Dec 06, 2003 at 10:32:45AM +0800, Colin Coe wrote: > This indicates to me that the processing load is being evenly distributed > accross the two processes. Under v2.6.0-testxx however, 'cat > /proc/interrupts' shows this: > [root@host root]# cat /proc/interrupts > CPU0 CPU1 > 0: 633122 30 IO-APIC-edge timer > 1: 207 IO-APIC-edge i8042 > 2: XT-PIC cascade > 4: 48 1 IO-APIC-edge serial > 5: 449 1 IO-APIC-level eth1 > 10: 135 1 IO-APIC-level aic7xxx > 11: 1447 1 IO-APIC-level eth0 > 12: 61 IO-APIC-edge i8042 > 14: IO-APIC-level CS46XX > 15: 14982 1 IO-APIC-level megaraid 2.6 does balancing across packages, not logical cpus, so this will happen and it will be largely harmless, except for what appears to be some kind of bug where it's stealing the timer from logical cpu 1. -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 2:42 ` William Lee Irwin III @ 2003-12-06 2:48 ` William Lee Irwin III 2003-12-06 2:48 ` Nick Piggin 1 sibling, 0 replies; 27+ messages in thread From: William Lee Irwin III @ 2003-12-06 2:48 UTC (permalink / raw) To: Colin Coe, linux-kernel On Sat, Dec 06, 2003 at 10:32:45AM +0800, Colin Coe wrote: >> CPU0 CPU1 >> 0: 633122 30 IO-APIC-edge timer >> 1: 207 IO-APIC-edge i8042 >> 2: XT-PIC cascade >> 4: 48 1 IO-APIC-edge serial >> 5: 449 1 IO-APIC-level eth1 >> 10: 135 1 IO-APIC-level aic7xxx >> 11: 1447 1 IO-APIC-level eth0 >> 12: 61 IO-APIC-edge i8042 >> 14: IO-APIC-level CS46XX >> 15: 14982 1 IO-APIC-level megaraid On Fri, Dec 05, 2003 at 06:42:51PM -0800, William Lee Irwin III wrote: > 2.6 does balancing across packages, not logical cpus, so this will > happen and it will be largely harmless, except for what appears to > be some kind of bug where it's stealing the timer from logical cpu 1. Replied-to text trimming bogon, apparently. They're "LOC:" for local APIC timer interrupts. What's reported for irq 0 above is odd though. -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 2:42 ` William Lee Irwin III 2003-12-06 2:48 ` William Lee Irwin III @ 2003-12-06 2:48 ` Nick Piggin 2003-12-06 3:07 ` William Lee Irwin III 1 sibling, 1 reply; 27+ messages in thread From: Nick Piggin @ 2003-12-06 2:48 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Colin Coe, linux-kernel William Lee Irwin III wrote: >On Sat, Dec 06, 2003 at 10:32:45AM +0800, Colin Coe wrote: > >>This indicates to me that the processing load is being evenly distributed >>accross the two processes. Under v2.6.0-testxx however, 'cat >>/proc/interrupts' shows this: >>[root@host root]# cat /proc/interrupts >> CPU0 CPU1 >> 0: 633122 30 IO-APIC-edge timer >> 1: 207 IO-APIC-edge i8042 >> 2: XT-PIC cascade >> 4: 48 1 IO-APIC-edge serial >> 5: 449 1 IO-APIC-level eth1 >> 10: 135 1 IO-APIC-level aic7xxx >> 11: 1447 1 IO-APIC-level eth0 >> 12: 61 IO-APIC-edge i8042 >> 14: IO-APIC-level CS46XX >> 15: 14982 1 IO-APIC-level megaraid >> > >2.6 does balancing across packages, not logical cpus, so this will >happen and it will be largely harmless, except for what appears to >be some kind of bug where it's stealing the timer from logical cpu 1. > > Although in this case Colin has 2 PPro 200s. Colin - process load should be evenly distributed between CPUs, and this is generally the most important thing. Big networking loads (most commonly) can put a lot of time into processing interrupts though. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 2:48 ` Nick Piggin @ 2003-12-06 3:07 ` William Lee Irwin III 2003-12-06 4:28 ` Stian Jordet 2003-12-06 7:02 ` Colin Coe 0 siblings, 2 replies; 27+ messages in thread From: William Lee Irwin III @ 2003-12-06 3:07 UTC (permalink / raw) To: Nick Piggin; +Cc: Colin Coe, linux-kernel On Sat, Dec 06, 2003 at 01:48:54PM +1100, Nick Piggin wrote: > Although in this case Colin has 2 PPro 200s. > Colin - process load should be evenly distributed between CPUs, and this > is generally the most important thing. Big networking loads (most commonly) > can put a lot of time into processing interrupts though. That is rather busted, then. Colin, could you try booting with noirqbalance on the kernel command line? -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 3:07 ` William Lee Irwin III @ 2003-12-06 4:28 ` Stian Jordet 2003-12-06 4:37 ` William Lee Irwin III 2003-12-06 7:02 ` Colin Coe 1 sibling, 1 reply; 27+ messages in thread From: Stian Jordet @ 2003-12-06 4:28 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Nick Piggin, Colin Coe, linux-kernel lør, 06.12.2003 kl. 04.07 skrev William Lee Irwin III: > On Sat, Dec 06, 2003 at 01:48:54PM +1100, Nick Piggin wrote: > > Although in this case Colin has 2 PPro 200s. > > Colin - process load should be evenly distributed between CPUs, and this > > is generally the most important thing. Big networking loads (most commonly) > > can put a lot of time into processing interrupts though. > > That is rather busted, then. Uhm.. I was under the impression that this was expected behaviour? If not, I guess I'm having problems too? CPU0 CPU1 0: 91068534 45 IO-APIC-edge timer 1: 65293 1 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 3: 71 1 IO-APIC-edge serial 8: 325118 1 IO-APIC-edge rtc 9: 0 0 IO-APIC-level acpi 14: 245619 1 IO-APIC-edge ide0 15: 21 2 IO-APIC-edge ide1 17: 444526 0 IO-APIC-level aic7xxx, EMU10K1 18: 1112 0 IO-APIC-level aic7xxx, yenta 19: 6427306 1 IO-APIC-level saa7134[0], yenta, eth0, ide2 21: 34725 9049384 IO-APIC-level uhci_hcd, uhci_hcd, uhci_hcd NMI: 0 0 LOC: 91065099 91064973 ERR: 0 MIS: 2 This is with an uptime of almost 26 hours. Dual P3. USB uses lots of interrupts from both cpu's, but I'm running both the aic7xxx and eth0 quite hard at times... Best regards, Stian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 4:28 ` Stian Jordet @ 2003-12-06 4:37 ` William Lee Irwin III 2003-12-06 4:48 ` Stian Jordet 2003-12-06 14:07 ` Adam Kropelin 0 siblings, 2 replies; 27+ messages in thread From: William Lee Irwin III @ 2003-12-06 4:37 UTC (permalink / raw) To: Stian Jordet; +Cc: Nick Piggin, Colin Coe, linux-kernel On Sat, Dec 06, 2003 at 05:28:38AM +0100, Stian Jordet wrote: > Uhm.. I was under the impression that this was expected behaviour? If > not, I guess I'm having problems too? > CPU0 CPU1 > 0: 91068534 45 IO-APIC-edge timer > 1: 65293 1 IO-APIC-edge i8042 > 2: 0 0 XT-PIC cascade > 3: 71 1 IO-APIC-edge serial > 8: 325118 1 IO-APIC-edge rtc > 9: 0 0 IO-APIC-level acpi > 14: 245619 1 IO-APIC-edge ide0 Yeah, it looks like it hit you too. Could you boot with noirqbalance on the kernel commandline and see if the problem goes away? -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 4:37 ` William Lee Irwin III @ 2003-12-06 4:48 ` Stian Jordet 2003-12-06 4:54 ` William Lee Irwin III 2003-12-06 14:07 ` Adam Kropelin 1 sibling, 1 reply; 27+ messages in thread From: Stian Jordet @ 2003-12-06 4:48 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Nick Piggin, Colin Coe, linux-kernel lør, 06.12.2003 kl. 05.37 skrev William Lee Irwin III: > Yeah, it looks like it hit you too. > > Could you boot with noirqbalance on the kernel commandline and see if > the problem goes away? Wow, that actually fixed it :) CPU0 CPU1 0: 65636 63667 IO-APIC-edge timer 1: 150 136 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 3: 2 1 IO-APIC-edge serial 8: 3 1 IO-APIC-edge rtc 9: 0 0 IO-APIC-level acpi 14: 18 37 IO-APIC-edge ide0 15: 16 7 IO-APIC-edge ide1 17: 4830 4846 IO-APIC-level aic7xxx, EMU10K1 18: 218 210 IO-APIC-level aic7xxx, yenta 19: 3307 4259 IO-APIC-level saa7134[0], yenta, eth0, ide2 21: 41562 40666 IO-APIC-level uhci_hcd, uhci_hcd, uhci_hcd NMI: 0 0 LOC: 129121 129273 ERR: 0 MIS: 0 This is after about one minute uptime. Thanks :) Best regards, Stian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 4:48 ` Stian Jordet @ 2003-12-06 4:54 ` William Lee Irwin III 2003-12-06 4:57 ` Stian Jordet ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: William Lee Irwin III @ 2003-12-06 4:54 UTC (permalink / raw) To: Stian Jordet; +Cc: Nick Piggin, Colin Coe, linux-kernel l?r, 06.12.2003 kl. 05.37 skrev William Lee Irwin III: >> Yeah, it looks like it hit you too. >> Could you boot with noirqbalance on the kernel commandline and see if >> the problem goes away? On Sat, Dec 06, 2003 at 05:48:46AM +0100, Stian Jordet wrote: > Wow, that actually fixed it :) > CPU0 CPU1 > 0: 65636 63667 IO-APIC-edge timer > 1: 150 136 IO-APIC-edge i8042 > 2: 0 0 XT-PIC cascade > 3: 2 1 IO-APIC-edge serial > 8: 3 1 IO-APIC-edge rtc > 9: 0 0 IO-APIC-level acpi > 14: 18 37 IO-APIC-edge ide0 Okay, irqbalance has gaffed (as predicted). Could you send in /proc/cpuinfo and /var/log/dmesg? -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 4:54 ` William Lee Irwin III @ 2003-12-06 4:57 ` Stian Jordet 2003-12-06 5:09 ` William Lee Irwin III 2003-12-06 7:11 ` Colin Coe 2003-12-08 15:45 ` bill davidsen 2 siblings, 1 reply; 27+ messages in thread From: Stian Jordet @ 2003-12-06 4:57 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Nick Piggin, Colin Coe, linux-kernel [-- Attachment #1: Type: text/plain, Size: 224 bytes --] lør, 06.12.2003 kl. 05.54 skrev William Lee Irwin III: > Okay, irqbalance has gaffed (as predicted). Could you send in > /proc/cpuinfo and /var/log/dmesg? Here they are. Thanks for looking into this :) Best regards, Stian [-- Attachment #2: cpuinfo.txt --] [-- Type: text/plain, Size: 768 bytes --] processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 10 cpu MHz : 1000.416 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1957.88 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 10 cpu MHz : 1000.416 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1994.75 [-- Attachment #3: dmesg.txt --] [-- Type: text/plain, Size: 20791 bytes --] Linux version 2.6.0-test11 (root@chevrolet) (gcc version 3.3.2 (Debian)) #3 SMP Thu Dec 4 16:51:17 CET 2003 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009f800 (usable) BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000001fffc000 (usable) BIOS-e820: 000000001fffc000 - 000000001ffff000 (ACPI data) BIOS-e820: 000000001ffff000 - 0000000020000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved) 511MB LOWMEM available. found SMP MP-table at 000f5500 hm, page 000f5000 reserved twice. hm, page 000f6000 reserved twice. hm, page 000f5000 reserved twice. hm, page 000f6000 reserved twice. On node 0 totalpages: 131068 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 126972 pages, LIFO batch:16 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. ACPI: RSDP (v000 ASUS ) @ 0x000f6930 ACPI: RSDT (v001 ASUS CV266DLS 0x30303031 MSFT 0x31313031) @ 0x1fffc000 ACPI: FADT (v001 ASUS CV266DLS 0x30303031 MSFT 0x31313031) @ 0x1fffc100 ACPI: BOOT (v001 ASUS CV266DLS 0x30303031 MSFT 0x31313031) @ 0x1fffc040 ACPI: MADT (v001 ASUS CV266DLS 0x30303031 MSFT 0x31313031) @ 0x1fffc080 ACPI: DSDT (v001 ASUS CV266DLS 0x00001000 MSFT 0x0100000b) @ 0x00000000 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x03] enabled) Processor #3 6:8 APIC version 17 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 6:8 APIC version 17 ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0]) IOAPIC[0]: Assigned apic_id 2 IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, IRQ 0-23 ACPI: INT_SRC_OVR (bus[0] irq[0x0] global_irq[0x2] polarity[0x0] trigger[0x1]) ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x3] trigger[0x3]) Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Building zonelist for node : 0 Kernel command line: root=/dev/sda2 ro idebus=33 vga=0 noirqbalance ide_setup: idebus=33 Initializing CPU#0 PID hash table entries: 2048 (order 11: 16384 bytes) Detected 1000.416 MHz processor. Console: colour VGA+ 80x25 Memory: 512816k/524272k available (3543k kernel code, 10660k reserved, 1185k data, 208k init, 0k highmem) Calibrating delay loop... 1957.88 BogoMIPS Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX CPU0: Intel Pentium III (Coppermine) stepping 0a per-CPU timeslice cutoff: 731.00 usecs. task migration cache decay timeout: 1 msecs. enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Booting processor 1/0 eip 3000 Initializing CPU#1 masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 1994.75 BogoMIPS CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383fbff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel Pentium III (Coppermine) stepping 0a Total of 2 processors activated (3952.64 BogoMIPS). ENABLING IO-APIC IRQs init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected. ..TIMER: vector=0x31 pin1=2 pin2=-1 number of MP IRQ sources: 15. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00178011 ....... : max redirection entries: 0017 ....... : PRQ implemented: 1 ....... : IO APIC version: 0011 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 001 01 0 0 0 0 0 1 1 39 02 001 01 0 0 0 0 0 1 1 31 03 001 01 0 0 0 0 0 1 1 41 04 001 01 0 0 0 0 0 1 1 49 05 001 01 0 0 0 0 0 1 1 51 06 001 01 0 0 0 0 0 1 1 59 07 001 01 0 0 0 0 0 1 1 61 08 001 01 0 0 0 0 0 1 1 69 09 001 01 1 1 0 1 0 1 1 71 0a 001 01 0 0 0 0 0 1 1 79 0b 001 01 0 0 0 0 0 1 1 81 0c 001 01 0 0 0 0 0 1 1 89 0d 001 01 0 0 0 0 0 1 1 91 0e 001 01 0 0 0 0 0 1 1 99 0f 001 01 0 0 0 0 0 1 1 A1 10 000 00 1 0 0 0 0 0 0 00 11 000 00 1 0 0 0 0 0 0 00 12 000 00 1 0 0 0 0 0 0 00 13 000 00 1 0 0 0 0 0 0 00 14 000 00 1 0 0 0 0 0 0 00 15 000 00 1 0 0 0 0 0 0 00 16 000 00 1 0 0 0 0 0 0 00 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ5 -> 0:5 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ10 -> 0:10 IRQ11 -> 0:11 IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 999.0518 MHz. ..... host bus clock speed is 133.0269 MHz. checking TSC synchronization across 2 CPUs: passed. Starting migration thread for cpu 0 Bringing up 1 CPU 1 IS NOW UP! Starting migration thread for cpu 1 CPUS done 2 NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xf0d40, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20031002 IOAPIC[0]: Set PCI routing entry (2-9 -> 0x71 -> IRQ 9 Mode:1 Active:1) ACPI-0109: *** Error: No object was returned from [\_SB_.PCI0.PX40.IRDA._STA] (Node dff3f860), AE_NOT_EXIST ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 *12 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 7 9 10 11 12 14 15) ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT] Linux Plug and Play Support v0.97 (c) Adam Belay PnPBIOS: Scanning system for PnP BIOS support... PnPBIOS: Found PnP BIOS installation structure at 0xc00fbd80 PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0xbdb0, dseg 0xf0000 pnp: 00:11: ioport range 0xe400-0xe47f has been reserved pnp: 00:11: ioport range 0xe800-0xe83f has been reserved PnPBIOS: 14 nodes reported by PnP BIOS; 14 recorded by driver SCSI subsystem initialized Linux Kernel Card Services options: [pci] [cardbus] [pm] drivers/usb/core/usb.c: registered new driver usbfs drivers/usb/core/usb.c: registered new driver hub IOAPIC[0]: Set PCI routing entry (2-17 -> 0xa9 -> IRQ 17 Mode:1 Active:1) 00:00:07[A] -> 2-17 -> IRQ 17 IOAPIC[0]: Set PCI routing entry (2-18 -> 0xb1 -> IRQ 18 Mode:1 Active:1) 00:00:07[B] -> 2-18 -> IRQ 18 IOAPIC[0]: Set PCI routing entry (2-19 -> 0xb9 -> IRQ 19 Mode:1 Active:1) 00:00:05[A] -> 2-19 -> IRQ 19 Pin 2-19 already programmed IOAPIC[0]: Set PCI routing entry (2-16 -> 0xc1 -> IRQ 16 Mode:1 Active:1) 00:00:0c[B] -> 2-16 -> IRQ 16 Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-17 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-18 already programmed Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-19 already programmed Pin 2-16 already programmed Pin 2-17 already programmed Pin 2-18 already programmed ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11 IOAPIC[0]: Set PCI routing entry (2-11 -> 0xc9 -> IRQ 27 Mode:1 Active:1) 00:00:11[A] -> 2-11 -> IRQ 27 ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10 IOAPIC[0]: Set PCI routing entry (2-10 -> 0xd1 -> IRQ 26 Mode:1 Active:1) 00:00:11[B] -> 2-10 -> IRQ 26 ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 12 IOAPIC[0]: Set PCI routing entry (2-12 -> 0xd9 -> IRQ 28 Mode:1 Active:1) 00:00:11[C] -> 2-12 -> IRQ 28 ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 5 IOAPIC[0]: Set PCI routing entry (2-5 -> 0xe1 -> IRQ 21 Mode:1 Active:1) 00:00:11[D] -> 2-5 -> IRQ 21 Pin 2-16 already programmed Pin 2-17 already programmed PCI: Using ACPI for IRQ routing PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off' irda_init() NET: Registered protocol family 23 Bluetooth: Core ver 2.3 NET: Registered protocol family 31 Bluetooth: HCI device and connection manager initialized Bluetooth: HCI socket layer initialized radeonfb_pci_register BEGIN radeonfb: ref_clk=2700, ref_div=12, xclk=27000 defaults radeonfb: probed SDR SGRAM 131072k videoram radeon_get_moninfo: bios 4 scratch = 22000202 radeonfb: ATI Radeon 9700 ND SDR SGRAM 128 MB radeonfb: DVI port CRT monitor connected radeonfb: CRT port CRT monitor connected radeonfb_pci_register END SBF: Simple Boot Flag extension found and enabled. SBF: Setting boot flags 0x1 IA-32 Microcode Update Driver: v1.13 <tigran@veritas.com> ikconfig 0.7 with /proc/config* VFS: Disk quotas dquot_6.5.1 NTFS driver 2.1.5 [Flags: R/O]. udf: registering filesystem SGI XFS for Linux with ACLs, no debug enabled SGI XFS Quota Management subsystem ACPI: Power Button (FF) [PWRF] ACPI: Processor [CPU] (supports C1) ACPI: Processor [CPU1] (supports C1) pty: 256 Unix98 ptys configured Real Time Clock Driver v1.12 Non-volatile memory driver v1.2 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE] parport0: irq 7 detected parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) parport0: cpp_daisy: aa5500ff(38) parport0: assign_addrs: aa5500ff(38) Using anticipatory io scheduler Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 loop: loaded (max 8 devices) Intel(R) PRO/100 Network Driver - version 2.3.30-k1 Copyright (c) 2003 Intel Corporation e100: selftest OK. e100: eth0: Intel(R) PRO/100 Network Connection Hardware receive checksums enabled cpu cycle saver enabled PPP generic driver version 2.4.2 PPP Deflate Compression module registered PPP BSD Compression module registered Linux video capture interface: v1.00 saa7130/34: v4l2 driver version 0.2.9 loaded saa7134[0]: found at 0000:00:0c.0, rev: 1, irq: 19, latency: 32, mmio: 0xdc000000 saa7134[0]: subsystem: 153b:1143, board: Terratec Cinergy 600 TV [card=11,autodetected] saa7134[0]: board init: gpio is 50000 saa7134[0]: registered input device for IR saa7134[0]: i2c eeprom 00: 3b 15 43 11 ff ff ff ff ff ff ff ff ff ff ff ff saa7134[0]: i2c eeprom 10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff saa7134[0]: i2c eeprom 20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff saa7134[0]: i2c eeprom 30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff request_module: failed /sbin/modprobe -- tuner. error = -16 saa7134[0]: registered device video0 [v4l2] saa7134[0]: registered device vbi0 saa7134[0]: registered device radio0 tuner: chip found @ 0xc0 tuner: type set to 5 (Philips PAL_BG (FI1216 and compatibles)) registering 0-0060 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes VP_IDE: IDE controller at PCI slot 0000:00:11.1 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt8233 (rev 00) IDE UDMA100 controller on pci0000:00:11.1 ide0: BM-DMA at 0xa000-0xa007, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xa008-0xa00f, BIOS settings: hdc:pio, hdd:DMA hda: WDC WD1200JB-00CRA1, ATA DISK drive hdb: WDC WD1200JB-75CRA0, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 saa7134[0]/audio: audio carrier scan failed, using 5.500 MHz [default] hdd: IOMEGA ZIP 100 ATAPI Floppy, ATAPI FLOPPY drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 234441648 sectors (120034 MB) w/8192KiB Cache, CHS=65535/16/63, UDMA(100) hda: hda1 hdb: max request size: 128KiB hdb: Host Protected Area detected. current capacity is 234375000 sectors (120000 MB) native capacity is 234441648 sectors (120034 MB) hdb: 234375000 sectors (120000 MB) w/8192KiB Cache, CHS=65535/16/63, UDMA(100) hdb: hdb1 ide-floppy driver 0.99.newide hdd: 98288kB, 196576 blocks, 512 sector size hdd: 98304kB, 96/64/32 CHS, 4096 kBps, 512 sector size, 2941 rpm hdd: hdd1 hdd2 hdd3 hdd4 scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.35 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs (scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) Vendor: SEAGATE Model: ST318452LW Rev: 0004 Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:0:0: Tagged Queuing enabled. Depth 253 scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.35 <Adaptec aic7899 Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs (scsi1:A:4): 20.000MB/s transfers (20.000MHz, offset 16) (scsi1:A:5): 20.000MB/s transfers (20.000MHz, offset 15) (scsi1:A:6): 7.812MB/s transfers (7.812MHz, offset 15) Vendor: PIONEER Model: DVD-ROM DVD-304F Rev: 1.03 Type: CD-ROM ANSI SCSI revision: 02 Vendor: YAMAHA Model: CRW-F1S Rev: 1.0g Type: CD-ROM ANSI SCSI revision: 02 Vendor: SEAGATE Model: DAT 04687-XXX Rev: 6610 Type: Sequential-Access ANSI SCSI revision: 02 st: Version 20030811, fixed bufsize 32768, s/g segs 256 Attached scsi tape st0 at scsi1, channel 0, id 6, lun 0 st0: try direct i/o: yes, max page reachable by HBA 1048575 SCSI device sda: 35843670 512-byte hdwr sectors (18352 MB) SCSI device sda: drive cache: write back sda: sda1 sda2 sda3 sda4 Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 sr0: scsi3-mmc drive: 0x/0x cd/rw xa/form2 cdda tray Uniform CD-ROM driver Revision: 3.12 Attached scsi CD-ROM sr0 at scsi1, channel 0, id 4, lun 0 sr1: scsi3-mmc drive: 44x/44x writer cd/rw xa/form2 cdda tray Attached scsi CD-ROM sr1 at scsi1, channel 0, id 5, lun 0 Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0, type 0 Attached scsi generic sg1 at scsi1, channel 0, id 4, lun 0, type 5 Attached scsi generic sg2 at scsi1, channel 0, id 5, lun 0, type 5 Attached scsi generic sg3 at scsi1, channel 0, id 6, lun 0, type 1 PCI: Enabling device 0000:00:0f.0 (0000 -> 0002) Yenta: CardBus bridge found at 0000:00:0f.0 [14ef:0220] Yenta: ISA IRQ list 0000, PCI irq18 Socket status: 30000006 PCI: Enabling device 0000:00:0f.1 (0000 -> 0002) Yenta: CardBus bridge found at 0000:00:0f.1 [14ef:0220] Yenta: ISA IRQ list 0000, PCI irq19 Socket status: 30000810 drivers/usb/host/uhci-hcd.c: USB Universal Host Controller Interface driver v2.1 uhci_hcd 0000:00:11.2: UHCI Host Controller uhci_hcd 0000:00:11.2: irq 21, io base 00009800 uhci_hcd 0000:00:11.2: new USB bus registered, assigned bus number 1 hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected uhci_hcd 0000:00:11.3: UHCI Host Controller uhci_hcd 0000:00:11.3: irq 21, io base 00009400 uhci_hcd 0000:00:11.3: new USB bus registered, assigned bus number 2 hub 2-0:1.0: USB hub found hub 2-0:1.0: 2 ports detected uhci_hcd 0000:00:11.4: UHCI Host Controller uhci_hcd 0000:00:11.4: irq 21, io base 00009000 uhci_hcd 0000:00:11.4: new USB bus registered, assigned bus number 3 hub 3-0:1.0: USB hub found hub 3-0:1.0: 2 ports detected drivers/usb/core/usb.c: registered new driver usblp drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver Initializing USB Mass Storage driver... drivers/usb/core/usb.c: registered new driver usb-storage USB Mass Storage support registered. drivers/usb/core/usb.c: registered new driver hiddev drivers/usb/core/usb.c: registered new driver hid drivers/usb/input/hid-core.c: v2.0:USB HID core driver drivers/usb/core/usb.c: registered new driver usbscanner drivers/usb/image/scanner.c: 0.4.15:USB Scanner Driver mice: PS/2 mouse device common for all mice input: PC Speaker gameport: pci0000:00:0e.1 speed 1704 kHz serio: i8042 AUX port at 0x60,0x64 irq 12 input: AT Translated Set 2 keyboard on isa0060/serio0 serio: i8042 KBD port at 0x60,0x64 irq 1 registering 1-002d registering 1-0049 registering 1-0048 NET: Registered protocol family 2 IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 32768 bind 32768) ip_conntrack version 2.1 (4095 buckets, 32760 max) - 304 bytes per conntrack ip_tables: (C) 2000-2002 Netfilter core team hub 3-0:1.0: new USB device on port 2, assigned address 2 drivers/usb/class/usblp.c: usblp0: USB Bidirectional printer dev 2 if 0 alt 0 proto 2 vid 0x03F0 pid 0x3304 ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>. http://snowman.net/projects/ipt_recent/ arp_tables: (C) 2002 David S. Miller NET: Registered protocol family 1 NET: Registered protocol family 17 IrCOMM protocol (Dag Brattli) Bluetooth: L2CAP ver 2.1 Bluetooth: L2CAP socket layer initialized Bluetooth: SCO (Voice Link) ver 0.3 Bluetooth: SCO socket layer initialized Bluetooth: RFCOMM ver 1.0 Bluetooth: RFCOMM socket layer initialized Bluetooth: RFCOMM TTY layer initialized Bluetooth: BNEP (Ethernet Emulation) ver 1.0 Bluetooth: BNEP filters: protocol multicast Software Suspend has malfunctioning SMP support. Disabled :( ACPI: (supports S0 S1 S4 S5) kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 208k freed hub 2-0:1.0: new USB device on port 1, assigned address 2 drivers/usb/image/scanner.c: USB scanner device (0x04b8/0x011d) now attached to usb/scanner0 hub 2-0:1.0: new USB device on port 2, assigned address 3 Adding 1052248k swap on /dev/sda4. Priority:-1 extents:1 EXT3 FS on sda2, internal journal hub 1-0:1.0: new USB device on port 1, assigned address 2 input: USB HID v1.10 Keyboard [Logitech USB Receiver] on usb-0000:00:11.2-1 input: USB HID v1.10 Mouse [Logitech USB Receiver] on usb-0000:00:11.2-1 hub 1-0:1.0: new USB device on port 2, assigned address 3 usb 1-2: USB disconnect, address 3 Bluetooth: HCI USB driver ver 2.4 drivers/usb/core/usb.c: registered new driver hci_usb Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected VIA Pro 266 chipset agpgart: Maximum main memory to use for agp memory: 439M agpgart: AGP aperture is 64M @ 0xf8000000 hub 1-0:1.0: new USB device on port 2, assigned address 4 XFS mounting filesystem sda3 Ending clean XFS mount for filesystem: sda3 kjournald starting. Commit interval 5 seconds EXT3-fs warning: maximal mount count reached, running e2fsck is recommended EXT3 FS on hda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3-fs warning: maximal mount count reached, running e2fsck is recommended EXT3 FS on hdb1, internal journal EXT3-fs: mounted filesystem with ordered data mode. NTFS volume version 3.0. usb 1-2: device not accepting address 4, error -110 hub 1-0:1.0: new USB device on port 2, assigned address 5 hci_usb: probe of 1-2:1.1 failed with error -5 hci_usb: probe of 1-2:1.2 failed with error -5 e100: eth0 NIC Link is Up 100 Mbps Full duplex nfs warning: mount version older than kernel nfs warning: mount version older than kernel ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 4:57 ` Stian Jordet @ 2003-12-06 5:09 ` William Lee Irwin III 2003-12-06 5:14 ` Stian Jordet 0 siblings, 1 reply; 27+ messages in thread From: William Lee Irwin III @ 2003-12-06 5:09 UTC (permalink / raw) To: Stian Jordet; +Cc: Nick Piggin, Colin Coe, linux-kernel l?r, 06.12.2003 kl. 05.54 skrev William Lee Irwin III: >> Okay, irqbalance has gaffed (as predicted). Could you send in >> /proc/cpuinfo and /var/log/dmesg? On Sat, Dec 06, 2003 at 05:57:14AM +0100, Stian Jordet wrote: > Here they are. Thanks for looking into this :) This tells me you're not being mistaken for HT. It also suggests it's the policy not doing what you want it to. If your interrupt rate isn't high (according to what metric I have no idea; presumably it should depend on the expense of handling it, which is driver-dependent but about which the code has no knowledge), it won't rebalance the irq's. If you actually manage to get interrupt rates exceeding its thresholds, you should see interrupts migrated, but only dynamically and on-demand, not under light usage. There might still be something wrong with it, but we'd have to dig deeper. -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 5:09 ` William Lee Irwin III @ 2003-12-06 5:14 ` Stian Jordet 2003-12-06 5:40 ` William Lee Irwin III 0 siblings, 1 reply; 27+ messages in thread From: Stian Jordet @ 2003-12-06 5:14 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Nick Piggin, Colin Coe, linux-kernel lør, 06.12.2003 kl. 06.09 skrev William Lee Irwin III: > l?r, 06.12.2003 kl. 05.54 skrev William Lee Irwin III: > >> Okay, irqbalance has gaffed (as predicted). Could you send in > >> /proc/cpuinfo and /var/log/dmesg? > > On Sat, Dec 06, 2003 at 05:57:14AM +0100, Stian Jordet wrote: > > Here they are. Thanks for looking into this :) > > This tells me you're not being mistaken for HT. > > It also suggests it's the policy not doing what you want it to. If your > interrupt rate isn't high (according to what metric I have no idea; > presumably it should depend on the expense of handling it, which is > driver-dependent but about which the code has no knowledge), it won't > rebalance the irq's. > > If you actually manage to get interrupt rates exceeding its thresholds, > you should see interrupts migrated, but only dynamically and on-demand, > not under light usage. I really don't know the definition of "light usage", but I'm beating the aic7xxx and eth0 quite hard at times, without any interrupts being migrated. Anyway, thanks :) This haven't been a problem for me so far, and I doubt it ever will :) Stian ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 5:14 ` Stian Jordet @ 2003-12-06 5:40 ` William Lee Irwin III 2003-12-08 15:57 ` bill davidsen 0 siblings, 1 reply; 27+ messages in thread From: William Lee Irwin III @ 2003-12-06 5:40 UTC (permalink / raw) To: Stian Jordet; +Cc: Nick Piggin, Colin Coe, linux-kernel l?r, 06.12.2003 kl. 06.09 skrev William Lee Irwin III: >> If you actually manage to get interrupt rates exceeding its thresholds, >> you should see interrupts migrated, but only dynamically and on-demand, >> not under light usage. On Sat, Dec 06, 2003 at 06:14:15AM +0100, Stian Jordet wrote: > I really don't know the definition of "light usage", but I'm beating the > aic7xxx and eth0 quite hard at times, without any interrupts being > migrated. Anyway, thanks :) This haven't been a problem for me so far, > and I doubt it ever will :) Okay, this should be fixed. The entire subarch organization is wrong for this anyway. It needs several axes to vary upon for the APIC-based subarches: (a) xAPIC (P-IV) vs. serial APIC (before P-IV) (b) logical vs. physical IPI's (c) logical vs. physical IO interrupts (d) flat logical vs. clustered hierarchical DFR (e) NMI wakeup vs. INIT wakeup (f) software vs. hardware interrupt load balancing (g) locality-dependent vs. locality-independent APIC destinations The real problem with all this is that it was arranged around minimal impact code changes instead of adequately describing hardware, and so it gives rise to numerous corner cases and is generally brittle. Of course, 2.6 is too frozen to do anything with it now, and ia32 will likely be largely legacy during the course of 2.7, so the damage will probably be permanent. What you've run into is essentially there being no distinction for (a) or (f) in mach-default, what normal Pee Cees use. There are several disturbing differences between the two cases which are for the moment carefully avoided but at the very least raise my eyebrows. For instance, both the physical broadcast destination and the size of the physical APIC ID space differ between the two cases. The difference you've been burned by is the fact that current revisions of xAPIC's have broken hardware interrupt load balancing, and so singleton fixed destinations are used with software interrupt balancing instead of lowest priority destinations with many cpus in them perfectly suitable for P-III's, which under your light usage pinned all interrupts on cpu 0. -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 5:40 ` William Lee Irwin III @ 2003-12-08 15:57 ` bill davidsen 2003-12-08 16:47 ` William Lee Irwin III 0 siblings, 1 reply; 27+ messages in thread From: bill davidsen @ 2003-12-08 15:57 UTC (permalink / raw) To: linux-kernel In article <20031206054031.GM8039@holomorphy.com>, William Lee Irwin III <wli@holomorphy.com> wrote: | The real problem with all this is that it was arranged around minimal | impact code changes instead of adequately describing hardware, and so | it gives rise to numerous corner cases and is generally brittle. Of | course, 2.6 is too frozen to do anything with it now, and ia32 will | likely be largely legacy during the course of 2.7, so the damage will | probably be permanent. I don't follow your thinking here, 2.6.0 is certainly frozen, but I see no reason this can't be fixed in 2.6 if someone cares to do so. The amount of code is small, and as long as the interrupt gets serviced by exactly one CPU I doubt the performance could get worse. I don't see ia32 going away, either, unless you see 2.7 in a more distant timeframe than I do. Looking at the power issue I predict significant ia32 in laptops, and due to cost issues in desktops and servers. Also, I suspect that Linux hackers have a much higher percentage of SMP ia32 machines than the general public, which encourages enhancements in that area. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-08 15:57 ` bill davidsen @ 2003-12-08 16:47 ` William Lee Irwin III 0 siblings, 0 replies; 27+ messages in thread From: William Lee Irwin III @ 2003-12-08 16:47 UTC (permalink / raw) To: bill davidsen; +Cc: linux-kernel On Mon, Dec 08, 2003 at 03:57:54PM +0000, bill davidsen wrote: > I don't follow your thinking here, 2.6.0 is certainly frozen, but I > see no reason this can't be fixed in 2.6 if someone cares to do so. The > amount of code is small, and as long as the interrupt gets serviced by > exactly one CPU I doubt the performance could get worse. > I don't see ia32 going away, either, unless you see 2.7 in a more > distant timeframe than I do. Looking at the power issue I predict > significant ia32 in laptops, and due to cost issues in desktops and > servers. Also, I suspect that Linux hackers have a much higher > percentage of SMP ia32 machines than the general public, which > encourages enhancements in that area. What I'm on about is that some interfaces internal to arch/i386/ for APIC management are ad hoc and the configuration boundaries and case analysis in the code don't match the configuration boundaries or cases of the hardware. The API has to change to accurately describe machines in order to accurately drive the machines. The worst offense for end users is probably the mismeasured physical APIC ID space on smaller (mach-default) xAPIC systems which should lose cpus with sufficiently sparse physical APIC ID's. The only current use of physical broadcast is for clustered hierarchical serial APIC RTE's. -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 4:54 ` William Lee Irwin III 2003-12-06 4:57 ` Stian Jordet @ 2003-12-06 7:11 ` Colin Coe 2003-12-08 15:45 ` bill davidsen 2 siblings, 0 replies; 27+ messages in thread From: Colin Coe @ 2003-12-06 7:11 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Stian Jordet, Nick Piggin, Colin Coe, linux-kernel [-- Attachment #1: Type: text/plain, Size: 935 bytes --] /proc/cpuinfo and dmesg output is attached. Thanks! CC -- "Obnoxious frog..." Spike, 2071AD William Lee Irwin III said: > l?r, 06.12.2003 kl. 05.37 skrev William Lee Irwin III: >>> Yeah, it looks like it hit you too. >>> Could you boot with noirqbalance on the kernel commandline and see if >>> the problem goes away? > > On Sat, Dec 06, 2003 at 05:48:46AM +0100, Stian Jordet wrote: >> Wow, that actually fixed it :) >> CPU0 CPU1 >> 0: 65636 63667 IO-APIC-edge timer >> 1: 150 136 IO-APIC-edge i8042 >> 2: XT-PIC cascade >> 3: 2 1 IO-APIC-edge serial >> 8: 3 1 IO-APIC-edge rtc >> 9: IO-APIC-level acpi >> 14: 18 37 IO-APIC-edge ide0 > > Okay, irqbalance has gaffed (as predicted). Could you send in > /proc/cpuinfo and /var/log/dmesg? > > > -- wli > [-- Attachment #2: cpuinfo.txt --] [-- Type: text/plain, Size: 926 bytes --] processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 1 model name : Pentium Pro stepping : 9 cpu MHz : 199.489 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov bogomips : 390.14 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 1 model name : Pentium Pro stepping : 9 cpu MHz : 199.489 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov bogomips : 397.31 [-- Attachment #3: dmesg.txt --] [-- Type: text/plain, Size: 11992 bytes --] Linux version 2.6.0-test11 (root@host.domain.com) (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) #3 SMP Sat Dec 6 09:49:19 WST 2003 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 0000000028000000 (usable) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) 640MB LOWMEM available. found SMP MP-table at 000fdba0 hm, page 000fd000 reserved twice. hm, page 000fe000 reserved twice. hm, page 0009f000 reserved twice. hm, page 000a0000 reserved twice. On node 0 totalpages: 163840 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 159744 pages, LIFO batch:16 HighMem zone: 0 pages, LIFO batch:1 DMI not present. ACPI: Unable to locate RSDP Intel MultiProcessor Specification v1.4 Virtual Wire compatibility mode. OEM ID: DELL Product ID: POWEREDGE APIC at: 0xFEE00000 Processor #3 6:1 APIC version 17 Processor #0 6:1 APIC version 17 I/O APIC #1 Version 17 at 0xFEC00000. Enabling APIC mode: Flat. Using 1 I/O APICs Processors: 2 Building zonelist for node : 0 Kernel command line: ro root=/dev/sdc1 noirqbalance Initializing CPU#0 PID hash table entries: 4096 (order 12: 32768 bytes) Detected 199.489 MHz processor. Console: colour VGA+ 80x25 Memory: 643948k/655360k available (2308k kernel code, 10636k reserved, 709k data, 456k init, 0k highmem) Calibrating delay loop... 390.14 BogoMIPS Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) checking if image is initramfs...it isn't (no cpio magic); looks like an initrd Freeing initrd memory: 82k freed CPU: After generic identify, caps: 0000fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0000fbff 00000000 00000000 00000000 CPU: L1 I cache: 8K, L1 D cache: 8K CPU: L2 cache: 512K CPU: After all inits, caps: 0000f3ff 00000000 00000000 00000040 Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX CPU0: Intel Pentium Pro stepping 09 per-CPU timeslice cutoff: 1460.32 usecs. task migration cache decay timeout: 2 msecs. enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Booting processor 1/0 eip 2000 Initializing CPU#1 masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 397.31 BogoMIPS CPU: After generic identify, caps: 0000fbff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0000fbff 00000000 00000000 00000000 CPU: L1 I cache: 8K, L1 D cache: 8K CPU: L2 cache: 512K CPU: After all inits, caps: 0000f3ff 00000000 00000000 00000040 CPU1: Intel Pentium Pro stepping 09 Total of 2 processors activated (787.45 BogoMIPS). ENABLING IO-APIC IRQs Setting 1 in the phys_id_present_map ...changing IO-APIC physical APIC ID to 1 ... ok. init IO_APIC IRQs IO-APIC (apicid-pin) 1-0 not connected. ..TIMER: vector=0x31 pin1=2 pin2=0 number of MP IRQ sources: 16. number of IO-APIC #1 registers: 16. testing the IO APIC....................... IO APIC #1...... .... register #00: 01000000 ....... : physical APIC id: 01 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 000F0011 ....... : max redirection entries: 000F ....... : PRQ implemented: 0 ....... : IO APIC version: 0011 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 001 01 0 0 0 0 0 1 1 39 02 001 01 0 0 0 0 0 1 1 31 03 001 01 0 0 0 0 0 1 1 41 04 001 01 0 0 0 0 0 1 1 49 05 001 01 1 1 0 0 0 1 1 51 06 001 01 0 0 0 0 0 1 1 59 07 001 01 1 1 0 0 0 1 1 61 08 001 01 0 0 0 0 0 1 1 69 09 001 01 1 1 0 0 0 1 1 71 0a 001 01 1 1 0 0 0 1 1 79 0b 001 01 1 1 0 0 0 1 1 81 0c 001 01 0 0 0 0 0 1 1 89 0d 001 01 0 0 0 0 0 1 1 91 0e 001 01 1 1 0 0 0 1 1 99 0f 001 01 1 1 0 0 0 1 1 A1 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ5 -> 0:5 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ10 -> 0:10 IRQ11 -> 0:11 IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 .................................... done. Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 199.0394 MHz. ..... host bus clock speed is 66.0464 MHz. checking TSC synchronization across 2 CPUs: passed. Starting migration thread for cpu 0 Bringing up 1 CPU 1 IS NOW UP! Starting migration thread for cpu 1 CPUS done 4 NET: Registered protocol family 16 EISA bus registered PCI: PCI BIOS revision 2.10 entry at 0xf814d, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration. Linux Plug and Play Support v0.97 (c) Adam Belay PnPBIOS: Scanning system for PnP BIOS support... PnPBIOS: Found PnP BIOS installation structure at 0xc00fdb70 PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0x4ce5, dseg 0xf0000 pnp: 00:0f: ioport range 0x400-0x407 has been reserved pnp: 00:0f: ioport range 0x40a-0x40c has been reserved pnp: 00:0f: ioport range 0x410-0x43f has been reserved pnp: 00:0f: ioport range 0x461-0x462 has been reserved pnp: 00:0f: ioport range 0x464-0x465 has been reserved pnp: 00:0f: ioport range 0x481-0x48b has been reserved pnp: 00:0f: ioport range 0x4c6-0x4c6 has been reserved PnPBIOS: 17 nodes reported by PnP BIOS; 17 recorded by driver SCSI subsystem initialized PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) matroxfb: Matrox Millennium II (PCI) detected matroxfb: MTRR's turned on matroxfb: 640x480x8bpp (virtual: 640x65536) matroxfb: framebuffer at 0xFB000000, mapped to 0xe8805000, size 4194304 fb0: MATROX frame buffer device fb0: initializing hardware ikconfig 0.7 with /proc/config* udf: registering filesystem Limiting direct PCI/PCI transfers. isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found pty: 256 Unix98 ptys configured Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A Using anticipatory io scheduler Floppy drive(s): fd0 is 1.44M FDC 0 is a National Semiconductor PC87306 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html 0000:00:10.0: 3Com PCI 3c905C Tornado at 0xf400. Vers LK1.1.19 0000:00:13.0: 3Com PCI 3c905B Cyclone 100baseTx at 0xf480. Vers LK1.1.19 PPP generic driver version 2.4.2 PPP Deflate Compression module registered PPP BSD Compression module registered scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.35 <Adaptec aic7860 Ultra SCSI adapter> aic7860: Ultra Single Channel A, SCSI Id=7, 3/253 SCBs (scsi0:A:5): 10.000MB/s transfers (10.000MHz, offset 15) (scsi0:A:6): 5.000MB/s transfers (5.000MHz, offset 15) Vendor: NEC Model: CD-ROM DRIVE:462 Rev: 1.14 Type: CD-ROM ANSI SCSI revision: 02 Vendor: DEC Model: DLT2000 Rev: 971E Type: Sequential-Access ANSI SCSI revision: 02 megaraid: v2.00.3 (Release Date: Wed Feb 19 08:51:30 EST 2003) megaraid: found 0x101e:0x9010:bus 1:slot 13:func 0 scsi1:Found MegaRAID controller at 0xd090, IRQ:15 megaraid: [U.75:1.44] detected 4 logical drives. megaraid: channel[0] is raid. megaraid: channel[1] is raid. scsi1 : LSI Logic MegaRAID U.75 254 commands 16 targs 5 chans 7 luns scsi1: scanning scsi channel 0 for logical drives. Vendor: MegaRAID Model: LD0 RAID1 8568R Rev: U.75 Type: Direct-Access ANSI SCSI revision: 02 Vendor: MegaRAID Model: LD1 RAID1 4088R Rev: U.75 Type: Direct-Access ANSI SCSI revision: 02 Vendor: MegaRAID Model: LD2 RAID0 4088R Rev: U.75 Type: Direct-Access ANSI SCSI revision: 02 Vendor: MegaRAID Model: LD3 RAID0 4088R Rev: U.75 Type: Direct-Access ANSI SCSI revision: 02 scsi1: scanning scsi channel 4 [P0] for physical devices. Vendor: DELL Model: 6UW BACKPLANE Rev: 7 Type: Processor ANSI SCSI revision: 02 scsi1: scanning scsi channel 5 [P1] for physical devices. st: Version 20030811, fixed bufsize 32768, s/g segs 256 Attached scsi tape st0 at scsi0, channel 0, id 6, lun 0 st0: try direct i/o: yes, max page reachable by HBA 1048575 SCSI device sda: 17547264 512-byte hdwr sectors (8984 MB) sda: asking for cache data failed sda: assuming drive cache: write through sda: sda1 Attached scsi disk sda at scsi1, channel 0, id 0, lun 0 SCSI device sdb: 8372224 512-byte hdwr sectors (4287 MB) sdb: asking for cache data failed sdb: assuming drive cache: write through sdb: sdb1 Attached scsi disk sdb at scsi1, channel 0, id 1, lun 0 SCSI device sdc: 8372224 512-byte hdwr sectors (4287 MB) sdc: asking for cache data failed sdc: assuming drive cache: write through sdc: sdc1 Attached scsi disk sdc at scsi1, channel 0, id 2, lun 0 SCSI device sdd: 8372224 512-byte hdwr sectors (4287 MB) sdd: asking for cache data failed sdd: assuming drive cache: write through sdd: unknown partition table Attached scsi disk sdd at scsi1, channel 0, id 3, lun 0 sr0: scsi-1 drive Uniform CD-ROM driver Revision: 3.12 Attached scsi CD-ROM sr0 at scsi0, channel 0, id 5, lun 0 Attached scsi generic sg0 at scsi0, channel 0, id 5, lun 0, type 5 Attached scsi generic sg1 at scsi0, channel 0, id 6, lun 0, type 1 Attached scsi generic sg2 at scsi1, channel 0, id 0, lun 0, type 0 Attached scsi generic sg3 at scsi1, channel 0, id 1, lun 0, type 0 Attached scsi generic sg4 at scsi1, channel 0, id 2, lun 0, type 0 Attached scsi generic sg5 at scsi1, channel 0, id 3, lun 0, type 0 Attached scsi generic sg6 at scsi1, channel 4, id 6, lun 0, type 3 mice: PS/2 mouse device common for all mice input: ImExPS/2 Generic Explorer Mouse on isa0060/serio1 serio: i8042 AUX port at 0x60,0x64 irq 12 input: AT Translated Set 2 keyboard on isa0060/serio0 serio: i8042 KBD port at 0x60,0x64 irq 1 device-mapper: 1.0.6-ioctl (2002-10-15) initialised: dm@uk.sistina.com EISA: Probing bus 0 at 0000:00:0f.0 EISA: Mainboard DEL0058 detected. EISA: Detected 0 cards. Advanced Linux Sound Architecture Driver Version 0.9.7 (Thu Sep 25 19:16:36 2003 UTC). request_module: failed /sbin/modprobe -- snd-card-0. error = -16 ALSA device list: #0: Sound Fusion CS46xx at 0xfcffc000/0xfce00000, irq 14 NET: Registered protocol family 2 IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 262144 bind 65536) ip_conntrack version 2.1 (5120 buckets, 40960 max) - 304 bytes per conntrack ip_tables: (C) 2000-2002 Netfilter core team ipt_recent v0.3.1: Stephen Frost <sfrost@snowman.net>. http://snowman.net/projects/ipt_recent/ arp_tables: (C) 2002 David S. Miller NET: Registered protocol family 1 NET: Registered protocol family 17 kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 456k freed EXT3 FS on sdc1, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on sdb1, internal journal EXT3-fs: mounted filesystem with ordered data mode. kudzu: numerical sysctl 1 23 is obsolete. process `named' is using obsolete setsockopt SO_BSDCOMPAT ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 4:54 ` William Lee Irwin III 2003-12-06 4:57 ` Stian Jordet 2003-12-06 7:11 ` Colin Coe @ 2003-12-08 15:45 ` bill davidsen 2003-12-08 17:36 ` Zwane Mwaikambo 2 siblings, 1 reply; 27+ messages in thread From: bill davidsen @ 2003-12-08 15:45 UTC (permalink / raw) To: linux-kernel In article <20031206045409.GK8039@holomorphy.com>, William Lee Irwin III <wli@holomorphy.com> wrote: | l?r, 06.12.2003 kl. 05.37 skrev William Lee Irwin III: | >> Yeah, it looks like it hit you too. | >> Could you boot with noirqbalance on the kernel commandline and see if | >> the problem goes away? | | On Sat, Dec 06, 2003 at 05:48:46AM +0100, Stian Jordet wrote: | > Wow, that actually fixed it :) | > CPU0 CPU1 | > 0: 65636 63667 IO-APIC-edge timer | > 1: 150 136 IO-APIC-edge i8042 | > 2: 0 0 XT-PIC cascade | > 3: 2 1 IO-APIC-edge serial | > 8: 3 1 IO-APIC-edge rtc | > 9: 0 0 IO-APIC-level acpi | > 14: 18 37 IO-APIC-edge ide0 | | Okay, irqbalance has gaffed (as predicted). Could you send in | /proc/cpuinfo and /var/log/dmesg? I think the most confusing thing about this was the choice of "noirqbalance" as an option to mean "do balance irqs." I'm not sure that the default to put all irqs on a single CPU is optimal in any case, but the naming is particularly bad. On light irq load the cache probably gets reloaded before the next interrupt on modern CPUs, and under really heavy irq pressure I see posts showing some overflow to other CPUs, so it's the in-between cases which benefit. At least I hope, people did look at cache and ctx effects before putting irqs on a single CPU, given the people I assume they are right. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-08 15:45 ` bill davidsen @ 2003-12-08 17:36 ` Zwane Mwaikambo 0 siblings, 0 replies; 27+ messages in thread From: Zwane Mwaikambo @ 2003-12-08 17:36 UTC (permalink / raw) To: bill davidsen; +Cc: linux-kernel On Mon, 8 Dec 2003, bill davidsen wrote: > I think the most confusing thing about this was the choice of > "noirqbalance" as an option to mean "do balance irqs." I'm not sure that > the default to put all irqs on a single CPU is optimal in any case, but > the naming is particularly bad. Actually, noirqbalance means no in kernel irq balancer. ia32 SMP systems before P4 tend to RR interrupt handling via hardware by utilising an APIC bus arbitration scheme. P4 doesn't, one reason being the missing Arbitration ID register and the usage of a bus cycle to determine which processor should handle the interrupt depending on the status of the Task Priority Register on each local apic (processor). So in essence we should be using the TPR to do interrupt balancing decisions with P4/Xeon. So all noirqbalance will do is disable in kernel balancer. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 4:37 ` William Lee Irwin III 2003-12-06 4:48 ` Stian Jordet @ 2003-12-06 14:07 ` Adam Kropelin 1 sibling, 0 replies; 27+ messages in thread From: Adam Kropelin @ 2003-12-06 14:07 UTC (permalink / raw) To: William Lee Irwin III, Stian Jordet, Nick Piggin, Colin Coe, linux-kernel On Fri, Dec 05, 2003 at 08:37:57PM -0800, William Lee Irwin III wrote: > On Sat, Dec 06, 2003 at 05:28:38AM +0100, Stian Jordet wrote: > > Uhm.. I was under the impression that this was expected behaviour? If > > not, I guess I'm having problems too? > > CPU0 CPU1 > > 0: 91068534 45 IO-APIC-edge timer > > 1: 65293 1 IO-APIC-edge i8042 > > 2: 0 0 XT-PIC cascade > > 3: 71 1 IO-APIC-edge serial > > 8: 325118 1 IO-APIC-edge rtc > > 9: 0 0 IO-APIC-level acpi > > 14: 245619 1 IO-APIC-edge ide0 > > Yeah, it looks like it hit you too. > > Could you boot with noirqbalance on the kernel commandline and see if > the problem goes away? Sounds like you have things under control, but if you need another data point I can provide my info. Dual ppro200, same behaviors described in this thread. Booting with noirqbalance fixes it. --Adam ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 3:07 ` William Lee Irwin III 2003-12-06 4:28 ` Stian Jordet @ 2003-12-06 7:02 ` Colin Coe 2003-12-06 10:58 ` William Lee Irwin III 2003-12-06 20:08 ` Ethan Weinstein 1 sibling, 2 replies; 27+ messages in thread From: Colin Coe @ 2003-12-06 7:02 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel Sorry about the delay. Booted with noirqbalance. [root@host root]# cat /proc/interrupts CPU0 CPU1 0: 7411777 5971987 IO-APIC-edge timer 1: 7 4 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 4: 16 42 IO-APIC-edge serial 5: 4915 4820 IO-APIC-level eth1 10: 67 69 IO-APIC-level aic7xxx 11: 325 266 IO-APIC-level eth0 12: 47 109 IO-APIC-edge i8042 14: 0 0 IO-APIC-level CS46XX 15: 6398 6401 IO-APIC-level megaraid NMI: 0 0 LOC: 13383659 13383658 ERR: 0 MIS: 0 That looks a lot better... Thanks! -- "Obnoxious frog..." Spike, 2071AD William Lee Irwin III said: > On Sat, Dec 06, 2003 at 01:48:54PM +1100, Nick Piggin wrote: >> Although in this case Colin has 2 PPro 200s. >> Colin - process load should be evenly distributed between CPUs, and this >> is generally the most important thing. Big networking loads (most >> commonly) >> can put a lot of time into processing interrupts though. > > That is rather busted, then. > > Colin, could you try booting with noirqbalance on the kernel command > line? > > > -- wli > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 7:02 ` Colin Coe @ 2003-12-06 10:58 ` William Lee Irwin III 2003-12-06 20:08 ` Ethan Weinstein 1 sibling, 0 replies; 27+ messages in thread From: William Lee Irwin III @ 2003-12-06 10:58 UTC (permalink / raw) To: Colin Coe; +Cc: linux-kernel On Sat, Dec 06, 2003 at 03:02:22PM +0800, Colin Coe wrote: > Sorry about the delay. > Booted with noirqbalance. This leads to a similar conclusion to Stian Jordet's case. It's not mistaking you for HT, it's the lack of an internal distinction between the cases that need and don't need irq balancing. -- wli ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? 2003-12-06 7:02 ` Colin Coe 2003-12-06 10:58 ` William Lee Irwin III @ 2003-12-06 20:08 ` Ethan Weinstein 1 sibling, 0 replies; 27+ messages in thread From: Ethan Weinstein @ 2003-12-06 20:08 UTC (permalink / raw) To: linux-kernel; +Cc: colin Colin Coe wrote: > Sorry about the delay. > > Booted with noirqbalance. > > [root@host root]# cat /proc/interrupts > CPU0 CPU1 > 0: 7411777 5971987 IO-APIC-edge timer > 1: 7 4 IO-APIC-edge i8042 > 2: 0 0 XT-PIC cascade > 4: 16 42 IO-APIC-edge serial > 5: 4915 4820 IO-APIC-level eth1 > 10: 67 69 IO-APIC-level aic7xxx > 11: 325 266 IO-APIC-level eth0 > 12: 47 109 IO-APIC-edge i8042 > 14: 0 0 IO-APIC-level CS46XX > 15: 6398 6401 IO-APIC-level megaraid > NMI: 0 0 > LOC: 13383659 13383658 > ERR: 0 > MIS: 0 > > That looks a lot better... > > Thanks! > > -- > "Obnoxious frog..." Spike, 2071AD > > William Lee Irwin III said: > >>On Sat, Dec 06, 2003 at 01:48:54PM +1100, Nick Piggin wrote: >> >>>Although in this case Colin has 2 PPro 200s. >>>Colin - process load should be evenly distributed between CPUs, and this >>>is generally the most important thing. Big networking loads (most >>>commonly) >>>can put a lot of time into processing interrupts though. >> >>That is rather busted, then. >> >>Colin, could you try booting with noirqbalance on the kernel command >>line? >> >> >>-- wli >> > I'll throw my hat in here as well. This is an old compaq proliant I have at the office- dual 400 PII, booted with "noirqbalance" 2.6.0-test11: CPU0 CPU1 0: 2580383 1920931 IO-APIC-edge timer 1: 6 3 IO-APIC-edge i8042 2: 0 0 XT-PIC cascade 5: 467 423 IO-APIC-level TLAN 8: 1 0 IO-APIC-edge rtc 9: 15 15 IO-APIC-level sym53c8xx 10: 17 17 IO-APIC-level sym53c8xx 11: 1602 1593 IO-APIC-level ida0 14: 8 2 IO-APIC-edge ide0 NMI: 0 0 LOC: 4501366 4501354 ERR: 0 MIS: 0 witout "noirqbalance" we interrupt on CPU0 solely. -E ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2003-12-08 19:25 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <fa.jiqirm0.13gv2u@ifi.uio.no> [not found] ` <fa.f9f2gij.1kua0f@ifi.uio.no> 2003-12-06 17:38 ` SMP broken on Dell PowerEdge 4100/200 under 2.6.0-testxx? William Park [not found] <10wU2-1mR-11@gated-at.bofh.it> [not found] ` <10wU2-1mR-13@gated-at.bofh.it> [not found] ` <10wU2-1mR-15@gated-at.bofh.it> [not found] ` <10wU3-1mR-17@gated-at.bofh.it> [not found] ` <10wU3-1mR-19@gated-at.bofh.it> [not found] ` <10wU3-1mR-21@gated-at.bofh.it> [not found] ` <10wU3-1mR-23@gated-at.bofh.it> [not found] ` <10wU3-1mR-25@gated-at.bofh.it> [not found] ` <10wU2-1mR-9@gated-at.bofh.it> [not found] ` <10xGk-38t-15@gated-at.bofh.it> 2003-12-08 19:24 ` Matthew Kanar [not found] <ZAwx-88m-3@gated-at.bofh.it> [not found] ` <ZAGd-8ma-5@gated-at.bofh.it> [not found] ` <ZAQ7-6X-13@gated-at.bofh.it> [not found] ` <ZAZB-pS-11@gated-at.bofh.it> [not found] ` <ZCoI-2oz-9@gated-at.bofh.it> [not found] ` <ZCyh-2Bv-1@gated-at.bofh.it> [not found] ` <ZCI5-2Pv-3@gated-at.bofh.it> [not found] ` <ZCIb-2Pv-11@gated-at.bofh.it> 2003-12-08 16:42 ` Matthew Kanar 2003-12-08 17:21 ` William Lee Irwin III 2003-12-08 17:38 ` Zwane Mwaikambo 2003-12-06 2:32 Colin Coe 2003-12-06 2:42 ` William Lee Irwin III 2003-12-06 2:48 ` William Lee Irwin III 2003-12-06 2:48 ` Nick Piggin 2003-12-06 3:07 ` William Lee Irwin III 2003-12-06 4:28 ` Stian Jordet 2003-12-06 4:37 ` William Lee Irwin III 2003-12-06 4:48 ` Stian Jordet 2003-12-06 4:54 ` William Lee Irwin III 2003-12-06 4:57 ` Stian Jordet 2003-12-06 5:09 ` William Lee Irwin III 2003-12-06 5:14 ` Stian Jordet 2003-12-06 5:40 ` William Lee Irwin III 2003-12-08 15:57 ` bill davidsen 2003-12-08 16:47 ` William Lee Irwin III 2003-12-06 7:11 ` Colin Coe 2003-12-08 15:45 ` bill davidsen 2003-12-08 17:36 ` Zwane Mwaikambo 2003-12-06 14:07 ` Adam Kropelin 2003-12-06 7:02 ` Colin Coe 2003-12-06 10:58 ` William Lee Irwin III 2003-12-06 20:08 ` Ethan Weinstein
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).