* Re: 2.6.10-rc2 doesn't boot (if no floppy device) [not found] <F7DC2337C7631D4386A2DF6E8FB22B30020B7225@hdsmsx401.amr.corp.intel.com> @ 2004-11-19 15:57 ` Adrian Bunk 2004-11-19 17:36 ` Linus Torvalds 0 siblings, 1 reply; 36+ messages in thread From: Adrian Bunk @ 2004-11-19 15:57 UTC (permalink / raw) To: Brown, Len Cc: Chris Wright, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Fri, Nov 19, 2004 at 10:27:59AM -0500, Brown, Len wrote: > > >It doesn't compile since I don't have APIC support enabled. > > can you add it APIC support for the purpose of the test, > or should I send you a patch with print_PIC() moved > out of io_apic.c? It's not a problem (I just wasn't sure, whether enabling APIC might change something relevant. Full dmesg output is below (this is with Linus' patch applied). > thanks, > -Len cu Adrian BTW: The only reason why I have ACPI enabled is that my computer turns off after "halt"... Linux version 2.6.10-rc2 (bunk@r063144.stusta.swh.mhn.de) (gcc version 3.4.2 (Debian 3.4.2-3)) #14 Fri Nov 19 16:41:20 CET 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000000fff0000 (usable) BIOS-e820: 000000000fff0000 - 000000000fff8000 (ACPI data) BIOS-e820: 000000000fff8000 - 0000000010000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ffee0000 - 00000000fff00000 (reserved) BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved) 255MB LOWMEM available. found SMP MP-table at 000fbc70 On node 0 totalpages: 65520 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 61424 pages, LIFO batch:14 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. ACPI: RSDP (v000 AMI ) @ 0x000fab70 ACPI: RSDT (v001 AMIINT SiS740XX 0x00001000 MSFT 0x0100000b) @ 0x0fff0000 ACPI: FADT (v001 AMIINT SiS740XX 0x00000011 MSFT 0x0100000b) @ 0x0fff0030 ACPI: MADT (v001 AMIINT SiS740XX 0x00001000 MSFT 0x0100000b) @ 0x0fff00c0 ACPI: DSDT (v001 SiS 746 0x00000100 MSFT 0x0100000d) @ 0x00000000 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 6:8 APIC version 16 ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Built 1 zonelists Kernel command line: BOOT_IMAGE=test ro root=301 mode=1280x1024@760 apic=debug acpi_dbg_level=1 mapped APIC to ffffd000 (fee00000) mapped IOAPIC to ffffc000 (fec00000) Initializing CPU#0 CPU 0 irqstacks, hard=c0476000 soft=c0475000 PID hash table entries: 1024 (order: 10, 16384 bytes) Detected 1800.017 MHz processor. Using tsc for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Memory: 255264k/262080k available (2376k kernel code, 6248k reserved, 972k data, 164k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 3563.52 BogoMIPS (lpj=1781760) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020 CPU: AMD Athlon(tm) XP 2200+ stepping 01 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. Getting VERSION: 40010 Getting VERSION: 40010 Getting ID: 0 Getting LVT0: 700 Getting LVT1: 10400 enabled ExtINT on CPU#0 ENABLING IO-APIC IRQs Synchronizing Arb IDs. init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected. ..TIMER: vector=0x31 pin1=2 pin2=-1 Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 1799.0641 MHz. ..... host bus clock speed is 266.0613 MHz. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfdb31, last bus=2 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20041105 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing printing PIC contents ... PIC IMR: fffe ... PIC IRR: 0000 ... PIC ISR: 0000 ... PIC ELCR: 0c68 ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) Uncovering SIS963 that hid as a SIS503 (compatible=0) Enabling SiS 96x SMBus. ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: Power Resource [URP1] (off) ACPI: Power Resource [URP2] (off) ACPI: Power Resource [FDDP] (off) ACPI: Power Resource [LPTP] (off) ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 *6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs *3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 12 14 15) SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing printing PIC contents ... PIC IMR: fffe ... PIC IRR: 0c00 ... PIC ISR: 0000 ... PIC ELCR: 0c68 ** PCI interrupts are no longer routed automatically. If this ** causes a device to stop working, it is probably because the ** driver failed to call pci_enable_device(). As a temporary ** workaround, the "pci=routeirq" argument restores the old ** behavior. If this argument makes the device work again, ** please email the output of "lspci" to bjorn.helgaas@hp.com ** so I can fix the driver. number of MP IRQ sources: 15. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 ....... : Delivery Type: 0 ....... : LTS : 0 .... register #01: 00178011 ....... : max redirection entries: 0017 ....... : PRQ implemented: 1 ....... : IO APIC version: 0011 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 001 01 0 0 0 0 0 1 1 39 02 001 01 0 0 0 0 0 1 1 31 03 001 01 0 0 0 0 0 1 1 41 04 001 01 0 0 0 0 0 1 1 49 05 001 01 0 0 0 0 0 1 1 51 06 001 01 0 0 0 0 0 1 1 59 07 001 01 0 0 0 0 0 1 1 61 08 001 01 0 0 0 0 0 1 1 69 09 001 01 0 1 0 1 0 1 1 71 0a 001 01 0 0 0 0 0 1 1 79 0b 001 01 0 0 0 0 0 1 1 81 0c 001 01 0 0 0 0 0 1 1 89 0d 001 01 0 0 0 0 0 1 1 91 0e 001 01 0 0 0 0 0 1 1 99 0f 001 01 0 0 0 0 0 1 1 A1 10 000 00 1 0 0 0 0 0 0 00 11 000 00 1 0 0 0 0 0 0 00 12 000 00 1 0 0 0 0 0 0 00 13 000 00 1 0 0 0 0 0 0 00 14 000 00 1 0 0 0 0 0 0 00 15 000 00 1 0 0 0 0 0 0 00 16 000 00 1 0 0 0 0 0 0 00 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:2 IRQ1 -> 0:1 IRQ3 -> 0:3 IRQ4 -> 0:4 IRQ5 -> 0:5 IRQ6 -> 0:6 IRQ7 -> 0:7 IRQ8 -> 0:8 IRQ9 -> 0:9 IRQ10 -> 0:10 IRQ11 -> 0:11 IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 IRQ15 -> 0:15 .................................... done. printing PIC contents ... PIC IMR: fffe ... PIC IRR: 0c68 ... PIC ISR: 0000 ... PIC ELCR: 0c68 NTFS driver 2.1.22 [Flags: R/O]. IOAPIC[0]: Set PCI routing entry (2-16 -> 0xa9 -> IRQ 16 Mode:1 Active:1) ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16 radeonfb: Found Intel x86 BIOS ROM Image radeonfb: Retreived PLL infos from BIOS radeonfb: Reference=27.00 MHz (RefDiv=60) Memory=155.00 Mhz, System=155.00 MHz radeonfb: PLL min 12000 max 35000 radeonfb: Monitor 1 type DFP found radeonfb: EDID probed radeonfb: Monitor 2 type no found Console: switching to colour frame buffer device 160x64 radeonfb: ATI Radeon QY DDR SGRAM 64 MB lp: driver loaded but no devices found Real Time Clock Driver v1.12 Non-volatile memory driver v1.2 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected SiS 746 chipset agpgart: Maximum main memory to use for agp memory: 203M agpgart: AGP aperture is 64M @ 0xd0000000 ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16 [drm] Initialized radeon 1.11.0 20020828 on minor 0: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP] parport0: irq 7 detected lp0: using parport0 (polling). io scheduler noop registered io scheduler cfq registered elevator: using cfq as default io scheduler floppy0: no floppy controllers found loop: loaded (max 8 devices) sis900.c: v1.08.07 11/02/2003 IOAPIC[0]: Set PCI routing entry (2-19 -> 0xb1 -> IRQ 19 Mode:1 Active:1) ACPI: PCI interrupt 0000:00:04.0[A] -> GSI 19 (level, low) -> IRQ 19 eth0: Realtek RTL8201 PHY transceiver found at address 1. eth0: Using transceiver found at address 1 as default eth0: SiS 900 PCI Fast Ethernet at 0xdc00, IRQ 19, 00:00:00:00:00:00. Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx SIS5513: IDE controller at PCI slot 0000:00:02.5 SIS5513: chipset revision 0 SIS5513: not 100% native mode: will probe irqs later SIS5513: SiS 962/963 MuTIOL IDE UDMA133 controller ide0: BM-DMA at 0xff00-0xff07, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xff08-0xff0f, BIOS settings: hdc:DMA, hdd:DMA Probing IDE interface ide0... hda: SAMSUNG SV1604N, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: LITE-ON LTR-12101B, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 1024KiB hda: 312581808 sectors (160041 MB) w/2048KiB Cache, CHS=19457/255/63, UDMA(100) hda: cache flushes supported hda: hda1 hda2 hda3 hda4 hdc: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache, DMA Uniform CD-ROM driver Revision: 3.20 IOAPIC[0]: Set PCI routing entry (2-23 -> 0xb9 -> IRQ 23 Mode:1 Active:1) ACPI: PCI interrupt 0000:00:03.2[D] -> GSI 23 (level, low) -> IRQ 23 ehci_hcd 0000:00:03.2: Silicon Integrated Systems [SiS] USB 2.0 Controller ehci_hcd 0000:00:03.2: irq 23, pci mem 0xcffff000 ehci_hcd 0000:00:03.2: new USB bus registered, assigned bus number 1 PCI: cache line size of 64 is not supported by device 0000:00:03.2 ehci_hcd 0000:00:03.2: USB 2.0 initialized, EHCI 1.00, driver 26 Oct 2004 hub 1-0:1.0: USB hub found hub 1-0:1.0: 6 ports detected Initializing USB Mass Storage driver... usbcore: registered new driver usb-storage USB Mass Storage support registered. mice: PS/2 mouse device common for all mice input: AT Translated Set 2 keyboard on isa0060/serio0 input: ImPS/2 Generic Wheel Mouse on isa0060/serio1 i2c /dev entries driver Advanced Linux Sound Architecture Driver Version 1.0.6 (Sun Aug 15 07:17:53 2004 UTC). ACPI: PCI interrupt 0000:00:0c.0[A] -> GSI 16 (level, low) -> IRQ 16 ALSA device list: #0: Ensoniq AudioPCI ENS1371 at 0xd800, irq 16 NET: Registered protocol family 2 IP: routing cache hash table of 2048 buckets, 16Kbytes TCP: Hash tables configured (established 16384 bind 32768) ip_conntrack version 2.1 (2047 buckets, 16376 max) - 176 bytes per conntrack ip_tables: (C) 2000-2002 Netfilter core team NET: Registered protocol family 1 NET: Registered protocol family 17 BIOS EDD facility v0.16 2004-Jun-25, 1 devices found VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 164k freed Adding 987988k swap on /dev/hda2. Priority:-1 extents:1 eth0: Media Link On 10mbps half-duplex agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode agpgart: SiS delay workaround: giving bridge time to recover. agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-19 15:57 ` 2.6.10-rc2 doesn't boot (if no floppy device) Adrian Bunk @ 2004-11-19 17:36 ` Linus Torvalds 2004-11-19 18:51 ` Len Brown 2004-11-19 19:11 ` Adrian Bunk 0 siblings, 2 replies; 36+ messages in thread From: Linus Torvalds @ 2004-11-19 17:36 UTC (permalink / raw) To: Adrian Bunk Cc: Brown, Len, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Fri, 19 Nov 2004, Adrian Bunk wrote: > > It's not a problem (I just wasn't sure, whether enabling APIC might > change something relevant. It did. You no longer show the problem. No irq storm. So can you disable APIC again, and just remove the non-relevant APIC print calls to get it to compile? Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-19 17:36 ` Linus Torvalds @ 2004-11-19 18:51 ` Len Brown 2004-11-19 19:11 ` Adrian Bunk 1 sibling, 0 replies; 36+ messages in thread From: Len Brown @ 2004-11-19 18:51 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Fri, 2004-11-19 at 12:36, Linus Torvalds wrote: > > On Fri, 19 Nov 2004, Adrian Bunk wrote: > > > > It's not a problem (I just wasn't sure, whether enabling APIC might > > change something relevant. > > It did. You no longer show the problem. No irq storm. > > So can you disable APIC again, and just remove the non-relevant APIC > print > calls to get it to compile? I think if you boot the kernel you have with "nolapic" that will be a valid test for us to examine PIC mode. thanks, -Len ps. I'm curious why you were running with !IOAPIC kernel suport on a system which has an IOAPIC. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-19 17:36 ` Linus Torvalds 2004-11-19 18:51 ` Len Brown @ 2004-11-19 19:11 ` Adrian Bunk 2004-11-19 21:05 ` Len Brown 1 sibling, 1 reply; 36+ messages in thread From: Adrian Bunk @ 2004-11-19 19:11 UTC (permalink / raw) To: Linus Torvalds Cc: Brown, Len, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Fri, Nov 19, 2004 at 09:36:12AM -0800, Linus Torvalds wrote: > > > On Fri, 19 Nov 2004, Adrian Bunk wrote: > > > > It's not a problem (I just wasn't sure, whether enabling APIC might > > change something relevant. > > It did. You no longer show the problem. No irq storm. > > So can you disable APIC again, and just remove the non-relevant APIC print > calls to get it to compile? dmesg is below. > Linus cu Adrian Linux version 2.6.10-rc2 (bunk@r063144.stusta.swh.mhn.de) (gcc version 3.4.2 (Debian 3.4.2-3)) #16 Fri Nov 19 19:40:36 CET 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000000fff0000 (usable) BIOS-e820: 000000000fff0000 - 000000000fff8000 (ACPI data) BIOS-e820: 000000000fff8000 - 0000000010000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ffee0000 - 00000000fff00000 (reserved) BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved) 255MB LOWMEM available. On node 0 totalpages: 65520 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 61424 pages, LIFO batch:14 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. ACPI: RSDP (v000 AMI ) @ 0x000fab70 ACPI: RSDT (v001 AMIINT SiS740XX 0x00001000 MSFT 0x0100000b) @ 0x0fff0000 ACPI: FADT (v001 AMIINT SiS740XX 0x00000011 MSFT 0x0100000b) @ 0x0fff0030 ACPI: MADT (v001 AMIINT SiS740XX 0x00001000 MSFT 0x0100000b) @ 0x0fff00c0 ACPI: DSDT (v001 SiS 746 0x00000100 MSFT 0x0100000d) @ 0x00000000 Built 1 zonelists Kernel command line: BOOT_IMAGE=test ro root=301 mode=1280x1024@760 apic=debug acpi_dbg_level=1 Initializing CPU#0 CPU 0 irqstacks, hard=c0466000 soft=c0465000 PID hash table entries: 1024 (order: 10, 16384 bytes) Detected 1800.276 MHz processor. Using tsc for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Memory: 255340k/262080k available (2364k kernel code, 6172k reserved, 943k data, 144k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 3563.52 BogoMIPS (lpj=1781760) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020 CPU: AMD Athlon(tm) XP 2200+ stepping 01 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. ACPI: IRQ9 SCI: Edge set to Level Trigger. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfdb31, last bus=2 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20041105 ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) Uncovering SIS963 that hid as a SIS503 (compatible=0) Enabling SiS 96x SMBus. ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: Power Resource [URP1] (off) ACPI: Power Resource [URP2] (off) ACPI: Power Resource [FDDP] (off) ACPI: Power Resource [LPTP] (off) ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 *6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKE] (IRQs *3 4 5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 12 14 15) SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing ** PCI interrupts are no longer routed automatically. If this ** causes a device to stop working, it is probably because the ** driver failed to call pci_enable_device(). As a temporary ** workaround, the "pci=routeirq" argument restores the old ** behavior. If this argument makes the device work again, ** please email the output of "lspci" to bjorn.helgaas@hp.com ** so I can fix the driver. NTFS driver 2.1.22 [Flags: R/O]. ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11 PCI: setting IRQ 11 as level-triggered ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 11 (level, low) -> IRQ 11 radeonfb: Found Intel x86 BIOS ROM Image radeonfb: Retreived PLL infos from BIOS radeonfb: Reference=27.00 MHz (RefDiv=60) Memory=155.00 Mhz, System=155.00 MHz radeonfb: PLL min 12000 max 35000 radeonfb: Monitor 1 type DFP found radeonfb: EDID probed radeonfb: Monitor 2 type no found Console: switching to colour frame buffer device 160x64 radeonfb: ATI Radeon QY DDR SGRAM 64 MB lp: driver loaded but no devices found Real Time Clock Driver v1.12 Non-volatile memory driver v1.2 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected SiS 746 chipset agpgart: Maximum main memory to use for agp memory: 203M agpgart: AGP aperture is 64M @ 0xd0000000 ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 11 (level, low) -> IRQ 11 [drm] Initialized radeon 1.11.0 20020828 on minor 0: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP] parport0: irq 7 detected lp0: using parport0 (polling). io scheduler noop registered io scheduler cfq registered elevator: using cfq as default io scheduler irq 6: nobody cared! [<c012c0d4>] __report_bad_irq+0x24/0x80 [<c012c1e1>] note_interrupt+0x81/0xa0 [<c012bc13>] __do_IRQ+0x143/0x170 [<c01040ce>] do_IRQ+0x3e/0x60 ======================= [<c010271a>] common_interrupt+0x1a/0x20 [<c0116720>] __do_softirq+0x30/0x90 [<c01041d1>] do_softirq+0x41/0x50 ======================= [<c012ba55>] irq_exit+0x35/0x40 [<c01040d5>] do_IRQ+0x45/0x60 [<c010271a>] common_interrupt+0x1a/0x20 [<c012be74>] setup_irq+0x94/0x110 [<c0258300>] floppy_hardint+0x0/0x120 [<c012c07c>] request_irq+0x7c/0xb0 [<c02584df>] fd_request_irq+0x2f/0x60 [<c025edab>] floppy_grab_irq_and_dma+0x4b/0x3b0 [<c044ec5c>] floppy_init+0x1ec/0x5b0 [<c025ece0>] floppy_find+0x0/0x80 [<c043c813>] do_initcalls+0x53/0xb0 [<c0100400>] init+0x0/0x110 [<c0100400>] init+0x0/0x110 [<c010042a>] init+0x2a/0x110 [<c0100818>] kernel_thread_helper+0x0/0x18 [<c010081d>] kernel_thread_helper+0x5/0x18 handlers: [<c0258300>] (floppy_hardint+0x0/0x120) Disabling IRQ #6 floppy0: no floppy controllers found loop: loaded (max 8 devices) sis900.c: v1.08.07 11/02/2003 ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 6 PCI: setting IRQ 6 as level-triggered ACPI: PCI interrupt 0000:00:04.0[A] -> GSI 6 (level, low) -> IRQ 6 eth0: Realtek RTL8201 PHY transceiver found at address 1. eth0: Using transceiver found at address 1 as default eth0: SiS 900 PCI Fast Ethernet at 0xdc00, IRQ 6, 00:00:00:00:00:00. Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx SIS5513: IDE controller at PCI slot 0000:00:02.5 SIS5513: chipset revision 0 SIS5513: not 100% native mode: will probe irqs later SIS5513: SiS 962/963 MuTIOL IDE UDMA133 controller ide0: BM-DMA at 0xff00-0xff07, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xff08-0xff0f, BIOS settings: hdc:DMA, hdd:DMA Probing IDE interface ide0... hda: SAMSUNG SV1604N, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: LITE-ON LTR-12101B, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 1024KiB hda: 312581808 sectors (160041 MB) w/2048KiB Cache, CHS=19457/255/63, UDMA(100) hda: cache flushes supported hda: hda1 hda2 hda3 hda4 hdc: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache, DMA Uniform CD-ROM driver Revision: 3.20 ACPI: PCI Interrupt Link [LNKH] enabled at IRQ 10 PCI: setting IRQ 10 as level-triggered ACPI: PCI interrupt 0000:00:03.2[D] -> GSI 10 (level, low) -> IRQ 10 ehci_hcd 0000:00:03.2: Silicon Integrated Systems [SiS] USB 2.0 Controller ehci_hcd 0000:00:03.2: irq 10, pci mem 0xcffff000 ehci_hcd 0000:00:03.2: new USB bus registered, assigned bus number 1 PCI: cache line size of 64 is not supported by device 0000:00:03.2 ehci_hcd 0000:00:03.2: USB 2.0 initialized, EHCI 1.00, driver 26 Oct 2004 hub 1-0:1.0: USB hub found hub 1-0:1.0: 6 ports detected Initializing USB Mass Storage driver... usbcore: registered new driver usb-storage USB Mass Storage support registered. mice: PS/2 mouse device common for all mice input: AT Translated Set 2 keyboard on isa0060/serio0 input: ImPS/2 Generic Wheel Mouse on isa0060/serio1 i2c /dev entries driver Advanced Linux Sound Architecture Driver Version 1.0.6 (Sun Aug 15 07:17:53 2004 UTC). ACPI: PCI interrupt 0000:00:0c.0[A] -> GSI 11 (level, low) -> IRQ 11 ALSA device list: #0: Ensoniq AudioPCI ENS1371 at 0xd800, irq 11 NET: Registered protocol family 2 IP: routing cache hash table of 2048 buckets, 16Kbytes TCP: Hash tables configured (established 16384 bind 32768) ip_conntrack version 2.1 (2047 buckets, 16376 max) - 176 bytes per conntrack ip_tables: (C) 2000-2002 Netfilter core team NET: Registered protocol family 1 NET: Registered protocol family 17 BIOS EDD facility v0.16 2004-Jun-25, 1 devices found VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 144k freed Adding 987988k swap on /dev/hda2. Priority:-1 extents:1 eth0: Media Link On 10mbps half-duplex ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-19 19:11 ` Adrian Bunk @ 2004-11-19 21:05 ` Len Brown 0 siblings, 0 replies; 36+ messages in thread From: Len Brown @ 2004-11-19 21:05 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Fri, 2004-11-19 at 14:11, Adrian Bunk wrote: > Kernel command line: BOOT_IMAGE=test ro root=301 mode=1280x1024@760 > apic=debug acpi_dbg_level=1 > ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 *6 7 10 11 12 14 15) > elevator: using cfq as default io scheduler > irq 6: nobody cared! > ... > ======================= > [<c012ba55>] irq_exit+0x35/0x40 > [<c01040d5>] do_IRQ+0x45/0x60 > [<c010271a>] common_interrupt+0x1a/0x20 > [<c012be74>] setup_irq+0x94/0x110 > [<c0258300>] floppy_hardint+0x0/0x120 > [<c012c07c>] request_irq+0x7c/0xb0 > [<c02584df>] fd_request_irq+0x2f/0x60 > [<c025edab>] floppy_grab_irq_and_dma+0x4b/0x3b0 > [<c044ec5c>] floppy_init+0x1ec/0x5b0 > [<c025ece0>] floppy_find+0x0/0x80 > [<c043c813>] do_initcalls+0x53/0xb0 > [<c0100400>] init+0x0/0x110 > [<c0100400>] init+0x0/0x110 > [<c010042a>] init+0x2a/0x110 > [<c0100818>] kernel_thread_helper+0x0/0x18 > [<c010081d>] kernel_thread_helper+0x5/0x18 > handlers: > [<c0258300>] (floppy_hardint+0x0/0x120) > Disabling IRQ #6 > floppy0: no floppy controllers found > loop: loaded (max 8 devices) > sis900.c: v1.08.07 11/02/2003 > ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 6 > PCI: setting IRQ 6 as level-triggered > ACPI: PCI interrupt 0000:00:04.0[A] -> GSI 6 (level, low) -> IRQ 6 > eth0: Realtek RTL8201 PHY transceiver found at address 1. > eth0: Using transceiver found at address 1 as default > eth0: SiS 900 PCI Fast Ethernet at 0xdc00, IRQ 6, 00:00:00:00:00:00. ... Please verify that the patch is applied and working by excluding acpi_dbg_level=1 from the cmdline and verifying that you get some new "NOT disabled" lines in the dmesg. Please restore the IOAPIC support so we get the print_PIC lines back per the original debug patch, and boot with "nolapic" "apic=debug" "pci=routeirq" and send along the dmesg. This will tell us about the state of IRQ6 when we get it from the BIOS and how Linux changes it. Also, if you can send or point me the output from acpidmp, available in /usr/sbin or in pmtools: http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ I'd like to examine the ASL associated with the interrupt link that is camped on IRQ6. thanks, -Len ps. In addition to soft-power-off, I expect that ACPI is required to enable IOAPIC support on this system. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot
@ 2004-11-15 23:27 Chris Wright
2004-11-18 23:14 ` 2.6.10-rc2 doesn't boot (if no floppy device) Len Brown
0 siblings, 1 reply; 36+ messages in thread
From: Chris Wright @ 2004-11-15 23:27 UTC (permalink / raw)
To: Adrian Bunk; +Cc: Linus Torvalds, Bjorn Helgaas, Kernel Mailing List
* Adrian Bunk (bunk@stusta.de) wrote:
> It seems Bjorns "PCI: remove unconditional PCI ACPI IRQ routing" was
> merged now into your tree, but his patch to fix floppy.c wasn't
> merged...
What's the likelihood of getting some derivative of Bjorn's patch
merged? W/out the patch (and w/ floppy built) I've the same issue.
thanks,
-chris
--
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-15 23:27 2.6.10-rc2 doesn't boot Chris Wright @ 2004-11-18 23:14 ` Len Brown 2004-11-19 7:09 ` Chris Wright ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: Len Brown @ 2004-11-18 23:14 UTC (permalink / raw) To: Chris Wright Cc: Adrian Bunk, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton [-- Attachment #1: Type: text/plain, Size: 796 bytes --] Chris, Please apply this debug patch and boot with apic=debug acpi_dbg_level=1 If the disabled floppy hardware doesn't cause the floppy.c to hang the system, (or if running Linus' recent floppy.c update, if the floppy.c doesn't nuke IRQ6) then please send me the dmesg. Then try with apic=debug pci=routeirq and capture that dmesg. If the patch makes no functional difference, then please add pci=routeirq to the cmdline above and send me that dmesg. apic=debug enables print_PIC() so we can see what the BIOS gave us, and what we do to the PIC. acpi_dbg_level=1 will prevent ACPI from blindly disabling the PCI Interrupt Link Devices, which I'm guessing may be confusing the BIOS on this box. if you can send me the acpidmp and lspci -vv for this box, that would help too. thanks, -Len [-- Attachment #2: debug.patch --] [-- Type: text/plain, Size: 1079 bytes --] ===== arch/i386/pci/acpi.c 1.18 vs edited ===== --- 1.18/arch/i386/pci/acpi.c 2004-10-19 00:44:01 -04:00 +++ edited/arch/i386/pci/acpi.c 2004-11-18 17:57:20 -05:00 @@ -56,6 +56,10 @@ if (acpi_ioapic) print_IO_APIC(); #endif + { + extern void print_PIC(void); + print_PIC(); + } return 0; } ===== drivers/acpi/pci_link.c 1.34 vs edited ===== --- 1.34/drivers/acpi/pci_link.c 2004-11-02 02:40:09 -05:00 +++ edited/drivers/acpi/pci_link.c 2004-11-18 18:11:15 -05:00 @@ -475,6 +475,9 @@ struct acpi_pci_link *link = NULL; int i = 0; +extern void print_PIC(void); +print_PIC(); + ACPI_FUNCTION_TRACE("acpi_irq_penalty_init"); /* @@ -685,8 +688,13 @@ acpi_link.count++; end: + /* disable all links -- to be activated on use */ +if (acpi_dbg_level != 1) acpi_ut_evaluate_object(link->handle, "_DIS", 0, NULL); +else + printk("NOT disabled\n"); + if (result) kfree(link); @@ -865,6 +873,9 @@ if (acpi_noirq) return_VALUE(0); + +extern void print_PIC(void); +print_PIC(); acpi_link.count = 0; INIT_LIST_HEAD(&acpi_link.entries); ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-18 23:14 ` 2.6.10-rc2 doesn't boot (if no floppy device) Len Brown @ 2004-11-19 7:09 ` Chris Wright 2004-11-20 9:02 ` Len Brown 2004-11-19 13:47 ` Adrian Bunk 2004-11-23 1:57 ` Chris Wright 2 siblings, 1 reply; 36+ messages in thread From: Chris Wright @ 2004-11-19 7:09 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Adrian Bunk, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton * Len Brown (len.brown@intel.com) wrote: > Chris, > > Please apply this debug patch and boot with > apic=debug acpi_dbg_level=1 Len, unfortunately the disk drive went south on my laptop. So testing will take a bit longer as I piece things back together. thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-19 7:09 ` Chris Wright @ 2004-11-20 9:02 ` Len Brown 2004-11-20 12:40 ` Adrian Bunk ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: Len Brown @ 2004-11-20 9:02 UTC (permalink / raw) To: Chris Wright Cc: Adrian Bunk, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton [-- Attachment #1: Type: text/plain, Size: 2463 bytes --] Please try this updated debug patch. It clears the ELCR on Linux boot. Also, it prints out the ICH PIRQ registers which are the hardware underlying the ACPI PCI Interrupt Links. It no longer depends on IOAPIC support in the kernel, nor the apic=debug flag. Please boot it with no kernel parameters and report if it makes the floppy probe failure (or 8042 probe failure) go away, and send dmesg. If it still fails, please boot it with pci=routeirq and capture the dmesg; also boot it with acpi_dbg_level=1 and capture the dmesg. The former will program all the links before device probe, the later will cause the kernel to skip disabling the links and skip clearning the ELCR. thanks, -Len ps. what I think is happening... To its credit, he BIOS correctly recognizes that there is no floppy, and it routes a PIRQ to IRQ6. It correctly sets the ELCR bit for this IRQ. Linux boots and disables all the PCI Interrupt Links, which un-programs the PIRQ directed to IRQ6. However, Linux doesn't clear the ELCR first, and for some reason that causes an interrupt to latch in IRQ6 -- though it is masked. Along comes the broken floppy driver before the PCI devices probe. floppy doesn't realize there is no hardware and unwittingly does a request_irq(6). Since nobody is camped on IRQ6, this exclusive request succeeds, IRQ6 is unmasked and the floppy driver takes the outstaning interrupt. For some reason it is unable to clear the IRQ. I don't think it is because the line is being driven by the PIRQ, that is disabled. It may have something to do with the ELCR still being set to level senstive for this IRQ. So this hangs the system, or with Linus' update to floppy.c, it nukes IRQ6. I'm hopeful that with a cleared ELCR, the misguided legacy floppy probe will work as it always has, and floppy will then unload so the PCI device can later claim IRQ6 and program the link and the ELCR accordingly. I do believe that Linux should continue to disable the PCI Interrupt Links at boot and enable them only on demand. We used to leave them all enabled and on some systems that caused spurious interrupts, duplicated interrupts, and all kinds of crazy things. Note if the floppy driver had not requested IRQ6, then a PCI device would have requested it, the link would be programmed, and ELCR would be set (redundant), and the interrupt unmasked. The device would take the initial interrupt, it would clear it successfully, and run normally from then on. [-- Attachment #2: debug.patch --] [-- Type: text/plain, Size: 4628 bytes --] ===== arch/i386/kernel/i8259.c 1.38 vs edited ===== --- 1.38/arch/i386/kernel/i8259.c 2004-10-20 04:37:14 -04:00 +++ edited/arch/i386/kernel/i8259.c 2004-11-20 03:18:08 -05:00 @@ -243,10 +243,21 @@ /** * ELCR registers (0x4d0, 0x4d1) control edge/level of IRQ */ -static void restore_ELCR(char *trigger) + +void restore_ELCR(char *trigger) { outb(trigger[0], 0x4d0); outb(trigger[1], 0x4d1); +} + +void +clear_ELCR(void) +{ + char trigger[2] = {0, 0}; + + restore_ELCR(trigger); + printk("clear ELCR\n"); + print_ICH_PIC(); /* XXX debug */ } static void save_ELCR(char *trigger) ===== arch/i386/kernel/acpi/boot.c 1.75 vs edited ===== --- 1.75/arch/i386/kernel/acpi/boot.c 2004-11-11 19:08:40 -05:00 +++ edited/arch/i386/kernel/acpi/boot.c 2004-11-20 01:23:46 -05:00 @@ -801,7 +801,9 @@ acpi_boot_init (void) { int error; - + extern int print_ICH_PIC(void); + + print_ICH_PIC(); /* * If acpi_disabled, bail out * One exception: acpi=ht continues far enough to enumerate LAPICs ===== arch/i386/pci/acpi.c 1.18 vs edited ===== --- 1.18/arch/i386/pci/acpi.c 2004-10-19 00:44:01 -04:00 +++ edited/arch/i386/pci/acpi.c 2004-11-20 01:23:46 -05:00 @@ -56,6 +56,10 @@ if (acpi_ioapic) print_IO_APIC(); #endif + { + extern int print_ICH_PIC(void); + print_ICH_PIC(); + } return 0; } ===== drivers/acpi/bus.c 1.47 vs edited ===== --- 1.47/drivers/acpi/bus.c 2004-11-02 02:40:09 -05:00 +++ edited/drivers/acpi/bus.c 2004-11-20 03:47:22 -05:00 @@ -31,6 +31,7 @@ #include <linux/proc_fs.h> #ifdef CONFIG_X86 #include <asm/mpspec.h> +#include <asm/i8259.h> #endif #include <acpi/acpi_bus.h> #include <acpi/acpi_drivers.h> @@ -629,6 +630,11 @@ #ifdef CONFIG_X86 if (!acpi_ioapic) { extern acpi_interrupt_flags acpi_sci_flags; + + if (acpi_dbg_level == 1) /* XXX hack use of dbg flag */ + printk("NOT clearing ELCR\n"); + else + clear_ELCR(); /* compatible (0) means level (3) */ if (acpi_sci_flags.trigger == 0) ===== drivers/acpi/pci_link.c 1.34 vs edited ===== --- 1.34/drivers/acpi/pci_link.c 2004-11-02 02:40:09 -05:00 +++ edited/drivers/acpi/pci_link.c 2004-11-20 03:47:34 -05:00 @@ -54,6 +54,56 @@ #define ACPI_PCI_LINK_FILE_STATUS "state" #define ACPI_PCI_LINK_MAX_POSSIBLE 16 +int print_ICH_PIC(void) +{ + extern spinlock_t i8259A_lock; + unsigned int v; + u8 b; + unsigned long flags; + int i; + + spin_lock_irqsave(&i8259A_lock, flags); + + v = inb(0xa1) << 8 | inb(0x21); + printk(KERN_DEBUG "... PIC IMR: %04x\n", v); + + v = inb(0xa0) << 8 | inb(0x20); + printk(KERN_DEBUG "... PIC IRR: %04x\n", v); + + outb(0x0b,0xa0); + outb(0x0b,0x20); + v = inb(0xa0) << 8 | inb(0x20); + outb(0x0a,0xa0); + outb(0x0a,0x20); + + spin_unlock_irqrestore(&i8259A_lock, flags); + + printk(KERN_DEBUG "... PIC ISR: %04x\n", v); + + v = inb(0x4d1) << 8 | inb(0x4d0); + printk(KERN_DEBUG "... PIC ELCR: %04x\n", v); +#define PCI_CONF1_ADDRESS(bus, devfn, reg) \ + (0x80000000 | (bus << 16) | (devfn << 8) | (reg & ~3)) + + for (i = 0; i < 8; ++i) { + u32 bus = 0; + u32 devfn = 0x1F << 3; /* ICH is dev 31, func 0 */ + u32 reg; + + if (i < 4) + reg = 0x60 + i; + else + reg = 0x64 + i; + + outl(PCI_CONF1_ADDRESS(bus, devfn, reg), 0xCF8); + b = inb(0xCFC + (reg & 3)); + printk("PIRQ%c -> IRQ%d %s\n", 'A' + i, b & 0xF, + b & 0x80 ? "disabled" : "ACTIVE"); + } + return 0; +} + +late_initcall(print_ICH_PIC); static int acpi_pci_link_add (struct acpi_device *device); static int acpi_pci_link_remove (struct acpi_device *device, int type); @@ -475,6 +525,8 @@ struct acpi_pci_link *link = NULL; int i = 0; +print_ICH_PIC(); + ACPI_FUNCTION_TRACE("acpi_irq_penalty_init"); /* @@ -685,8 +737,15 @@ acpi_link.count++; end: + /* disable all links -- to be activated on use */ +if (acpi_dbg_level == 1) { + printk("NOT _DISabled\n"); +} else { acpi_ut_evaluate_object(link->handle, "_DIS", 0, NULL); + printk("_DISabled\n"); +} + if (result) kfree(link); @@ -865,6 +924,9 @@ if (acpi_noirq) return_VALUE(0); + +printk("LENB: ACPI has not touched Links yet\n"); +print_ICH_PIC(); acpi_link.count = 0; INIT_LIST_HEAD(&acpi_link.entries); ===== include/asm-i386/i8259.h 1.2 vs edited ===== --- 1.2/include/asm-i386/i8259.h 2003-03-06 13:14:05 -05:00 +++ edited/include/asm-i386/i8259.h 2004-11-20 03:17:06 -05:00 @@ -13,5 +13,7 @@ extern void enable_8259A_irq(unsigned int irq); extern void disable_8259A_irq(unsigned int irq); extern unsigned int startup_8259A_irq(unsigned int irq); +extern void clear_ELCR(void); +extern void print_ICH_PIC(void); /* debug hack only */ #endif /* __ASM_I8259_H__ */ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 9:02 ` Len Brown @ 2004-11-20 12:40 ` Adrian Bunk 2004-11-20 18:28 ` Linus Torvalds 2004-11-22 18:28 ` Len Brown 2004-11-20 16:41 ` Linus Torvalds 2004-11-23 1:58 ` Chris Wright 2 siblings, 2 replies; 36+ messages in thread From: Adrian Bunk @ 2004-11-20 12:40 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, Nov 20, 2004 at 04:02:04AM -0500, Len Brown wrote: > Please try this updated debug patch. > > It clears the ELCR on Linux boot. > > Also, it prints out the ICH PIRQ registers which > are the hardware underlying the ACPI PCI Interrupt Links. > It no longer depends on IOAPIC support in the kernel, > nor the apic=debug flag. > > Please boot it with no kernel parameters > and report if it makes the floppy probe failure > (or 8042 probe failure) go away, and send dmesg. >... With your patch, the boot failure goes away. This was with a kernel without Linus' patch applied. desg is below. > thanks, > -Len >... cu Adrian BTW: Is all what ACPI does really required, if all I need ACPI for is to turn the power off after halting my computer? Linux version 2.6.10-rc2 (bunk@r063144.stusta.swh.mhn.de) (gcc version 3.4.2 (Debian 3.4.2-3)) #2 Sat Nov 20 13:20:42 CET 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000000fff0000 (usable) BIOS-e820: 000000000fff0000 - 000000000fff8000 (ACPI data) BIOS-e820: 000000000fff8000 - 0000000010000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ffee0000 - 00000000fff00000 (reserved) BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved) 255MB LOWMEM available. On node 0 totalpages: 65520 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 61424 pages, LIFO batch:14 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. ... PIC IMR: fffb ... PIC IRR: 0001 ... PIC ISR: 0000 ... PIC ELCR: 0c68 PIRQA -> IRQ15 disabled PIRQB -> IRQ15 disabled PIRQC -> IRQ15 disabled PIRQD -> IRQ15 disabled PIRQE -> IRQ15 disabled PIRQF -> IRQ15 disabled PIRQG -> IRQ15 disabled PIRQH -> IRQ15 disabled ACPI: RSDP (v000 AMI ) @ 0x000fab70 ACPI: RSDT (v001 AMIINT SiS740XX 0x00001000 MSFT 0x0100000b) @ 0x0fff0000 ACPI: FADT (v001 AMIINT SiS740XX 0x00000011 MSFT 0x0100000b) @ 0x0fff0030 ACPI: MADT (v001 AMIINT SiS740XX 0x00001000 MSFT 0x0100000b) @ 0x0fff00c0 ACPI: DSDT (v001 SiS 746 0x00000100 MSFT 0x0100000d) @ 0x00000000 Built 1 zonelists Kernel command line: BOOT_IMAGE=test ro root=301 mode=1280x1024@760 Initializing CPU#0 CPU 0 irqstacks, hard=c0466000 soft=c0465000 PID hash table entries: 1024 (order: 10, 16384 bytes) Detected 1800.276 MHz processor. Using tsc for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Memory: 255340k/262080k available (2365k kernel code, 6172k reserved, 942k data, 144k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 3563.52 BogoMIPS (lpj=1781760) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020 CPU: AMD Athlon(tm) XP 2200+ stepping 01 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. clear ELCR ... PIC IMR: fffa ... PIC IRR: 0000 ... PIC ISR: 0000 ... PIC ELCR: 0000 PIRQA -> IRQ15 disabled PIRQB -> IRQ15 disabled PIRQC -> IRQ15 disabled PIRQD -> IRQ15 disabled PIRQE -> IRQ15 disabled PIRQF -> IRQ15 disabled PIRQG -> IRQ15 disabled PIRQH -> IRQ15 disabled ACPI: IRQ9 SCI: Edge set to Level Trigger. NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfdb31, last bus=2 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20041105 ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing LENB: ACPI has not touched Links yet ... PIC IMR: fdfa ... PIC IRR: 0000 ... PIC ISR: 0000 ... PIC ELCR: 0200 PIRQA -> IRQ15 disabled PIRQB -> IRQ15 disabled PIRQC -> IRQ15 disabled PIRQD -> IRQ15 disabled PIRQE -> IRQ15 disabled PIRQF -> IRQ15 disabled PIRQG -> IRQ15 disabled PIRQH -> IRQ15 disabled ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) Uncovering SIS963 that hid as a SIS503 (compatible=0) Enabling SiS 96x SMBus. ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: Power Resource [URP1] (off) ACPI: Power Resource [URP2] (off) ACPI: Power Resource [FDDP] (off) ACPI: Power Resource [LPTP] (off) ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) _DISabled ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. _DISabled ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. _DISabled ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 *6 7 10 11 12 14 15) _DISabled ACPI: PCI Interrupt Link [LNKE] (IRQs *3 4 5 6 7 10 11 12 14 15) _DISabled ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 *5 6 7 10 11 12 14 15) _DISabled ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled. _DISabled ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 12 14 15) _DISabled SCSI subsystem initialized usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing ... PIC IMR: fdfa ... PIC IRR: 0c68 ... PIC ISR: 0000 ... PIC ELCR: 0200 PIRQA -> IRQ15 disabled PIRQB -> IRQ15 disabled PIRQC -> IRQ15 disabled PIRQD -> IRQ15 disabled PIRQE -> IRQ15 disabled PIRQF -> IRQ15 disabled PIRQG -> IRQ15 disabled PIRQH -> IRQ15 disabled ** PCI interrupts are no longer routed automatically. If this ** causes a device to stop working, it is probably because the ** driver failed to call pci_enable_device(). As a temporary ** workaround, the "pci=routeirq" argument restores the old ** behavior. If this argument makes the device work again, ** please email the output of "lspci" to bjorn.helgaas@hp.com ** so I can fix the driver. ... PIC IMR: fdfa ... PIC IRR: 0c68 ... PIC ISR: 0000 ... PIC ELCR: 0200 PIRQA -> IRQ15 disabled PIRQB -> IRQ15 disabled PIRQC -> IRQ15 disabled PIRQD -> IRQ15 disabled PIRQE -> IRQ15 disabled PIRQF -> IRQ15 disabled PIRQG -> IRQ15 disabled PIRQH -> IRQ15 disabled NTFS driver 2.1.22 [Flags: R/O]. ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11 PCI: setting IRQ 11 as level-triggered ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 11 (level, low) -> IRQ 11 radeonfb: Found Intel x86 BIOS ROM Image radeonfb: Retreived PLL infos from BIOS radeonfb: Reference=27.00 MHz (RefDiv=60) Memory=155.00 Mhz, System=155.00 MHz radeonfb: PLL min 12000 max 35000 radeonfb: Monitor 1 type DFP found radeonfb: EDID probed radeonfb: Monitor 2 type no found Console: switching to colour frame buffer device 160x64 radeonfb: ATI Radeon QY DDR SGRAM 64 MB lp: driver loaded but no devices found Real Time Clock Driver v1.12 Non-volatile memory driver v1.2 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected SiS 746 chipset agpgart: Maximum main memory to use for agp memory: 203M agpgart: AGP aperture is 64M @ 0xd0000000 ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 11 (level, low) -> IRQ 11 [drm] Initialized radeon 1.11.0 20020828 on minor 0: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP] parport0: irq 7 detected lp0: using parport0 (polling). io scheduler noop registered io scheduler cfq registered elevator: using cfq as default io scheduler floppy0: no floppy controllers found loop: loaded (max 8 devices) sis900.c: v1.08.07 11/02/2003 ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 6 PCI: setting IRQ 6 as level-triggered ACPI: PCI interrupt 0000:00:04.0[A] -> GSI 6 (level, low) -> IRQ 6 eth0: Realtek RTL8201 PHY transceiver found at address 1. eth0: Using transceiver found at address 1 as default eth0: SiS 900 PCI Fast Ethernet at 0xdc00, IRQ 6, 00:00:00:00:00:00. Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx SIS5513: IDE controller at PCI slot 0000:00:02.5 SIS5513: chipset revision 0 SIS5513: not 100% native mode: will probe irqs later SIS5513: SiS 962/963 MuTIOL IDE UDMA133 controller ide0: BM-DMA at 0xff00-0xff07, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xff08-0xff0f, BIOS settings: hdc:DMA, hdd:DMA Probing IDE interface ide0... hda: SAMSUNG SV1604N, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: LITE-ON LTR-12101B, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 1024KiB hda: 312581808 sectors (160041 MB) w/2048KiB Cache, CHS=19457/255/63, UDMA(100) hda: cache flushes supported hda: hda1 hda2 hda3 hda4 hdc: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache, DMA Uniform CD-ROM driver Revision: 3.20 ACPI: PCI Interrupt Link [LNKH] enabled at IRQ 10 PCI: setting IRQ 10 as level-triggered ACPI: PCI interrupt 0000:00:03.2[D] -> GSI 10 (level, low) -> IRQ 10 ehci_hcd 0000:00:03.2: Silicon Integrated Systems [SiS] USB 2.0 Controller ehci_hcd 0000:00:03.2: irq 10, pci mem 0xcffff000 ehci_hcd 0000:00:03.2: new USB bus registered, assigned bus number 1 PCI: cache line size of 64 is not supported by device 0000:00:03.2 ehci_hcd 0000:00:03.2: USB 2.0 initialized, EHCI 1.00, driver 26 Oct 2004 hub 1-0:1.0: USB hub found hub 1-0:1.0: 6 ports detected Initializing USB Mass Storage driver... usbcore: registered new driver usb-storage USB Mass Storage support registered. mice: PS/2 mouse device common for all mice input: AT Translated Set 2 keyboard on isa0060/serio0 input: ImPS/2 Generic Wheel Mouse on isa0060/serio1 i2c /dev entries driver Advanced Linux Sound Architecture Driver Version 1.0.6 (Sun Aug 15 07:17:53 2004 UTC). ACPI: PCI interrupt 0000:00:0c.0[A] -> GSI 11 (level, low) -> IRQ 11 ALSA device list: #0: Ensoniq AudioPCI ENS1371 at 0xd800, irq 11 NET: Registered protocol family 2 IP: routing cache hash table of 2048 buckets, 16Kbytes TCP: Hash tables configured (established 16384 bind 32768) ip_conntrack version 2.1 (2047 buckets, 16376 max) - 176 bytes per conntrack ip_tables: (C) 2000-2002 Netfilter core team NET: Registered protocol family 1 NET: Registered protocol family 17 ... PIC IMR: 20f8 ... PIC IRR: 00a8 ... PIC ISR: 0000 ... PIC ELCR: 0e40 PIRQA -> IRQ15 disabled PIRQB -> IRQ15 disabled PIRQC -> IRQ15 disabled PIRQD -> IRQ15 disabled PIRQE -> IRQ15 disabled PIRQF -> IRQ15 disabled PIRQG -> IRQ15 disabled PIRQH -> IRQ15 disabled BIOS EDD facility v0.16 2004-Jun-25, 1 devices found VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 144k freed Adding 987988k swap on /dev/hda2. Priority:-1 extents:1 eth0: Media Link On 10mbps half-duplex agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode agpgart: SiS delay workaround: giving bridge time to recover. agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 12:40 ` Adrian Bunk @ 2004-11-20 18:28 ` Linus Torvalds 2004-11-20 19:10 ` Linus Torvalds ` (3 more replies) 2004-11-22 18:28 ` Len Brown 1 sibling, 4 replies; 36+ messages in thread From: Linus Torvalds @ 2004-11-20 18:28 UTC (permalink / raw) To: Adrian Bunk Cc: Len Brown, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, 20 Nov 2004, Adrian Bunk wrote: > > With your patch, the boot failure goes away. > This was with a kernel without Linus' patch applied. It boots for me too, but it all seems pretty accidental. In particular, the code will disable irq12 (mouse interrupt), so the mouse has no chance of working. Having tested it, it so happens that if I boot with a mouse actually conntected, the BIOS will not use irq12 for PCI devices, so that will hide the problem since ACPI won't try to disable it when it doesn't see it. But if I had more PCI devices, or another legacy device that doesn't have the same kind of "test if something is connected" logic, it really looks like it would break again. (It's entirely possible that Windows has the exact same issue, of course, at which point it's fairly safe to say that manufacturers will have tested this and just not done it, but I don't feel all that safe making that assumption). So I think the simpler fix is just this one-liner: we should not disable preexisting links, because non-PCI devices may depend on the same routing information, and thus the comments about "being activated on use" is not actually true. Linus ===== drivers/acpi/pci_link.c 1.34 vs edited ===== --- 1.34/drivers/acpi/pci_link.c 2004-11-01 23:40:09 -08:00 +++ edited/drivers/acpi/pci_link.c 2004-11-20 09:43:56 -08:00 @@ -685,9 +685,6 @@ acpi_link.count++; end: - /* disable all links -- to be activated on use */ - acpi_ut_evaluate_object(link->handle, "_DIS", 0, NULL); - if (result) kfree(link); ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 18:28 ` Linus Torvalds @ 2004-11-20 19:10 ` Linus Torvalds 2004-11-22 19:55 ` Len Brown 2004-11-24 16:26 ` Alan Cox 2004-11-21 16:29 ` Adrian Bunk ` (2 subsequent siblings) 3 siblings, 2 replies; 36+ messages in thread From: Linus Torvalds @ 2004-11-20 19:10 UTC (permalink / raw) To: Adrian Bunk Cc: Len Brown, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, 20 Nov 2004, Linus Torvalds wrote: > > In particular, the code will disable irq12 (mouse interrupt), so the mouse > has no chance of working. Btw, looking closer still, this all will most likely vary wildly according to southbridge (and BIOS setups). At least some SB's seem to put the legacy interrupts totally separately from the PIRQ stuff, in which case the PIRQ disable will not matter one whit - the legacy interrupt is inserted "after" the PIRQ gating/translation anyway. This seems to be especially common for controllers for keyboard/mouse/i2c etc that are actually on the southbridge itself. But the basic notion remains: disabling a PIRQ line is valid only if you know it's only used by PCI devices. There might be other special devices on the board that don't show up as PCI devices, eg things like the Sony programmable I/O thing that doesn't show up as a PCI device at all, it's just "invisibly" connected to the bus (it just hijacks port 0x66 or something - the range 0-0x3ff is generally reserved for "motherboard devices"). These kinds of things hopefully aren't all that common (there can't be a lot of extra hw required to follow the PCI spec _properly_), but if I were a hw designer, I'd connect such a chip to the PIRQ input, and just make the BIOS enable it automatically. Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 19:10 ` Linus Torvalds @ 2004-11-22 19:55 ` Len Brown 2004-11-24 16:26 ` Alan Cox 1 sibling, 0 replies; 36+ messages in thread From: Len Brown @ 2004-11-22 19:55 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, 2004-11-20 at 14:10, Linus Torvalds wrote: > > On Sat, 20 Nov 2004, Linus Torvalds wrote: > > > > In particular, the code will disable irq12 (mouse interrupt), so the > mouse > > has no chance of working. > > Btw, looking closer still, this all will most likely vary wildly > according to southbridge (and BIOS setups). At least some SB's seem to > put the legacy interrupts totally separately from the PIRQ stuff, in > which case the PIRQ disable will not matter one whit - the legacy > interrupt is inserted "after" the PIRQ gating/translation anyway. This > seems to be especially common for controllers for keyboard/mouse/i2c > etc that are actually on the southbridge itself. Right, programming a PIRQ router to an IRQ doesn't mean that a legacy device isn't still attached to that IRQ. > But the basic notion remains: disabling a PIRQ line is valid only if > you know it's only used by PCI devices. There might be other special > devices on the board that don't show up as PCI devices, eg things like > the Sony programmable I/O thing that doesn't show up as a PCI device > at all, it's just "invisibly" connected to the bus (it just hijacks > port 0x66 or something - the range 0-0x3ff is generally reserved for > "motherboard devices"). > These kinds of things hopefully aren't all that common (there can't be > a lot of extra hw required to follow the PCI spec _properly_), but if > I were a hw designer, I'd connect such a chip to the PIRQ input, and > just make the BIOS enable it automatically. While there may be non-standard non-PCI legacy devices that (erroneously) use PIRQ routers on legacy systms, that isn't the issue at hand. The issue at hand is what to do in ACPI mode. ACPI PCI Interrupt Link Devices, by definition, are used only by PCI devices, or devices that look like them. You look up the device in the _PRT by its devid. Although links are often implemented underneath by PIRQ routers, they are much more general. ACPI PCI Interrupt Links can specificy any trigger/polarity, as well as connect to IOAPIC inputs. If there is "special" hardware using an ACPI PCI Interrupt Link without being listed in the DSDT _PRT that describes the link, then the BIOS is simply broken. -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 19:10 ` Linus Torvalds 2004-11-22 19:55 ` Len Brown @ 2004-11-24 16:26 ` Alan Cox 1 sibling, 0 replies; 36+ messages in thread From: Alan Cox @ 2004-11-24 16:26 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Len Brown, Chris Wright, Bjorn Helgaas, Linux Kernel Mailing List, Andrew Morton On Sad, 2004-11-20 at 19:10, Linus Torvalds wrote: > These kinds of things hopefully aren't all that common (there can't be a > lot of extra hw required to follow the PCI spec _properly_), but if I were > a hw designer, I'd connect such a chip to the PIRQ input, and just make > the BIOS enable it automatically. The PCI spec includes several such bits of magic itself remember - notably on a PC the use of IRQ14 and IRQ15 by the IDE controller is not via the PCI PIRQ routing. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 18:28 ` Linus Torvalds 2004-11-20 19:10 ` Linus Torvalds @ 2004-11-21 16:29 ` Adrian Bunk 2004-11-22 19:29 ` Len Brown 2004-11-23 2:00 ` Chris Wright 3 siblings, 0 replies; 36+ messages in thread From: Adrian Bunk @ 2004-11-21 16:29 UTC (permalink / raw) To: Linus Torvalds Cc: Len Brown, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, Nov 20, 2004 at 10:28:27AM -0800, Linus Torvalds wrote: >... > So I think the simpler fix is just this one-liner: we should not disable > preexisting links, because non-PCI devices may depend on the same routing > information, and thus the comments about "being activated on use" is not > actually true. I can confirm this patch fixes the problem for me in an otherwise unmodified 2.6.10-rc2. > Linus > > ===== drivers/acpi/pci_link.c 1.34 vs edited ===== > --- 1.34/drivers/acpi/pci_link.c 2004-11-01 23:40:09 -08:00 > +++ edited/drivers/acpi/pci_link.c 2004-11-20 09:43:56 -08:00 > @@ -685,9 +685,6 @@ > acpi_link.count++; > > end: > - /* disable all links -- to be activated on use */ > - acpi_ut_evaluate_object(link->handle, "_DIS", 0, NULL); > - > if (result) > kfree(link); cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 18:28 ` Linus Torvalds 2004-11-20 19:10 ` Linus Torvalds 2004-11-21 16:29 ` Adrian Bunk @ 2004-11-22 19:29 ` Len Brown 2004-11-22 20:02 ` Linus Torvalds 2004-11-23 2:00 ` Chris Wright 3 siblings, 1 reply; 36+ messages in thread From: Len Brown @ 2004-11-22 19:29 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, 2004-11-20 at 13:28, Linus Torvalds wrote: > > On Sat, 20 Nov 2004, Adrian Bunk wrote: > > > > With your patch, the boot failure goes away. > > This was with a kernel without Linus' patch applied. > > It boots for me too, but it all seems pretty accidental. > > In particular, the code will disable irq12 (mouse interrupt), so the > mouse has no chance of working. Having tested it, it so happens that > if I boot with a mouse actually conntected, the BIOS will not use > irq12 for PCI devices, so that will hide the problem since ACPI won't > try to disable it when it doesn't see it. This is not an accident that is being hidden by mistake. The BIOS probes for the legacy mouse. If it finds one, it gives it IRQ12. If it doesn't find one it gives IRQ12 to PCI. Other BIOS's are not as good as these and they don't give the free IRQs to PCI. Sub-optimal, but perfectly legal. Exact same situation Chris and Adrian see with the floppy and IRQ6. Exact same situation could potentially be seen with any motherboard device and any IRQ -- except, of course, those that can't be probed or are otherwise hard-coded for well known legacy devices, such as the RTC on IRQ8. > But if I had more PCI devices, or another legacy device that doesn't > have the same kind of "test if something is connected" logic, it > really looks like it would break again. (It's entirely possible that > Windows has the exact same issue, of course, at which point it's > fairly safe to say that manufacturers will have tested this and just > not done it, but I don't feel all that safe making that assumption). Windows uses ACPI to probe the legacy motherboard devices, and ACPI uses what the BIOS finds. If the BIOS and ACPI don't know about the motherboard device, then it isn't an ACPI system and among other failures, it would never have got a nifty Made for Windows sticker, and thus would have market penetration of approximately 0. Linux is just now learning to use ACPI for probing legacy motherboard devices. I believe that this transition can be made safely and that legacy probes should continue to work both during and after the transition. > So I think the simpler fix is just this one-liner: we should not > disable preexisting links, because non-PCI devices may depend on the > same routing information, and thus the comments about "being activated > on use" is not actually true. > > Linus > > ===== drivers/acpi/pci_link.c 1.34 vs edited ===== > --- 1.34/drivers/acpi/pci_link.c 2004-11-01 23:40:09 -08:00 > +++ edited/drivers/acpi/pci_link.c 2004-11-20 09:43:56 -08:00 > @@ -685,9 +685,6 @@ > acpi_link.count++; > > end: > - /* disable all links -- to be activated on use */ > - acpi_ut_evaluate_object(link->handle, "_DIS", 0, NULL); > - > if (result) > kfree(link); This is not a fix, it will break systems in the field. When we enable a link, we must set the ELCR. When we disable a link, we must clear the ELCR. We need to be able to enable and disable all links in the system. The bug was that while we were were setting the ELCR when we enabled a link, we were not clearing it when we disabled one. We also have an issue when we move the destiation of a link, though we don't do that very often. My debug patch cleared every bit in the ELCR even if no PCI interrupt link device pointed to that entry. Yes, this assumes that ACPI knows about every level triggered interrupt in the system. I think that is a valid assumption on an ACPI-compliant system. But if you're more comfortable with disabling the associated ELCR bit only when we disable links directed at that entry, we can do that too. The complication with that approach is that links are many to one, so clearing the bit without disabling all links directed to that entry would result in a failure. Also, the SCI uses the ELCR too, and it isn't described by links at all. -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 19:29 ` Len Brown @ 2004-11-22 20:02 ` Linus Torvalds 2004-11-22 20:10 ` Linus Torvalds 2004-11-22 20:38 ` Len Brown 0 siblings, 2 replies; 36+ messages in thread From: Linus Torvalds @ 2004-11-22 20:02 UTC (permalink / raw) To: Len Brown Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 22 Nov 2004, Len Brown wrote: > > Windows uses ACPI to probe the legacy motherboard devices, and ACPI uses > what the BIOS finds. If the BIOS and ACPI don't know about the > motherboard device, then it isn't an ACPI system and among other > failures, it would never have got a nifty Made for Windows sticker, and > thus would have market penetration of approximately 0. One thing that matters is definitely "how does Windows do things", because that's what has been tested. However, "Windows" clearly does not use ACPI for all enumeration, since there are different versions of Windows, and some of them don't (and some of them very much don't boot on all hardware either). In other words, Linux actually needs to be _more_ careful than Windows, because it's supposed to just magically work on pretty much anything out there. Nasty. So "what windows does" can never be anything but a hint on what kind of functionality has ever been tested. > When we enable a link, we must set the ELCR. > When we disable a link, we must clear the ELCR. > We need to be able to enable and disable all links in the system. > > The bug was that while we were were setting the ELCR > when we enabled a link, we were not clearing it when we disabled one. Fair enough. That may be a good fix too, but so far I can see the bug on a system I actually have access to, and my one-liner fixes it in a fundamentally more acceptable manner than any other patch I've seen. Why? Because _not_ touching things is the other thing that _has_ been tested on machines, ie mb manufacturers actually tend to test things that have no ACPI knowledge at all, still. DOS comes to mind, but so do pretty much all other operating systems. Because of that, there's this dichtonomy: you either do everything exactly like Windows does things, or you try very carefully to do the minimal amount of untrusted accesses possible. Linux ends up mixing the two approaches as best it can. But feel free to send me a patch that doesn't just clear ELCR totally, but clears the bits we are disabling. I just don't believe in the "let's just clear everything" approach. In particular, I don't have the same kind of "ACPI will provide" belief in higher powers that you seem to have. > But if you're more comfortable with disabling the associated ELCR bit > only when we disable links directed at that entry, we can do that too. > The complication with that approach is that links are many to one, so > clearing the bit without disabling all links directed to that entry > would result in a failure. Also, the SCI uses the ELCR too, and it > isn't described by links at all. Wouldn't it be nicer to take the _reverse_ approach: let's assume that any PCI interrupts that we have already enabled are fine and should not be disabled? Mark them in the ELCR, and _report_ when the ELCR seems to be incorrect (let's make a wild guess here, and realize that the screaming VIA interrupts you talk about are exactly because the ELCR was wrong). This is exactly what you are already doing with SCI, thanks to "acpi_pic_sci_set_trigger()", no? So I'm really suggesting that instead of disabling the PCI irq routing, it should do exactly the same thing that SCI already does. Namely make sure that ELCR is set correctly for it. So in "acpi_pci_link_add()", when you find a link that is enabled, add a call to make sure that it is set to level triggered in the ELCR. That's not even ACPI-specific, now we're talking fundamental PCI behaviour, so the likelihood of that being wrong is pretty low, no? That seems like a _safe_ thing to do. Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 20:02 ` Linus Torvalds @ 2004-11-22 20:10 ` Linus Torvalds 2004-11-22 20:38 ` Len Brown 1 sibling, 0 replies; 36+ messages in thread From: Linus Torvalds @ 2004-11-22 20:10 UTC (permalink / raw) To: Len Brown Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 22 Nov 2004, Linus Torvalds wrote: > > This is exactly what you are already doing with SCI, thanks to > "acpi_pic_sci_set_trigger()", no? > > So I'm really suggesting that instead of disabling the PCI irq routing, it > should do exactly the same thing that SCI already does. Namely make sure > that ELCR is set correctly for it. In fact, we would even use the same function for it (the only thing that makes it SCI-specific right now is the "printk()" that says "SCI IRQ", the rest really is totally generic. So how about renaming "acpi_pic_sci_set_trigger()" to not have the "sci" part in there, and remove it's dependence on CONFIG_ACPI_BUS, and just using it in "apic_pci_link_add()" to make sure that any PCI links we find to be enabled have the right ELCR. That's _logical_, since if we were to actually enable them, we'd set ELCR right. So literally the only difference between disabling them at boot (and then re-enabling them when a driver finds them) _is_ that ELCR setting.. And that would make me much happier, because it's a "minimally intrusive" thing to do. Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 20:02 ` Linus Torvalds 2004-11-22 20:10 ` Linus Torvalds @ 2004-11-22 20:38 ` Len Brown 2004-11-23 2:45 ` Linus Torvalds 1 sibling, 1 reply; 36+ messages in thread From: Len Brown @ 2004-11-22 20:38 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 2004-11-22 at 15:02, Linus Torvalds wrote: > > On Mon, 22 Nov 2004, Len Brown wrote: > > > > When we enable a link, we must set the ELCR. > > When we disable a link, we must clear the ELCR. > > We need to be able to enable and disable all links in the system. > > > > The bug was that while we were were setting the ELCR > > when we enabled a link, we were not clearing it when we disabled > one. > > Fair enough. ... > But feel free to send me a patch that doesn't just clear ELCR totally, > but clears the bits we are disabling. I just don't believe in the > "let's just clear everything" approach. Will do. > > > But if you're more comfortable with disabling the associated ELCR > bit> only when we disable links directed at that entry, we can do that > too. > > The complication with that approach is that links are many to one, > so > > clearing the bit without disabling all links directed to that entry > > would result in a failure. Also, the SCI uses the ELCR too, and it > > isn't described by links at all. > > Wouldn't it be nicer to take the _reverse_ approach: let's assume that > any PCI interrupts that we have already enabled are fine and should > not be disabled? Mark them in the ELCR, and _report_ when the ELCR > seems to be incorrect (let's make a wild guess here, and realize that > the screaming VIA interrupts you talk about are exactly because the > ELCR was wrong). I think the VIA case is more complicated than that, but I'll take another look at it. thanks, -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 20:38 ` Len Brown @ 2004-11-23 2:45 ` Linus Torvalds 2004-11-23 4:57 ` Linus Torvalds 0 siblings, 1 reply; 36+ messages in thread From: Linus Torvalds @ 2004-11-23 2:45 UTC (permalink / raw) To: Len Brown Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton, Stian Jordet On Mon, 22 Nov 2004, Len Brown wrote: > > I think the VIA case is more complicated than that, but I'll take > another look at it. Ok, having looked at the bugzilla entry, I have to concur. Looks like we do need to disable the PCI interrupts and re-enable them. Mea culpa. So what's the right way to get ELCR into a useful state? I'm starting to lean towards your "just clear it all" after all, but that does the wrong thing for SCI (which is _usually_ level-triggered), and I worry that there are other cases too. Any reasonably simple patch that likely gets it right? Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-23 2:45 ` Linus Torvalds @ 2004-11-23 4:57 ` Linus Torvalds 2004-11-23 7:06 ` Len Brown 0 siblings, 1 reply; 36+ messages in thread From: Linus Torvalds @ 2004-11-23 4:57 UTC (permalink / raw) To: Len Brown Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton, Stian Jordet On Mon, 22 Nov 2004, Linus Torvalds wrote: > > So what's the right way to get ELCR into a useful state? I'm starting to > lean towards your "just clear it all" after all, but that does the wrong > thing for SCI (which is _usually_ level-triggered), and I worry that there > are other cases too. > > Any reasonably simple patch that likely gets it right? Len, how about this patch - it re-enables the link disable and then re-codes the ELCR setting to match. Basically it just computes the new ELCR: if acpi_noirq is set, it leaves it at the old value, otherwise it zeroes it - and in both cases it fixes the SCI entry. Your argument for doing this ended up being convincing, so the only difference between this and your debug patch is really just the obvious organizational ones, and the test for "acpi_noirq", which I think is needed (since if acpi_noirq is set, we're not going to disable and re-enable the PCI interrupts, so we'll just have to trust ELCR). Linus ---- ===== arch/i386/kernel/acpi/boot.c 1.75 vs edited ===== --- 1.75/arch/i386/kernel/acpi/boot.c 2004-11-11 16:08:40 -08:00 +++ edited/arch/i386/kernel/acpi/boot.c 2004-11-22 20:55:57 -08:00 @@ -409,28 +409,38 @@ void __init acpi_pic_sci_set_trigger(unsigned int irq, u16 trigger) { - unsigned char mask = 1 << (irq & 7); - unsigned int port = 0x4d0 + (irq >> 3); - unsigned char val = inb(port); + unsigned int mask = 1 << irq; + unsigned int old, new; - - printk(PREFIX "IRQ%d SCI:", irq); - if (!(val & mask)) { - printk(" Edge"); + /* Real old ELCR mask */ + old = inb(0x4d0) | (inb(0x4d1) << 8); - if (trigger == 3) { - printk(" set to Level"); - outb(val | mask, port); - } - } else { - printk(" Level"); + /* + * If we use ACPI to set PCI irq's, then we should clear ELCR + * since we will set it correctly as we enable the PCI irq + * routing. + */ + new = acpi_noirq ? old : 0; - if (trigger == 1) { - printk(" set to Edge"); - outb(val & ~mask, port); - } + /* + * Update SCI information in the ELCR, it isn't in the PCI + * routing tables.. + */ + switch (trigger) { + case 1: /* Edge - clear */ + new &= ~mask; + break; + case 3: /* Level - set */ + new |= mask; + break; } - printk(" Trigger.\n"); + + if (old == new) + return; + + printk(PREFIX "setting ELCR to %04x (from %04x)\n", new, old); + outb(new, 0x4d0); + outb(new >> 8, 0x4d1); } ===== drivers/acpi/pci_link.c 1.35 vs edited ===== --- 1.35/drivers/acpi/pci_link.c 2004-11-22 10:41:11 -08:00 +++ edited/drivers/acpi/pci_link.c 2004-11-22 20:02:53 -08:00 @@ -685,6 +685,9 @@ acpi_link.count++; end: + /* disable all links -- to be activated on use */ + acpi_ut_evaluate_object(link->handle, "_DIS", 0, NULL); + if (result) kfree(link); ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-23 4:57 ` Linus Torvalds @ 2004-11-23 7:06 ` Len Brown 2004-11-23 20:13 ` Stian Jordet 0 siblings, 1 reply; 36+ messages in thread From: Len Brown @ 2004-11-23 7:06 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton, Stian Jordet On Mon, 2004-11-22 at 23:57, Linus Torvalds wrote: > > On Mon, 22 Nov 2004, Linus Torvalds wrote: > > > Len, how about this patch - it re-enables the link disable and then > re-codes the ELCR setting to match. > > Basically it just computes the new ELCR: if acpi_noirq is set, it > leaves it at the old value, otherwise it zeroes it - and in both cases > it fixes the SCI entry. > Your argument for doing this ended up being convincing, so the only > difference between this and your debug patch is really just the > obvious organizational ones, and the test for "acpi_noirq", which I > think is needed (since if acpi_noirq is set, we're not going to > disable and re-enable the PCI interrupts, so we'll just have to trust > ELCR). I think your use of acpi_noirq in acpi_pic_sci_set_trigger() was clever -- maybe more clever than the name of the routine suggests -- but it looks correct. thanks for restoring pci_link.c -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-23 7:06 ` Len Brown @ 2004-11-23 20:13 ` Stian Jordet 0 siblings, 0 replies; 36+ messages in thread From: Stian Jordet @ 2004-11-23 20:13 UTC (permalink / raw) To: Len Brown; +Cc: Linus Torvalds, Kernel Mailing List, Andrew Morton Hi, If the reason for cc'ing me was for me to verify it still works on my VIA board, I can confirm it does with Linus' patch :) Thanks. Best regards, Stian tir, 23,.11.2004 kl. 02.06 -0500, skrev Len Brown: > On Mon, 2004-11-22 at 23:57, Linus Torvalds wrote: > > > > On Mon, 22 Nov 2004, Linus Torvalds wrote: > > > > > > Len, how about this patch - it re-enables the link disable and then > > re-codes the ELCR setting to match. > > > > Basically it just computes the new ELCR: if acpi_noirq is set, it > > leaves it at the old value, otherwise it zeroes it - and in both cases > > it fixes the SCI entry. > > > Your argument for doing this ended up being convincing, so the only > > difference between this and your debug patch is really just the > > obvious organizational ones, and the test for "acpi_noirq", which I > > think is needed (since if acpi_noirq is set, we're not going to > > disable and re-enable the PCI interrupts, so we'll just have to trust > > ELCR). > > I think your use of acpi_noirq in acpi_pic_sci_set_trigger() was clever > -- maybe more clever than the name of the routine suggests -- but it > looks correct. > > thanks for restoring pci_link.c > > -Len > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 18:28 ` Linus Torvalds ` (2 preceding siblings ...) 2004-11-22 19:29 ` Len Brown @ 2004-11-23 2:00 ` Chris Wright 3 siblings, 0 replies; 36+ messages in thread From: Chris Wright @ 2004-11-23 2:00 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Len Brown, Chris Wright, Bjorn Helgaas, Kernel Mailing List, Andrew Morton Also for the record... * Linus Torvalds (torvalds@osdl.org) wrote: > So I think the simpler fix is just this one-liner: we should not disable > preexisting links, because non-PCI devices may depend on the same routing > information, and thus the comments about "being activated on use" is not > actually true. This boots as expected (no irq6 storm). thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 12:40 ` Adrian Bunk 2004-11-20 18:28 ` Linus Torvalds @ 2004-11-22 18:28 ` Len Brown 2004-11-23 0:46 ` Adrian Bunk 1 sibling, 1 reply; 36+ messages in thread From: Len Brown @ 2004-11-22 18:28 UTC (permalink / raw) To: Adrian Bunk Cc: Chris Wright, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, 2004-11-20 at 07:40, Adrian Bunk wrote: > On Sat, Nov 20, 2004 at 04:02:04AM -0500, Len Brown wrote: > > > Please try this updated debug patch. > > > > It clears the ELCR on Linux boot. > With your patch, the boot failure goes away. > This was with a kernel without Linus' patch applied. Thanks for running this test Adrian. > > BTW: Is all what ACPI does really required, if all I need ACPI for is > to turn the power off after halting my computer? On this system ACPI is required to configure the IOAPIC. It may be possible to save power in idle with c-states and at run-time with p-states (cpufreq) on this box, but I couldn't tell that from the dmesg if CONFIG_ACPI_PROCESSOR was included or not. If you don't care about interrupt performance and you don't mind pressing the power button when you halt the system, go ahead and run with CONFIG_ACPI=n. cheers, -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 18:28 ` Len Brown @ 2004-11-23 0:46 ` Adrian Bunk 0 siblings, 0 replies; 36+ messages in thread From: Adrian Bunk @ 2004-11-23 0:46 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, Nov 22, 2004 at 01:28:59PM -0500, Len Brown wrote: >... > > BTW: Is all what ACPI does really required, if all I need ACPI for is > > to turn the power off after halting my computer? > > On this system ACPI is required to configure the IOAPIC. > > It may be possible to save power in idle with c-states > and at run-time with p-states (cpufreq) on this box, > but I couldn't tell that from the dmesg if CONFIG_ACPI_PROCESSOR > was included or not. It's not. > If you don't care about interrupt performance and you don't > mind pressing the power button when you halt the system, > go ahead and run with CONFIG_ACPI=n. Not needed "pressing the power button when you halt the system" is the "killer application" for using ACPI for me... > cheers, > -Len cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 9:02 ` Len Brown 2004-11-20 12:40 ` Adrian Bunk @ 2004-11-20 16:41 ` Linus Torvalds 2004-11-22 19:07 ` Len Brown 2004-11-23 1:58 ` Chris Wright 2 siblings, 1 reply; 36+ messages in thread From: Linus Torvalds @ 2004-11-20 16:41 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Adrian Bunk, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, 20 Nov 2004, Len Brown wrote: > > It clears the ELCR on Linux boot. I think this is _really_ wrong. Basically, you're screwing up more and more PIC state. Len, the PIC was _correct_ before ACPI touched it. We don't want to touch it MORE, we want to touch it LESS. I'll try this for debugging, but what I want to figure out is where ACPI is doing something it shouldn't be doing, and _removing_ that. We already had one patch where people tried to hide this problem by adding more code. Clearly, that patch was bogus. Yes, it hid the problem for floppies, but as shown by my other case (and as I was trying to say from the beginning), it's not about floppies. It's about _any_ non-PCI interrupt that apparently ACPI has done something _wrong_ for. So ACPI seems to assume that all interrupts are PCI interrupts, and that's just totally wrong. Clearing ELCR is more of this total wrongness. ELCR exists for a reason, namely that not all interrupts are PCI. Also, you seem to still totally concentrate on PIRQ routing etc. Totally ignoring the fact that the problematic cases are about interrupts that have _nothing_ to do with PCI. Not the floppy, not the PS/2 mouse. NOT PCI! They're both on the southbridge behind a very special interface that may or may not look like a PCI bus internally, but might quite as well be something totally special-case (ie a perfectly normal case is that somebody literally just bolted an old 8042 controller core into the system and set up special case magic irq routing). > ps. what I think is happening... > > To its credit, he BIOS correctly recognizes that there is > no floppy, and it routes a PIRQ to IRQ6. It correctly sets the > ELCR bit for this IRQ. > > Linux boots and disables all the PCI Interrupt Links, > which un-programs the PIRQ directed to IRQ6. And this is what I think is the bug. There is no reason to disable the PCI interrupt link unless you have a damn good reason to do so. > However, Linux doesn't clear the ELCR first, > and for some reason that causes an interrupt > to latch in IRQ6 -- though it is masked. > > Along comes the broken floppy driver before > the PCI devices probe. floppy > doesn't realize there is no hardware and > unwittingly does a request_irq(6). You are totally ignoring my other bug report which was for a (existing) PSAUX mouse driver on irq12. If I had had a mouse on that port, it would not have worked. So the fact is, ACPI does something WRONG. Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 16:41 ` Linus Torvalds @ 2004-11-22 19:07 ` Len Brown 2004-11-22 19:23 ` Linus Torvalds 0 siblings, 1 reply; 36+ messages in thread From: Len Brown @ 2004-11-22 19:07 UTC (permalink / raw) To: Linus Torvalds Cc: Chris Wright, Adrian Bunk, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Sat, 2004-11-20 at 11:41, Linus Torvalds wrote: > > On Sat, 20 Nov 2004, Len Brown wrote: > > > > It clears the ELCR on Linux boot. > > I think this is _really_ wrong. > > Basically, you're screwing up more and more PIC state. > > Len, the PIC was _correct_ before ACPI touched it. We don't want to > touch it MORE, we want to touch it LESS. > > I'll try this for debugging, but what I want to figure out is where > ACPI is doing something it shouldn't be doing, and _removing_ that. > > We already had one patch where people tried to hide this problem by > adding more code. Clearly, that patch was bogus. Yes, it hid the > problem for floppies, but as shown by my other case (and as I was > trying to say from the beginning), it's not about floppies. It's about > _any_ non-PCI interrupt that apparently ACPI has done something > _wrong_ for. I agree that the system should work properly even if the legacy device drivers are broken. Please understand, however, that the legacy device drivers _are_ broken. The BIOS via ACPI clearly tells them if the devices are present or not, and Linux isn't yet listening. > So ACPI seems to assume that all interrupts are PCI interrupts, and > that's just totally wrong. Clearing ELCR is more of this total > wrongness. ELCR exists for a reason, namely that not all interrupts > are PCI. ACPI-compliant systems have three types of interrupts: 1. legacy 2. PCI 3. the ACPI SCI The first two are described in the DSDT legacy devices and _PRT, respectively. The third is described in the FADT. The MADT overrides are available to handle any special cases, though that applies only to IOAPIC mode. If there are other interrupts, then it isn't an ACPI-compliant system and it the BIOS should not enable ACPI. If the BIOS erroneously enables ACPI on such a system, the workaround is to boot with acpi=off. I'd be extremly interested to know of such a system, as I've not yet encountered one. > Also, you seem to still totally concentrate on PIRQ routing etc. > Totally ignoring the fact that the problematic cases are about > interrupts that have _nothing_ to do with PCI. Not the floppy, not the > PS/2 mouse. NOT PCI! They're both on the southbridge behind a very > special interface that may or may not look like a PCI bus internally, > but might quite as well be something totally special-case (ie a > perfectly normal case is that somebody literally just bolted an old > 8042 controller core into the system and set up special case magic irq > routing). If somebody bolts motherboard hardware on and doesn't tell ACPI about it, then they need to disable ACPI, which _owns_ configuration of motherboard devices when it is enabled. The problem at hand has everything to do with PCI interrupts, and how they can conflict with legacy interrupts. PIC hardware is level-HIGH sensitive, it cannot be programmed like APIC INTIN's can. The only way to effectively use it as level-LOW sensitive such as that supplied by PCI devices, it so attach those interrupts sources through inverters. This is what the PIRQ routers do. I printed out the underlying PIRQ routers for the ICH in the debug patch because all of the failures at hand seemed to be in ICH systems and these registers tell us the state not of the abstract PCI Interrupt link, but of the actual hardware that can be driving that (legacy) interrupt input. > > ps. what I think is happening... > > > > To its credit, he BIOS correctly recognizes that there is > > no floppy, and it routes a PIRQ to IRQ6. It correctly sets the > > ELCR bit for this IRQ. > > > > Linux boots and disables all the PCI Interrupt Links, > > which un-programs the PIRQ directed to IRQ6. > > And this is what I think is the bug. There is no reason to disable the > PCI interrupt link unless you have a damn good reason to do so. The damn good reason is that doing otherwise breaks systems. This is the cset comment for the line of code disabling the links: ChangeSet 1.1608.11.11 2004/06/17 23:21:03 len.brown@intel.com [ACPI] avoid spurious interrupts on VIA http://bugzilla.kernel.org/show_bug.cgi?id=2243 drivers/acpi/pci_link.c 1.28.1.1 2004/06/11 10:38:46 len.brown@intel.com disable all PCI Interrupt Links to be enabled by _SRS It would sure make my life easier if we didn't support these VIA/Phoenix systems, but I don't think that breaking them is what the community wants. > > However, Linux doesn't clear the ELCR first, > > and for some reason that causes an interrupt > > to latch in IRQ6 -- though it is masked. > > > > Along comes the broken floppy driver before > > the PCI devices probe. floppy > > doesn't realize there is no hardware and > > unwittingly does a request_irq(6). > > You are totally ignoring my other bug report which was for a > (existing) PSAUX mouse driver on irq12. > > If I had had a mouse on that port, it would not have worked. > > So the fact is, ACPI does something WRONG. The PS2 IRQ12 situation is exactly the same as the IRQ6 floppy situation. If the mouse or floppy were present, the BIOS would not have given that interrupt to PCI. -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 19:07 ` Len Brown @ 2004-11-22 19:23 ` Linus Torvalds 2004-11-22 20:24 ` Len Brown 0 siblings, 1 reply; 36+ messages in thread From: Linus Torvalds @ 2004-11-22 19:23 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Adrian Bunk, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 22 Nov 2004, Len Brown wrote: > > I agree that the system should work properly even if the legacy device > drivers are broken. Please understand, however, that the legacy device > drivers _are_ broken. The BIOS via ACPI clearly tells them if the > devices are present or not, and Linux isn't yet listening. I really disagree. I realize that you like ACPI, or you would have shot yourself long long ago. But I have a totally different view on things. To me, firmware is not something cool to be used. It's a necessary evil, and it should be avoided and mistrusted as far as humanly possible, because it is always buggy, and we can't fix the bugs in it. Yes, the current ACPI layer in the kernel is a lot better at working around the bugs, and it's getting to the point where I suspect Linux vendors actually decide that enabing ACPI by default causes fewer problems than it solves. That clearly didn't use to be true. > ACPI-compliant systems have three types of interrupts: Stop right there. "ACPI-compliant systems". The fact is, there is no such thing. There are systems that users buy, and they are not "ACPI compliant", they are "one implementation of ACPI that was tested with a single vendor usage test". Call me cynical, but I believe in standards papers just about as much as I believe in the voices in my attic that tell me to kill the Queen of England. Papers is so much dead trees. The only thing that matters is real life. And like it or not, real life does NOT implement standards properly, even if the standards are well written and unambiguous (which also doesn't actually happen in real life). > If somebody bolts motherboard hardware on and doesn't tell ACPI about > it, then they need to disable ACPI, which _owns_ configuration of > motherboard devices when it is enabled. No. The only thing that owns the motherboard is the user. ACPI shouldn't get uppity. > The damn good reason is that doing otherwise breaks systems. And not doing it breaks systems. See a pattern? This is why I don't trust firmware. It's always buggy. Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 19:23 ` Linus Torvalds @ 2004-11-22 20:24 ` Len Brown 2004-11-22 20:31 ` Linus Torvalds 0 siblings, 1 reply; 36+ messages in thread From: Len Brown @ 2004-11-22 20:24 UTC (permalink / raw) To: Linus Torvalds Cc: Chris Wright, Adrian Bunk, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 2004-11-22 at 14:23, Linus Torvalds wrote: > To me, firmware is not > something cool to be used. It's a necessary evil, and it should be > avoided and mistrusted as far as humanly possible, because it is > always buggy, and we can't fix the bugs in it. Mistrusting firmware is why I disabled all the links, some system firmware didn't leave them in a self-consistent state. Re: liking ACPI Consider it a love/hate thing;-) > > The damn good reason is that doing otherwise breaks systems. > > And not doing it breaks systems. I'm not aware (yet) of any systems where disabling all the links (which we've been doing since June, BTW) and clearing the entire ELCR, and then re-enabling them both only as we use them causes a failure. > This is why I don't trust firmware. It's always buggy. I'm with you on that one. -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 20:24 ` Len Brown @ 2004-11-22 20:31 ` Linus Torvalds 2004-11-22 20:36 ` Linus Torvalds 2004-11-22 20:51 ` Len Brown 0 siblings, 2 replies; 36+ messages in thread From: Linus Torvalds @ 2004-11-22 20:31 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Adrian Bunk, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 22 Nov 2004, Len Brown wrote: > > > > And not doing it breaks systems. > > I'm not aware (yet) of any systems where disabling all the links (which > we've been doing since June, BTW) We have been doing it since June, but we also immediately _re-enabled_ them. And the moment we _didn't_ re-enable them, people started sending in bug-reports. So the "since June" is clearly true only in a very limited sense. A more correct way to say it would be that within hours of releasing a test-kernel (not even a real release) that _really_ disabled the links, we got people reporting boot failures. > and clearing the entire ELCR, and then re-enabling them both only as we > use them causes a failure. Now, the clearing the entire ELCR thing has been tested by all of three people, all of whom saw problems with the non-clearing thing. So not only is the base for that claim very thin indeed, the small base was totally self-selected, ie statistically completely meaningless even if it had been much much larger. Nobody who didn't actually see the problem in the first place would have tested it. IOW, I'll claim that the only thing that has really gotten testing since June is the thing that disables and immediately re-enables the links. And that's exactly why I think the "minimally disruptive" fix is to not disable them at all, but just fix up ELCR for anything that was already enabled. Since that _is_ what "disable + re-enable" ends up actually doing. See my argument? Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 20:31 ` Linus Torvalds @ 2004-11-22 20:36 ` Linus Torvalds 2004-11-22 20:54 ` Len Brown 2004-11-22 20:51 ` Len Brown 1 sibling, 1 reply; 36+ messages in thread From: Linus Torvalds @ 2004-11-22 20:36 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Adrian Bunk, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 22 Nov 2004, Linus Torvalds wrote: > > And that's exactly why I think the "minimally disruptive" fix is to not > disable them at all, but just fix up ELCR for anything that was already > enabled. Since that _is_ what "disable + re-enable" ends up actually > doing. Oh, and I think one alternative at this point is obviously to just go back to the "re-enable all interrupts early in the boot" code. Clearly we need to do _something_ for 2.6.10, and I want it to be something that is pretty much equivalent to what we _do_ have testing coverage of. Just to keep safe. I actually _like_ the "enable links only when needed" thing, which is why I'd prefer to look for alternatives. But I like even more not having to worry about strange hw setups, so "minimal fixing" really is pretty important. Linus ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 20:36 ` Linus Torvalds @ 2004-11-22 20:54 ` Len Brown 0 siblings, 0 replies; 36+ messages in thread From: Len Brown @ 2004-11-22 20:54 UTC (permalink / raw) To: Linus Torvalds Cc: Chris Wright, Adrian Bunk, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 2004-11-22 at 15:36, Linus Torvalds wrote: > > Oh, and I think one alternative at this point is obviously to just go > back to the "re-enable all interrupts early in the boot" code. Clearly > we need to do _something_ for 2.6.10, and I want it to be something > that is pretty much equivalent to what we _do_ have testing coverage > of. Just to keep safe. I'm okay with this, and testing the interrupt fixes in -mm in the mean-time -- particularly if you're planning a relatively short 2.6.10 rc cycle. -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-22 20:31 ` Linus Torvalds 2004-11-22 20:36 ` Linus Torvalds @ 2004-11-22 20:51 ` Len Brown 1 sibling, 0 replies; 36+ messages in thread From: Len Brown @ 2004-11-22 20:51 UTC (permalink / raw) To: Linus Torvalds Cc: Chris Wright, Adrian Bunk, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Mon, 2004-11-22 at 15:31, Linus Torvalds wrote: > > On Mon, 22 Nov 2004, Len Brown wrote: > > > > > > And not doing it breaks systems. > > > > I'm not aware (yet) of any systems where disabling all the links > (which > > we've been doing since June, BTW) > > We have been doing it since June, but we also immediately _re-enabled_ > them. Mostly true. We re-enabled all the links for which we found PCI devices. This is a super-set of all the links with device-drivers. But it is also a sub-set of the total population of links -- some BIOSs enabled links for which there were no devices attached. This caused two problems. First there were suprious interrupts on some boxes, and second in the case where we enabling balacing IRQs (default in IOAPIC mode, requires "acpi_irq_balance" in PIC mode) it ate up IRQs and forced more sharing. > IOW, I'll claim that the only thing that has really gotten testing > since June is the thing that disables and immediately re-enables the > links. > > And that's exactly why I think the "minimally disruptive" fix is to > not disable them at all, but just fix up ELCR for anything that was > already enabled. Since that _is_ what "disable + re-enable" ends up > actually doing. > > See my argument? "minimally distruptive" undertood, Yes. "minimal risk", OTOH is to return to what we did in 2.6.9. Note, for the record, that Bjorn's patch to remove the paranoia loop and add the pci=routeirq override came to you through the -mm tree, not through the ACPI tree. I think that Bjorn was as surprised as I was that it appeared in 2.6.10-rc2. -Len ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-20 9:02 ` Len Brown 2004-11-20 12:40 ` Adrian Bunk 2004-11-20 16:41 ` Linus Torvalds @ 2004-11-23 1:58 ` Chris Wright 2 siblings, 0 replies; 36+ messages in thread From: Chris Wright @ 2004-11-23 1:58 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Adrian Bunk, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton For the record... * Len Brown (len.brown@intel.com) wrote: > Please try this updated debug patch. > > It clears the ELCR on Linux boot. This boots as expected (no irq6 storm). I have dmesg if you're still interested. thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-18 23:14 ` 2.6.10-rc2 doesn't boot (if no floppy device) Len Brown 2004-11-19 7:09 ` Chris Wright @ 2004-11-19 13:47 ` Adrian Bunk 2004-11-23 1:57 ` Chris Wright 2 siblings, 0 replies; 36+ messages in thread From: Adrian Bunk @ 2004-11-19 13:47 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton On Thu, Nov 18, 2004 at 06:14:46PM -0500, Len Brown wrote: > Chris, Hi Len, I have the same problem Chris has. > Please apply this debug patch and boot with > apic=debug acpi_dbg_level=1 >... It doesn't compile since I don't have APIC support enabled. > thanks, > -Len >... cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: 2.6.10-rc2 doesn't boot (if no floppy device) 2004-11-18 23:14 ` 2.6.10-rc2 doesn't boot (if no floppy device) Len Brown 2004-11-19 7:09 ` Chris Wright 2004-11-19 13:47 ` Adrian Bunk @ 2004-11-23 1:57 ` Chris Wright 2 siblings, 0 replies; 36+ messages in thread From: Chris Wright @ 2004-11-23 1:57 UTC (permalink / raw) To: Len Brown Cc: Chris Wright, Adrian Bunk, Linus Torvalds, Bjorn Helgaas, Kernel Mailing List, Andrew Morton For the record... * Len Brown (len.brown@intel.com) wrote: > Please apply this debug patch and boot with > apic=debug acpi_dbg_level=1 acpi_dbg_level=1 boots as expected. w/out it, irq6 interrupt storm. I have dmesg if you're still interested. thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2004-11-27 1:59 UTC | newest] Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <F7DC2337C7631D4386A2DF6E8FB22B30020B7225@hdsmsx401.amr.corp.intel.com> 2004-11-19 15:57 ` 2.6.10-rc2 doesn't boot (if no floppy device) Adrian Bunk 2004-11-19 17:36 ` Linus Torvalds 2004-11-19 18:51 ` Len Brown 2004-11-19 19:11 ` Adrian Bunk 2004-11-19 21:05 ` Len Brown 2004-11-15 23:27 2.6.10-rc2 doesn't boot Chris Wright 2004-11-18 23:14 ` 2.6.10-rc2 doesn't boot (if no floppy device) Len Brown 2004-11-19 7:09 ` Chris Wright 2004-11-20 9:02 ` Len Brown 2004-11-20 12:40 ` Adrian Bunk 2004-11-20 18:28 ` Linus Torvalds 2004-11-20 19:10 ` Linus Torvalds 2004-11-22 19:55 ` Len Brown 2004-11-24 16:26 ` Alan Cox 2004-11-21 16:29 ` Adrian Bunk 2004-11-22 19:29 ` Len Brown 2004-11-22 20:02 ` Linus Torvalds 2004-11-22 20:10 ` Linus Torvalds 2004-11-22 20:38 ` Len Brown 2004-11-23 2:45 ` Linus Torvalds 2004-11-23 4:57 ` Linus Torvalds 2004-11-23 7:06 ` Len Brown 2004-11-23 20:13 ` Stian Jordet 2004-11-23 2:00 ` Chris Wright 2004-11-22 18:28 ` Len Brown 2004-11-23 0:46 ` Adrian Bunk 2004-11-20 16:41 ` Linus Torvalds 2004-11-22 19:07 ` Len Brown 2004-11-22 19:23 ` Linus Torvalds 2004-11-22 20:24 ` Len Brown 2004-11-22 20:31 ` Linus Torvalds 2004-11-22 20:36 ` Linus Torvalds 2004-11-22 20:54 ` Len Brown 2004-11-22 20:51 ` Len Brown 2004-11-23 1:58 ` Chris Wright 2004-11-19 13:47 ` Adrian Bunk 2004-11-23 1:57 ` Chris Wright
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).