All of lore.kernel.org
 help / color / mirror / Atom feed
* linux-next: boot failure for next-20120227 and later (pci tree related)
@ 2012-03-02  6:06 ` Stephen Rothwell
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Rothwell @ 2012-03-02  6:06 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: linux-next, linux-kernel, ppc-dev, Benjamin Herrenschmidt, Bjorn Helgaas

[-- Attachment #1: Type: text/plain, Size: 8788 bytes --]

Hi Jesse,

Staring with next-20120227, one of my boot tests is failing like this:

Freeing unused kernel memory: 488k freed
modprobe used greatest stack depth: 10624 bytes left
dracut: dracut-004-32.el6
udev: starting version 147
udevd (1161): /proc/1161/oom_adj is deprecated, please use /proc/1161/oom_score_adj instead.
setfont used greatest stack depth: 10528 bytes left
dracut: Starting plymouth daemon
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2689
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 9 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2701
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 20 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2713
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2725
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2737
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2749
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2761
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 7 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2773
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 7 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2785
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2797
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2809

and eventually our test system decides the machine is dead.  This is a
PowerPC 970 based blade system  (several other PowerPC based systems to
not fail).  A "normal" boot only shows .wait_scan_init being called
once.  (I have "initcall_debug=y debug" on the command line).

I bisected this down to:

commit 6c5705fec63d83eeb165fe61e34adc92ecc2ce75
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Feb 23 20:19:03 2012 -0700

    powerpc/PCI: get rid of device resource fixups
    
    Tell the PCI core about host bridge address translation so it can take
    care of bus-to-resource conversion for us.
    
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

The only seemingly relevant differences in the boot logs (good to bad) are:

 pci 0000:03:02.0: supports D1 D2
+PCI: Cannot allocate resource region 0 of PCI bridge 1, will remap
+PCI: Cannot allocate resource region 1 of PCI bridge 1, will remap
+PCI: Cannot allocate resource region 0 of PCI bridge 6, will remap
+PCI: Cannot allocate resource region 1 of PCI bridge 6, will remap
+PCI: Cannot allocate resource region 0 of PCI bridge 3, will remap
+PCI: Cannot allocate resource region 1 of PCI bridge 3, will remap
+PCI: Cannot allocate resource region 0 of device 0000:01:01.0, will remap
+PCI: Cannot allocate resource region 2 of device 0000:01:01.0, will remap
+PCI: Cannot allocate resource region 6 of device 0000:01:01.0, will remap
+PCI: Cannot allocate resource region 0 of device 0000:03:00.0, will remap
+PCI: Cannot allocate resource region 0 of device 0000:03:00.1, will remap
+PCI: Cannot allocate resource region 0 of device 0000:03:02.0, will remap
+PCI: Cannot allocate resource region 1 of device 0000:03:02.0, will remap
+PCI: Cannot allocate resource region 2 of device 0000:03:02.0, will remap
+PCI: Cannot allocate resource region 6 of device 0000:03:02.0, will remap
+PCI: Cannot allocate resource region 0 of device 0000:06:04.0, will remap
+PCI: Cannot allocate resource region 2 of device 0000:06:04.0, will remap
+PCI: Cannot allocate resource region 0 of device 0000:06:04.1, will remap
+PCI: Cannot allocate resource region 2 of device 0000:06:04.1, will remap
 PCI: Probing PCI hardware done
	.
	.
	.
 calling  .radeonfb_init+0x0/0x248 @ 1
-radeonfb 0000:03:02.0: Invalid ROM contents
-radeonfb (0000:03:02.0): Invalid ROM signature 7272 should be 0xaa55
-radeonfb: No ATY,RefCLK property !
-xtal calculation failed: 26550
-radeonfb: Used default PLL infos
-radeonfb: Reference=27.00 MHz (RefDiv=60) Memory=166.00 Mhz, System=166.00 MHz
-radeonfb: PLL min 12000 max 35000
-i2c i2c-1: unable to read EDID block.
-i2c i2c-1: unable to read EDID block.
-i2c i2c-1: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-2: unable to read EDID block.
-i2c i2c-2: unable to read EDID block.
-i2c i2c-2: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-radeonfb: Monitor 1 type CRT found
-radeonfb: Monitor 2 type no found
-Console: switching to colour frame buffer device 80x30
-radeonfb (0000:03:02.0): ATI Radeon 515e "Q^"
+radeonfb 0000:03:02.0: device not available (can't reserve [mem 0x00000000-0x07ffffff])
+radeonfb (0000:03:02.0): Cannot enable PCI device
+radeonfb: probe of 0000:03:02.0 failed with error -22
 initcall .radeonfb_init+0x0/0x248 returned 0 after x usecs
	.
	.
	.
 ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011)
-ipr 0000:01:01.0: Found IOA with IRQ: 26
-ipr 0000:01:01.0: Starting IOA initialization sequence.
-ipr 0000:01:01.0: Adapter firmware version: 06160039
-ipr 0000:01:01.0: IOA initialized.
-scsi0 : IBM 572E Storage Adapter
-scsi 0:0:1:0: Direct-Access     IBM-ESXS MAY2036RC        T106 PQ: 0 ANSI: 5
-scsi: unknown device type 31
-scsi 0:255:255:255: No Device         IBM      572E001          0150 PQ: 0 ANSI: 0
+ipr 0000:01:01.0: device not available (can't reserve [mem 0x00000000-0x0003ffff])
+ipr 0000:01:01.0: Cannot enable adapter
+ipr: probe of 0000:01:01.0 failed with error -22
 initcall .ipr_init+0x0/0x68 returned 0 after x usecs
	.
	.
	.
 calling  .tg3_init+0x0/0x3c @ 1
 tg3.c:v3.122 (December 7, 2011)
-tg3 0000:06:04.0: enabling device (0140 -> 0142)
-sd 0:0:1:0: [sda] 71096640 512-byte logical blocks: (36.4 GB/33.9 GiB)
-sd 0:0:1:0: [sda] Write Protect is off
-sd 0:0:1:0: [sda] Mode Sense: d7 00 00 08
-sd 0:0:1:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
- sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
-sd 0:0:1:0: [sda] Attached SCSI disk
-initcall 1_.sd_probe_async+0x0/0x1d0 returned 0 after x usecs
-tg3 0000:06:04.0: eth0: Tigon3 [partno(none) rev 8100] (PCIX:133MHz:64-bit) MAC address 00:14:5e:9c:21:e2
-tg3 0000:06:04.0: eth0: attached PHY is 5780 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
-tg3 0000:06:04.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
-tg3 0000:06:04.0: eth0: dma_rwctrl[76144000] dma_mask[40-bit]
-tg3 0000:06:04.1: enabling device (0140 -> 0142)
-tg3 0000:06:04.1: eth1: Tigon3 [partno(none) rev 8100] (PCIX:133MHz:64-bit) MAC address 00:14:5e:9c:21:e3
-tg3 0000:06:04.1: eth1: attached PHY is 5780 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
-tg3 0000:06:04.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
-tg3 0000:06:04.1: eth1: dma_rwctrl[76144000] dma_mask[40-bit]
+tg3 0000:06:04.0: device not available (can't reserve [mem 0x00000000-0x0000ffff])
+tg3 0000:06:04.0: Cannot enable PCI device, aborting
+tg3: probe of 0000:06:04.0 failed with error -22
+tg3 0000:06:04.1: device not available (can't reserve [mem 0x00000000-0x0000ffff])
+tg3 0000:06:04.1: Cannot enable PCI device, aborting
+tg3: probe of 0000:06:04.1 failed with error -22
 initcall .tg3_init+0x0/0x3c returned 0 after x usecs
	.
	.
	.
 calling  .ohci_hcd_mod_init+0x0/0xfc @ 1
 ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
-ohci_hcd 0000:03:00.0: OHCI Host Controller
-ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
-ohci_hcd 0000:03:00.0: irq 19, io mem 0x100a1001000
-hub 1-0:1.0: USB hub found
-hub 1-0:1.0: 3 ports detected
-ohci_hcd 0000:03:00.1: OHCI Host Controller
-ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
-ohci_hcd 0000:03:00.1: irq 19, io mem 0x100a1000000
-hub 2-0:1.0: USB hub found
-hub 2-0:1.0: 3 ports detected
+ohci_hcd 0000:03:00.0: device not available (can't reserve [mem 0x00000000-0x00000fff])
+ohci_hcd 0000:03:00.1: device not available (can't reserve [mem 0x00000000-0x00000fff])
 initcall .ohci_hcd_mod_init+0x0/0xfc returned 0 after x usecs

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* linux-next: boot failure for next-20120227 and later (pci tree related)
@ 2012-03-02  6:06 ` Stephen Rothwell
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Rothwell @ 2012-03-02  6:06 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Bjorn Helgaas, linux-next, ppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 8788 bytes --]

Hi Jesse,

Staring with next-20120227, one of my boot tests is failing like this:

Freeing unused kernel memory: 488k freed
modprobe used greatest stack depth: 10624 bytes left
dracut: dracut-004-32.el6
udev: starting version 147
udevd (1161): /proc/1161/oom_adj is deprecated, please use /proc/1161/oom_score_adj instead.
setfont used greatest stack depth: 10528 bytes left
dracut: Starting plymouth daemon
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2689
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 9 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2701
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 20 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2713
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2725
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2737
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2749
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2761
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 7 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2773
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 7 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2785
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2797
initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2809

and eventually our test system decides the machine is dead.  This is a
PowerPC 970 based blade system  (several other PowerPC based systems to
not fail).  A "normal" boot only shows .wait_scan_init being called
once.  (I have "initcall_debug=y debug" on the command line).

I bisected this down to:

commit 6c5705fec63d83eeb165fe61e34adc92ecc2ce75
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Feb 23 20:19:03 2012 -0700

    powerpc/PCI: get rid of device resource fixups
    
    Tell the PCI core about host bridge address translation so it can take
    care of bus-to-resource conversion for us.
    
    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

The only seemingly relevant differences in the boot logs (good to bad) are:

 pci 0000:03:02.0: supports D1 D2
+PCI: Cannot allocate resource region 0 of PCI bridge 1, will remap
+PCI: Cannot allocate resource region 1 of PCI bridge 1, will remap
+PCI: Cannot allocate resource region 0 of PCI bridge 6, will remap
+PCI: Cannot allocate resource region 1 of PCI bridge 6, will remap
+PCI: Cannot allocate resource region 0 of PCI bridge 3, will remap
+PCI: Cannot allocate resource region 1 of PCI bridge 3, will remap
+PCI: Cannot allocate resource region 0 of device 0000:01:01.0, will remap
+PCI: Cannot allocate resource region 2 of device 0000:01:01.0, will remap
+PCI: Cannot allocate resource region 6 of device 0000:01:01.0, will remap
+PCI: Cannot allocate resource region 0 of device 0000:03:00.0, will remap
+PCI: Cannot allocate resource region 0 of device 0000:03:00.1, will remap
+PCI: Cannot allocate resource region 0 of device 0000:03:02.0, will remap
+PCI: Cannot allocate resource region 1 of device 0000:03:02.0, will remap
+PCI: Cannot allocate resource region 2 of device 0000:03:02.0, will remap
+PCI: Cannot allocate resource region 6 of device 0000:03:02.0, will remap
+PCI: Cannot allocate resource region 0 of device 0000:06:04.0, will remap
+PCI: Cannot allocate resource region 2 of device 0000:06:04.0, will remap
+PCI: Cannot allocate resource region 0 of device 0000:06:04.1, will remap
+PCI: Cannot allocate resource region 2 of device 0000:06:04.1, will remap
 PCI: Probing PCI hardware done
	.
	.
	.
 calling  .radeonfb_init+0x0/0x248 @ 1
-radeonfb 0000:03:02.0: Invalid ROM contents
-radeonfb (0000:03:02.0): Invalid ROM signature 7272 should be 0xaa55
-radeonfb: No ATY,RefCLK property !
-xtal calculation failed: 26550
-radeonfb: Used default PLL infos
-radeonfb: Reference=27.00 MHz (RefDiv=60) Memory=166.00 Mhz, System=166.00 MHz
-radeonfb: PLL min 12000 max 35000
-i2c i2c-1: unable to read EDID block.
-i2c i2c-1: unable to read EDID block.
-i2c i2c-1: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-2: unable to read EDID block.
-i2c i2c-2: unable to read EDID block.
-i2c i2c-2: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-i2c i2c-3: unable to read EDID block.
-radeonfb: Monitor 1 type CRT found
-radeonfb: Monitor 2 type no found
-Console: switching to colour frame buffer device 80x30
-radeonfb (0000:03:02.0): ATI Radeon 515e "Q^"
+radeonfb 0000:03:02.0: device not available (can't reserve [mem 0x00000000-0x07ffffff])
+radeonfb (0000:03:02.0): Cannot enable PCI device
+radeonfb: probe of 0000:03:02.0 failed with error -22
 initcall .radeonfb_init+0x0/0x248 returned 0 after x usecs
	.
	.
	.
 ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011)
-ipr 0000:01:01.0: Found IOA with IRQ: 26
-ipr 0000:01:01.0: Starting IOA initialization sequence.
-ipr 0000:01:01.0: Adapter firmware version: 06160039
-ipr 0000:01:01.0: IOA initialized.
-scsi0 : IBM 572E Storage Adapter
-scsi 0:0:1:0: Direct-Access     IBM-ESXS MAY2036RC        T106 PQ: 0 ANSI: 5
-scsi: unknown device type 31
-scsi 0:255:255:255: No Device         IBM      572E001          0150 PQ: 0 ANSI: 0
+ipr 0000:01:01.0: device not available (can't reserve [mem 0x00000000-0x0003ffff])
+ipr 0000:01:01.0: Cannot enable adapter
+ipr: probe of 0000:01:01.0 failed with error -22
 initcall .ipr_init+0x0/0x68 returned 0 after x usecs
	.
	.
	.
 calling  .tg3_init+0x0/0x3c @ 1
 tg3.c:v3.122 (December 7, 2011)
-tg3 0000:06:04.0: enabling device (0140 -> 0142)
-sd 0:0:1:0: [sda] 71096640 512-byte logical blocks: (36.4 GB/33.9 GiB)
-sd 0:0:1:0: [sda] Write Protect is off
-sd 0:0:1:0: [sda] Mode Sense: d7 00 00 08
-sd 0:0:1:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
- sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
-sd 0:0:1:0: [sda] Attached SCSI disk
-initcall 1_.sd_probe_async+0x0/0x1d0 returned 0 after x usecs
-tg3 0000:06:04.0: eth0: Tigon3 [partno(none) rev 8100] (PCIX:133MHz:64-bit) MAC address 00:14:5e:9c:21:e2
-tg3 0000:06:04.0: eth0: attached PHY is 5780 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
-tg3 0000:06:04.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
-tg3 0000:06:04.0: eth0: dma_rwctrl[76144000] dma_mask[40-bit]
-tg3 0000:06:04.1: enabling device (0140 -> 0142)
-tg3 0000:06:04.1: eth1: Tigon3 [partno(none) rev 8100] (PCIX:133MHz:64-bit) MAC address 00:14:5e:9c:21:e3
-tg3 0000:06:04.1: eth1: attached PHY is 5780 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
-tg3 0000:06:04.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
-tg3 0000:06:04.1: eth1: dma_rwctrl[76144000] dma_mask[40-bit]
+tg3 0000:06:04.0: device not available (can't reserve [mem 0x00000000-0x0000ffff])
+tg3 0000:06:04.0: Cannot enable PCI device, aborting
+tg3: probe of 0000:06:04.0 failed with error -22
+tg3 0000:06:04.1: device not available (can't reserve [mem 0x00000000-0x0000ffff])
+tg3 0000:06:04.1: Cannot enable PCI device, aborting
+tg3: probe of 0000:06:04.1 failed with error -22
 initcall .tg3_init+0x0/0x3c returned 0 after x usecs
	.
	.
	.
 calling  .ohci_hcd_mod_init+0x0/0xfc @ 1
 ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
-ohci_hcd 0000:03:00.0: OHCI Host Controller
-ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
-ohci_hcd 0000:03:00.0: irq 19, io mem 0x100a1001000
-hub 1-0:1.0: USB hub found
-hub 1-0:1.0: 3 ports detected
-ohci_hcd 0000:03:00.1: OHCI Host Controller
-ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
-ohci_hcd 0000:03:00.1: irq 19, io mem 0x100a1000000
-hub 2-0:1.0: USB hub found
-hub 2-0:1.0: 3 ports detected
+ohci_hcd 0000:03:00.0: device not available (can't reserve [mem 0x00000000-0x00000fff])
+ohci_hcd 0000:03:00.1: device not available (can't reserve [mem 0x00000000-0x00000fff])
 initcall .ohci_hcd_mod_init+0x0/0xfc returned 0 after x usecs

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
  2012-03-02  6:06 ` Stephen Rothwell
@ 2012-03-02 17:10   ` Bjorn Helgaas
  -1 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2012-03-02 17:10 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Jesse Barnes, linux-next, linux-kernel, ppc-dev, Benjamin Herrenschmidt

On Thu, Mar 1, 2012 at 11:06 PM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> Hi Jesse,
>
> Staring with next-20120227, one of my boot tests is failing like this:

Hi Stephen,

Thanks a lot for the test report and the bisection.  I wish I had a
machine to test on so I wouldn't have to bother you about it.

Any chance you could point me at the complete before/after dmesg logs?
 There should be information about the PCI host bridge apertures and
offsets in the "after" log.  If that's not enough, we might need to
collect new before/after logs with something like this
(whitespace-mangled):

--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -71,7 +71,7 @@ CFLAGS-$(CONFIG_PPC64)        := -mminimal-toc
-mtraceback=no -mcall-aixdesc
 CFLAGS-$(CONFIG_PPC32) := -ffixed-r2 -mmultiple
 KBUILD_CPPFLAGS        += -Iarch/$(ARCH)
 KBUILD_AFLAGS  += -Iarch/$(ARCH)
-KBUILD_CFLAGS  += -msoft-float -pipe -Iarch/$(ARCH) $(CFLAGS-y)
+KBUILD_CFLAGS  += -msoft-float -pipe -Iarch/$(ARCH) $(CFLAGS-y) -DDEBUG
 CPP            = $(CC) -E $(KBUILD_CFLAGS)

 CHECKFLAGS     += -m$(CONFIG_WORD_SIZE) -D__powerpc__
-D__powerpc$(CONFIG_WORD_SIZE)__


> Freeing unused kernel memory: 488k freed
> modprobe used greatest stack depth: 10624 bytes left
> dracut: dracut-004-32.el6
> udev: starting version 147
> udevd (1161): /proc/1161/oom_adj is deprecated, please use /proc/1161/oom_score_adj instead.
> setfont used greatest stack depth: 10528 bytes left
> dracut: Starting plymouth daemon
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2689
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 9 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2701
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 20 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2713
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2725
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2737
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2749
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2761
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 7 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2773
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 7 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2785
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2797
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 usecs
> calling  .wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2809
>
> and eventually our test system decides the machine is dead.  This is a
> PowerPC 970 based blade system  (several other PowerPC based systems to
> not fail).  A "normal" boot only shows .wait_scan_init being called
> once.  (I have "initcall_debug=y debug" on the command line).
>
> I bisected this down to:
>
> commit 6c5705fec63d83eeb165fe61e34adc92ecc2ce75
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Thu Feb 23 20:19:03 2012 -0700
>
>    powerpc/PCI: get rid of device resource fixups
>
>    Tell the PCI core about host bridge address translation so it can take
>    care of bus-to-resource conversion for us.
>
>    CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> The only seemingly relevant differences in the boot logs (good to bad) are:
>
>  pci 0000:03:02.0: supports D1 D2
> +PCI: Cannot allocate resource region 0 of PCI bridge 1, will remap
> +PCI: Cannot allocate resource region 1 of PCI bridge 1, will remap
> +PCI: Cannot allocate resource region 0 of PCI bridge 6, will remap
> +PCI: Cannot allocate resource region 1 of PCI bridge 6, will remap
> +PCI: Cannot allocate resource region 0 of PCI bridge 3, will remap
> +PCI: Cannot allocate resource region 1 of PCI bridge 3, will remap
> +PCI: Cannot allocate resource region 0 of device 0000:01:01.0, will remap
> +PCI: Cannot allocate resource region 2 of device 0000:01:01.0, will remap
> +PCI: Cannot allocate resource region 6 of device 0000:01:01.0, will remap
> +PCI: Cannot allocate resource region 0 of device 0000:03:00.0, will remap
> +PCI: Cannot allocate resource region 0 of device 0000:03:00.1, will remap
> +PCI: Cannot allocate resource region 0 of device 0000:03:02.0, will remap
> +PCI: Cannot allocate resource region 1 of device 0000:03:02.0, will remap
> +PCI: Cannot allocate resource region 2 of device 0000:03:02.0, will remap
> +PCI: Cannot allocate resource region 6 of device 0000:03:02.0, will remap
> +PCI: Cannot allocate resource region 0 of device 0000:06:04.0, will remap
> +PCI: Cannot allocate resource region 2 of device 0000:06:04.0, will remap
> +PCI: Cannot allocate resource region 0 of device 0000:06:04.1, will remap
> +PCI: Cannot allocate resource region 2 of device 0000:06:04.1, will remap
>  PCI: Probing PCI hardware done
>        .
>        .
>        .
>  calling  .radeonfb_init+0x0/0x248 @ 1
> -radeonfb 0000:03:02.0: Invalid ROM contents
> -radeonfb (0000:03:02.0): Invalid ROM signature 7272 should be 0xaa55
> -radeonfb: No ATY,RefCLK property !
> -xtal calculation failed: 26550
> -radeonfb: Used default PLL infos
> -radeonfb: Reference=27.00 MHz (RefDiv=60) Memory=166.00 Mhz, System=166.00 MHz
> -radeonfb: PLL min 12000 max 35000
> -i2c i2c-1: unable to read EDID block.
> -i2c i2c-1: unable to read EDID block.
> -i2c i2c-1: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-2: unable to read EDID block.
> -i2c i2c-2: unable to read EDID block.
> -i2c i2c-2: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -radeonfb: Monitor 1 type CRT found
> -radeonfb: Monitor 2 type no found
> -Console: switching to colour frame buffer device 80x30
> -radeonfb (0000:03:02.0): ATI Radeon 515e "Q^"
> +radeonfb 0000:03:02.0: device not available (can't reserve [mem 0x00000000-0x07ffffff])
> +radeonfb (0000:03:02.0): Cannot enable PCI device
> +radeonfb: probe of 0000:03:02.0 failed with error -22
>  initcall .radeonfb_init+0x0/0x248 returned 0 after x usecs
>        .
>        .
>        .
>  ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011)
> -ipr 0000:01:01.0: Found IOA with IRQ: 26
> -ipr 0000:01:01.0: Starting IOA initialization sequence.
> -ipr 0000:01:01.0: Adapter firmware version: 06160039
> -ipr 0000:01:01.0: IOA initialized.
> -scsi0 : IBM 572E Storage Adapter
> -scsi 0:0:1:0: Direct-Access     IBM-ESXS MAY2036RC        T106 PQ: 0 ANSI: 5
> -scsi: unknown device type 31
> -scsi 0:255:255:255: No Device         IBM      572E001          0150 PQ: 0 ANSI: 0
> +ipr 0000:01:01.0: device not available (can't reserve [mem 0x00000000-0x0003ffff])
> +ipr 0000:01:01.0: Cannot enable adapter
> +ipr: probe of 0000:01:01.0 failed with error -22
>  initcall .ipr_init+0x0/0x68 returned 0 after x usecs
>        .
>        .
>        .
>  calling  .tg3_init+0x0/0x3c @ 1
>  tg3.c:v3.122 (December 7, 2011)
> -tg3 0000:06:04.0: enabling device (0140 -> 0142)
> -sd 0:0:1:0: [sda] 71096640 512-byte logical blocks: (36.4 GB/33.9 GiB)
> -sd 0:0:1:0: [sda] Write Protect is off
> -sd 0:0:1:0: [sda] Mode Sense: d7 00 00 08
> -sd 0:0:1:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
> - sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> -sd 0:0:1:0: [sda] Attached SCSI disk
> -initcall 1_.sd_probe_async+0x0/0x1d0 returned 0 after x usecs
> -tg3 0000:06:04.0: eth0: Tigon3 [partno(none) rev 8100] (PCIX:133MHz:64-bit) MAC address 00:14:5e:9c:21:e2
> -tg3 0000:06:04.0: eth0: attached PHY is 5780 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
> -tg3 0000:06:04.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
> -tg3 0000:06:04.0: eth0: dma_rwctrl[76144000] dma_mask[40-bit]
> -tg3 0000:06:04.1: enabling device (0140 -> 0142)
> -tg3 0000:06:04.1: eth1: Tigon3 [partno(none) rev 8100] (PCIX:133MHz:64-bit) MAC address 00:14:5e:9c:21:e3
> -tg3 0000:06:04.1: eth1: attached PHY is 5780 (1000Base-SX Ethernet) (WireSpeed[0], EEE[0])
> -tg3 0000:06:04.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
> -tg3 0000:06:04.1: eth1: dma_rwctrl[76144000] dma_mask[40-bit]
> +tg3 0000:06:04.0: device not available (can't reserve [mem 0x00000000-0x0000ffff])
> +tg3 0000:06:04.0: Cannot enable PCI device, aborting
> +tg3: probe of 0000:06:04.0 failed with error -22
> +tg3 0000:06:04.1: device not available (can't reserve [mem 0x00000000-0x0000ffff])
> +tg3 0000:06:04.1: Cannot enable PCI device, aborting
> +tg3: probe of 0000:06:04.1 failed with error -22
>  initcall .tg3_init+0x0/0x3c returned 0 after x usecs
>        .
>        .
>        .
>  calling  .ohci_hcd_mod_init+0x0/0xfc @ 1
>  ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> -ohci_hcd 0000:03:00.0: OHCI Host Controller
> -ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
> -ohci_hcd 0000:03:00.0: irq 19, io mem 0x100a1001000
> -hub 1-0:1.0: USB hub found
> -hub 1-0:1.0: 3 ports detected
> -ohci_hcd 0000:03:00.1: OHCI Host Controller
> -ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
> -ohci_hcd 0000:03:00.1: irq 19, io mem 0x100a1000000
> -hub 2-0:1.0: USB hub found
> -hub 2-0:1.0: 3 ports detected
> +ohci_hcd 0000:03:00.0: device not available (can't reserve [mem 0x00000000-0x00000fff])
> +ohci_hcd 0000:03:00.1: device not available (can't reserve [mem 0x00000000-0x00000fff])
>  initcall .ohci_hcd_mod_init+0x0/0xfc returned 0 after x usecs
>
> --
> Cheers,
> Stephen Rothwell                    sfr@canb.auug.org.au

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
@ 2012-03-02 17:10   ` Bjorn Helgaas
  0 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2012-03-02 17:10 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linux-next, ppc-dev, linux-kernel, Jesse Barnes

On Thu, Mar 1, 2012 at 11:06 PM, Stephen Rothwell <sfr@canb.auug.org.au> wr=
ote:
> Hi Jesse,
>
> Staring with next-20120227, one of my boot tests is failing like this:

Hi Stephen,

Thanks a lot for the test report and the bisection.  I wish I had a
machine to test on so I wouldn't have to bother you about it.

Any chance you could point me at the complete before/after dmesg logs?
 There should be information about the PCI host bridge apertures and
offsets in the "after" log.  If that's not enough, we might need to
collect new before/after logs with something like this
(whitespace-mangled):

--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -71,7 +71,7 @@ CFLAGS-$(CONFIG_PPC64)        :=3D -mminimal-toc
-mtraceback=3Dno -mcall-aixdesc
 CFLAGS-$(CONFIG_PPC32) :=3D -ffixed-r2 -mmultiple
 KBUILD_CPPFLAGS        +=3D -Iarch/$(ARCH)
 KBUILD_AFLAGS  +=3D -Iarch/$(ARCH)
-KBUILD_CFLAGS  +=3D -msoft-float -pipe -Iarch/$(ARCH) $(CFLAGS-y)
+KBUILD_CFLAGS  +=3D -msoft-float -pipe -Iarch/$(ARCH) $(CFLAGS-y) -DDEBUG
 CPP            =3D $(CC) -E $(KBUILD_CFLAGS)

 CHECKFLAGS     +=3D -m$(CONFIG_WORD_SIZE) -D__powerpc__
-D__powerpc$(CONFIG_WORD_SIZE)__


> Freeing unused kernel memory: 488k freed
> modprobe used greatest stack depth: 10624 bytes left
> dracut: dracut-004-32.el6
> udev: starting version 147
> udevd (1161): /proc/1161/oom_adj is deprecated, please use /proc/1161/oom=
_score_adj instead.
> setfont used greatest stack depth: 10528 bytes left
> dracut: Starting plymouth daemon
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2689
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 9 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2701
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 20 us=
ecs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2713
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2725
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2737
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2749
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2761
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 7 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2773
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 7 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2785
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2797
> initcall .wait_scan_init+0x0/0xc4 [scsi_wait_scan] returned 0 after 8 use=
cs
> calling =A0.wait_scan_init+0x0/0xc4 [scsi_wait_scan] @ 2809
>
> and eventually our test system decides the machine is dead. =A0This is a
> PowerPC 970 based blade system =A0(several other PowerPC based systems to
> not fail). =A0A "normal" boot only shows .wait_scan_init being called
> once. =A0(I have "initcall_debug=3Dy debug" on the command line).
>
> I bisected this down to:
>
> commit 6c5705fec63d83eeb165fe61e34adc92ecc2ce75
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date: =A0 Thu Feb 23 20:19:03 2012 -0700
>
> =A0 =A0powerpc/PCI: get rid of device resource fixups
>
> =A0 =A0Tell the PCI core about host bridge address translation so it can =
take
> =A0 =A0care of bus-to-resource conversion for us.
>
> =A0 =A0CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> =A0 =A0Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> The only seemingly relevant differences in the boot logs (good to bad) ar=
e:
>
> =A0pci 0000:03:02.0: supports D1 D2
> +PCI: Cannot allocate resource region 0 of PCI bridge 1, will remap
> +PCI: Cannot allocate resource region 1 of PCI bridge 1, will remap
> +PCI: Cannot allocate resource region 0 of PCI bridge 6, will remap
> +PCI: Cannot allocate resource region 1 of PCI bridge 6, will remap
> +PCI: Cannot allocate resource region 0 of PCI bridge 3, will remap
> +PCI: Cannot allocate resource region 1 of PCI bridge 3, will remap
> +PCI: Cannot allocate resource region 0 of device 0000:01:01.0, will rema=
p
> +PCI: Cannot allocate resource region 2 of device 0000:01:01.0, will rema=
p
> +PCI: Cannot allocate resource region 6 of device 0000:01:01.0, will rema=
p
> +PCI: Cannot allocate resource region 0 of device 0000:03:00.0, will rema=
p
> +PCI: Cannot allocate resource region 0 of device 0000:03:00.1, will rema=
p
> +PCI: Cannot allocate resource region 0 of device 0000:03:02.0, will rema=
p
> +PCI: Cannot allocate resource region 1 of device 0000:03:02.0, will rema=
p
> +PCI: Cannot allocate resource region 2 of device 0000:03:02.0, will rema=
p
> +PCI: Cannot allocate resource region 6 of device 0000:03:02.0, will rema=
p
> +PCI: Cannot allocate resource region 0 of device 0000:06:04.0, will rema=
p
> +PCI: Cannot allocate resource region 2 of device 0000:06:04.0, will rema=
p
> +PCI: Cannot allocate resource region 0 of device 0000:06:04.1, will rema=
p
> +PCI: Cannot allocate resource region 2 of device 0000:06:04.1, will rema=
p
> =A0PCI: Probing PCI hardware done
> =A0 =A0 =A0 =A0.
> =A0 =A0 =A0 =A0.
> =A0 =A0 =A0 =A0.
> =A0calling =A0.radeonfb_init+0x0/0x248 @ 1
> -radeonfb 0000:03:02.0: Invalid ROM contents
> -radeonfb (0000:03:02.0): Invalid ROM signature 7272 should be 0xaa55
> -radeonfb: No ATY,RefCLK property !
> -xtal calculation failed: 26550
> -radeonfb: Used default PLL infos
> -radeonfb: Reference=3D27.00 MHz (RefDiv=3D60) Memory=3D166.00 Mhz, Syste=
m=3D166.00 MHz
> -radeonfb: PLL min 12000 max 35000
> -i2c i2c-1: unable to read EDID block.
> -i2c i2c-1: unable to read EDID block.
> -i2c i2c-1: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-2: unable to read EDID block.
> -i2c i2c-2: unable to read EDID block.
> -i2c i2c-2: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -i2c i2c-3: unable to read EDID block.
> -radeonfb: Monitor 1 type CRT found
> -radeonfb: Monitor 2 type no found
> -Console: switching to colour frame buffer device 80x30
> -radeonfb (0000:03:02.0): ATI Radeon 515e "Q^"
> +radeonfb 0000:03:02.0: device not available (can't reserve [mem 0x000000=
00-0x07ffffff])
> +radeonfb (0000:03:02.0): Cannot enable PCI device
> +radeonfb: probe of 0000:03:02.0 failed with error -22
> =A0initcall .radeonfb_init+0x0/0x248 returned 0 after x usecs
> =A0 =A0 =A0 =A0.
> =A0 =A0 =A0 =A0.
> =A0 =A0 =A0 =A0.
> =A0ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011)
> -ipr 0000:01:01.0: Found IOA with IRQ: 26
> -ipr 0000:01:01.0: Starting IOA initialization sequence.
> -ipr 0000:01:01.0: Adapter firmware version: 06160039
> -ipr 0000:01:01.0: IOA initialized.
> -scsi0 : IBM 572E Storage Adapter
> -scsi 0:0:1:0: Direct-Access =A0 =A0 IBM-ESXS MAY2036RC =A0 =A0 =A0 =A0T1=
06 PQ: 0 ANSI: 5
> -scsi: unknown device type 31
> -scsi 0:255:255:255: No Device =A0 =A0 =A0 =A0 IBM =A0 =A0 =A0572E001 =A0=
 =A0 =A0 =A0 =A00150 PQ: 0 ANSI: 0
> +ipr 0000:01:01.0: device not available (can't reserve [mem 0x00000000-0x=
0003ffff])
> +ipr 0000:01:01.0: Cannot enable adapter
> +ipr: probe of 0000:01:01.0 failed with error -22
> =A0initcall .ipr_init+0x0/0x68 returned 0 after x usecs
> =A0 =A0 =A0 =A0.
> =A0 =A0 =A0 =A0.
> =A0 =A0 =A0 =A0.
> =A0calling =A0.tg3_init+0x0/0x3c @ 1
> =A0tg3.c:v3.122 (December 7, 2011)
> -tg3 0000:06:04.0: enabling device (0140 -> 0142)
> -sd 0:0:1:0: [sda] 71096640 512-byte logical blocks: (36.4 GB/33.9 GiB)
> -sd 0:0:1:0: [sda] Write Protect is off
> -sd 0:0:1:0: [sda] Mode Sense: d7 00 00 08
> -sd 0:0:1:0: [sda] Write cache: disabled, read cache: enabled, doesn't su=
pport DPO or FUA
> - sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 >
> -sd 0:0:1:0: [sda] Attached SCSI disk
> -initcall 1_.sd_probe_async+0x0/0x1d0 returned 0 after x usecs
> -tg3 0000:06:04.0: eth0: Tigon3 [partno(none) rev 8100] (PCIX:133MHz:64-b=
it) MAC address 00:14:5e:9c:21:e2
> -tg3 0000:06:04.0: eth0: attached PHY is 5780 (1000Base-SX Ethernet) (Wir=
eSpeed[0], EEE[0])
> -tg3 0000:06:04.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[=
1]
> -tg3 0000:06:04.0: eth0: dma_rwctrl[76144000] dma_mask[40-bit]
> -tg3 0000:06:04.1: enabling device (0140 -> 0142)
> -tg3 0000:06:04.1: eth1: Tigon3 [partno(none) rev 8100] (PCIX:133MHz:64-b=
it) MAC address 00:14:5e:9c:21:e3
> -tg3 0000:06:04.1: eth1: attached PHY is 5780 (1000Base-SX Ethernet) (Wir=
eSpeed[0], EEE[0])
> -tg3 0000:06:04.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[=
1]
> -tg3 0000:06:04.1: eth1: dma_rwctrl[76144000] dma_mask[40-bit]
> +tg3 0000:06:04.0: device not available (can't reserve [mem 0x00000000-0x=
0000ffff])
> +tg3 0000:06:04.0: Cannot enable PCI device, aborting
> +tg3: probe of 0000:06:04.0 failed with error -22
> +tg3 0000:06:04.1: device not available (can't reserve [mem 0x00000000-0x=
0000ffff])
> +tg3 0000:06:04.1: Cannot enable PCI device, aborting
> +tg3: probe of 0000:06:04.1 failed with error -22
> =A0initcall .tg3_init+0x0/0x3c returned 0 after x usecs
> =A0 =A0 =A0 =A0.
> =A0 =A0 =A0 =A0.
> =A0 =A0 =A0 =A0.
> =A0calling =A0.ohci_hcd_mod_init+0x0/0xfc @ 1
> =A0ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> -ohci_hcd 0000:03:00.0: OHCI Host Controller
> -ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
> -ohci_hcd 0000:03:00.0: irq 19, io mem 0x100a1001000
> -hub 1-0:1.0: USB hub found
> -hub 1-0:1.0: 3 ports detected
> -ohci_hcd 0000:03:00.1: OHCI Host Controller
> -ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
> -ohci_hcd 0000:03:00.1: irq 19, io mem 0x100a1000000
> -hub 2-0:1.0: USB hub found
> -hub 2-0:1.0: 3 ports detected
> +ohci_hcd 0000:03:00.0: device not available (can't reserve [mem 0x000000=
00-0x00000fff])
> +ohci_hcd 0000:03:00.1: device not available (can't reserve [mem 0x000000=
00-0x00000fff])
> =A0initcall .ohci_hcd_mod_init+0x0/0xfc returned 0 after x usecs
>
> --
> Cheers,
> Stephen Rothwell =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0sfr@canb.auug.org=
.au

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
  2012-03-02 17:10   ` Bjorn Helgaas
@ 2012-03-02 21:52     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-02 21:52 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Stephen Rothwell, Jesse Barnes, linux-next, linux-kernel, ppc-dev

On Fri, 2012-03-02 at 10:10 -0700, Bjorn Helgaas wrote:
> On Thu, Mar 1, 2012 at 11:06 PM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> > Hi Jesse,
> >
> > Staring with next-20120227, one of my boot tests is failing like this:
> 
> Hi Stephen,
> 
> Thanks a lot for the test report and the bisection.  I wish I had a
> machine to test on so I wouldn't have to bother you about it.
> 
> Any chance you could point me at the complete before/after dmesg logs?
>  There should be information about the PCI host bridge apertures and
> offsets in the "after" log.  If that's not enough, we might need to
> collect new before/after logs with something like this
> (whitespace-mangled):

Or give me a chance to dig :-) I'll have a look next week.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
@ 2012-03-02 21:52     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-02 21:52 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Stephen Rothwell, linux-next, ppc-dev, linux-kernel, Jesse Barnes

On Fri, 2012-03-02 at 10:10 -0700, Bjorn Helgaas wrote:
> On Thu, Mar 1, 2012 at 11:06 PM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> > Hi Jesse,
> >
> > Staring with next-20120227, one of my boot tests is failing like this:
> 
> Hi Stephen,
> 
> Thanks a lot for the test report and the bisection.  I wish I had a
> machine to test on so I wouldn't have to bother you about it.
> 
> Any chance you could point me at the complete before/after dmesg logs?
>  There should be information about the PCI host bridge apertures and
> offsets in the "after" log.  If that's not enough, we might need to
> collect new before/after logs with something like this
> (whitespace-mangled):

Or give me a chance to dig :-) I'll have a look next week.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
  2012-03-02 17:10   ` Bjorn Helgaas
@ 2012-03-02 22:26     ` Stephen Rothwell
  -1 siblings, 0 replies; 11+ messages in thread
From: Stephen Rothwell @ 2012-03-02 22:26 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Jesse Barnes, linux-next, linux-kernel, ppc-dev, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 951 bytes --]

Hi Bjorn,

On Fri, 2 Mar 2012 10:10:02 -0700 Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> Thanks a lot for the test report and the bisection.  I wish I had a
> machine to test on so I wouldn't have to bother you about it.

That's OK.

> Any chance you could point me at the complete before/after dmesg logs?
>  There should be information about the PCI host bridge apertures and
> offsets in the "after" log.

I have put the logs up at http://ozlabs.org/~sfr/console-bad.log and
console-good.log.  They have been editted to set the initcall return
timings to "x" usecs just for ease of diffing, otherwise verbatim.

>  If that's not enough, we might need to
> collect new before/after logs with something like this
> (whitespace-mangled):

I'll defer to Ben as to whether that would help as I don't have easy
access to the machine until Monday anyway.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
@ 2012-03-02 22:26     ` Stephen Rothwell
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Rothwell @ 2012-03-02 22:26 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-next, ppc-dev, linux-kernel, Jesse Barnes

[-- Attachment #1: Type: text/plain, Size: 951 bytes --]

Hi Bjorn,

On Fri, 2 Mar 2012 10:10:02 -0700 Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> Thanks a lot for the test report and the bisection.  I wish I had a
> machine to test on so I wouldn't have to bother you about it.

That's OK.

> Any chance you could point me at the complete before/after dmesg logs?
>  There should be information about the PCI host bridge apertures and
> offsets in the "after" log.

I have put the logs up at http://ozlabs.org/~sfr/console-bad.log and
console-good.log.  They have been editted to set the initcall return
timings to "x" usecs just for ease of diffing, otherwise verbatim.

>  If that's not enough, we might need to
> collect new before/after logs with something like this
> (whitespace-mangled):

I'll defer to Ben as to whether that would help as I don't have easy
access to the machine until Monday anyway.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
  2012-03-02 21:52     ` Benjamin Herrenschmidt
  (?)
@ 2012-03-05  3:34     ` Benjamin Herrenschmidt
  2012-03-05 16:14         ` Bjorn Helgaas
  -1 siblings, 1 reply; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-05  3:34 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Stephen Rothwell, linux-next, ppc-dev, linux-kernel, Jesse Barnes

On Sat, 2012-03-03 at 08:52 +1100, Benjamin Herrenschmidt wrote:

> Or give me a chance to dig :-) I'll have a look next week.

This is indeed what bjorn suspected on irc, this patch fixes it:

(Bjorn, please fold it in the original offending patch)

Cheers,
Ben.

diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
index b37d0b5..5dd63f1 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -75,6 +75,7 @@ static void of_pci_parse_addrs(struct device_node *node, struct pci_dev *dev)
 {
 	u64 base, size;
 	unsigned int flags;
+	struct pci_bus_region region;
 	struct resource *res;
 	const u32 *addrs;
 	u32 i;
@@ -106,10 +107,12 @@ static void of_pci_parse_addrs(struct device_node *node, struct pci_dev *dev)
 			printk(KERN_ERR "PCI: bad cfg reg num 0x%x\n", i);
 			continue;
 		}
-		res->start = base;
-		res->end = base + size - 1;
+
 		res->flags = flags;
 		res->name = pci_name(dev);
+		region.start = base;
+		region.end = base + size - 1;
+		pcibios_bus_to_resource(dev, res, &region);
 	}
 }
 
@@ -209,6 +212,7 @@ void __devinit of_scan_pci_bridge(struct pci_dev *dev)
 	struct pci_bus *bus;
 	const u32 *busrange, *ranges;
 	int len, i, mode;
+	struct pci_bus_region region;
 	struct resource *res;
 	unsigned int flags;
 	u64 size;
@@ -270,9 +274,10 @@ void __devinit of_scan_pci_bridge(struct pci_dev *dev)
 			res = bus->resource[i];
 			++i;
 		}
-		res->start = of_read_number(&ranges[1], 2);
-		res->end = res->start + size - 1;
 		res->flags = flags;
+		region.start = of_read_number(&ranges[1], 2);
+		region.end = region.start + size - 1;		
+		pcibios_bus_to_resource(dev, res, &region);
 	}
 	sprintf(bus->name, "PCI Bus %04x:%02x", pci_domain_nr(bus),
 		bus->number);



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
  2012-03-05  3:34     ` Benjamin Herrenschmidt
@ 2012-03-05 16:14         ` Bjorn Helgaas
  0 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2012-03-05 16:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-next, ppc-dev, linux-kernel, Jesse Barnes

On Sun, Mar 4, 2012 at 8:34 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Sat, 2012-03-03 at 08:52 +1100, Benjamin Herrenschmidt wrote:
>
>> Or give me a chance to dig :-) I'll have a look next week.
>
> This is indeed what bjorn suspected on irc, this patch fixes it:
>
> (Bjorn, please fold it in the original offending patch)

Thanks for checking this out.  Sparc should have the same problem, so
I'll post both updates in a bit.

Bjorn

> diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
> index b37d0b5..5dd63f1 100644
> --- a/arch/powerpc/kernel/pci_of_scan.c
> +++ b/arch/powerpc/kernel/pci_of_scan.c
> @@ -75,6 +75,7 @@ static void of_pci_parse_addrs(struct device_node *node, struct pci_dev *dev)
>  {
>        u64 base, size;
>        unsigned int flags;
> +       struct pci_bus_region region;
>        struct resource *res;
>        const u32 *addrs;
>        u32 i;
> @@ -106,10 +107,12 @@ static void of_pci_parse_addrs(struct device_node *node, struct pci_dev *dev)
>                        printk(KERN_ERR "PCI: bad cfg reg num 0x%x\n", i);
>                        continue;
>                }
> -               res->start = base;
> -               res->end = base + size - 1;
> +
>                res->flags = flags;
>                res->name = pci_name(dev);
> +               region.start = base;
> +               region.end = base + size - 1;
> +               pcibios_bus_to_resource(dev, res, &region);
>        }
>  }
>
> @@ -209,6 +212,7 @@ void __devinit of_scan_pci_bridge(struct pci_dev *dev)
>        struct pci_bus *bus;
>        const u32 *busrange, *ranges;
>        int len, i, mode;
> +       struct pci_bus_region region;
>        struct resource *res;
>        unsigned int flags;
>        u64 size;
> @@ -270,9 +274,10 @@ void __devinit of_scan_pci_bridge(struct pci_dev *dev)
>                        res = bus->resource[i];
>                        ++i;
>                }
> -               res->start = of_read_number(&ranges[1], 2);
> -               res->end = res->start + size - 1;
>                res->flags = flags;
> +               region.start = of_read_number(&ranges[1], 2);
> +               region.end = region.start + size - 1;
> +               pcibios_bus_to_resource(dev, res, &region);
>        }
>        sprintf(bus->name, "PCI Bus %04x:%02x", pci_domain_nr(bus),
>                bus->number);
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: linux-next: boot failure for next-20120227 and later (pci tree related)
@ 2012-03-05 16:14         ` Bjorn Helgaas
  0 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2012-03-05 16:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Stephen Rothwell, linux-next, ppc-dev, linux-kernel, Jesse Barnes

On Sun, Mar 4, 2012 at 8:34 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Sat, 2012-03-03 at 08:52 +1100, Benjamin Herrenschmidt wrote:
>
>> Or give me a chance to dig :-) I'll have a look next week.
>
> This is indeed what bjorn suspected on irc, this patch fixes it:
>
> (Bjorn, please fold it in the original offending patch)

Thanks for checking this out.  Sparc should have the same problem, so
I'll post both updates in a bit.

Bjorn

> diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_=
of_scan.c
> index b37d0b5..5dd63f1 100644
> --- a/arch/powerpc/kernel/pci_of_scan.c
> +++ b/arch/powerpc/kernel/pci_of_scan.c
> @@ -75,6 +75,7 @@ static void of_pci_parse_addrs(struct device_node *node=
, struct pci_dev *dev)
> =A0{
> =A0 =A0 =A0 =A0u64 base, size;
> =A0 =A0 =A0 =A0unsigned int flags;
> + =A0 =A0 =A0 struct pci_bus_region region;
> =A0 =A0 =A0 =A0struct resource *res;
> =A0 =A0 =A0 =A0const u32 *addrs;
> =A0 =A0 =A0 =A0u32 i;
> @@ -106,10 +107,12 @@ static void of_pci_parse_addrs(struct device_node *=
node, struct pci_dev *dev)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printk(KERN_ERR "PCI: bad =
cfg reg num 0x%x\n", i);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 res->start =3D base;
> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 res->end =3D base + size - 1;
> +
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0res->flags =3D flags;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0res->name =3D pci_name(dev);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 region.start =3D base;
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 region.end =3D base + size - 1;
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pcibios_bus_to_resource(dev, res, &region);
> =A0 =A0 =A0 =A0}
> =A0}
>
> @@ -209,6 +212,7 @@ void __devinit of_scan_pci_bridge(struct pci_dev *dev=
)
> =A0 =A0 =A0 =A0struct pci_bus *bus;
> =A0 =A0 =A0 =A0const u32 *busrange, *ranges;
> =A0 =A0 =A0 =A0int len, i, mode;
> + =A0 =A0 =A0 struct pci_bus_region region;
> =A0 =A0 =A0 =A0struct resource *res;
> =A0 =A0 =A0 =A0unsigned int flags;
> =A0 =A0 =A0 =A0u64 size;
> @@ -270,9 +274,10 @@ void __devinit of_scan_pci_bridge(struct pci_dev *de=
v)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0res =3D bus->resource[i];
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0++i;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 res->start =3D of_read_number(&ranges[1], 2=
);
> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 res->end =3D res->start + size - 1;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0res->flags =3D flags;
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 region.start =3D of_read_number(&ranges[1],=
 2);
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 region.end =3D region.start + size - 1;
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 pcibios_bus_to_resource(dev, res, &region);
> =A0 =A0 =A0 =A0}
> =A0 =A0 =A0 =A0sprintf(bus->name, "PCI Bus %04x:%02x", pci_domain_nr(bus)=
,
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0bus->number);
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-03-05 16:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-02  6:06 linux-next: boot failure for next-20120227 and later (pci tree related) Stephen Rothwell
2012-03-02  6:06 ` Stephen Rothwell
2012-03-02 17:10 ` Bjorn Helgaas
2012-03-02 17:10   ` Bjorn Helgaas
2012-03-02 21:52   ` Benjamin Herrenschmidt
2012-03-02 21:52     ` Benjamin Herrenschmidt
2012-03-05  3:34     ` Benjamin Herrenschmidt
2012-03-05 16:14       ` Bjorn Helgaas
2012-03-05 16:14         ` Bjorn Helgaas
2012-03-02 22:26   ` Stephen Rothwell
2012-03-02 22:26     ` Stephen Rothwell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.