All of lore.kernel.org
 help / color / mirror / Atom feed
* Resource assignment oddities
@ 2013-05-05  0:10 Benjamin Herrenschmidt
  2013-05-05  0:15 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-05  0:10 UTC (permalink / raw)
  To: linux-pci; +Cc: Bjorn Helgaas, Gavin Shan, Yinghai Lu

Once upon a time, our PCI resource assignment code use to be reasonably
straightforward... right now I'm having a hard time making any sense
of it.

Anyway, here's a machine where some oddities happen (see the log below).

Some memory resources fail to be assigned and I'm not entirely sure why,
though it *could* just be resource exhaustion, I'm not 100% sure. I'll
spend time in the next days or so trying to figure out what exactly
is going on in the kernel but I though maybe somebody here more familiar
with the code can spot it faster.

The "special" things about this machine are:

 - We have a custom pcibios_window_alignment which for most part will
return a minimum alignment of bridge windows of 0x800000 (8MB) to deal
with our address space segmentation (for our EEH/error isolation).

 - We have no IO space on that controller

 - Bus numbers have been assigned by the FW but BARs have been left
alone (and bridge windows are all closed)

 - There is a single 2GB MMIO region coming off the PHB (no "prefetchable")
space at this point, which is at high addresses in CPU space but corresponds
to 2G..4G in PCI space. It's located at offset 2G within an aligned 4G window
in the CPU space which means that the conversion is trivial: just chop the top
32-bit off and you get the PCI address.

Any help welcome, this is 3.9-rc8 based. I'll try to dig myself a bit more
during the next few days.

Log below...

Ben.

PCI: Probing PCI hardware
PCI: I/O resource not set for host bridge /pciex@3fffe40000000 (domain 0)
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [mem 0x3d00080000000-0x3d000fffeffff] (bus address [0x80000000-0xfffeffff])
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to ff
pci 0000:00:00.0: [1014:03dc] type 01 class 0x060400
pci 0000:00:00.0: PME# supported from D0 D3hot D3cold
pci 0000:01:00.0: [1014:0339] type 00 class 0x010400
pci 0000:01:00.0: reg 10: [mem 0x00000000-0x0003ffff 64bit]
pci 0000:01:00.0: reg 18: [mem 0x00000000-0x00ffffff 64bit pref]
pci 0000:01:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
pci 0000:00:00.0: PCI bridge to [bus 01]
pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to 01
PCI: I/O resource not set for host bridge /pciex@3fffe40100000 (domain 1)
PCI host bridge to bus 0001:00
pci_bus 0001:00: root bus resource [mem 0x3d01080000000-0x3d010fffeffff] (bus address [0x80000000-0xfffeffff])
pci_bus 0001:00: root bus resource [bus 00-ff]
pci_bus 0001:00: busn_res: [bus 00-ff] end is updated to ff
pci 0001:00:00.0: [1014:03dc] type 01 class 0x060400
pci 0001:00:00.0: PME# supported from D0 D3hot D3cold
pci 0001:01:00.0: [10b5:8732] type 01 class 0x060400
pci 0001:01:00.0: reg 10: [mem 0x00000000-0x0003ffff]
pci 0001:01:00.0: PME# supported from D0 D3hot D3cold
pci 0001:00:00.0: PCI bridge to [bus 01-0d]
pci 0001:02:01.0: [10b5:8732] type 01 class 0x060400
pci 0001:02:01.0: PME# supported from D0 D3hot D3cold
pci 0001:02:08.0: [10b5:8732] type 01 class 0x060400
pci 0001:02:08.0: PME# supported from D0 D3hot D3cold
pci 0001:02:09.0: [10b5:8732] type 01 class 0x060400
pci 0001:02:09.0: PME# supported from D0 D3hot D3cold
pci 0001:01:00.0: PCI bridge to [bus 02-0d]
pci 0001:02:01.0: PCI bridge to [bus 03-07]
pci 0001:08:00.0: [1014:034a] type 00 class 0x010400
pci 0001:08:00.0: reg 10: [mem 0x00000000-0x0000ffff 64bit]
pci 0001:08:00.0: reg 18: [mem 0x00000000-0x0000ffff 64bit]
pci 0001:08:00.0: reg 30: [mem 0x00000000-0x0001ffff pref]
pci 0001:08:00.0: PME# supported from D0 D3hot D3cold
pci 0001:02:08.0: PCI bridge to [bus 08]
pci 0001:02:09.0: PCI bridge to [bus 09-0d]
pci_bus 0001:00: busn_res: [bus 00-ff] end is updated to 0d
PCI: I/O resource not set for host bridge /pciex@3fffe40400000 (domain 2)
PCI host bridge to bus 0002:00
pci_bus 0002:00: root bus resource [mem 0x3d04080000000-0x3d040fffeffff] (bus address [0x80000000-0xfffeffff])
pci_bus 0002:00: root bus resource [bus 00-ff]
pci_bus 0002:00: busn_res: [bus 00-ff] end is updated to ff
pci 0002:00:00.0: [1014:03dc] type 01 class 0x060400
pci 0002:00:00.0: PME# supported from D0 D3hot D3cold
pci 0002:00:00.0: PCI bridge to [bus 01-05]
pci_bus 0002:00: busn_res: [bus 00-ff] end is updated to 05
PCI: I/O resource not set for host bridge /pciex@3fffe40500000 (domain 3)
PCI host bridge to bus 0003:00
pci_bus 0003:00: root bus resource [mem 0x3d05080000000-0x3d050fffeffff] (bus address [0x80000000-0xfffeffff])
pci_bus 0003:00: root bus resource [bus 00-ff]
pci_bus 0003:00: busn_res: [bus 00-ff] end is updated to ff
pci 0003:00:00.0: [1014:03dc] type 01 class 0x060400
pci 0003:00:00.0: PME# supported from D0 D3hot D3cold
pci 0003:01:00.0: [10b5:8748] type 01 class 0x060400
pci 0003:01:00.0: reg 10: [mem 0x00000000-0x0003ffff]
pci 0003:01:00.0: PME# supported from D0 D3hot D3cold
pci 0003:00:00.0: PCI bridge to [bus 01-13]
pci 0003:02:01.0: [10b5:8748] type 01 class 0x060400
pci 0003:02:01.0: PME# supported from D0 D3hot D3cold
pci 0003:02:08.0: [10b5:8748] type 01 class 0x060400
pci 0003:02:08.0: PME# supported from D0 D3hot D3cold
pci 0003:02:09.0: [10b5:8748] type 01 class 0x060400
pci 0003:02:09.0: PME# supported from D0 D3hot D3cold
pci 0003:02:10.0: [10b5:8748] type 01 class 0x060400
pci 0003:02:10.0: PME# supported from D0 D3hot D3cold
pci 0003:02:11.0: [10b5:8748] type 01 class 0x060400
pci 0003:02:11.0: PME# supported from D0 D3hot D3cold
pci 0003:01:00.0: PCI bridge to [bus 02-13]
pci 0003:03:00.0: [104c:8241] type 00 class 0x0c0330
pci 0003:03:00.0: reg 10: [mem 0x00000000-0x0000ffff 64bit]
pci 0003:03:00.0: reg 18: [mem 0x00000000-0x00001fff 64bit]
pci 0003:03:00.0: supports D1 D2
pci 0003:03:00.0: PME# supported from D0 D1 D2 D3hot
pci 0003:02:01.0: PCI bridge to [bus 03]
pci 0003:02:08.0: PCI bridge to [bus 04-08]
pci 0003:09:00.0: [14e4:1657] type 00 class 0x020000
pci 0003:09:00.0: reg 10: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.0: reg 18: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.0: reg 20: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.0: reg 30: [mem 0x00000000-0x0007ffff pref]
pci 0003:09:00.0: PME# supported from D0 D3hot D3cold
pci 0003:09:00.1: [14e4:1657] type 00 class 0x020000
pci 0003:09:00.1: reg 10: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.1: reg 18: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.1: reg 20: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.1: reg 30: [mem 0x00000000-0x0007ffff pref]
pci 0003:09:00.1: PME# supported from D0 D3hot D3cold
pci 0003:09:00.2: [14e4:1657] type 00 class 0x020000
pci 0003:09:00.2: reg 10: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.2: reg 18: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.2: reg 20: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.2: reg 30: [mem 0x00000000-0x0007ffff pref]
pci 0003:09:00.2: PME# supported from D0 D3hot D3cold
pci 0003:09:00.3: [14e4:1657] type 00 class 0x020000
pci 0003:09:00.3: reg 10: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.3: reg 18: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.3: reg 20: [mem 0x00000000-0x0000ffff 64bit pref]
pci 0003:09:00.3: reg 30: [mem 0x00000000-0x0007ffff pref]
pci 0003:09:00.3: PME# supported from D0 D3hot D3cold
pci 0003:02:09.0: PCI bridge to [bus 09]
pci 0003:02:10.0: PCI bridge to [bus 0a-0e]
pci 0003:02:11.0: PCI bridge to [bus 0f-13]
pci_bus 0003:00: busn_res: [bus 00-ff] end is updated to 13
pci 0001:02:01.0: bridge window [io  0x1000-0x0fff] to [bus 03-07] add_size 1000
pci 0001:02:01.0: bridge window [mem 0x00800000-0x007fffff 64bit pref] to [bus 03-07] add_size 800000
pci 0001:02:01.0: bridge window [mem 0x00800000-0x007fffff] to [bus 03-07] add_size 800000
pci 0001:02:08.0: bridge window [io  0x1000-0x0fff] to [bus 08] add_size 1000
pci 0001:02:09.0: bridge window [io  0x1000-0x0fff] to [bus 09-0d] add_size 1000
pci 0001:02:09.0: bridge window [mem 0x00800000-0x007fffff 64bit pref] to [bus 09-0d] add_size 800000
pci 0001:02:09.0: bridge window [mem 0x00800000-0x007fffff] to [bus 09-0d] add_size 800000
pci 0001:02:01.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0001:02:08.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0001:02:09.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0001:01:00.0: bridge window [io  0x1000-0x0fff] to [bus 02-0d] add_size 3000
pci 0001:02:01.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0001:02:09.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0001:01:00.0: bridge window [mem 0x00800000-0x00ffffff pref] to [bus 02-0d] add_size 1000000
pci 0001:02:01.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0001:02:09.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0001:01:00.0: bridge window [mem 0x00800000-0x00ffffff] to [bus 02-0d] add_size 1000000
pci 0001:01:00.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 3000
pci 0001:00:00.0: bridge window [io  0x1000-0x0fff] to [bus 01-0d] add_size 3000
pci 0001:01:00.0: res[9]=[mem 0x00800000-0x00ffffff pref] get_res_add_size add_size 1000000
pci 0001:00:00.0: bridge window [mem 0x00800000-0x00ffffff pref] to [bus 01-0d] add_size 1000000
pci 0001:01:00.0: res[8]=[mem 0x00800000-0x00ffffff] get_res_add_size add_size 1000000
pci 0001:00:00.0: bridge window [mem 0x00800000-0x017fffff] to [bus 01-0d] add_size 1000000
pci 0003:02:08.0: bridge window [io  0x1000-0x0fff] to [bus 04-08] add_size 1000
pci 0003:02:08.0: bridge window [mem 0x00800000-0x007fffff 64bit pref] to [bus 04-08] add_size 800000
pci 0003:02:08.0: bridge window [mem 0x00800000-0x007fffff] to [bus 04-08] add_size 800000
pci 0003:02:09.0: bridge window [io  0x1000-0x0fff] to [bus 09] add_size 1000
pci 0003:02:09.0: bridge window [mem 0x00800000-0x007fffff] to [bus 09] add_size 800000
pci 0003:02:10.0: bridge window [io  0x1000-0x0fff] to [bus 0a-0e] add_size 1000
pci 0003:02:10.0: bridge window [mem 0x00800000-0x007fffff 64bit pref] to [bus 0a-0e] add_size 800000
pci 0003:02:10.0: bridge window [mem 0x00800000-0x007fffff] to [bus 0a-0e] add_size 800000
pci 0003:02:11.0: bridge window [io  0x1000-0x0fff] to [bus 0f-13] add_size 1000
pci 0003:02:11.0: bridge window [mem 0x00800000-0x007fffff 64bit pref] to [bus 0f-13] add_size 800000
pci 0003:02:11.0: bridge window [mem 0x00800000-0x007fffff] to [bus 0f-13] add_size 800000
pci 0003:02:08.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0003:02:09.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0003:02:10.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0003:02:11.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0003:01:00.0: bridge window [io  0x1000-0x0fff] to [bus 02-13] add_size 4000
pci 0003:02:08.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0003:02:10.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0003:02:11.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0003:01:00.0: bridge window [mem 0x00800000-0x00ffffff pref] to [bus 02-13] add_size 1800000
pci 0003:02:08.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0003:02:09.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0003:02:10.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0003:02:11.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0003:01:00.0: bridge window [mem 0x00800000-0x00ffffff] to [bus 02-13] add_size 2000000
pci 0003:01:00.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 4000
pci 0003:00:00.0: bridge window [io  0x1000-0x0fff] to [bus 01-13] add_size 4000
pci 0003:01:00.0: res[9]=[mem 0x00800000-0x00ffffff pref] get_res_add_size add_size 1800000
pci 0003:00:00.0: bridge window [mem 0x00800000-0x00ffffff pref] to [bus 01-13] add_size 1800000
pci 0003:01:00.0: res[8]=[mem 0x00800000-0x00ffffff] get_res_add_size add_size 2000000
pci 0003:00:00.0: bridge window [mem 0x00800000-0x017fffff] to [bus 01-13] add_size 2000000
pci 0000:00:00.0: BAR 8: assigned [mem 0x3d00080000000-0x3d000807fffff]
pci 0000:00:00.0: BAR 9: assigned [mem 0x3d00080800000-0x3d00081ffffff pref]
pci 0000:01:00.0: BAR 2: assigned [mem 0x3d00081000000-0x3d00081ffffff 64bit pref]
pci 0000:01:00.0: BAR 0: assigned [mem 0x3d00080000000-0x3d0008003ffff 64bit]
pci 0000:01:00.0: BAR 6: assigned [mem 0x3d00080800000-0x3d0008081ffff pref]
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0:   bridge window [mem 0x3d00080000000-0x3d000807fffff]
pci 0000:00:00.0:   bridge window [mem 0x3d00080800000-0x3d00081ffffff pref]
pci 0001:00:00.0: res[8]=[mem 0x00800000-0x017fffff] get_res_add_size add_size 1000000
pci 0001:00:00.0: res[9]=[mem 0x00800000-0x00ffffff pref] get_res_add_size add_size 1000000
pci 0001:00:00.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 3000
pci 0001:00:00.0: BAR 8: assigned [mem 0x3d01080000000-0x3d01081ffffff]
pci 0001:00:00.0: BAR 9: assigned [mem 0x3d01082000000-0x3d010837fffff pref]
pci 0001:00:00.0: BAR 7: can't assign io (size 0x3000)
pci 0001:00:00.0: BAR 8: assigned [mem 0x3d01080000000-0x3d01080ffffff]
pci 0001:00:00.0: BAR 9: assigned [mem 0x3d01081000000-0x3d010817fffff pref]
pci 0001:00:00.0: BAR 8: reassigned [mem 0x3d01081800000-0x3d010837fffff]
pci 0001:00:00.0: BAR 9: reassigned [mem 0x3d01080000000-0x3d010817fffff pref]
pci 0001:00:00.0: BAR 7: can't assign io (size 0x3000)
pci 0001:01:00.0: res[8]=[mem 0x00800000-0x00ffffff] get_res_add_size add_size 1000000
pci 0001:01:00.0: res[9]=[mem 0x00800000-0x00ffffff pref] get_res_add_size add_size 1000000
pci 0001:01:00.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 3000
pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01081800000-0x3d01082ffffff]
pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010817fffff pref]
pci 0001:01:00.0: BAR 0: assigned [mem 0x3d01083000000-0x3d0108303ffff]
pci 0001:01:00.0: BAR 7: can't assign io (size 0x3000)
pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01081800000-0x3d01081ffffff]
pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010807fffff pref]
pci 0001:01:00.0: BAR 0: assigned [mem 0x3d01082000000-0x3d0108203ffff]
pci 0001:01:00.0: BAR 8: can't assign mem (size 0x800000)
pci 0001:01:00.0: failed to add 1000000 res[8]=[mem 0x3d01081800000-0x3d01081ffffff]
pci 0001:01:00.0: BAR 9: reassigned [mem 0x3d01080000000-0x3d010817fffff pref]
pci 0001:01:00.0: BAR 7: can't assign io (size 0x3000)
pci 0001:02:01.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0001:02:01.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0001:02:09.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0001:02:09.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0001:02:01.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0001:02:08.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0001:02:09.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0001:02:01.0: BAR 8: assigned [mem 0x3d01081800000-0x3d01081ffffff]
pci 0001:02:01.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010807fffff 64bit pref]
pci 0001:02:08.0: BAR 8: can't assign mem (size 0x800000)
pci 0001:02:08.0: BAR 9: assigned [mem 0x3d01080800000-0x3d01080ffffff pref]
pci 0001:02:09.0: BAR 8: can't assign mem (size 0x800000)
pci 0001:02:09.0: BAR 9: assigned [mem 0x3d01081000000-0x3d010817fffff 64bit pref]
pci 0001:02:01.0: BAR 7: can't assign io (size 0x1000)
pci 0001:02:08.0: BAR 7: can't assign io (size 0x1000)
pci 0001:02:09.0: BAR 7: can't assign io (size 0x1000)
pci 0001:02:08.0: BAR 8: assigned [mem 0x3d01081800000-0x3d01081ffffff]
pci 0001:02:08.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010807fffff pref]
pci 0001:02:09.0: BAR 8: can't assign mem (size 0x800000)
pci 0001:02:09.0: BAR 9: assigned [mem 0x3d01080800000-0x3d01080ffffff 64bit pref]
pci 0001:02:09.0: BAR 7: can't assign io (size 0x1000)
pci 0001:02:08.0: BAR 7: can't assign io (size 0x1000)
pci 0001:02:01.0: BAR 8: can't assign mem (size 0x800000)
pci 0001:02:01.0: BAR 9: assigned [mem 0x3d01081000000-0x3d010817fffff 64bit pref]
pci 0001:02:01.0: BAR 7: can't assign io (size 0x1000)
pci 0001:02:01.0: PCI bridge to [bus 03-07]
pci 0001:02:01.0:   bridge window [mem 0x3d01081000000-0x3d010817fffff 64bit pref]
pci 0001:08:00.0: BAR 6: assigned [mem 0x3d01080000000-0x3d0108001ffff pref]
pci 0001:08:00.0: BAR 0: assigned [mem 0x3d01081800000-0x3d0108180ffff 64bit]
pci 0001:08:00.0: BAR 2: assigned [mem 0x3d01081810000-0x3d0108181ffff 64bit]
pci 0001:02:08.0: PCI bridge to [bus 08]
pci 0001:02:08.0:   bridge window [mem 0x3d01081800000-0x3d01081ffffff]
pci 0001:02:08.0:   bridge window [mem 0x3d01080000000-0x3d010807fffff pref]
pci 0001:02:09.0: PCI bridge to [bus 09-0d]
pci 0001:02:09.0:   bridge window [mem 0x3d01080800000-0x3d01080ffffff 64bit pref]
pci 0001:01:00.0: PCI bridge to [bus 02-0d]
pci 0001:01:00.0:   bridge window [mem 0x3d01081800000-0x3d01081ffffff]
pci 0001:01:00.0:   bridge window [mem 0x3d01080000000-0x3d010817fffff pref]
pci 0001:00:00.0: PCI bridge to [bus 01-0d]
pci 0001:00:00.0:   bridge window [mem 0x3d01081800000-0x3d010837fffff]
pci 0001:00:00.0:   bridge window [mem 0x3d01080000000-0x3d010817fffff pref]
pci 0002:00:00.0: PCI bridge to [bus 01-05]
pci 0003:00:00.0: res[8]=[mem 0x00800000-0x017fffff] get_res_add_size add_size 2000000
pci 0003:00:00.0: res[9]=[mem 0x00800000-0x00ffffff pref] get_res_add_size add_size 1800000
pci 0003:00:00.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 4000
pci 0003:00:00.0: BAR 8: assigned [mem 0x3d05080000000-0x3d05082ffffff]
pci 0003:00:00.0: BAR 9: assigned [mem 0x3d05083000000-0x3d05084ffffff pref]
pci 0003:00:00.0: BAR 7: can't assign io (size 0x4000)
pci 0003:00:00.0: BAR 8: assigned [mem 0x3d05080000000-0x3d05080ffffff]
pci 0003:00:00.0: BAR 9: assigned [mem 0x3d05081000000-0x3d050817fffff pref]
pci 0003:00:00.0: BAR 8: reassigned [mem 0x3d05081800000-0x3d050847fffff]
pci 0003:00:00.0: BAR 9: reassigned [mem 0x3d05084800000-0x3d050867fffff pref]
pci 0003:00:00.0: BAR 7: can't assign io (size 0x4000)
pci 0003:01:00.0: res[8]=[mem 0x00800000-0x00ffffff] get_res_add_size add_size 2000000
pci 0003:01:00.0: res[9]=[mem 0x00800000-0x00ffffff pref] get_res_add_size add_size 1800000
pci 0003:01:00.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 4000
pci 0003:01:00.0: BAR 8: assigned [mem 0x3d05081800000-0x3d05083ffffff]
pci 0003:01:00.0: BAR 9: assigned [mem 0x3d05084800000-0x3d050867fffff pref]
pci 0003:01:00.0: BAR 0: assigned [mem 0x3d05084000000-0x3d0508403ffff]
pci 0003:01:00.0: BAR 7: can't assign io (size 0x4000)
pci 0003:01:00.0: BAR 8: assigned [mem 0x3d05081800000-0x3d05081ffffff]
pci 0003:01:00.0: BAR 9: assigned [mem 0x3d05084800000-0x3d05084ffffff pref]
pci 0003:01:00.0: BAR 0: assigned [mem 0x3d05082000000-0x3d0508203ffff]
pci 0003:01:00.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:01:00.0: failed to add 2000000 res[8]=[mem 0x3d05081800000-0x3d05081ffffff]
pci 0003:01:00.0: BAR 9: reassigned [mem 0x3d05084800000-0x3d050867fffff pref]
pci 0003:01:00.0: BAR 7: can't assign io (size 0x4000)
pci 0003:02:08.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0003:02:08.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0003:02:09.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0003:02:10.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0003:02:10.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0003:02:11.0: res[8]=[mem 0x00800000-0x007fffff] get_res_add_size add_size 800000
pci 0003:02:11.0: res[9]=[mem 0x00800000-0x007fffff 64bit pref] get_res_add_size add_size 800000
pci 0003:02:08.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0003:02:09.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0003:02:10.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0003:02:11.0: res[7]=[io  0x1000-0x0fff] get_res_add_size add_size 1000
pci 0003:02:01.0: BAR 8: assigned [mem 0x3d05081800000-0x3d05081ffffff]
pci 0003:02:08.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:02:08.0: BAR 9: assigned [mem 0x3d05084800000-0x3d05084ffffff 64bit pref]
pci 0003:02:09.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:02:09.0: BAR 9: assigned [mem 0x3d05085000000-0x3d050857fffff pref]
pci 0003:02:10.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:02:10.0: BAR 9: assigned [mem 0x3d05085800000-0x3d05085ffffff 64bit pref]
pci 0003:02:11.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:02:11.0: BAR 9: assigned [mem 0x3d05086000000-0x3d050867fffff 64bit pref]
pci 0003:02:08.0: BAR 7: can't assign io (size 0x1000)
pci 0003:02:09.0: BAR 7: can't assign io (size 0x1000)
pci 0003:02:10.0: BAR 7: can't assign io (size 0x1000)
pci 0003:02:11.0: BAR 7: can't assign io (size 0x1000)
pci 0003:02:01.0: BAR 8: assigned [mem 0x3d05081800000-0x3d05081ffffff]
pci 0003:02:09.0: BAR 9: assigned [mem 0x3d05084800000-0x3d05084ffffff pref]
pci 0003:02:11.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:02:11.0: BAR 9: assigned [mem 0x3d05085000000-0x3d050857fffff 64bit pref]
pci 0003:02:11.0: BAR 7: can't assign io (size 0x1000)
pci 0003:02:10.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:02:10.0: BAR 9: assigned [mem 0x3d05085800000-0x3d05085ffffff 64bit pref]
pci 0003:02:10.0: BAR 7: can't assign io (size 0x1000)
pci 0003:02:09.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:02:09.0: BAR 7: can't assign io (size 0x1000)
pci 0003:02:08.0: BAR 8: can't assign mem (size 0x800000)
pci 0003:02:08.0: BAR 9: assigned [mem 0x3d05086000000-0x3d050867fffff 64bit pref]
pci 0003:02:08.0: BAR 7: can't assign io (size 0x1000)
pci 0003:03:00.0: BAR 0: assigned [mem 0x3d05081800000-0x3d0508180ffff 64bit]
pci 0003:03:00.0: BAR 2: assigned [mem 0x3d05081810000-0x3d05081811fff 64bit]
pci 0003:02:01.0: PCI bridge to [bus 03]
pci 0003:02:01.0:   bridge window [mem 0x3d05081800000-0x3d05081ffffff]
pci 0003:02:08.0: PCI bridge to [bus 04-08]
pci 0003:02:08.0:   bridge window [mem 0x3d05086000000-0x3d050867fffff 64bit pref]
pci 0003:09:00.0: BAR 6: assigned [mem 0x3d05084800000-0x3d0508487ffff pref]
pci 0003:09:00.1: BAR 6: assigned [mem 0x3d05084880000-0x3d050848fffff pref]
pci 0003:09:00.2: BAR 6: assigned [mem 0x3d05084900000-0x3d0508497ffff pref]
pci 0003:09:00.3: BAR 6: assigned [mem 0x3d05084980000-0x3d050849fffff pref]
pci 0003:09:00.0: BAR 0: assigned [mem 0x3d05084a00000-0x3d05084a0ffff 64bit pref]
pci 0003:09:00.0: BAR 2: assigned [mem 0x3d05084a10000-0x3d05084a1ffff 64bit pref]
pci 0003:09:00.0: BAR 4: assigned [mem 0x3d05084a20000-0x3d05084a2ffff 64bit pref]
pci 0003:09:00.1: BAR 0: assigned [mem 0x3d05084a30000-0x3d05084a3ffff 64bit pref]
pci 0003:09:00.1: BAR 2: assigned [mem 0x3d05084a40000-0x3d05084a4ffff 64bit pref]
pci 0003:09:00.1: BAR 4: assigned [mem 0x3d05084a50000-0x3d05084a5ffff 64bit pref]
pci 0003:09:00.2: BAR 0: assigned [mem 0x3d05084a60000-0x3d05084a6ffff 64bit pref]
pci 0003:09:00.2: BAR 2: assigned [mem 0x3d05084a70000-0x3d05084a7ffff 64bit pref]
pci 0003:09:00.2: BAR 4: assigned [mem 0x3d05084a80000-0x3d05084a8ffff 64bit pref]
pci 0003:09:00.3: BAR 0: assigned [mem 0x3d05084a90000-0x3d05084a9ffff 64bit pref]
pci 0003:09:00.3: BAR 2: assigned [mem 0x3d05084aa0000-0x3d05084aaffff 64bit pref]
pci 0003:09:00.3: BAR 4: assigned [mem 0x3d05084ab0000-0x3d05084abffff 64bit pref]
pci 0003:02:09.0: PCI bridge to [bus 09]
pci 0003:02:09.0:   bridge window [mem 0x3d05084800000-0x3d05084ffffff pref]
pci 0003:02:10.0: PCI bridge to [bus 0a-0e]
pci 0003:02:10.0:   bridge window [mem 0x3d05085800000-0x3d05085ffffff 64bit pref]
pci 0003:02:11.0: PCI bridge to [bus 0f-13]
pci 0003:02:11.0:   bridge window [mem 0x3d05085000000-0x3d050857fffff 64bit pref]
pci 0003:01:00.0: PCI bridge to [bus 02-13]
pci 0003:01:00.0:   bridge window [mem 0x3d05081800000-0x3d05081ffffff]
pci 0003:01:00.0:   bridge window [mem 0x3d05084800000-0x3d050867fffff pref]
pci 0003:00:00.0: PCI bridge to [bus 01-13]
pci 0003:00:00.0:   bridge window [mem 0x3d05081800000-0x3d050847fffff]
pci 0003:00:00.0:   bridge window [mem 0x3d05084800000-0x3d050867fffff pref]
pci_bus 0000:00: resource 4 [mem 0x3d00080000000-0x3d000fffeffff]
pci_bus 0000:01: resource 1 [mem 0x3d00080000000-0x3d000807fffff]
pci_bus 0000:01: resource 2 [mem 0x3d00080800000-0x3d00081ffffff pref]
pci_bus 0001:00: resource 4 [mem 0x3d01080000000-0x3d010fffeffff]
pci_bus 0001:01: resource 1 [mem 0x3d01081800000-0x3d010837fffff]
pci_bus 0001:01: resource 2 [mem 0x3d01080000000-0x3d010817fffff pref]
pci_bus 0001:02: resource 1 [mem 0x3d01081800000-0x3d01081ffffff]
pci_bus 0001:02: resource 2 [mem 0x3d01080000000-0x3d010817fffff pref]
pci_bus 0001:03: resource 2 [mem 0x3d01081000000-0x3d010817fffff 64bit pref]
pci_bus 0001:08: resource 1 [mem 0x3d01081800000-0x3d01081ffffff]
pci_bus 0001:08: resource 2 [mem 0x3d01080000000-0x3d010807fffff pref]
pci_bus 0001:09: resource 2 [mem 0x3d01080800000-0x3d01080ffffff 64bit pref]
pci_bus 0002:00: resource 4 [mem 0x3d04080000000-0x3d040fffeffff]
pci_bus 0003:00: resource 4 [mem 0x3d05080000000-0x3d050fffeffff]
pci_bus 0003:01: resource 1 [mem 0x3d05081800000-0x3d050847fffff]
pci_bus 0003:01: resource 2 [mem 0x3d05084800000-0x3d050867fffff pref]
pci_bus 0003:02: resource 1 [mem 0x3d05081800000-0x3d05081ffffff]
pci_bus 0003:02: resource 2 [mem 0x3d05084800000-0x3d050867fffff pref]
pci_bus 0003:03: resource 1 [mem 0x3d05081800000-0x3d05081ffffff]
pci_bus 0003:04: resource 2 [mem 0x3d05086000000-0x3d050867fffff 64bit pref]
pci_bus 0003:09: resource 2 [mem 0x3d05084800000-0x3d05084ffffff pref]
pci_bus 0003:0a: resource 2 [mem 0x3d05085800000-0x3d05085ffffff 64bit pref]
pci_bus 0003:0f: resource 2 [mem 0x3d05085000000-0x3d050857fffff 64bit pref]



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-05  0:10 Resource assignment oddities Benjamin Herrenschmidt
@ 2013-05-05  0:15 ` Benjamin Herrenschmidt
  2013-05-05  5:18   ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-05  0:15 UTC (permalink / raw)
  To: linux-pci; +Cc: Bjorn Helgaas, Gavin Shan, Yinghai Lu

On Sun, 2013-05-05 at 10:10 +1000, Benjamin Herrenschmidt wrote:
> Once upon a time, our PCI resource assignment code use to be reasonably
> straightforward... right now I'm having a hard time making any sense
> of it.

Note that the devices so far seem to be working and here's the resulting
layout:

/ # cat /proc/iomem 
00000000-7ffffffff : System RAM
3d00080000000-3d000fffeffff : /pciex@3fffe40000000
  3d00080000000-3d000807fffff : PCI Bus 0000:01
    3d00080000000-3d0008003ffff : 0000:01:00.0
      3d00080000000-3d0008003ffff : ipr
  3d00080800000-3d00081ffffff : PCI Bus 0000:01
    3d00080800000-3d0008081ffff : 0000:01:00.0
    3d00081000000-3d00081ffffff : 0000:01:00.0
      3d00081000000-3d00081ffffff : ipr
3d01080000000-3d010fffeffff : /pciex@3fffe40100000
  3d01080000000-3d010817fffff : PCI Bus 0001:01
    3d01080000000-3d010817fffff : PCI Bus 0001:02
      3d01080000000-3d010807fffff : PCI Bus 0001:08
        3d01080000000-3d0108001ffff : 0001:08:00.0
      3d01080800000-3d01080ffffff : PCI Bus 0001:09
      3d01081000000-3d010817fffff : PCI Bus 0001:03
  3d01081800000-3d010837fffff : PCI Bus 0001:01
    3d01081800000-3d01081ffffff : PCI Bus 0001:02
      3d01081800000-3d01081ffffff : PCI Bus 0001:08
        3d01081800000-3d0108180ffff : 0001:08:00.0
          3d01081800000-3d0108180ffff : ipr
        3d01081810000-3d0108181ffff : 0001:08:00.0
          3d01081810000-3d0108181ffff : ipr
    3d01082000000-3d0108203ffff : 0001:01:00.0
3d04080000000-3d040fffeffff : /pciex@3fffe40400000
3d05080000000-3d050fffeffff : /pciex@3fffe40500000
  3d05081800000-3d050847fffff : PCI Bus 0003:01
    3d05081800000-3d05081ffffff : PCI Bus 0003:02
      3d05081800000-3d05081ffffff : PCI Bus 0003:03
        3d05081800000-3d0508180ffff : 0003:03:00.0
        3d05081810000-3d05081811fff : 0003:03:00.0
    3d05082000000-3d0508203ffff : 0003:01:00.0
  3d05084800000-3d050867fffff : PCI Bus 0003:01
    3d05084800000-3d050867fffff : PCI Bus 0003:02
      3d05084800000-3d05084ffffff : PCI Bus 0003:09
        3d05084800000-3d0508487ffff : 0003:09:00.0
        3d05084880000-3d050848fffff : 0003:09:00.1
        3d05084900000-3d0508497ffff : 0003:09:00.2
        3d05084980000-3d050849fffff : 0003:09:00.3
        3d05084a00000-3d05084a0ffff : 0003:09:00.0
          3d05084a00000-3d05084a0ffff : tg3
        3d05084a10000-3d05084a1ffff : 0003:09:00.0
          3d05084a10000-3d05084a1ffff : tg3
        3d05084a20000-3d05084a2ffff : 0003:09:00.0
          3d05084a20000-3d05084a2ffff : tg3
        3d05084a30000-3d05084a3ffff : 0003:09:00.1
          3d05084a30000-3d05084a3ffff : tg3
        3d05084a40000-3d05084a4ffff : 0003:09:00.1
          3d05084a40000-3d05084a4ffff : tg3
        3d05084a50000-3d05084a5ffff : 0003:09:00.1
          3d05084a50000-3d05084a5ffff : tg3
        3d05084a60000-3d05084a6ffff : 0003:09:00.2
          3d05084a60000-3d05084a6ffff : tg3
        3d05084a70000-3d05084a7ffff : 0003:09:00.2
          3d05084a70000-3d05084a7ffff : tg3
        3d05084a80000-3d05084a8ffff : 0003:09:00.2
          3d05084a80000-3d05084a8ffff : tg3
        3d05084a90000-3d05084a9ffff : 0003:09:00.3
          3d05084a90000-3d05084a9ffff : tg3
        3d05084aa0000-3d05084aaffff : 0003:09:00.3
          3d05084aa0000-3d05084aaffff : tg3
        3d05084ab0000-3d05084abffff : 0003:09:00.3
          3d05084ab0000-3d05084abffff : tg3
      3d05085000000-3d050857fffff : PCI Bus 0003:0f
      3d05085800000-3d05085ffffff : PCI Bus 0003:0a
      3d05086000000-3d050867fffff : PCI Bus 0003:04

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-05  0:15 ` Benjamin Herrenschmidt
@ 2013-05-05  5:18   ` Yinghai Lu
  2013-05-05  5:34     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-05  5:18 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linux-pci, Bjorn Helgaas, Gavin Shan

On Sat, May 4, 2013 at 5:15 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Sun, 2013-05-05 at 10:10 +1000, Benjamin Herrenschmidt wrote:
>> Once upon a time, our PCI resource assignment code use to be reasonably
>> straightforward... right now I'm having a hard time making any sense
>> of it.
>
> Note that the devices so far seem to be working and here's the resulting
> layout:

we intend to assign unassigned resources in two try,
first one will try to allocate must_have and optional ones the same time.

if it fails to do that, will only allocate must_have again.

now optional one only include SRIOV related.

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-05  5:18   ` Yinghai Lu
@ 2013-05-05  5:34     ` Benjamin Herrenschmidt
  2013-05-05  7:09       ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-05  5:34 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: linux-pci, Bjorn Helgaas, Gavin Shan

On Sat, 2013-05-04 at 22:18 -0700, Yinghai Lu wrote:
> we intend to assign unassigned resources in two try,
> first one will try to allocate must_have and optional ones the same
> time.
> 
> if it fails to do that, will only allocate must_have again.
> 
> now optional one only include SRIOV related.

Hrm... ok. We'll eventually need some kind of hook here for power as
the way we deal with SR-IOV will have to be somewhat "special", but
I'll explain that some other time.

Anything shows up from my log about what you think might be the cause
of those assignment failures ?

Thanks !

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-05  5:34     ` Benjamin Herrenschmidt
@ 2013-05-05  7:09       ` Yinghai Lu
  2013-05-05  7:52         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-05  7:09 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linux-pci, Bjorn Helgaas, Gavin Shan

On Sat, May 4, 2013 at 10:34 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Sat, 2013-05-04 at 22:18 -0700, Yinghai Lu wrote:
>> we intend to assign unassigned resources in two try,
>> first one will try to allocate must_have and optional ones the same
>> time.
>>
>> if it fails to do that, will only allocate must_have again.
>>
>> now optional one only include SRIOV related.
>
> Hrm... ok. We'll eventually need some kind of hook here for power as
> the way we deal with SR-IOV will have to be somewhat "special", but
> I'll explain that some other time.
>
> Anything shows up from my log about what you think might be the cause
> of those assignment failures ?

Yes, there is something wrong

pci 0001:01:00.0: BAR 8: can't assign mem (size 0x800000)

as bridge only can support 32bit mmio non-pref.

There is one bug for arch other than x86, but it should not be related.

in pci_bus_alloc_resource()

|        /* don't allocate too high if the pref mem doesn't support 64bit*/
|        if (!(res->flags & IORESOURCE_MEM_64))
|                max = PCIBIOS_MAX_MEM_32;

we should call pcibios_resource_to_bus ... to make
sure that actual bus addr is still 32bit

But i'm confused, Did you happen to define your own
    PCIBIOS_MAX_MEM_32 ?
as default one should be -1 other than x86.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-05  7:09       ` Yinghai Lu
@ 2013-05-05  7:52         ` Benjamin Herrenschmidt
       [not found]           ` <51871088.4594420a.0ccc.7300SMTPIN_ADDED_BROKEN@mx.google.com>
  0 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-05  7:52 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: linux-pci, Bjorn Helgaas, Gavin Shan

On Sun, 2013-05-05 at 00:09 -0700, Yinghai Lu wrote:
> Yes, there is something wrong
> 
> pci 0001:01:00.0: BAR 8: can't assign mem (size 0x800000)
> 
> as bridge only can support 32bit mmio non-pref.

Right, that looks wrong, why can't it assign it ? That's what I haven't
figured out yet. There should be plenty of space still available.

> There is one bug for arch other than x86, but it should not be related.
> 
> in pci_bus_alloc_resource()
> 
> |        /* don't allocate too high if the pref mem doesn't support 64bit*/
> |        if (!(res->flags & IORESOURCE_MEM_64))
> |                max = PCIBIOS_MAX_MEM_32;
> 
> we should call pcibios_resource_to_bus ... to make
> sure that actual bus addr is still 32bit

Or the other way around but yes, I see your point however ...

> But i'm confused, Did you happen to define your own
>     PCIBIOS_MAX_MEM_32 ?
> as default one should be -1 other than x86.

Right, it is -1. Oh well, I'll sprinkle some printk's around tomorrow (or
ask Gavin to do it :-)

Thanks !

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
       [not found]           ` <51871088.4594420a.0ccc.7300SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2013-05-06  3:04             ` Yinghai Lu
       [not found]               ` <20130506103159.GA16927@shangw.(null)>
       [not found]               ` <518786a7.64bbec0a.58a0.1f6bSMTPIN_ADDED_BROKEN@mx.google.com>
  0 siblings, 2 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-06  3:04 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Benjamin Herrenschmidt, linux-pci, Bjorn Helgaas

On Sun, May 5, 2013 at 7:08 PM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> On Sun, May 05, 2013 at 05:52:16PM +1000, Benjamin Herrenschmidt wrote:
>>On Sun, 2013-05-05 at 00:09 -0700, Yinghai Lu wrote:
>>> Yes, there is something wrong
>>>
>>> pci 0001:01:00.0: BAR 8: can't assign mem (size 0x800000)
>>>
>>> as bridge only can support 32bit mmio non-pref.
>>
>>Right, that looks wrong, why can't it assign it ? That's what I haven't
>>figured out yet. There should be plenty of space still available.
>>
>>> There is one bug for arch other than x86, but it should not be related.
>>>
>>> in pci_bus_alloc_resource()
>>>
>>> |        /* don't allocate too high if the pref mem doesn't support 64bit*/
>>> |        if (!(res->flags & IORESOURCE_MEM_64))
>>> |                max = PCIBIOS_MAX_MEM_32;
>>>
>>> we should call pcibios_resource_to_bus ... to make
>>> sure that actual bus addr is still 32bit
>>
>>Or the other way around but yes, I see your point however ...
>>
>>> But i'm confused, Did you happen to define your own
>>>     PCIBIOS_MAX_MEM_32 ?
>>> as default one should be -1 other than x86.
>>
>>Right, it is -1. Oh well, I'll sprinkle some printk's around tomorrow (or
>>ask Gavin to do it :-)
>>
>
> Ben, I'll trace it down since we can see the same problem on simulator
> as well. I'll update with any findings :-)

root cause could be:
ioport retry cause it fail.

please try to revert:to see it works.

commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Thu Feb 23 19:23:29 2012 -0800

    PCI: Retry on IORESOURCE_IO type allocations

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
       [not found]               ` <20130506103159.GA16927@shangw.(null)>
@ 2013-05-06 10:48                 ` Benjamin Herrenschmidt
  2013-05-06 19:56                   ` Yinghai Lu
                                     ` (4 more replies)
  0 siblings, 5 replies; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-06 10:48 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Yinghai Lu, linux-pci, Bjorn Helgaas

On Mon, 2013-05-06 at 18:31 +0800, Gavin Shan wrote:
> The possible (temporary) fix would be checking "local_fail_head" in function
> __assign_resources_sorted() and don't do reassignment if all the element in
> the list is IO related. In the future, we might introduce specific flag to
> indicate the PCI host can't support IO and skip all IO assignment accordingly.

Yinghai, what's the best we can do for 3.10 ? This kernel will go to enterprise
distros I'm told and that's a pretty nasty problem for some of our machines
that don't do IO.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-06 10:48                 ` Benjamin Herrenschmidt
@ 2013-05-06 19:56                   ` Yinghai Lu
       [not found]                     ` <5188b791.a110420a.0bea.077eSMTPIN_ADDED_BROKEN@mx.google.com>
  2013-05-06 23:15                   ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
                                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-06 19:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Gavin Shan, linux-pci, Bjorn Helgaas

On Mon, May 6, 2013 at 3:48 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2013-05-06 at 18:31 +0800, Gavin Shan wrote:
>> The possible (temporary) fix would be checking "local_fail_head" in function
>> __assign_resources_sorted() and don't do reassignment if all the element in
>> the list is IO related. In the future, we might introduce specific flag to
>> indicate the PCI host can't support IO and skip all IO assignment accordingly.
>
> Yinghai, what's the best we can do for 3.10 ? This kernel will go to enterprise
> distros I'm told and that's a pretty nasty problem for some of our machines
> that don't do IO.

ok, will look at that today.

will fix that from pci core side.

as later on x86 8 sockets and even 32 sockets system, will have more chance
that some root bus will not get io port allocation ...

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus
  2013-05-06 10:48                 ` Benjamin Herrenschmidt
  2013-05-06 19:56                   ` Yinghai Lu
@ 2013-05-06 23:15                   ` Yinghai Lu
  2013-05-06 23:15                     ` [PATCH 2/2] PCI: Skip IORESOURCE_IO size and allocation for root bus without ioport range Yinghai Lu
                                       ` (2 more replies)
  2013-05-07 22:17                   ` [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
                                     ` (2 subsequent siblings)
  4 siblings, 3 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-06 23:15 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails will try must then try to extend must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

We should not fall into retry in this case, as root bus does
not io port range.

Before we do that we need to split pci_assign_unassiged_resource
to every root bus, so we can stop early for root bus without ioport
range, and still continue to retry on buses that do have ioport range.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

For the retry failing, we could allocate mmio-non-pref bottom-up
and mmio-pref will be top-down, but that could not be material for v3.10.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |  101 +++++++++++++++++++++++-------------------------
 1 file changed, 49 insertions(+), 52 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1315,21 +1315,6 @@ static int __init pci_bus_get_depth(stru
 
 	return depth;
 }
-static int __init pci_get_max_depth(void)
-{
-	int depth = 0;
-	struct pci_bus *bus;
-
-	list_for_each_entry(bus, &pci_root_buses, node) {
-		int ret;
-
-		ret = pci_bus_get_depth(bus);
-		if (ret > depth)
-			depth = ret;
-	}
-
-	return depth;
-}
 
 /*
  * -1: undefined, will auto detect later
@@ -1354,34 +1339,41 @@ void __init pci_realloc_get_opt(char *st
 	else if (!strncmp(str, "on", 2))
 		pci_realloc_enable = user_enabled;
 }
-static bool __init pci_realloc_enabled(void)
+static bool __init pci_realloc_enabled(enum enable_type enable)
 {
-	return pci_realloc_enable >= user_enabled;
+	return enable >= user_enabled;
 }
 
-static void __init pci_realloc_detect(void)
+static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+			 enum enable_type enable_local)
 {
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
-	struct pci_dev *dev = NULL;
+	struct pci_dev *dev;
 
-	if (pci_realloc_enable != undefined)
-		return;
+	if (enable_local != undefined)
+		return enable_local;
 
-	for_each_pci_dev(dev) {
+	list_for_each_entry(dev, &bus->devices, bus_list) {
 		int i;
 
 		for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
 			struct resource *r = &dev->resource[i];
 
 			/* Not assigned, or rejected by kernel ? */
-			if (r->flags && !r->start) {
-				pci_realloc_enable = auto_enabled;
-
-				return;
-			}
+			if (r->flags && !r->start)
+				return auto_enabled;
 		}
 	}
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (child &&
+		    pci_realloc_detect(child, enable_local) == auto_enabled)
+			return auto_enabled;
+	}
 #endif
+	return enable_local;
 }
 
 /*
@@ -1389,10 +1381,9 @@ static void __init pci_realloc_detect(vo
  * second  and later try will clear small leaf bridge res
  * will stop till to the max  deepth if can not find good one
  */
-void __init
-pci_assign_unassigned_resources(void)
+static void __init
+pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 {
-	struct pci_bus *bus;
 	LIST_HEAD(realloc_head); /* list of resources that
 					want additional resources */
 	struct list_head *add_list = NULL;
@@ -1403,15 +1394,17 @@ pci_assign_unassigned_resources(void)
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
+	enum enable_type enable_local;
 
 	/* don't realloc if asked to do so */
-	pci_realloc_detect();
-	if (pci_realloc_enabled()) {
-		int max_depth = pci_get_max_depth();
+	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
+	if (pci_realloc_enabled(enable_local)) {
+		int max_depth = pci_bus_get_depth(bus);
 
 		pci_try_num = max_depth + 1;
-		printk(KERN_DEBUG "PCI: max bus depth: %d pci_try_num: %d\n",
-			 max_depth, pci_try_num);
+		dev_printk(KERN_DEBUG, &bus->dev,
+			   "max bus depth: %d pci_try_num: %d\n",
+			   max_depth, pci_try_num);
 	}
 
 again:
@@ -1423,12 +1416,10 @@ again:
 		add_list = &realloc_head;
 	/* Depth first, calculate sizes and alignments of all
 	   subordinate buses. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		__pci_bus_size_bridges(bus, add_list);
+	__pci_bus_size_bridges(bus, add_list);
 
 	/* Depth last, allocate resources and update the hardware. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		__pci_bus_assign_resources(bus, add_list, &fail_head);
+	__pci_bus_assign_resources(bus, add_list, &fail_head);
 	if (add_list)
 		BUG_ON(!list_empty(add_list));
 	tried_times++;
@@ -1438,17 +1429,17 @@ again:
 		goto enable_and_dump;
 
 	if (tried_times >= pci_try_num) {
-		if (pci_realloc_enable == undefined)
-			printk(KERN_INFO "Some PCI device resources are unassigned, try booting with pci=realloc\n");
-		else if (pci_realloc_enable == auto_enabled)
-			printk(KERN_INFO "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
+		if (enable_local == undefined)
+			dev_info(&bus->dev, "Some PCI device resources are unassigned, try booting with pci=realloc\n");
+		else if (enable_local == auto_enabled)
+			dev_info(&bus->dev, "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);
 		goto enable_and_dump;
 	}
 
-	printk(KERN_DEBUG "PCI: No. %d try to assign unassigned res\n",
-			 tried_times + 1);
+	dev_printk(KERN_DEBUG, &bus->dev,
+		   "No. %d try to assign unassigned res\n", tried_times + 1);
 
 	/* third times and later will not check if it is leaf */
 	if ((tried_times + 1) > 2)
@@ -1458,12 +1449,11 @@ again:
 	 * Try to release leaf bridge's resources that doesn't fit resource of
 	 * child device under that bridge
 	 */
-	list_for_each_entry(fail_res, &fail_head, list) {
-		bus = fail_res->dev->bus;
-		pci_bus_release_bridge_resources(bus,
+	list_for_each_entry(fail_res, &fail_head, list)
+		pci_bus_release_bridge_resources(fail_res->dev->bus,
 						 fail_res->flags & type_mask,
 						 rel_type);
-	}
+
 	/* restore size and flags */
 	list_for_each_entry(fail_res, &fail_head, list) {
 		struct resource *res = fail_res->res;
@@ -1480,12 +1470,19 @@ again:
 
 enable_and_dump:
 	/* Depth last, update the hardware. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		pci_enable_bridges(bus);
+	pci_enable_bridges(bus);
 
 	/* dump the resource on buses */
+	pci_bus_dump_resources(bus);
+}
+
+void __init
+pci_assign_unassigned_resources(void)
+{
+	struct pci_bus *bus;
+
 	list_for_each_entry(bus, &pci_root_buses, node)
-		pci_bus_dump_resources(bus);
+		pci_assign_unassigned_root_bus_resources(bus);
 }
 
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH 2/2] PCI: Skip IORESOURCE_IO size and allocation for root bus without ioport range
  2013-05-06 23:15                   ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
@ 2013-05-06 23:15                     ` Yinghai Lu
  2013-05-07  0:50                     ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Benjamin Herrenschmidt
  2013-05-21 20:41                     ` Bjorn Helgaas
  2 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-06 23:15 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails will try must then try to extend must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

We should not fall into retry in this case, as root bus does
not io port range.

We check if the root bus has ioport range, and set bus_res_type_mask,
and pass it to __bus_size_bridges and skip io port resources.

For the retry failing, we could allocate mmio-non-pref bottom-up
and mmio-pref will be top-down, but that could not be material for v3.10.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   39 +++++++++++++++++++++++++++++++--------
 1 file changed, 31 insertions(+), 8 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1045,7 +1045,8 @@ handle_done:
 }
 
 static void __ref __pci_bus_size_bridges(struct pci_bus *bus,
-			struct list_head *realloc_head)
+			struct list_head *realloc_head,
+			unsigned long res_type_mask)
 {
 	struct pci_dev *dev;
 	unsigned long mask, prefmask;
@@ -1063,7 +1064,7 @@ static void __ref __pci_bus_size_bridges
 
 		case PCI_CLASS_BRIDGE_PCI:
 		default:
-			__pci_bus_size_bridges(b, realloc_head);
+			__pci_bus_size_bridges(b, realloc_head, res_type_mask);
 			break;
 		}
 	}
@@ -1087,8 +1088,9 @@ static void __ref __pci_bus_size_bridges
 		 * Follow thru
 		 */
 	default:
-		pbus_size_io(bus, realloc_head ? 0 : additional_io_size,
-			     additional_io_size, realloc_head);
+		if (res_type_mask & IORESOURCE_IO)
+			pbus_size_io(bus, realloc_head ? 0 : additional_io_size,
+				     additional_io_size, realloc_head);
 		/* If the bridge supports prefetchable range, size it
 		   separately. If it doesn't, or its prefetchable window
 		   has already been allocated by arch code, try
@@ -1111,7 +1113,10 @@ static void __ref __pci_bus_size_bridges
 
 void __ref pci_bus_size_bridges(struct pci_bus *bus)
 {
-	__pci_bus_size_bridges(bus, NULL);
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
+
+	__pci_bus_size_bridges(bus, NULL, type_mask);
 }
 EXPORT_SYMBOL(pci_bus_size_bridges);
 
@@ -1376,6 +1381,21 @@ static enum enable_type __init pci_reall
 	return enable_local;
 }
 
+static unsigned int __init pci_bus_res_type_mask(struct pci_bus *bus)
+{
+	int i;
+	struct resource *r;
+	unsigned long mask = 0;
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
+
+	pci_bus_for_each_resource(bus, r, i)
+		if (r)
+			mask |= r->flags & type_mask;
+
+	return mask;
+}
+
 /*
  * first try will not touch pci bridge res
  * second  and later try will clear small leaf bridge res
@@ -1395,6 +1415,7 @@ pci_assign_unassigned_root_bus_resources
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
 	enum enable_type enable_local;
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bus);
 
 	/* don't realloc if asked to do so */
 	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
@@ -1416,7 +1437,7 @@ again:
 		add_list = &realloc_head;
 	/* Depth first, calculate sizes and alignments of all
 	   subordinate buses. */
-	__pci_bus_size_bridges(bus, add_list);
+	__pci_bus_size_bridges(bus, add_list, bus_res_type_mask);
 
 	/* Depth last, allocate resources and update the hardware. */
 	__pci_bus_assign_resources(bus, add_list, &fail_head);
@@ -1496,9 +1517,10 @@ void pci_assign_unassigned_bridge_resour
 	int retval;
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bridge->bus);
 
 again:
-	__pci_bus_size_bridges(parent, &add_list);
+	__pci_bus_size_bridges(parent, &add_list, bus_res_type_mask);
 	__pci_bridge_assign_resources(bridge, &add_list, &fail_head);
 	BUG_ON(!list_empty(&add_list));
 	tried_times++;
@@ -1554,6 +1576,7 @@ void pci_assign_unassigned_bus_resources
 	struct pci_dev *dev;
 	LIST_HEAD(add_list); /* list of resources that
 					want additional resources */
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bus);
 
 	down_read(&pci_bus_sem);
 	list_for_each_entry(dev, &bus->devices, bus_list)
@@ -1561,7 +1584,7 @@ void pci_assign_unassigned_bus_resources
 		    dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
 			if (dev->subordinate)
 				__pci_bus_size_bridges(dev->subordinate,
-							 &add_list);
+						 &add_list, bus_res_type_mask);
 	up_read(&pci_bus_sem);
 	__pci_bus_assign_resources(bus, &add_list, NULL);
 	BUG_ON(!list_empty(&add_list));

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus
  2013-05-06 23:15                   ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
  2013-05-06 23:15                     ` [PATCH 2/2] PCI: Skip IORESOURCE_IO size and allocation for root bus without ioport range Yinghai Lu
@ 2013-05-07  0:50                     ` Benjamin Herrenschmidt
       [not found]                       ` <51885a04.c181440a.37c9.ffffa88eSMTPIN_ADDED_BROKEN@mx.google.com>
  2013-05-21 20:41                     ` Bjorn Helgaas
  2 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-07  0:50 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Bjorn Helgaas, Gavin Shan, linux-pci, linux-kernel

On Mon, 2013-05-06 at 16:15 -0700, Yinghai Lu wrote:
> BenH reported that there is some assign unassigned resource problem
> in powerpc.
> 
> It turns out after
> | commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
> | Date:   Thu Feb 23 19:23:29 2012 -0800
> |
> |    PCI: Retry on IORESOURCE_IO type allocations
> 
> even the root bus does not have io port range, it will keep retrying
> to realloc with mmio.
> 
> Current retry logic is : try with must+optional at first, and if
> it fails will try must then try to extend must with optional.
> That will fail as mmio-non-pref and mmio-pref for bridge will
> be next to each other. So we have no chance to extend mmio-non-pref.

Shouldn't you do completely separate passes for IO and MMIO through the
whole lot instead since they are completely separate address spaces ?

IE. Even a legit setup with IOs, in case of failures on the IO side
(resource exhaustion, FW bugs, ...) we shouldn't retry the memory side
and vice-versa.

Note that I'm a bit worried by how invasive your proposed patch is. I
will let Gavin test it and if it works I'm happy but keep in mind that I
really need that fixed in 3.10 ... (especially since there are noises
around about a certain enterprise distro using that version...)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus
       [not found]                       ` <51885a04.c181440a.37c9.ffffa88eSMTPIN_ADDED_BROKEN@mx.google.com>
@ 2013-05-07  7:34                         ` Yinghai Lu
  0 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07  7:34 UTC (permalink / raw)
  To: Gavin Shan
  Cc: Benjamin Herrenschmidt, Bjorn Helgaas, linux-pci,
	Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 299 bytes --]

On Mon, May 6, 2013 at 6:33 PM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
>
> Run the kernel with Yinghai's code on the simulator and it's failed to
> enable PCI devices (including p2p bridges) as the attached kernel log
> indicates.

Please check attached v2 for second patch.

Thanks

Yinghai

[-- Attachment #2: root_bus_ioport_skip_2.patch --]
[-- Type: application/octet-stream, Size: 7451 bytes --]

Subject: [PATCH 2/2] PCI: Skip IORESOURCE_IO size and allocation for root bus without ioport range

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails will try must then try to extend must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

We should not fall into retry in this case, as root bus does
not io port range.

We check if the root bus has ioport range, and set bus_res_type_mask,
and pass it to __bus_size_bridges and skip io port resources.

For the retry failing, we could allocate mmio-non-pref bottom-up
and mmio-pref will be top-down, but that could not be material for v3.10.

-v2: remove wrong __init with pci_bus_res_type_mask()
     don't check bridge size/flags, and clear bus io resources.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   81 +++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 64 insertions(+), 17 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -586,7 +586,8 @@ void pci_setup_bridge(struct pci_bus *bu
 /* Check whether the bridge supports optional I/O and
    prefetchable memory ranges. If not, the respective
    base/limit registers must be read-only and read as 0. */
-static void pci_bridge_check_ranges(struct pci_bus *bus)
+static void pci_bridge_check_ranges(struct pci_bus *bus,
+				    unsigned long res_type_mask)
 {
 	u16 io;
 	u32 pmem;
@@ -596,14 +597,17 @@ static void pci_bridge_check_ranges(stru
 	b_res = &bridge->resource[PCI_BRIDGE_RESOURCES];
 	b_res[1].flags |= IORESOURCE_MEM;
 
-	pci_read_config_word(bridge, PCI_IO_BASE, &io);
-	if (!io) {
-		pci_write_config_word(bridge, PCI_IO_BASE, 0xf0f0);
+	if (res_type_mask & IORESOURCE_IO) {
 		pci_read_config_word(bridge, PCI_IO_BASE, &io);
- 		pci_write_config_word(bridge, PCI_IO_BASE, 0x0);
- 	}
- 	if (io)
-		b_res[0].flags |= IORESOURCE_IO;
+		if (!io) {
+			pci_write_config_word(bridge, PCI_IO_BASE, 0xf0f0);
+			pci_read_config_word(bridge, PCI_IO_BASE, &io);
+			pci_write_config_word(bridge, PCI_IO_BASE, 0x0);
+		}
+		if (io)
+			b_res[0].flags |= IORESOURCE_IO;
+	}
+
 	/*  DECchip 21050 pass 2 errata: the bridge may miss an address
 	    disconnect boundary by one PCI data phase.
 	    Workaround: do not use prefetching on this device. */
@@ -812,6 +816,24 @@ static void pbus_size_io(struct pci_bus
 	}
 }
 
+static void pbus_clear_io(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		int i;
+
+		for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+			struct resource *r = &dev->resource[i];
+
+			if (r->parent || !(r->flags & IORESOURCE_IO))
+				continue;
+
+			reset_resource(r);
+		}
+	}
+}
+
 static inline resource_size_t calculate_mem_align(resource_size_t *aligns,
 						  int max_order)
 {
@@ -1045,7 +1067,8 @@ handle_done:
 }
 
 static void __ref __pci_bus_size_bridges(struct pci_bus *bus,
-			struct list_head *realloc_head)
+			struct list_head *realloc_head,
+			unsigned long res_type_mask)
 {
 	struct pci_dev *dev;
 	unsigned long mask, prefmask;
@@ -1063,7 +1086,7 @@ static void __ref __pci_bus_size_bridges
 
 		case PCI_CLASS_BRIDGE_PCI:
 		default:
-			__pci_bus_size_bridges(b, realloc_head);
+			__pci_bus_size_bridges(b, realloc_head, res_type_mask);
 			break;
 		}
 	}
@@ -1078,7 +1101,7 @@ static void __ref __pci_bus_size_bridges
 		break;
 
 	case PCI_CLASS_BRIDGE_PCI:
-		pci_bridge_check_ranges(bus);
+		pci_bridge_check_ranges(bus, res_type_mask);
 		if (bus->self->is_hotplug_bridge) {
 			additional_io_size  = pci_hotplug_io_size;
 			additional_mem_size = pci_hotplug_mem_size;
@@ -1087,8 +1110,11 @@ static void __ref __pci_bus_size_bridges
 		 * Follow thru
 		 */
 	default:
-		pbus_size_io(bus, realloc_head ? 0 : additional_io_size,
-			     additional_io_size, realloc_head);
+		if (res_type_mask & IORESOURCE_IO)
+			pbus_size_io(bus, realloc_head ? 0 : additional_io_size,
+				     additional_io_size, realloc_head);
+		else
+			pbus_clear_io(bus);
 		/* If the bridge supports prefetchable range, size it
 		   separately. If it doesn't, or its prefetchable window
 		   has already been allocated by arch code, try
@@ -1111,7 +1137,10 @@ static void __ref __pci_bus_size_bridges
 
 void __ref pci_bus_size_bridges(struct pci_bus *bus)
 {
-	__pci_bus_size_bridges(bus, NULL);
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
+
+	__pci_bus_size_bridges(bus, NULL, type_mask);
 }
 EXPORT_SYMBOL(pci_bus_size_bridges);
 
@@ -1376,6 +1405,21 @@ static enum enable_type __init pci_reall
 	return enable_local;
 }
 
+static unsigned long pci_bus_res_type_mask(struct pci_bus *bus)
+{
+	int i;
+	struct resource *r;
+	unsigned long mask = 0;
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
+
+	pci_bus_for_each_resource(bus, r, i)
+		if (r)
+			mask |= r->flags & type_mask;
+
+	return mask;
+}
+
 /*
  * first try will not touch pci bridge res
  * second  and later try will clear small leaf bridge res
@@ -1395,6 +1439,7 @@ pci_assign_unassigned_root_bus_resources
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
 	enum enable_type enable_local;
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bus);
 
 	/* don't realloc if asked to do so */
 	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
@@ -1416,7 +1461,7 @@ again:
 		add_list = &realloc_head;
 	/* Depth first, calculate sizes and alignments of all
 	   subordinate buses. */
-	__pci_bus_size_bridges(bus, add_list);
+	__pci_bus_size_bridges(bus, add_list, bus_res_type_mask);
 
 	/* Depth last, allocate resources and update the hardware. */
 	__pci_bus_assign_resources(bus, add_list, &fail_head);
@@ -1496,9 +1541,10 @@ void pci_assign_unassigned_bridge_resour
 	int retval;
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bridge->bus);
 
 again:
-	__pci_bus_size_bridges(parent, &add_list);
+	__pci_bus_size_bridges(parent, &add_list, bus_res_type_mask);
 	__pci_bridge_assign_resources(bridge, &add_list, &fail_head);
 	BUG_ON(!list_empty(&add_list));
 	tried_times++;
@@ -1554,6 +1600,7 @@ void pci_assign_unassigned_bus_resources
 	struct pci_dev *dev;
 	LIST_HEAD(add_list); /* list of resources that
 					want additional resources */
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bus);
 
 	down_read(&pci_bus_sem);
 	list_for_each_entry(dev, &bus->devices, bus_list)
@@ -1561,7 +1608,7 @@ void pci_assign_unassigned_bus_resources
 		    dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
 			if (dev->subordinate)
 				__pci_bus_size_bridges(dev->subordinate,
-							 &add_list);
+						 &add_list, bus_res_type_mask);
 	up_read(&pci_bus_sem);
 	__pci_bus_assign_resources(bus, &add_list, NULL);
 	BUG_ON(!list_empty(&add_list));

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource
  2013-05-06 10:48                 ` Benjamin Herrenschmidt
  2013-05-06 19:56                   ` Yinghai Lu
  2013-05-06 23:15                   ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
@ 2013-05-07 22:17                   ` Yinghai Lu
  2013-05-07 22:17                     ` [PATCH v3 1/5] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
                                       ` (4 more replies)
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
  4 siblings, 5 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07 22:17 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails will try must then try to extend must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

We should not fall into retry in this case, as root bus does
not io port range.

We check if the root bus has ioport range, and set bus_res_type_mask,
and pass it to assign_resources and don't add ioport res to failed list
for root bus that does not have ioport range.
So even BIOS set wrong value to pci devices and bridges will still
get cleared.

Then also check mmio-nonpref resource for root buses.

First two are for 3.10, and others are targeted to 3.11

 PCI: Split pci_assign_unassigned_resources to per root bus
 PCI: Skip IORESOURCE_IO allocation for root bus without ioport range
 PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range
 PCI: Enable pci bridge when it is needed
 PCI: Retry assign unassigned resources for hotadd root bus

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 1/5] PCI: Split pci_assign_unassigned_resources to per root bus
  2013-05-07 22:17                   ` [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
@ 2013-05-07 22:17                     ` Yinghai Lu
  2013-05-07 22:17                     ` [PATCH v3 2/5] PCI: Skip IORESOURCE_IO allocation for root bus without ioport range Yinghai Lu
                                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07 22:17 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails will try must then try to extend must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

We should not fall into retry in this case, as root bus does
not io port range.

Before that we need to split pci_assign_unassiged_resource
to every root bus, so we can stop early for root bus without ioport
range, and still continue to retry on buses that do have ioport range.

Also later we could let root bus hot add and booting path use same code.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |  101 +++++++++++++++++++++++-------------------------
 1 file changed, 49 insertions(+), 52 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1315,21 +1315,6 @@ static int __init pci_bus_get_depth(stru
 
 	return depth;
 }
-static int __init pci_get_max_depth(void)
-{
-	int depth = 0;
-	struct pci_bus *bus;
-
-	list_for_each_entry(bus, &pci_root_buses, node) {
-		int ret;
-
-		ret = pci_bus_get_depth(bus);
-		if (ret > depth)
-			depth = ret;
-	}
-
-	return depth;
-}
 
 /*
  * -1: undefined, will auto detect later
@@ -1354,34 +1339,41 @@ void __init pci_realloc_get_opt(char *st
 	else if (!strncmp(str, "on", 2))
 		pci_realloc_enable = user_enabled;
 }
-static bool __init pci_realloc_enabled(void)
+static bool __init pci_realloc_enabled(enum enable_type enable)
 {
-	return pci_realloc_enable >= user_enabled;
+	return enable >= user_enabled;
 }
 
-static void __init pci_realloc_detect(void)
+static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+			 enum enable_type enable_local)
 {
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
-	struct pci_dev *dev = NULL;
+	struct pci_dev *dev;
 
-	if (pci_realloc_enable != undefined)
-		return;
+	if (enable_local != undefined)
+		return enable_local;
 
-	for_each_pci_dev(dev) {
+	list_for_each_entry(dev, &bus->devices, bus_list) {
 		int i;
 
 		for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
 			struct resource *r = &dev->resource[i];
 
 			/* Not assigned, or rejected by kernel ? */
-			if (r->flags && !r->start) {
-				pci_realloc_enable = auto_enabled;
-
-				return;
-			}
+			if (r->flags && !r->start)
+				return auto_enabled;
 		}
 	}
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		struct pci_bus *child = dev->subordinate;
+
+		if (child &&
+		    pci_realloc_detect(child, enable_local) == auto_enabled)
+			return auto_enabled;
+	}
 #endif
+	return enable_local;
 }
 
 /*
@@ -1389,10 +1381,9 @@ static void __init pci_realloc_detect(vo
  * second  and later try will clear small leaf bridge res
  * will stop till to the max  deepth if can not find good one
  */
-void __init
-pci_assign_unassigned_resources(void)
+static void __init
+pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 {
-	struct pci_bus *bus;
 	LIST_HEAD(realloc_head); /* list of resources that
 					want additional resources */
 	struct list_head *add_list = NULL;
@@ -1403,15 +1394,17 @@ pci_assign_unassigned_resources(void)
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
+	enum enable_type enable_local;
 
 	/* don't realloc if asked to do so */
-	pci_realloc_detect();
-	if (pci_realloc_enabled()) {
-		int max_depth = pci_get_max_depth();
+	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
+	if (pci_realloc_enabled(enable_local)) {
+		int max_depth = pci_bus_get_depth(bus);
 
 		pci_try_num = max_depth + 1;
-		printk(KERN_DEBUG "PCI: max bus depth: %d pci_try_num: %d\n",
-			 max_depth, pci_try_num);
+		dev_printk(KERN_DEBUG, &bus->dev,
+			   "max bus depth: %d pci_try_num: %d\n",
+			   max_depth, pci_try_num);
 	}
 
 again:
@@ -1423,12 +1416,10 @@ again:
 		add_list = &realloc_head;
 	/* Depth first, calculate sizes and alignments of all
 	   subordinate buses. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		__pci_bus_size_bridges(bus, add_list);
+	__pci_bus_size_bridges(bus, add_list);
 
 	/* Depth last, allocate resources and update the hardware. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		__pci_bus_assign_resources(bus, add_list, &fail_head);
+	__pci_bus_assign_resources(bus, add_list, &fail_head);
 	if (add_list)
 		BUG_ON(!list_empty(add_list));
 	tried_times++;
@@ -1438,17 +1429,17 @@ again:
 		goto enable_and_dump;
 
 	if (tried_times >= pci_try_num) {
-		if (pci_realloc_enable == undefined)
-			printk(KERN_INFO "Some PCI device resources are unassigned, try booting with pci=realloc\n");
-		else if (pci_realloc_enable == auto_enabled)
-			printk(KERN_INFO "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
+		if (enable_local == undefined)
+			dev_info(&bus->dev, "Some PCI device resources are unassigned, try booting with pci=realloc\n");
+		else if (enable_local == auto_enabled)
+			dev_info(&bus->dev, "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);
 		goto enable_and_dump;
 	}
 
-	printk(KERN_DEBUG "PCI: No. %d try to assign unassigned res\n",
-			 tried_times + 1);
+	dev_printk(KERN_DEBUG, &bus->dev,
+		   "No. %d try to assign unassigned res\n", tried_times + 1);
 
 	/* third times and later will not check if it is leaf */
 	if ((tried_times + 1) > 2)
@@ -1458,12 +1449,11 @@ again:
 	 * Try to release leaf bridge's resources that doesn't fit resource of
 	 * child device under that bridge
 	 */
-	list_for_each_entry(fail_res, &fail_head, list) {
-		bus = fail_res->dev->bus;
-		pci_bus_release_bridge_resources(bus,
+	list_for_each_entry(fail_res, &fail_head, list)
+		pci_bus_release_bridge_resources(fail_res->dev->bus,
 						 fail_res->flags & type_mask,
 						 rel_type);
-	}
+
 	/* restore size and flags */
 	list_for_each_entry(fail_res, &fail_head, list) {
 		struct resource *res = fail_res->res;
@@ -1480,12 +1470,19 @@ again:
 
 enable_and_dump:
 	/* Depth last, update the hardware. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		pci_enable_bridges(bus);
+	pci_enable_bridges(bus);
 
 	/* dump the resource on buses */
+	pci_bus_dump_resources(bus);
+}
+
+void __init
+pci_assign_unassigned_resources(void)
+{
+	struct pci_bus *bus;
+
 	list_for_each_entry(bus, &pci_root_buses, node)
-		pci_bus_dump_resources(bus);
+		pci_assign_unassigned_root_bus_resources(bus);
 }
 
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 2/5] PCI: Skip IORESOURCE_IO allocation for root bus without ioport range
  2013-05-07 22:17                   ` [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
  2013-05-07 22:17                     ` [PATCH v3 1/5] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
@ 2013-05-07 22:17                     ` Yinghai Lu
  2013-05-07 22:17                     ` [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range Yinghai Lu
                                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07 22:17 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails will try must then try to extend must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

We should not fall into retry in this case, as root bus does
not io port range.

We check if the root bus has ioport range, and set bus_res_type_mask,
and pass it to assign_resources and don't add ioport res to failed list
for root bus that does not have ioport range.
So even BIOS set wrong value to pci devices and bridges will still
get cleared.

For the retry failing, we could allocate mmio-non-pref bottom-up
and mmio-pref will be top-down, but that is not easy and could not be
material for v3.10.

-v2: remove wrong __init with pci_bus_res_type_mask()
     don't check bridge size/flags, and clear bus io resources.
-v3: change to skip adding to failed_list instead.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   86 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 66 insertions(+), 20 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -272,7 +272,8 @@ out:
  * requests that could not satisfied to the failed_list.
  */
 static void assign_requested_resources_sorted(struct list_head *head,
-				 struct list_head *fail_head)
+				 struct list_head *fail_head,
+				 unsigned long bus_res_type_mask)
 {
 	struct resource *res;
 	struct pci_dev_resource *dev_res;
@@ -288,8 +289,19 @@ static void assign_requested_resources_s
 				 * if the failed res is for ROM BAR, and it will
 				 * be enabled later, don't add it to the list
 				 */
-				if (!((idx == PCI_ROM_RESOURCE) &&
-				      (!(res->flags & IORESOURCE_ROM_ENABLE))))
+				bool is_rom_res_not_enabled =
+					 (idx == PCI_ROM_RESOURCE) &&
+					 (!(res->flags & IORESOURCE_ROM_ENABLE));
+				/*
+				 * if the failed res is io port, but bus does
+				 * not have io port support, don't add it
+				 */
+				bool is_ioport_res_without_bus_support =
+					 (!(bus_res_type_mask & IORESOURCE_IO)) &&
+					 (res->flags & IORESOURCE_IO);
+
+				if (!is_rom_res_not_enabled &&
+				    !is_ioport_res_without_bus_support)
 					add_to_list(fail_head,
 						    dev_res->dev, res,
 						    0 /* dont care */,
@@ -302,7 +314,8 @@ static void assign_requested_resources_s
 
 static void __assign_resources_sorted(struct list_head *head,
 				 struct list_head *realloc_head,
-				 struct list_head *fail_head)
+				 struct list_head *fail_head,
+				 unsigned long bus_res_type_mask)
 {
 	/*
 	 * Should not assign requested resources at first.
@@ -336,7 +349,8 @@ static void __assign_resources_sorted(st
 							dev_res->res);
 
 	/* Try updated head list with add_size added */
-	assign_requested_resources_sorted(head, &local_fail_head);
+	assign_requested_resources_sorted(head, &local_fail_head,
+					  bus_res_type_mask);
 
 	/* all assigned with add_size ? */
 	if (list_empty(&local_fail_head)) {
@@ -365,7 +379,8 @@ static void __assign_resources_sorted(st
 
 requested_and_reassign:
 	/* Satisfy the must-have resource requests */
-	assign_requested_resources_sorted(head, fail_head);
+	assign_requested_resources_sorted(head, fail_head,
+					  bus_res_type_mask);
 
 	/* Try to satisfy any additional optional resource
 		requests */
@@ -376,18 +391,21 @@ requested_and_reassign:
 
 static void pdev_assign_resources_sorted(struct pci_dev *dev,
 				 struct list_head *add_head,
-				 struct list_head *fail_head)
+				 struct list_head *fail_head,
+				 unsigned long bus_res_type_mask)
 {
 	LIST_HEAD(head);
 
 	__dev_sort_resources(dev, &head);
-	__assign_resources_sorted(&head, add_head, fail_head);
+	__assign_resources_sorted(&head, add_head, fail_head,
+					bus_res_type_mask);
 
 }
 
 static void pbus_assign_resources_sorted(const struct pci_bus *bus,
 					 struct list_head *realloc_head,
-					 struct list_head *fail_head)
+					 struct list_head *fail_head,
+					 unsigned long bus_res_type_mask)
 {
 	struct pci_dev *dev;
 	LIST_HEAD(head);
@@ -395,7 +413,8 @@ static void pbus_assign_resources_sorted
 	list_for_each_entry(dev, &bus->devices, bus_list)
 		__dev_sort_resources(dev, &head);
 
-	__assign_resources_sorted(&head, realloc_head, fail_head);
+	__assign_resources_sorted(&head, realloc_head, fail_head,
+					bus_res_type_mask);
 }
 
 void pci_setup_cardbus(struct pci_bus *bus)
@@ -1117,19 +1136,22 @@ EXPORT_SYMBOL(pci_bus_size_bridges);
 
 static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,
 					 struct list_head *realloc_head,
-					 struct list_head *fail_head)
+					 struct list_head *fail_head,
+					 unsigned long bus_res_type_mask)
 {
 	struct pci_bus *b;
 	struct pci_dev *dev;
 
-	pbus_assign_resources_sorted(bus, realloc_head, fail_head);
+	pbus_assign_resources_sorted(bus, realloc_head, fail_head,
+					bus_res_type_mask);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		b = dev->subordinate;
 		if (!b)
 			continue;
 
-		__pci_bus_assign_resources(b, realloc_head, fail_head);
+		__pci_bus_assign_resources(b, realloc_head, fail_head,
+						 bus_res_type_mask);
 
 		switch (dev->class >> 8) {
 		case PCI_CLASS_BRIDGE_PCI:
@@ -1151,24 +1173,28 @@ static void __ref __pci_bus_assign_resou
 
 void __ref pci_bus_assign_resources(const struct pci_bus *bus)
 {
-	__pci_bus_assign_resources(bus, NULL, NULL);
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
+
+	__pci_bus_assign_resources(bus, NULL, NULL, type_mask);
 }
 EXPORT_SYMBOL(pci_bus_assign_resources);
 
 static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
 					 struct list_head *add_head,
-					 struct list_head *fail_head)
+					 struct list_head *fail_head,
+					 unsigned long bus_res_type_mask)
 {
 	struct pci_bus *b;
 
 	pdev_assign_resources_sorted((struct pci_dev *)bridge,
-					 add_head, fail_head);
+				     add_head, fail_head, bus_res_type_mask);
 
 	b = bridge->subordinate;
 	if (!b)
 		return;
 
-	__pci_bus_assign_resources(b, add_head, fail_head);
+	__pci_bus_assign_resources(b, add_head, fail_head, bus_res_type_mask);
 
 	switch (bridge->class >> 8) {
 	case PCI_CLASS_BRIDGE_PCI:
@@ -1376,6 +1402,21 @@ static enum enable_type __init pci_reall
 	return enable_local;
 }
 
+static unsigned long pci_bus_res_type_mask(struct pci_bus *bus)
+{
+	int i;
+	struct resource *r;
+	unsigned long mask = 0;
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
+
+	pci_bus_for_each_resource(bus, r, i)
+		if (r)
+			mask |= r->flags & type_mask;
+
+	return mask;
+}
+
 /*
  * first try will not touch pci bridge res
  * second  and later try will clear small leaf bridge res
@@ -1395,6 +1436,7 @@ pci_assign_unassigned_root_bus_resources
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
 	enum enable_type enable_local;
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bus);
 
 	/* don't realloc if asked to do so */
 	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
@@ -1419,7 +1461,8 @@ again:
 	__pci_bus_size_bridges(bus, add_list);
 
 	/* Depth last, allocate resources and update the hardware. */
-	__pci_bus_assign_resources(bus, add_list, &fail_head);
+	__pci_bus_assign_resources(bus, add_list, &fail_head,
+				   bus_res_type_mask);
 	if (add_list)
 		BUG_ON(!list_empty(add_list));
 	tried_times++;
@@ -1496,10 +1539,12 @@ void pci_assign_unassigned_bridge_resour
 	int retval;
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bridge->bus);
 
 again:
 	__pci_bus_size_bridges(parent, &add_list);
-	__pci_bridge_assign_resources(bridge, &add_list, &fail_head);
+	__pci_bridge_assign_resources(bridge, &add_list, &fail_head,
+					bus_res_type_mask);
 	BUG_ON(!list_empty(&add_list));
 	tried_times++;
 
@@ -1554,6 +1599,7 @@ void pci_assign_unassigned_bus_resources
 	struct pci_dev *dev;
 	LIST_HEAD(add_list); /* list of resources that
 					want additional resources */
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bus);
 
 	down_read(&pci_bus_sem);
 	list_for_each_entry(dev, &bus->devices, bus_list)
@@ -1563,6 +1609,6 @@ void pci_assign_unassigned_bus_resources
 				__pci_bus_size_bridges(dev->subordinate,
 							 &add_list);
 	up_read(&pci_bus_sem);
-	__pci_bus_assign_resources(bus, &add_list, NULL);
+	__pci_bus_assign_resources(bus, &add_list, NULL, bus_res_type_mask);
 	BUG_ON(!list_empty(&add_list));
 }

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range
  2013-05-07 22:17                   ` [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
  2013-05-07 22:17                     ` [PATCH v3 1/5] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
  2013-05-07 22:17                     ` [PATCH v3 2/5] PCI: Skip IORESOURCE_IO allocation for root bus without ioport range Yinghai Lu
@ 2013-05-07 22:17                     ` Yinghai Lu
  2013-05-07 22:28                       ` Benjamin Herrenschmidt
  2013-05-07 22:17                     ` [PATCH v3 4/5] PCI: Enable pci bridge when it is needed Yinghai Lu
  2013-05-07 22:17                     ` [PATCH v3 5/5] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
  4 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07 22:17 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

For x86 8 sockets or 32 sockets system that will have one root bus per socket,
They may have some root buses do not have mmio non-pref range.

We should not fall into retry in this case, as root bus does
not mmio non-pref range.

We check if the root bus has mmio-nonpref range, and set bus_res_type_mask,
and pass it to assign_resources and don't add mmio-nonpref res to failed list
for root bus that does not have mmio-nonpref range.
So even BIOS set wrong value to pci devices and bridges will still
get cleared.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -299,9 +299,17 @@ static void assign_requested_resources_s
 				bool is_ioport_res_without_bus_support =
 					 (!(bus_res_type_mask & IORESOURCE_IO)) &&
 					 (res->flags & IORESOURCE_IO);
+				/*
+				 * if the failed res is mmio, but bus does
+				 * not have io port support, don't add it
+				 */
+				bool is_mmio_nonpref_res_without_bus_support =
+					 (!(bus_res_type_mask & IORESOURCE_MEM)) &&
+					 ((res->flags & (IORESOURCE_MEM | IORESOURCE_PREFETCH)) == IORESOURCE_MEM);
 
 				if (!is_rom_res_not_enabled &&
-				    !is_ioport_res_without_bus_support)
+				    !is_ioport_res_without_bus_support &&
+				    !is_mmio_nonpref_res_without_bus_support)
 					add_to_list(fail_head,
 						    dev_res->dev, res,
 						    0 /* dont care */,
@@ -1407,12 +1415,24 @@ static unsigned long pci_bus_res_type_ma
 	int i;
 	struct resource *r;
 	unsigned long mask = 0;
-	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
-				  IORESOURCE_PREFETCH;
 
-	pci_bus_for_each_resource(bus, r, i)
-		if (r)
-			mask |= r->flags & type_mask;
+	pci_bus_for_each_resource(bus, r, i) {
+		if (!r)
+			continue;
+
+		if (r->flags & IORESOURCE_IO) {
+			mask |= IORESOURCE_IO;
+			continue;
+		}
+		if (r->flags & IORESOURCE_PREFETCH) {
+			mask |= IORESOURCE_PREFETCH;
+			continue;
+		}
+		if ((r->flags & (IORESOURCE_MEM | IORESOURCE_PREFETCH)) == IORESOURCE_MEM) {
+			mask |= IORESOURCE_MEM; /* nonpref only */
+			continue;
+		}
+	}
 
 	return mask;
 }

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 4/5] PCI: Enable pci bridge when it is needed
  2013-05-07 22:17                   ` [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
                                       ` (2 preceding siblings ...)
  2013-05-07 22:17                     ` [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range Yinghai Lu
@ 2013-05-07 22:17                     ` Yinghai Lu
  2013-05-07 22:17                     ` [PATCH v3 5/5] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
  4 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07 22:17 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Current we enable bridges after bus scan and assign resources.
and it is spreaded a lot of places.

We can move it to where pci device is enabled, and need
to go up to root bus and enable bridge one by one down to pci
dev.

So that will delay enable bridge late as needed bassis,
also kill one inconsistent between boot path and hotplug
path in acpi_pci_root_add().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/arm/kernel/bios32.c           |    5 -----
 arch/m68k/platform/coldfire/pci.c  |    1 -
 arch/mips/pci/pci.c                |    1 -
 arch/sh/drivers/pci/pci.c          |    1 -
 drivers/acpi/pci_root.c            |    4 ----
 drivers/parisc/lba_pci.c           |    1 -
 drivers/pci/bus.c                  |   19 -------------------
 drivers/pci/hotplug/acpiphp_glue.c |    1 -
 drivers/pci/pci.c                  |   20 ++++++++++++++++++++
 drivers/pci/probe.c                |    1 -
 drivers/pci/setup-bus.c            |   10 +++-------
 drivers/pcmcia/cardbus.c           |    1 -
 include/linux/pci.h                |    1 -
 13 files changed, 23 insertions(+), 43 deletions(-)

Index: linux-2.6/arch/arm/kernel/bios32.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/bios32.c
+++ linux-2.6/arch/arm/kernel/bios32.c
@@ -524,11 +524,6 @@ void pci_common_init(struct hw_pci *hw)
 			 * Assign resources.
 			 */
 			pci_bus_assign_resources(bus);
-
-			/*
-			 * Enable bridges
-			 */
-			pci_enable_bridges(bus);
 		}
 
 		/*
Index: linux-2.6/arch/m68k/platform/coldfire/pci.c
===================================================================
--- linux-2.6.orig/arch/m68k/platform/coldfire/pci.c
+++ linux-2.6/arch/m68k/platform/coldfire/pci.c
@@ -319,7 +319,6 @@ static int __init mcf_pci_init(void)
 	pci_fixup_irqs(pci_common_swizzle, mcf_pci_map_irq);
 	pci_bus_size_bridges(rootbus);
 	pci_bus_assign_resources(rootbus);
-	pci_enable_bridges(rootbus);
 	pci_bus_add_devices(rootbus);
 	return 0;
 }
Index: linux-2.6/arch/mips/pci/pci.c
===================================================================
--- linux-2.6.orig/arch/mips/pci/pci.c
+++ linux-2.6/arch/mips/pci/pci.c
@@ -113,7 +113,6 @@ static void pcibios_scanbus(struct pci_c
 		if (!pci_has_flag(PCI_PROBE_ONLY)) {
 			pci_bus_size_bridges(bus);
 			pci_bus_assign_resources(bus);
-			pci_enable_bridges(bus);
 		}
 	}
 }
Index: linux-2.6/arch/sh/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/arch/sh/drivers/pci/pci.c
+++ linux-2.6/arch/sh/drivers/pci/pci.c
@@ -69,7 +69,6 @@ static void pcibios_scanbus(struct pci_c
 
 		pci_bus_size_bridges(bus);
 		pci_bus_assign_resources(bus);
-		pci_enable_bridges(bus);
 	} else {
 		pci_free_resource_list(&resources);
 	}
Index: linux-2.6/drivers/acpi/pci_root.c
===================================================================
--- linux-2.6.orig/drivers/acpi/pci_root.c
+++ linux-2.6/drivers/acpi/pci_root.c
@@ -538,10 +538,6 @@ static int acpi_pci_root_add(struct acpi
 		pci_assign_unassigned_bus_resources(root->bus);
 	}
 
-	/* need to after hot-added ioapic is registered */
-	if (system_state != SYSTEM_BOOTING)
-		pci_enable_bridges(root->bus);
-
 	pci_bus_add_devices(root->bus);
 	return 1;
 
Index: linux-2.6/drivers/parisc/lba_pci.c
===================================================================
--- linux-2.6.orig/drivers/parisc/lba_pci.c
+++ linux-2.6/drivers/parisc/lba_pci.c
@@ -1533,7 +1533,6 @@ lba_driver_probe(struct parisc_device *d
 		lba_dump_res(&lba_dev->hba.lmmio_space, 2);
 #endif
 	}
-	pci_enable_bridges(lba_bus);
 
 	/*
 	** Once PCI register ops has walked the bus, access to config
Index: linux-2.6/drivers/pci/bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/bus.c
+++ linux-2.6/drivers/pci/bus.c
@@ -215,24 +215,6 @@ void pci_bus_add_devices(const struct pc
 	}
 }
 
-void pci_enable_bridges(struct pci_bus *bus)
-{
-	struct pci_dev *dev;
-	int retval;
-
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		if (dev->subordinate) {
-			if (!pci_is_enabled(dev)) {
-				retval = pci_enable_device(dev);
-				if (retval)
-					dev_err(&dev->dev, "Error enabling bridge (%d), continuing\n", retval);
-				pci_set_master(dev);
-			}
-			pci_enable_bridges(dev->subordinate);
-		}
-	}
-}
-
 /** pci_walk_bus - walk devices on/under bus, calling callback.
  *  @top      bus whose devices should be walked
  *  @cb       callback to be called for each device found
@@ -285,4 +267,3 @@ EXPORT_SYMBOL_GPL(pci_walk_bus);
 EXPORT_SYMBOL(pci_bus_alloc_resource);
 EXPORT_SYMBOL_GPL(pci_bus_add_device);
 EXPORT_SYMBOL(pci_bus_add_devices);
-EXPORT_SYMBOL(pci_enable_bridges);
Index: linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
@@ -704,7 +704,6 @@ static int __ref enable_device(struct ac
 	acpiphp_sanitize_bus(bus);
 	acpiphp_set_hpp_values(bus);
 	acpiphp_set_acpi_region(slot);
-	pci_enable_bridges(bus);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		/* Assume that newly added devices are powered on already. */
Index: linux-2.6/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci.c
+++ linux-2.6/drivers/pci/pci.c
@@ -1145,6 +1145,24 @@ int pci_reenable_device(struct pci_dev *
 	return 0;
 }
 
+static void pci_enable_bridge(struct pci_dev *dev)
+{
+	int retval;
+
+	if (!dev)
+		return;
+
+	pci_enable_bridge(dev->bus->self);
+
+	if (pci_is_enabled(dev))
+		return;
+	retval = pci_enable_device(dev);
+	if (retval)
+		dev_err(&dev->dev, "Error enabling bridge (%d), continuing\n",
+			retval);
+	pci_set_master(dev);
+}
+
 static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 {
 	int err;
@@ -1165,6 +1183,8 @@ static int pci_enable_device_flags(struc
 	if (atomic_inc_return(&dev->enable_cnt) > 1)
 		return 0;		/* already enabled */
 
+	pci_enable_bridge(dev->bus->self);
+
 	/* only skip sriov related */
 	for (i = 0; i <= PCI_ROM_RESOURCE; i++)
 		if (dev->resource[i].flags & flags)
Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -1949,7 +1949,6 @@ unsigned int __ref pci_rescan_bus(struct
 
 	max = pci_scan_child_bus(bus);
 	pci_assign_unassigned_bus_resources(bus);
-	pci_enable_bridges(bus);
 	pci_bus_add_devices(bus);
 
 	return max;
Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1489,7 +1489,7 @@ again:
 
 	/* any device complain? */
 	if (list_empty(&fail_head))
-		goto enable_and_dump;
+		goto dump;
 
 	if (tried_times >= pci_try_num) {
 		if (enable_local == undefined)
@@ -1498,7 +1498,7 @@ again:
 			dev_info(&bus->dev, "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);
-		goto enable_and_dump;
+		goto dump;
 	}
 
 	dev_printk(KERN_DEBUG, &bus->dev,
@@ -1531,10 +1531,7 @@ again:
 
 	goto again;
 
-enable_and_dump:
-	/* Depth last, update the hardware. */
-	pci_enable_bridges(bus);
-
+dump:
 	/* dump the resource on buses */
 	pci_bus_dump_resources(bus);
 }
@@ -1610,7 +1607,6 @@ enable_all:
 	if (retval)
 		dev_err(&bridge->dev, "Error reenabling bridge (%d)\n", retval);
 	pci_set_master(bridge);
-	pci_enable_bridges(parent);
 }
 EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
 
Index: linux-2.6/drivers/pcmcia/cardbus.c
===================================================================
--- linux-2.6.orig/drivers/pcmcia/cardbus.c
+++ linux-2.6/drivers/pcmcia/cardbus.c
@@ -91,7 +91,6 @@ int __ref cb_alloc(struct pcmcia_socket
 	if (s->tune_bridge)
 		s->tune_bridge(s, bus);
 
-	pci_enable_bridges(bus);
 	pci_bus_add_devices(bus);
 
 	return 0;
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -1040,7 +1040,6 @@ int __must_check pci_bus_alloc_resource(
 						  resource_size_t,
 						  resource_size_t),
 			void *alignf_data);
-void pci_enable_bridges(struct pci_bus *bus);
 
 /* Proper probing supporting hot-pluggable devices */
 int __must_check __pci_register_driver(struct pci_driver *, struct module *,

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3 5/5] PCI: Retry assign unassigned resources for hotadd root bus
  2013-05-07 22:17                   ` [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
                                       ` (3 preceding siblings ...)
  2013-05-07 22:17                     ` [PATCH v3 4/5] PCI: Enable pci bridge when it is needed Yinghai Lu
@ 2013-05-07 22:17                     ` Yinghai Lu
  4 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07 22:17 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Let root bus hotadd path use same code for booting path.
As driver is not loaded yet, we could retry to make sure
all pci devices get resources allocated.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/acpi/pci_root.c |    2 +-
 drivers/pci/setup-bus.c |   11 +++++------
 include/linux/pci.h     |    1 +
 3 files changed, 7 insertions(+), 7 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1331,7 +1331,7 @@ static void pci_bus_dump_resources(struc
 	}
 }
 
-static int __init pci_bus_get_depth(struct pci_bus *bus)
+static int pci_bus_get_depth(struct pci_bus *bus)
 {
 	int depth = 0;
 	struct pci_dev *dev;
@@ -1365,7 +1365,7 @@ enum enable_type {
 	auto_enabled,
 };
 
-static enum enable_type pci_realloc_enable __initdata = undefined;
+static enum enable_type pci_realloc_enable = undefined;
 void __init pci_realloc_get_opt(char *str)
 {
 	if (!strncmp(str, "off", 3))
@@ -1373,12 +1373,12 @@ void __init pci_realloc_get_opt(char *st
 	else if (!strncmp(str, "on", 2))
 		pci_realloc_enable = user_enabled;
 }
-static bool __init pci_realloc_enabled(enum enable_type enable)
+static bool pci_realloc_enabled(enum enable_type enable)
 {
 	return enable >= user_enabled;
 }
 
-static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+static enum enable_type pci_realloc_detect(struct pci_bus *bus,
 			 enum enable_type enable_local)
 {
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
@@ -1442,8 +1442,7 @@ static unsigned long pci_bus_res_type_ma
  * second  and later try will clear small leaf bridge res
  * will stop till to the max  deepth if can not find good one
  */
-static void __init
-pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
+void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 {
 	LIST_HEAD(realloc_head); /* list of resources that
 					want additional resources */
Index: linux-2.6/drivers/acpi/pci_root.c
===================================================================
--- linux-2.6.orig/drivers/acpi/pci_root.c
+++ linux-2.6/drivers/acpi/pci_root.c
@@ -535,7 +535,7 @@ static int acpi_pci_root_add(struct acpi
 
 	if (system_state != SYSTEM_BOOTING) {
 		pcibios_resource_survey_bus(root->bus);
-		pci_assign_unassigned_bus_resources(root->bus);
+		pci_assign_unassigned_root_bus_resources(root->bus);
 	}
 
 	pci_bus_add_devices(root->bus);
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -1002,6 +1002,7 @@ int pci_claim_resource(struct pci_dev *,
 void pci_assign_unassigned_resources(void);
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus);
+void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus);
 void pdev_enable_device(struct pci_dev *);
 int pci_enable_resources(struct pci_dev *, int mask);
 void pci_fixup_irqs(u8 (*)(struct pci_dev *, u8 *),

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
       [not found]                     ` <5188b791.a110420a.0bea.077eSMTPIN_ADDED_BROKEN@mx.google.com>
@ 2013-05-07 22:21                       ` Yinghai Lu
       [not found]                         ` <5189c22b.45f7440a.0a88.6b75SMTPIN_ADDED_BROKEN@mx.google.com>
  2013-05-17  5:36                         ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07 22:21 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Benjamin Herrenschmidt, linux-pci, Bjorn Helgaas

On Tue, May 7, 2013 at 1:11 AM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> Yinghai, thanks for the updated patch. It's working well with (v1 #1 and v2 #2), which
> was applied on top of Ben's patch sent for review. The kernel log is attached in case
> you want double-check.
>
Yes, that works.

I just sent v3, please test that.

That should be less invasive for v3.10. As it will skip adding failed
resource to failed list when the resource is IORESOURCE_IO, and bus
does not have IORESOURCE_IO.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range
  2013-05-07 22:17                     ` [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range Yinghai Lu
@ 2013-05-07 22:28                       ` Benjamin Herrenschmidt
  2013-05-07 22:44                         ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-07 22:28 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Bjorn Helgaas, Gavin Shan, linux-pci, linux-kernel

On Tue, 2013-05-07 at 15:17 -0700, Yinghai Lu wrote:
> For x86 8 sockets or 32 sockets system that will have one root bus per socket,
> They may have some root buses do not have mmio non-pref range.

That seems very odd. Most device registers are non-prefetchable. I know
of no adapter today that would work in a prefetchable-only environment.

Are you sure that isn't the other way around ?

Regarding your 3.10 patches, me and Gavin will test your v3 later today
(ASAP) and will give you an Ack if they work, in which case they should
hit Linus as soon as Bjorn is comfortable with :-)

Cheers,
Ben.

> We should not fall into retry in this case, as root bus does
> not mmio non-pref range.
> 
> We check if the root bus has mmio-nonpref range, and set bus_res_type_mask,
> and pass it to assign_resources and don't add mmio-nonpref res to failed list
> for root bus that does not have mmio-nonpref range.
> So even BIOS set wrong value to pci devices and bridges will still
> get cleared.
> 
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> ---
>  drivers/pci/setup-bus.c |   32 ++++++++++++++++++++++++++------
>  1 file changed, 26 insertions(+), 6 deletions(-)
> 
> Index: linux-2.6/drivers/pci/setup-bus.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/setup-bus.c
> +++ linux-2.6/drivers/pci/setup-bus.c
> @@ -299,9 +299,17 @@ static void assign_requested_resources_s
>  				bool is_ioport_res_without_bus_support =
>  					 (!(bus_res_type_mask & IORESOURCE_IO)) &&
>  					 (res->flags & IORESOURCE_IO);
> +				/*
> +				 * if the failed res is mmio, but bus does
> +				 * not have io port support, don't add it
> +				 */
> +				bool is_mmio_nonpref_res_without_bus_support =
> +					 (!(bus_res_type_mask & IORESOURCE_MEM)) &&
> +					 ((res->flags & (IORESOURCE_MEM | IORESOURCE_PREFETCH)) == IORESOURCE_MEM);
>  
>  				if (!is_rom_res_not_enabled &&
> -				    !is_ioport_res_without_bus_support)
> +				    !is_ioport_res_without_bus_support &&
> +				    !is_mmio_nonpref_res_without_bus_support)
>  					add_to_list(fail_head,
>  						    dev_res->dev, res,
>  						    0 /* dont care */,
> @@ -1407,12 +1415,24 @@ static unsigned long pci_bus_res_type_ma
>  	int i;
>  	struct resource *r;
>  	unsigned long mask = 0;
> -	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
> -				  IORESOURCE_PREFETCH;
>  
> -	pci_bus_for_each_resource(bus, r, i)
> -		if (r)
> -			mask |= r->flags & type_mask;
> +	pci_bus_for_each_resource(bus, r, i) {
> +		if (!r)
> +			continue;
> +
> +		if (r->flags & IORESOURCE_IO) {
> +			mask |= IORESOURCE_IO;
> +			continue;
> +		}
> +		if (r->flags & IORESOURCE_PREFETCH) {
> +			mask |= IORESOURCE_PREFETCH;
> +			continue;
> +		}
> +		if ((r->flags & (IORESOURCE_MEM | IORESOURCE_PREFETCH)) == IORESOURCE_MEM) {
> +			mask |= IORESOURCE_MEM; /* nonpref only */
> +			continue;
> +		}
> +	}
>  
>  	return mask;
>  }



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range
  2013-05-07 22:28                       ` Benjamin Herrenschmidt
@ 2013-05-07 22:44                         ` Yinghai Lu
  2013-05-08  1:16                           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-07 22:44 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Bjorn Helgaas, Gavin Shan, linux-pci, Linux Kernel Mailing List

On Tue, May 7, 2013 at 3:28 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2013-05-07 at 15:17 -0700, Yinghai Lu wrote:
>> For x86 8 sockets or 32 sockets system that will have one root bus per socket,
>> They may have some root buses do not have mmio non-pref range.
>
> That seems very odd. Most device registers are non-prefetchable. I know
> of no adapter today that would work in a prefetchable-only environment.

FC/10G/FCoE from Qlogic and Emulex can be used with 64bit mmio-pref only.

pci bridge will only support 32bit non-pref mmio, so we have them under 4G,
and it could be only 2G at most.

x86 32 socket system, we may need to leave more mmiol for only several
sockets to make them work with cards that does not support mmio 64 bit pref.

we can always have enough 64bit mmio pref above 4G....

>
> Are you sure that isn't the other way around ?
>
> Regarding your 3.10 patches, me and Gavin will test your v3 later today
> (ASAP) and will give you an Ack if they work, in which case they should
> hit Linus as soon as Bjorn is comfortable with :-)

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range
  2013-05-07 22:44                         ` Yinghai Lu
@ 2013-05-08  1:16                           ` Benjamin Herrenschmidt
  2013-05-08  3:57                             ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-08  1:16 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Bjorn Helgaas, Gavin Shan, linux-pci, Linux Kernel Mailing List

On Tue, 2013-05-07 at 15:44 -0700, Yinghai Lu wrote:
> x86 32 socket system, we may need to leave more mmiol for only several
> sockets to make them work with cards that does not support mmio 64 bit
> pref.

Ok, while on POWER each root bridge has its own distinct 32-bit space
(mapped elsewhere in CPU space). So at least we don't have *that*
specific problem :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
       [not found]                         ` <5189c22b.45f7440a.0a88.6b75SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2013-05-08  3:42                           ` Yinghai Lu
  0 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-08  3:42 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Benjamin Herrenschmidt, linux-pci, Bjorn Helgaas

On Tue, May 7, 2013 at 8:10 PM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> On Wed, May 08, 2013 at 10:38:00AM +0800, Gavin Shan wrote:
>>On Tue, May 07, 2013 at 03:21:42PM -0700, Yinghai Lu wrote:
>>>On Tue, May 7, 2013 at 1:11 AM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
>>>> Yinghai, thanks for the updated patch. It's working well with (v1 #1 and v2 #2), which
>>>> was applied on top of Ben's patch sent for review. The kernel log is attached in case
>>>> you want double-check.
>>>>
>>>Yes, that works.
>>>
>>>I just sent v3, please test that.
>>>
>>>That should be less invasive for v3.10. As it will skip adding failed
>>>resource to failed list when the resource is IORESOURCE_IO, and bus
>>>does not have IORESOURCE_IO.
>>>
>>
>>Yinghai, with all patches (5) applied on top of Ben's patchset for PowerPC.
>>I still see failure to assign IO ports, which shouldn't be seen. Please
>>check the attached kernel log.
>>
>
> After digging for more, here's what happened: pci_bridge_check_ranges() checks
> the specific bridge supports IO ports and set IORESOURCE_IO explicitly if that's
> the case. Then in pbus_size_io(), we always have valid IO window from that specific
> PCI bus and the corresponding resource of the P2P bridge will be added to "realloc_head".
> That eventually cause assignment of IO ports and it's to failure because the root
> PCI bus doesn't have IO port range.

that is intentionally,

pci 0001:02:01.0: BAR 7: can't assign io (size 0x1000)
pci 0001:02:08.0: BAR 7: can't assign io (size 0x1000)
pci 0001:02:09.0: BAR 7: can't assign io (size 0x1000)

those are just print out, and then those does not get inserted to failed list.
so it will not affect mmio-nonpref/mmio-pref allocation.

that will make sure all related bridge io port BAR get disabled for sure.

except those print out, do you find any function that does not work?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range
  2013-05-08  1:16                           ` Benjamin Herrenschmidt
@ 2013-05-08  3:57                             ` Yinghai Lu
  0 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-08  3:57 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Bjorn Helgaas, Gavin Shan, linux-pci, Linux Kernel Mailing List

On Tue, May 7, 2013 at 6:16 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2013-05-07 at 15:44 -0700, Yinghai Lu wrote:
>> x86 32 socket system, we may need to leave more mmiol for only several
>> sockets to make them work with cards that does not support mmio 64 bit
>> pref.
>
> Ok, while on POWER each root bridge has its own distinct 32-bit space
> (mapped elsewhere in CPU space). So at least we don't have *that*
> specific problem :-)

Hope x86 could support that.

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-07 22:21                       ` Yinghai Lu
       [not found]                         ` <5189c22b.45f7440a.0a88.6b75SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2013-05-17  5:36                         ` Benjamin Herrenschmidt
  2013-05-21 17:28                           ` Bjorn Helgaas
  1 sibling, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-17  5:36 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Gavin Shan, linux-pci, Bjorn Helgaas

On Tue, 2013-05-07 at 15:21 -0700, Yinghai Lu wrote:
> I just sent v3, please test that.
> 
> That should be less invasive for v3.10. As it will skip adding failed
> resource to failed list when the resource is IORESOURCE_IO, and bus
> does not have IORESOURCE_IO.

Any news about merging this ?

Thanks !

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-17  5:36                         ` Benjamin Herrenschmidt
@ 2013-05-21 17:28                           ` Bjorn Helgaas
  2013-05-21 17:39                             ` Yinghai Lu
  2013-05-21 22:01                             ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 76+ messages in thread
From: Bjorn Helgaas @ 2013-05-21 17:28 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Yinghai Lu, Gavin Shan, linux-pci

On Thu, May 16, 2013 at 11:36 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2013-05-07 at 15:21 -0700, Yinghai Lu wrote:
>> I just sent v3, please test that.
>>
>> That should be less invasive for v3.10. As it will skip adding failed
>> resource to failed list when the resource is IORESOURCE_IO, and bus
>> does not have IORESOURCE_IO.
>
> Any news about merging this ?

I'm starting to look at this.  Do you think this is v3.10 material?
My first impression is that it seems pretty large for that.

Bjorn

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-21 17:28                           ` Bjorn Helgaas
@ 2013-05-21 17:39                             ` Yinghai Lu
  2013-05-21 22:01                             ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-21 17:39 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci

On Tue, May 21, 2013 at 10:28 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Thu, May 16, 2013 at 11:36 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>> On Tue, 2013-05-07 at 15:21 -0700, Yinghai Lu wrote:
>>> I just sent v3, please test that.
>>>
>>> That should be less invasive for v3.10. As it will skip adding failed
>>> resource to failed list when the resource is IORESOURCE_IO, and bus
>>> does not have IORESOURCE_IO.
>>
>> Any news about merging this ?
>
> I'm starting to look at this.  Do you think this is v3.10 material?
> My first impression is that it seems pretty large for that.

Maybe could put first two for 3.10, and other three for v3.11.

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus
  2013-05-06 23:15                   ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
  2013-05-06 23:15                     ` [PATCH 2/2] PCI: Skip IORESOURCE_IO size and allocation for root bus without ioport range Yinghai Lu
  2013-05-07  0:50                     ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Benjamin Herrenschmidt
@ 2013-05-21 20:41                     ` Bjorn Helgaas
  2 siblings, 0 replies; 76+ messages in thread
From: Bjorn Helgaas @ 2013-05-21 20:41 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci, linux-kernel

On Mon, May 06, 2013 at 04:15:29PM -0700, Yinghai Lu wrote:
> BenH reported that there is some assign unassigned resource problem
> in powerpc.
> 
> It turns out after
> | commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
> | Date:   Thu Feb 23 19:23:29 2012 -0800
> |
> |    PCI: Retry on IORESOURCE_IO type allocations
> 
> even the root bus does not have io port range, it will keep retrying
> to realloc with mmio.
> 
> Current retry logic is : try with must+optional at first, and if
> it fails will try must then try to extend must with optional.
> That will fail as mmio-non-pref and mmio-pref for bridge will
> be next to each other. So we have no chance to extend mmio-non-pref.
> 
> We should not fall into retry in this case, as root bus does
> not io port range.
> 
> Before we do that we need to split pci_assign_unassiged_resource
> to every root bus, so we can stop early for root bus without ioport
> range, and still continue to retry on buses that do have ioport range.
> 
> This will be become more often when we have x86 8 sockets or 32 sockets
> system, and those system will have one root bus per socket.
> They will have some root buses do not have ioport range.
> 
> For the retry failing, we could allocate mmio-non-pref bottom-up
> and mmio-pref will be top-down, but that could not be material for v3.10.

If I understand correctly, this particular patch makes no functional
changes, so the changelog above should be saved for the patches that *do*
actually fix problems.

> 
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> ---
>  drivers/pci/setup-bus.c |  101 +++++++++++++++++++++++-------------------------
>  1 file changed, 49 insertions(+), 52 deletions(-)
> 
> Index: linux-2.6/drivers/pci/setup-bus.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/setup-bus.c
> +++ linux-2.6/drivers/pci/setup-bus.c
> @@ -1315,21 +1315,6 @@ static int __init pci_bus_get_depth(stru
>  
>  	return depth;
>  }
> -static int __init pci_get_max_depth(void)
> -{
> -	int depth = 0;
> -	struct pci_bus *bus;
> -
> -	list_for_each_entry(bus, &pci_root_buses, node) {
> -		int ret;
> -
> -		ret = pci_bus_get_depth(bus);
> -		if (ret > depth)
> -			depth = ret;
> -	}
> -
> -	return depth;
> -}
>  
>  /*
>   * -1: undefined, will auto detect later
> @@ -1354,34 +1339,41 @@ void __init pci_realloc_get_opt(char *st
>  	else if (!strncmp(str, "on", 2))
>  		pci_realloc_enable = user_enabled;
>  }
> -static bool __init pci_realloc_enabled(void)
> +static bool __init pci_realloc_enabled(enum enable_type enable)
>  {
> -	return pci_realloc_enable >= user_enabled;
> +	return enable >= user_enabled;
>  }
>  
> -static void __init pci_realloc_detect(void)
> +static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
> +			 enum enable_type enable_local)
>  {
>  #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
> -	struct pci_dev *dev = NULL;
> +	struct pci_dev *dev;
>  
> -	if (pci_realloc_enable != undefined)
> -		return;
> +	if (enable_local != undefined)
> +		return enable_local;
>  
> -	for_each_pci_dev(dev) {
> +	list_for_each_entry(dev, &bus->devices, bus_list) {
>  		int i;
>  
>  		for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
>  			struct resource *r = &dev->resource[i];
>  
>  			/* Not assigned, or rejected by kernel ? */
> -			if (r->flags && !r->start) {
> -				pci_realloc_enable = auto_enabled;
> -
> -				return;
> -			}
> +			if (r->flags && !r->start)
> +				return auto_enabled;
>  		}
>  	}
> +
> +	list_for_each_entry(dev, &bus->devices, bus_list) {
> +		struct pci_bus *child = dev->subordinate;
> +
> +		if (child &&
> +		    pci_realloc_detect(child, enable_local) == auto_enabled)
> +			return auto_enabled;
> +	}

This uses recursion and basically does the same thing as pci_walk_bus().
I think it will be clearer if you make it look something like this:

        static int count_unassigned_resources(struct pci_dev *dev, void *data)
        {
          int *count = data;

          for (i = PCI_IOV_RESOURCES; ...)
            if (r->flags && !r->start)
              *count++;

          return 0;
        }

        static pci_realloc_detect(struct pci_bus *bus, ...  enable_local)
        {
	  int unassigned;

          if (enable_local != undefined)
            return enable_local;

	  unassigned = 0;
	  pci_walk_bus(bus, count_unassigned_resources, &unassigned);
	  if (unassigned)
	    return auto_enabled;

          return enable_local;
        }


>  #endif
> +	return enable_local;
>  }
>  
>  /*
> @@ -1389,10 +1381,9 @@ static void __init pci_realloc_detect(vo
>   * second  and later try will clear small leaf bridge res
>   * will stop till to the max  deepth if can not find good one
>   */
> -void __init
> -pci_assign_unassigned_resources(void)
> +static void __init
> +pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
>  {
> -	struct pci_bus *bus;
>  	LIST_HEAD(realloc_head); /* list of resources that
>  					want additional resources */
>  	struct list_head *add_list = NULL;
> @@ -1403,15 +1394,17 @@ pci_assign_unassigned_resources(void)
>  	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
>  				  IORESOURCE_PREFETCH;
>  	int pci_try_num = 1;
> +	enum enable_type enable_local;
>  
>  	/* don't realloc if asked to do so */
> -	pci_realloc_detect();
> -	if (pci_realloc_enabled()) {
> -		int max_depth = pci_get_max_depth();
> +	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
> +	if (pci_realloc_enabled(enable_local)) {
> +		int max_depth = pci_bus_get_depth(bus);
>  
>  		pci_try_num = max_depth + 1;
> -		printk(KERN_DEBUG "PCI: max bus depth: %d pci_try_num: %d\n",
> -			 max_depth, pci_try_num);
> +		dev_printk(KERN_DEBUG, &bus->dev,
> +			   "max bus depth: %d pci_try_num: %d\n",
> +			   max_depth, pci_try_num);
>  	}
>  
>  again:
> @@ -1423,12 +1416,10 @@ again:
>  		add_list = &realloc_head;
>  	/* Depth first, calculate sizes and alignments of all
>  	   subordinate buses. */
> -	list_for_each_entry(bus, &pci_root_buses, node)
> -		__pci_bus_size_bridges(bus, add_list);
> +	__pci_bus_size_bridges(bus, add_list);
>  
>  	/* Depth last, allocate resources and update the hardware. */
> -	list_for_each_entry(bus, &pci_root_buses, node)
> -		__pci_bus_assign_resources(bus, add_list, &fail_head);
> +	__pci_bus_assign_resources(bus, add_list, &fail_head);
>  	if (add_list)
>  		BUG_ON(!list_empty(add_list));
>  	tried_times++;
> @@ -1438,17 +1429,17 @@ again:
>  		goto enable_and_dump;
>  
>  	if (tried_times >= pci_try_num) {
> -		if (pci_realloc_enable == undefined)
> -			printk(KERN_INFO "Some PCI device resources are unassigned, try booting with pci=realloc\n");
> -		else if (pci_realloc_enable == auto_enabled)
> -			printk(KERN_INFO "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
> +		if (enable_local == undefined)
> +			dev_info(&bus->dev, "Some PCI device resources are unassigned, try booting with pci=realloc\n");
> +		else if (enable_local == auto_enabled)
> +			dev_info(&bus->dev, "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");

I think you can add enable_local and the pci_realloc_enabled() parameter
in a separate patch.  That will remove distractions from the main patch.

>  
>  		free_list(&fail_head);
>  		goto enable_and_dump;
>  	}
>  
> -	printk(KERN_DEBUG "PCI: No. %d try to assign unassigned res\n",
> -			 tried_times + 1);
> +	dev_printk(KERN_DEBUG, &bus->dev,
> +		   "No. %d try to assign unassigned res\n", tried_times + 1);
>  
>  	/* third times and later will not check if it is leaf */
>  	if ((tried_times + 1) > 2)
> @@ -1458,12 +1449,11 @@ again:
>  	 * Try to release leaf bridge's resources that doesn't fit resource of
>  	 * child device under that bridge
>  	 */
> -	list_for_each_entry(fail_res, &fail_head, list) {
> -		bus = fail_res->dev->bus;
> -		pci_bus_release_bridge_resources(bus,
> +	list_for_each_entry(fail_res, &fail_head, list)
> +		pci_bus_release_bridge_resources(fail_res->dev->bus,

This change is gratuitous and distracting.  Please move it to a
separate patch.

>  						 fail_res->flags & type_mask,
>  						 rel_type);
> -	}
> +
>  	/* restore size and flags */
>  	list_for_each_entry(fail_res, &fail_head, list) {
>  		struct resource *res = fail_res->res;
> @@ -1480,12 +1470,19 @@ again:
>  
>  enable_and_dump:
>  	/* Depth last, update the hardware. */
> -	list_for_each_entry(bus, &pci_root_buses, node)
> -		pci_enable_bridges(bus);
> +	pci_enable_bridges(bus);
>  
>  	/* dump the resource on buses */
> +	pci_bus_dump_resources(bus);
> +}
> +
> +void __init
> +pci_assign_unassigned_resources(void)
> +{
> +	struct pci_bus *bus;
> +
>  	list_for_each_entry(bus, &pci_root_buses, node)
> -		pci_bus_dump_resources(bus);
> +		pci_assign_unassigned_root_bus_resources(bus);
>  }
>  
>  void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)

I think this should be split up into something like the following patches
so we can see what's going on here:

    - Remove "bus" temporary when calling pci_bus_release_bridge_resources()

    - Add pci_realloc_enabled() parameter and enable_local

    - Add pci_realloc_detect() parameters.  The "bus" parameter is 
      ignored for now.

    - Change pci_realloc_detect() to iterate over pci_root_buses and
      call pci_walk_bus() to find any unassigned resources instead of
      using for_each_pci_dev().

    - Split pci_assign_unassigned_resources() into iterating over
      pci_root_buses and calling pci_assign_unassigned_root_bus_resources(bus).
      Change pci_realloc_detect() to only walk the supplied bus instead
      of everything in pci_root_buses.  This will be basically just removing
      list_for_each_entry(bus, &pci_root_buses, node) loops.

Bjorn

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-21 17:28                           ` Bjorn Helgaas
  2013-05-21 17:39                             ` Yinghai Lu
@ 2013-05-21 22:01                             ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-21 22:01 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, Gavin Shan, linux-pci

On Tue, 2013-05-21 at 11:28 -0600, Bjorn Helgaas wrote:
> On Thu, May 16, 2013 at 11:36 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Tue, 2013-05-07 at 15:21 -0700, Yinghai Lu wrote:
> >> I just sent v3, please test that.
> >>
> >> That should be less invasive for v3.10. As it will skip adding faile
> >> resource to failed list when the resource is IORESOURCE_IO, and bus
> >> does not have IORESOURCE_IO.
> >
> > Any news about merging this ?
> 
> I'm starting to look at this.  Do you think this is v3.10 material?
> My first impression is that it seems pretty large for that.

Well, it's problematic for us, the assignment fails here or there on
some of our new systems with the new code multi-pass code for memory due
to the lack of IO which is nasty at best.

We were hoping we could get 3.10 to support those machines since that's
apparently what a well known enterprise distro will pickup (and trying
to get them to pickup backport can be complete hell).

Cheers,
Ben.





^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource
  2013-05-06 10:48                 ` Benjamin Herrenschmidt
                                     ` (2 preceding siblings ...)
  2013-05-07 22:17                   ` [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
@ 2013-05-22  6:38                   ` Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 1/8] PCI: Don't use temp bus for pci_bus_release_bridge_resources Yinghai Lu
                                       ` (7 more replies)
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
  4 siblings, 8 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails will try must then try to extend must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

We should not fall into retry in this case, as root bus does
not io port range.

We check if the root bus has ioport range, and set bus_res_type_mask,
and pass it to assign_resources and don't add ioport res to failed list
for root bus that does not have ioport range.
So even BIOS set wrong value to pci devices and bridges will still
get cleared.

Then also check mmio-nonpref resource for root buses.

First five are for 3.10, and others are targeted to 3.11

-v4: split first patch into 4 patches per Bjorn.

 PCI: Don't use temp bus for pci_bus_release_bridge_resources
 PCI: Use pci_walk_bus to detect unassigned resources
 PCI: Introduce enable_local to prepare per root bus handling
 PCI: Split pci_assign_unassigned_resources to per root bus
 PCI: Skip IORESOURCE_IO allocation for root bus without ioport range
 PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range
 PCI: Enable pci bridge when it is needed
 PCI: Retry assign unassigned resources for hotadd root bus

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 1/8] PCI: Don't use temp bus for pci_bus_release_bridge_resources
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
@ 2013-05-22  6:38                     ` Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 2/8] PCI: Use pci_walk_bus to detect unassigned resources Yinghai Lu
                                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

as later bus can not be used as temp variable after we change to
per root bus handling with assign unassigned resources.

Per Bjorn, separated out different patch.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1458,12 +1458,11 @@ again:
 	 * Try to release leaf bridge's resources that doesn't fit resource of
 	 * child device under that bridge
 	 */
-	list_for_each_entry(fail_res, &fail_head, list) {
-		bus = fail_res->dev->bus;
-		pci_bus_release_bridge_resources(bus,
+	list_for_each_entry(fail_res, &fail_head, list)
+		pci_bus_release_bridge_resources(fail_res->dev->bus,
 						 fail_res->flags & type_mask,
 						 rel_type);
-	}
+
 	/* restore size and flags */
 	list_for_each_entry(fail_res, &fail_head, list) {
 		struct resource *res = fail_res->res;

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 2/8] PCI: Use pci_walk_bus to detect unassigned resources
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 1/8] PCI: Don't use temp bus for pci_bus_release_bridge_resources Yinghai Lu
@ 2013-05-22  6:38                     ` Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 3/8] PCI: Introduce enable_local to prepare per root bus handling Yinghai Lu
                                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Per Bjorn, use pci_walk_bus instead of for_each_pci_dev or
calling pci_realloc_detect() recursively.

And separate it to different patch.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   46 +++++++++++++++++++++++++++++++---------------
 1 file changed, 31 insertions(+), 15 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1359,30 +1359,46 @@ static bool __init pci_realloc_enabled(v
 	return pci_realloc_enable >= user_enabled;
 }
 
-static void __init pci_realloc_detect(void)
-{
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
-	struct pci_dev *dev = NULL;
+static int __init check_unassigned_resources(struct pci_dev *dev, void *data)
+{
+	int i;
+	int *unassigned = data;
 
-	if (pci_realloc_enable != undefined)
-		return;
+	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
+		struct resource *r = &dev->resource[i];
 
-	for_each_pci_dev(dev) {
-		int i;
+		/* Not assigned, or rejected by kernel ? */
+		if (r->flags && !r->start) {
+			(*unassigned)++;
+			return 1; /* return early from pci_walk_bus */
+		}
+	}
 
-		for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
-			struct resource *r = &dev->resource[i];
+	return 0;
+}
 
-			/* Not assigned, or rejected by kernel ? */
-			if (r->flags && !r->start) {
-				pci_realloc_enable = auto_enabled;
+static void  __init pci_realloc_detect(void)
+{
+	int unassigned = 0;
+	struct pci_bus *bus;
 
-				return;
-			}
+	if (pci_realloc_enable != undefined)
+		return;
+
+	list_for_each_entry(bus, &pci_root_buses, node) {
+		pci_walk_bus(bus, check_unassigned_resources, &unassigned);
+		if (unassigned) {
+			pci_realloc_enable = auto_enabled;
+			return;
 		}
 	}
-#endif
 }
+#else
+static void __init pci_realloc_detect(void)
+{
+}
+#endif
 
 /*
  * first try will not touch pci bridge res

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 3/8] PCI: Introduce enable_local to prepare per root bus handling
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 1/8] PCI: Don't use temp bus for pci_bus_release_bridge_resources Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 2/8] PCI: Use pci_walk_bus to detect unassigned resources Yinghai Lu
@ 2013-05-22  6:38                     ` Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 4/8] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
                                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Add enable_local to prepare assign unassigned resource
for per root bus.

Per Bjorn, separate it to different patch.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   40 +++++++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 19 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1354,9 +1354,9 @@ void __init pci_realloc_get_opt(char *st
 	else if (!strncmp(str, "on", 2))
 		pci_realloc_enable = user_enabled;
 }
-static bool __init pci_realloc_enabled(void)
+static bool __init pci_realloc_enabled(enum enable_type enable)
 {
-	return pci_realloc_enable >= user_enabled;
+	return enable >= user_enabled;
 }
 
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
@@ -1378,25 +1378,25 @@ static int __init check_unassigned_resou
 	return 0;
 }
 
-static void  __init pci_realloc_detect(void)
+static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+			 enum enable_type enable_local)
 {
 	int unassigned = 0;
-	struct pci_bus *bus;
 
-	if (pci_realloc_enable != undefined)
-		return;
+	if (enable_local != undefined)
+		return enable_local;
 
-	list_for_each_entry(bus, &pci_root_buses, node) {
-		pci_walk_bus(bus, check_unassigned_resources, &unassigned);
-		if (unassigned) {
-			pci_realloc_enable = auto_enabled;
-			return;
-		}
-	}
+	pci_walk_bus(bus, check_unassigned_resources, &unassigned);
+	if (unassigned)
+		return auto_enabled;
+
+	return enable_local;
 }
 #else
-static void __init pci_realloc_detect(void)
+static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+			 enum enable_type enable_local)
 {
+	return enable_local;
 }
 #endif
 
@@ -1419,10 +1419,12 @@ pci_assign_unassigned_resources(void)
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
+	enum enable_type enable_local = pci_realloc_enable;
+
+	list_for_each_entry(bus, &pci_root_buses, node)
+		enable_local = pci_realloc_detect(bus, enable_local);
 
-	/* don't realloc if asked to do so */
-	pci_realloc_detect();
-	if (pci_realloc_enabled()) {
+	if (pci_realloc_enabled(enable_local)) {
 		int max_depth = pci_get_max_depth();
 
 		pci_try_num = max_depth + 1;
@@ -1454,9 +1456,9 @@ again:
 		goto enable_and_dump;
 
 	if (tried_times >= pci_try_num) {
-		if (pci_realloc_enable == undefined)
+		if (enable_local == undefined)
 			printk(KERN_INFO "Some PCI device resources are unassigned, try booting with pci=realloc\n");
-		else if (pci_realloc_enable == auto_enabled)
+		else if (enable_local == auto_enabled)
 			printk(KERN_INFO "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 4/8] PCI: Split pci_assign_unassigned_resources to per root bus
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
                                       ` (2 preceding siblings ...)
  2013-05-22  6:38                     ` [PATCH v4 3/8] PCI: Introduce enable_local to prepare per root bus handling Yinghai Lu
@ 2013-05-22  6:38                     ` Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 5/8] PCI: Skip IORESOURCE_IO allocation for root bus without ioport range Yinghai Lu
                                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

We need to split pci_assign_unassiged_resource to every root bus, so in
next patch we can stop early for root bus without ioport range, and still
continue to retry on buses that do have ioport range.

Also later we could let root bus hot add and booting path use same code.

-v2: separate enable_local and pci_release_bridge_resources to
     other patches requested by Bjorn.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   62 +++++++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 37 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1315,21 +1315,6 @@ static int __init pci_bus_get_depth(stru
 
 	return depth;
 }
-static int __init pci_get_max_depth(void)
-{
-	int depth = 0;
-	struct pci_bus *bus;
-
-	list_for_each_entry(bus, &pci_root_buses, node) {
-		int ret;
-
-		ret = pci_bus_get_depth(bus);
-		if (ret > depth)
-			depth = ret;
-	}
-
-	return depth;
-}
 
 /*
  * -1: undefined, will auto detect later
@@ -1405,10 +1390,9 @@ static enum enable_type __init pci_reall
  * second  and later try will clear small leaf bridge res
  * will stop till to the max  deepth if can not find good one
  */
-void __init
-pci_assign_unassigned_resources(void)
+static void __init
+pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 {
-	struct pci_bus *bus;
 	LIST_HEAD(realloc_head); /* list of resources that
 					want additional resources */
 	struct list_head *add_list = NULL;
@@ -1419,17 +1403,17 @@ pci_assign_unassigned_resources(void)
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
-	enum enable_type enable_local = pci_realloc_enable;
-
-	list_for_each_entry(bus, &pci_root_buses, node)
-		enable_local = pci_realloc_detect(bus, enable_local);
+	enum enable_type enable_local;
 
+	/* don't realloc if asked to do so */
+	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
 	if (pci_realloc_enabled(enable_local)) {
-		int max_depth = pci_get_max_depth();
+		int max_depth = pci_bus_get_depth(bus);
 
 		pci_try_num = max_depth + 1;
-		printk(KERN_DEBUG "PCI: max bus depth: %d pci_try_num: %d\n",
-			 max_depth, pci_try_num);
+		dev_printk(KERN_DEBUG, &bus->dev,
+			   "max bus depth: %d pci_try_num: %d\n",
+			   max_depth, pci_try_num);
 	}
 
 again:
@@ -1441,12 +1425,10 @@ again:
 		add_list = &realloc_head;
 	/* Depth first, calculate sizes and alignments of all
 	   subordinate buses. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		__pci_bus_size_bridges(bus, add_list);
+	__pci_bus_size_bridges(bus, add_list);
 
 	/* Depth last, allocate resources and update the hardware. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		__pci_bus_assign_resources(bus, add_list, &fail_head);
+	__pci_bus_assign_resources(bus, add_list, &fail_head);
 	if (add_list)
 		BUG_ON(!list_empty(add_list));
 	tried_times++;
@@ -1457,16 +1439,16 @@ again:
 
 	if (tried_times >= pci_try_num) {
 		if (enable_local == undefined)
-			printk(KERN_INFO "Some PCI device resources are unassigned, try booting with pci=realloc\n");
+			dev_info(&bus->dev, "Some PCI device resources are unassigned, try booting with pci=realloc\n");
 		else if (enable_local == auto_enabled)
-			printk(KERN_INFO "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
+			dev_info(&bus->dev, "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);
 		goto enable_and_dump;
 	}
 
-	printk(KERN_DEBUG "PCI: No. %d try to assign unassigned res\n",
-			 tried_times + 1);
+	dev_printk(KERN_DEBUG, &bus->dev,
+		   "No. %d try to assign unassigned res\n", tried_times + 1);
 
 	/* third times and later will not check if it is leaf */
 	if ((tried_times + 1) > 2)
@@ -1497,12 +1479,18 @@ again:
 
 enable_and_dump:
 	/* Depth last, update the hardware. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		pci_enable_bridges(bus);
+	pci_enable_bridges(bus);
 
 	/* dump the resource on buses */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		pci_bus_dump_resources(bus);
+	pci_bus_dump_resources(bus);
+}
+
+void __init pci_assign_unassigned_resources(void)
+{
+	struct pci_bus *root_bus;
+
+	list_for_each_entry(root_bus, &pci_root_buses, node)
+		pci_assign_unassigned_root_bus_resources(root_bus);
 }
 
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 5/8] PCI: Skip IORESOURCE_IO allocation for root bus without ioport range
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
                                       ` (3 preceding siblings ...)
  2013-05-22  6:38                     ` [PATCH v4 4/8] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
@ 2013-05-22  6:38                     ` Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 6/8] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range Yinghai Lu
                                       ` (2 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails will try must then try to extend must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

We should not fall into retry in this case, as root bus does
not io port range.

We check if the root bus has ioport range, and set bus_res_type_mask,
and pass it to assign_resources and don't add ioport res to failed list
for root bus that does not have ioport range.
So even BIOS set wrong value to pci devices and bridges will still
get cleared.

For the retry failing, we could allocate mmio-non-pref bottom-up
and mmio-pref will be top-down, but that is not easy and could not be
material for v3.10.

-v2: remove wrong __init with pci_bus_res_type_mask()
     don't check bridge size/flags, and clear bus io resources.
-v3: change to skip adding to failed_list instead.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   86 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 66 insertions(+), 20 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -272,7 +272,8 @@ out:
  * requests that could not satisfied to the failed_list.
  */
 static void assign_requested_resources_sorted(struct list_head *head,
-				 struct list_head *fail_head)
+				 struct list_head *fail_head,
+				 unsigned long bus_res_type_mask)
 {
 	struct resource *res;
 	struct pci_dev_resource *dev_res;
@@ -288,8 +289,19 @@ static void assign_requested_resources_s
 				 * if the failed res is for ROM BAR, and it will
 				 * be enabled later, don't add it to the list
 				 */
-				if (!((idx == PCI_ROM_RESOURCE) &&
-				      (!(res->flags & IORESOURCE_ROM_ENABLE))))
+				bool is_rom_res_not_enabled =
+					 (idx == PCI_ROM_RESOURCE) &&
+					 (!(res->flags & IORESOURCE_ROM_ENABLE));
+				/*
+				 * if the failed res is io port, but bus does
+				 * not have io port support, don't add it
+				 */
+				bool is_ioport_res_without_bus_support =
+					 (!(bus_res_type_mask & IORESOURCE_IO)) &&
+					 (res->flags & IORESOURCE_IO);
+
+				if (!is_rom_res_not_enabled &&
+				    !is_ioport_res_without_bus_support)
 					add_to_list(fail_head,
 						    dev_res->dev, res,
 						    0 /* dont care */,
@@ -302,7 +314,8 @@ static void assign_requested_resources_s
 
 static void __assign_resources_sorted(struct list_head *head,
 				 struct list_head *realloc_head,
-				 struct list_head *fail_head)
+				 struct list_head *fail_head,
+				 unsigned long bus_res_type_mask)
 {
 	/*
 	 * Should not assign requested resources at first.
@@ -336,7 +349,8 @@ static void __assign_resources_sorted(st
 							dev_res->res);
 
 	/* Try updated head list with add_size added */
-	assign_requested_resources_sorted(head, &local_fail_head);
+	assign_requested_resources_sorted(head, &local_fail_head,
+					  bus_res_type_mask);
 
 	/* all assigned with add_size ? */
 	if (list_empty(&local_fail_head)) {
@@ -365,7 +379,8 @@ static void __assign_resources_sorted(st
 
 requested_and_reassign:
 	/* Satisfy the must-have resource requests */
-	assign_requested_resources_sorted(head, fail_head);
+	assign_requested_resources_sorted(head, fail_head,
+					  bus_res_type_mask);
 
 	/* Try to satisfy any additional optional resource
 		requests */
@@ -376,18 +391,21 @@ requested_and_reassign:
 
 static void pdev_assign_resources_sorted(struct pci_dev *dev,
 				 struct list_head *add_head,
-				 struct list_head *fail_head)
+				 struct list_head *fail_head,
+				 unsigned long bus_res_type_mask)
 {
 	LIST_HEAD(head);
 
 	__dev_sort_resources(dev, &head);
-	__assign_resources_sorted(&head, add_head, fail_head);
+	__assign_resources_sorted(&head, add_head, fail_head,
+					bus_res_type_mask);
 
 }
 
 static void pbus_assign_resources_sorted(const struct pci_bus *bus,
 					 struct list_head *realloc_head,
-					 struct list_head *fail_head)
+					 struct list_head *fail_head,
+					 unsigned long bus_res_type_mask)
 {
 	struct pci_dev *dev;
 	LIST_HEAD(head);
@@ -395,7 +413,8 @@ static void pbus_assign_resources_sorted
 	list_for_each_entry(dev, &bus->devices, bus_list)
 		__dev_sort_resources(dev, &head);
 
-	__assign_resources_sorted(&head, realloc_head, fail_head);
+	__assign_resources_sorted(&head, realloc_head, fail_head,
+					bus_res_type_mask);
 }
 
 void pci_setup_cardbus(struct pci_bus *bus)
@@ -1117,19 +1136,22 @@ EXPORT_SYMBOL(pci_bus_size_bridges);
 
 static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,
 					 struct list_head *realloc_head,
-					 struct list_head *fail_head)
+					 struct list_head *fail_head,
+					 unsigned long bus_res_type_mask)
 {
 	struct pci_bus *b;
 	struct pci_dev *dev;
 
-	pbus_assign_resources_sorted(bus, realloc_head, fail_head);
+	pbus_assign_resources_sorted(bus, realloc_head, fail_head,
+					bus_res_type_mask);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		b = dev->subordinate;
 		if (!b)
 			continue;
 
-		__pci_bus_assign_resources(b, realloc_head, fail_head);
+		__pci_bus_assign_resources(b, realloc_head, fail_head,
+						 bus_res_type_mask);
 
 		switch (dev->class >> 8) {
 		case PCI_CLASS_BRIDGE_PCI:
@@ -1151,24 +1173,28 @@ static void __ref __pci_bus_assign_resou
 
 void __ref pci_bus_assign_resources(const struct pci_bus *bus)
 {
-	__pci_bus_assign_resources(bus, NULL, NULL);
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
+
+	__pci_bus_assign_resources(bus, NULL, NULL, type_mask);
 }
 EXPORT_SYMBOL(pci_bus_assign_resources);
 
 static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
 					 struct list_head *add_head,
-					 struct list_head *fail_head)
+					 struct list_head *fail_head,
+					 unsigned long bus_res_type_mask)
 {
 	struct pci_bus *b;
 
 	pdev_assign_resources_sorted((struct pci_dev *)bridge,
-					 add_head, fail_head);
+				     add_head, fail_head, bus_res_type_mask);
 
 	b = bridge->subordinate;
 	if (!b)
 		return;
 
-	__pci_bus_assign_resources(b, add_head, fail_head);
+	__pci_bus_assign_resources(b, add_head, fail_head, bus_res_type_mask);
 
 	switch (bridge->class >> 8) {
 	case PCI_CLASS_BRIDGE_PCI:
@@ -1385,6 +1411,21 @@ static enum enable_type __init pci_reall
 }
 #endif
 
+static unsigned long pci_bus_res_type_mask(struct pci_bus *bus)
+{
+	int i;
+	struct resource *r;
+	unsigned long mask = 0;
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
+
+	pci_bus_for_each_resource(bus, r, i)
+		if (r)
+			mask |= r->flags & type_mask;
+
+	return mask;
+}
+
 /*
  * first try will not touch pci bridge res
  * second  and later try will clear small leaf bridge res
@@ -1404,6 +1445,7 @@ pci_assign_unassigned_root_bus_resources
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
 	enum enable_type enable_local;
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bus);
 
 	/* don't realloc if asked to do so */
 	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
@@ -1428,7 +1470,8 @@ again:
 	__pci_bus_size_bridges(bus, add_list);
 
 	/* Depth last, allocate resources and update the hardware. */
-	__pci_bus_assign_resources(bus, add_list, &fail_head);
+	__pci_bus_assign_resources(bus, add_list, &fail_head,
+				   bus_res_type_mask);
 	if (add_list)
 		BUG_ON(!list_empty(add_list));
 	tried_times++;
@@ -1504,10 +1547,12 @@ void pci_assign_unassigned_bridge_resour
 	int retval;
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bridge->bus);
 
 again:
 	__pci_bus_size_bridges(parent, &add_list);
-	__pci_bridge_assign_resources(bridge, &add_list, &fail_head);
+	__pci_bridge_assign_resources(bridge, &add_list, &fail_head,
+					bus_res_type_mask);
 	BUG_ON(!list_empty(&add_list));
 	tried_times++;
 
@@ -1562,6 +1607,7 @@ void pci_assign_unassigned_bus_resources
 	struct pci_dev *dev;
 	LIST_HEAD(add_list); /* list of resources that
 					want additional resources */
+	unsigned long bus_res_type_mask = pci_bus_res_type_mask(bus);
 
 	down_read(&pci_bus_sem);
 	list_for_each_entry(dev, &bus->devices, bus_list)
@@ -1571,6 +1617,6 @@ void pci_assign_unassigned_bus_resources
 				__pci_bus_size_bridges(dev->subordinate,
 							 &add_list);
 	up_read(&pci_bus_sem);
-	__pci_bus_assign_resources(bus, &add_list, NULL);
+	__pci_bus_assign_resources(bus, &add_list, NULL, bus_res_type_mask);
 	BUG_ON(!list_empty(&add_list));
 }

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 6/8] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
                                       ` (4 preceding siblings ...)
  2013-05-22  6:38                     ` [PATCH v4 5/8] PCI: Skip IORESOURCE_IO allocation for root bus without ioport range Yinghai Lu
@ 2013-05-22  6:38                     ` Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 7/8] PCI: Enable pci bridge when it is needed Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 8/8] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

For x86 8 sockets or 32 sockets system that will have one root bus per socket,
They may have some root buses do not have mmio non-pref range.

We should not fall into retry in this case, as root bus does
not mmio non-pref range.

We check if the root bus has mmio-nonpref range, and set bus_res_type_mask,
and pass it to assign_resources and don't add mmio-nonpref res to failed list
for root bus that does not have mmio-nonpref range.
So even BIOS set wrong value to pci devices and bridges will still
get cleared.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -299,9 +299,17 @@ static void assign_requested_resources_s
 				bool is_ioport_res_without_bus_support =
 					 (!(bus_res_type_mask & IORESOURCE_IO)) &&
 					 (res->flags & IORESOURCE_IO);
+				/*
+				 * if the failed res is mmio, but bus does
+				 * not have io port support, don't add it
+				 */
+				bool is_mmio_nonpref_res_without_bus_support =
+					 (!(bus_res_type_mask & IORESOURCE_MEM)) &&
+					 ((res->flags & (IORESOURCE_MEM | IORESOURCE_PREFETCH)) == IORESOURCE_MEM);
 
 				if (!is_rom_res_not_enabled &&
-				    !is_ioport_res_without_bus_support)
+				    !is_ioport_res_without_bus_support &&
+				    !is_mmio_nonpref_res_without_bus_support)
 					add_to_list(fail_head,
 						    dev_res->dev, res,
 						    0 /* dont care */,
@@ -1416,12 +1424,24 @@ static unsigned long pci_bus_res_type_ma
 	int i;
 	struct resource *r;
 	unsigned long mask = 0;
-	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
-				  IORESOURCE_PREFETCH;
 
-	pci_bus_for_each_resource(bus, r, i)
-		if (r)
-			mask |= r->flags & type_mask;
+	pci_bus_for_each_resource(bus, r, i) {
+		if (!r)
+			continue;
+
+		if (r->flags & IORESOURCE_IO) {
+			mask |= IORESOURCE_IO;
+			continue;
+		}
+		if (r->flags & IORESOURCE_PREFETCH) {
+			mask |= IORESOURCE_PREFETCH;
+			continue;
+		}
+		if ((r->flags & (IORESOURCE_MEM | IORESOURCE_PREFETCH)) == IORESOURCE_MEM) {
+			mask |= IORESOURCE_MEM; /* nonpref only */
+			continue;
+		}
+	}
 
 	return mask;
 }

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 7/8] PCI: Enable pci bridge when it is needed
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
                                       ` (5 preceding siblings ...)
  2013-05-22  6:38                     ` [PATCH v4 6/8] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range Yinghai Lu
@ 2013-05-22  6:38                     ` Yinghai Lu
  2013-05-22  6:38                     ` [PATCH v4 8/8] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Current we enable bridges after bus scan and assign resources.
and it is spreaded a lot of places.

We can move it to where pci device is enabled, and need
to go up to root bus and enable bridge one by one down to pci
dev.

So that will delay enable bridge late as needed bassis,
also kill one inconsistent between boot path and hotplug
path in acpi_pci_root_add().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/arm/kernel/bios32.c           |    5 -----
 arch/m68k/platform/coldfire/pci.c  |    1 -
 arch/mips/pci/pci.c                |    1 -
 arch/sh/drivers/pci/pci.c          |    1 -
 drivers/acpi/pci_root.c            |    4 ----
 drivers/parisc/lba_pci.c           |    1 -
 drivers/pci/bus.c                  |   19 -------------------
 drivers/pci/hotplug/acpiphp_glue.c |    1 -
 drivers/pci/pci.c                  |   20 ++++++++++++++++++++
 drivers/pci/probe.c                |    1 -
 drivers/pci/setup-bus.c            |   10 +++-------
 drivers/pcmcia/cardbus.c           |    1 -
 include/linux/pci.h                |    1 -
 13 files changed, 23 insertions(+), 43 deletions(-)

Index: linux-2.6/arch/arm/kernel/bios32.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/bios32.c
+++ linux-2.6/arch/arm/kernel/bios32.c
@@ -524,11 +524,6 @@ void pci_common_init(struct hw_pci *hw)
 			 * Assign resources.
 			 */
 			pci_bus_assign_resources(bus);
-
-			/*
-			 * Enable bridges
-			 */
-			pci_enable_bridges(bus);
 		}
 
 		/*
Index: linux-2.6/arch/m68k/platform/coldfire/pci.c
===================================================================
--- linux-2.6.orig/arch/m68k/platform/coldfire/pci.c
+++ linux-2.6/arch/m68k/platform/coldfire/pci.c
@@ -319,7 +319,6 @@ static int __init mcf_pci_init(void)
 	pci_fixup_irqs(pci_common_swizzle, mcf_pci_map_irq);
 	pci_bus_size_bridges(rootbus);
 	pci_bus_assign_resources(rootbus);
-	pci_enable_bridges(rootbus);
 	pci_bus_add_devices(rootbus);
 	return 0;
 }
Index: linux-2.6/arch/mips/pci/pci.c
===================================================================
--- linux-2.6.orig/arch/mips/pci/pci.c
+++ linux-2.6/arch/mips/pci/pci.c
@@ -113,7 +113,6 @@ static void pcibios_scanbus(struct pci_c
 		if (!pci_has_flag(PCI_PROBE_ONLY)) {
 			pci_bus_size_bridges(bus);
 			pci_bus_assign_resources(bus);
-			pci_enable_bridges(bus);
 		}
 	}
 }
Index: linux-2.6/arch/sh/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/arch/sh/drivers/pci/pci.c
+++ linux-2.6/arch/sh/drivers/pci/pci.c
@@ -69,7 +69,6 @@ static void pcibios_scanbus(struct pci_c
 
 		pci_bus_size_bridges(bus);
 		pci_bus_assign_resources(bus);
-		pci_enable_bridges(bus);
 	} else {
 		pci_free_resource_list(&resources);
 	}
Index: linux-2.6/drivers/acpi/pci_root.c
===================================================================
--- linux-2.6.orig/drivers/acpi/pci_root.c
+++ linux-2.6/drivers/acpi/pci_root.c
@@ -537,10 +537,6 @@ static int acpi_pci_root_add(struct acpi
 		pci_assign_unassigned_bus_resources(root->bus);
 	}
 
-	/* need to after hot-added ioapic is registered */
-	if (system_state != SYSTEM_BOOTING)
-		pci_enable_bridges(root->bus);
-
 	pci_bus_add_devices(root->bus);
 	return 1;
 
Index: linux-2.6/drivers/parisc/lba_pci.c
===================================================================
--- linux-2.6.orig/drivers/parisc/lba_pci.c
+++ linux-2.6/drivers/parisc/lba_pci.c
@@ -1533,7 +1533,6 @@ lba_driver_probe(struct parisc_device *d
 		lba_dump_res(&lba_dev->hba.lmmio_space, 2);
 #endif
 	}
-	pci_enable_bridges(lba_bus);
 
 	/*
 	** Once PCI register ops has walked the bus, access to config
Index: linux-2.6/drivers/pci/bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/bus.c
+++ linux-2.6/drivers/pci/bus.c
@@ -216,24 +216,6 @@ void pci_bus_add_devices(const struct pc
 	}
 }
 
-void pci_enable_bridges(struct pci_bus *bus)
-{
-	struct pci_dev *dev;
-	int retval;
-
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		if (dev->subordinate) {
-			if (!pci_is_enabled(dev)) {
-				retval = pci_enable_device(dev);
-				if (retval)
-					dev_err(&dev->dev, "Error enabling bridge (%d), continuing\n", retval);
-				pci_set_master(dev);
-			}
-			pci_enable_bridges(dev->subordinate);
-		}
-	}
-}
-
 /** pci_walk_bus - walk devices on/under bus, calling callback.
  *  @top      bus whose devices should be walked
  *  @cb       callback to be called for each device found
@@ -286,4 +268,3 @@ EXPORT_SYMBOL_GPL(pci_walk_bus);
 EXPORT_SYMBOL(pci_bus_alloc_resource);
 EXPORT_SYMBOL_GPL(pci_bus_add_device);
 EXPORT_SYMBOL(pci_bus_add_devices);
-EXPORT_SYMBOL(pci_enable_bridges);
Index: linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
@@ -704,7 +704,6 @@ static int __ref enable_device(struct ac
 	acpiphp_sanitize_bus(bus);
 	acpiphp_set_hpp_values(bus);
 	acpiphp_set_acpi_region(slot);
-	pci_enable_bridges(bus);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		/* Assume that newly added devices are powered on already. */
Index: linux-2.6/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci.c
+++ linux-2.6/drivers/pci/pci.c
@@ -1145,6 +1145,24 @@ int pci_reenable_device(struct pci_dev *
 	return 0;
 }
 
+static void pci_enable_bridge(struct pci_dev *dev)
+{
+	int retval;
+
+	if (!dev)
+		return;
+
+	pci_enable_bridge(dev->bus->self);
+
+	if (pci_is_enabled(dev))
+		return;
+	retval = pci_enable_device(dev);
+	if (retval)
+		dev_err(&dev->dev, "Error enabling bridge (%d), continuing\n",
+			retval);
+	pci_set_master(dev);
+}
+
 static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 {
 	int err;
@@ -1165,6 +1183,8 @@ static int pci_enable_device_flags(struc
 	if (atomic_inc_return(&dev->enable_cnt) > 1)
 		return 0;		/* already enabled */
 
+	pci_enable_bridge(dev->bus->self);
+
 	/* only skip sriov related */
 	for (i = 0; i <= PCI_ROM_RESOURCE; i++)
 		if (dev->resource[i].flags & flags)
Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -1948,7 +1948,6 @@ unsigned int __ref pci_rescan_bus(struct
 
 	max = pci_scan_child_bus(bus);
 	pci_assign_unassigned_bus_resources(bus);
-	pci_enable_bridges(bus);
 	pci_bus_add_devices(bus);
 
 	return max;
Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1498,7 +1498,7 @@ again:
 
 	/* any device complain? */
 	if (list_empty(&fail_head))
-		goto enable_and_dump;
+		goto dump;
 
 	if (tried_times >= pci_try_num) {
 		if (enable_local == undefined)
@@ -1507,7 +1507,7 @@ again:
 			dev_info(&bus->dev, "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);
-		goto enable_and_dump;
+		goto dump;
 	}
 
 	dev_printk(KERN_DEBUG, &bus->dev,
@@ -1540,10 +1540,7 @@ again:
 
 	goto again;
 
-enable_and_dump:
-	/* Depth last, update the hardware. */
-	pci_enable_bridges(bus);
-
+dump:
 	/* dump the resource on buses */
 	pci_bus_dump_resources(bus);
 }
@@ -1618,7 +1615,6 @@ enable_all:
 	if (retval)
 		dev_err(&bridge->dev, "Error reenabling bridge (%d)\n", retval);
 	pci_set_master(bridge);
-	pci_enable_bridges(parent);
 }
 EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
 
Index: linux-2.6/drivers/pcmcia/cardbus.c
===================================================================
--- linux-2.6.orig/drivers/pcmcia/cardbus.c
+++ linux-2.6/drivers/pcmcia/cardbus.c
@@ -91,7 +91,6 @@ int __ref cb_alloc(struct pcmcia_socket
 	if (s->tune_bridge)
 		s->tune_bridge(s, bus);
 
-	pci_enable_bridges(bus);
 	pci_bus_add_devices(bus);
 
 	return 0;
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -1040,7 +1040,6 @@ int __must_check pci_bus_alloc_resource(
 						  resource_size_t,
 						  resource_size_t),
 			void *alignf_data);
-void pci_enable_bridges(struct pci_bus *bus);
 
 /* Proper probing supporting hot-pluggable devices */
 int __must_check __pci_register_driver(struct pci_driver *, struct module *,

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v4 8/8] PCI: Retry assign unassigned resources for hotadd root bus
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
                                       ` (6 preceding siblings ...)
  2013-05-22  6:38                     ` [PATCH v4 7/8] PCI: Enable pci bridge when it is needed Yinghai Lu
@ 2013-05-22  6:38                     ` Yinghai Lu
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22  6:38 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Let root bus hotadd path use same code for booting path.
As driver is not loaded yet, we could retry to make sure
all pci devices get resources allocated.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/acpi/pci_root.c |    2 +-
 drivers/pci/setup-bus.c |   15 +++++++--------
 include/linux/pci.h     |    1 +
 3 files changed, 9 insertions(+), 9 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1331,7 +1331,7 @@ static void pci_bus_dump_resources(struc
 	}
 }
 
-static int __init pci_bus_get_depth(struct pci_bus *bus)
+static int pci_bus_get_depth(struct pci_bus *bus)
 {
 	int depth = 0;
 	struct pci_dev *dev;
@@ -1365,7 +1365,7 @@ enum enable_type {
 	auto_enabled,
 };
 
-static enum enable_type pci_realloc_enable __initdata = undefined;
+static enum enable_type pci_realloc_enable = undefined;
 void __init pci_realloc_get_opt(char *str)
 {
 	if (!strncmp(str, "off", 3))
@@ -1373,13 +1373,13 @@ void __init pci_realloc_get_opt(char *st
 	else if (!strncmp(str, "on", 2))
 		pci_realloc_enable = user_enabled;
 }
-static bool __init pci_realloc_enabled(enum enable_type enable)
+static bool pci_realloc_enabled(enum enable_type enable)
 {
 	return enable >= user_enabled;
 }
 
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
-static int __init check_unassigned_resources(struct pci_dev *dev, void *data)
+static int check_unassigned_resources(struct pci_dev *dev, void *data)
 {
 	int i;
 	int *unassigned = data;
@@ -1397,7 +1397,7 @@ static int __init check_unassigned_resou
 	return 0;
 }
 
-static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+static enum enable_type pci_realloc_detect(struct pci_bus *bus,
 			 enum enable_type enable_local)
 {
 	int unassigned = 0;
@@ -1412,7 +1412,7 @@ static enum enable_type __init pci_reall
 	return enable_local;
 }
 #else
-static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+static enum enable_type pci_realloc_detect(struct pci_bus *bus,
 			 enum enable_type enable_local)
 {
 	return enable_local;
@@ -1451,8 +1451,7 @@ static unsigned long pci_bus_res_type_ma
  * second  and later try will clear small leaf bridge res
  * will stop till to the max  deepth if can not find good one
  */
-static void __init
-pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
+void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 {
 	LIST_HEAD(realloc_head); /* list of resources that
 					want additional resources */
Index: linux-2.6/drivers/acpi/pci_root.c
===================================================================
--- linux-2.6.orig/drivers/acpi/pci_root.c
+++ linux-2.6/drivers/acpi/pci_root.c
@@ -534,7 +534,7 @@ static int acpi_pci_root_add(struct acpi
 
 	if (system_state != SYSTEM_BOOTING) {
 		pcibios_resource_survey_bus(root->bus);
-		pci_assign_unassigned_bus_resources(root->bus);
+		pci_assign_unassigned_root_bus_resources(root->bus);
 	}
 
 	pci_bus_add_devices(root->bus);
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -1002,6 +1002,7 @@ int pci_claim_resource(struct pci_dev *,
 void pci_assign_unassigned_resources(void);
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus);
+void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus);
 void pdev_enable_device(struct pci_dev *);
 int pci_enable_resources(struct pci_dev *, int mask);
 void pci_fixup_irqs(u8 (*)(struct pci_dev *, u8 *),

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
       [not found]               ` <518786a7.64bbec0a.58a0.1f6bSMTPIN_ADDED_BROKEN@mx.google.com>
@ 2013-05-22 14:54                 ` Bjorn Helgaas
  2013-05-22 16:59                   ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Bjorn Helgaas @ 2013-05-22 14:54 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Yinghai Lu, Benjamin Herrenschmidt, linux-pci

On Mon, May 6, 2013 at 4:31 AM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> On Sun, May 05, 2013 at 08:04:48PM -0700, Yinghai Lu wrote:
>>On Sun, May 5, 2013 at 7:08 PM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
>>> On Sun, May 05, 2013 at 05:52:16PM +1000, Benjamin Herrenschmidt wrote:
>>>>On Sun, 2013-05-05 at 00:09 -0700, Yinghai Lu wrote:
>
> Hi Ben/Yinghai,
>
> Here's what happened for the failure to allocate memory window of P2P
> bridge 0001:01:00.0.
>
> 1. Sizing P2P bridge window.
> 2. Allocated resources in sorted order for specific PCI bus. Unfortunately,
>    it should fail since the PCI bus doesn't IO (not MMIO) resource. So in
>    function __assign_resources_sorted(), those resources that have been allocated
>    would be released, then allocate the so-called "must-have" resources and
>    eventually allocate the additional size for the corresponding resource. However,
>    we have some conflict in the last step (the additional size). Let have one
>    example here.
>
>    [0 24MB] Memory window of bus 0001:01     "ROOT"
>    [0 8MB]  Meory window#0 of 0001:01:00.0   "WIN#0"   with additional size 8MB
>    [0 8MB]  Memory window#1 of 0001:01:00.0  "WIN#1"   with additional size 0
>
>    If the last step (mentioned as above) fails, the resource would be allocated
>    in following sequence:
>
>    WIN#0  ->  WIN#1       -> WIN#0 with additional size 8MB
>    [0 8MB]    [8MB 16MB]     No available space for this one (16MB)
>
> The possible (temporary) fix would be checking "local_fail_head" in function
> __assign_resources_sorted() and don't do reassignment if all the element in
> the list is IO related. In the future, we might introduce specific flag to
> indicate the PCI host can't support IO and skip all IO assignment accordingly.

What exactly is the problem here?  Is it just that you don't want to
see the "can't assign" messages?  Or is there a device with a BAR that
*should* be assigned, but isn't?  If so, which device is it?

Bjorn

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-22 14:54                 ` Resource assignment oddities Bjorn Helgaas
@ 2013-05-22 16:59                   ` Yinghai Lu
  2013-05-22 17:21                     ` Bjorn Helgaas
                                       ` (2 more replies)
  0 siblings, 3 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22 16:59 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Gavin Shan, Benjamin Herrenschmidt, linux-pci

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]

On Wed, May 22, 2013 at 7:54 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> What exactly is the problem here?  Is it just that you don't want to
> see the "can't assign" messages?  Or is there a device with a BAR that
> *should* be assigned, but isn't?  If so, which device is it?

We try must+optional as first, then if there is any ioport or mmio fail
we will stick to must only then extend must to meet optional.
but mmio range and mmio-pref could be connected each other,
so extend will fail...

problem here, some root bus will not have ioport range, so it will always have
ioport allocation fail.

looks like right fix for v3.9 should be as attached patch.
it will keep must+optional for mmio, if only ioport fails....

Yinghai

[-- Attachment #2: root_bus_ioport_skip_9.patch --]
[-- Type: application/octet-stream, Size: 2860 bytes --]

Subject: [PATCH] PCI: Don't fall back MMIO to must-only, if io port fail with optional
From: Yinghai Lu <yinghai@kernel.org>

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails with any ioport or mmio, it will try must then try to extend
must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

We can check fail type and only fall back for io port only, that will
keep mmio type still have must+optional.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -316,7 +316,11 @@ static void __assign_resources_sorted(st
 	LIST_HEAD(save_head);
 	LIST_HEAD(local_fail_head);
 	struct pci_dev_resource *save_res;
-	struct pci_dev_resource *dev_res;
+	struct pci_dev_resource *dev_res, *tmp_res;
+	unsigned long fail_type = 0;
+	struct pci_dev_resource *fail_res;
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
 
 	/* Check if optional add_size is there */
 	if (!realloc_head || list_empty(realloc_head))
@@ -348,6 +352,27 @@ static void __assign_resources_sorted(st
 		return;
 	}
 
+	/* check failed type */
+	list_for_each_entry(fail_res, &local_fail_head, list)
+		fail_type |= fail_res->flags & type_mask;
+	/* only io port fails */
+	if ((fail_type & type_mask) == IORESOURCE_IO) {
+		/* remove assigned non ioport from head list */
+		list_for_each_entry_safe(dev_res, tmp_res, head, list)
+			if (dev_res->res->parent &&
+			    !(dev_res->res->flags & IORESOURCE_IO)) {
+				list_del(&dev_res->list);
+				kfree(dev_res);
+			}
+		/* remove assigned non ioport from saved list */
+		list_for_each_entry_safe(save_res, tmp_res, &save_head, list)
+			if (save_res->res->parent &&
+			    !(save_res->res->flags & IORESOURCE_IO)) {
+				list_del(&save_res->list);
+				kfree(save_res);
+			}
+	}
+
 	free_list(&local_fail_head);
 	/* Release assigned resource */
 	list_for_each_entry(dev_res, head, list)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-22 16:59                   ` Yinghai Lu
@ 2013-05-22 17:21                     ` Bjorn Helgaas
  2013-05-22 20:44                       ` Benjamin Herrenschmidt
  2013-05-22 20:43                     ` Benjamin Herrenschmidt
  2013-05-22 20:50                     ` Yinghai Lu
  2 siblings, 1 reply; 76+ messages in thread
From: Bjorn Helgaas @ 2013-05-22 17:21 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Gavin Shan, Benjamin Herrenschmidt, linux-pci

On Wed, May 22, 2013 at 10:59 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Wed, May 22, 2013 at 7:54 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> What exactly is the problem here?  Is it just that you don't want to
>> see the "can't assign" messages?  Or is there a device with a BAR that
>> *should* be assigned, but isn't?  If so, which device is it?
>
> We try must+optional as first, then if there is any ioport or mmio fail
> we will stick to must only then extend must to meet optional.
> but mmio range and mmio-pref could be connected each other,
> so extend will fail...
>
> problem here, some root bus will not have ioport range, so it will always have
> ioport allocation fail.
>
> looks like right fix for v3.9 should be as attached patch.
> it will keep must+optional for mmio, if only ioport fails....

I read all that before, but it didn't answer my questions.  I assume
we're trying to fix a device that doesn't work.  Which device is it?

Bjorn

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-22 16:59                   ` Yinghai Lu
  2013-05-22 17:21                     ` Bjorn Helgaas
@ 2013-05-22 20:43                     ` Benjamin Herrenschmidt
  2013-05-22 21:00                       ` Yinghai Lu
  2013-05-22 20:50                     ` Yinghai Lu
  2 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-22 20:43 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Bjorn Helgaas, Gavin Shan, linux-pci

On Wed, 2013-05-22 at 09:59 -0700, Yinghai Lu wrote:
> On Wed, May 22, 2013 at 7:54 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > What exactly is the problem here?  Is it just that you don't want to
> > see the "can't assign" messages?  Or is there a device with a BAR that
> > *should* be assigned, but isn't?  If so, which device is it?
> 
> We try must+optional as first, then if there is any ioport or mmio fail
> we will stick to must only then extend must to meet optional.
> but mmio range and mmio-pref could be connected each other,
> so extend will fail...
> 
> problem here, some root bus will not have ioport range, so it will always have
> ioport allocation fail.
> 
> looks like right fix for v3.9 should be as attached patch.
> it will keep must+optional for mmio, if only ioport fails....

The simpler thing to do would have been to do the entire thing in two
separate passes, one for MMIO and one for IO, and skip the second one
entirely if there's no IO at the host bridge level :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-22 17:21                     ` Bjorn Helgaas
@ 2013-05-22 20:44                       ` Benjamin Herrenschmidt
  2013-05-22 21:01                         ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-22 20:44 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, Gavin Shan, linux-pci

On Wed, 2013-05-22 at 11:21 -0600, Bjorn Helgaas wrote:
> > looks like right fix for v3.9 should be as attached patch.
> > it will keep must+optional for mmio, if only ioport fails....
> 
> I read all that before, but it didn't answer my questions.  I assume
> we're trying to fix a device that doesn't work.  Which device is it?

We have MMIO assignment failures due to the "retry" but so far it hasn't
prevented the devices I have tested from working. It does make me very
nervous however.

Gavin, do you have more details about what actually happens ?

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-22 16:59                   ` Yinghai Lu
  2013-05-22 17:21                     ` Bjorn Helgaas
  2013-05-22 20:43                     ` Benjamin Herrenschmidt
@ 2013-05-22 20:50                     ` Yinghai Lu
       [not found]                       ` <519dcfbe.89e9420a.4934.488bSMTPIN_ADDED_BROKEN@mx.google.com>
  2 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22 20:50 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Gavin Shan, Benjamin Herrenschmidt, linux-pci

[-- Attachment #1: Type: text/plain, Size: 638 bytes --]

On Wed, May 22, 2013 at 9:59 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> We try must+optional as first, then if there is any ioport or mmio fail
> we will stick to must only then extend must to meet optional.
> but mmio range and mmio-pref could be connected each other,
> so extend will fail...
>
> problem here, some root bus will not have ioport range, so it will always have
> ioport allocation fail.
>
> looks like right fix for v3.9 should be as attached patch.
> it will keep must+optional for mmio, if only ioport fails....

looks like i missed change to realloc_head list.

Ben/Shan, can you check attached v2?

thanks

Yinghai

[-- Attachment #2: root_bus_ioport_skip_9_v2.patch --]
[-- Type: application/octet-stream, Size: 2759 bytes --]

Subject: [PATCH] PCI: Don't fall back MMIO to must-only, if io port fail with optional
From: Yinghai Lu <yinghai@kernel.org>

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails with any ioport or mmio, it will try must then try to extend
must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

We can check fail type and only fall back for io port only, that will
keep mmio type still have must+optional.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

-v2: need to remove assigned entries from optional list too.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -317,6 +317,10 @@ static void __assign_resources_sorted(st
 	LIST_HEAD(local_fail_head);
 	struct pci_dev_resource *save_res;
 	struct pci_dev_resource *dev_res;
+	unsigned long fail_type = 0;
+	struct pci_dev_resource *fail_res;
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
 
 	/* Check if optional add_size is there */
 	if (!realloc_head || list_empty(realloc_head))
@@ -348,6 +352,25 @@ static void __assign_resources_sorted(st
 		return;
 	}
 
+	/* check failed type */
+	list_for_each_entry(fail_res, &local_fail_head, list)
+		fail_type |= fail_res->flags & type_mask;
+	/* only io port fails */
+	if ((fail_type & type_mask) == IORESOURCE_IO) {
+		struct pci_dev_resource *tmp_res;
+
+		/* remove assigned non ioport from head list etc */
+		list_for_each_entry_safe(dev_res, tmp_res, head, list)
+			if (dev_res->res->parent &&
+			    !(dev_res->res->flags & IORESOURCE_IO)) {
+				/* remove it from realloc_head list */
+				remove_from_list(realloc_head, dev_res->res);
+				remove_from_list(&save_head, dev_res->res);
+				list_del(&dev_res->list);
+				kfree(dev_res);
+			}
+	}
+
 	free_list(&local_fail_head);
 	/* Release assigned resource */
 	list_for_each_entry(dev_res, head, list)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-22 20:43                     ` Benjamin Herrenschmidt
@ 2013-05-22 21:00                       ` Yinghai Lu
  2013-05-22 21:13                         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22 21:00 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Bjorn Helgaas, Gavin Shan, linux-pci

On Wed, May 22, 2013 at 1:43 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:

> The simpler thing to do would have been to do the entire thing in two
> separate passes, one for MMIO and one for IO, and skip the second one
> entirely if there's no IO at the host bridge level :-)

that will need more changes.

also how about hostbridge only has mmio pref 64bit?

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-22 20:44                       ` Benjamin Herrenschmidt
@ 2013-05-22 21:01                         ` Yinghai Lu
  0 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-22 21:01 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Bjorn Helgaas, Gavin Shan, linux-pci

On Wed, May 22, 2013 at 1:44 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Wed, 2013-05-22 at 11:21 -0600, Bjorn Helgaas wrote:
>> > looks like right fix for v3.9 should be as attached patch.
>> > it will keep must+optional for mmio, if only ioport fails....
>>
>> I read all that before, but it didn't answer my questions.  I assume
>> we're trying to fix a device that doesn't work.  Which device is it?
>
> We have MMIO assignment failures due to the "retry" but so far it hasn't
> prevented the devices I have tested from working. It does make me very
> nervous however.
>

yes, io port allocation failure trigger wrong retry with mmio.

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-22 21:00                       ` Yinghai Lu
@ 2013-05-22 21:13                         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-22 21:13 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Bjorn Helgaas, Gavin Shan, linux-pci

On Wed, 2013-05-22 at 14:00 -0700, Yinghai Lu wrote:
> > The simpler thing to do would have been to do the entire thing in two
> > separate passes, one for MMIO and one for IO, and skip the second one
> > entirely if there's no IO at the host bridge level :-)
> 
> that will need more changes.
> 
> also how about hostbridge only has mmio pref 64bit?

MMIO pref and non-pref are somewhat linked, I'm not saying you should
separate them or change your design, but IO should have absolutely no
impact.

Anyway, that's how it should have been... probably too late to change.

Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
       [not found]                       ` <519dcfbe.89e9420a.4934.488bSMTPIN_ADDED_BROKEN@mx.google.com>
@ 2013-05-23 17:08                         ` Yinghai Lu
  2013-05-23 17:12                           ` Bjorn Helgaas
  2013-05-23 17:11                         ` [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional Yinghai Lu
  1 sibling, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-23 17:08 UTC (permalink / raw)
  To: Gavin Shan; +Cc: Bjorn Helgaas, Benjamin Herrenschmidt, linux-pci

On Thu, May 23, 2013 at 1:13 AM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
> On Wed, May 22, 2013 at 01:50:35PM -0700, Yinghai Lu wrote:
>>On Wed, May 22, 2013 at 9:59 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>>> We try must+optional as first, then if there is any ioport or mmio fail
>>> we will stick to must only then extend must to meet optional.
>>> but mmio range and mmio-pref could be connected each other,
>>> so extend will fail...
>>>
>>> problem here, some root bus will not have ioport range, so it will always have
>>> ioport allocation fail.
>>>
>>> looks like right fix for v3.9 should be as attached patch.
>>> it will keep must+optional for mmio, if only ioport fails....
>>
>>looks like i missed change to realloc_head list.
>>
>>Ben/Shan, can you check attached v2?
>>
>
> Sorry for late response. I spend lots of time to get the simulator working
> with PCI stuff. I had a try with your patch on top of mainline (3.10.RC2).
> Things look good except that we still see the intended the failure message
> of failure to assign I/O ports as the attached kernel log indicates :-)
>

Good, will resend this as complete form to Bjorn for v3.10.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional
       [not found]                       ` <519dcfbe.89e9420a.4934.488bSMTPIN_ADDED_BROKEN@mx.google.com>
  2013-05-23 17:08                         ` Yinghai Lu
@ 2013-05-23 17:11                         ` Yinghai Lu
  2013-05-24 17:25                           ` Bjorn Helgaas
  1 sibling, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-23 17:11 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails with any ioport or mmio, it will try must then try to extend
must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

We can check fail type and only fall back for io port only, that will
keep mmio type still have must+optional.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

-v2: need to remove assigned entries from optional list too.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -317,6 +317,10 @@ static void __assign_resources_sorted(st
 	LIST_HEAD(local_fail_head);
 	struct pci_dev_resource *save_res;
 	struct pci_dev_resource *dev_res;
+	unsigned long fail_type = 0;
+	struct pci_dev_resource *fail_res;
+	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+				  IORESOURCE_PREFETCH;
 
 	/* Check if optional add_size is there */
 	if (!realloc_head || list_empty(realloc_head))
@@ -348,6 +352,25 @@ static void __assign_resources_sorted(st
 		return;
 	}
 
+	/* check failed type */
+	list_for_each_entry(fail_res, &local_fail_head, list)
+		fail_type |= fail_res->flags & type_mask;
+	/* only io port fails */
+	if ((fail_type & type_mask) == IORESOURCE_IO) {
+		struct pci_dev_resource *tmp_res;
+
+		/* remove assigned non ioport from head list etc */
+		list_for_each_entry_safe(dev_res, tmp_res, head, list)
+			if (dev_res->res->parent &&
+			    !(dev_res->res->flags & IORESOURCE_IO)) {
+				/* remove it from realloc_head list */
+				remove_from_list(realloc_head, dev_res->res);
+				remove_from_list(&save_head, dev_res->res);
+				list_del(&dev_res->list);
+				kfree(dev_res);
+			}
+	}
+
 	free_list(&local_fail_head);
 	/* Release assigned resource */
 	list_for_each_entry(dev_res, head, list)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-23 17:08                         ` Yinghai Lu
@ 2013-05-23 17:12                           ` Bjorn Helgaas
  2013-05-23 17:17                             ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Bjorn Helgaas @ 2013-05-23 17:12 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Gavin Shan, Benjamin Herrenschmidt, linux-pci

On Thu, May 23, 2013 at 11:08 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Thu, May 23, 2013 at 1:13 AM, Gavin Shan <shangw@linux.vnet.ibm.com> wrote:
>> On Wed, May 22, 2013 at 01:50:35PM -0700, Yinghai Lu wrote:
>>>On Wed, May 22, 2013 at 9:59 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>>>> We try must+optional as first, then if there is any ioport or mmio fail
>>>> we will stick to must only then extend must to meet optional.
>>>> but mmio range and mmio-pref could be connected each other,
>>>> so extend will fail...
>>>>
>>>> problem here, some root bus will not have ioport range, so it will always have
>>>> ioport allocation fail.
>>>>
>>>> looks like right fix for v3.9 should be as attached patch.
>>>> it will keep must+optional for mmio, if only ioport fails....
>>>
>>>looks like i missed change to realloc_head list.
>>>
>>>Ben/Shan, can you check attached v2?
>>>
>>
>> Sorry for late response. I spend lots of time to get the simulator working
>> with PCI stuff. I had a try with your patch on top of mainline (3.10.RC2).
>> Things look good except that we still see the intended the failure message
>> of failure to assign I/O ports as the attached kernel log indicates :-)
>>
>
> Good, will resend this as complete form to Bjorn for v3.10.

I haven't seen anything this actually fixes yet, except that maybe it
removes some "can't assign" messages.  That doesn't sound like v3.10
material.

Bjorn

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-23 17:12                           ` Bjorn Helgaas
@ 2013-05-23 17:17                             ` Yinghai Lu
  2013-05-23 19:47                               ` Bjorn Helgaas
  0 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-23 17:17 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Gavin Shan, Benjamin Herrenschmidt, linux-pci

On Thu, May 23, 2013 at 10:12 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Thu, May 23, 2013 at 11:08 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>>
>> Good, will resend this as complete form to Bjorn for v3.10.
>
> I haven't seen anything this actually fixes yet, except that maybe it
> removes some "can't assign" messages.  That doesn't sound like v3.10
> material.

io port "can't assign" still there.

the problem that it fixed: don't fallback wrongly for mmio when ioport
fails with must+optional.

because if it fall back to mmio must-only, later extending to cover
optional will
not find extra space as mmio range and mmio-pref range in the bridge
is connected
together.

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-23 17:17                             ` Yinghai Lu
@ 2013-05-23 19:47                               ` Bjorn Helgaas
  2013-05-23 21:00                                 ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Bjorn Helgaas @ 2013-05-23 19:47 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Gavin Shan, Benjamin Herrenschmidt, linux-pci

On Thu, May 23, 2013 at 11:17 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Thu, May 23, 2013 at 10:12 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Thu, May 23, 2013 at 11:08 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>>>
>>> Good, will resend this as complete form to Bjorn for v3.10.
>>
>> I haven't seen anything this actually fixes yet, except that maybe it
>> removes some "can't assign" messages.  That doesn't sound like v3.10
>> material.
>
> io port "can't assign" still there.

The "can't assign io" message is harmless.  The host bridge doesn't
support I/O port space, so we'll *never* be able to assign IO space.
The best we can do for this part of the problem is to get rid of the
message, and that's not urgent for v3.10.  Actually, I don't think we
*should* get rid of the message; if we can't assign IO space to an IO
BAR, I want to know about it.  Maybe the message should be clearer or
less alarming, but it should be there.

> the problem that it fixed: don't fallback wrongly for mmio when ioport
> fails with must+optional.
>
> because if it fall back to mmio must-only, later extending to cover
> optional will
> not find extra space as mmio range and mmio-pref range in the bridge
> is connected
> together.

I know that we retry when we don't need to.  If the retry results in
MEM assignments that are *worse* than the original ones, that might be
a problem we should fix.  But that problem should have nothing to do
with I/O port space.  Conceptually, if we retry because IO assignment
failed, the MEM assignments should not change.

I could probably grope through your logs and figure out whether this
is happening, but it's your job to do that legwork and convince me
that a change is necessary.

Bjorn

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-23 19:47                               ` Bjorn Helgaas
@ 2013-05-23 21:00                                 ` Yinghai Lu
  2013-05-23 21:23                                   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-23 21:00 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Gavin Shan, Benjamin Herrenschmidt, linux-pci

On Thu, May 23, 2013 at 12:47 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Thu, May 23, 2013 at 11:17 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>> On Thu, May 23, 2013 at 10:12 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> On Thu, May 23, 2013 at 11:08 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>>>>
>>>> Good, will resend this as complete form to Bjorn for v3.10.
>>>
>>> I haven't seen anything this actually fixes yet, except that maybe it
>>> removes some "can't assign" messages.  That doesn't sound like v3.10
>>> material.
>>
>> io port "can't assign" still there.
>
> The "can't assign io" message is harmless.  The host bridge doesn't
> support I/O port space, so we'll *never* be able to assign IO space.
> The best we can do for this part of the problem is to get rid of the
> message, and that's not urgent for v3.10.  Actually, I don't think we
> *should* get rid of the message; if we can't assign IO space to an IO
> BAR, I want to know about it.  Maybe the message should be clearer or
> less alarming, but it should be there.

Agreed, that are only harmless message.

>
>> the problem that it fixed: don't fallback wrongly for mmio when ioport
>> fails with must+optional.
>>
>> because if it fall back to mmio must-only, later extending to cover
>> optional will
>> not find extra space as mmio range and mmio-pref range in the bridge
>> is connected
>> together.
>
> I know that we retry when we don't need to.  If the retry results in
> MEM assignments that are *worse* than the original ones, that might be
> a problem we should fix.  But that problem should have nothing to do
> with I/O port space.  Conceptually, if we retry because IO assignment
> failed, the MEM assignments should not change.

Yes, that is what patch try to do. Keep the mmio allocation from first try, and
only retry with io port.

>
> I could probably grope through your logs and figure out whether this
> is happening, but it's your job to do that legwork and convince me
> that a change is necessary.

sure.

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-23 21:00                                 ` Yinghai Lu
@ 2013-05-23 21:23                                   ` Benjamin Herrenschmidt
  2013-05-23 22:16                                     ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-23 21:23 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Bjorn Helgaas, Gavin Shan, linux-pci

On Thu, 2013-05-23 at 14:00 -0700, Yinghai Lu wrote:
> Yes, that is what patch try to do. Keep the mmio allocation from first
> try, and only retry with io port.

Can you describe more precisely what happens with the *current* code ?

Ie. The first pass allocates my device MMIO regions fine. The second
pass them spew some error messages about some mem allocation. I can
still observe *something* being assigned to devices and in the (limited)
setup I've been able to test with, so far, the devices seemed to still
work, so I don't have a good grasp of the extent of the risk here.

Is there a chance that this failed "second pass" actually prevents
proper allocation of resources ?

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-23 21:23                                   ` Benjamin Herrenschmidt
@ 2013-05-23 22:16                                     ` Yinghai Lu
  2013-05-24 15:59                                       ` Bjorn Helgaas
  0 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-05-23 22:16 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Bjorn Helgaas, Gavin Shan, linux-pci

On Thu, May 23, 2013 at 2:23 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Thu, 2013-05-23 at 14:00 -0700, Yinghai Lu wrote:
>> Yes, that is what patch try to do. Keep the mmio allocation from first
>> try, and only retry with io port.
>
> Can you describe more precisely what happens with the *current* code ?
>
> Ie. The first pass allocates my device MMIO regions fine. The second
> pass them spew some error messages about some mem allocation. I can
> still observe *something* being assigned to devices and in the (limited)
> setup I've been able to test with, so far, the devices seemed to still
> work, so I don't have a good grasp of the extent of the risk here.

after bus_size_bridge, will have two list
a. head that is for must-have resource
b. realloc_head that is for optional size.

then __assign_resources_sorted:
in the first try,
clone head list to saved list,  then update head list with optional size
from realloc_head list. and try to assign resources, fail resources
will be added to local_fail_head.

retry:
if local_fail_head is empty, mean must+optional are all allocated, so done.

if local_fail_head is not empty, will restore head list from saved list.
then try to assign resource, fail resources will be add to global resources.
then try to extend (reassign) resources with optional size from realloc_size.

>
> Is there a chance that this failed "second pass" actually prevents
> proper allocation of resources ?

failed "second pass", should only fail to extend optional resources
like resource
for SRIOV, because fail to allocate resource compactly.

in your case, "second pass" assign must-have, then extend optional way.

pci 0001:00:00.0: BAR 8: assigned [mem 0x3d01080000000-0x3d01081ffffff]
pci 0001:00:00.0: BAR 9: assigned [mem 0x3d01082000000-0x3d010837fffff pref]
pci 0001:00:00.0: BAR 7: can't assign io (size 0x3000)  ==> cause first try fail
pci 0001:00:00.0: BAR 8: assigned [mem 0x3d01080000000-0x3d010817fffff]
pci 0001:00:00.0: BAR 9: assigned [mem 0x3d01081800000-0x3d010827fffff pref]
===> go extend now
pci 0001:00:00.0: BAR 8: reassigned [mem 0x3d01082800000-0x3d010847fffff]
pci 0001:00:00.0: BAR 9: reassigned [mem 0x3d01080000000-0x3d010817fffff pref]
==> luckly it will extend ok, but it will stay in the MIDDLE.

then come the poor 0001:01:00.0

pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01082800000-0x3d01083ffffff]
pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010817fffff pref]
pci 0001:01:00.0: BAR 0: assigned [mem 0x3d01084000000-0x3d0108403ffff]
pci 0001:01:00.0: BAR 7: can't assign io (size 0x3000) ===> cause first try fail
pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01082800000-0x3d010837fffff]
pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d01080ffffff pref]
pci 0001:01:00.0: BAR 0: assigned [mem 0x3d01083800000-0x3d0108383ffff]
===> go extend optional now
pci 0001:01:00.0: BAR 8: can't assign mem (size 0x1000000)
pci 0001:01:00.0: failed to add 800000 res[8]=[mem
0x3d01082800000-0x3d010837fffff]
pci 0001:01:00.0: BAR 9: reassigned [mem 0x3d01080000000-0x3d010817fffff pref]
pci 0001:01:00.0: BAR 7: can't assign io (size 0x3000)
===> so BAR8 can not extend now...because other lay on the middle...


So we need to patch that will prevent us to fall into "allocate must
and extend it"
trap as it will not allocate resource efficiently.

otherwise in your case, devices pci0001:01:00.0 will have problem to use SRIOV.

Thanks.

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-23 22:16                                     ` Yinghai Lu
@ 2013-05-24 15:59                                       ` Bjorn Helgaas
  2013-05-24 16:33                                         ` Benjamin Herrenschmidt
  2013-05-24 16:34                                         ` Yinghai Lu
  0 siblings, 2 replies; 76+ messages in thread
From: Bjorn Helgaas @ 2013-05-24 15:59 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci

On Thu, May 23, 2013 at 4:16 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Thu, May 23, 2013 at 2:23 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>> On Thu, 2013-05-23 at 14:00 -0700, Yinghai Lu wrote:
>>> Yes, that is what patch try to do. Keep the mmio allocation from first
>>> try, and only retry with io port.
>>
>> Can you describe more precisely what happens with the *current* code ?
>>
>> Ie. The first pass allocates my device MMIO regions fine. The second
>> pass them spew some error messages about some mem allocation. I can
>> still observe *something* being assigned to devices and in the (limited)
>> setup I've been able to test with, so far, the devices seemed to still
>> work, so I don't have a good grasp of the extent of the risk here.
>
> after bus_size_bridge, will have two list
> a. head that is for must-have resource
> b. realloc_head that is for optional size.
>
> then __assign_resources_sorted:
> in the first try,
> clone head list to saved list,  then update head list with optional size
> from realloc_head list. and try to assign resources, fail resources
> will be added to local_fail_head.
>
> retry:
> if local_fail_head is empty, mean must+optional are all allocated, so done.
>
> if local_fail_head is not empty, will restore head list from saved list.
> then try to assign resource, fail resources will be add to global resources.
> then try to extend (reassign) resources with optional size from realloc_size.
>
>>
>> Is there a chance that this failed "second pass" actually prevents
>> proper allocation of resources ?
>
> failed "second pass", should only fail to extend optional resources
> like resource
> for SRIOV, because fail to allocate resource compactly.
>
> in your case, "second pass" assign must-have, then extend optional way.
>
> pci 0001:00:00.0: BAR 8: assigned [mem 0x3d01080000000-0x3d01081ffffff]
> pci 0001:00:00.0: BAR 9: assigned [mem 0x3d01082000000-0x3d010837fffff pref]
> pci 0001:00:00.0: BAR 7: can't assign io (size 0x3000)  ==> cause first try fail
> pci 0001:00:00.0: BAR 8: assigned [mem 0x3d01080000000-0x3d010817fffff]
> pci 0001:00:00.0: BAR 9: assigned [mem 0x3d01081800000-0x3d010827fffff pref]
> ===> go extend now
> pci 0001:00:00.0: BAR 8: reassigned [mem 0x3d01082800000-0x3d010847fffff]
> pci 0001:00:00.0: BAR 9: reassigned [mem 0x3d01080000000-0x3d010817fffff pref]
> ==> luckly it will extend ok, but it will stay in the MIDDLE.
>
> then come the poor 0001:01:00.0
>
> pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01082800000-0x3d01083ffffff]
> pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010817fffff pref]
> pci 0001:01:00.0: BAR 0: assigned [mem 0x3d01084000000-0x3d0108403ffff]
> pci 0001:01:00.0: BAR 7: can't assign io (size 0x3000) ===> cause first try fail
> pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01082800000-0x3d010837fffff]
> pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d01080ffffff pref]
> pci 0001:01:00.0: BAR 0: assigned [mem 0x3d01083800000-0x3d0108383ffff]
> ===> go extend optional now
> pci 0001:01:00.0: BAR 8: can't assign mem (size 0x1000000)
> pci 0001:01:00.0: failed to add 800000 res[8]=[mem
> 0x3d01082800000-0x3d010837fffff]
> pci 0001:01:00.0: BAR 9: reassigned [mem 0x3d01080000000-0x3d010817fffff pref]
> pci 0001:01:00.0: BAR 7: can't assign io (size 0x3000)
> ===> so BAR8 can not extend now...because other lay on the middle...
>
>
> So we need to patch that will prevent us to fall into "allocate must
> and extend it"
> trap as it will not allocate resource efficiently.
>
> otherwise in your case, devices pci0001:01:00.0 will have problem to use SRIOV.

Whew.  I think I finally see what you're getting at.  From the log in
Ben's very first email, it looks like we initially assign two 24MB MEM
windows to the 0001:01:00.0 bridge:

  pci 0001:01:00.0: BAR 8: assigned [mem 0x3d01081800000-0x3d01082ffffff]
  pci 0001:01:00.0: BAR 9: assigned [mem 0x3d01080000000-0x3d010817fffff pref]

After the reassignment brouhaha, the prefetchable window is unchanged,
but the non-prefetchable window has been reduced in size to 8MB:

  pci 0001:01:00.0: PCI bridge to [bus 02-0d]
  pci 0001:01:00.0:   bridge window [mem 0x3d01081800000-0x3d01081ffffff]
  pci 0001:01:00.0:   bridge window [mem 0x3d01080000000-0x3d010817fffff pref]

That should not happen, because the only reason we're reassigning is
for IO space, and we shouldn't be touching the MEM assignments.  I
agree that we should fix that.

I do not agree that your current patches are the right way to fix
this.  You are trying to make a relatively small patch that we can
wedge into v3.10 in a hurry.  I would prefer that we work out a
cleaner solution that simplifies reassignment, *then* figure out the
best way to backport it as needed.

Bjorn

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-24 15:59                                       ` Bjorn Helgaas
@ 2013-05-24 16:33                                         ` Benjamin Herrenschmidt
  2013-05-24 16:34                                         ` Yinghai Lu
  1 sibling, 0 replies; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-05-24 16:33 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, Gavin Shan, linux-pci

On Fri, 2013-05-24 at 09:59 -0600, Bjorn Helgaas wrote:
> I do not agree that your current patches are the right way to fix
> this.  You are trying to make a relatively small patch that we can
> wedge into v3.10 in a hurry.  I would prefer that we work out a
> cleaner solution that simplifies reassignment, *then* figure out the
> best way to backport it as needed.

I agree. The current stuff seems to be working enough for basic
functionality (read: getting the distro installer to work). If the only
problem we have is with SR-IOV or hotplug I can deal with that.

The right solution is to once and for all consider IO and MEM as two
completely separate address spaces that require completely different
passes IMHO.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Resource assignment oddities
  2013-05-24 15:59                                       ` Bjorn Helgaas
  2013-05-24 16:33                                         ` Benjamin Herrenschmidt
@ 2013-05-24 16:34                                         ` Yinghai Lu
  1 sibling, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-24 16:34 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci

On Fri, May 24, 2013 at 8:59 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Thu, May 23, 2013 at 4:16 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> That should not happen, because the only reason we're reassigning is
> for IO space, and we shouldn't be touching the MEM assignments.  I
> agree that we should fix that.

yes, that is

https://patchwork.kernel.org/patch/2608461/

>
> I do not agree that your current patches are the right way to fix
> this.  You are trying to make a relatively small patch that we can
> wedge into v3.10 in a hurry.  I would prefer that we work out a
> cleaner solution that simplifies reassignment, *then* figure out the
> best way to backport it as needed.

ok, we can leave per root bus method to v3.11.

we can put only one patch in v3.10.

https://patchwork.kernel.org/patch/2608461/
PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional

that should fix the problem that Ben and Shan met.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional
  2013-05-23 17:11                         ` [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional Yinghai Lu
@ 2013-05-24 17:25                           ` Bjorn Helgaas
  2013-05-24 23:31                             ` [PATCH v3] " Yinghai Lu
  2013-05-24 23:34                             ` [PATCH] " Yinghai Lu
  0 siblings, 2 replies; 76+ messages in thread
From: Bjorn Helgaas @ 2013-05-24 17:25 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci, linux-kernel

On Thu, May 23, 2013 at 11:11 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> BenH reported that there is some assign unassigned resource problem
> in powerpc.
>
> It turns out after
> | commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
> | Date:   Thu Feb 23 19:23:29 2012 -0800
> |
> |    PCI: Retry on IORESOURCE_IO type allocations
>
> even the root bus does not have io port range, it will keep retrying
> to realloc with mmio.
>
> Current retry logic is : try with must+optional at first, and if
> it fails with any ioport or mmio, it will try must then try to extend
> must with optional.
> That will fail as mmio-non-pref and mmio-pref for bridge will
> be next to each other. So we have no chance to extend mmio-non-pref.
>
> We can check fail type and only fall back for io port only, that will
> keep mmio type still have must+optional.
>
> This will be become more often when we have x86 8 sockets or 32 sockets
> system, and those system will have one root bus per socket.
> They will have some root buses do not have ioport range.
>
> -v2: need to remove assigned entries from optional list too.
>
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Tested-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>
> ---
>  drivers/pci/setup-bus.c |   23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
>
> Index: linux-2.6/drivers/pci/setup-bus.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/setup-bus.c
> +++ linux-2.6/drivers/pci/setup-bus.c
> @@ -317,6 +317,10 @@ static void __assign_resources_sorted(st
>         LIST_HEAD(local_fail_head);
>         struct pci_dev_resource *save_res;
>         struct pci_dev_resource *dev_res;
> +       unsigned long fail_type = 0;
> +       struct pci_dev_resource *fail_res;
> +       unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
> +                                 IORESOURCE_PREFETCH;
>
>         /* Check if optional add_size is there */
>         if (!realloc_head || list_empty(realloc_head))
> @@ -348,6 +352,25 @@ static void __assign_resources_sorted(st
>                 return;
>         }
>
> +       /* check failed type */
> +       list_for_each_entry(fail_res, &local_fail_head, list)
> +               fail_type |= fail_res->flags & type_mask;
> +       /* only io port fails */
> +       if ((fail_type & type_mask) == IORESOURCE_IO) {
> +               struct pci_dev_resource *tmp_res;
> +
> +               /* remove assigned non ioport from head list etc */
> +               list_for_each_entry_safe(dev_res, tmp_res, head, list)
> +                       if (dev_res->res->parent &&
> +                           !(dev_res->res->flags & IORESOURCE_IO)) {
> +                               /* remove it from realloc_head list */
> +                               remove_from_list(realloc_head, dev_res->res);
> +                               remove_from_list(&save_head, dev_res->res);
> +                               list_del(&dev_res->list);
> +                               kfree(dev_res);
> +                       }
> +       }

The problem we're trying to solve is that when allocation for type X
fails, we retry allocation for type Y.

This patch handles IO specially.  I think it basically says, "if we
only have IO allocation failures, don't retry MEM allocation."  But a
clean strategy would also avoid retrying IO allocation if we only had
MEM allocation failures.

Bjorn

>         free_list(&local_fail_head);
>         /* Release assigned resource */
>         list_for_each_entry(dev_res, head, list)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v3] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional
  2013-05-24 17:25                           ` Bjorn Helgaas
@ 2013-05-24 23:31                             ` Yinghai Lu
  2013-05-24 23:34                             ` [PATCH] " Yinghai Lu
  1 sibling, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-24 23:31 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails with any ioport or mmio, it will try must then try to extend
must with optional. The reassign will put extended mmio or pref mmio
in the middle of parent resource range.
That will prevent other sibling resources to get must-have resources
or get extended properly.

We can check fail type to see if can we need fall back to must-have
only, that will keep not needed release resource to be must+optional.

Separate three resource type checking if we need to release
assigned resource after requested + add_size try.
1. if there is io port assign fail, will release assigned io port.
2. if there is pref mmio assign fail, release assigned pref mmio.
   if assigned pref mmio's parent is non-pref mmio and there
   is non-pref mmio assign fail, will release that assigned pref mmio.
3. if there is non-pref mmio assign fail or pref mmio assigned fail,
   will release assigned non-pref mmio.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

-v2: need to remove assigned entries from optional list too.
-v3: not just checking ioport related, requested by Bjorn.


Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   70 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 69 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -300,6 +300,48 @@ static void assign_requested_resources_s
 	}
 }
 
+static unsigned long pci_fail_res_type_mask(struct list_head *fail_head)
+{
+	struct pci_dev_resource *fail_res;
+	unsigned long mask = 0;
+
+	/* check failed type */
+	list_for_each_entry(fail_res, fail_head, list)
+		mask |= fail_res->flags;
+
+	/*
+	 * one pref failed resource will set IORESOURCE_MEM,
+	 * as we can allocate pref in non-pref range.
+	 * Will release all asssigned non-pref sibling resources
+	 * according to that bit.
+	 */
+	return mask & (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH);
+}
+
+static bool pci_need_to_release(unsigned long mask, struct resource *res)
+{
+	if (res->flags & IORESOURCE_IO)
+		return !!(mask & IORESOURCE_IO);
+
+	/* check pref at first */
+	if (res->flags & IORESOURCE_PREFETCH) {
+		if (mask & IORESOURCE_PREFETCH)
+			return true;
+		/* count pref if its parent is non-pref */
+		else if ((mask & IORESOURCE_MEM) &&
+			 !(res->parent->flags & IORESOURCE_PREFETCH))
+			return true;
+		else
+			return false;
+	}
+
+	if (res->flags & IORESOURCE_MEM)
+		return !!(mask & IORESOURCE_MEM);
+
+	/* should not get here */
+	return false;
+}
+
 static void __assign_resources_sorted(struct list_head *head,
 				 struct list_head *realloc_head,
 				 struct list_head *fail_head)
@@ -312,11 +354,24 @@ static void __assign_resources_sorted(st
 	 *  if could do that, could get out early.
 	 *  if could not do that, we still try to assign requested at first,
 	 *    then try to reassign add_size for some resources.
+	 *
+	 * Separate three resource type checking if we need to release
+	 *  assigned resource after requested + add_size try.
+	 *	1. if there is io port assign fail, will release assigned
+	 *	   io port.
+	 *	2. if there is pref mmio assign fail, release assigned
+	 *	   pref mmio.
+	 *	   if assigned pref mmio's parent is non-pref mmio and there
+	 *	   is non-pref mmio assign fail, will release that assigned
+	 *	   pref mmio.
+	 *	3. if there is non-pref mmio assign fail or pref mmio
+	 *	   assigned fail, will release assigned non-pref mmio.
 	 */
 	LIST_HEAD(save_head);
 	LIST_HEAD(local_fail_head);
 	struct pci_dev_resource *save_res;
-	struct pci_dev_resource *dev_res;
+	struct pci_dev_resource *dev_res, *tmp_res;
+	unsigned long fail_type;
 
 	/* Check if optional add_size is there */
 	if (!realloc_head || list_empty(realloc_head))
@@ -348,6 +403,19 @@ static void __assign_resources_sorted(st
 		return;
 	}
 
+	/* check failed type */
+	fail_type = pci_fail_res_type_mask(&local_fail_head);
+	/* remove not need to be released assigned res from head list etc */
+	list_for_each_entry_safe(dev_res, tmp_res, head, list)
+		if (dev_res->res->parent &&
+		    !pci_need_to_release(fail_type, dev_res->res)) {
+			/* remove it from realloc_head list */
+			remove_from_list(realloc_head, dev_res->res);
+			remove_from_list(&save_head, dev_res->res);
+			list_del(&dev_res->list);
+			kfree(dev_res);
+		}
+
 	free_list(&local_fail_head);
 	/* Release assigned resource */
 	list_for_each_entry(dev_res, head, list)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional
  2013-05-24 17:25                           ` Bjorn Helgaas
  2013-05-24 23:31                             ` [PATCH v3] " Yinghai Lu
@ 2013-05-24 23:34                             ` Yinghai Lu
  1 sibling, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-05-24 23:34 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci, linux-kernel

On Fri, May 24, 2013 at 10:25 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> The problem we're trying to solve is that when allocation for type X
> fails, we retry allocation for type Y.
>
> This patch handles IO specially.  I think it basically says, "if we
> only have IO allocation failures, don't retry MEM allocation."  But a
> clean strategy would also avoid retrying IO allocation if we only had
> MEM allocation failures.

Well, that will make it little bit complicated as v3 that is sent in
another mail.

Need to separate ioport, mmio, mmio pref three types.

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis
  2013-05-06 10:48                 ` Benjamin Herrenschmidt
                                     ` (3 preceding siblings ...)
  2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
@ 2013-06-01  6:03                   ` Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 2/7] PCI: Don't use temp bus for pci_bus_release_bridge_resources Yinghai Lu
                                       ` (7 more replies)
  4 siblings, 8 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-01  6:03 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

After checking the code, found that we bound io port and mmio fail
path together.
First patch fix the problem, that will not make mmio fall back to must-only
when only have io port fail with must+optional.

During we found the fix for that problem, found that we can separate assign
unassigned resources to per root bus.
that will make the code simple, also could reuse it for hotadd path.

These patches are targeted to 3.11

-v4: split first patch into 4 patches per Bjorn.
-v5: drop two patches that will pass root bus resource mask after we found
     simple and less intrusive way to fix the problem.

 PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional
 PCI: Don't use temp bus for pci_bus_release_bridge_resources
 PCI: Use pci_walk_bus to detect unassigned resources
 PCI: Introduce enable_local to prepare per root bus handling
 PCI: Split pci_assign_unassigned_resources to per root bus
 PCI: Enable pci bridge when it is needed
 PCI: Retry assign unassigned resources for hotadd root bus

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v5 2/7] PCI: Don't use temp bus for pci_bus_release_bridge_resources
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
@ 2013-06-01  6:03                     ` Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources Yinghai Lu
                                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-01  6:03 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

as later bus can not be used as temp variable after we change to
per root bus handling with assign unassigned resources.

Per Bjorn, separate it from big patch that handing assign_unssigned per root bus.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1526,12 +1526,11 @@ again:
 	 * Try to release leaf bridge's resources that doesn't fit resource of
 	 * child device under that bridge
 	 */
-	list_for_each_entry(fail_res, &fail_head, list) {
-		bus = fail_res->dev->bus;
-		pci_bus_release_bridge_resources(bus,
+	list_for_each_entry(fail_res, &fail_head, list)
+		pci_bus_release_bridge_resources(fail_res->dev->bus,
 						 fail_res->flags & type_mask,
 						 rel_type);
-	}
+
 	/* restore size and flags */
 	list_for_each_entry(fail_res, &fail_head, list) {
 		struct resource *res = fail_res->res;

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 2/7] PCI: Don't use temp bus for pci_bus_release_bridge_resources Yinghai Lu
@ 2013-06-01  6:03                     ` Yinghai Lu
  2013-06-25 21:15                       ` Bjorn Helgaas
  2013-06-01  6:03                     ` [PATCH v5 4/7] PCI: Introduce enable_local to prepare per root bus handling Yinghai Lu
                                       ` (5 subsequent siblings)
  7 siblings, 1 reply; 76+ messages in thread
From: Yinghai Lu @ 2013-06-01  6:03 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Per Bjorn, use pci_walk_bus instead of for_each_pci_dev or
calling pci_realloc_detect() recursively, that will make code more readable.

Per Bjorn, separate it from big patch that handing assign_unssigned per root bus.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   46 +++++++++++++++++++++++++++++++---------------
 1 file changed, 31 insertions(+), 15 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1427,30 +1427,46 @@ static bool __init pci_realloc_enabled(v
 	return pci_realloc_enable >= user_enabled;
 }
 
-static void __init pci_realloc_detect(void)
-{
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
-	struct pci_dev *dev = NULL;
+static int __init check_unassigned_resources(struct pci_dev *dev, void *data)
+{
+	int i;
+	int *unassigned = data;
 
-	if (pci_realloc_enable != undefined)
-		return;
+	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
+		struct resource *r = &dev->resource[i];
 
-	for_each_pci_dev(dev) {
-		int i;
+		/* Not assigned, or rejected by kernel ? */
+		if (r->flags && !r->start) {
+			(*unassigned)++;
+			return 1; /* return early from pci_walk_bus */
+		}
+	}
 
-		for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
-			struct resource *r = &dev->resource[i];
+	return 0;
+}
 
-			/* Not assigned, or rejected by kernel ? */
-			if (r->flags && !r->start) {
-				pci_realloc_enable = auto_enabled;
+static void  __init pci_realloc_detect(void)
+{
+	int unassigned = 0;
+	struct pci_bus *bus;
 
-				return;
-			}
+	if (pci_realloc_enable != undefined)
+		return;
+
+	list_for_each_entry(bus, &pci_root_buses, node) {
+		pci_walk_bus(bus, check_unassigned_resources, &unassigned);
+		if (unassigned) {
+			pci_realloc_enable = auto_enabled;
+			return;
 		}
 	}
-#endif
 }
+#else
+static void __init pci_realloc_detect(void)
+{
+}
+#endif
 
 /*
  * first try will not touch pci bridge res

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v5 4/7] PCI: Introduce enable_local to prepare per root bus handling
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 2/7] PCI: Don't use temp bus for pci_bus_release_bridge_resources Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources Yinghai Lu
@ 2013-06-01  6:03                     ` Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 5/7] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
                                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-01  6:03 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Add enable_local to prepare assign unassigned resource
for per root bus.

Per Bjorn, separate it from big patch that handing assign_unssigned per root bus.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   40 +++++++++++++++++++++-------------------
 1 file changed, 21 insertions(+), 19 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1422,9 +1422,9 @@ void __init pci_realloc_get_opt(char *st
 	else if (!strncmp(str, "on", 2))
 		pci_realloc_enable = user_enabled;
 }
-static bool __init pci_realloc_enabled(void)
+static bool __init pci_realloc_enabled(enum enable_type enable)
 {
-	return pci_realloc_enable >= user_enabled;
+	return enable >= user_enabled;
 }
 
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
@@ -1446,25 +1446,25 @@ static int __init check_unassigned_resou
 	return 0;
 }
 
-static void  __init pci_realloc_detect(void)
+static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+			 enum enable_type enable_local)
 {
 	int unassigned = 0;
-	struct pci_bus *bus;
 
-	if (pci_realloc_enable != undefined)
-		return;
+	if (enable_local != undefined)
+		return enable_local;
 
-	list_for_each_entry(bus, &pci_root_buses, node) {
-		pci_walk_bus(bus, check_unassigned_resources, &unassigned);
-		if (unassigned) {
-			pci_realloc_enable = auto_enabled;
-			return;
-		}
-	}
+	pci_walk_bus(bus, check_unassigned_resources, &unassigned);
+	if (unassigned)
+		return auto_enabled;
+
+	return enable_local;
 }
 #else
-static void __init pci_realloc_detect(void)
+static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+			 enum enable_type enable_local)
 {
+	return enable_local;
 }
 #endif
 
@@ -1487,10 +1487,12 @@ pci_assign_unassigned_resources(void)
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
+	enum enable_type enable_local = pci_realloc_enable;
+
+	list_for_each_entry(bus, &pci_root_buses, node)
+		enable_local = pci_realloc_detect(bus, enable_local);
 
-	/* don't realloc if asked to do so */
-	pci_realloc_detect();
-	if (pci_realloc_enabled()) {
+	if (pci_realloc_enabled(enable_local)) {
 		int max_depth = pci_get_max_depth();
 
 		pci_try_num = max_depth + 1;
@@ -1522,9 +1524,9 @@ again:
 		goto enable_and_dump;
 
 	if (tried_times >= pci_try_num) {
-		if (pci_realloc_enable == undefined)
+		if (enable_local == undefined)
 			printk(KERN_INFO "Some PCI device resources are unassigned, try booting with pci=realloc\n");
-		else if (pci_realloc_enable == auto_enabled)
+		else if (enable_local == auto_enabled)
 			printk(KERN_INFO "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v5 5/7] PCI: Split pci_assign_unassigned_resources to per root bus
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
                                       ` (2 preceding siblings ...)
  2013-06-01  6:03                     ` [PATCH v5 4/7] PCI: Introduce enable_local to prepare per root bus handling Yinghai Lu
@ 2013-06-01  6:03                     ` Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 6/7] PCI: Enable pci bridge when it is needed Yinghai Lu
                                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-01  6:03 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

We need to split pci_assign_unassiged_resource to every root bus, so can
have different retry for assign_unassigned per root bus

Also we need root bus hot add and booting path use same code.

-v2: separate enable_local and pci_release_bridge_resources to
     other patches requested by Bjorn.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   62 +++++++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 37 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1383,21 +1383,6 @@ static int __init pci_bus_get_depth(stru
 
 	return depth;
 }
-static int __init pci_get_max_depth(void)
-{
-	int depth = 0;
-	struct pci_bus *bus;
-
-	list_for_each_entry(bus, &pci_root_buses, node) {
-		int ret;
-
-		ret = pci_bus_get_depth(bus);
-		if (ret > depth)
-			depth = ret;
-	}
-
-	return depth;
-}
 
 /*
  * -1: undefined, will auto detect later
@@ -1473,10 +1458,9 @@ static enum enable_type __init pci_reall
  * second  and later try will clear small leaf bridge res
  * will stop till to the max  deepth if can not find good one
  */
-void __init
-pci_assign_unassigned_resources(void)
+static void __init
+pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 {
-	struct pci_bus *bus;
 	LIST_HEAD(realloc_head); /* list of resources that
 					want additional resources */
 	struct list_head *add_list = NULL;
@@ -1487,17 +1471,17 @@ pci_assign_unassigned_resources(void)
 	unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
 				  IORESOURCE_PREFETCH;
 	int pci_try_num = 1;
-	enum enable_type enable_local = pci_realloc_enable;
-
-	list_for_each_entry(bus, &pci_root_buses, node)
-		enable_local = pci_realloc_detect(bus, enable_local);
+	enum enable_type enable_local;
 
+	/* don't realloc if asked to do so */
+	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
 	if (pci_realloc_enabled(enable_local)) {
-		int max_depth = pci_get_max_depth();
+		int max_depth = pci_bus_get_depth(bus);
 
 		pci_try_num = max_depth + 1;
-		printk(KERN_DEBUG "PCI: max bus depth: %d pci_try_num: %d\n",
-			 max_depth, pci_try_num);
+		dev_printk(KERN_DEBUG, &bus->dev,
+			   "max bus depth: %d pci_try_num: %d\n",
+			   max_depth, pci_try_num);
 	}
 
 again:
@@ -1509,12 +1493,10 @@ again:
 		add_list = &realloc_head;
 	/* Depth first, calculate sizes and alignments of all
 	   subordinate buses. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		__pci_bus_size_bridges(bus, add_list);
+	__pci_bus_size_bridges(bus, add_list);
 
 	/* Depth last, allocate resources and update the hardware. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		__pci_bus_assign_resources(bus, add_list, &fail_head);
+	__pci_bus_assign_resources(bus, add_list, &fail_head);
 	if (add_list)
 		BUG_ON(!list_empty(add_list));
 	tried_times++;
@@ -1525,16 +1507,16 @@ again:
 
 	if (tried_times >= pci_try_num) {
 		if (enable_local == undefined)
-			printk(KERN_INFO "Some PCI device resources are unassigned, try booting with pci=realloc\n");
+			dev_info(&bus->dev, "Some PCI device resources are unassigned, try booting with pci=realloc\n");
 		else if (enable_local == auto_enabled)
-			printk(KERN_INFO "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
+			dev_info(&bus->dev, "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);
 		goto enable_and_dump;
 	}
 
-	printk(KERN_DEBUG "PCI: No. %d try to assign unassigned res\n",
-			 tried_times + 1);
+	dev_printk(KERN_DEBUG, &bus->dev,
+		   "No. %d try to assign unassigned res\n", tried_times + 1);
 
 	/* third times and later will not check if it is leaf */
 	if ((tried_times + 1) > 2)
@@ -1565,12 +1547,18 @@ again:
 
 enable_and_dump:
 	/* Depth last, update the hardware. */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		pci_enable_bridges(bus);
+	pci_enable_bridges(bus);
 
 	/* dump the resource on buses */
-	list_for_each_entry(bus, &pci_root_buses, node)
-		pci_bus_dump_resources(bus);
+	pci_bus_dump_resources(bus);
+}
+
+void __init pci_assign_unassigned_resources(void)
+{
+	struct pci_bus *root_bus;
+
+	list_for_each_entry(root_bus, &pci_root_buses, node)
+		pci_assign_unassigned_root_bus_resources(root_bus);
 }
 
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v5 6/7] PCI: Enable pci bridge when it is needed
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
                                       ` (3 preceding siblings ...)
  2013-06-01  6:03                     ` [PATCH v5 5/7] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
@ 2013-06-01  6:03                     ` Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 7/7] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
                                       ` (2 subsequent siblings)
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-01  6:03 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Current we enable bridges after bus scan and assign resources.
and it is spreaded a lot of places.

We can enable them later when their children pci device is enabled.
Need to go up to root bus and enable bridge one by one down to pci
device.

So that will delay enable bridge as needed bassis,
also kill one inconsistent between boot path and hot-add
path in acpi_pci_root_add().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/arm/kernel/bios32.c           |    5 -----
 arch/m68k/platform/coldfire/pci.c  |    1 -
 arch/mips/pci/pci.c                |    1 -
 arch/sh/drivers/pci/pci.c          |    1 -
 drivers/acpi/pci_root.c            |    4 ----
 drivers/parisc/lba_pci.c           |    1 -
 drivers/pci/bus.c                  |   19 -------------------
 drivers/pci/hotplug/acpiphp_glue.c |    1 -
 drivers/pci/pci.c                  |   20 ++++++++++++++++++++
 drivers/pci/probe.c                |    1 -
 drivers/pci/setup-bus.c            |   10 +++-------
 drivers/pcmcia/cardbus.c           |    1 -
 include/linux/pci.h                |    1 -
 13 files changed, 23 insertions(+), 43 deletions(-)

Index: linux-2.6/arch/arm/kernel/bios32.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/bios32.c
+++ linux-2.6/arch/arm/kernel/bios32.c
@@ -524,11 +524,6 @@ void pci_common_init(struct hw_pci *hw)
 			 * Assign resources.
 			 */
 			pci_bus_assign_resources(bus);
-
-			/*
-			 * Enable bridges
-			 */
-			pci_enable_bridges(bus);
 		}
 
 		/*
Index: linux-2.6/arch/m68k/platform/coldfire/pci.c
===================================================================
--- linux-2.6.orig/arch/m68k/platform/coldfire/pci.c
+++ linux-2.6/arch/m68k/platform/coldfire/pci.c
@@ -319,7 +319,6 @@ static int __init mcf_pci_init(void)
 	pci_fixup_irqs(pci_common_swizzle, mcf_pci_map_irq);
 	pci_bus_size_bridges(rootbus);
 	pci_bus_assign_resources(rootbus);
-	pci_enable_bridges(rootbus);
 
 	return 0;
 }
Index: linux-2.6/arch/mips/pci/pci.c
===================================================================
--- linux-2.6.orig/arch/mips/pci/pci.c
+++ linux-2.6/arch/mips/pci/pci.c
@@ -113,7 +113,6 @@ static void pcibios_scanbus(struct pci_c
 		if (!pci_has_flag(PCI_PROBE_ONLY)) {
 			pci_bus_size_bridges(bus);
 			pci_bus_assign_resources(bus);
-			pci_enable_bridges(bus);
 		}
 	}
 }
Index: linux-2.6/arch/sh/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/arch/sh/drivers/pci/pci.c
+++ linux-2.6/arch/sh/drivers/pci/pci.c
@@ -69,7 +69,6 @@ static void pcibios_scanbus(struct pci_c
 
 		pci_bus_size_bridges(bus);
 		pci_bus_assign_resources(bus);
-		pci_enable_bridges(bus);
 	} else {
 		pci_free_resource_list(&resources);
 	}
Index: linux-2.6/drivers/parisc/lba_pci.c
===================================================================
--- linux-2.6.orig/drivers/parisc/lba_pci.c
+++ linux-2.6/drivers/parisc/lba_pci.c
@@ -1533,7 +1533,6 @@ lba_driver_probe(struct parisc_device *d
 		lba_dump_res(&lba_dev->hba.lmmio_space, 2);
 #endif
 	}
-	pci_enable_bridges(lba_bus);
 
 	/*
 	** Once PCI register ops has walked the bus, access to config
Index: linux-2.6/drivers/pci/bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/bus.c
+++ linux-2.6/drivers/pci/bus.c
@@ -216,24 +216,6 @@ void pci_bus_add_devices(const struct pc
 	}
 }
 
-void pci_enable_bridges(struct pci_bus *bus)
-{
-	struct pci_dev *dev;
-	int retval;
-
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		if (dev->subordinate) {
-			if (!pci_is_enabled(dev)) {
-				retval = pci_enable_device(dev);
-				if (retval)
-					dev_err(&dev->dev, "Error enabling bridge (%d), continuing\n", retval);
-				pci_set_master(dev);
-			}
-			pci_enable_bridges(dev->subordinate);
-		}
-	}
-}
-
 /** pci_walk_bus - walk devices on/under bus, calling callback.
  *  @top      bus whose devices should be walked
  *  @cb       callback to be called for each device found
@@ -301,4 +283,3 @@ EXPORT_SYMBOL(pci_bus_put);
 EXPORT_SYMBOL(pci_bus_alloc_resource);
 EXPORT_SYMBOL_GPL(pci_bus_add_device);
 EXPORT_SYMBOL(pci_bus_add_devices);
-EXPORT_SYMBOL(pci_enable_bridges);
Index: linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/acpiphp_glue.c
+++ linux-2.6/drivers/pci/hotplug/acpiphp_glue.c
@@ -704,7 +704,6 @@ static int __ref enable_device(struct ac
 	acpiphp_sanitize_bus(bus);
 	acpiphp_set_hpp_values(bus);
 	acpiphp_set_acpi_region(slot);
-	pci_enable_bridges(bus);
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
 		/* Assume that newly added devices are powered on already. */
Index: linux-2.6/drivers/pci/pci.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci.c
+++ linux-2.6/drivers/pci/pci.c
@@ -1145,6 +1145,24 @@ int pci_reenable_device(struct pci_dev *
 	return 0;
 }
 
+static void pci_enable_bridge(struct pci_dev *dev)
+{
+	int retval;
+
+	if (!dev)
+		return;
+
+	pci_enable_bridge(dev->bus->self);
+
+	if (pci_is_enabled(dev))
+		return;
+	retval = pci_enable_device(dev);
+	if (retval)
+		dev_err(&dev->dev, "Error enabling bridge (%d), continuing\n",
+			retval);
+	pci_set_master(dev);
+}
+
 static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
 {
 	int err;
@@ -1165,6 +1183,8 @@ static int pci_enable_device_flags(struc
 	if (atomic_inc_return(&dev->enable_cnt) > 1)
 		return 0;		/* already enabled */
 
+	pci_enable_bridge(dev->bus->self);
+
 	/* only skip sriov related */
 	for (i = 0; i <= PCI_ROM_RESOURCE; i++)
 		if (dev->resource[i].flags & flags)
Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -1964,7 +1964,6 @@ unsigned int __ref pci_rescan_bus(struct
 
 	max = pci_scan_child_bus(bus);
 	pci_assign_unassigned_bus_resources(bus);
-	pci_enable_bridges(bus);
 	pci_bus_add_devices(bus);
 
 	return max;
Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1503,7 +1503,7 @@ again:
 
 	/* any device complain? */
 	if (list_empty(&fail_head))
-		goto enable_and_dump;
+		goto dump;
 
 	if (tried_times >= pci_try_num) {
 		if (enable_local == undefined)
@@ -1512,7 +1512,7 @@ again:
 			dev_info(&bus->dev, "Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off\n");
 
 		free_list(&fail_head);
-		goto enable_and_dump;
+		goto dump;
 	}
 
 	dev_printk(KERN_DEBUG, &bus->dev,
@@ -1545,10 +1545,7 @@ again:
 
 	goto again;
 
-enable_and_dump:
-	/* Depth last, update the hardware. */
-	pci_enable_bridges(bus);
-
+dump:
 	/* dump the resource on buses */
 	pci_bus_dump_resources(bus);
 }
@@ -1621,7 +1618,6 @@ enable_all:
 	if (retval)
 		dev_err(&bridge->dev, "Error reenabling bridge (%d)\n", retval);
 	pci_set_master(bridge);
-	pci_enable_bridges(parent);
 }
 EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
 
Index: linux-2.6/drivers/pcmcia/cardbus.c
===================================================================
--- linux-2.6.orig/drivers/pcmcia/cardbus.c
+++ linux-2.6/drivers/pcmcia/cardbus.c
@@ -91,7 +91,6 @@ int __ref cb_alloc(struct pcmcia_socket
 	if (s->tune_bridge)
 		s->tune_bridge(s, bus);
 
-	pci_enable_bridges(bus);
 	pci_bus_add_devices(bus);
 
 	return 0;
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -1043,7 +1043,6 @@ int __must_check pci_bus_alloc_resource(
 						  resource_size_t,
 						  resource_size_t),
 			void *alignf_data);
-void pci_enable_bridges(struct pci_bus *bus);
 
 /* Proper probing supporting hot-pluggable devices */
 int __must_check __pci_register_driver(struct pci_driver *, struct module *,
Index: linux-2.6/drivers/acpi/pci_root.c
===================================================================
--- linux-2.6.orig/drivers/acpi/pci_root.c
+++ linux-2.6/drivers/acpi/pci_root.c
@@ -537,10 +537,6 @@ static int acpi_pci_root_add(struct acpi
 		pci_assign_unassigned_bus_resources(root->bus);
 	}
 
-	/* need to after hot-added ioapic is registered */
-	if (system_state != SYSTEM_BOOTING)
-		pci_enable_bridges(root->bus);
-
 	pci_bus_add_devices(root->bus);
 	return 1;
 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v5 7/7] PCI: Retry assign unassigned resources for hotadd root bus
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
                                       ` (4 preceding siblings ...)
  2013-06-01  6:03                     ` [PATCH v5 6/7] PCI: Enable pci bridge when it is needed Yinghai Lu
@ 2013-06-01  6:03                     ` Yinghai Lu
  2013-06-01  6:03                     ` [PATCH v5 1/7] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional Yinghai Lu
  2013-06-22  3:00                     ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-01  6:03 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

Let root bus hotadd path use same code for booting path.
As driver is not loaded yet, we could retry to make sure
all pci devices get resources allocated.
We need this as during hotadd, firmware could assign some bars before
handle over.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/acpi/pci_root.c |    2 +-
 drivers/pci/setup-bus.c |   15 +++++++--------
 include/linux/pci.h     |    1 +
 3 files changed, 9 insertions(+), 9 deletions(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1365,7 +1365,7 @@ static void pci_bus_dump_resources(struc
 	}
 }
 
-static int __init pci_bus_get_depth(struct pci_bus *bus)
+static int pci_bus_get_depth(struct pci_bus *bus)
 {
 	int depth = 0;
 	struct pci_dev *dev;
@@ -1399,7 +1399,7 @@ enum enable_type {
 	auto_enabled,
 };
 
-static enum enable_type pci_realloc_enable __initdata = undefined;
+static enum enable_type pci_realloc_enable = undefined;
 void __init pci_realloc_get_opt(char *str)
 {
 	if (!strncmp(str, "off", 3))
@@ -1407,13 +1407,13 @@ void __init pci_realloc_get_opt(char *st
 	else if (!strncmp(str, "on", 2))
 		pci_realloc_enable = user_enabled;
 }
-static bool __init pci_realloc_enabled(enum enable_type enable)
+static bool pci_realloc_enabled(enum enable_type enable)
 {
 	return enable >= user_enabled;
 }
 
 #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
-static int __init check_unassigned_resources(struct pci_dev *dev, void *data)
+static int check_unassigned_resources(struct pci_dev *dev, void *data)
 {
 	int i;
 	int *unassigned = data;
@@ -1431,7 +1431,7 @@ static int __init check_unassigned_resou
 	return 0;
 }
 
-static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+static enum enable_type pci_realloc_detect(struct pci_bus *bus,
 			 enum enable_type enable_local)
 {
 	int unassigned = 0;
@@ -1446,7 +1446,7 @@ static enum enable_type __init pci_reall
 	return enable_local;
 }
 #else
-static enum enable_type __init pci_realloc_detect(struct pci_bus *bus,
+static enum enable_type pci_realloc_detect(struct pci_bus *bus,
 			 enum enable_type enable_local)
 {
 	return enable_local;
@@ -1458,8 +1458,7 @@ static enum enable_type __init pci_reall
  * second  and later try will clear small leaf bridge res
  * will stop till to the max  deepth if can not find good one
  */
-static void __init
-pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
+void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 {
 	LIST_HEAD(realloc_head); /* list of resources that
 					want additional resources */
Index: linux-2.6/drivers/acpi/pci_root.c
===================================================================
--- linux-2.6.orig/drivers/acpi/pci_root.c
+++ linux-2.6/drivers/acpi/pci_root.c
@@ -534,7 +534,7 @@ static int acpi_pci_root_add(struct acpi
 
 	if (system_state != SYSTEM_BOOTING) {
 		pcibios_resource_survey_bus(root->bus);
-		pci_assign_unassigned_bus_resources(root->bus);
+		pci_assign_unassigned_root_bus_resources(root->bus);
 	}
 
 	pci_bus_add_devices(root->bus);
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -1003,6 +1003,7 @@ int pci_claim_resource(struct pci_dev *,
 void pci_assign_unassigned_resources(void);
 void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus);
+void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus);
 void pdev_enable_device(struct pci_dev *);
 int pci_enable_resources(struct pci_dev *, int mask);
 void pci_fixup_irqs(u8 (*)(struct pci_dev *, u8 *),

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v5 1/7] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
                                       ` (5 preceding siblings ...)
  2013-06-01  6:03                     ` [PATCH v5 7/7] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
@ 2013-06-01  6:03                     ` Yinghai Lu
  2013-06-22  3:00                     ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-01  6:03 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, linux-kernel, Yinghai Lu

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date:   Thu Feb 23 19:23:29 2012 -0800
|
|    PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails with any ioport or mmio, it will try must then try to extend
must with optional. The reassign will put extended mmio or pref mmio
in the middle of parent resource range.
That will prevent other sibling resources to get must-have resources
or get extended properly.

We can check fail type to see if can we need fall back to must-have
only, that will keep not needed release resource to be must+optional.

Separate three resource type checking if we need to release
assigned resource after requested + add_size try.
1. if there is io port assign fail, will release assigned io port.
2. if there is pref mmio assign fail, release assigned pref mmio.
   if assigned pref mmio's parent is non-pref mmio and there
   is non-pref mmio assign fail, will release that assigned pref mmio.
3. if there is non-pref mmio assign fail or pref mmio assigned fail,
   will release assigned non-pref mmio.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range or 32bit mmio.

-v2: need to remove assigned entries from optional list too.
-v3: not just checking ioport related, requested by Bjorn.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |   70 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 69 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -300,6 +300,48 @@ static void assign_requested_resources_s
 	}
 }
 
+static unsigned long pci_fail_res_type_mask(struct list_head *fail_head)
+{
+	struct pci_dev_resource *fail_res;
+	unsigned long mask = 0;
+
+	/* check failed type */
+	list_for_each_entry(fail_res, fail_head, list)
+		mask |= fail_res->flags;
+
+	/*
+	 * one pref failed resource will set IORESOURCE_MEM,
+	 * as we can allocate pref in non-pref range.
+	 * Will release all asssigned non-pref sibling resources
+	 * according to that bit.
+	 */
+	return mask & (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH);
+}
+
+static bool pci_need_to_release(unsigned long mask, struct resource *res)
+{
+	if (res->flags & IORESOURCE_IO)
+		return !!(mask & IORESOURCE_IO);
+
+	/* check pref at first */
+	if (res->flags & IORESOURCE_PREFETCH) {
+		if (mask & IORESOURCE_PREFETCH)
+			return true;
+		/* count pref if its parent is non-pref */
+		else if ((mask & IORESOURCE_MEM) &&
+			 !(res->parent->flags & IORESOURCE_PREFETCH))
+			return true;
+		else
+			return false;
+	}
+
+	if (res->flags & IORESOURCE_MEM)
+		return !!(mask & IORESOURCE_MEM);
+
+	/* should not get here */
+	return false;
+}
+
 static void __assign_resources_sorted(struct list_head *head,
 				 struct list_head *realloc_head,
 				 struct list_head *fail_head)
@@ -312,11 +354,24 @@ static void __assign_resources_sorted(st
 	 *  if could do that, could get out early.
 	 *  if could not do that, we still try to assign requested at first,
 	 *    then try to reassign add_size for some resources.
+	 *
+	 * Separate three resource type checking if we need to release
+	 *  assigned resource after requested + add_size try.
+	 *	1. if there is io port assign fail, will release assigned
+	 *	   io port.
+	 *	2. if there is pref mmio assign fail, release assigned
+	 *	   pref mmio.
+	 *	   if assigned pref mmio's parent is non-pref mmio and there
+	 *	   is non-pref mmio assign fail, will release that assigned
+	 *	   pref mmio.
+	 *	3. if there is non-pref mmio assign fail or pref mmio
+	 *	   assigned fail, will release assigned non-pref mmio.
 	 */
 	LIST_HEAD(save_head);
 	LIST_HEAD(local_fail_head);
 	struct pci_dev_resource *save_res;
-	struct pci_dev_resource *dev_res;
+	struct pci_dev_resource *dev_res, *tmp_res;
+	unsigned long fail_type;
 
 	/* Check if optional add_size is there */
 	if (!realloc_head || list_empty(realloc_head))
@@ -348,6 +403,19 @@ static void __assign_resources_sorted(st
 		return;
 	}
 
+	/* check failed type */
+	fail_type = pci_fail_res_type_mask(&local_fail_head);
+	/* remove not need to be released assigned res from head list etc */
+	list_for_each_entry_safe(dev_res, tmp_res, head, list)
+		if (dev_res->res->parent &&
+		    !pci_need_to_release(fail_type, dev_res->res)) {
+			/* remove it from realloc_head list */
+			remove_from_list(realloc_head, dev_res->res);
+			remove_from_list(&save_head, dev_res->res);
+			list_del(&dev_res->list);
+			kfree(dev_res);
+		}
+
 	free_list(&local_fail_head);
 	/* Release assigned resource */
 	list_for_each_entry(dev_res, head, list)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis
  2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
                                       ` (6 preceding siblings ...)
  2013-06-01  6:03                     ` [PATCH v5 1/7] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional Yinghai Lu
@ 2013-06-22  3:00                     ` Yinghai Lu
  7 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-22  3:00 UTC (permalink / raw)
  To: Bjorn Helgaas, Benjamin Herrenschmidt, Gavin Shan
  Cc: linux-pci, Linux Kernel Mailing List, Yinghai Lu

On Fri, May 31, 2013 at 11:03 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> BenH reported that there is some assign unassigned resource problem
> in powerpc.
>
> It turns out after
> | commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
> | Date:   Thu Feb 23 19:23:29 2012 -0800
> |
> |    PCI: Retry on IORESOURCE_IO type allocations
>
> even the root bus does not have io port range, it will keep retrying
> to realloc with mmio.
>
> After checking the code, found that we bound io port and mmio fail
> path together.
> First patch fix the problem, that will not make mmio fall back to must-only
> when only have io port fail with must+optional.
>
> During we found the fix for that problem, found that we can separate assign
> unassigned resources to per root bus.
> that will make the code simple, also could reuse it for hotadd path.
>
> These patches are targeted to 3.11
>
> -v4: split first patch into 4 patches per Bjorn.
> -v5: drop two patches that will pass root bus resource mask after we found
>      simple and less intrusive way to fix the problem.
>
>  PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional
>  PCI: Don't use temp bus for pci_bus_release_bridge_resources
>  PCI: Use pci_walk_bus to detect unassigned resources
>  PCI: Introduce enable_local to prepare per root bus handling
>  PCI: Split pci_assign_unassigned_resources to per root bus
>  PCI: Enable pci bridge when it is needed
>  PCI: Retry assign unassigned resources for hotadd root bus

Hi, Bjorn,

Can you put this patchset in pci/next for 3.11?

Found another pciehp will need this one two. the pcie bridge does not
have io port range and it cause mmio get clear and retry.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources
  2013-06-01  6:03                     ` [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources Yinghai Lu
@ 2013-06-25 21:15                       ` Bjorn Helgaas
  2013-06-25 21:38                         ` Benjamin Herrenschmidt
  2013-06-26  7:38                         ` Yinghai Lu
  0 siblings, 2 replies; 76+ messages in thread
From: Bjorn Helgaas @ 2013-06-25 21:15 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci, linux-kernel

On Fri, May 31, 2013 at 11:03:08PM -0700, Yinghai Lu wrote:
> Per Bjorn, use pci_walk_bus instead of for_each_pci_dev or
> calling pci_realloc_detect() recursively, that will make code more readable.
> 
> Per Bjorn, separate it from big patch that handing assign_unssigned per root bus.
> 
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 
> ---
>  drivers/pci/setup-bus.c |   46 +++++++++++++++++++++++++++++++---------------
>  1 file changed, 31 insertions(+), 15 deletions(-)
> 
> Index: linux-2.6/drivers/pci/setup-bus.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/setup-bus.c
> +++ linux-2.6/drivers/pci/setup-bus.c
> @@ -1427,30 +1427,46 @@ static bool __init pci_realloc_enabled(v
>  	return pci_realloc_enable >= user_enabled;
>  }
>  
> -static void __init pci_realloc_detect(void)
> -{
>  #if defined(CONFIG_PCI_IOV) && defined(CONFIG_PCI_REALLOC_ENABLE_AUTO)
> -	struct pci_dev *dev = NULL;
> +static int __init check_unassigned_resources(struct pci_dev *dev, void *data)

I'm not going to add a function named "check_*()" because the name gives no
clue about what the return value means.  If it's a boolean function, the
name should be something like a question that has a yes/no answer.

> +{
> +	int i;
> +	int *unassigned = data;
>  
> -	if (pci_realloc_enable != undefined)
> -		return;
> +	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
> +		struct resource *r = &dev->resource[i];
>  
> -	for_each_pci_dev(dev) {
> -		int i;
> +		/* Not assigned, or rejected by kernel ? */
> +		if (r->flags && !r->start) {
> +			(*unassigned)++;
> +			return 1; /* return early from pci_walk_bus */
> +		}
> +	}
>  
> -		for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
> -			struct resource *r = &dev->resource[i];
> +	return 0;
> +}
>  
> -			/* Not assigned, or rejected by kernel ? */
> -			if (r->flags && !r->start) {
> -				pci_realloc_enable = auto_enabled;
> +static void  __init pci_realloc_detect(void)
> +{
> +	int unassigned = 0;
> +	struct pci_bus *bus;
>  
> -				return;
> -			}
> +	if (pci_realloc_enable != undefined)
> +		return;
> +
> +	list_for_each_entry(bus, &pci_root_buses, node) {
> +		pci_walk_bus(bus, check_unassigned_resources, &unassigned);
> +		if (unassigned) {
> +			pci_realloc_enable = auto_enabled;
> +			return;
>  		}
>  	}
> -#endif
>  }
> +#else
> +static void __init pci_realloc_detect(void)
> +{
> +}
> +#endif
>  
>  /*
>   * first try will not touch pci bridge res

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources
  2013-06-25 21:15                       ` Bjorn Helgaas
@ 2013-06-25 21:38                         ` Benjamin Herrenschmidt
  2013-06-25 21:46                           ` Bjorn Helgaas
  2013-06-26  7:38                         ` Yinghai Lu
  1 sibling, 1 reply; 76+ messages in thread
From: Benjamin Herrenschmidt @ 2013-06-25 21:38 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Yinghai Lu, Gavin Shan, linux-pci, linux-kernel

On Tue, 2013-06-25 at 15:15 -0600, Bjorn Helgaas wrote:
> -     for_each_pci_dev(dev) {
> > -             int i;
> > +             /* Not assigned, or rejected by kernel ? */
> > +             if (r->flags && !r->start) {
> > +                     (*unassigned)++;
> > +                     return 1; /* return early from pci_walk_bus */
> > +             }
> > +     }

BTW. I'm aware you didn't change that logic but ... it's somewhat broken
in the case where the aperture has an offset. You should compare
r->start with the offset, not with 0.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources
  2013-06-25 21:38                         ` Benjamin Herrenschmidt
@ 2013-06-25 21:46                           ` Bjorn Helgaas
  2013-06-26  8:07                             ` Yinghai Lu
  0 siblings, 1 reply; 76+ messages in thread
From: Bjorn Helgaas @ 2013-06-25 21:46 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Yinghai Lu, Gavin Shan, linux-pci, linux-kernel

On Tue, Jun 25, 2013 at 3:38 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2013-06-25 at 15:15 -0600, Bjorn Helgaas wrote:
>> -     for_each_pci_dev(dev) {
>> > -             int i;
>> > +             /* Not assigned, or rejected by kernel ? */
>> > +             if (r->flags && !r->start) {
>> > +                     (*unassigned)++;
>> > +                     return 1; /* return early from pci_walk_bus */
>> > +             }
>> > +     }
>
> BTW. I'm aware you didn't change that logic but ... it's somewhat broken
> in the case where the aperture has an offset. You should compare
> r->start with the offset, not with 0.

Yes, please fix that in a separate patch that contains only the bugfix.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources
  2013-06-25 21:15                       ` Bjorn Helgaas
  2013-06-25 21:38                         ` Benjamin Herrenschmidt
@ 2013-06-26  7:38                         ` Yinghai Lu
  1 sibling, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-26  7:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci, Linux Kernel Mailing List

On Tue, Jun 25, 2013 at 2:15 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> +static int __init check_unassigned_resources(struct pci_dev *dev, void *data)
>
> I'm not going to add a function named "check_*()" because the name gives no
> clue about what the return value means.  If it's a boolean function, the
> name should be something like a question that has a yes/no answer.

that prototype return int is required by pci_walk_bus().

drivers/pci/bus.c:void pci_walk_bus(struct pci_bus *top, int
(*cb)(struct pci_dev *, void *)

return 1, will return early from pci_walk_bus().

count_unassigned_resources() is not good name too, as we bail out early.
find_unassigned_resources() is more weird, looks like it want to return resource

Yinghai

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources
  2013-06-25 21:46                           ` Bjorn Helgaas
@ 2013-06-26  8:07                             ` Yinghai Lu
  0 siblings, 0 replies; 76+ messages in thread
From: Yinghai Lu @ 2013-06-26  8:07 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Benjamin Herrenschmidt, Gavin Shan, linux-pci, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2240 bytes --]

On Tue, Jun 25, 2013 at 2:46 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Tue, Jun 25, 2013 at 3:38 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>> On Tue, 2013-06-25 at 15:15 -0600, Bjorn Helgaas wrote:
>>> -     for_each_pci_dev(dev) {
>>> > -             int i;
>>> > +             /* Not assigned, or rejected by kernel ? */
>>> > +             if (r->flags && !r->start) {
>>> > +                     (*unassigned)++;
>>> > +                     return 1; /* return early from pci_walk_bus */
>>> > +             }
>>> > +     }
>>
>> BTW. I'm aware you didn't change that logic but ... it's somewhat broken
>> in the case where the aperture has an offset. You should compare
>> r->start with the offset, not with 0.
>
> Yes, please fix that in a separate patch that contains only the bugfix.

Please check inline word warped patch.

Subject: [PATCH] PCI: check pci bus address for unassigned res

We should compare res->start with root bus window offset.
Otherwise will have problem with arch that support hostbridge
resource offset.

BenH pointed out that during reviewing patchset that separate
assign unassigned to per root buses.

According to Bjorn, have it in separated patch.

Use pcibios_resource_to_bus to get region at first, and check
region.start instead.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1420,9 +1420,14 @@ static int check_unassigned_resources(st

     for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
         struct resource *r = &dev->resource[i];
+        struct pci_bus_region region;

         /* Not assigned, or rejected by kernel ? */
-        if (r->flags && !r->start) {
+        if (!r->flags)
+            continue;
+
+        pcibios_resource_to_bus(dev, &region, res);
+        if (!region.start) {
             (*unassigned)++;
             return 1; /* return early from pci_walk_bus */
         }

[-- Attachment #2: root_bus_ioport_skip_2_a.patch --]
[-- Type: application/octet-stream, Size: 1300 bytes --]

Subject: [PATCH] PCI: check pci bus address for unassigned res

We should compare res->start with root bus window offset.
Otherwise will have problem with arch that support hostbridge
resource offset.

BenH pointed out that during reviewing patchset that separate
assign unassigned to per root buses.

According to Bjorn, have it in separated patch.

Use pcibios_resource_to_bus to get region at first, and check
region.start instead.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/setup-bus.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1420,9 +1420,14 @@ static int check_unassigned_resources(st
 
 	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) {
 		struct resource *r = &dev->resource[i];
+		struct pci_bus_region region;
 
 		/* Not assigned, or rejected by kernel ? */
-		if (r->flags && !r->start) {
+		if (!r->flags)
+			continue;
+
+		pcibios_resource_to_bus(dev, &region, res);
+		if (!region.start) {
 			(*unassigned)++;
 			return 1; /* return early from pci_walk_bus */
 		}

^ permalink raw reply	[flat|nested] 76+ messages in thread

end of thread, other threads:[~2013-06-26  8:08 UTC | newest]

Thread overview: 76+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-05  0:10 Resource assignment oddities Benjamin Herrenschmidt
2013-05-05  0:15 ` Benjamin Herrenschmidt
2013-05-05  5:18   ` Yinghai Lu
2013-05-05  5:34     ` Benjamin Herrenschmidt
2013-05-05  7:09       ` Yinghai Lu
2013-05-05  7:52         ` Benjamin Herrenschmidt
     [not found]           ` <51871088.4594420a.0ccc.7300SMTPIN_ADDED_BROKEN@mx.google.com>
2013-05-06  3:04             ` Yinghai Lu
     [not found]               ` <20130506103159.GA16927@shangw.(null)>
2013-05-06 10:48                 ` Benjamin Herrenschmidt
2013-05-06 19:56                   ` Yinghai Lu
     [not found]                     ` <5188b791.a110420a.0bea.077eSMTPIN_ADDED_BROKEN@mx.google.com>
2013-05-07 22:21                       ` Yinghai Lu
     [not found]                         ` <5189c22b.45f7440a.0a88.6b75SMTPIN_ADDED_BROKEN@mx.google.com>
2013-05-08  3:42                           ` Yinghai Lu
2013-05-17  5:36                         ` Benjamin Herrenschmidt
2013-05-21 17:28                           ` Bjorn Helgaas
2013-05-21 17:39                             ` Yinghai Lu
2013-05-21 22:01                             ` Benjamin Herrenschmidt
2013-05-06 23:15                   ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
2013-05-06 23:15                     ` [PATCH 2/2] PCI: Skip IORESOURCE_IO size and allocation for root bus without ioport range Yinghai Lu
2013-05-07  0:50                     ` [PATCH 1/2] PCI: Split pci_assign_unassigned_resources to per root bus Benjamin Herrenschmidt
     [not found]                       ` <51885a04.c181440a.37c9.ffffa88eSMTPIN_ADDED_BROKEN@mx.google.com>
2013-05-07  7:34                         ` Yinghai Lu
2013-05-21 20:41                     ` Bjorn Helgaas
2013-05-07 22:17                   ` [PATCH v3 0/5] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
2013-05-07 22:17                     ` [PATCH v3 1/5] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
2013-05-07 22:17                     ` [PATCH v3 2/5] PCI: Skip IORESOURCE_IO allocation for root bus without ioport range Yinghai Lu
2013-05-07 22:17                     ` [PATCH v3 3/5] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range Yinghai Lu
2013-05-07 22:28                       ` Benjamin Herrenschmidt
2013-05-07 22:44                         ` Yinghai Lu
2013-05-08  1:16                           ` Benjamin Herrenschmidt
2013-05-08  3:57                             ` Yinghai Lu
2013-05-07 22:17                     ` [PATCH v3 4/5] PCI: Enable pci bridge when it is needed Yinghai Lu
2013-05-07 22:17                     ` [PATCH v3 5/5] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
2013-05-22  6:38                   ` [PATCH v4 0/8] PCI: Skip resource allocation for root bus without conresponding type resource Yinghai Lu
2013-05-22  6:38                     ` [PATCH v4 1/8] PCI: Don't use temp bus for pci_bus_release_bridge_resources Yinghai Lu
2013-05-22  6:38                     ` [PATCH v4 2/8] PCI: Use pci_walk_bus to detect unassigned resources Yinghai Lu
2013-05-22  6:38                     ` [PATCH v4 3/8] PCI: Introduce enable_local to prepare per root bus handling Yinghai Lu
2013-05-22  6:38                     ` [PATCH v4 4/8] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
2013-05-22  6:38                     ` [PATCH v4 5/8] PCI: Skip IORESOURCE_IO allocation for root bus without ioport range Yinghai Lu
2013-05-22  6:38                     ` [PATCH v4 6/8] PCI: Skip IORESOURCE_MMIO allocation for root bus without MMIO range Yinghai Lu
2013-05-22  6:38                     ` [PATCH v4 7/8] PCI: Enable pci bridge when it is needed Yinghai Lu
2013-05-22  6:38                     ` [PATCH v4 8/8] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
2013-06-01  6:03                   ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
2013-06-01  6:03                     ` [PATCH v5 2/7] PCI: Don't use temp bus for pci_bus_release_bridge_resources Yinghai Lu
2013-06-01  6:03                     ` [PATCH v5 3/7] PCI: Use pci_walk_bus to detect unassigned resources Yinghai Lu
2013-06-25 21:15                       ` Bjorn Helgaas
2013-06-25 21:38                         ` Benjamin Herrenschmidt
2013-06-25 21:46                           ` Bjorn Helgaas
2013-06-26  8:07                             ` Yinghai Lu
2013-06-26  7:38                         ` Yinghai Lu
2013-06-01  6:03                     ` [PATCH v5 4/7] PCI: Introduce enable_local to prepare per root bus handling Yinghai Lu
2013-06-01  6:03                     ` [PATCH v5 5/7] PCI: Split pci_assign_unassigned_resources to per root bus Yinghai Lu
2013-06-01  6:03                     ` [PATCH v5 6/7] PCI: Enable pci bridge when it is needed Yinghai Lu
2013-06-01  6:03                     ` [PATCH v5 7/7] PCI: Retry assign unassigned resources for hotadd root bus Yinghai Lu
2013-06-01  6:03                     ` [PATCH v5 1/7] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional Yinghai Lu
2013-06-22  3:00                     ` [PATCH v5 0/7] PCI: Change assign unassigned resources per root bus bassis Yinghai Lu
     [not found]               ` <518786a7.64bbec0a.58a0.1f6bSMTPIN_ADDED_BROKEN@mx.google.com>
2013-05-22 14:54                 ` Resource assignment oddities Bjorn Helgaas
2013-05-22 16:59                   ` Yinghai Lu
2013-05-22 17:21                     ` Bjorn Helgaas
2013-05-22 20:44                       ` Benjamin Herrenschmidt
2013-05-22 21:01                         ` Yinghai Lu
2013-05-22 20:43                     ` Benjamin Herrenschmidt
2013-05-22 21:00                       ` Yinghai Lu
2013-05-22 21:13                         ` Benjamin Herrenschmidt
2013-05-22 20:50                     ` Yinghai Lu
     [not found]                       ` <519dcfbe.89e9420a.4934.488bSMTPIN_ADDED_BROKEN@mx.google.com>
2013-05-23 17:08                         ` Yinghai Lu
2013-05-23 17:12                           ` Bjorn Helgaas
2013-05-23 17:17                             ` Yinghai Lu
2013-05-23 19:47                               ` Bjorn Helgaas
2013-05-23 21:00                                 ` Yinghai Lu
2013-05-23 21:23                                   ` Benjamin Herrenschmidt
2013-05-23 22:16                                     ` Yinghai Lu
2013-05-24 15:59                                       ` Bjorn Helgaas
2013-05-24 16:33                                         ` Benjamin Herrenschmidt
2013-05-24 16:34                                         ` Yinghai Lu
2013-05-23 17:11                         ` [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional Yinghai Lu
2013-05-24 17:25                           ` Bjorn Helgaas
2013-05-24 23:31                             ` [PATCH v3] " Yinghai Lu
2013-05-24 23:34                             ` [PATCH] " Yinghai Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.