On 02/07/2017 13:29, Bjorn Helgaas wrote: > Hi Joshua, > > First of all, apologies for breaking IP27 and the inconvenience this > caused you. And thanks a lot for tracking down the problem and > proposing a solution! Eh, don't worry, IP27's been broken and unbroken over the years. These systems are becoming difficult to find intact, so these breaks tend to go unnoticed. > On Tue, Feb 7, 2017 at 12:13 AM, Joshua Kinard wrote: >> From: Joshua Kinard >> >> Commit 046136170a56 changed things such that setting PCI_PROBE_ONLY >> will now explicitly claim PCI resources instead of skipping the sizing >> of the bridges and assigning resources. This is okay for IP30's PCI >> code, which doesn't use physical address space to access I/O resources. >> >> However, IP27 is completely different in this regard. Instead of using >> ioremapped addresses for I/O, IP27 has a dedicated address range, >> 0x92xxxxxxxxxxxxxx, that is used for all I/O access. Since this is >> uncached physical address space, the generic MIPS PCI code will not >> probe it correctly and thus, the original behavior of PCI_PROBE_ONLY >> needs to be restored only for the IP27 platform to bypass this logic >> and have working PCI, at least for the IO6/IO6G board that houses the >> base devices, until a better solution is found. > > It sounds like there's something different about how ioremap() works > on these platforms and PCI probing is tripping over that. I'd really > like to understand more about this difference to see if we can > converge that instead of adding back the PCI_PROBE_ONLY usage. I'd need to go and dig around in the IP27 headers again for this machine to see what ioremap() is actually doing, but I *think* it returns uncached physical addresses in most instances because of a special feature of the CPU, the R10000-family, which makes uncached access very fast. I think only the IP27 platform uses this capability. Other R1x0-based systems don't (although, I might be wrong about IP27's successor in IP35). > Drivers shouldn't know whether they're running on IP27 or IP30, and > they should be using ioremap() in both cases. Does ioremap() work > differently on IP27 and IP30? Does this have something to do with > plat_ioremap() or fixup_bigphys_addr()? I believe most drivers won't know about this change. The patch I proposed modifies two files very specific to MIPS and SGI systems, one being the PCI module for "legacy" MIPS systems, and the newly-proposed generic BRIDGE driver (created by an earlier patch in this patch series). The BRIDGE ASIC (and its cousin, the XBRIDGE) are only used on SGI systems and they provide a translation layer from Crosstalk address space (Xtalk, the system interconnect bus) to PCI devices by translating PCI ops into something understood by the rest of the system. BRIDGE is the only thing that needs to get setup correctly, and then, I think, other drivers should be unaware of the magic going on behind the scenes. That said, it's still a hack that I proposed. IP27 has long had issues with PCI because it uses addresses differently. I'm no expert in these systems -- just a hobby of mine I do in my free time -- but I'll refer you to these two documents, which are archive copies since SGI retired Techpubs last year: Origin™ and Onyx2™ Theory of Operations Manual: http://csweb.cs.wfu.edu/~torgerse/Kokua/SGI/007-3439-002/sgi_html/index.html Origin™ and Onyx2™ Programmer's Reference Manual http://csweb.cs.wfu.edu/~torgerse/Kokua/SGI/007-3410-001/sgi_html/ch02.html#id5437809 The whole setup is extremely modular, so each node gets a static, fixed portion of physical address space, regardless if that node is actually present or not. By setting various bits in the upper part of a memory address, you can do various things on specific nodes. See Chapter 1 in the programmer's manual for the details. The address space in question here, 0x9200000000000000, is how you tell the HUB (system controller for each node board) to access I/O space and talk to the device peripherals. Since this is physical address space, there's no need to ioremap, so I think deep in the IP27 headers, ioremap just falls through and returns one of these physical address, then you add in the needed offsets (node number, crosstalk port, and PCI address) to get to the device you're trying to reach. IP30, SGI Octane, is similar in that it's like a single-node IP27 system. It has a HEART as a system controller instead of a HUB, but also uses Crosstalk devices. Documentation for the system doesn't really exist -- work on it has been done via manual reverse engineering by people way better than I at this. I've just been maintaining the code in external patches over the years. I think the addresses on that platform *might* also be physical addresses, but with no documentation on HEART, it's unknown if HEART treats certain address spaces in special ways like HUB does. > Is there any chance you can collect complete dmesg logs and > /proc/iomem contents from IP27 and IP30? Maybe "lspci -vv" output, > too? I'm not sure where to look to understand the ioremap() behavior. Due to the size, I'll attach the dmesg and lspci outputs from both platforms as text files, and inline /proc/iomem. There are three dmesg files for IP27. One is a normal boot with enough patches to make PCI work. Another is without the PCI_PROBE_ONLY flag and my hack, so it will try to size/assign the resources, and the last is with PCI_PROBE_ONLY and my hack, so it will try to "claim" the resources only. For the last two, each results in different error paths. A note about IP30, since that's part of an external patch set, the dmesg output is a little different, as it is using the newer BRIDGE/Xtalk code I just sent in as patches. IP27's dmesg output won't be using my full patchset in this test, so it'll more closely resemble what you'd get out of the existing mainline code. Also, IP30 doesn't use the PCI_PROBE_ONLY thing anymore. This causes /proc/iomem to be far more detailed than it used to be. IP30's /proc/iomem: 00000000-00003fff : reserved 1d200000-1d9fffff : Bridge MEM d080000000-d080003fff : 0001:00:03.0 1d204000-1d204fff : 0001:00:02.0 1d205000-1d205fff : 0001:00:02.1 1d206000-1d206fff : 0001:00:02.2 1d207000-1d2070ff : 0001:00:02.3 1d207100-1d20717f : 0001:00:01.0 1f200000-1f9fffff : Bridge MEM f080100000-f0801fffff : 0000:00:02.0 f080010000-f08001ffff : 0000:00:00.0 f080030000-f08003ffff : 0000:00:01.0 f080000000-f080000fff : 0000:00:00.0 f080020000-f080020fff : 0000:00:01.0 20004000-209b8fff : reserved 20004000-206acb13 : Kernel code 206acb14-208affff : Kernel data 209b9000-20efffff : System RAM 20f00000-20ffffff : System RAM 21000000-9fffffff : System RAM f080100000-f0801fffff : ioc3 IP27's /proc/iomem (from the working boot): 0f600000-0f6fffff : ioc3 0f680000-0f687fff : rtc-m48t35 0fa00000-0fafffff : ioc3 920000000b200000-920000000b9fffff : Bridge MEM 920000000c200000-920000000c9fffff : Bridge MEM 920000000f200000-920000000f9fffff : Bridge MEM 920000000f620170-920000000f620177 : serial 920000000f620178-920000000f62017f : serial 920000000fa20170-920000000fa20177 : serial 920000000fa20178-920000000fa2017f : serial > What exactly is the PCI probe failure without this patch? If you have > a console log (with "ignore_loglevel") it might have a clue, too. > >> Fixes: 046136170a56 ("MIPS/PCI: Claim bus resources on PCI_PROBE_ONLY set-ups") >> Signed-off-by: Joshua Kinard >> Cc: Bjorn Helgaas >> Cc: Lorenzo Pieralisi >> Cc: Thomas Bogendoerfer >> --- >> arch/mips/pci/pci-bridge.c | 15 +++++++++++++++ >> arch/mips/pci/pci-legacy.c | 15 +++++++++++++++ >> 2 files changed, 30 insertions(+) >> >> diff --git a/arch/mips/pci/pci-bridge.c b/arch/mips/pci/pci-bridge.c >> index 9df13ce313b5..af7073dba36b 100644 >> --- a/arch/mips/pci/pci-bridge.c >> +++ b/arch/mips/pci/pci-bridge.c >> @@ -62,6 +62,21 @@ bridge_probe(nasid_t nasid, int widget_id, int masterwid) >> unsigned long offset = NODE_OFFSET(nasid); >> struct bridge_controller *bc; >> >> +#ifdef CONFIG_SGI_IP27 > > I don't know how MIPS multi-platform support works. If you enable > CONFIG_SGI_IP27, does that mean the resulting kernel will *only* run > on IP27? Or can you enable several platforms, e.g., SGI_IP22, > SGI_IP27, SGI_IP28, SGI_IP32, etc., and end up with a kernel that will > boot on any of those platforms? From Kconfig, it looks like these > options are not mutually exclusive, so my guess is maybe the latter? > > If so, I would think whatever we do should be based on a run-time test > for SGI_IP27 instead of a compile-time test. Kernels are platform-specific for the SGI machines. So you need a different kernel for IP22, IP27, IP30, IP32, etc. Sometimes, different CPU's in a system needs a different kernel as well (e.g., IP32 w/ R5000 versus IP32 w/ RM7000, different kernels for each). The one *slight* exception is Indy and Indigo2, where one Indigo2 model is just a larger Indy (both are IP22). There's two other Indigo2 variants out there though, the rare Indigo2 Power (IP26) and the Indigo2 Impact (IP28). These classify as distinctly-different systems, mostly, and thus need their own kernels (not that anyone's written code for IP26 yet, but I digress). Other MIPS platforms may be a bit different. I can only speak for the SGI platforms right now. You can see where all the fun is at... >> + /* >> + * Commit 046136170a56 changed things such that setting PCI_PROBE_ONLY >> + * will now explicitly claim PCI resources instead of skipping the >> + * sizing of the bridges and assigning resources. This is okay for >> + * the IP30's PCI code, which uses normal, ioremapped addresses to >> + * do I/O. IP27, however, is different and uses a hardware-specific >> + * address range of 0x92xxxxxxxxxxxxxx for all I/O access. As such, >> + * the generic MIPS PCI code will not probe correctly and thus make >> + * PCI on IP27 completely unusable. Thus, we must restore the >> + * original logic only for IP27 until a better solution can be found. >> + */ >> + pci_set_flags(PCI_PROBE_ONLY); >> +#endif >> + >> /* XXX: Temporary until the IP27 "mega update". */ >> bc = &bridges[num_bridges]; >> if (!num_bridges) >> diff --git a/arch/mips/pci/pci-legacy.c b/arch/mips/pci/pci-legacy.c >> index 68268bbb15b8..5590af4f367f 100644 >> --- a/arch/mips/pci/pci-legacy.c >> +++ b/arch/mips/pci/pci-legacy.c >> @@ -107,6 +107,20 @@ static void pcibios_scanbus(struct pci_controller *hose) >> need_domain_info = 1; >> } >> >> +#ifdef CONFIG_SGI_IP27 >> + /* >> + * Commit 046136170a56 changed things such that setting PCI_PROBE_ONLY >> + * will now explicitly claim PCI resources instead of skipping the >> + * sizing of the bridges and assigning resources. This is okay for >> + * the IP30's PCI code, which uses normal, ioremapped addresses to >> + * do I/O. IP27, however, is different and uses a hardware-specific >> + * address range of 0x92xxxxxxxxxxxxxx for all I/O access. As such, >> + * the generic MIPS PCI code will not probe correctly and thus make >> + * PCI on IP27 completely unusable. Thus, we must restore the >> + * original logic only for IP27 until a better solution can be found. >> + */ >> + if (!pci_has_flag(PCI_PROBE_ONLY)) { >> +#else >> /* >> * We insert PCI resources into the iomem_resource and >> * ioport_resource trees in either pci_bus_claim_resources() >> @@ -115,6 +129,7 @@ static void pcibios_scanbus(struct pci_controller *hose) >> if (pci_has_flag(PCI_PROBE_ONLY)) { >> pci_bus_claim_resources(bus); >> } else { >> +#endif >> pci_bus_size_bridges(bus); >> pci_bus_assign_resources(bus); >> } >> -- >> 2.11.1 >> > -- Joshua Kinard Gentoo/MIPS kumba@gentoo.org 6144R/F5C6C943 2015-04-27 177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic