On Fri, May 14, 2021 at 12:03:10PM +1000, Oliver O'Halloran wrote: > On Thu, May 13, 2021 at 2:22 PM David Gibson > wrote: > > > > On Wed, May 05, 2021 at 08:18:27PM +0530, Mahesh Salgaonkar wrote: > > > With upstream kernel, especially after commit 98ba956f6a389 > > > ("powerpc/pseries/eeh: Rework device EEH PE determination") we see that KVM > > > guest isn't able to enable EEH option for PCI pass-through devices anymore. > > > > > > [root@atest-guest ~]# dmesg | grep EEH > > > [ 0.032337] EEH: pSeries platform initialized > > > [ 0.298207] EEH: No capable adapters found: recovery disabled. > > > [root@atest-guest ~]# > > > > > > So far the linux kernel was assuming pe_config_addr equal to device's > > > config_addr and using it to enable EEH on the PE through ibm,set-eeh-option > > > RTAS call. Which wasn't the correct way as per PAPR. The linux kernel > > > commit 98ba956f6a389 fixed this flow. With that fixed, linux now uses PE > > > config address returned by ibm,get-config-addr-info2 RTAS call to enable > > > EEH option per-PE basis instead of per-device basis. However this has > > > uncovered a bug in qemu where ibm,set-eeh-option is treating PE config > > > address as per-device config address. > > > > Huh. To be fair, the stuff about this in PAPR is nearly > > incomprehensible, so we probably used what the kernel was doing as a > > guide instead. > > I found the PAPR documentation made some sense after I learned how EEH > was handled on PCI(-X) systems. What's in Linux never made sense, > unfortunately. Indeed. > > Hmm.. shouldn't we at least check that the supplied config_addr > > matches the one it should be for this PHB, rather than just ignoring > > it? > > I think that'd cause issues with older kernels. Oh, good point. > Prior to the rework > mentioned by Mahesh (linux commit 98ba956f6a389 ("powerpc/pseries/eeh: > Rework device EEH PE determination")) the kernel would call > eeh-set-option for each device in the PE using the device's > config_address as the argument rather than the PE address. If we > return an error from eeh-set-option when the argument isn't a valid PE > address then older kernels will interpret that as EEH not being > supported. That really needs to be called out in a comment though. > Preferably with kernel version numbers, etc. Agreed. > > ..and, looking back at rtas_ibm_get_config_addr_info2(), I think > > that looks wrong in the case of PCI bridges. AFAICT it gives an > > address that depends on the bus, but in other places we assume that > > the entire PHB is a single PE on the guest side, so it really > > shouldn't. > > Yep, get_config_addr_info2 should map every device inside that PE to > the same PE address, even when they're on child busses. Right. > That said, I'm > not sure how well EEH works when there's a mix of real (vfio) and > emulated (qemu bridges) devices in the same PHB. I think it'll kind of work, as long as there's only real devices from a single host PE on there. The emulated devices will basically just ignore EEH, but I think they should still apply ok to the passthrough devices. > Can VFIO pass through > a bridge? I don't think so. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson