linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
To: Sam Bobroff <sbobroff@linux.ibm.com>,
	Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 0/8]
Date: Fri, 19 Apr 2019 15:36:34 -0700	[thread overview]
Message-ID: <a9d78cf8-4d50-d0f7-7545-4469034ec1d8@linux.vnet.ibm.com> (raw)
In-Reply-To: <20190415034109.GA9045@tungsten.ozlabs.ibm.com>

On 04/14/2019 08:41 PM, Sam Bobroff wrote:
> On Thu, Apr 11, 2019 at 05:55:33PM -0700, Tyrel Datwyler wrote:
>> On 03/19/2019 07:58 PM, Sam Bobroff wrote:
>>> Hi all,
>>>
>>> This patch set adds support for EEH recovery of hot plugged devices on pSeries
>>> machines. Specifically, devices discovered by PCI rescanning using
>>> /sys/bus/pci/rescan, which includes devices hotplugged by QEMU's device_add
>>> command. (pSeries doesn't currently use slot power control for hotplugging.)
>>
>> Slight nit that its not that pSeries doesn't support slot power control
>> hotplugging, its that QEMU pSeries guests don't support it. We most definitely
>> use the slot power control for hotplugging in PowerVM pSeries Linux guests. More
> 
> Ah, I think I see what you mean: pSeries can (and does!) use slot power
> control for hotplugging, it's just that Linux doesn't. Right :-) I'll
> change it to "Linux on pSeries doesn't...." for the next version.

Not quite. A pSeries Linux PowerVM LPAR does use the slot power hotplugging. It
is only the pSeries Linux qemu guest that doesn't. The way that hotplug is
initiated is different for PowerVM and qemu. On PowerVM the command is sent from
the HMC down the RSCT stack which calls drmgr whereas with qemu hotplug
generates an interrupt which logs an event that is picked up by rtas_errd which
then calls drmgr with the "-V" flag. That flag was a special case that we added
to drmgr to work around the slot power hotplugging with rpaphp not working
correctly with qemu guests.

> 
>> specifically we had to work around short comings in the rpaphp driver when
>> dealing with QEMU. This being that while at initial glance the design implies
>> that it had multiple devices per PHB in mind, it didn't, and only actually
>> supported a single slot per PHB. Further, when we developed the QEMU pci hotplug
>> feature we had to deal with only having a single PHB per QEMU guest and as a
>> result needed a way to plug multiple pci devices into a single PHB. Hence, came
>> the pci rescan work around in drmgr.
>>
>> Mike Roth and I have had discussions over the years to get the slot power
>> control hotplugging working properly with QEMU, and while I did get the RPA
>> hotplug driver fixed to register all available slots associated with a PHB, EEH
>> remained an issue. So, I'm very happy to see this patchset get that working with
>> the rescan work around.
>>
>>>
>>> As a side effect this also provides EEH support for devices removed by
>>> /sys/bus/pci/devices/*/remove and re-discovered by writing to /sys/bus/pci/rescan,
>>> on all platforms.
>>
>> +1, this seems like icing on the cake. ;)
> 
> Yes :-)
> 
> Although maybe I should mention that we can't really benefit from this
> on PowerNV *yet* because there seem to be some other problems with
> removing and re-scanning devices: in my tests devices are often unusable
> after being rediscovered.
> 
> (I'm hoping to take a look at that soon.)

Interesting.

> 
>>>
>>> The approach I've taken is to use the fact that the existing
>>> pcibios_bus_add_device() platform hooks (which are used to set up EEH on
>>> Virtual Function devices (VFs)) are actually called for all devices, so I've
>>> widened their scope and made other adjustments necessary to allow them to work
>>> for hotplugged and boot-time devices as well.
>>>
>>> Because some of the changes are in generic PowerPC code, it's
>>> possible that I've disturbed something for another PowerPC platform. I've tried
>>> to minimize this by leaving that code alone as much as possible and so there
>>> are a few cases where eeh_add_device_{early,late}() or eeh_add_sysfs_files() is
>>> called more than once. I think these can be looked at later, as duplicate calls
>>> are not harmful.
>>>
>>> The patch "Convert PNV_PHB_FLAG_EEH" isn't strictly necessary and I'm not sure
>>> if it's better to keep it, because it simplifies the code or drop it, because
>>> we may need a separate flag per PHB later on. Thoughts anyone?
>>>
>>> The first patch is a rework of the pcibios_init reordering patch I posted
>>> earlier, which I've included here because it's necessary for this set.
>>>
>>> I have done some testing for PowerNV on Power9 using a modified pnv_php module
>>> and some testing on pSeries with slot power control using a modified rpaphp
>>> module, and the EEH-related parts seem to work.
>>
>> I'm interested in what modifications with rpaphp. Its unclear if you are saying
>> rpaphp modified so that slot power hotplug works with a QEMU pSeries guest? If
>> thats the case it would be optimal to get that upstream and remove the work
>> rescan workaround for guests that don't need it.
> 
> Unfortunately no, I didn't do enough work to really get it working.  I
> just wanted to get an idea of how that code path interacted with the EEH
> code I was changing, so that hopefully when we get to fixing it, the EEH
> part will be easier to do.
> 
> The hack I tested with was:
> 
> - rtas_errd changed so that it doesn't pass -V to drmgr (-V seems to
>   trigger drmgr to use the PCI rescan system rather that slot power
>   control).

Correct, as I mentioned above we did that on purpose to basically hack around
rpaphp not working with qemu guests.

-Tyrel

> - of_pci_parse_addrs() changed so that if the assigned-addresses node
>   is missing (which it is when the guest is running under QEMU/KVM) we
>   call pci_setup_device() to configure the BARs.
> 
> It did look pretty good -- the EEH part may actually work fine once we get
> the rest sorted out.
> 
>>
>> -Tyrel
>>
>>>
>>> Cheers,
>>> Sam.
>>>
>>> Sam Bobroff (8):
>>>   powerpc/64: Adjust order in pcibios_init()
>>>   powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
>>>   powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag
>>>   powerpc/eeh: Improve debug messages around device addition
>>>   powerpc/eeh: Add eeh_show_enabled()
>>>   powerpc/eeh: Initialize EEH address cache earlier
>>>   powerpc/eeh: EEH for pSeries hot plug
>>>   powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build()
>>>
>>>  arch/powerpc/include/asm/eeh.h               | 19 +++--
>>>  arch/powerpc/kernel/eeh.c                    | 33 ++++-----
>>>  arch/powerpc/kernel/eeh_cache.c              | 29 +-------
>>>  arch/powerpc/kernel/eeh_driver.c             |  4 ++
>>>  arch/powerpc/kernel/of_platform.c            |  3 +-
>>>  arch/powerpc/kernel/pci-common.c             |  4 --
>>>  arch/powerpc/kernel/pci_32.c                 |  4 ++
>>>  arch/powerpc/kernel/pci_64.c                 | 12 +++-
>>>  arch/powerpc/platforms/powernv/eeh-powernv.c | 41 +++++------
>>>  arch/powerpc/platforms/powernv/pci.c         |  7 +-
>>>  arch/powerpc/platforms/powernv/pci.h         |  2 -
>>>  arch/powerpc/platforms/pseries/eeh_pseries.c | 75 +++++++++++---------
>>>  arch/powerpc/platforms/pseries/pci.c         |  7 +-
>>>  13 files changed, 122 insertions(+), 118 deletions(-)
>>>
>>


      reply	other threads:[~2019-04-19 22:38 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-20  2:58 [PATCH 0/8] Sam Bobroff
2019-03-20  2:58 ` [PATCH 1/8] powerpc/64: Adjust order in pcibios_init() Sam Bobroff
2019-03-20  6:03   ` Alexey Kardashevskiy
2019-03-20  2:58 ` [PATCH 2/8] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-04-08  6:50     ` Sam Bobroff
2019-03-20  2:58 ` [PATCH 3/8] powerpc/eeh: Convert PNV_PHB_FLAG_EEH to global flag Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-04-09  1:41     ` Sam Bobroff
2019-04-18  9:51   ` Oliver O'Halloran
2019-04-30  5:30     ` Sam Bobroff
2019-03-20  2:58 ` [PATCH 4/8] powerpc/eeh: Improve debug messages around device addition Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-03-20  2:58 ` [PATCH 5/8] powerpc/eeh: Add eeh_show_enabled() Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-03-20  6:24     ` Oliver
2019-04-09  3:30     ` Sam Bobroff
2019-04-18 10:01   ` Oliver O'Halloran
2019-04-30  5:44     ` Sam Bobroff
2019-03-20  2:58 ` [PATCH 6/8] powerpc/eeh: Initialize EEH address cache earlier Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-04-18 10:13   ` Oliver O'Halloran
2019-04-30  5:54     ` Sam Bobroff
2019-03-20  2:58 ` [PATCH 7/8] powerpc/eeh: EEH for pSeries hot plug Sam Bobroff
2019-03-20  6:02   ` Alexey Kardashevskiy
2019-03-20  2:58 ` [PATCH 8/8] powerpc/eeh: Remove eeh_probe_devices() and eeh_addr_cache_build() Sam Bobroff
2019-03-20  6:05   ` Alexey Kardashevskiy
2019-04-09  3:31     ` Sam Bobroff
2019-04-12  0:55 ` [PATCH 0/8] Tyrel Datwyler
2019-04-15  3:41   ` Sam Bobroff
2019-04-19 22:36     ` Tyrel Datwyler [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9d78cf8-4d50-d0f7-7545-4469034ec1d8@linux.vnet.ibm.com \
    --to=tyreld@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=sbobroff@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).