linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: aik@ozlabs.ru, kvm-ppc@vger.kernel.org,
	alex.williamson@redhat.com, qiudayu@linux.vnet.ibm.com,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 4/4] powerpc/eeh: Avoid event on passed PE
Date: Tue, 20 May 2014 15:49:57 +0200	[thread overview]
Message-ID: <537B5D85.3010305@suse.de> (raw)
In-Reply-To: <20140520124504.GB28441@shangw>


On 20.05.14 14:45, Gavin Shan wrote:
> On Tue, May 20, 2014 at 02:14:56PM +0200, Alexander Graf wrote:
>> On 20.05.14 13:56, Gavin Shan wrote:
>>> On Tue, May 20, 2014 at 01:25:11PM +0200, Alexander Graf wrote:
>>>> On 20.05.14 10:30, Gavin Shan wrote:
>>>>> If we detects frozen state on PE that has been passed to guest, we
>>>>> needn't handle it. Instead, we rely on the guest to detect and recover
>>>>> it. The patch avoid EEH event on the frozen passed PE so that the guest
>>>>> can have chance to handle that.
>>>>>
>>>>> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>>> How does the guest learn about this failure? We'd need to inject an
>>>> error into it, no?
>>>>
>>> When error is existing in HW level, 0xFF's will be turned on reading
>>> PCI config space or memory BARs. Guest retrieves the failure state,
>>> which is captured by HW automatically, via RTAS call
>>> "ibm,read-slot-reset-state2" when seeing 0xFF's on reading PCI config
>>> space or memory BARs. If "ibm,read-slot-reset-state2" reports errors in HW,
>>> the guest kernel starts to recovery.
>>>
>>> It can be called as "passive" reporting. There possible has one case that
>>> the error can't be reported for ever: No device driver binding to the VFIO
>>> PCI device and no access to device's config space and memory BARs. However,
>>> it doesn't matter. As we don't use the device, we needn't detect and recover
>>> the error at all.
>> So if the guest is waiting for an interrupt to happen it will wait
>> forever? Not really nice.
>>
> Nope, the error reporting in guest isn't interrupt-driven. It's always
> "polling" :-)

That sucks :).

>
>>>> I think what you want is an irqfd that the in-kernel eeh code
>>>> notifies when it sees a failure. When such an fd exists, the kernel
>>>> skips its own error handling.
>>>>
>>> Yeah, it's a good idea and something for me to improve in phase II. We
>>> can discuss for more later.
>> I think it makes sense to at least walk into that direction
>> immediately. The reason I brought it up in the context of this patch
>> is that with an irqfd you wouldn't need the passed flag at all.
>>
> I don't see how it can avoid the "passed" flag. Without the flag, any
> PCI config and memory BAR access on host side could trigger EEH recovery
> for those PCI devices passed to guest. That's unexpected behaviour.

Instead of

   if (passed_flag)
     return;

you would do

   if (trigger_irqfd) {
     trigger_irqfd();
     return;
   }

which would be a much nicer, generic interface.

> For host, we have 2 ways to report errors: interrupt driven and polling.
> For the guest, we only have "polling" :-)

And the interrupt path is powernv specific? Does sPAPR specify anything 
here?

>
>>>   For now, what I have in my head is something
>>> like this:
>>>
>>>        [ Host ] -> Error detected -> irqfd (or eventfd) -> QEMU
>>>                                                             |
>>>                                     -------------(A)---------
>>>                                     |
>>>                          Send one EEH event to guest kernel
>>>                                     |
>>>                          Guest kernel starts the recovery
>>>
>>> (A): I didn't figure out one convienent way to do the EEH event injection yet.
>> How does the guest learn about errors in pHyp?
>>
> It relies on "polling".

Sigh ;).

So how about we just implement this whole thing properly as irqfd? 
Whether QEMU can actually do anything with the interrupt is a different 
question - we can leave it be for now. But we could model all the code 
with the assumption that it should either handle the error itself or 
trigger and irqfd write.


Alex

  reply	other threads:[~2014-05-20 13:50 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-20  8:30 [PATCH RFCv4 0/4] EEH Support for VFIO PCI device Gavin Shan
2014-05-20  8:30 ` [PATCH 1/4] drivers/vfio: Introduce CONFIG_VFIO_PCI_EEH Gavin Shan
2014-05-20  8:30 ` [PATCH 2/4] powerpc/eeh: Flags for passed device and PE Gavin Shan
2014-05-20  8:30 ` [PATCH 3/4] drivers/vfio: New IOCTL command VFIO_EEH_INFO Gavin Shan
2014-05-20 11:21   ` Alexander Graf
2014-05-20 11:28     ` Alexander Graf
2014-05-20 11:40       ` Gavin Shan
2014-05-20 11:44         ` Alexander Graf
2014-05-20 12:21           ` Gavin Shan
2014-05-20 12:25             ` Alexander Graf
2014-05-20 12:39               ` Gavin Shan
2014-05-21  0:23                 ` Benjamin Herrenschmidt
2014-05-21  4:39                   ` Gavin Shan
2014-05-21  6:23                   ` Alexander Graf
2014-05-21  7:24                     ` Benjamin Herrenschmidt
2014-05-21 10:48                       ` Gavin Shan
2014-05-21  0:21               ` Benjamin Herrenschmidt
2014-05-20  8:30 ` [PATCH 4/4] powerpc/eeh: Avoid event on passed PE Gavin Shan
2014-05-20 11:25   ` Alexander Graf
2014-05-20 11:56     ` Gavin Shan
2014-05-20 12:14       ` Alexander Graf
2014-05-20 12:45         ` Gavin Shan
2014-05-20 13:49           ` Alexander Graf [this message]
2014-05-21  0:13             ` Benjamin Herrenschmidt
2014-05-21  6:16               ` Alexander Graf
2014-05-21  0:19             ` Benjamin Herrenschmidt
2014-05-21  6:20               ` Alexander Graf
2014-05-21  0:12       ` Benjamin Herrenschmidt
2014-05-21  4:41         ` Gavin Shan
2014-06-03  5:54     ` Paul Mackerras
2014-06-03  7:45       ` Alexander Graf
2014-06-03  7:52         ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=537B5D85.3010305@suse.de \
    --to=agraf@suse.de \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=qiudayu@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).