All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefano Stabellini <sstabellini@kernel.org>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: edgar.iglesias@xilinx.com,
	'Stefano Stabellini' <sstabellini@kernel.org>,
	Vikram Sethi <vikrams@codeaurora.org>,
	'Wei Chen' <Wei.Chen@arm.com>,
	'Steve Capper' <Steve.Capper@arm.com>,
	'Andre Przywara' <andre.przywara@arm.com>,
	manish.jaggi@caviumnetworks.com,
	'Julien Grall' <julien.grall@linaro.org>,
	'Vikram Sethi' <vikrams@qti.qualcomm.com>,
	punit.agrawal@arm.com, 'Sameer Goel' <sgoel@qti.qualcomm.com>,
	'xen-devel' <xen-devel@lists.xenproject.org>,
	'Sinan Kaya' <okaya@qti.qualcomm.com>,
	'Dave P Martin' <Dave.Martin@arm.com>,
	'Vijaya Kumar K' <Vijaya.Kumar@caviumnetworks.com>
Subject: Re: [RFC] ARM PCI Passthrough design document
Date: Fri, 7 Jul 2017 14:50:01 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1707071420300.2919@sstabellini-ThinkPad-X260> (raw)
In-Reply-To: <20170707084915.hbl3h4mpqfk7jhpi@dhcp-3-128.uk.xensource.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6399 bytes --]

On Fri, 7 Jul 2017, Roger Pau Monné wrote:
> On Thu, Jul 06, 2017 at 03:55:28PM -0500, Vikram Sethi wrote:
> > > > > AER: Will PCIe non-fatal and fatal errors (secondary bus reset for fatal) 
> > > > > be
> > > recoverable in Xen?
> > > > > Will drivers in doms be notified about fatal errors so they can be 
> > > > > quiesced
> > > before doing secondary bus reset in Xen?
> > > > > Will Xen support Firmware First Error handling for AER? i.e When
> > > > > platform does Firmware first error handling for AER and/or filtering of 
> > > > > AER,
> > > sends associated ACPI HEST logs to Xen How will AER notification and logs be
> > > propagated to the doms: injected ACPI HEST?
> > >
> > > Hm, I'm not sure I follow here, I don't see AER tied to ACPI. AER is a PCIe
> > > capability, and according to the spec can be setup completely independent to
> > > ACPI.
> > >
> > True, it can be independent if not using firmware first AER handling (FFH). But 
> > Firmware tells the OS whether firmware first is in use.
> > If FFH is in use, the AER interrupt goes to firmware and then firmware processes 
> 
> I'm sorry, but how is the firmware supposed to know which interrupt is
> AER using? That's AFAIK setup in the PCI AER capabilities, and
> depends on whether the OS configures the device to use MSI or MSI-X.
> 
> Is there some kind of side-band mechanism that delivers the AER
> interrupt using a different method?
> 
> > the AER logs, filters errors, and sends a ACPI HEST log with the filtered AER 
> > regs to OS along with an ACPI event/interrupt. Kernel is not supposed to touch 
> > the AER registers directly in this case, but act on the register values in the 
> > HEST log.
> > http://elixir.free-electrons.com/linux/latest/source/drivers/pci/pcie/aer/aerdrv_acpi.c#L94
> 
> That's not a problem IMHO, Xen could even mask the AER capability from
> the Dom0/guest completely if needed.
> 
> > If Firmware is using FFH, Xen will get a HEST log with AER registers, and must 
> > parse those registers instead of reading AER config space.
> 
> Xen will not get an event, it's going to be delivered to Dom0 because
> when using ACPI Dom0 is the OSPM (not Xen). I assume this event is
> going to be notified by triggering an interrupt from the ACPI SCI?

It is still possible to get the event in Xen, either by having Dom0 tell
Xen about it, or my moving ACPI SCI handling in Xen. If we move ACPI SCI
handling in Xen, we could still forward a virtual SCI interrupt to Dom0
in cases where Xen decides that Dom0 should be the one handling the
event. In other cases, where Xen knows how to handle the event, then
nothing would be sent to Dom0. Would that work?


> > After the AER registers have been parsed (either from HEST log or native Xen AER 
> > interrupt handler), at least for fatal errors, Xen needs to send notification to 
> > the DOM with the device passthrough so that it's driver(s) can be quiesced (via 
> > callbacks to dev->driver->err_handler->error_detected for linux) before hot 
> > reset/secondary bus reset.
> 
> I don't think this is relevant/true given the statement above (Dom0
> being OSPM and receiving the event).
> 
> > Whether FFH is in use or not, Xen has 2 choices in how to present the error to 
> > doms for quiescing before secondary bus reset:
> 
> How is this secondary bus reset performed?

It is based on writing to PCI config space registers
(drivers/pci/pci.c:pci_reset_secondary_bus). If Xen is in charge of
ECAM, it shouldn't be an issue for Xen to do it.


> Is it something specific to each bridge or it's a standard
> interface?
> 
> Can it be done directly by Dom0, or should it be done by Xen?
> 
> > a. Send a HEST log and ACPI interrupt/event to dom if it booted ACPI dom and 
> > linux dom calls aer_recover_queue from ACPI ghes path 
> > http://elixir.free-electrons.com/linux/latest/source/drivers/pci/pcie/aer/aerdrv_core.c#L592b. Present a Root port wired interrupt source in dom ACPI/DT, and inject that 
> > irq in the GIC LR registers. When dom kernel processes the interrupt and queries 
> 
> You lost me here, I have no knowledge of ARM, and I don't know what
> GIC LR is at all.

GIC LRs are registers specific to the ARM Generic Interrupt Controller
that allow an hypervisor to inject interrupts into a guest.  Vikram is
saying that the irq could be injected into the guest.


> > config space AER, Xen emulates the AER values it wants the dom to see (in FFH 
> > case based on register values in HEST), and if FFH was in use, not actually 
> > allow the dom to clear out the AER registers.
> > 
> > Option b is probably better/easier since it works for ACPI/DT dom.
> 
> So as I understand it, the flow is the following:
> 
> 1. Hardware generates an error.
> 2. This error triggers an interrupt that's delivered to Dom0 (either
>    using an ACPI SCI or a specific AER MSI vector)
> 3. *Someone* has to do a secondary bus reset.
> 
> My question would be, who (either Xen or Dom0) should perform the bus
> reset? (and why).

I am interested in Vikram's reply, he knows more than me about this.
However, my gut feeling is that it's best to do it in Xen because
otherwise Xen might end up having to wait for Dom0 for the completion of
the reset. The operation is now short and it includes a couple of
sleeps: each sleep is an opportunity to trap into Xen again and risk
descheduling the Dom0 vcpu.


> > In my view this is the basic AER error handling leaving the devices 
> > inaccessible.
> > To recover/resume the devices, the owning dom would need to signal Xen once all 
> > its driver(s) have quiesced, letting Xen know it is ok to do the secondary bus 
> > reset (for AER fatal errors). The best way to signal this would be to let the 
> > dom try to hit SBR in the Root port bridge control register in config space, and 
> > Xen traps that and actually does the BCR.SBR write.
> >
> > Since Xen controls the ECAM config space access in Julien's proposed design, I 
> > don't see any fundamental issues with the above flow fitting into the design.
> 
> I think it's very hard for me (or Julien) to know exactly how all the
> PCI capabilities behave and interact with other components (like
> ACPI).
> 
> You seem to have a good amount of knowledge about this stuff, would
> you mind writing your proposal as a diff to Julien's original
> proposal, so that it can be properly reviewed and merged into the
> design document?

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-07-07 21:50 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-26 17:14 [RFC] ARM PCI Passthrough design document Julien Grall
2017-05-29  2:30 ` Manish Jaggi
2017-05-29 18:14   ` Julien Grall
2017-05-30  5:53     ` Manish Jaggi
2017-05-30  9:33       ` Julien Grall
2017-05-30  7:53     ` Roger Pau Monné
2017-05-30  9:42       ` Julien Grall
2017-05-30  7:40 ` Roger Pau Monné
2017-05-30  9:54   ` Julien Grall
2017-06-16  0:31     ` Stefano Stabellini
2017-06-16  0:23 ` Stefano Stabellini
2017-06-20  0:19 ` Vikram Sethi
2017-06-28 15:22   ` Julien Grall
2017-06-29 15:17     ` Vikram Sethi
2017-07-03 14:35       ` Julien Grall
2017-07-04  8:30     ` roger.pau
2017-07-06 20:55       ` Vikram Sethi
2017-07-07  8:49         ` Roger Pau Monné
2017-07-07 21:50           ` Stefano Stabellini [this message]
2017-07-07 23:40             ` Vikram Sethi
2017-07-08  7:34             ` Roger Pau Monné
2018-01-19 10:34               ` Manish Jaggi
2017-07-19 14:41 ` Notes from PCI Passthrough design discussion at Xen Summit Punit Agrawal
2017-07-20  3:54   ` Manish Jaggi
2017-07-20  8:24     ` Roger Pau Monné
2017-07-20  9:32       ` Manish Jaggi
2017-07-20 10:29         ` Roger Pau Monné
2017-07-20 10:47           ` Julien Grall
2017-07-20 11:06             ` Roger Pau Monné
2017-07-20 11:52               ` Julien Grall
2017-07-20 11:02           ` Manish Jaggi
2017-07-20 10:41         ` Julien Grall
2017-07-20 11:00           ` Manish Jaggi
2017-07-20 12:24             ` Julien Grall
2018-01-22 11:10 ` [RFC] ARM PCI Passthrough design document Manish Jaggi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1707071420300.2919@sstabellini-ThinkPad-X260 \
    --to=sstabellini@kernel.org \
    --cc=Dave.Martin@arm.com \
    --cc=Steve.Capper@arm.com \
    --cc=Vijaya.Kumar@caviumnetworks.com \
    --cc=Wei.Chen@arm.com \
    --cc=andre.przywara@arm.com \
    --cc=edgar.iglesias@xilinx.com \
    --cc=julien.grall@linaro.org \
    --cc=manish.jaggi@caviumnetworks.com \
    --cc=okaya@qti.qualcomm.com \
    --cc=punit.agrawal@arm.com \
    --cc=roger.pau@citrix.com \
    --cc=sgoel@qti.qualcomm.com \
    --cc=vikrams@codeaurora.org \
    --cc=vikrams@qti.qualcomm.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.