From: Niklas Schnelle <schnelle@linux.ibm.com> To: linasvepstas@gmail.com Cc: Bjorn Helgaas <bhelgaas@google.com>, "Oliver O'Halloran" <oohall@gmail.com>, Russell Currey <ruscur@russell.cc>, linuxppc-dev@lists.ozlabs.org, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, linux-s390@vger.kernel.org, Matthew Rosato <mjrosato@linux.ibm.com>, Pierre Morel <pmorel@linux.ibm.com> Subject: Re: [PATCH 0/5] s390/pci: automatic error recovery Date: Tue, 07 Sep 2021 09:49:09 +0200 [thread overview] Message-ID: <bddf2d1867585427680cb093cb10d5d15d7aa8d3.camel@linux.ibm.com> (raw) In-Reply-To: <CAHrUA34TK6U4TB34FHejott9TdFvSgAedOpmro-Uj2ZwnvzecQ@mail.gmail.com> On Mon, 2021-09-06 at 21:05 -0500, Linas Vepstas wrote: > On Mon, Sep 6, 2021 at 4:49 AM Niklas Schnelle <schnelle@linux.ibm.com> > wrote: > > > I believe we might be the first > > implementation of PCI device recovery in a virtualized setting requiring > > us to > > coordinate the device reset with the hypervisor platform by issuing a > > disable > > and re-enable to the platform as well as starting the recovery following > > a platform event. > > > > I recall none of the details, but SRIOV is a standardized system for > sharing a PCI device across multiple virtual machines. It has detailed info > on what the hypervisor must do, and what the local OS instance must do to > accomplish this. Yes and in fact on s390 we make heavy use of SR-IOV. > It's part of the PCI standard, and its more than a decade > old now, maybe two. Being a part of the PCI standard, it was interoperable > with error recovery, to the best of my recollection. Maybe I worded things with a bit too much sensationalism and it might even be that POWER supports error recovery also with virtualization, though I'm not sure how far that goes. I believe you are right in that SR-IOV supports the error recovery, after all this patch set also has to work together with SRIOV enabled devices. At least on s390 though until this patch set the error recovery performed by the hypervisor stopped in the hypervisor. The missing part added by this patch set is coordinating with device drivers in Linux to determine where use of a recovered device can pick up after the PCIe level error recovery is done. As for virtualization this coordination of course needs to cross the hypervisor/guest boundary and at least for KVM+QEMU I know for a fact that reporting a PCI error to the guest is currently just a stub that actually completely stops the guest, so you definitely don't get smooth error recovery there yet. > At the time it was > introduced, it got pushed very aggressively. The x86 hypervisor vendors > were aiming at the heart of zseries, and were militant about it. And yet we're still here, use SR-IOV ourselves and even support Linux + KVM as a hypervisor you can use just the same on a mainframe, an x86, POWER, or ARM system. > > -- Linas >
WARNING: multiple messages have this Message-ID (diff)
From: Niklas Schnelle <schnelle@linux.ibm.com> To: linasvepstas@gmail.com Cc: linux-s390@vger.kernel.org, Pierre Morel <pmorel@linux.ibm.com>, Matthew Rosato <mjrosato@linux.ibm.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, Oliver O'Halloran <oohall@gmail.com>, Bjorn Helgaas <bhelgaas@google.com>, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 0/5] s390/pci: automatic error recovery Date: Tue, 07 Sep 2021 09:49:09 +0200 [thread overview] Message-ID: <bddf2d1867585427680cb093cb10d5d15d7aa8d3.camel@linux.ibm.com> (raw) In-Reply-To: <CAHrUA34TK6U4TB34FHejott9TdFvSgAedOpmro-Uj2ZwnvzecQ@mail.gmail.com> On Mon, 2021-09-06 at 21:05 -0500, Linas Vepstas wrote: > On Mon, Sep 6, 2021 at 4:49 AM Niklas Schnelle <schnelle@linux.ibm.com> > wrote: > > > I believe we might be the first > > implementation of PCI device recovery in a virtualized setting requiring > > us to > > coordinate the device reset with the hypervisor platform by issuing a > > disable > > and re-enable to the platform as well as starting the recovery following > > a platform event. > > > > I recall none of the details, but SRIOV is a standardized system for > sharing a PCI device across multiple virtual machines. It has detailed info > on what the hypervisor must do, and what the local OS instance must do to > accomplish this. Yes and in fact on s390 we make heavy use of SR-IOV. > It's part of the PCI standard, and its more than a decade > old now, maybe two. Being a part of the PCI standard, it was interoperable > with error recovery, to the best of my recollection. Maybe I worded things with a bit too much sensationalism and it might even be that POWER supports error recovery also with virtualization, though I'm not sure how far that goes. I believe you are right in that SR-IOV supports the error recovery, after all this patch set also has to work together with SRIOV enabled devices. At least on s390 though until this patch set the error recovery performed by the hypervisor stopped in the hypervisor. The missing part added by this patch set is coordinating with device drivers in Linux to determine where use of a recovered device can pick up after the PCIe level error recovery is done. As for virtualization this coordination of course needs to cross the hypervisor/guest boundary and at least for KVM+QEMU I know for a fact that reporting a PCI error to the guest is currently just a stub that actually completely stops the guest, so you definitely don't get smooth error recovery there yet. > At the time it was > introduced, it got pushed very aggressively. The x86 hypervisor vendors > were aiming at the heart of zseries, and were militant about it. And yet we're still here, use SR-IOV ourselves and even support Linux + KVM as a hypervisor you can use just the same on a mainframe, an x86, POWER, or ARM system. > > -- Linas >
next prev parent reply other threads:[~2021-09-07 7:49 UTC|newest] Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-06 9:49 [PATCH 0/5] s390/pci: automatic error recovery Niklas Schnelle 2021-09-06 9:49 ` Niklas Schnelle 2021-09-06 9:49 ` [PATCH 1/5] s390/pci: refresh function handle in iomap Niklas Schnelle 2021-09-06 9:49 ` Niklas Schnelle 2021-09-06 9:49 ` [PATCH 2/5] s390/pci: implement reset_slot for hotplug slot Niklas Schnelle 2021-09-06 9:49 ` Niklas Schnelle 2021-09-06 9:49 ` [PATCH 3/5] PCI: Move pci_dev_is/assign_added() to pci.h Niklas Schnelle 2021-09-06 9:49 ` Niklas Schnelle 2021-09-07 0:22 ` kernel test robot 2021-09-07 0:22 ` kernel test robot 2021-09-07 0:22 ` kernel test robot 2021-09-07 0:25 ` kernel test robot 2021-09-07 0:25 ` kernel test robot 2021-09-07 0:25 ` kernel test robot 2021-09-07 7:51 ` Andy Shevchenko 2021-09-07 7:51 ` Andy Shevchenko 2021-09-07 7:51 ` Andy Shevchenko 2021-09-07 8:14 ` Niklas Schnelle 2021-09-07 8:14 ` Niklas Schnelle 2021-09-07 8:14 ` Niklas Schnelle 2021-09-06 9:49 ` [PATCH 4/5] PCI: Export pci_dev_lock() Niklas Schnelle 2021-09-06 9:49 ` Niklas Schnelle 2021-09-06 9:49 ` [PATCH 5/5] s390/pci: implement minimal PCI error recovery Niklas Schnelle 2021-09-06 9:49 ` Niklas Schnelle 2021-09-07 2:04 ` [PATCH 0/5] s390/pci: automatic " Oliver O'Halloran 2021-09-07 2:04 ` Oliver O'Halloran 2021-09-07 8:45 ` Niklas Schnelle 2021-09-07 8:45 ` Niklas Schnelle 2021-09-07 12:21 ` Niklas Schnelle 2021-09-07 12:21 ` Niklas Schnelle 2021-09-08 1:37 ` Oliver O'Halloran 2021-09-08 1:37 ` Oliver O'Halloran 2021-09-08 8:09 ` Niklas Schnelle 2021-09-08 8:09 ` Niklas Schnelle 2021-09-07 2:05 ` Linas Vepstas 2021-09-07 2:10 ` Fwd: " Linas Vepstas 2021-09-07 7:49 ` Niklas Schnelle [this message] 2021-09-07 7:49 ` Niklas Schnelle
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bddf2d1867585427680cb093cb10d5d15d7aa8d3.camel@linux.ibm.com \ --to=schnelle@linux.ibm.com \ --cc=bhelgaas@google.com \ --cc=linasvepstas@gmail.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-s390@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mjrosato@linux.ibm.com \ --cc=oohall@gmail.com \ --cc=pmorel@linux.ibm.com \ --cc=ruscur@russell.cc \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.