All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org
Subject: Re: PCI resource allocation mismatch with BIOS
Date: Wed, 30 Nov 2022 09:57:18 +0200	[thread overview]
Message-ID: <Y4cM3qYnaHl3fQsU@black.fi.intel.com> (raw)
In-Reply-To: <20221130074347.GC8198@wunner.de>

Hi,

On Wed, Nov 30, 2022 at 08:43:47AM +0100, Lukas Wunner wrote:
> On Tue, Nov 29, 2022 at 09:12:49AM -0700, Alex Williamson wrote:
> > On Tue, 29 Nov 2022 17:06:26 +0100 Lukas Wunner <lukas@wunner.de> wrote:
> > > On Tue, Nov 29, 2022 at 08:46:46AM -0700, Alex Williamson wrote:
> > > > Maybe the elephant in the room is why it's apparently such common
> > > > practice to need to perform a hard reset these devices outside of
> > > > virtualization scenarios...  
> > > 
> > > These GPUs are used as accelerators in cloud environments.
> > > 
> > > They're reset to a pristine state when handed out to another tenant
> > > to avoid info leaks from the previous tenant.
> > > 
> > > That should be a legitimate usage of PCIe reset, no?
> > 
> > Absolutely, but why the whole switch?  Thanks,
> 
> The reset is propagated down the hierarchy, so by resetting the
> Switch Upstream Port, it is guaranteed that all endpoints are
> reset with just a single operation.  Per PCIe r6.0.1 sec 6.6.1:
> 
>    "For a Switch, the following must cause a hot reset to be sent
>     on all Downstream Ports:
>     [...]
>     Receiving a hot reset on the Upstream Port"

Adding here the reason I got from the GPU folks:

In addition to the use case when the GPU is reset when switched to
another tenant, this is used for recovery. The "first level" recovery is
handled by the graphics driver that does Function Level Reset but if
that does not work data centers may trigger reset at higher level (root
port or switch upstream port) to reset the whole card. So called "big
hammer".

There is another use case too - firmware upgrade. This allows data
centers to upgrade firmware on those cards without need to reboot - they
just reset the whole thing to make it run the new firmware.

  reply	other threads:[~2022-11-30  7:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-28 11:14 PCI resource allocation mismatch with BIOS Mika Westerberg
2022-11-28 20:39 ` Bjorn Helgaas
2022-11-28 22:06   ` Alex Williamson
2022-11-29  6:48     ` Lukas Wunner
2022-11-29 10:09       ` Mika Westerberg
2022-11-29 13:52       ` Alex Williamson
2022-11-29 15:07         ` Mika Westerberg
2022-11-29 15:46           ` Alex Williamson
2022-11-29 16:06             ` Lukas Wunner
2022-11-29 16:12               ` Alex Williamson
2022-11-30  7:43                 ` Lukas Wunner
2022-11-30  7:57                   ` Mika Westerberg [this message]
2022-11-30 15:47                     ` Alex Williamson
2022-12-01  9:41                       ` Mika Westerberg
2022-12-09 11:08                         ` Mika Westerberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y4cM3qYnaHl3fQsU@black.fi.intel.com \
    --to=mika.westerberg@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.