linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Lyude Paul <lyude@redhat.com>
Cc: linux-pci@vger.kernel.org, nouveau@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org,
	Karol Herbst <kherbst@redhat.com>, Ben Skeggs <skeggsb@gmail.com>,
	stable@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] pci/quirks: Add quirk to reset nvgpu at boot for the Lenovo ThinkPad P50
Date: Thu, 25 Apr 2019 08:01:24 -0500	[thread overview]
Message-ID: <20190425130124.GD11428@google.com> (raw)
In-Reply-To: <20190212220230.1568-1-lyude@redhat.com>

On Tue, Feb 12, 2019 at 05:02:30PM -0500, Lyude Paul wrote:
> On a very specific subset of ThinkPad P50 SKUs, particularly ones that
> come with a Quadro M1000M chip instead of the M2000M variant, the BIOS
> seems to have a very nasty habit of not always resetting the secondary
> Nvidia GPU between full reboots if the laptop is configured in Hybrid
> Graphics mode. The reason for this happening is unknown, but the
> following steps and possibly a good bit of patience will reproduce the
> issue:
> 
> 1. Boot up the laptop normally in Hybrid graphics mode
> 2. Make sure nouveau is loaded and that the GPU is awake
> 2. Allow the nvidia GPU to runtime suspend itself after being idle
> 3. Reboot the machine, the more sudden the better (e.g sysrq-b may help)
> 4. If nouveau loads up properly, reboot the machine again and go back to
> step 2 until you reproduce the issue
> 
> This results in some very strange behavior: the GPU will
> quite literally be left in exactly the same state it was in when the
> previously booted kernel started the reboot. This has all sorts of bad
> sideaffects: for starters, this completely breaks nouveau starting with a
> mysterious EVO channel failure that happens well before we've actually
> used the EVO channel for anything:
> 
> nouveau 0000:01:00.0: disp: chid 0 mthd 0000 data 00000400 00001000
> 00000002
> ...

> So to do this, we add a new pci quirk using
> DECLARE_PCI_FIXUP_CLASS_FINAL that will be invoked before the PCI probe
> at boot finishes. From there, we check to make sure that this is indeed
> the specific P50 variant of this GPU. We also make sure that the GPU PCI
> device is advertising NoReset- in order to prevent us from trying to
> reset the GPU when the machine is in Dedicated graphics mode (where the
> GPU being initialized by the BIOS is normal and expected). Finally, we
> try mapping the MMIO space for the GPU which should only work if the GPU
> is actually active in D0 mode. We can then read the magic 0x2240c
> register on the GPU, which will have bit 1 set if the GPU's firmware has
> already been posted during a previous boot. Once we've confirmed all of
> this, we reset the PCI device and re-disable it - bringing the GPU back
> into a healthy state.
> 
> Signed-off-by: Lyude Paul <lyude@redhat.com>
> Cc: nouveau@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: Karol Herbst <kherbst@redhat.com>
> Cc: Ben Skeggs <skeggsb@gmail.com>
> Cc: stable@vger.kernel.org

Applied to pci/misc for v5.2, thanks!

> ---
>  drivers/pci/quirks.c | 65 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 65 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index b0a413f3f7ca..948492fda8bf 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5117,3 +5117,68 @@ SWITCHTEC_QUIRK(0x8573);  /* PFXI 48XG3 */
>  SWITCHTEC_QUIRK(0x8574);  /* PFXI 64XG3 */
>  SWITCHTEC_QUIRK(0x8575);  /* PFXI 80XG3 */
>  SWITCHTEC_QUIRK(0x8576);  /* PFXI 96XG3 */
> +
> +/*
> + * On certain Lenovo Thinkpad P50 SKUs, specifically those with a Nvidia
> + * Quadro M1000M, the BIOS will occasionally make the mistake of not resetting
> + * the nvidia GPU between reboots if the system is configured to use hybrid
> + * graphics mode. This results in the GPU being left in whatever state it was
> + * in during the previous boot which causes spurious interrupts from the GPU,
> + * which in turn cause us to disable the wrong IRQs and end up breaking the
> + * touchpad. Unsurprisingly, this also completely breaks nouveau.
> + *
> + * Luckily, it seems a simple reset of the PCI device for the nvidia GPU
> + * manages to bring the GPU back into a clean state and fix all of these
> + * issues. Additionally since the GPU will report NoReset+ when the machine is
> + * configured in Dedicated display mode, we don't need to worry about
> + * accidentally resetting the GPU when it's supposed to already be
> + * initialized.
> + */
> +static void
> +quirk_lenovo_thinkpad_p50_nvgpu_survives_reboot(struct pci_dev *pdev)
> +{
> +	void __iomem *map;
> +	int ret;
> +
> +	if (pdev->subsystem_vendor != PCI_VENDOR_ID_LENOVO ||
> +	    pdev->subsystem_device != 0x222e ||
> +	    !pdev->reset_fn)
> +		return;
> +
> +	/*
> +	 * If we can't enable the device's mmio space, it's probably not even
> +	 * initialized. This is fine, and means we can just skip the quirk
> +	 * entirely.
> +	 */
> +	if (pci_enable_device_mem(pdev)) {
> +		pci_dbg(pdev, "Can't enable device mem, no reset needed\n");
> +		return;
> +	}
> +
> +	/* Taken from drivers/gpu/drm/nouveau/engine/device/base.c */
> +	map = ioremap(pci_resource_start(pdev, 0), 0x102000);
> +	if (!map) {
> +		pci_err(pdev, "Can't map MMIO space, this is probably very bad\n");
> +		goto out_disable;
> +	}
> +
> +	/*
> +	 * Be extra careful, and make sure that the GPU firmware is posted
> +	 * before trying a reset
> +	 */
> +	if (ioread32(map + 0x2240c) & 0x2) {
> +		pci_info(pdev,
> +			 FW_BUG "GPU left initialized by EFI, resetting\n");
> +		ret = pci_reset_function(pdev);
> +		if (ret < 0)
> +			pci_err(pdev, "Failed to reset GPU: %d\n", ret);
> +	}
> +
> +	iounmap(map);
> +out_disable:
> +	pci_disable_device(pdev);
> +}
> +
> +DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1,
> +			      PCI_CLASS_DISPLAY_VGA, 8,
> +			      quirk_lenovo_thinkpad_p50_nvgpu_survives_reboot);
> -- 
> 2.20.1
> 

      parent reply	other threads:[~2019-04-25 13:01 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-12 22:02 [PATCH] pci/quirks: Add quirk to reset nvgpu at boot for the Lenovo ThinkPad P50 Lyude Paul
2019-02-15  0:43 ` Bjorn Helgaas
2019-02-15 21:17   ` Lyude Paul
2019-03-13 22:25     ` Lyude Paul
2019-03-19 20:56       ` Lyude Paul
2019-03-21 22:48       ` Bjorn Helgaas
2019-03-22 11:30         ` Bjorn Helgaas
2019-04-03 17:27           ` Lyude Paul
2019-04-04 14:17           ` Bjorn Helgaas
2019-04-15 18:07             ` Lyude Paul
2019-04-24 18:59               ` Bjorn Helgaas
2019-04-24 19:16                 ` Lyude Paul
2019-04-24 22:36                   ` Bjorn Helgaas
2019-04-24 23:03                     ` Lyude Paul
2019-04-24 17:31             ` Lyude Paul
2019-04-24 18:28               ` Bjorn Helgaas
2019-03-22 23:50         ` Lyude Paul
2019-04-25 13:01 ` Bjorn Helgaas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190425130124.GD11428@google.com \
    --to=helgaas@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kherbst@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lyude@redhat.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=skeggsb@gmail.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).