linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>,
	Mika Westerberg <mika.westerberg@intel.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Bjorn Helgaas <helgaas@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Linux PCI <linux-pci@vger.kernel.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	nouveau <nouveau@lists.freedesktop.org>,
	Dave Airlie <airlied@gmail.com>,
	Mario Limonciello <Mario.Limonciello@dell.com>
Subject: Re: [PATCH v4] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges
Date: Mon, 9 Dec 2019 12:38:53 +0100	[thread overview]
Message-ID: <CAJZ5v0hcONxiWD+jpBe62H1SZ-84iNxT+QCn8mcesB1C7SVWjw@mail.gmail.com> (raw)
In-Reply-To: <CACO55tu+hT1WGbBn_nxLR=A-X6YWmeuz-UztJKw0QAFQDDV_xg@mail.gmail.com>

On Mon, Dec 9, 2019 at 12:17 PM Karol Herbst <kherbst@redhat.com> wrote:
>
> anybody any other ideas?

Not yet, but I'm trying to collect some more information.

> It seems that both patches don't really fix
> the issue and I have no idea left on my side to try out. The only
> thing left I could do to further investigate would be to reverse
> engineer the Nvidia driver as they support runpm on Turing+ GPUs now,
> but I've heard users having similar issues to the one Lyude told us
> about... and I couldn't verify that the patches help there either in a
> reliable way.

It looks like the newer (8+) versions of Windows expect the GPU driver
to prepare the GPU for power removal in some specific way and the
latter fails if the GPU has not been prepared as expected.

Because testing indicates that the Windows 7 path in the platform
firmware works, it may be worth trying to do what it does to the PCIe
link before invoking the _OFF method for the power resource
controlling the GPU power.

If the Mika's theory that the Win7 path simply turns the PCIe link off
is correct, then whatever the _OFF method tries to do to the link
after that should not matter.

> On Wed, Nov 27, 2019 at 8:55 PM Lyude Paul <lyude@redhat.com> wrote:
> >
> > On Wed, 2019-11-27 at 12:51 +0100, Karol Herbst wrote:
> > > On Wed, Nov 27, 2019 at 12:49 PM Mika Westerberg
> > > <mika.westerberg@intel.com> wrote:
> > > > On Tue, Nov 26, 2019 at 06:10:36PM -0500, Lyude Paul wrote:
> > > > > Hey-this is almost certainly not the right place in this thread to
> > > > > respond,
> > > > > but this thread has gotten so deep evolution can't push the subject
> > > > > further to
> > > > > the right, heh. So I'll just respond here.
> > > >
> > > > :)
> > > >
> > > > > I've been following this and helping out Karol with testing here and
> > > > > there.
> > > > > They had me test Bjorn's PCI branch on the X1 Extreme 2nd generation,
> > > > > which
> > > > > has a turing GPU and 8086:1901 PCI bridge.
> > > > >
> > > > > I was about to say "the patch fixed things, hooray!" but it seems that
> > > > > after
> > > > > trying runtime suspend/resume a couple times things fall apart again:
> > > >
> > > > You mean $subject patch, no?
> > > >
> > >
> > > no, I told Lyude to test the pci/pm branch as the runpm errors we saw
> > > on that machine looked different. Some BAR error the GPU reported
> > > after it got resumed, so I was wondering if the delays were helping
> > > with that. But after some cycles it still caused the same issue, that
> > > the GPU disappeared. Later testing also showed that my patch also
> > > didn't seem to help with this error sadly :/
> > >
> > > > > [  686.883247] nouveau 0000:01:00.0: DRM: suspending object tree...
> > > > > [  752.866484] ACPI Error: Aborting method \_SB.PCI0.PEG0.PEGP.NVPO due
> > > > > to previous error (AE_AML_LOOP_TIMEOUT) (20190816/psparse-529)
> > > > > [  752.866508] ACPI Error: Aborting method \_SB.PCI0.PGON due to
> > > > > previous error (AE_AML_LOOP_TIMEOUT) (20190816/psparse-529)
> > > > > [  752.866521] ACPI Error: Aborting method \_SB.PCI0.PEG0.PG00._ON due
> > > > > to previous error (AE_AML_LOOP_TIMEOUT) (20190816/psparse-529)
> > > >
> > > > This is probably the culprit. The same AML code fails to properly turn
> > > > on the device.
> > > >
> > > > Is acpidump from this system available somewhere?
> >
> > Attached it to this email
> >
> > > >
> > --
> > Cheers,
> >         Lyude Paul
>

  reply	other threads:[~2019-12-09 11:39 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-17 12:19 [PATCH v4] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges Karol Herbst
2019-11-14 19:17 ` Karol Herbst
2019-11-19 20:06 ` Dave Airlie
2019-11-19 21:49 ` Bjorn Helgaas
2019-11-19 22:26   ` Karol Herbst
2019-11-19 22:57     ` Bjorn Helgaas
2019-11-20 10:18     ` Mika Westerberg
2019-11-20 10:52       ` Rafael J. Wysocki
2019-11-20 11:22         ` Mika Westerberg
2019-11-20 11:48           ` Rafael J. Wysocki
2019-11-20 11:51             ` Karol Herbst
2019-11-20 12:06               ` Rafael J. Wysocki
2019-11-20 12:09                 ` Karol Herbst
2019-11-20 12:14                   ` Rafael J. Wysocki
2019-11-20 12:19                     ` Karol Herbst
2019-11-20 12:11                 ` Rafael J. Wysocki
2019-11-20 11:51           ` Mika Westerberg
2019-11-20 11:54             ` Karol Herbst
2019-11-20 11:58               ` Karol Herbst
2019-11-20 12:09                 ` Mika Westerberg
2019-11-20 12:11                   ` Karol Herbst
2019-11-20 15:15                     ` Mika Westerberg
2019-11-20 15:37                       ` Karol Herbst
2019-11-20 15:53                         ` Mika Westerberg
2019-11-20 16:23                           ` Mika Westerberg
2019-11-20 21:36                             ` Karol Herbst
2019-11-21 10:14                               ` Mika Westerberg
2019-11-21 11:03                                 ` Rafael J. Wysocki
2019-11-21 11:08                                   ` Rafael J. Wysocki
2019-11-21 11:15                                     ` Rafael J. Wysocki
2019-11-21 11:17                                   ` Mika Westerberg
2019-11-21 11:31                                     ` Rafael J. Wysocki
2019-11-20 21:37                           ` Rafael J. Wysocki
2019-11-20 21:40                             ` Karol Herbst
2019-11-20 22:29                               ` Rafael J. Wysocki
2019-11-21 11:28                                 ` Mika Westerberg
2019-11-21 11:34                                   ` Rafael J. Wysocki
2019-11-21 11:46                                     ` Mika Westerberg
2019-11-21 12:52                                       ` Mika Westerberg
2019-11-21 12:56                                         ` Karol Herbst
2019-11-21 15:43                                         ` Rafael J. Wysocki
2019-11-21 19:49                                           ` Mika Westerberg
2019-11-21 22:39                                             ` Rafael J. Wysocki
2019-11-21 22:50                                               ` Karol Herbst
2019-11-22  0:13                                                 ` Karol Herbst
2019-11-22  9:07                                                   ` Rafael J. Wysocki
2019-11-22 11:30                                                     ` Karol Herbst
2019-11-22 10:36                                               ` Mika Westerberg
2019-11-22 11:30                                                 ` Rafael J. Wysocki
2019-11-22 11:34                                                   ` Karol Herbst
2019-11-22 11:54                                                     ` Rafael J. Wysocki
2019-11-22 11:52                                                   ` Mika Westerberg
2019-11-22 12:15                                                     ` Rafael J. Wysocki
2019-11-21 12:52                                       ` Karol Herbst
2019-11-21 15:47                                         ` Rafael J. Wysocki
2019-11-21 16:06                                           ` Karol Herbst
2019-11-21 16:39                                             ` Rafael J. Wysocki
2019-11-26 23:10                                               ` Lyude Paul
2019-11-27 11:48                                                 ` Mika Westerberg
2019-11-27 11:51                                                   ` Karol Herbst
2019-11-27 19:51                                                     ` Lyude Paul
2019-12-09 11:17                                                       ` Karol Herbst
2019-12-09 11:38                                                         ` Rafael J. Wysocki [this message]
2019-12-09 12:24                                                           ` Karol Herbst
2019-12-10 19:58                                                           ` Dave Airlie
2019-12-10 20:49                                                             ` Karol Herbst
2020-01-13 15:31                                                               ` Karol Herbst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJZ5v0hcONxiWD+jpBe62H1SZ-84iNxT+QCn8mcesB1C7SVWjw@mail.gmail.com \
    --to=rafael@kernel.org \
    --cc=Mario.Limonciello@dell.com \
    --cc=airlied@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=helgaas@kernel.org \
    --cc=kherbst@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lyude@redhat.com \
    --cc=mika.westerberg@intel.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).