linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Deucher <alexdeucher@gmail.com>
To: Daniel Drake <drake@endlessm.com>
Cc: "Deucher, Alexander" <alexander.deucher@amd.com>,
	Christian Koenig <christian.koenig@amd.com>,
	Chunming Zhou <David1.Zhou@amd.com>,
	amd-gfx list <amd-gfx@lists.freedesktop.org>,
	Linux PM <linux-pm@vger.kernel.org>
Subject: Re: [PATCH] drm/amdgpu: always reset asic when going into suspend
Date: Fri, 23 Oct 2020 11:04:59 -0400	[thread overview]
Message-ID: <CADnq5_MZLt76XSMOb3oQUDNqDoJeud8_82xikiaqDFE6pwkgGg@mail.gmail.com> (raw)
In-Reply-To: <CADnq5_PNGr4=MqpBeKbhxJ-gpniSCj7L0wO5_V6mjuwpKoaCAg@mail.gmail.com>

On Thu, Jan 16, 2020 at 10:14 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>
> On Wed, Jan 15, 2020 at 2:44 AM Daniel Drake <drake@endlessm.com> wrote:
> >
> > On Thu, Dec 19, 2019 at 10:08 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> > > I think there may be some AMD specific handling needed in
> > > drivers/acpi/sleep.c.  My understanding from reading the modern
> > > standby documents from MS is that each vendor needs to provide a
> > > platform specific PEP driver.  I'm not sure how much of that current
> > > code is Intel specific or not.
> >
> > I don't think there is anything Intel-specific in drivers/acpi/sleep.c.
> >
> > Reading more about PEP, I see that Linux supports PEP devices with
> > ACPI ID INT33A1 or PNP0D80. Indeed the Intel platforms we work with
> > have INT33A1 devices in their ACPI tables.
> >
> > This product has a \_SB.PEP ACPI device with _HID AMD0004 and _CID
> > PNP0D80. Full acpidump:
> > https://gist.github.com/dsd/ff3dfc0f63cdd9eba4a0fbd9e776e8be (see
> > ssdt7)
> >
> > This PEP device responds to a _DSM with UUID argument
> > "e3f32452-febc-43ce-9039-932122d37721", which is not the one
> > documented at https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf
> >
> > Nevertheless, there is some data about the GPU:
> >                     Package (0x04)
> >                     {
> >                         One,
> >                         "\\_SB.PCI0.GP17.VGA",
> >                         Zero,
> >                         0x03
> >                     },
> >
> > However since this data is identical to many other devices that
> > suspend and resume just fine, I wonder if it is really important.
> >
> > The one supported method does offer two calls which may mirror the
> > Display Off/On Notifications in the above spec:
> >                         Case (0x02)
> >                         {
> >                             \_SB.PCI0.SBRG.EC0.CSEE (0xB7)
> >                             Return (Zero)
> >                         }
> >                         Case (0x03)
> >                         {
> >                             \_SB.PCI0.SBRG.EC0.CSEE (0xB8)
> >                             Notify (\_SB.PCI0.SBRG.EC0.LID, 0x80) //
> > Status Change
> >                             Return (Zero)
> >                         }
> >
> > but I tried executing this code after suspending amdgpu, and the
> > problem still stands, amdgpu cannot wakeup correctly.
> >
> > There's nothing else really interesting in the PEP device as far as I can see.
> >
> > PEP things aside, I am still quite suspicious about the fact that
> > calling amdgpu_device_suspend() then amdgpu_device_resume() on
> > multiple products (not just this one) fails. It seems that this code
> > flow is relying on the BIOS doing something in the S3 suspend/resume
> > path in order to make the device resumable by amdgpu_device_resume(),
> > which is why we have only encountered this issue for the first time on
> > our first AMD platform that does not support S3 suspend.
> >
>
> It makes sense.  This is an SOC, not a collection of individual
> devices.  There are more devices than power rails so the sbios (via
> ACPI) has to handle the power rail.  All of the devices using that
> power rail have to suspend and tell the sbios before the platform
> microcontroller can turn off the power rail.  Presumably there is
> something missing that prevents the microcontroller for powering down
> the rail.  If the power rail is kept on, the device is never powered
> off and still retains its current state.
>
> > With that in mind, and lacking any better info, wouldn't it make sense
> > for amdgpu_device_resume() to always call reset? Maybe it's not
> > necessary in the S3 case, but it shouldn't harm anything. Or perhaps
> > it could check if the device is alive and reset it if it's not?
>
> It's just papering over the problem.  It would be better from a power
> perspective for the driver to just not suspend and keep running like
> normal.  When the driver is not suspended runtime things like clock
> and power gating are active which keep the GPU power at a minimum.
>
> >
> > Alternatively do you have any other contacts within AMD that could
> > help us figure out the underlying question of how to correctly suspend
> > and resume these devices? Happy to ship an affected product sample
> > your way.
> >
>
> I talked to our sbios team and they seem to think our S0ix
> implementation works pretty differently from Intel's.  I'm not really
> an expert on this area however.  We have a new team ramping on up this
> for Linux however.

Initial patches for S0ix platform support:
https://patchwork.kernel.org/project/platform-driver-x86/patch/20201023080410.458629-1-Shyam-sundar.S-k@amd.com/
https://patchwork.kernel.org/project/linux-acpi/patch/20201023080315.458570-1-Shyam-sundar.S-k@amd.com/

You also need the following GPU driver patches:
https://patchwork.freedesktop.org/series/82762/

There is also an audio patch required which will be available soon.

Alex

      parent reply	other threads:[~2020-10-23 15:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20191015065002.18701-1-drake@endlessm.com>
     [not found] ` <CADnq5_M4Leu0raYS6M72MqTm1+PLg9BjHCHLAYuB2-dEVP56_A@mail.gmail.com>
2019-10-16  7:23   ` [PATCH] drm/amdgpu: always reset asic when going into suspend Daniel Drake
2019-11-22 15:31     ` Alex Deucher
2019-11-25  5:17       ` Daniel Drake
2019-12-16  9:00         ` Daniel Drake
2019-12-19 14:08           ` Alex Deucher
2020-01-15  7:44             ` Daniel Drake
2020-01-16 15:14               ` Alex Deucher
2020-02-07  4:52                 ` Daniel Drake
2020-10-23 15:04                 ` Alex Deucher [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADnq5_MZLt76XSMOb3oQUDNqDoJeud8_82xikiaqDFE6pwkgGg@mail.gmail.com \
    --to=alexdeucher@gmail.com \
    --cc=David1.Zhou@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=drake@endlessm.com \
    --cc=linux-pm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).