All of lore.kernel.org
 help / color / mirror / Atom feed
* PCI: Revert "PCI: Add runtime PM support for PCIe ports"
@ 2016-12-27 23:57 Bjorn Helgaas
  2016-12-28  9:17 ` Mika Westerberg
                   ` (3 more replies)
  0 siblings, 4 replies; 115+ messages in thread
From: Bjorn Helgaas @ 2016-12-27 23:57 UTC (permalink / raw)
  To: kilian.singer; +Cc: linux-pci, Mika Westerberg, Lukas Wunner, Rafael J. Wysocki

Hi Killian,

Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
and all the debugging you've done.  Below is a revert of the troublesome
commit.  Can you test it and verify that it also fixes the problem?

I assume Mika is looking at this and will have a better solution soon.
But if not, I'll queue this up for v4.10.


commit e648b1ca2b94d207289fedc2538d33c57cdbc4de
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Tue Dec 27 17:27:30 2016 -0600

    Revert "PCI: Add runtime PM support for PCIe ports"
    
    Revert 006d44e49a25 ("PCI: Add runtime PM support for PCIe ports").
    
    Killian reported that on a Lenovo W54l with i7-4810MQ, Intel HD Graphics
    4600, and NVIDIA Quadro® K1100M, locking the screen kills all keyboard and
    mouse interaction.  Reverting 006d44e49a25 fixes the problem.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=190861
    Reported-by: kilian.singer@quantumtechnology.info
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    CC: stable@vger.kernel.org	# v4.8+
    CC: Mika Westerberg <mika.westerberg@linux.intel.com>

diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 9698289..dcb185c 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -11,7 +11,6 @@
 #include <linux/kernel.h>
 #include <linux/errno.h>
 #include <linux/pm.h>
-#include <linux/pm_runtime.h>
 #include <linux/string.h>
 #include <linux/slab.h>
 #include <linux/pcieport_if.h>
@@ -343,8 +342,6 @@ static int pcie_device_init(struct pci_dev *pdev, int service, int irq)
 		return retval;
 	}
 
-	pm_runtime_no_callbacks(device);
-
 	return 0;
 }
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 8aa3f14..d3af264 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -85,26 +85,6 @@ static int pcie_port_resume_noirq(struct device *dev)
 	return 0;
 }
 
-static int pcie_port_runtime_suspend(struct device *dev)
-{
-	return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
-}
-
-static int pcie_port_runtime_resume(struct device *dev)
-{
-	return 0;
-}
-
-static int pcie_port_runtime_idle(struct device *dev)
-{
-	/*
-	 * Assume the PCI core has set bridge_d3 whenever it thinks the port
-	 * should be good to go to D3.  Everything else, including moving
-	 * the port to D3, is handled by the PCI core.
-	 */
-	return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
-}
-
 static const struct dev_pm_ops pcie_portdrv_pm_ops = {
 	.suspend	= pcie_port_device_suspend,
 	.resume		= pcie_port_device_resume,
@@ -113,9 +93,6 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
 	.poweroff	= pcie_port_device_suspend,
 	.restore	= pcie_port_device_resume,
 	.resume_noirq	= pcie_port_resume_noirq,
-	.runtime_suspend = pcie_port_runtime_suspend,
-	.runtime_resume	= pcie_port_runtime_resume,
-	.runtime_idle	= pcie_port_runtime_idle,
 };
 
 #define PCIE_PORTDRV_PM_OPS	(&pcie_portdrv_pm_ops)
@@ -149,31 +126,11 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
 		return status;
 
 	pci_save_state(dev);
-
-	if (pci_bridge_d3_possible(dev)) {
-		/*
-		 * Keep the port resumed 100ms to make sure things like
-		 * config space accesses from userspace (lspci) will not
-		 * cause the port to repeatedly suspend and resume.
-		 */
-		pm_runtime_set_autosuspend_delay(&dev->dev, 100);
-		pm_runtime_use_autosuspend(&dev->dev);
-		pm_runtime_mark_last_busy(&dev->dev);
-		pm_runtime_put_autosuspend(&dev->dev);
-		pm_runtime_allow(&dev->dev);
-	}
-
 	return 0;
 }
 
 static void pcie_portdrv_remove(struct pci_dev *dev)
 {
-	if (pci_bridge_d3_possible(dev)) {
-		pm_runtime_forbid(&dev->dev);
-		pm_runtime_get_noresume(&dev->dev);
-		pm_runtime_dont_use_autosuspend(&dev->dev);
-	}
-
 	pcie_port_device_remove(dev);
 }
 

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-27 23:57 PCI: Revert "PCI: Add runtime PM support for PCIe ports" Bjorn Helgaas
@ 2016-12-28  9:17 ` Mika Westerberg
  2016-12-28 11:29 ` Lukas Wunner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 115+ messages in thread
From: Mika Westerberg @ 2016-12-28  9:17 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: kilian.singer, linux-pci, Lukas Wunner, Rafael J. Wysocki

On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> I assume Mika is looking at this and will have a better solution soon.

I'll take a look at the issue next week. I'm currently on vacation.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-27 23:57 PCI: Revert "PCI: Add runtime PM support for PCIe ports" Bjorn Helgaas
  2016-12-28  9:17 ` Mika Westerberg
@ 2016-12-28 11:29 ` Lukas Wunner
  2016-12-28 16:18   ` Bjorn Helgaas
  2017-01-17 14:56 ` Bjorn Helgaas
  2017-01-25 17:58 ` Bjorn Helgaas
  3 siblings, 1 reply; 115+ messages in thread
From: Lukas Wunner @ 2016-12-28 11:29 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: kilian.singer, linux-pci, Mika Westerberg, Rafael J. Wysocki

On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> and all the debugging you've done.  Below is a revert of the troublesome
> commit.  Can you test it and verify that it also fixes the problem?
> 
> I assume Mika is looking at this and will have a better solution soon.
> But if not, I'll queue this up for v4.10.

@Kilian:  Are you using the proprietary nvidia driver?  If so,
does the issue go away if you blacklist that driver or use nouveau
instead?


With a bit of googling I found dmesg and lspci output for this model:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437386

The keyboard and mouse seem to be PS/2 devices, accessed via I/O ports
0x60, 0x64.  I assume they're located behind the LPC bridge?

The proprietary nvidia driver has a bug, it locks the legacy PCI VGA
registers with vga_tryget() but never releases that lock.  Intel
chipsets have a quirk wherein I/O ports are routed to the bus to
which the legacy PCI VGA registers are locked.  So once vga_tryget()
is called by the nvidia driver, access to the keyboard and mouse is
routed to bus 01 (on which the Nvidia card resides) and not to bus 00
(on which the LPC bridge resides).

My theory would be that if you lock the screen, the Nvidia card
runtime suspends, allowing the root port above it to suspend,
and then the I/O ports are no longer accessible.

We have a similar issue on dual GPU MacBook Pros:  Backlight brightness
is adjusted by writing to I/O ports of a gmux controller situated below
the LPC bridge.  The nvidia driver locks the legacy VGA registers and
from that point reads from the I/O ports always return 0xff.  Commit
4eebd5a4e726 ("apple-gmux: lock iGP IO to protect from vgaarb changes")
sought to fix it but caused other breakage which remains unfixed so far:
https://bugzilla.kernel.org/show_bug.cgi?id=105051
https://bugzilla.kernel.org/show_bug.cgi?id=88861#c11

I've always wondered if the Intel chipsets would behave more sensibly
if the LPC bridge had BARs specifying the I/O regions used by devices
below it.

Reverting runtime suspend for PCIe ports is not a good solution as it's
needed for Thunderbolt runtime PM on Macs.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-28 11:29 ` Lukas Wunner
@ 2016-12-28 16:18   ` Bjorn Helgaas
  2016-12-29  9:58     ` Kilian Singer
  2016-12-30  0:19     ` Rafael J. Wysocki
  0 siblings, 2 replies; 115+ messages in thread
From: Bjorn Helgaas @ 2016-12-28 16:18 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: kilian.singer, linux-pci, Mika Westerberg, Rafael J. Wysocki

On Wed, Dec 28, 2016 at 12:29:54PM +0100, Lukas Wunner wrote:
> On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > and all the debugging you've done.  Below is a revert of the troublesome
> > commit.  Can you test it and verify that it also fixes the problem?
> > 
> > I assume Mika is looking at this and will have a better solution soon.
> > But if not, I'll queue this up for v4.10.
> 
> @Kilian:  Are you using the proprietary nvidia driver?  If so,
> does the issue go away if you blacklist that driver or use nouveau
> instead?
> 
> 
> With a bit of googling I found dmesg and lspci output for this model:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437386
> 
> The keyboard and mouse seem to be PS/2 devices, accessed via I/O ports
> 0x60, 0x64.  I assume they're located behind the LPC bridge?
> 
> The proprietary nvidia driver has a bug, it locks the legacy PCI VGA
> registers with vga_tryget() but never releases that lock.  Intel
> chipsets have a quirk wherein I/O ports are routed to the bus to
> which the legacy PCI VGA registers are locked.  So once vga_tryget()
> is called by the nvidia driver, access to the keyboard and mouse is
> routed to bus 01 (on which the Nvidia card resides) and not to bus 00
> (on which the LPC bridge resides).

Interesting.  A spec reference would be a good addition to whatever
fix is proposed.

> My theory would be that if you lock the screen, the Nvidia card
> runtime suspends, allowing the root port above it to suspend,
> and then the I/O ports are no longer accessible.
> 
> We have a similar issue on dual GPU MacBook Pros:  Backlight brightness
> is adjusted by writing to I/O ports of a gmux controller situated below
> the LPC bridge.  The nvidia driver locks the legacy VGA registers and
> from that point reads from the I/O ports always return 0xff.  Commit
> 4eebd5a4e726 ("apple-gmux: lock iGP IO to protect from vgaarb changes")
> sought to fix it but caused other breakage which remains unfixed so far:
> https://bugzilla.kernel.org/show_bug.cgi?id=105051
> https://bugzilla.kernel.org/show_bug.cgi?id=88861#c11
> 
> I've always wondered if the Intel chipsets would behave more sensibly
> if the LPC bridge had BARs specifying the I/O regions used by devices
> below it.
> 
> Reverting runtime suspend for PCIe ports is not a good solution as it's
> needed for Thunderbolt runtime PM on Macs.

The choices are:

  1) Fix the regression and preserve runtime PM for PCIe ports
  2) Fix the regression by reverting runtime PM for PCIe ports

Obviously we hope for 1).  Preserving runtime PM without fixing the
regression isn't even on the list.  I know this is Linux 101, so I
apologize for restating the obvious.

Bjorn

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-28 16:18   ` Bjorn Helgaas
@ 2016-12-29  9:58     ` Kilian Singer
  2016-12-29 16:02       ` Kilian Singer
  2016-12-30  0:19     ` Rafael J. Wysocki
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-29  9:58 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Lukas Wunner, linux-pci, Mika Westerberg, Rafael J. Wysocki

Dear Lukas and Bjorn,
I am using the nouveau driver.
dmesg and lspci --vv is attached as a comment to the bug rebort
https://bugzilla.kernel.org/show_bug.cgi?id=190861#c15
https://bugzilla.kernel.org/show_bug.cgi?id=190861#c16

I could completely deactivate nvidia in the bios...

Best regards
Kilian

----- Original Message -----
From: "Bjorn Helgaas" <helgaas@kernel.org>
To: "Lukas Wunner" <lukas@wunner.de>
Cc: "Kilian Singer" <kilian.singer@quantumtechnology.info>, linux-pci@vger.kernel.org, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Wednesday, December 28, 2016 5:18:16 PM
Subject: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Wed, Dec 28, 2016 at 12:29:54PM +0100, Lukas Wunner wrote:
> On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > and all the debugging you've done.  Below is a revert of the troublesome
> > commit.  Can you test it and verify that it also fixes the problem?
> > 
> > I assume Mika is looking at this and will have a better solution soon.
> > But if not, I'll queue this up for v4.10.
> 
> @Kilian:  Are you using the proprietary nvidia driver?  If so,
> does the issue go away if you blacklist that driver or use nouveau
> instead?
> 
> 
> With a bit of googling I found dmesg and lspci output for this model:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437386
> 
> The keyboard and mouse seem to be PS/2 devices, accessed via I/O ports
> 0x60, 0x64.  I assume they're located behind the LPC bridge?
> 
> The proprietary nvidia driver has a bug, it locks the legacy PCI VGA
> registers with vga_tryget() but never releases that lock.  Intel
> chipsets have a quirk wherein I/O ports are routed to the bus to
> which the legacy PCI VGA registers are locked.  So once vga_tryget()
> is called by the nvidia driver, access to the keyboard and mouse is
> routed to bus 01 (on which the Nvidia card resides) and not to bus 00
> (on which the LPC bridge resides).

Interesting.  A spec reference would be a good addition to whatever
fix is proposed.

> My theory would be that if you lock the screen, the Nvidia card
> runtime suspends, allowing the root port above it to suspend,
> and then the I/O ports are no longer accessible.
> 
> We have a similar issue on dual GPU MacBook Pros:  Backlight brightness
> is adjusted by writing to I/O ports of a gmux controller situated below
> the LPC bridge.  The nvidia driver locks the legacy VGA registers and
> from that point reads from the I/O ports always return 0xff.  Commit
> 4eebd5a4e726 ("apple-gmux: lock iGP IO to protect from vgaarb changes")
> sought to fix it but caused other breakage which remains unfixed so far:
> https://bugzilla.kernel.org/show_bug.cgi?id=105051
> https://bugzilla.kernel.org/show_bug.cgi?id=88861#c11
> 
> I've always wondered if the Intel chipsets would behave more sensibly
> if the LPC bridge had BARs specifying the I/O regions used by devices
> below it.
> 
> Reverting runtime suspend for PCIe ports is not a good solution as it's
> needed for Thunderbolt runtime PM on Macs.

The choices are:

  1) Fix the regression and preserve runtime PM for PCIe ports
  2) Fix the regression by reverting runtime PM for PCIe ports

Obviously we hope for 1).  Preserving runtime PM without fixing the
regression isn't even on the list.  I know this is Linux 101, so I
apologize for restating the obvious.

Bjorn

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29  9:58     ` Kilian Singer
@ 2016-12-29 16:02       ` Kilian Singer
  2016-12-29 16:20         ` Kilian Singer
  0 siblings, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-29 16:02 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Lukas Wunner, linux-pci, Mika Westerberg, Rafael J. Wysocki

One thing that was always weird in my debian system is, 
that even with working lock screen on the 4.7.0-1 version.
The lock screen is not a black screen but instead seems to
be a static screenshot of the desktop.

I currently have no system to compare with but this might be 
an abnormal behavior.

----- Original Message -----
From: "Kilian Singer" <kilian.singer@quantumtechnology.info>
To: "Bjorn Helgaas" <helgaas@kernel.org>
Cc: "Lukas Wunner" <lukas@wunner.de>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Thursday, December 29, 2016 10:58:44 AM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

Dear Lukas and Bjorn,
I am using the nouveau driver.
dmesg and lspci --vv is attached as a comment to the bug rebort
https://bugzilla.kernel.org/show_bug.cgi?id=190861#c15
https://bugzilla.kernel.org/show_bug.cgi?id=190861#c16

I could completely deactivate nvidia in the bios...

Best regards
Kilian

----- Original Message -----
From: "Bjorn Helgaas" <helgaas@kernel.org>
To: "Lukas Wunner" <lukas@wunner.de>
Cc: "Kilian Singer" <kilian.singer@quantumtechnology.info>, linux-pci@vger.kernel.org, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Wednesday, December 28, 2016 5:18:16 PM
Subject: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Wed, Dec 28, 2016 at 12:29:54PM +0100, Lukas Wunner wrote:
> On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > and all the debugging you've done.  Below is a revert of the troublesome
> > commit.  Can you test it and verify that it also fixes the problem?
> > 
> > I assume Mika is looking at this and will have a better solution soon.
> > But if not, I'll queue this up for v4.10.
> 
> @Kilian:  Are you using the proprietary nvidia driver?  If so,
> does the issue go away if you blacklist that driver or use nouveau
> instead?
> 
> 
> With a bit of googling I found dmesg and lspci output for this model:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437386
> 
> The keyboard and mouse seem to be PS/2 devices, accessed via I/O ports
> 0x60, 0x64.  I assume they're located behind the LPC bridge?
> 
> The proprietary nvidia driver has a bug, it locks the legacy PCI VGA
> registers with vga_tryget() but never releases that lock.  Intel
> chipsets have a quirk wherein I/O ports are routed to the bus to
> which the legacy PCI VGA registers are locked.  So once vga_tryget()
> is called by the nvidia driver, access to the keyboard and mouse is
> routed to bus 01 (on which the Nvidia card resides) and not to bus 00
> (on which the LPC bridge resides).

Interesting.  A spec reference would be a good addition to whatever
fix is proposed.

> My theory would be that if you lock the screen, the Nvidia card
> runtime suspends, allowing the root port above it to suspend,
> and then the I/O ports are no longer accessible.
> 
> We have a similar issue on dual GPU MacBook Pros:  Backlight brightness
> is adjusted by writing to I/O ports of a gmux controller situated below
> the LPC bridge.  The nvidia driver locks the legacy VGA registers and
> from that point reads from the I/O ports always return 0xff.  Commit
> 4eebd5a4e726 ("apple-gmux: lock iGP IO to protect from vgaarb changes")
> sought to fix it but caused other breakage which remains unfixed so far:
> https://bugzilla.kernel.org/show_bug.cgi?id=105051
> https://bugzilla.kernel.org/show_bug.cgi?id=88861#c11
> 
> I've always wondered if the Intel chipsets would behave more sensibly
> if the LPC bridge had BARs specifying the I/O regions used by devices
> below it.
> 
> Reverting runtime suspend for PCIe ports is not a good solution as it's
> needed for Thunderbolt runtime PM on Macs.

The choices are:

  1) Fix the regression and preserve runtime PM for PCIe ports
  2) Fix the regression by reverting runtime PM for PCIe ports

Obviously we hope for 1).  Preserving runtime PM without fixing the
regression isn't even on the list.  I know this is Linux 101, so I
apologize for restating the obvious.

Bjorn

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29 16:02       ` Kilian Singer
@ 2016-12-29 16:20         ` Kilian Singer
  2016-12-29 17:50           ` Lukas Wunner
  0 siblings, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-29 16:20 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Lukas Wunner, linux-pci, Mika Westerberg, Rafael J. Wysocki


I know it is a repetition of what I have written above but this behaviour (comment 19) should be contrasted to the behaviour on the 4.8 and 4.9 kernel which make my system unresponsive:
Here the desktop is non static. I can see xclock ticking. The mouse
moves. But any keyboard interaction or mouse click is not possible anymore.

I checked that this behavior is not related to nvidia and nouveau kernel driver. By blacklisting them.


----- Original Message -----
From: "Kilian Singer" <kilian.singer@quantumtechnology.info>
To: "Bjorn Helgaas" <helgaas@kernel.org>
Cc: "Lukas Wunner" <lukas@wunner.de>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Thursday, December 29, 2016 5:02:30 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

One thing that was always weird in my debian system is, 
that even with working lock screen on the 4.7.0-1 version.
The lock screen is not a black screen but instead seems to
be a static screenshot of the desktop.

I currently have no system to compare with but this might be 
an abnormal behavior.

----- Original Message -----
From: "Kilian Singer" <kilian.singer@quantumtechnology.info>
To: "Bjorn Helgaas" <helgaas@kernel.org>
Cc: "Lukas Wunner" <lukas@wunner.de>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Thursday, December 29, 2016 10:58:44 AM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

Dear Lukas and Bjorn,
I am using the nouveau driver.
dmesg and lspci --vv is attached as a comment to the bug rebort
https://bugzilla.kernel.org/show_bug.cgi?id=190861#c15
https://bugzilla.kernel.org/show_bug.cgi?id=190861#c16

I could completely deactivate nvidia in the bios...

Best regards
Kilian

----- Original Message -----
From: "Bjorn Helgaas" <helgaas@kernel.org>
To: "Lukas Wunner" <lukas@wunner.de>
Cc: "Kilian Singer" <kilian.singer@quantumtechnology.info>, linux-pci@vger.kernel.org, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Wednesday, December 28, 2016 5:18:16 PM
Subject: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Wed, Dec 28, 2016 at 12:29:54PM +0100, Lukas Wunner wrote:
> On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > and all the debugging you've done.  Below is a revert of the troublesome
> > commit.  Can you test it and verify that it also fixes the problem?
> > 
> > I assume Mika is looking at this and will have a better solution soon.
> > But if not, I'll queue this up for v4.10.
> 
> @Kilian:  Are you using the proprietary nvidia driver?  If so,
> does the issue go away if you blacklist that driver or use nouveau
> instead?
> 
> 
> With a bit of googling I found dmesg and lspci output for this model:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437386
> 
> The keyboard and mouse seem to be PS/2 devices, accessed via I/O ports
> 0x60, 0x64.  I assume they're located behind the LPC bridge?
> 
> The proprietary nvidia driver has a bug, it locks the legacy PCI VGA
> registers with vga_tryget() but never releases that lock.  Intel
> chipsets have a quirk wherein I/O ports are routed to the bus to
> which the legacy PCI VGA registers are locked.  So once vga_tryget()
> is called by the nvidia driver, access to the keyboard and mouse is
> routed to bus 01 (on which the Nvidia card resides) and not to bus 00
> (on which the LPC bridge resides).

Interesting.  A spec reference would be a good addition to whatever
fix is proposed.

> My theory would be that if you lock the screen, the Nvidia card
> runtime suspends, allowing the root port above it to suspend,
> and then the I/O ports are no longer accessible.
> 
> We have a similar issue on dual GPU MacBook Pros:  Backlight brightness
> is adjusted by writing to I/O ports of a gmux controller situated below
> the LPC bridge.  The nvidia driver locks the legacy VGA registers and
> from that point reads from the I/O ports always return 0xff.  Commit
> 4eebd5a4e726 ("apple-gmux: lock iGP IO to protect from vgaarb changes")
> sought to fix it but caused other breakage which remains unfixed so far:
> https://bugzilla.kernel.org/show_bug.cgi?id=105051
> https://bugzilla.kernel.org/show_bug.cgi?id=88861#c11
> 
> I've always wondered if the Intel chipsets would behave more sensibly
> if the LPC bridge had BARs specifying the I/O regions used by devices
> below it.
> 
> Reverting runtime suspend for PCIe ports is not a good solution as it's
> needed for Thunderbolt runtime PM on Macs.

The choices are:

  1) Fix the regression and preserve runtime PM for PCIe ports
  2) Fix the regression by reverting runtime PM for PCIe ports

Obviously we hope for 1).  Preserving runtime PM without fixing the
regression isn't even on the list.  I know this is Linux 101, so I
apologize for restating the obvious.

Bjorn

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29 16:20         ` Kilian Singer
@ 2016-12-29 17:50           ` Lukas Wunner
  2016-12-29 22:52             ` Kilian Singer
  2016-12-29 23:20             ` Kilian Singer
  0 siblings, 2 replies; 115+ messages in thread
From: Lukas Wunner @ 2016-12-29 17:50 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 1384 bytes --]

On Thu, Dec 29, 2016 at 05:20:22PM +0100, Kilian Singer wrote:
> One thing that was always weird in my debian system is, 
> that even with working lock screen on the 4.7.0-1 version.
> The lock screen is not a black screen but instead seems to
> be a static screenshot of the desktop.

This sounds like an issue with the i915 driver.  When the static
screenshot is shown, i915 may have turned on panel self-refresh
(PSR).  There were numerous PSR issues.


> I know it is a repetition of what I have written above but this behaviour
> (comment 19) should be contrasted to the behaviour on the 4.8 and 4.9
> kernel which make my system unresponsive:
> Here the desktop is non static. I can see xclock ticking. The mouse
> moves. But any keyboard interaction or mouse click is not possible anymore.

It's very odd that this should be related to a root port suspending.
If mouse movements are still visible, the I/O ports of the keyboard
and mouse must still be accessible.

Perhaps you could apply the attached small debug patch, this will
log a message whenever a device runtime suspends/resumes, so it
should log when the root port that's causing trouble goes to D3.
Then we would at least know which one it is.

My money is on the root port above the Nvidia card, you can also
try to keep that one awake with
echo on > /sys/bus/pci/devices/0000:00:01.0/power/control

Thanks,

Lukas

[-- Attachment #2: runpm_debug.patch --]
[-- Type: text/plain, Size: 1072 bytes --]

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 4c7055009bd6..9eba9686e302 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -345,9 +345,10 @@ static int rpm_idle(struct device *dev, int rpmflags)
 
 	callback = RPM_GET_CALLBACK(dev, runtime_idle);
 
-	if (callback)
+	if (callback) {
+		dev_info(dev, "rpm_idle\n");
 		retval = __rpm_callback(callback, dev);
-
+	}
 	dev->power.idle_notification = false;
 	wake_up_all(&dev->power.wait_queue);
 
@@ -516,6 +517,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
 	callback = RPM_GET_CALLBACK(dev, runtime_suspend);
 
 	dev_pm_enable_wake_irq(dev);
+	dev_info(dev, "rpm_suspend\n");
 	retval = rpm_callback(callback, dev);
 	if (retval)
 		goto fail;
@@ -738,6 +740,7 @@ static int rpm_resume(struct device *dev, int rpmflags)
 	callback = RPM_GET_CALLBACK(dev, runtime_resume);
 
 	dev_pm_disable_wake_irq(dev);
+	dev_info(dev, "rpm_resume\n");
 	retval = rpm_callback(callback, dev);
 	if (retval) {
 		__update_runtime_status(dev, RPM_SUSPENDED);

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29 17:50           ` Lukas Wunner
@ 2016-12-29 22:52             ` Kilian Singer
  2016-12-29 23:02               ` Kilian Singer
  2016-12-29 23:48               ` Lukas Wunner
  2016-12-29 23:20             ` Kilian Singer
  1 sibling, 2 replies; 115+ messages in thread
From: Kilian Singer @ 2016-12-29 22:52 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 2478 bytes --]

Just to be sure I am currently using this 
repository:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

using
commit 2d706e790f0508dff4fb72eca9b4892b79757feb
Merge: 8f18e4d03ed8 8759fec4af22
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Dec 27 17:51:36 2016 -0800

The provided patch fails I can locate the positions by hand though.

Shall I use another repository or commit?

Best regards

PS:
In order to compile on debian I use the makefile patch to disable PIE:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8-rc2/0002-UBUNTU-SAUCE-no-up-disable-pie-when-gcc-has-it-enabl.patch

But I guess that this does not matter.


----- Original Message -----
From: "Lukas Wunner" <lukas@wunner.de>
To: "Kilian Singer" <kilian.singer@quantumtechnology.info>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Thursday, December 29, 2016 6:50:28 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Thu, Dec 29, 2016 at 05:20:22PM +0100, Kilian Singer wrote:
> One thing that was always weird in my debian system is, 
> that even with working lock screen on the 4.7.0-1 version.
> The lock screen is not a black screen but instead seems to
> be a static screenshot of the desktop.

This sounds like an issue with the i915 driver.  When the static
screenshot is shown, i915 may have turned on panel self-refresh
(PSR).  There were numerous PSR issues.


> I know it is a repetition of what I have written above but this behaviour
> (comment 19) should be contrasted to the behaviour on the 4.8 and 4.9
> kernel which make my system unresponsive:
> Here the desktop is non static. I can see xclock ticking. The mouse
> moves. But any keyboard interaction or mouse click is not possible anymore.

It's very odd that this should be related to a root port suspending.
If mouse movements are still visible, the I/O ports of the keyboard
and mouse must still be accessible.

Perhaps you could apply the attached small debug patch, this will
log a message whenever a device runtime suspends/resumes, so it
should log when the root port that's causing trouble goes to D3.
Then we would at least know which one it is.

My money is on the root port above the Nvidia card, you can also
try to keep that one awake with
echo on > /sys/bus/pci/devices/0000:00:01.0/power/control

Thanks,

Lukas

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: runpm_debug.patch --]
[-- Type: text/x-patch; name=runpm_debug.patch, Size: 1105 bytes --]

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 4c7055009bd6..9eba9686e302 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -345,9 +345,10 @@ static int rpm_idle(struct device *dev, int rpmflags)
 
 	callback = RPM_GET_CALLBACK(dev, runtime_idle);
 
-	if (callback)
+	if (callback) {
+		dev_info(dev, "rpm_idle\n");
 		retval = __rpm_callback(callback, dev);
-
+	}
 	dev->power.idle_notification = false;
 	wake_up_all(&dev->power.wait_queue);
 
@@ -516,6 +517,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
 	callback = RPM_GET_CALLBACK(dev, runtime_suspend);
 
 	dev_pm_enable_wake_irq(dev);
+	dev_info(dev, "rpm_suspend\n");
 	retval = rpm_callback(callback, dev);
 	if (retval)
 		goto fail;
@@ -738,6 +740,7 @@ static int rpm_resume(struct device *dev, int rpmflags)
 	callback = RPM_GET_CALLBACK(dev, runtime_resume);
 
 	dev_pm_disable_wake_irq(dev);
+	dev_info(dev, "rpm_resume\n");
 	retval = rpm_callback(callback, dev);
 	if (retval) {
 		__update_runtime_status(dev, RPM_SUSPENDED);

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29 22:52             ` Kilian Singer
@ 2016-12-29 23:02               ` Kilian Singer
  2016-12-29 23:05                 ` Kilian Singer
  2016-12-29 23:48               ` Lukas Wunner
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-29 23:02 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

The patch failes on 2 insert points, but I applied it by hand.
Should I use another repo or commit?

I also patched the Makefile due to the some gcc issues:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=841420
I guess that does not matter.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29 23:02               ` Kilian Singer
@ 2016-12-29 23:05                 ` Kilian Singer
  0 siblings, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2016-12-29 23:05 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

Sorry for repeating the last part, my webmailer seems to strip
away some text in the overview pane and I thought it got cut away. 
So I sent the "missing" text again...

----- Original Message -----
From: "Kilian Singer" <kilian.singer@quantumtechnology.info>
To: "Lukas Wunner" <lukas@wunner.de>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Friday, December 30, 2016 12:02:36 AM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

The patch failes on 2 insert points, but I applied it by hand.
Should I use another repo or commit?

I also patched the Makefile due to the some gcc issues:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=841420
I guess that does not matter.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29 17:50           ` Lukas Wunner
  2016-12-29 22:52             ` Kilian Singer
@ 2016-12-29 23:20             ` Kilian Singer
  2016-12-30  0:07               ` Lukas Wunner
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-29 23:20 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

The echo on > /sys/bus/pci/devices/0000:00:01.0/power/control

did not help.

Also I noticed on each boot directly after initramfs I get
mmc0: Unknown contrller version. You may experience problems.

On all versions of the kernel.


----- Original Message -----
From: "Lukas Wunner" <lukas@wunner.de>
To: "Kilian Singer" <kilian.singer@quantumtechnology.info>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Thursday, December 29, 2016 6:50:28 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Thu, Dec 29, 2016 at 05:20:22PM +0100, Kilian Singer wrote:
> One thing that was always weird in my debian system is, 
> that even with working lock screen on the 4.7.0-1 version.
> The lock screen is not a black screen but instead seems to
> be a static screenshot of the desktop.

This sounds like an issue with the i915 driver.  When the static
screenshot is shown, i915 may have turned on panel self-refresh
(PSR).  There were numerous PSR issues.


> I know it is a repetition of what I have written above but this behaviour
> (comment 19) should be contrasted to the behaviour on the 4.8 and 4.9
> kernel which make my system unresponsive:
> Here the desktop is non static. I can see xclock ticking. The mouse
> moves. But any keyboard interaction or mouse click is not possible anymore.

It's very odd that this should be related to a root port suspending.
If mouse movements are still visible, the I/O ports of the keyboard
and mouse must still be accessible.

Perhaps you could apply the attached small debug patch, this will
log a message whenever a device runtime suspends/resumes, so it
should log when the root port that's causing trouble goes to D3.
Then we would at least know which one it is.

My money is on the root port above the Nvidia card, you can also
try to keep that one awake with
echo on > /sys/bus/pci/devices/0000:00:01.0/power/control

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29 22:52             ` Kilian Singer
  2016-12-29 23:02               ` Kilian Singer
@ 2016-12-29 23:48               ` Lukas Wunner
  1 sibling, 0 replies; 115+ messages in thread
From: Lukas Wunner @ 2016-12-29 23:48 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]

On Thu, Dec 29, 2016 at 11:52:30PM +0100, Kilian Singer wrote:
> Just to be sure I am currently using this repository:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> 
> using
> commit 2d706e790f0508dff4fb72eca9b4892b79757feb

That's the tip of Linus' master branch (of 45 hours ago),
quite adventurous. :-)


> The provided patch fails I can locate the positions by hand though.

The context changed a bit, but you can work around that by increasing
the fuzz, i.e. "patch -p1 -F3".  The context of the patch is just 3 lines,
so the risk is that the patch is applied in the wrong place, but it's not
in this particular case.  I'm also attaching the same patch rebased on the
commit you mentioned above.


> Shall I use another repository or commit?

Linus' master branch is currently at rc1, i.e. bleeding edge,
you may experience lots of (other) breakage there and e.g. v4.9
can be expected to be more stable.  You can switch to the v4.9 tag
with "git checkout v4.9".


To provide some background information, the tiny patch I've attached
does nothing more but log when a device runtime suspends/resumes.
So after locking the screen you should theoretically get a message
in dmesg showing which PCIe root port runtime suspended and somehow
interfered with the keyboard/mouse.

There are five root ports in your machine:
0000:00:01.0	bus 01, Nvidia GPU
0000:00:1c.0	bus 02, SD/MMC Card Reader
0000:00:1c.1	bus 03, Wireless Card
0000:00:1c.2	bus 04, Unused Hotplug Port
0000:00:1c.4	bus 06, Unused Hotplug Port

Runtime suspending these ports was newly added in v4.8 and is supposed
to save energy.  How that can interfere with a keyboard/mouse is a bit
mysterious, the only thing I can imagine is inaccessibility of their
I/O ports due to bad routing or usage by another device.

Best regards,

Lukas

[-- Attachment #2: runpm_debug_v4.10-rc1.patch --]
[-- Type: text/plain, Size: 1080 bytes --]

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 872eac4..c563eaf 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -422,9 +422,10 @@ static int rpm_idle(struct device *dev, int rpmflags)
 
 	callback = RPM_GET_CALLBACK(dev, runtime_idle);
 
-	if (callback)
+	if (callback) {
+		dev_info(dev, "rpm_idle\n");
 		retval = __rpm_callback(callback, dev);
-
+	}
 	dev->power.idle_notification = false;
 	wake_up_all(&dev->power.wait_queue);
 
@@ -593,6 +594,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
 	callback = RPM_GET_CALLBACK(dev, runtime_suspend);
 
 	dev_pm_enable_wake_irq_check(dev, true);
+	dev_info(dev, "rpm_suspend\n");
 	retval = rpm_callback(callback, dev);
 	if (retval)
 		goto fail;
@@ -815,6 +817,7 @@ static int rpm_resume(struct device *dev, int rpmflags)
 	callback = RPM_GET_CALLBACK(dev, runtime_resume);
 
 	dev_pm_disable_wake_irq_check(dev);
+	dev_info(dev, "rpm_resume\n");
 	retval = rpm_callback(callback, dev);
 	if (retval) {
 		__update_runtime_status(dev, RPM_SUSPENDED);

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-29 23:20             ` Kilian Singer
@ 2016-12-30  0:07               ` Lukas Wunner
  2016-12-30  0:16                 ` Kilian Singer
  0 siblings, 1 reply; 115+ messages in thread
From: Lukas Wunner @ 2016-12-30  0:07 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

On Fri, Dec 30, 2016 at 12:20:34AM +0100, Kilian Singer wrote:
> The echo on > /sys/bus/pci/devices/0000:00:01.0/power/control
> 
> did not help.
> 
> Also I noticed on each boot directly after initramfs I get
> mmc0: Unknown contrller version. You may experience problems.
> 
> On all versions of the kernel.

Hm, that rings a bell.  The MMC controller is located below
root port 0000:00:1c.0, which has vendor/device ID 8086:8c10.

We're having trouble with the exact same root port on 2015
MacBook Pros where it mysteriously prevents them from powering off:
https://bugzilla.kernel.org/show_bug.cgi?id=103211
http://www.spinics.net/lists/linux-pci/msg53460.html

Does this help:
echo on > /sys/bus/pci/devices/0000:00:1c.0/power/control

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:07               ` Lukas Wunner
@ 2016-12-30  0:16                 ` Kilian Singer
  2016-12-30  0:24                   ` Kilian Singer
  2017-01-02 11:40                   ` Lukas Wunner
  0 siblings, 2 replies; 115+ messages in thread
From: Kilian Singer @ 2016-12-30  0:16 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki


I will check the echo... that on my next reboot.
I did the debug message on the 4.10-rc1 for now. I could go back to 4.9 if =
that helps but needs some time again to compile.
The debug messages from the first rpm_... to the crash are:
Dec 30 00:48:05 klaptop kernel: [    3.944157] usb usb1-port1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.944158] usb usb1-port2: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.944159] usb usb1-port3: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.944605] ehci-pci 0000:00:1d.0: EHCI =
Host Controller
Dec 30 00:48:05 klaptop kernel: [    3.944610] ehci-pci 0000:00:1d.0: new U=
SB bus registered, assigned bus number 2
Dec 30 00:48:05 klaptop kernel: [    3.944621] ehci-pci 0000:00:1d.0: debug=
 port 2
Dec 30 00:48:05 klaptop kernel: [    3.948526] ehci-pci 0000:00:1d.0: cache=
 line size of 64 is not supported
Dec 30 00:48:05 klaptop kernel: [    3.948624] ehci-pci 0000:00:1d.0: irq 2=
3, io mem 0xb4a3d000
Dec 30 00:48:05 klaptop kernel: [    3.962387] ehci-pci 0000:00:1d.0: USB 2=
.0 started, EHCI 1.00
Dec 30 00:48:05 klaptop kernel: [    3.962427] usb usb2: New USB device fou=
nd, idVendor=3D1d6b, idProduct=3D0002
Dec 30 00:48:05 klaptop kernel: [    3.962428] usb usb2: New USB device str=
ings: Mfr=3D3, Product=3D2, SerialNumber=3D1
Dec 30 00:48:05 klaptop kernel: [    3.962429] usb usb2: Product: EHCI Host=
 Controller
Dec 30 00:48:05 klaptop kernel: [    3.962430] usb usb2: Manufacturer: Linu=
x 4.10.0-rc1+ ehci_hcd
Dec 30 00:48:05 klaptop kernel: [    3.962431] usb usb2: SerialNumber: 0000=
:00:1d.0
Dec 30 00:48:05 klaptop kernel: [    3.962564] hub 2-0:1.0: USB hub found
Dec 30 00:48:05 klaptop kernel: [    3.962569] hub 2-0:1.0: 3 ports detecte=
d
Dec 30 00:48:05 klaptop kernel: [    3.962664] usb usb2-port1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.962665] usb usb2-port2: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.962666] usb usb2-port3: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.962851] xhci_hcd 0000:00:14.0: xHCI =
Host Controller
Dec 30 00:48:05 klaptop kernel: [    3.962857] xhci_hcd 0000:00:14.0: new U=
SB bus registered, assigned bus number 3
Dec 30 00:48:05 klaptop kernel: [    3.963949] xhci_hcd 0000:00:14.0: hcc p=
arams 0x200077c1 hci version 0x100 quirks 0x00009810
Dec 30 00:48:05 klaptop kernel: [    3.963955] xhci_hcd 0000:00:14.0: cache=
 line size of 64 is not supported
Dec 30 00:48:05 klaptop kernel: [    3.964888] usb usb3: New USB device fou=
nd, idVendor=3D1d6b, idProduct=3D0002
Dec 30 00:48:05 klaptop kernel: [    3.964889] usb usb3: New USB device str=
ings: Mfr=3D3, Product=3D2, SerialNumber=3D1
Dec 30 00:48:05 klaptop kernel: [    3.964891] usb usb3: Product: xHCI Host=
 Controller
Dec 30 00:48:05 klaptop kernel: [    3.964891] usb usb3: Manufacturer: Linu=
x 4.10.0-rc1+ xhci-hcd
Dec 30 00:48:05 klaptop kernel: [    3.964892] usb usb3: SerialNumber: 0000=
:00:14.0
Dec 30 00:48:05 klaptop kernel: [    3.965012] hub 3-0:1.0: USB hub found
Dec 30 00:48:05 klaptop kernel: [    3.965031] hub 3-0:1.0: 15 ports detect=
ed
Dec 30 00:48:05 klaptop kernel: [    3.968107] usb usb3-port1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968109] usb usb3-port2: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968110] usb usb3-port3: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968111] usb usb3-port4: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968112] usb usb3-port5: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968113] usb usb3-port6: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968114] usb usb3-port7: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968115] usb usb3-port8: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968116] usb usb3-port9: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968117] usb usb3-port10: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968118] usb usb3-port11: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968119] usb usb3-port12: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968119] usb usb3-port13: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968120] usb usb3-port14: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968121] usb usb3-port15: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.968575] xhci_hcd 0000:00:14.0: xHCI =
Host Controller
Dec 30 00:48:05 klaptop kernel: [    3.968578] xhci_hcd 0000:00:14.0: new U=
SB bus registered, assigned bus number 4
Dec 30 00:48:05 klaptop kernel: [    3.968623] usb usb4: New USB device fou=
nd, idVendor=3D1d6b, idProduct=3D0003
Dec 30 00:48:05 klaptop kernel: [    3.968624] usb usb4: New USB device str=
ings: Mfr=3D3, Product=3D2, SerialNumber=3D1
Dec 30 00:48:05 klaptop kernel: [    3.968625] usb usb4: Product: xHCI Host=
 Controller
Dec 30 00:48:05 klaptop kernel: [    3.968626] usb usb4: Manufacturer: Linu=
x 4.10.0-rc1+ xhci-hcd
Dec 30 00:48:05 klaptop kernel: [    3.968627] usb usb4: SerialNumber: 0000=
:00:14.0
Dec 30 00:48:05 klaptop kernel: [    3.968749] hub 4-0:1.0: USB hub found
Dec 30 00:48:05 klaptop kernel: [    3.968973] hub 4-0:1.0: 6 ports detecte=
d
Dec 30 00:48:05 klaptop kernel: [    3.969256] usb usb3-port1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.969492] usb usb3-port2: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.969768] usb usb3-port3: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.970036] usb: port power management m=
ay be unreliable
Dec 30 00:48:05 klaptop kernel: [    3.970634] usb usb3-port9: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.970875] usb usb4-port4: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.970877] usb usb4-port6: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    3.971104] e1000e 0000:00:19.0: Interru=
pt Throttling Rate (ints/sec) set to dynamic conservative mode
Dec 30 00:48:05 klaptop kernel: [    4.074129] usb usb4: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.074132] usb usb4: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.074728] e1000e 0000:00:19.0 0000:00:=
19.0 (uninitialized): registered PHC clock
Dec 30 00:48:05 klaptop kernel: [    4.179267] e1000e 0000:00:19.0 eth0: (P=
CI Express:2.5GT/s:Width x1) 54:ee:75:4d:4e:6d
Dec 30 00:48:05 klaptop kernel: [    4.179268] e1000e 0000:00:19.0 eth0: In=
tel(R) PRO/1000 Network Connection
Dec 30 00:48:05 klaptop kernel: [    4.179304] e1000e 0000:00:19.0 eth0: MA=
C: 11, PHY: 12, PBA No: 1000FF-0FF
Dec 30 00:48:05 klaptop kernel: [    4.179432] i801_smbus 0000:00:1f.3: SPD=
 Write Disable is set
Dec 30 00:48:05 klaptop kernel: [    4.179455] i801_smbus 0000:00:1f.3: SMB=
us using PCI interrupt
Dec 30 00:48:05 klaptop kernel: [    4.180279] i801_smbus 0000:00:1f.3: rpm=
_idle
Dec 30 00:48:05 klaptop kernel: [    4.180281] i801_smbus 0000:00:1f.3: rpm=
_suspend
Dec 30 00:48:05 klaptop kernel: [    4.180326] ahci 0000:00:1f.2: version 3=
.0
Dec 30 00:48:05 klaptop kernel: [    4.180486] ahci 0000:00:1f.2: AHCI 0001=
.0300 32 slots 6 ports 6 Gbps 0x21 impl SATA mode
Dec 30 00:48:05 klaptop kernel: [    4.180488] ahci 0000:00:1f.2: flags: 64=
bit ncq ilck pm led clo pio slum part ems apst=20
Dec 30 00:48:05 klaptop kernel: [    4.180547] e1000e 0000:00:19.0 enp0s25:=
 renamed from eth0
Dec 30 00:48:05 klaptop kernel: [    4.190944] scsi host0: ahci
Dec 30 00:48:05 klaptop kernel: [    4.191011] scsi host0: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.191013] scsi host0: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.191267] scsi host1: ahci
Dec 30 00:48:05 klaptop kernel: [    4.191325] scsi host1: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.191326] scsi host1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.191492] scsi host2: ahci
Dec 30 00:48:05 klaptop kernel: [    4.191550] scsi host2: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.191551] scsi host2: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.191601] scsi host3: ahci
Dec 30 00:48:05 klaptop kernel: [    4.191658] scsi host3: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.191659] scsi host3: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.191740] scsi host4: ahci
Dec 30 00:48:05 klaptop kernel: [    4.191785] scsi host4: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.191786] scsi host4: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.191837] scsi host5: ahci
Dec 30 00:48:05 klaptop kernel: [    4.191879] scsi host5: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.191880] scsi host5: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.191886] ata1: SATA max UDMA/133 abar=
 m2048@0xb4a3c000 port 0xb4a3c100 irq 29
Dec 30 00:48:05 klaptop kernel: [    4.191887] ata2: DUMMY
Dec 30 00:48:05 klaptop kernel: [    4.191888] ata3: DUMMY
Dec 30 00:48:05 klaptop kernel: [    4.191888] ata4: DUMMY
Dec 30 00:48:05 klaptop kernel: [    4.191889] ata5: DUMMY
Dec 30 00:48:05 klaptop kernel: [    4.191891] ata6: SATA max UDMA/133 abar=
 m2048@0xb4a3c000 port 0xb4a3c380 irq 29
Dec 30 00:48:05 klaptop kernel: [    4.270186] usb 1-1: new high-speed USB =
device number 2 using ehci-pci
Dec 30 00:48:05 klaptop kernel: [    4.290184] usb 2-1: new high-speed USB =
device number 2 using ehci-pci
Dec 30 00:48:05 klaptop kernel: [    4.294182] usb 3-1: new low-speed USB d=
evice number 2 using xhci_hcd
Dec 30 00:48:05 klaptop kernel: [    4.420368] usb 1-1: New USB device foun=
d, idVendor=3D8087, idProduct=3D8008
Dec 30 00:48:05 klaptop kernel: [    4.420371] usb 1-1: New USB device stri=
ngs: Mfr=3D0, Product=3D0, SerialNumber=3D0
Dec 30 00:48:05 klaptop kernel: [    4.420921] hub 1-1:1.0: USB hub found
Dec 30 00:48:05 klaptop kernel: [    4.421097] hub 1-1:1.0: 6 ports detecte=
d
Dec 30 00:48:05 klaptop kernel: [    4.422531] usb 1-1-port1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.422535] usb 1-1-port2: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.422537] usb 1-1-port3: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.422539] usb 1-1-port4: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.422541] usb 1-1-port5: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.422542] usb 1-1-port6: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.438635] usb 2-1: New USB device foun=
d, idVendor=3D8087, idProduct=3D8000
Dec 30 00:48:05 klaptop kernel: [    4.438638] usb 2-1: New USB device stri=
ngs: Mfr=3D0, Product=3D0, SerialNumber=3D0
Dec 30 00:48:05 klaptop kernel: [    4.439158] hub 2-1:1.0: USB hub found
Dec 30 00:48:05 klaptop kernel: [    4.439346] hub 2-1:1.0: 8 ports detecte=
d
Dec 30 00:48:05 klaptop kernel: [    4.440868] usb 2-1-port1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.440870] usb 2-1-port2: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.440872] usb 2-1-port3: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.440873] usb 2-1-port4: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.440874] usb 2-1-port5: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.440875] usb 2-1-port6: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.440876] usb 2-1-port7: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.440877] usb 2-1-port8: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.454746] usb 3-1: New USB device foun=
d, idVendor=3D045e, idProduct=3D00db
Dec 30 00:48:05 klaptop kernel: [    4.454749] usb 3-1: New USB device stri=
ngs: Mfr=3D1, Product=3D2, SerialNumber=3D0
Dec 30 00:48:05 klaptop kernel: [    4.454751] usb 3-1: Product: Natural=C2=
=AE Ergonomic Keyboard 4000
Dec 30 00:48:05 klaptop kernel: [    4.454753] usb 3-1: Manufacturer: Micro=
soft
Dec 30 00:48:05 klaptop kernel: [    4.459668] hidraw: raw HID events drive=
r (C) Jiri Kosina
Dec 30 00:48:05 klaptop kernel: [    4.477166] usbcore: registered new inte=
rface driver usbhid
Dec 30 00:48:05 klaptop kernel: [    4.477168] usbhid: USB HID core driver
Dec 30 00:48:05 klaptop kernel: [    4.479926] input: Microsoft Natural=C2=
=AE Ergonomic Keyboard 4000 as /devices/pci0000:00/0000:00:14.0/usb3/3-1/3-=
1:1.0/0003:045E:00DB.0001/input/input3
Dec 30 00:48:05 klaptop kernel: [    4.506854] ata1: SATA link up 6.0 Gbps =
(SStatus 133 SControl 300)
Dec 30 00:48:05 klaptop kernel: [    4.506895] ata6: SATA link up 1.5 Gbps =
(SStatus 113 SControl 300)
Dec 30 00:48:05 klaptop kernel: [    4.510821] ata1.00: ACPI cmd ef/02:00:0=
0:00:00:a0 (SET FEATURES) succeeded
Dec 30 00:48:05 klaptop kernel: [    4.510826] ata1.00: ACPI cmd f5/00:00:0=
0:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Dec 30 00:48:05 klaptop kernel: [    4.511143] ata1.00: supports DRM functi=
ons and may not be fully accessible
Dec 30 00:48:05 klaptop kernel: [    4.515430] ata6.00: ACPI cmd e3/00:1f:0=
0:00:00:a0 (IDLE) succeeded
Dec 30 00:48:05 klaptop kernel: [    4.516018] ata6.00: ACPI cmd e3/00:02:0=
0:00:00:a0 (IDLE) succeeded
Dec 30 00:48:05 klaptop kernel: [    4.516302] ata6.00: ATAPI: PLDS DVD-RW =
DU8A5SH, BU51, max UDMA/100
Dec 30 00:48:05 klaptop kernel: [    4.517574] ata1.00: NCQ Send/Recv Log n=
ot supported
Dec 30 00:48:05 klaptop kernel: [    4.517578] ata1.00: ATA-9: Samsung SSD =
850 EVO 1TB, EMT01B6Q, max UDMA/133
Dec 30 00:48:05 klaptop kernel: [    4.517582] ata1.00: 1953525168 sectors,=
 multi 1: LBA48 NCQ (depth 31/32), AA
Dec 30 00:48:05 klaptop kernel: [    4.517902] random: fast init done
Dec 30 00:48:05 klaptop kernel: [    4.519473] ata1.00: ACPI cmd ef/02:00:0=
0:00:00:a0 (SET FEATURES) succeeded
Dec 30 00:48:05 klaptop kernel: [    4.519478] ata1.00: ACPI cmd f5/00:00:0=
0:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Dec 30 00:48:05 klaptop kernel: [    4.519754] ata1.00: supports DRM functi=
ons and may not be fully accessible
Dec 30 00:48:05 klaptop kernel: [    4.523127] ata6.00: ACPI cmd e3/00:1f:0=
0:00:00:a0 (IDLE) succeeded
Dec 30 00:48:05 klaptop kernel: [    4.523880] ata6.00: ACPI cmd e3/00:02:0=
0:00:00:a0 (IDLE) succeeded
Dec 30 00:48:05 klaptop kernel: [    4.524169] ata6.00: configured for UDMA=
/100
Dec 30 00:48:05 klaptop kernel: [    4.526167] ata1.00: NCQ Send/Recv Log n=
ot supported
Dec 30 00:48:05 klaptop kernel: [    4.526262] ata1.00: configured for UDMA=
/133
Dec 30 00:48:05 klaptop kernel: [    4.526333] scsi host0: rpm_resume
Dec 30 00:48:05 klaptop kernel: [    4.526889] scsi 0:0:0:0: Direct-Access =
    ATA      Samsung SSD 850  1B6Q PQ: 0 ANSI: 5
Dec 30 00:48:05 klaptop kernel: [    4.532745] usb 1-1: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.532749] usb 1-1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.538511] microsoft 0003:045E:00DB.000=
1: input,hidraw0: USB HID v1.11 Keyboard [Microsoft Natural=C2=AE Ergonomic=
 Keyboard 4000] on usb-0000:00:14.0-1/input0
Dec 30 00:48:05 klaptop kernel: [    4.539775] input: Microsoft Natural=C2=
=AE Ergonomic Keyboard 4000 as /devices/pci0000:00/0000:00:14.0/usb3/3-1/3-=
1:1.1/0003:045E:00DB.0002/input/input4
Dec 30 00:48:05 klaptop kernel: [    4.547582] usb 2-1: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.547586] usb 2-1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.558184] usb usb1: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.558187] usb usb1: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.566184] usb usb2: rpm_idle
Dec 30 00:48:05 klaptop kernel: [    4.566188] usb usb2: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    4.574194] usb 3-3: new full-speed USB =
device number 3 using xhci_hcd
Dec 30 00:48:05 klaptop kernel: [    4.578617] scsi host5: rpm_resume
Dec 30 00:48:05 klaptop kernel: [    4.582495] scsi 5:0:0:0: CD-ROM        =
    PLDS     DVD-RW DU8A5SH   BU51 PQ: 0 ANSI: 5
Dec 30 00:48:05 klaptop kernel: [    4.598318] microsoft 0003:045E:00DB.000=
2: input,hidraw1: USB HID v1.11 Device [Microsoft Natural=C2=AE Ergonomic K=
eyboard 4000] on usb-0000:00:14.0-1/input1
Dec 30 00:48:05 klaptop kernel: [    4.608089] sd 0:0:0:0: [sda] 1953525168=
 512-byte logical blocks: (1.00 TB/932 GiB)
Dec 30 00:48:05 klaptop kernel: [    4.608109] sd 0:0:0:0: [sda] Write Prot=
ect is off
Dec 30 00:48:05 klaptop kernel: [    4.608111] sd 0:0:0:0: [sda] Mode Sense=
: 00 3a 00 00
Dec 30 00:48:05 klaptop kernel: [    4.608141] sd 0:0:0:0: [sda] Write cach=
e: enabled, read cache: enabled, doesn't support DPO or FUA
Dec 30 00:48:05 klaptop kernel: [    4.610618]  sda: sda1 sda2 sda3 sda4 < =
sda5 sda6 >
Dec 30 00:48:05 klaptop kernel: [    4.611296] sd 0:0:0:0: [sda] Attached S=
CSI disk
Dec 30 00:48:05 klaptop kernel: [    4.628295] sr 5:0:0:0: [sr0] scsi3-mmc =
drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray
Dec 30 00:48:05 klaptop kernel: [    4.628298] cdrom: Uniform CD-ROM driver=
 Revision: 3.20
Dec 30 00:48:05 klaptop kernel: [    4.628564] sr 5:0:0:0: Attached scsi CD=
-ROM sr0
Dec 30 00:48:05 klaptop kernel: [    4.716694] usb 3-3: New USB device foun=
d, idVendor=3D062a, idProduct=3D7223
Dec 30 00:48:05 klaptop kernel: [    4.716697] usb 3-3: New USB device stri=
ngs: Mfr=3D1, Product=3D2, SerialNumber=3D0
Dec 30 00:48:05 klaptop kernel: [    4.716699] usb 3-3: Product: Full-Speed=
 Mouse
Dec 30 00:48:05 klaptop kernel: [    4.716700] usb 3-3: Manufacturer: Full-=
Speed Mouse
Dec 30 00:48:05 klaptop kernel: [    4.722626] input: Full-Speed Mouse Full=
-Speed Mouse as /devices/pci0000:00/0000:00:14.0/usb3/3-3/3-3:1.0/0003:062A=
:7223.0003/input/input5
Dec 30 00:48:05 klaptop kernel: [    4.722734] hid-generic 0003:062A:7223.0=
003: input,hidraw2: USB HID v1.10 Mouse [Full-Speed Mouse Full-Speed Mouse]=
 on usb-0000:00:14.0-3/input0
Dec 30 00:48:05 klaptop kernel: [    4.722847] input: Full-Speed Mouse Full=
-Speed Mouse as /devices/pci0000:00/0000:00:14.0/usb3/3-3/3-3:1.1/0003:062A=
:7223.0004/input/input6
Dec 30 00:48:05 klaptop kernel: [    4.750174] psmouse serio1: synaptics: q=
ueried max coordinates: x [..5676], y [..4758]
Dec 30 00:48:05 klaptop kernel: [    4.783218] psmouse serio1: synaptics: q=
ueried min coordinates: x [1266..], y [1096..]
Dec 30 00:48:05 klaptop kernel: [    4.786447] hid-generic 0003:062A:7223.0=
004: input,hidraw3: USB HID v1.10 Keyboard [Full-Speed Mouse Full-Speed Mou=
se] on usb-0000:00:14.0-3/input1
Dec 30 00:48:05 klaptop kernel: [    4.838194] usb 3-7: new full-speed USB =
device number 4 using xhci_hcd
Dec 30 00:48:05 klaptop kernel: [    4.847940] psmouse serio1: synaptics: T=
ouchpad model: 1, fw: 8.1, id: 0x1e2b1, caps: 0xf003a3/0x943300/0x12e800/0x=
10000, board id: 3053, fw id: 2560
Dec 30 00:48:05 klaptop kernel: [    4.847955] psmouse serio1: synaptics: s=
erio: Synaptics pass-through port at isa0060/serio1/input0
Dec 30 00:48:05 klaptop kernel: [    4.889663] input: SynPS/2 Synaptics Tou=
chPad as /devices/platform/i8042/serio1/input/input2
Dec 30 00:48:05 klaptop kernel: [    4.916042] VMware vmxnet3 virtual NIC d=
river - version 1.4.a.0-k-NAPI
Dec 30 00:48:05 klaptop kernel: [    4.916827] VMware PVSCSI driver - versi=
on 1.0.7.0-k
Dec 30 00:48:05 klaptop kernel: [    4.986126] raid6: sse2x1   gen()  9919 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.054130] raid6: sse2x1   xor()  4853 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.122133] raid6: sse2x2   gen() 10571 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.190135] raid6: sse2x2   xor()  5995 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.258140] raid6: sse2x4   gen() 11260 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.326146] raid6: sse2x4   xor()  6655 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.394146] raid6: avx2x1   gen() 15285 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.462150] raid6: avx2x1   xor()  9592 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.530152] raid6: avx2x2   gen() 19419 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.598159] raid6: avx2x2   xor() 11276 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.666162] raid6: avx2x4   gen() 20378 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.734166] raid6: avx2x4   xor() 12890 =
MB/s
Dec 30 00:48:05 klaptop kernel: [    5.734166] raid6: using algorithm avx2x=
4 gen() 20378 MB/s
Dec 30 00:48:05 klaptop kernel: [    5.734167] raid6: .... xor() 12890 MB/s=
, rmw enabled
Dec 30 00:48:05 klaptop kernel: [    5.734167] raid6: using avx2x2 recovery=
 algorithm
Dec 30 00:48:05 klaptop kernel: [    5.734179] clocksource: Switched to clo=
cksource tsc
Dec 30 00:48:05 klaptop kernel: [    5.734435] xor: automatically using bes=
t checksumming function   avx      =20
Dec 30 00:48:05 klaptop kernel: [    5.736801] Btrfs loaded, crc32c=3Dcrc32=
c-intel
Dec 30 00:48:05 klaptop kernel: [    5.755286] usb 3-7: New USB device foun=
d, idVendor=3D138a, idProduct=3D0017
Dec 30 00:48:05 klaptop kernel: [    5.755287] usb 3-7: New USB device stri=
ngs: Mfr=3D0, Product=3D0, SerialNumber=3D1
Dec 30 00:48:05 klaptop kernel: [    5.755288] usb 3-7: SerialNumber: 82f9b=
467acb7
Dec 30 00:48:05 klaptop kernel: [    5.812491] BTRFS: device fsid f517ae30-=
e509-4bfb-9554-7fe60f091b0e devid 1 transid 220098 /dev/sda5
Dec 30 00:48:05 klaptop kernel: [    5.813964] PM: Starting manual resume f=
rom disk
Dec 30 00:48:05 klaptop kernel: [    5.813966] PM: Hibernation image partit=
ion 8:6 present
Dec 30 00:48:05 klaptop kernel: [    5.813966] PM: Looking for hibernation =
image.
Dec 30 00:48:05 klaptop kernel: [    5.814140] PM: Image not found (code -2=
2)
Dec 30 00:48:05 klaptop kernel: [    5.814141] PM: Hibernation image not pr=
esent or could not be loaded.
Dec 30 00:48:05 klaptop kernel: [    5.821080] BTRFS info (device sda5): di=
sk space caching is enabled
Dec 30 00:48:05 klaptop kernel: [    5.821081] BTRFS info (device sda5): ha=
s skinny extents
Dec 30 00:48:05 klaptop kernel: [    5.832058] BTRFS info (device sda5): de=
tected SSD devices, enabling SSD mode
Dec 30 00:48:05 klaptop kernel: [    5.878254] usb 3-11: new full-speed USB=
 device number 5 using xhci_hcd
Dec 30 00:48:05 klaptop kernel: [    5.975558] ip_tables: (C) 2000-2006 Net=
filter Core Team
Dec 30 00:48:05 klaptop kernel: [    6.023264] usb 3-11: New USB device fou=
nd, idVendor=3D8087, idProduct=3D07dc
Dec 30 00:48:05 klaptop kernel: [    6.023265] usb 3-11: New USB device str=
ings: Mfr=3D0, Product=3D0, SerialNumber=3D0
Dec 30 00:48:05 klaptop kernel: [    6.048338] BTRFS info (device sda5): di=
sk space caching is enabled
Dec 30 00:48:05 klaptop kernel: [    6.058759] RPC: Registered named UNIX s=
ocket transport module.
Dec 30 00:48:05 klaptop kernel: [    6.058761] RPC: Registered udp transpor=
t module.
Dec 30 00:48:05 klaptop kernel: [    6.058761] RPC: Registered tcp transpor=
t module.
Dec 30 00:48:05 klaptop kernel: [    6.058762] RPC: Registered tcp NFSv4.1 =
backchannel transport module.
Dec 30 00:48:05 klaptop kernel: [    6.065054] Installing knfsd (copyright =
(C) 1996 okir@monad.swb.de).
Dec 30 00:48:05 klaptop kernel: [    6.142203] usb 3-12: new high-speed USB=
 device number 6 using xhci_hcd
Dec 30 00:48:05 klaptop kernel: [    6.179734] EDAC MC: Ver: 3.0.0
Dec 30 00:48:05 klaptop kernel: [    6.179777] shpchp: Standard Hot Plug PC=
I Controller Driver version: 0.4
Dec 30 00:48:05 klaptop kernel: [    6.182575] ACPI Warning: SystemIO range=
 0x0000000000001828-0x000000000000182F conflicts with OpRegion 0x0000000000=
001800-0x000000000000187F (\_SB.PCI0.LPC.PMIO) (20160930/utaddress-247)
Dec 30 00:48:05 klaptop kernel: [    6.182581] ACPI: If an ACPI driver is a=
vailable for this device, you should use it instead of the native driver
Dec 30 00:48:05 klaptop kernel: [    6.182657] ACPI Warning: SystemIO range=
 0x0000000000000840-0x000000000000084F conflicts with OpRegion 0x0000000000=
000800-0x000000000000087F (\_SB.PCI0.LPC.LPIO) (20160930/utaddress-247)
Dec 30 00:48:05 klaptop kernel: [    6.182660] ACPI: If an ACPI driver is a=
vailable for this device, you should use it instead of the native driver
Dec 30 00:48:05 klaptop kernel: [    6.182660] ACPI Warning: SystemIO range=
 0x0000000000000830-0x000000000000083F conflicts with OpRegion 0x0000000000=
000800-0x000000000000087F (\_SB.PCI0.LPC.LPIO) (20160930/utaddress-247)
Dec 30 00:48:05 klaptop kernel: [    6.182662] ACPI: If an ACPI driver is a=
vailable for this device, you should use it instead of the native driver
Dec 30 00:48:05 klaptop kernel: [    6.182662] ACPI Warning: SystemIO range=
 0x0000000000000800-0x000000000000082F conflicts with OpRegion 0x0000000000=
000800-0x000000000000087F (\_SB.PCI0.LPC.LPIO) (20160930/utaddress-247)
Dec 30 00:48:05 klaptop kernel: [    6.182664] ACPI: If an ACPI driver is a=
vailable for this device, you should use it instead of the native driver
Dec 30 00:48:05 klaptop kernel: [    6.182664] lpc_ich: Resource conflict(s=
) found affecting gpio_ich
Dec 30 00:48:05 klaptop kernel: [    6.182711] input: Lid Switch as /device=
s/LNXSYSTM:00/LNXSYBUS:00/PNP0C0D:00/input/input8
Dec 30 00:48:05 klaptop kernel: [    6.183106] ACPI: Lid Switch [LID]
Dec 30 00:48:05 klaptop kernel: [    6.183158] input: Sleep Button as /devi=
ces/LNXSYSTM:00/LNXSYBUS:00/PNP0C0E:00/input/input9
Dec 30 00:48:05 klaptop kernel: [    6.183161] ACPI: Sleep Button [SLPB]
Dec 30 00:48:05 klaptop kernel: [    6.183223] input: Power Button as /devi=
ces/LNXSYSTM:00/LNXPWRBN:00/input/input10
Dec 30 00:48:05 klaptop kernel: [    6.183225] ACPI: Power Button [PWRF]
Dec 30 00:48:05 klaptop kernel: [    6.186274] EDAC ie31200: No ECC support
Dec 30 00:48:05 klaptop kernel: [    6.187649] [drm] Initialized
Dec 30 00:48:05 klaptop kernel: [    6.197452] Non-volatile memory driver v=
1.3
Dec 30 00:48:05 klaptop kernel: [    6.198264] tpm_tis 00:05: 1.2 TPM (devi=
ce-id 0x0, rev-id 78)
Dec 30 00:48:05 klaptop kernel: [    6.199581] thinkpad_acpi: ThinkPad ACPI=
 Extras v0.25
Dec 30 00:48:05 klaptop kernel: [    6.199581] thinkpad_acpi: http://ibm-ac=
pi.sf.net/
Dec 30 00:48:05 klaptop kernel: [    6.199582] thinkpad_acpi: ThinkPad BIOS=
 GNET80WW (2.28 ), EC unknown
Dec 30 00:48:05 klaptop kernel: [    6.199582] thinkpad_acpi: Lenovo ThinkP=
ad W541, model 20EG000BGB
Dec 30 00:48:05 klaptop kernel: [    6.200548] thinkpad_hwmon thinkpad_hwmo=
n: hwmon_device_register() is deprecated. Please convert the driver to use =
hwmon_device_register_with_info().
Dec 30 00:48:05 klaptop kernel: [    6.200989] thinkpad_acpi: radio switch =
found; radios are enabled
Dec 30 00:48:05 klaptop kernel: [    6.201138] thinkpad_acpi: This ThinkPad=
 has standard ACPI backlight brightness control, supported by the ACPI vide=
o driver
Dec 30 00:48:05 klaptop kernel: [    6.201139] thinkpad_acpi: Disabling thi=
nkpad-acpi brightness events by default...
Dec 30 00:48:05 klaptop kernel: [    6.202724] [drm] Memory usable by graph=
ics device =3D 2048M
Dec 30 00:48:05 klaptop kernel: [    6.202725] [drm] Replacing VGA console =
driver
Dec 30 00:48:05 klaptop kernel: [    6.203520] Console: switching to colour=
 dummy device 80x25
Dec 30 00:48:05 klaptop kernel: [    6.205807] wmi: Mapper loaded
Dec 30 00:48:05 klaptop kernel: [    6.206690] ACPI: Battery Slot [BAT0] (b=
attery present)
Dec 30 00:48:05 klaptop kernel: [    6.206873] ACPI: AC Adapter [AC] (on-li=
ne)
Dec 30 00:48:05 klaptop kernel: [    6.207676] thinkpad_acpi: rfkill switch=
 tpacpi_bluetooth_sw: radio is unblocked
Dec 30 00:48:05 klaptop kernel: [    6.211290] input: ThinkPad Extra Button=
s as /devices/platform/thinkpad_acpi/input/input11
Dec 30 00:48:05 klaptop kernel: [    6.211735] [drm] Supports vblank timest=
amp caching Rev 2 (21.10.2013).
Dec 30 00:48:05 klaptop kernel: [    6.211736] [drm] Driver supports precis=
e vblank timestamp query.
Dec 30 00:48:05 klaptop kernel: [    6.211906] snd_hda_codec_realtek hdaudi=
oC1D0: autoconfig for ALC3232: line_outs=3D1 (0x14/0x0/0x0/0x0/0x0) type:sp=
eaker
Dec 30 00:48:05 klaptop kernel: [    6.211907] snd_hda_codec_realtek hdaudi=
oC1D0:    speaker_outs=3D0 (0x0/0x0/0x0/0x0/0x0)
Dec 30 00:48:05 klaptop kernel: [    6.211908] snd_hda_codec_realtek hdaudi=
oC1D0:    hp_outs=3D2 (0x16/0x15/0x0/0x0/0x0)
Dec 30 00:48:05 klaptop kernel: [    6.211909] snd_hda_codec_realtek hdaudi=
oC1D0:    mono: mono_out=3D0x0
Dec 30 00:48:05 klaptop kernel: [    6.211910] snd_hda_codec_realtek hdaudi=
oC1D0:    inputs:
Dec 30 00:48:05 klaptop kernel: [    6.211911] snd_hda_codec_realtek hdaudi=
oC1D0:      Dock Mic=3D0x19
Dec 30 00:48:05 klaptop kernel: [    6.211912] snd_hda_codec_realtek hdaudi=
oC1D0:      Mic=3D0x1a
Dec 30 00:48:05 klaptop kernel: [    6.211913] snd_hda_codec_realtek hdaudi=
oC1D0:      Internal Mic=3D0x12
Dec 30 00:48:05 klaptop kernel: [    6.212434] i915 0000:00:02.0: vgaarb: c=
hanged VGA decodes: olddecodes=3Dio+mem,decodes=3Dnone:owns=3Dio+mem
Dec 30 00:48:05 klaptop kernel: [    6.232038] input: HDA Digital PCBeep as=
 /devices/pci0000:00/0000:00:1b.0/sound/card1/input12
Dec 30 00:48:05 klaptop kernel: [    6.232061] snd_hda_codec_realtek hdaudi=
oC1D0: rpm_suspend
Dec 30 00:48:05 klaptop kernel: [    6.232150] input: HDA Intel PCH Dock Mi=
c as /devices/pci0000:00/0000:00:1b.0/sound/card1/input13
Dec 30 00:48:05 klaptop kernel: [    6.232179] input: HDA Intel PCH Mic as =
/devices/pci0000:00/0000:00:1b.0/sound/card1/input14
Dec 30 00:48:05 klaptop kernel: [    6.232206] input: HDA Intel PCH Dock He=
adphone as /devices/pci0000:00/0000:00:1b.0/sound/card1/input15
Dec 30 00:48:05 klaptop kernel: [    6.232233] input: HDA Intel PCH Headpho=
ne as /devices/pci0000:00/0000:00:1b.0/sound/card1/input16
Dec 30 00:48:05 klaptop kernel: [    6.250290] ACPI Warning: \_SB.PCI0.PEG.=
VID._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Packag=
e] (20160930/nsarguments-95)
Dec 30 00:48:05 klaptop kernel: [    6.250569] ACPI Warning: \_SB.PCI0.PEG.=
VID._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Packag=
e] (20160930/nsarguments-95)
Dec 30 00:48:05 klaptop kernel: [    6.250724] snd_hda_codec_realtek hdaudi=
oC1D0: rpm_resume
Dec 30 00:48:05 klaptop kernel: [    6.250838] pci 0000:01:00.0: optimus ca=
pabilities: enabled, status dynamic power, hda bios codec supported
Dec 30 00:48:05 klaptop kernel: [    6.250844] VGA switcheroo: detected Opt=
imus DSM method \_SB_.PCI0.PEG_.VID_ handle
Dec 30 00:48:05 klaptop kernel: [    6.250844] nouveau: detected PR support=
, will not use DSM
Dec 30 00:48:05 klaptop kernel: [    6.250873] nouveau 0000:01:00.0: enabli=
ng device (0000 -> 0003)
Dec 30 00:48:05 klaptop kernel: [    6.251018] nouveau 0000:01:00.0: NVIDIA=
 GK107 (0e7360a2)
Dec 30 00:48:05 klaptop kernel: [    6.253864] sd 0:0:0:0: Attached scsi ge=
neric sg0 type 0
Dec 30 00:48:05 klaptop kernel: [    6.253910] sr 5:0:0:0: Attached scsi ge=
neric sg1 type 5
Dec 30 00:48:05 klaptop kernel: [    6.254400] Intel(R) Wireless WiFi drive=
r for Linux
Dec 30 00:48:05 klaptop kernel: [    6.254400] Copyright(c) 2003- 2015 Inte=
l Corporation
Dec 30 00:48:05 klaptop kernel: [    6.258825] iwlwifi 0000:03:00.0: loaded=
 firmware version 17.352738.0 op_mode iwlmvm
Dec 30 00:48:05 klaptop kernel: [    6.266705] input: PC Speaker as /device=
s/platform/pcspkr/input/input17
Dec 30 00:48:05 klaptop kernel: [    6.267356] iwlwifi 0000:03:00.0: Detect=
ed Intel(R) Dual Band Wireless AC 7260, REV=3D0x144
Dec 30 00:48:05 klaptop kernel: [    6.267618] Error: Driver 'pcspkr' is al=
ready registered, aborting...
Dec 30 00:48:05 klaptop kernel: [    6.269505] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:05 klaptop kernel: [    6.270062] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:05 klaptop kernel: [    6.290659] AVX2 version of gcm_enc/dec =
engaged.
Dec 30 00:48:05 klaptop kernel: [    6.290659] AES CTR mode by8 optimizatio=
n enabled
Dec 30 00:48:05 klaptop kernel: [    6.298239] alg: No test for pcbc(aes) (=
pcbc-aes-aesni)
Dec 30 00:48:05 klaptop kernel: [    6.303593] psmouse serio2: trackpoint: =
IBM TrackPoint firmware: 0x0e, buttons: 3/3
Dec 30 00:48:05 klaptop kernel: [    6.321411] Adding 524284k swap on /dev/=
sda6.  Priority:-1 extents:1 across:524284k SSFS
Dec 30 00:48:05 klaptop kernel: [    6.358568] usb 3-12: New USB device fou=
nd, idVendor=3D04ca, idProduct=3D7035
Dec 30 00:48:05 klaptop kernel: [    6.358570] usb 3-12: New USB device str=
ings: Mfr=3D1, Product=3D2, SerialNumber=3D0
Dec 30 00:48:05 klaptop kernel: [    6.358572] usb 3-12: Product: Integrate=
d Camera
Dec 30 00:48:05 klaptop kernel: [    6.358573] usb 3-12: Manufacturer: 8SSC=
20F26960L1GZ523029G
Dec 30 00:48:05 klaptop kernel: [    6.365611] nouveau 0000:01:00.0: bios: =
version 80.07.ac.00.20
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.0862] Net=
workManager (version 1.4.2) is starting...
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.0863] Rea=
d config: /etc/NetworkManager/NetworkManager.conf
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.0906] man=
ager[0x55b0228c90a0]: monitoring kernel firmware directory '/lib/firmware'.
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.0906] mon=
itoring ifupdown state file '/run/network/ifstate'.
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.0932] dns=
-mgr[0x55b0228c42a0]: init: dns=3Ddefault, rc-manager=3Dresolvconf
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.0938] man=
ager[0x55b0228c90a0]: WiFi hardware radio set enabled
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.0939] man=
ager[0x55b0228c90a0]: WWAN hardware radio set enabled
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1141] ini=
t!
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1143] man=
agement mode: unmanaged
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1146] dev=
ices added (path: /sys/devices/pci0000:00/0000:00:19.0/net/enp0s25, iface: =
enp0s25)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1146] dev=
ice added (path: /sys/devices/pci0000:00/0000:00:19.0/net/enp0s25, iface: e=
np0s25): no ifupdown configuration found.
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1146] dev=
ices added (path: /sys/devices/virtual/net/lo, iface: lo)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1146] dev=
ice added (path: /sys/devices/virtual/net/lo, iface: lo): no ifupdown confi=
guration found.
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1146] end=
 _init.
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1146] set=
tings: loaded plugin ifupdown: (C) 2008 Canonical Ltd.  To report bugs plea=
se use the NetworkManager mailing list. (/usr/lib/x86_64-linux-gnu/NetworkM=
anager/libnm-settings-plugin-ifupdown.so)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1147] set=
tings: loaded plugin keyfile: (c) 2007 - 2015 Red Hat, Inc.  To report bugs=
 please use the NetworkManager mailing list.
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1147] (57=
9818704) ... get_connections.
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1147] (57=
9818704) ... get_connections (managed=3Dfalse): return empty list.
Dec 30 00:48:06 klaptop kernel: [    6.506270] input: TPPS/2 IBM TrackPoint=
 as /devices/platform/i8042/serio1/serio2/input/input7
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1476] key=
file: new connection /etc/NetworkManager/system-connections/WB2 (84e0c20b-b=
085-4370-929b-8d754af80d8e,"WB2")
Dec 30 00:48:06 klaptop kernel: [    6.526660] iTCO_vendor_support: vendor-=
support=3D0
Dec 30 00:48:06 klaptop kernel: [    6.528125] Bluetooth: Core ver 2.22
Dec 30 00:48:06 klaptop kernel: [    6.528133] NET: Registered protocol fam=
ily 31
Dec 30 00:48:06 klaptop kernel: [    6.528133] Bluetooth: HCI device and co=
nnection manager initialized
Dec 30 00:48:06 klaptop kernel: [    6.528135] Bluetooth: HCI socket layer =
initialized
Dec 30 00:48:06 klaptop kernel: [    6.528137] Bluetooth: L2CAP socket laye=
r initialized
Dec 30 00:48:06 klaptop kernel: [    6.528140] Bluetooth: SCO socket layer =
initialized
Dec 30 00:48:06 klaptop kernel: [    6.529698] ieee80211 phy0: Selected rat=
e control algorithm 'iwl-mvm-rs'
Dec 30 00:48:06 klaptop kernel: [    6.531732] iTCO_wdt: Intel TCO WatchDog=
 Timer Driver v1.11
Dec 30 00:48:06 klaptop kernel: [    6.531795] iTCO_wdt: Found a Lynx Point=
 TCO device (Version=3D2, TCOBASE=3D0x1860)
Dec 30 00:48:06 klaptop kernel: [    6.531934] iTCO_wdt: initialized. heart=
beat=3D30 sec (nowayout=3D0)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1625] key=
file: new connection /etc/NetworkManager/system-connections/WBP 1 (7fc51fbc=
-eac7-4a50-bc93-193095df185e,"WBP 1")
Dec 30 00:48:06 klaptop kernel: [    6.540832] intel_rapl: Found RAPL domai=
n package
Dec 30 00:48:06 klaptop kernel: [    6.540834] intel_rapl: Found RAPL domai=
n dram
Dec 30 00:48:06 klaptop kernel: [    6.546465] usbcore: registered new inte=
rface driver btusb
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.1825] key=
file: new connection /etc/NetworkManager/system-connections/WLAN-810157 (d0=
d783b6-4e71-41ec-8454-6288455b5a9c,"WLAN-810157")
Dec 30 00:48:06 klaptop kernel: [    6.572225] Bluetooth: hci0: read Intel =
version: 3707100180012d0d00
Dec 30 00:48:06 klaptop kernel: [    6.573114] Bluetooth: hci0: Intel Bluet=
ooth firmware file: intel/ibt-hw-37.7.10-fw-1.80.1.2d.d.bseq
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.2145] key=
file: new connection /etc/NetworkManager/system-connections/eduroam (f58856=
50-e54a-48c0-9fda-810c33a92c38,"eduroam")
Dec 30 00:48:06 klaptop kernel: [    6.588924] iwlwifi 0000:03:00.0 wlp3s0:=
 renamed from wlan0
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.2412] key=
file: new connection /etc/NetworkManager/system-connections/H18+ (07870f15-=
63d2-4069-9b7b-7d9af7728479,"H18+")
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.2499] key=
file: new connection /etc/NetworkManager/system-connections/Vodafone Hotspo=
t (60f7b168-b7db-4a8e-8a78-8d78f6d182f5,"Vodafone Hotspot")
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.2609] key=
file: new connection /etc/NetworkManager/system-connections/Androidks (0fe0=
4bb6-7e72-46aa-950f-8b266d918691,"Androidks")
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.2697] key=
file: new connection /etc/NetworkManager/system-connections/Wired connectio=
n 1 (69f85f23-0764-4afc-b23c-b1a5ce7ea215,"Wired connection 1")
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.2782] key=
file: new connection /etc/NetworkManager/system-connections/Casablanca (e23=
9d612-895e-4fd9-9c86-2b49615fd1f5,"Casablanca")
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.2901] key=
file: new connection /etc/NetworkManager/system-connections/casablanca (3bd=
cd57d-d3e6-44e5-bb8c-dbaae0803d26,"casablanca")
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.2998] key=
file: new connection /etc/NetworkManager/system-connections/WBP (d8910d0b-4=
a65-4b7e-a4f9-2a9dcc737e9e,"WBP")
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.3016] get=
 unmanaged devices count: 0
Dec 30 00:48:06 klaptop kernel: [    6.682894] Bluetooth: BNEP (Ethernet Em=
ulation) ver 1.3
Dec 30 00:48:06 klaptop kernel: [    6.682895] Bluetooth: BNEP filters: pro=
tocol multicast
Dec 30 00:48:06 klaptop kernel: [    6.682898] Bluetooth: BNEP socket layer=
 initialized
Dec 30 00:48:06 klaptop kernel: [    6.682899] media: Linux media interface=
: v0.10
Dec 30 00:48:06 klaptop kernel: [    6.686284] Linux video capture interfac=
e: v2.00
Dec 30 00:48:06 klaptop kernel: [    6.690577] uvcvideo: Found UVC 1.00 dev=
ice Integrated Camera (04ca:7035)
Dec 30 00:48:06 klaptop kernel: [    6.699125] uvcvideo 3-12:1.0: Entity ty=
pe for entity Extension 4 was not initialized!
Dec 30 00:48:06 klaptop kernel: [    6.699126] uvcvideo 3-12:1.0: Entity ty=
pe for entity Extension 3 was not initialized!
Dec 30 00:48:06 klaptop kernel: [    6.699127] uvcvideo 3-12:1.0: Entity ty=
pe for entity Processing 2 was not initialized!
Dec 30 00:48:06 klaptop kernel: [    6.699128] uvcvideo 3-12:1.0: Entity ty=
pe for entity Camera 1 was not initialized!
Dec 30 00:48:06 klaptop kernel: [    6.699183] input: Integrated Camera as =
/devices/pci0000:00/0000:00:14.0/usb3/3-12/3-12:1.0/input/input18
Dec 30 00:48:06 klaptop kernel: [    6.699231] usb 3-12: rpm_idle
Dec 30 00:48:06 klaptop kernel: [    6.699241] usbcore: registered new inte=
rface driver uvcvideo
Dec 30 00:48:06 klaptop kernel: [    6.699242] USB Video Class driver (1.1.=
1)
Dec 30 00:48:06 klaptop kernel: [    6.701468] usb 3-12: rpm_idle
Dec 30 00:48:06 klaptop kernel: [    6.719226] Bluetooth: hci0: Intel Bluet=
ooth firmware patch completed and activated
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5109] set=
tings: hostname: using hostnamed
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5110] set=
tings: hostname changed from (none) to "klaptop"
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5114] dhc=
p-init: Using DHCP client 'dhclient'
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5114] man=
ager: WiFi enabled by radio killswitch; enabled by state file
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5114] man=
ager: WWAN enabled by radio killswitch; enabled by state file
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5114] man=
ager: Networking is enabled by state file
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5115] Loa=
ded device plugin: NMVxlanFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5115] Loa=
ded device plugin: NMVlanFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5115] Loa=
ded device plugin: NMVethFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5115] Loa=
ded device plugin: NMTunFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5115] Loa=
ded device plugin: NMMacvlanFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5116] Loa=
ded device plugin: NMIPTunnelFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5116] Loa=
ded device plugin: NMInfinibandFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5116] Loa=
ded device plugin: NMEthernetFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5116] Loa=
ded device plugin: NMBridgeFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5116] Loa=
ded device plugin: NMBondFactory (internal)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5125] Loa=
ded device plugin: NMAtmManager (/usr/lib/x86_64-linux-gnu/NetworkManager/l=
ibnm-device-plugin-adsl.so)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5157] Loa=
ded device plugin: NMBluezManager (/usr/lib/x86_64-linux-gnu/NetworkManager=
/libnm-device-plugin-bluetooth.so)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5178] Loa=
ded device plugin: NMTeamFactory (/usr/lib/x86_64-linux-gnu/NetworkManager/=
libnm-device-plugin-team.so)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5186] Loa=
ded device plugin: NMWifiFactory (/usr/lib/x86_64-linux-gnu/NetworkManager/=
libnm-device-plugin-wifi.so)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5191] Loa=
ded device plugin: NMWwanFactory (/usr/lib/x86_64-linux-gnu/NetworkManager/=
libnm-device-plugin-wwan.so)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5198] dev=
ice (lo): link connected
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5207] man=
ager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/0)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5222] man=
ager: (enp0s25): new Ethernet device (/org/freedesktop/NetworkManager/Devic=
es/1)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.5235] dev=
ice (enp0s25): state change: unmanaged -> unavailable (reason 'managed') [1=
0 20 2]
Dec 30 00:48:06 klaptop kernel: [    6.895239] IPv6: ADDRCONF(NETDEV_UP): e=
np0s25: link is not ready
Dec 30 00:48:06 klaptop kernel: [    7.146750] IPv6: ADDRCONF(NETDEV_UP): e=
np0s25: link is not ready
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.7772] rfk=
ill1: found WiFi radio killswitch (at /sys/devices/pci0000:00/0000:00:1c.1/=
0000:03:00.0/ieee80211/phy0/rfkill1) (driver iwlwifi)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.7776] dev=
ices added (path: /sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/net/wlp=
3s0, iface: wlp3s0)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.7776] dev=
ice added (path: /sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/net/wlp3=
s0, iface: wlp3s0): no ifupdown configuration found.
Dec 30 00:48:06 klaptop kernel: [    7.148434] ip6_tables: (C) 2000-2006 Ne=
tfilter Core Team
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.7806] (wl=
p3s0): using nl80211 for WiFi device control
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.7807] dev=
ice (wlp3s0): driver supports Access Point (AP) mode
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.7814] man=
ager: (wlp3s0): new 802.11 WiFi device (/org/freedesktop/NetworkManager/Dev=
ices/2)
Dec 30 00:48:06 klaptop NetworkManager[1918]: <info>  [1483055286.7820] dev=
ice (wlp3s0): state change: unmanaged -> unavailable (reason 'managed') [10=
 20 2]
Dec 30 00:48:06 klaptop kernel: [    7.153381] IPv6: ADDRCONF(NETDEV_UP): w=
lp3s0: link is not ready
Dec 30 00:48:06 klaptop kernel: [    7.155543] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:06 klaptop kernel: [    7.156063] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:06 klaptop kernel: [    7.159205] Ebtables v2.0 registered
Dec 30 00:48:06 klaptop kernel: [    7.351582] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:06 klaptop kernel: [    7.351837] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:07 klaptop kernel: [    7.372728] IPv6: ADDRCONF(NETDEV_UP): w=
lp3s0: link is not ready
Dec 30 00:48:07 klaptop kernel: [    7.444450] nouveau 0000:01:00.0: fb: 20=
48 MiB GDDR5
Dec 30 00:48:07 klaptop NetworkManager[1918]: <info>  [1483055287.0794] dev=
ice (wlp3s0): set-hw-addr: set MAC address to 2A:65:07:A2:E5:F6 (scanning)
Dec 30 00:48:07 klaptop kernel: [    7.452958] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:07 klaptop kernel: [    7.453891] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:07 klaptop kernel: [    7.655344] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:07 klaptop kernel: [    7.656268] iwlwifi 0000:03:00.0: L1 Ena=
bled - LTR Enabled
Dec 30 00:48:07 klaptop NetworkManager[1918]: <info>  [1483055287.3014] blu=
ez: use BlueZ version 5
Dec 30 00:48:07 klaptop NetworkManager[1918]: <info>  [1483055287.3018] Mod=
emManager available in the bus
Dec 30 00:48:07 klaptop kernel: [    7.671864] IPv6: ADDRCONF(NETDEV_UP): w=
lp3s0: link is not ready
Dec 30 00:48:07 klaptop NetworkManager[1918]: <info>  [1483055287.3147] sup=
plicant: wpa_supplicant running
Dec 30 00:48:07 klaptop NetworkManager[1918]: <info>  [1483055287.3147] dev=
ice (wlp3s0): supplicant interface state: init -> starting
Dec 30 00:48:07 klaptop NetworkManager[1918]: <info>  [1483055287.4803] sup=
-iface[0x55b02292d960,wlp3s0]: supports 5 scan SSIDs
Dec 30 00:48:07 klaptop NetworkManager[1918]: <info>  [1483055287.4809] dev=
ice (wlp3s0): supplicant interface state: starting -> ready
Dec 30 00:48:07 klaptop NetworkManager[1918]: <info>  [1483055287.4810] dev=
ice (wlp3s0): state change: unavailable -> disconnected (reason 'supplicant=
-available') [20 30 42]
Dec 30 00:48:07 klaptop kernel: [    7.852392] IPv6: ADDRCONF(NETDEV_UP): w=
lp3s0: link is not ready
Dec 30 00:48:08 klaptop kernel: [    8.746563] vga_switcheroo: enabled
Dec 30 00:48:08 klaptop kernel: [    8.746819] [TTM] Zone  kernel: Availabl=
e graphics memory: 10092776 kiB
Dec 30 00:48:08 klaptop kernel: [    8.746819] [TTM] Zone   dma32: Availabl=
e graphics memory: 2097152 kiB
Dec 30 00:48:08 klaptop kernel: [    8.746820] [TTM] Initializing pool allo=
cator
Dec 30 00:48:08 klaptop kernel: [    8.746824] [TTM] Initializing DMA pool =
allocator
Dec 30 00:48:08 klaptop kernel: [    8.746852] nouveau 0000:01:00.0: DRM: V=
RAM: 2048 MiB
Dec 30 00:48:08 klaptop kernel: [    8.746853] nouveau 0000:01:00.0: DRM: G=
ART: 1048576 MiB
Dec 30 00:48:08 klaptop kernel: [    8.746855] nouveau 0000:01:00.0: DRM: T=
MDS table version 2.0
Dec 30 00:48:08 klaptop kernel: [    8.746856] nouveau 0000:01:00.0: DRM: D=
CB version 4.0
Dec 30 00:48:08 klaptop kernel: [    8.746857] nouveau 0000:01:00.0: DRM: D=
CB outp 00: 08800fc6 0f420010
Dec 30 00:48:08 klaptop kernel: [    8.746858] nouveau 0000:01:00.0: DRM: D=
CB outp 01: 08000f82 00020010
Dec 30 00:48:08 klaptop kernel: [    8.746859] nouveau 0000:01:00.0: DRM: D=
CB conn 00: 01000046
Dec 30 00:48:08 klaptop kernel: [    8.781822] [drm] Supports vblank timest=
amp caching Rev 2 (21.10.2013).
Dec 30 00:48:08 klaptop kernel: [    8.781823] [drm] Driver supports precis=
e vblank timestamp query.
Dec 30 00:48:08 klaptop kernel: [    8.781913] nouveau 0000:01:00.0: hwmon_=
device_register() is deprecated. Please convert the driver to use hwmon_dev=
ice_register_with_info().
Dec 30 00:48:08 klaptop kernel: [    8.949744] nouveau 0000:01:00.0: DRM: M=
M: using COPY for buffer copies
Dec 30 00:48:08 klaptop kernel: [    8.958369] usb 3-12: rpm_suspend
Dec 30 00:48:08 klaptop kernel: [    8.998373] usb usb3-port12: rpm_suspend
Dec 30 00:48:08 klaptop kernel: [    9.010375] [drm] Cannot find any crtc o=
r sizes - going 1024x768
Dec 30 00:48:08 klaptop kernel: [    9.064123] nouveau 0000:01:00.0: DRM: a=
llocated 1024x768 fb: 0x60000, bo ffff9bb72238bc00
Dec 30 00:48:08 klaptop kernel: [    9.064392] Console: switching to colour=
 frame buffer device 128x48
Dec 30 00:48:08 klaptop kernel: [    9.065116] nouveau 0000:01:00.0: fb0: n=
ouveaufb frame buffer device
Dec 30 00:48:08 klaptop kernel: [    9.082458] [drm] Initialized nouveau 1.=
3.1 20120801 for 0000:01:00.0 on minor 1
Dec 30 00:48:08 klaptop kernel: [    9.083512] ACPI: Video Device [VID] (mu=
lti-head: yes  rom: no  post: no)
Dec 30 00:48:08 klaptop kernel: [    9.083980] input: Video Bus as /devices=
/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input19
Dec 30 00:48:08 klaptop kernel: [    9.084137] ACPI: Video Device [VID1] (m=
ulti-head: yes  rom: yes  post: no)
Dec 30 00:48:08 klaptop kernel: [    9.084291] input: Video Bus as /devices=
/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:09/LNXVIDEO:01/input/input20
Dec 30 00:48:08 klaptop kernel: [    9.084406] snd_hda_intel 0000:00:03.0: =
bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
Dec 30 00:48:08 klaptop kernel: [    9.084413] [drm] Initialized i915 1.6.0=
 20161121 for 0000:00:02.0 on minor 0
Dec 30 00:48:10 klaptop kernel: [    9.459001] fbcon: inteldrmfb (fb1) is p=
rimary device
Dec 30 00:48:10 klaptop kernel: [    9.459002] fbcon: Remapping primary dev=
ice, fb1, to tty 1-63
Dec 30 00:48:10 klaptop kernel: [   10.799770] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:10 klaptop kernel: [   10.815681] i915 0000:00:02.0: fb1: inte=
ldrmfb frame buffer device
Dec 30 00:48:10 klaptop kernel: [   10.835266] snd_hda_codec_hdmi hdaudioC0=
D0: rpm_suspend
Dec 30 00:48:10 klaptop kernel: [   10.835434] input: HDA Intel HDMI HDMI/D=
P,pcm=3D3 as /devices/pci0000:00/0000:00:03.0/sound/card0/input21
Dec 30 00:48:10 klaptop kernel: [   10.835480] input: HDA Intel HDMI HDMI/D=
P,pcm=3D7 as /devices/pci0000:00/0000:00:03.0/sound/card0/input22
Dec 30 00:48:10 klaptop kernel: [   10.835523] input: HDA Intel HDMI HDMI/D=
P,pcm=3D8 as /devices/pci0000:00/0000:00:03.0/sound/card0/input23
Dec 30 00:48:10 klaptop kernel: [   10.836029] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:10 klaptop kernel: [   10.859089] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:10 klaptop kernel: [   10.859150] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:10 klaptop NetworkManager[1918]: <info>  [1483055290.7948] dev=
ice (wlp3s0): supplicant interface state: ready -> inactive
Dec 30 00:48:11 klaptop NetworkManager[1918]: <info>  [1483055291.6638] man=
ager: startup complete
Dec 30 00:48:12 klaptop kernel: [   13.122178] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:12 klaptop kernel: [   13.122216] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:12 klaptop kernel: [   13.267619] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:12 klaptop kernel: [   13.267686] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:12 klaptop kernel: [   13.267814] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:12 klaptop kernel: [   13.267870] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:13 klaptop kernel: [   13.550640] nouveau 0000:01:00.0: rpm_id=
le
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9375] pol=
icy: auto-activating connection 'casablanca'
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9384] dev=
ice (wlp3s0): Activation: starting connection 'casablanca' (3bdcd57d-d3e6-4=
4e5-bb8c-dbaae0803d26)
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9385] dev=
ice (wlp3s0): state change: disconnected -> prepare (reason 'none') [30 40 =
0]
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9386] man=
ager: NetworkManager state is now CONNECTING
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9392] dev=
ice (wlp3s0): set-hw-addr: set-cloned MAC address to CC:3D:82:59:89:F4 (per=
manent)
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9410] dev=
ice (wlp3s0): state change: prepare -> config (reason 'none') [40 50 0]
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9411] dev=
ice (wlp3s0): Activation: (wifi) access point 'casablanca' has security, bu=
t secrets are required.
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9411] dev=
ice (wlp3s0): state change: config -> need-auth (reason 'none') [50 60 0]
Dec 30 00:48:17 klaptop kernel: [   18.312115] IPv6: ADDRCONF(NETDEV_UP): w=
lp3s0: link is not ready
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9434] dev=
ice (wlp3s0): state change: need-auth -> prepare (reason 'none') [60 40 0]
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9436] dev=
ice (wlp3s0): state change: prepare -> config (reason 'none') [40 50 0]
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9437] dev=
ice (wlp3s0): Activation: (wifi) connection 'casablanca' has security, and =
secrets exist.  No new secrets needed.
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9438] Con=
fig: added 'ssid' value 'casablanca'
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9438] Con=
fig: added 'scan_ssid' value '1'
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9438] Con=
fig: added 'key_mgmt' value 'WPA-PSK'
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9438] Con=
fig: added 'auth_alg' value 'OPEN'
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9438] Con=
fig: added 'psk' value '<omitted>'
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9874] dev=
ice (wlp3s0): supplicant interface state: inactive -> disconnected
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9875] sup=
-iface[0x55b02292d960,wlp3s0]: config: set interface ap_scan to 1
Dec 30 00:48:17 klaptop NetworkManager[1918]: <info>  [1483055297.9952] dev=
ice (wlp3s0): supplicant interface state: disconnected -> inactive
Dec 30 00:48:17 klaptop kernel: [   18.367411] wlp3s0: authenticate with d4=
:21:22:cb:cb:85
Dec 30 00:48:17 klaptop kernel: [   18.370259] wlp3s0: send auth to d4:21:2=
2:cb:cb:85 (try 1/3)
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0008] dev=
ice (wlp3s0): supplicant interface state: inactive -> associating
Dec 30 00:48:18 klaptop kernel: [   18.371146] wlp3s0: authenticated
Dec 30 00:48:18 klaptop kernel: [   18.374934] wlp3s0: associate with d4:21=
:22:cb:cb:85 (try 1/3)
Dec 30 00:48:18 klaptop kernel: [   18.379137] wlp3s0: RX AssocResp from d4=
:21:22:cb:cb:85 (capab=3D0x11 status=3D0 aid=3D1)
Dec 30 00:48:18 klaptop kernel: [   18.380611] wlp3s0: associated
Dec 30 00:48:18 klaptop kernel: [   18.380619] IPv6: ADDRCONF(NETDEV_CHANGE=
): wlp3s0: link becomes ready
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0144] dev=
ice (wlp3s0): supplicant interface state: associating -> associated
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0199] dev=
ice (wlp3s0): supplicant interface state: associated -> 4-way handshake
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0358] dev=
ice (wlp3s0): supplicant interface state: 4-way handshake -> completed
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0358] dev=
ice (wlp3s0): Activation: (wifi) Stage 2 of 5 (Device Configure) successful=
.  Connected to wireless network 'casablanca'.
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0359] dev=
ice (wlp3s0): state change: config -> ip-config (reason 'none') [50 70 0]
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0367] dhc=
p4 (wlp3s0): activation: beginning transaction (timeout in 45 seconds)
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0394] dhc=
p4 (wlp3s0): dhclient started with pid 3064
Dec 30 00:48:18 klaptop kernel: [   18.438644] wlp3s0: Limiting TX power to=
 30 (30 - 0) dBm as advertised by d4:21:22:cb:cb:85
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0):   address 192.168.2.104
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0):   plen 24 (255.255.255.0)
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0):   gateway 192.168.2.1
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0):   server identifier 192.168.2.1
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0):   lease time 1814400
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0):   nameserver '192.168.2.1'
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0):   nameserver '192.168.2.1'
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0):   domain name 'Speedport_W_724V_09011603_00_025'
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0886] dhc=
p4 (wlp3s0): state changed unknown -> bound
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0897] dev=
ice (wlp3s0): state change: ip-config -> ip-check (reason 'none') [70 80 0]
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0900] dev=
ice (wlp3s0): state change: ip-check -> secondaries (reason 'none') [80 90 =
0]
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0902] dev=
ice (wlp3s0): state change: secondaries -> activated (reason 'none') [90 10=
0 0]
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0902] man=
ager: NetworkManager state is now CONNECTED_LOCAL
Dec 30 00:48:18 klaptop at-spi-bus-laun[3085]: Failed to register client: G=
DBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.gnome.Se=
ssionManager was not provided by any .service files
Dec 30 00:48:18 klaptop at-spi-bus-laun[3085]: g_dbus_proxy_call_internal: =
assertion 'G_IS_DBUS_PROXY (proxy)' failed
Dec 30 00:48:18 klaptop at-spi-bus-laun[3085]: invalid unclassed pointer in=
 cast to 'GObject'
Dec 30 00:48:18 klaptop at-spi-bus-laun[3085]: instance with invalid (NULL)=
 class pointer
Dec 30 00:48:18 klaptop at-spi-bus-laun[3085]: g_signal_connect_data: asser=
tion 'G_TYPE_CHECK_INSTANCE (instance)' failed
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0913] man=
ager: NetworkManager state is now CONNECTED_GLOBAL
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0913] pol=
icy: set 'casablanca' (wlp3s0) as default for IPv4 routing and DNS
Dec 30 00:48:18 klaptop NetworkManager[1918]: <info>  [1483055298.0936] dev=
ice (wlp3s0): Activation: successful, device activated.
Dec 30 00:48:19 klaptop NetworkManager[1918]: <info>  [1483055299.6746] dhc=
p6 (wlp3s0): activation: beginning transaction (timeout in 45 seconds)
Dec 30 00:48:19 klaptop NetworkManager[1918]: <warn>  [1483055299.6746] dhc=
p6 (wlp3s0): hostname is not a FQDN, it will be ignored
Dec 30 00:48:19 klaptop NetworkManager[1918]: <info>  [1483055299.6762] dhc=
p6 (wlp3s0): dhclient started with pid 3220
Dec 30 00:48:19 klaptop NetworkManager[1918]: <info>  [1483055299.6771] pol=
icy: set 'casablanca' (wlp3s0) as default for IPv6 routing and DNS
Dec 30 00:48:20 klaptop kernel: [   20.583465] snd_hda_codec_hdmi hdaudioC0=
D0: rpm_resume
Dec 30 00:48:20 klaptop NetworkManager[1918]: <info>  [1483055300.3050] dhc=
p6 (wlp3s0):   nameserver 'fe80::1'
Dec 30 00:48:20 klaptop NetworkManager[1918]: <info>  [1483055300.3051] dhc=
p6 (wlp3s0): state changed unknown -> bound
Dec 30 00:48:20 klaptop NetworkManager[1918]: <info>  [1483055300.3070] dhc=
p6 (wlp3s0): client pid 3220 exited with status 0
Dec 30 00:48:20 klaptop NetworkManager[1918]: <info>  [1483055300.3071] dhc=
p6 (wlp3s0): state changed bound -> done
Dec 30 00:48:20 klaptop kernel: [   20.839240] snd_hda_intel 0000:00:1b.0: =
IRQ timing workaround is activated for card #1. Suggest a bigger bdl_pos_ad=
j.
Dec 30 00:48:20 klaptop clock-applet[3304]: /build/glib2.0-m2w47E/glib2.0-2=
.50.2/./gobject/gsignal.c:2523: signal 'size_request' is invalid for instan=
ce '0x564aac92bd20' of type 'GtkLabel'
Dec 30 00:48:21 klaptop kernel: [   22.114185] random: crng init done
Dec 30 00:48:24 klaptop kernel: [   24.831417] nouveau 0000:01:00.0: rpm_su=
spend
Dec 30 00:48:24 klaptop kernel: [   24.831427] nouveau 0000:01:00.0: DRM: s=
uspending console...
Dec 30 00:48:24 klaptop kernel: [   24.831432] nouveau 0000:01:00.0: DRM: s=
uspending display...
Dec 30 00:48:24 klaptop kernel: [   24.831477] nouveau 0000:01:00.0: DRM: e=
victing buffers...
Dec 30 00:48:24 klaptop kernel: [   24.865243] nouveau 0000:01:00.0: DRM: w=
aiting for kernel channels to go idle...
Dec 30 00:48:24 klaptop kernel: [   24.865269] nouveau 0000:01:00.0: DRM: s=
uspending client object trees...
Dec 30 00:48:24 klaptop kernel: [   24.870724] nouveau 0000:01:00.0: DRM: s=
uspending kernel object tree...
Dec 30 00:48:25 klaptop kernel: [   26.080300] thinkpad_acpi: EC reports th=
at Thermal Table has changed
Dec 30 00:48:25 klaptop kernel: [   26.207691] pcieport 0000:00:01.0: rpm_i=
dle
Dec 30 00:48:25 klaptop kernel: [   26.207693] pcieport 0000:00:01.0: rpm_s=
uspend
Dec 30 00:48:28 klaptop kernel: [   28.927640] snd_hda_codec_hdmi hdaudioC0=
D0: rpm_suspend
SYSTEM IS NOW NOT RESPONSIVE


----- Original Message -----
From: "Lukas Wunner" <lukas@wunner.de>
To: "Kilian Singer" <kilian.singer@quantumtechnology.info>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kerne=
l.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wys=
ocki" <rafael.j.wysocki@intel.com>
Sent: Friday, December 30, 2016 1:07:31 AM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Fri, Dec 30, 2016 at 12:20:34AM +0100, Kilian Singer wrote:
> The echo on > /sys/bus/pci/devices/0000:00:01.0/power/control
>=20
> did not help.
>=20
> Also I noticed on each boot directly after initramfs I get
> mmc0: Unknown contrller version. You may experience problems.
>=20
> On all versions of the kernel.

Hm, that rings a bell.  The MMC controller is located below
root port 0000:00:1c.0, which has vendor/device ID 8086:8c10.

We're having trouble with the exact same root port on 2015
MacBook Pros where it mysteriously prevents them from powering off:
https://bugzilla.kernel.org/show_bug.cgi?id=3D103211
http://www.spinics.net/lists/linux-pci/msg53460.html

Does this help:
echo on > /sys/bus/pci/devices/0000:00:1c.0/power/control

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-28 16:18   ` Bjorn Helgaas
  2016-12-29  9:58     ` Kilian Singer
@ 2016-12-30  0:19     ` Rafael J. Wysocki
  2016-12-30 14:48       ` Rafael J. Wysocki
  1 sibling, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30  0:19 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Lukas Wunner, kilian.singer, linux-pci, Mika Westerberg,
	Rafael J. Wysocki

On Wednesday, December 28, 2016 10:18:16 AM Bjorn Helgaas wrote:
> On Wed, Dec 28, 2016 at 12:29:54PM +0100, Lukas Wunner wrote:
> > On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > > and all the debugging you've done.  Below is a revert of the troublesome
> > > commit.  Can you test it and verify that it also fixes the problem?
> > > 
> > > I assume Mika is looking at this and will have a better solution soon.
> > > But if not, I'll queue this up for v4.10.
> > 
> > @Kilian:  Are you using the proprietary nvidia driver?  If so,
> > does the issue go away if you blacklist that driver or use nouveau
> > instead?
> > 
> > 
> > With a bit of googling I found dmesg and lspci output for this model:
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437386
> > 
> > The keyboard and mouse seem to be PS/2 devices, accessed via I/O ports
> > 0x60, 0x64.  I assume they're located behind the LPC bridge?
> > 
> > The proprietary nvidia driver has a bug, it locks the legacy PCI VGA
> > registers with vga_tryget() but never releases that lock.  Intel
> > chipsets have a quirk wherein I/O ports are routed to the bus to
> > which the legacy PCI VGA registers are locked.  So once vga_tryget()
> > is called by the nvidia driver, access to the keyboard and mouse is
> > routed to bus 01 (on which the Nvidia card resides) and not to bus 00
> > (on which the LPC bridge resides).
> 
> Interesting.  A spec reference would be a good addition to whatever
> fix is proposed.
> 
> > My theory would be that if you lock the screen, the Nvidia card
> > runtime suspends, allowing the root port above it to suspend,
> > and then the I/O ports are no longer accessible.
> > 
> > We have a similar issue on dual GPU MacBook Pros:  Backlight brightness
> > is adjusted by writing to I/O ports of a gmux controller situated below
> > the LPC bridge.  The nvidia driver locks the legacy VGA registers and
> > from that point reads from the I/O ports always return 0xff.  Commit
> > 4eebd5a4e726 ("apple-gmux: lock iGP IO to protect from vgaarb changes")
> > sought to fix it but caused other breakage which remains unfixed so far:
> > https://bugzilla.kernel.org/show_bug.cgi?id=105051
> > https://bugzilla.kernel.org/show_bug.cgi?id=88861#c11
> > 
> > I've always wondered if the Intel chipsets would behave more sensibly
> > if the LPC bridge had BARs specifying the I/O regions used by devices
> > below it.
> > 
> > Reverting runtime suspend for PCIe ports is not a good solution as it's
> > needed for Thunderbolt runtime PM on Macs.
> 
> The choices are:
> 
>   1) Fix the regression and preserve runtime PM for PCIe ports
>   2) Fix the regression by reverting runtime PM for PCIe ports
> 
> Obviously we hope for 1).  Preserving runtime PM without fixing the
> regression isn't even on the list.  I know this is Linux 101, so I
> apologize for restating the obvious.

There is a couple of obvious things we can do other than reverting, though.

Like for example changing the cutoff date we have in there to cover the Kilian's
system.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:24                   ` Kilian Singer
@ 2016-12-30  0:22                     ` Rafael J. Wysocki
  2016-12-30  0:39                       ` Kilian Singer
  2016-12-30  0:45                       ` Kilian Singer
  0 siblings, 2 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30  0:22 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg,
	Rafael J. Wysocki

On Friday, December 30, 2016 01:24:19 AM Kilian Singer wrote:
> The
> 
> echo on > /sys/bus/pci/devices/0000:00:1c.0/power/control
> 
> did not help neither.

So does it help if you do "echo on > .../power/control" for all devices under
/sys/bus/pci/devices/ ?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:16                 ` Kilian Singer
@ 2016-12-30  0:24                   ` Kilian Singer
  2016-12-30  0:22                     ` Rafael J. Wysocki
  2017-01-02 11:40                   ` Lukas Wunner
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-30  0:24 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

The

echo on > /sys/bus/pci/devices/0000:00:1c.0/power/control

did not help neither.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:22                     ` Rafael J. Wysocki
@ 2016-12-30  0:39                       ` Kilian Singer
  2016-12-30  0:41                         ` Rafael J. Wysocki
  2016-12-30  0:45                       ` Kilian Singer
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-30  0:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg,
	Rafael J. Wysocki

I did for all but

0000:01:00:0

there bash on the mate desktop did not show a new prompt.

I opened a second terminal and for the same one it also did not show a
prompt.

But after clicking on lock screen I had same issue.


On 30-Dec-16 01:22, Rafael J. Wysocki wrote:
> On Friday, December 30, 2016 01:24:19 AM Kilian Singer wrote:
>> The
>>
>> echo on > /sys/bus/pci/devices/0000:00:1c.0/power/control
>>
>> did not help neither.
> So does it help if you do "echo on > .../power/control" for all devices under
> /sys/bus/pci/devices/ ?
>
> Thanks,
> Rafael
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:39                       ` Kilian Singer
@ 2016-12-30  0:41                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30  0:41 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg,
	Rafael J. Wysocki

On Friday, December 30, 2016 01:39:46 AM Kilian Singer wrote:
> I did for all but
> 
> 0000:01:00:0
> 
> there bash on the mate desktop did not show a new prompt.
> 
> I opened a second terminal and for the same one it also did not show a
> prompt.
> 
> But after clicking on lock screen I had same issue.

Something fishy is going on here.

What kernel did you test it with?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:22                     ` Rafael J. Wysocki
  2016-12-30  0:39                       ` Kilian Singer
@ 2016-12-30  0:45                       ` Kilian Singer
  2016-12-30  1:40                         ` Rafael J. Wysocki
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-30  0:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg,
	Rafael J. Wysocki

The touchpad and builtin keyboard of laptop fail.
But also connected usb keyboard and usb mouse fail.

In both cases mouse pointer is still moveable.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:45                       ` Kilian Singer
@ 2016-12-30  1:40                         ` Rafael J. Wysocki
  2016-12-30  1:50                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30  1:40 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg,
	Rafael J. Wysocki

On Friday, December 30, 2016 01:45:53 AM Kilian Singer wrote:
> The touchpad and builtin keyboard of laptop fail.
> But also connected usb keyboard and usb mouse fail.
> 
> In both cases mouse pointer is still moveable.

It may just be too late to turn the ports "on" once they have been suspended at
least once on your system.

Please check if the appended patch makes any difference.

Thanks,
Rafael


---
 drivers/pci/pcie/portdrv_pci.c |    2 --
 1 file changed, 2 deletions(-)

Index: linux-pm/drivers/pci/pcie/portdrv_pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pcie/portdrv_pci.c
+++ linux-pm/drivers/pci/pcie/portdrv_pci.c
@@ -160,7 +160,6 @@ static int pcie_portdrv_probe(struct pci
 		pm_runtime_use_autosuspend(&dev->dev);
 		pm_runtime_mark_last_busy(&dev->dev);
 		pm_runtime_put_autosuspend(&dev->dev);
-		pm_runtime_allow(&dev->dev);
 	}
 
 	return 0;
@@ -169,7 +168,6 @@ static int pcie_portdrv_probe(struct pci
 static void pcie_portdrv_remove(struct pci_dev *dev)
 {
 	if (pci_bridge_d3_possible(dev)) {
-		pm_runtime_forbid(&dev->dev);
 		pm_runtime_get_noresume(&dev->dev);
 		pm_runtime_dont_use_autosuspend(&dev->dev);
 	}

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  1:40                         ` Rafael J. Wysocki
@ 2016-12-30  1:50                           ` Rafael J. Wysocki
  2016-12-30  1:52                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30  1:50 UTC (permalink / raw)
  To: Kilian Singer; +Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg

On Friday, December 30, 2016 02:40:45 AM Rafael J. Wysocki wrote:
> On Friday, December 30, 2016 01:45:53 AM Kilian Singer wrote:
> > The touchpad and builtin keyboard of laptop fail.
> > But also connected usb keyboard and usb mouse fail.
> > 
> > In both cases mouse pointer is still moveable.
> 
> It may just be too late to turn the ports "on" once they have been suspended at
> least once on your system.
> 
> Please check if the appended patch makes any difference.

Actually, please first check if booting with pci_port_pm=off in the kernel
command line makes any difference.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  1:50                           ` Rafael J. Wysocki
@ 2016-12-30  1:52                             ` Rafael J. Wysocki
  2016-12-30 13:37                               ` Kilian Singer
  0 siblings, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30  1:52 UTC (permalink / raw)
  To: Kilian Singer; +Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg

On Friday, December 30, 2016 02:50:45 AM Rafael J. Wysocki wrote:
> On Friday, December 30, 2016 02:40:45 AM Rafael J. Wysocki wrote:
> > On Friday, December 30, 2016 01:45:53 AM Kilian Singer wrote:
> > > The touchpad and builtin keyboard of laptop fail.
> > > But also connected usb keyboard and usb mouse fail.
> > > 
> > > In both cases mouse pointer is still moveable.
> > 
> > It may just be too late to turn the ports "on" once they have been suspended at
> > least once on your system.
> > 
> > Please check if the appended patch makes any difference.
> 
> Actually, please first check if booting with pci_port_pm=off in the kernel
> command line makes any difference.

The command line option should be pcie_port_pm=off ("e" was missing), sorry.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  1:52                             ` Rafael J. Wysocki
@ 2016-12-30 13:37                               ` Kilian Singer
  2016-12-30 13:59                                 ` Kilian Singer
  2016-12-30 14:47                                 ` Rafael J. Wysocki
  0 siblings, 2 replies; 115+ messages in thread
From: Kilian Singer @ 2016-12-30 13:37 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg

Yes,
the pci_port_pm=off
fixes both the firefox issue and the lock screen issue. Also suspend/resume work.
Tested on 4.9

----- Original Message -----
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: "Kilian Singer" <kilian.singer@quantumtechnology.info>
Cc: "Lukas Wunner" <lukas@wunner.de>, "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>
Sent: Friday, December 30, 2016 2:52:59 AM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Friday, December 30, 2016 02:50:45 AM Rafael J. Wysocki wrote:
> On Friday, December 30, 2016 02:40:45 AM Rafael J. Wysocki wrote:
> > On Friday, December 30, 2016 01:45:53 AM Kilian Singer wrote:
> > > The touchpad and builtin keyboard of laptop fail.
> > > But also connected usb keyboard and usb mouse fail.
> > > 
> > > In both cases mouse pointer is still moveable.
> > 
> > It may just be too late to turn the ports "on" once they have been suspended at
> > least once on your system.
> > 
> > Please check if the appended patch makes any difference.
> 
> Actually, please first check if booting with pci_port_pm=off in the kernel
> command line makes any difference.

The command line option should be pcie_port_pm=off ("e" was missing), sorry.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30 13:37                               ` Kilian Singer
@ 2016-12-30 13:59                                 ` Kilian Singer
  2016-12-30 14:44                                   ` Rafael J. Wysocki
  2016-12-30 14:47                                 ` Rafael J. Wysocki
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2016-12-30 13:59 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg

The proposed patch alone did not change the issue.


----- Original Message -----
From: "Kilian Singer" <kilian.singer@quantumtechnology.info>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: "Lukas Wunner" <lukas@wunner.de>, "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>
Sent: Friday, December 30, 2016 2:37:17 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

Yes,
the pci_port_pm=off
fixes both the firefox issue and the lock screen issue. Also suspend/resume work.
Tested on 4.9

----- Original Message -----
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: "Kilian Singer" <kilian.singer@quantumtechnology.info>
Cc: "Lukas Wunner" <lukas@wunner.de>, "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>
Sent: Friday, December 30, 2016 2:52:59 AM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Friday, December 30, 2016 02:50:45 AM Rafael J. Wysocki wrote:
> On Friday, December 30, 2016 02:40:45 AM Rafael J. Wysocki wrote:
> > On Friday, December 30, 2016 01:45:53 AM Kilian Singer wrote:
> > > The touchpad and builtin keyboard of laptop fail.
> > > But also connected usb keyboard and usb mouse fail.
> > > 
> > > In both cases mouse pointer is still moveable.
> > 
> > It may just be too late to turn the ports "on" once they have been suspended at
> > least once on your system.
> > 
> > Please check if the appended patch makes any difference.
> 
> Actually, please first check if booting with pci_port_pm=off in the kernel
> command line makes any difference.

The command line option should be pcie_port_pm=off ("e" was missing), sorry.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30 13:59                                 ` Kilian Singer
@ 2016-12-30 14:44                                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30 14:44 UTC (permalink / raw)
  To: Kilian Singer; +Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg

On Friday, December 30, 2016 02:59:53 PM Kilian Singer wrote:
> The proposed patch alone did not change the issue.

I guess you mean https://patchwork.kernel.org/patch/9491693/ which probably
means that the runtime PM of the PCIe ports is enabled by user space anyway.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30 13:37                               ` Kilian Singer
  2016-12-30 13:59                                 ` Kilian Singer
@ 2016-12-30 14:47                                 ` Rafael J. Wysocki
  2017-01-02 12:22                                   ` Mika Westerberg
  1 sibling, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30 14:47 UTC (permalink / raw)
  To: Kilian Singer; +Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Mika Westerberg

On Friday, December 30, 2016 02:37:17 PM Kilian Singer wrote:
> Yes,
> the pci_port_pm=off
> fixes both the firefox issue and the lock screen issue. Also suspend/resume work.
> Tested on 4.9

OK, thanks!

Please use that as a manual workaround for the time being.

I looked at the acpidump attached to the BZ entry, but nothing jumped up at me
immediately.  I'll let Mika take care of this going forward when he's back.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:19     ` Rafael J. Wysocki
@ 2016-12-30 14:48       ` Rafael J. Wysocki
  0 siblings, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2016-12-30 14:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Lukas Wunner, kilian.singer, linux-pci, Mika Westerberg,
	Rafael J. Wysocki

On Friday, December 30, 2016 01:19:14 AM Rafael J. Wysocki wrote:
> On Wednesday, December 28, 2016 10:18:16 AM Bjorn Helgaas wrote:
> > On Wed, Dec 28, 2016 at 12:29:54PM +0100, Lukas Wunner wrote:
> > > On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > > > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > > > and all the debugging you've done.  Below is a revert of the troublesome
> > > > commit.  Can you test it and verify that it also fixes the problem?
> > > > 
> > > > I assume Mika is looking at this and will have a better solution soon.
> > > > But if not, I'll queue this up for v4.10.
> > > 
> > > @Kilian:  Are you using the proprietary nvidia driver?  If so,
> > > does the issue go away if you blacklist that driver or use nouveau
> > > instead?
> > > 
> > > 
> > > With a bit of googling I found dmesg and lspci output for this model:
> > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1437386
> > > 
> > > The keyboard and mouse seem to be PS/2 devices, accessed via I/O ports
> > > 0x60, 0x64.  I assume they're located behind the LPC bridge?
> > > 
> > > The proprietary nvidia driver has a bug, it locks the legacy PCI VGA
> > > registers with vga_tryget() but never releases that lock.  Intel
> > > chipsets have a quirk wherein I/O ports are routed to the bus to
> > > which the legacy PCI VGA registers are locked.  So once vga_tryget()
> > > is called by the nvidia driver, access to the keyboard and mouse is
> > > routed to bus 01 (on which the Nvidia card resides) and not to bus 00
> > > (on which the LPC bridge resides).
> > 
> > Interesting.  A spec reference would be a good addition to whatever
> > fix is proposed.
> > 
> > > My theory would be that if you lock the screen, the Nvidia card
> > > runtime suspends, allowing the root port above it to suspend,
> > > and then the I/O ports are no longer accessible.
> > > 
> > > We have a similar issue on dual GPU MacBook Pros:  Backlight brightness
> > > is adjusted by writing to I/O ports of a gmux controller situated below
> > > the LPC bridge.  The nvidia driver locks the legacy VGA registers and
> > > from that point reads from the I/O ports always return 0xff.  Commit
> > > 4eebd5a4e726 ("apple-gmux: lock iGP IO to protect from vgaarb changes")
> > > sought to fix it but caused other breakage which remains unfixed so far:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=105051
> > > https://bugzilla.kernel.org/show_bug.cgi?id=88861#c11
> > > 
> > > I've always wondered if the Intel chipsets would behave more sensibly
> > > if the LPC bridge had BARs specifying the I/O regions used by devices
> > > below it.
> > > 
> > > Reverting runtime suspend for PCIe ports is not a good solution as it's
> > > needed for Thunderbolt runtime PM on Macs.
> > 
> > The choices are:
> > 
> >   1) Fix the regression and preserve runtime PM for PCIe ports
> >   2) Fix the regression by reverting runtime PM for PCIe ports
> > 
> > Obviously we hope for 1).  Preserving runtime PM without fixing the
> > regression isn't even on the list.  I know this is Linux 101, so I
> > apologize for restating the obvious.
> 
> There is a couple of obvious things we can do other than reverting, though.
> 
> Like for example changing the cutoff date we have in there to cover the Kilian's
> system.

And I hope you realize that this revert isn't even sufficient to fix the Kilian's
machine entirely (system suspend/resume issues will remain after it).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30  0:16                 ` Kilian Singer
  2016-12-30  0:24                   ` Kilian Singer
@ 2017-01-02 11:40                   ` Lukas Wunner
  2017-01-02 12:10                     ` Mika Westerberg
                                       ` (2 more replies)
  1 sibling, 3 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-02 11:40 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 2014 bytes --]

On Fri, Dec 30, 2016 at 01:16:17AM +0100, Kilian Singer wrote:
> I did the debug message on the 4.10-rc1 for now. I could go back to 4.9
> if that helps but needs some time again to compile.
> The debug messages from the first rpm_... to the crash are:
[...]
> [   24.831417] nouveau 0000:01:00.0: rpm_suspend
> [   24.831427] nouveau 0000:01:00.0: DRM: suspending console...
> [   24.831432] nouveau 0000:01:00.0: DRM: suspending display...
> [   24.831477] nouveau 0000:01:00.0: DRM: evicting buffers...
> [   24.865243] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
> [   24.865269] nouveau 0000:01:00.0: DRM: suspending client object trees...
> [   24.870724] nouveau 0000:01:00.0: DRM: suspending kernel object tree...
> [   26.080300] thinkpad_acpi: EC reports that Thermal Table has changed
> [   26.207691] pcieport 0000:00:01.0: rpm_idle
> [   26.207693] pcieport 0000:00:01.0: rpm_suspend
> [   28.927640] snd_hda_codec_hdmi hdaudioC0D0: rpm_suspend
> SYSTEM IS NOW NOT RESPONSIVE

So two seconds before the system became unresponsive, the root port above
the discrete GPU suspended, suggesting that's the culprit.  Could you test
either of the attached patches to confirm this theory?  They disable
runtime PM on this specific root port but allow it on all the others.

You've got an Optimus laptop, i.e. power to the discrete GPU can be cut.
Traditionally this is achieved by invoking an ACPI _DSM (Device Specific
Method).  That's what we did up until v4.7.

However on newer laptops Windows no longer cuts power to the discrete GPU
by invoking the _DSM, but rather by suspending the root port above the
GPU.  (More specifically by turning off Power Resources required for D3
of the root port, those are specified in a _PR3 object.)  We started
supporting this with v4.8.

If the above theory is correct, we need to involve Optimus experts
because this is not an issue then with powering down root ports in
general, but rather specific to this Optimus use case.

Thanks,

Lukas

[-- Attachment #2: disable_pm_v4.9.patch --]
[-- Type: text/plain, Size: 439 bytes --]

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ba34907..c9dd1e0 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2242,6 +2242,10 @@ static bool pci_bridge_d3_possible(struct pci_dev *bridge)
 		if (pci_bridge_d3_force)
 			return true;
 
+		if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
+		    bridge->device == 0x0c01)
+			return false;
+
 		/*
 		 * It should be safe to put PCIe ports from 2015 or newer
 		 * to D3.

[-- Attachment #3: disable_pm_v4.10-rc1.patch --]
[-- Type: text/plain, Size: 494 bytes --]

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a881c0d..2e4b32bd 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2240,6 +2240,10 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
 		if (pci_bridge_d3_disable)
 			return false;
 
+		if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
+		    bridge->device == 0x0c01)
+			return false;
+
 		/*
 		 * Hotplug ports handled by firmware in System Management Mode
 		 * may not be put into D3 by the OS (Thunderbolt on non-Macs).

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 11:40                   ` Lukas Wunner
@ 2017-01-02 12:10                     ` Mika Westerberg
  2017-01-02 13:53                       ` Mika Westerberg
                                         ` (2 more replies)
  2017-01-03 16:59                     ` Kilian Singer
  2017-01-03 17:08                     ` Kilian Singer
  2 siblings, 3 replies; 115+ messages in thread
From: Mika Westerberg @ 2017-01-02 12:10 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Kilian Singer, Bjorn Helgaas, linux-pci, Rafael J. Wysocki

On Mon, Jan 02, 2017 at 12:40:40PM +0100, Lukas Wunner wrote:
> On Fri, Dec 30, 2016 at 01:16:17AM +0100, Kilian Singer wrote:
> > I did the debug message on the 4.10-rc1 for now. I could go back to 4.9
> > if that helps but needs some time again to compile.
> > The debug messages from the first rpm_... to the crash are:
> [...]
> > [   24.831417] nouveau 0000:01:00.0: rpm_suspend
> > [   24.831427] nouveau 0000:01:00.0: DRM: suspending console...
> > [   24.831432] nouveau 0000:01:00.0: DRM: suspending display...
> > [   24.831477] nouveau 0000:01:00.0: DRM: evicting buffers...
> > [   24.865243] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
> > [   24.865269] nouveau 0000:01:00.0: DRM: suspending client object trees...
> > [   24.870724] nouveau 0000:01:00.0: DRM: suspending kernel object tree...
> > [   26.080300] thinkpad_acpi: EC reports that Thermal Table has changed
> > [   26.207691] pcieport 0000:00:01.0: rpm_idle
> > [   26.207693] pcieport 0000:00:01.0: rpm_suspend
> > [   28.927640] snd_hda_codec_hdmi hdaudioC0D0: rpm_suspend
> > SYSTEM IS NOW NOT RESPONSIVE
> 
> So two seconds before the system became unresponsive, the root port above
> the discrete GPU suspended, suggesting that's the culprit.  Could you test
> either of the attached patches to confirm this theory?  They disable
> runtime PM on this specific root port but allow it on all the others.
> 
> You've got an Optimus laptop, i.e. power to the discrete GPU can be cut.
> Traditionally this is achieved by invoking an ACPI _DSM (Device Specific
> Method).  That's what we did up until v4.7.
> 
> However on newer laptops Windows no longer cuts power to the discrete GPU
> by invoking the _DSM, but rather by suspending the root port above the
> GPU.  (More specifically by turning off Power Resources required for D3
> of the root port, those are specified in a _PR3 object.)  We started
> supporting this with v4.8.
> 
> If the above theory is correct, we need to involve Optimus experts
> because this is not an issue then with powering down root ports in
> general, but rather specific to this Optimus use case.

[Back from vacation now]

I've checked the acpidump of this machine and it does not seem to be a
traditional Optimus machine. At least this one is missing the magic _DSM
which is used to gather capabilities of the graphics device.

However, it does have _PR3 and it is attached to the device
(_SB.PCI0.PEG) itself, not the root port.

One thing you could try in addition to Lucas' patches is just to prevent
D3cold from the device by doing this:

  # echo 0 > /sys/bus/pci/devices/0000:01:00.0/d3cold_allowed

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-30 14:47                                 ` Rafael J. Wysocki
@ 2017-01-02 12:22                                   ` Mika Westerberg
  2017-01-03 17:12                                     ` Kilian Singer
  0 siblings, 1 reply; 115+ messages in thread
From: Mika Westerberg @ 2017-01-02 12:22 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Kilian Singer, Lukas Wunner, Bjorn Helgaas, linux-pci

On Fri, Dec 30, 2016 at 03:47:31PM +0100, Rafael J. Wysocki wrote:
> On Friday, December 30, 2016 02:37:17 PM Kilian Singer wrote:
> > Yes,
> > the pci_port_pm=off
> > fixes both the firefox issue and the lock screen issue. Also suspend/resume work.
> > Tested on 4.9
> 
> OK, thanks!
> 
> Please use that as a manual workaround for the time being.
> 
> I looked at the acpidump attached to the BZ entry, but nothing jumped up at me
> immediately.  I'll let Mika take care of this going forward when he's back.

Thanks Rafael and Lucas for the help. I'm now back from my vacation so I
can start investigating this as well.

Kilian, can you attach full output of 'sudo lspci -vv' to the bug? The
one in the comments is pretty hard to read.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 12:10                     ` Mika Westerberg
@ 2017-01-02 13:53                       ` Mika Westerberg
  2017-01-02 14:48                       ` Mika Westerberg
  2017-01-03 17:10                       ` Kilian Singer
  2 siblings, 0 replies; 115+ messages in thread
From: Mika Westerberg @ 2017-01-02 13:53 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Kilian Singer, Bjorn Helgaas, linux-pci, Rafael J. Wysocki

On Mon, Jan 02, 2017 at 02:10:19PM +0200, Mika Westerberg wrote:
> I've checked the acpidump of this machine and it does not seem to be a
> traditional Optimus machine. At least this one is missing the magic _DSM
> which is used to gather capabilities of the graphics device.
> 
> However, it does have _PR3 and it is attached to the device
> (_SB.PCI0.PEG) itself, not the root port.
> 
> One thing you could try in addition to Lucas' patches is just to prevent
> D3cold from the device by doing this:
> 
>   # echo 0 > /sys/bus/pci/devices/0000:01:00.0/d3cold_allowed

Following messages look like the device fails to resume properly from D3cold:

Dec 30 08:45:06 klaptop kernel: [   27.775949] nouveau 0000:01:00.0: rpm_resume
Dec 30 08:45:06 klaptop kernel: [   27.776316] nouveau 0000:01:00.0: Refused to change power state, currently in D3
Dec 30 08:45:06 klaptop kernel: [   27.836049] nouveau 0000:01:00.0: Refused to change power state, currently in D3
Dec 30 08:45:06 klaptop kernel: [   27.836053] nouveau 0000:01:00.0: Refused to change power state, currently in D3

This happens if we read back 0xffffffff from PM register.

Dec 30 08:45:06 klaptop kernel: [   27.836055] nouveau 0000:01:00.0: DRM: resuming kernel object tree...
Dec 30 08:45:06 klaptop kernel: [   27.836127] nouveau 0000:01:00.0: pci: failed to adjust cap speed
Dec 30 08:45:06 klaptop kernel: [   27.836131] nouveau 0000:01:00.0: pci: failed to adjust lnkctl speed

Preventing D3cold should at least show some difference on resume path.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 12:10                     ` Mika Westerberg
  2017-01-02 13:53                       ` Mika Westerberg
@ 2017-01-02 14:48                       ` Mika Westerberg
  2017-01-02 21:31                         ` Rafael J. Wysocki
  2017-01-03 17:10                       ` Kilian Singer
  2 siblings, 1 reply; 115+ messages in thread
From: Mika Westerberg @ 2017-01-02 14:48 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Kilian Singer, Bjorn Helgaas, linux-pci, Rafael J. Wysocki

On Mon, Jan 02, 2017 at 02:10:19PM +0200, Mika Westerberg wrote:
> I've checked the acpidump of this machine and it does not seem to be a
> traditional Optimus machine. At least this one is missing the magic _DSM
> which is used to gather capabilities of the graphics device.
> 
> However, it does have _PR3 and it is attached to the device
> (_SB.PCI0.PEG) itself, not the root port.

Nah, actually PEG is the root port. So it certainly looks like
a traditional Optimus machine.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 14:48                       ` Mika Westerberg
@ 2017-01-02 21:31                         ` Rafael J. Wysocki
  2017-01-03  9:51                           ` Mika Westerberg
  0 siblings, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-02 21:31 UTC (permalink / raw)
  To: Mika Westerberg; +Cc: Lukas Wunner, Kilian Singer, Bjorn Helgaas, linux-pci

On Monday, January 02, 2017 04:48:52 PM Mika Westerberg wrote:
> On Mon, Jan 02, 2017 at 02:10:19PM +0200, Mika Westerberg wrote:
> > I've checked the acpidump of this machine and it does not seem to be a
> > traditional Optimus machine. At least this one is missing the magic _DSM
> > which is used to gather capabilities of the graphics device.
> > 
> > However, it does have _PR3 and it is attached to the device
> > (_SB.PCI0.PEG) itself, not the root port.
> 
> Nah, actually PEG is the root port. So it certainly looks like
> a traditional Optimus machine.

So can we quirk that thing somehow and see if that helps (for debugging
purposes at least)?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 21:31                         ` Rafael J. Wysocki
@ 2017-01-03  9:51                           ` Mika Westerberg
  2017-01-03 15:15                             ` Peter Wu
  0 siblings, 1 reply; 115+ messages in thread
From: Mika Westerberg @ 2017-01-03  9:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lukas Wunner, Kilian Singer, Bjorn Helgaas, linux-pci, Peter Wu

On Mon, Jan 02, 2017 at 10:31:07PM +0100, Rafael J. Wysocki wrote:
> On Monday, January 02, 2017 04:48:52 PM Mika Westerberg wrote:
> > On Mon, Jan 02, 2017 at 02:10:19PM +0200, Mika Westerberg wrote:
> > > I've checked the acpidump of this machine and it does not seem to be a
> > > traditional Optimus machine. At least this one is missing the magic _DSM
> > > which is used to gather capabilities of the graphics device.
> > > 
> > > However, it does have _PR3 and it is attached to the device
> > > (_SB.PCI0.PEG) itself, not the root port.
> > 
> > Nah, actually PEG is the root port. So it certainly looks like
> > a traditional Optimus machine.
> 
> So can we quirk that thing somehow and see if that helps (for debugging
> purposes at least)?

I was kind of hoping disabling D3cold would do that (prevent it from
turning off power resources). But we can also just force it to use _DSM
instead and see if it makes a difference.

I guess the reason why keyboard and mouse become unresponsive is because
the driver tries to resume the device and hogs the CPU. At least it
looks like so from the dmesg in comment 27 (of the bugzilla bug) where
NMI watchdog is triggered.

Since this might be related to nouveau, adding Peter Wu to the loop.
Peter the bug in question is https://bugzilla.kernel.org/show_bug.cgi?id=190861.

Kilian, can you try the following hack as well?

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 193573d191e5..50482d5c8072 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -282,7 +282,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle *dhandle_out
 			 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " : "",
 			 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec supported" : "");
 
-		*has_pr3 = nouveau_pr3_present(pdev);
+//		*has_pr3 = nouveau_pr3_present(pdev);
 	}
 }
 

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03  9:51                           ` Mika Westerberg
@ 2017-01-03 15:15                             ` Peter Wu
  2017-01-03 16:11                               ` Lukas Wunner
  2017-01-03 17:37                               ` Kilian Singer
  0 siblings, 2 replies; 115+ messages in thread
From: Peter Wu @ 2017-01-03 15:15 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Rafael J. Wysocki, Lukas Wunner, Kilian Singer, Bjorn Helgaas, linux-pci

(replying to earlier comments in the thread:)

Changing (lowering?) the cut-off date would not help as the laptop has
DMI year 2016. (For the long-term, it would probably be desirable to
lower the date or otherwise add detection of _PR3, see
https://bugs.freedesktop.org/show_bug.cgi?id=98505#c23).

Reverting the patch is not a good idea either, it would reintroduce the
memory corruption that have plagued some Lenovo models
(https://bugs.freedesktop.org/show_bug.cgi?id=78530).

On Tue, Jan 03, 2017 at 11:51:58AM +0200, Mika Westerberg wrote:
> On Mon, Jan 02, 2017 at 10:31:07PM +0100, Rafael J. Wysocki wrote:
> > On Monday, January 02, 2017 04:48:52 PM Mika Westerberg wrote:
> > > On Mon, Jan 02, 2017 at 02:10:19PM +0200, Mika Westerberg wrote:
> > > > I've checked the acpidump of this machine and it does not seem to be a
> > > > traditional Optimus machine. At least this one is missing the magic _DSM
> > > > which is used to gather capabilities of the graphics device.
> > > > 
> > > > However, it does have _PR3 and it is attached to the device
> > > > (_SB.PCI0.PEG) itself, not the root port.
> > > 
> > > Nah, actually PEG is the root port. So it certainly looks like
> > > a traditional Optimus machine.
> > 
> > So can we quirk that thing somehow and see if that helps (for debugging
> > purposes at least)?
> 
> I was kind of hoping disabling D3cold would do that (prevent it from
> turning off power resources). But we can also just force it to use _DSM
> instead and see if it makes a difference.

Disabling d3cold that way might be too late due to the short RPM suspend
delay. You would need a udev rule to activate this ASAP. E.g., create
/etc/udev/rules.d/42-nvidia-rpm.rules with:

    SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", ATTR{power/d3cold_allowed}="0"

This disables D3cold on the child device (which should also prevent the
parent PCIe port from using D3cold).

Alternatively, can you try to boot with nouveau.runpm=0 and see if it
makes any difference? When runpm is disabled, then the PCIe port and
Nvidia device should not be suspended and therefore prevent the issue
from being triggered.

> I guess the reason why keyboard and mouse become unresponsive is because
> the driver tries to resume the device and hogs the CPU. At least it
> looks like so from the dmesg in comment 27 (of the bugzilla bug) where
> NMI watchdog is triggered.
> 
> Since this might be related to nouveau, adding Peter Wu to the loop.
> Peter the bug in question is https://bugzilla.kernel.org/show_bug.cgi?id=190861.

Kilian, in the bug you had the issue with Firefox. The trace suggests
that runtime resume was triggered, so you should have this problem too
when using lspci. Can you try:

 1. Switch to a text console (e.g. Ctrl-Alt-F2).
 2. sleep 5; lspci

If that command does not return immediately, you likely have triggered
the same issue.

The acpidump from the bug does not show known issues, it *looks* fine.
There have been other issues related to resuming power on newer Nvidia
hardware (https://bugs.freedesktop.org/show_bug.cgi?id=94725,
https://bugzilla.kernel.org/show_bug.cgi?id=156341) but there is not
much progress here.  (The last time I traced the PCIe register accesses
(via kprobes) and tried to disable some of those, it still did not help
with preventing the power issue.)

> Kilian, can you try the following hack as well?
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> index 193573d191e5..50482d5c8072 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> @@ -282,7 +282,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle *dhandle_out
>  			 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " : "",
>  			 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec supported" : "");
>  
> -		*has_pr3 = nouveau_pr3_present(pdev);
> +//		*has_pr3 = nouveau_pr3_present(pdev);
>  	}
>  }
>  

This would not disable D3cold support and as a result both PR3 and DSM
would be active. Try the above with this line added to force DSM:

    pci_d3cold_disable(pdev);

(This should have the same effect as setting d3cold_allowed=0.)
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 15:15                             ` Peter Wu
@ 2017-01-03 16:11                               ` Lukas Wunner
  2017-01-03 16:31                                 ` Peter Wu
  2017-01-03 21:26                                 ` Rafael J. Wysocki
  2017-01-03 17:37                               ` Kilian Singer
  1 sibling, 2 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-03 16:11 UTC (permalink / raw)
  To: Peter Wu
  Cc: Mika Westerberg, Rafael J. Wysocki, Kilian Singer, Bjorn Helgaas,
	linux-pci, Dave Airlie

[cc += Dave Airlie:

Dave, we're about to lose support for newer Optimus laptops which use
_PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
a commit on his for-linus branch to remove runtime PM for PCIe ports.
This fixes a regression on Kilian Singer's laptop on which locking the
screen breaks USB and PS/2 input devices:  Mouse movements are still
visible, but button or key presses no longer have any effect.  The GPU
is powered down upon locking the screen and the current theory is that
this causes the issues.]

On Tue, Jan 03, 2017 at 04:15:47PM +0100, Peter Wu wrote:
> The acpidump from the bug does not show known issues, it *looks* fine.
> There have been other issues related to resuming power on newer Nvidia
> hardware (https://bugs.freedesktop.org/show_bug.cgi?id=94725,
> https://bugzilla.kernel.org/show_bug.cgi?id=156341) but there is not
> much progress here.  (The last time I traced the PCIe register accesses
> (via kprobes) and tried to disable some of those, it still did not help
> with preventing the power issue.)

It seems that the _DSM method works on Kilian's laptop.  Would it be
viable to default to _DSM if it's available, and only use _PR3 if not?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 16:11                               ` Lukas Wunner
@ 2017-01-03 16:31                                 ` Peter Wu
  2017-01-03 16:44                                   ` Deucher, Alexander
                                                     ` (2 more replies)
  2017-01-03 21:26                                 ` Rafael J. Wysocki
  1 sibling, 3 replies; 115+ messages in thread
From: Peter Wu @ 2017-01-03 16:31 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Mika Westerberg, Rafael J. Wysocki, Kilian Singer, Bjorn Helgaas,
	linux-pci, Alex Deucher, Dave Airlie

On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> [cc += Dave Airlie:
> 
> Dave, we're about to lose support for newer Optimus laptops which use
> _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> a commit on his for-linus branch to remove runtime PM for PCIe ports.
> This fixes a regression on Kilian Singer's laptop on which locking the
> screen breaks USB and PS/2 input devices:  Mouse movements are still
> visible, but button or key presses no longer have any effect.  The GPU
> is powered down upon locking the screen and the current theory is that
> this causes the issues.]

(+cc Alex: this might affect amdgpu/radeon too.]

Bjorn, please reconsider the rpm patch. Reverting support would
introduce other regressions (see issues below) and make future
Thunderbolt work harder (according to Lukas). If Kilian's laptop has
issues, what about a "temporary" quirk?

> On Tue, Jan 03, 2017 at 04:15:47PM +0100, Peter Wu wrote:
> > The acpidump from the bug does not show known issues, it *looks* fine.
> > There have been other issues related to resuming power on newer Nvidia
> > hardware (https://bugs.freedesktop.org/show_bug.cgi?id=94725,
> > https://bugzilla.kernel.org/show_bug.cgi?id=156341) but there is not
> > much progress here.  (The last time I traced the PCIe register accesses
> > (via kprobes) and tried to disable some of those, it still did not help
> > with preventing the power issue.)
> 
> It seems that the _DSM method works on Kilian's laptop.  Would it be
> viable to default to _DSM if it's available, and only use _PR3 if not?

DSM should not be preferred when PR3 is available:

 - After MS introduced D3cold (PR3) support to Win8+, vendors are
   unlikely to test legacy DSM and the likelihood of breakage increases.
 - On one Lenovo laptop, the DSM method causes memory corruption while
   PR3 fixes this problem.
 - On some laptops, DSM keeps the fan on while PR3 stopped the noise.
 - On some laptops, DSM does not really power off the GPU and results in
   increased power consumption during runtime/system sleep. PR3 fully
   removes the power, as desired.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl

^ permalink raw reply	[flat|nested] 115+ messages in thread

* RE: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 16:31                                 ` Peter Wu
@ 2017-01-03 16:44                                   ` Deucher, Alexander
  2017-01-03 18:09                                   ` Lukas Wunner
  2017-01-03 18:12                                   ` Bjorn Helgaas
  2 siblings, 0 replies; 115+ messages in thread
From: Deucher, Alexander @ 2017-01-03 16:44 UTC (permalink / raw)
  To: 'Peter Wu', Lukas Wunner
  Cc: Mika Westerberg, Rafael J. Wysocki, Kilian Singer, Bjorn Helgaas,
	linux-pci, Dave Airlie

> -----Original Message-----
> From: Peter Wu [mailto:peter@lekensteyn.nl]
> Sent: Tuesday, January 03, 2017 11:32 AM
> To: Lukas Wunner
> Cc: Mika Westerberg; Rafael J. Wysocki; Kilian Singer; Bjorn Helgaas; lin=
ux-pci;
> Deucher, Alexander; Dave Airlie
> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
>=20
> On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> > [cc +=3D Dave Airlie:
> >
> > Dave, we're about to lose support for newer Optimus laptops which use
> > _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> > a commit on his for-linus branch to remove runtime PM for PCIe ports.
> > This fixes a regression on Kilian Singer's laptop on which locking the
> > screen breaks USB and PS/2 input devices:  Mouse movements are still
> > visible, but button or key presses no longer have any effect.  The GPU
> > is powered down upon locking the screen and the current theory is that
> > this causes the issues.]
>=20
> (+cc Alex: this might affect amdgpu/radeon too.]
>=20
> Bjorn, please reconsider the rpm patch. Reverting support would
> introduce other regressions (see issues below) and make future
> Thunderbolt work harder (according to Lukas). If Kilian's laptop has
> issues, what about a "temporary" quirk?
>=20
> > On Tue, Jan 03, 2017 at 04:15:47PM +0100, Peter Wu wrote:
> > > The acpidump from the bug does not show known issues, it *looks* fine=
.
> > > There have been other issues related to resuming power on newer
> Nvidia
> > > hardware (https://bugs.freedesktop.org/show_bug.cgi?id=3D94725,
> > > https://bugzilla.kernel.org/show_bug.cgi?id=3D156341) but there is no=
t
> > > much progress here.  (The last time I traced the PCIe register access=
es
> > > (via kprobes) and tried to disable some of those, it still did not he=
lp
> > > with preventing the power issue.)
> >
> > It seems that the _DSM method works on Kilian's laptop.  Would it be
> > viable to default to _DSM if it's available, and only use _PR3 if not?
>=20
> DSM should not be preferred when PR3 is available:
>=20
>  - After MS introduced D3cold (PR3) support to Win8+, vendors are
>    unlikely to test legacy DSM and the likelihood of breakage increases.
>  - On one Lenovo laptop, the DSM method causes memory corruption while
>    PR3 fixes this problem.
>  - On some laptops, DSM keeps the fan on while PR3 stopped the noise.
>  - On some laptops, DSM does not really power off the GPU and results in
>    increased power consumption during runtime/system sleep. PR3 fully
>    removes the power, as desired.

Yes, this will affect just about all AMD multi-GPU laptops from late 2013 o=
nward.  I'd much prefer a temporary quirk for this specific laptop than to =
disable PR3 for everything.

Alex

> --
> Kind regards,
> Peter Wu
> https://lekensteyn.nl

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 11:40                   ` Lukas Wunner
  2017-01-02 12:10                     ` Mika Westerberg
@ 2017-01-03 16:59                     ` Kilian Singer
  2017-01-03 17:08                     ` Kilian Singer
  2 siblings, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2017-01-03 16:59 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

I tried the 4.9.0 kernel and the patch fixes both the screen lock and firefox issue.


----- Original Message -----
From: "Lukas Wunner" <lukas@wunner.de>
To: "Kilian Singer" <kilian.singer@quantumtechnology.info>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Monday, January 2, 2017 12:40:40 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Fri, Dec 30, 2016 at 01:16:17AM +0100, Kilian Singer wrote:
> I did the debug message on the 4.10-rc1 for now. I could go back to 4.9
> if that helps but needs some time again to compile.
> The debug messages from the first rpm_... to the crash are:
[...]
> [   24.831417] nouveau 0000:01:00.0: rpm_suspend
> [   24.831427] nouveau 0000:01:00.0: DRM: suspending console...
> [   24.831432] nouveau 0000:01:00.0: DRM: suspending display...
> [   24.831477] nouveau 0000:01:00.0: DRM: evicting buffers...
> [   24.865243] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
> [   24.865269] nouveau 0000:01:00.0: DRM: suspending client object trees...
> [   24.870724] nouveau 0000:01:00.0: DRM: suspending kernel object tree...
> [   26.080300] thinkpad_acpi: EC reports that Thermal Table has changed
> [   26.207691] pcieport 0000:00:01.0: rpm_idle
> [   26.207693] pcieport 0000:00:01.0: rpm_suspend
> [   28.927640] snd_hda_codec_hdmi hdaudioC0D0: rpm_suspend
> SYSTEM IS NOW NOT RESPONSIVE

So two seconds before the system became unresponsive, the root port above
the discrete GPU suspended, suggesting that's the culprit.  Could you test
either of the attached patches to confirm this theory?  They disable
runtime PM on this specific root port but allow it on all the others.

You've got an Optimus laptop, i.e. power to the discrete GPU can be cut.
Traditionally this is achieved by invoking an ACPI _DSM (Device Specific
Method).  That's what we did up until v4.7.

However on newer laptops Windows no longer cuts power to the discrete GPU
by invoking the _DSM, but rather by suspending the root port above the
GPU.  (More specifically by turning off Power Resources required for D3
of the root port, those are specified in a _PR3 object.)  We started
supporting this with v4.8.

If the above theory is correct, we need to involve Optimus experts
because this is not an issue then with powering down root ports in
general, but rather specific to this Optimus use case.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 11:40                   ` Lukas Wunner
  2017-01-02 12:10                     ` Mika Westerberg
  2017-01-03 16:59                     ` Kilian Singer
@ 2017-01-03 17:08                     ` Kilian Singer
  2 siblings, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2017-01-03 17:08 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, linux-pci, Mika Westerberg, Rafael J. Wysocki

Sorry I should mention the patch.
I tried the 4.9.0 kernel and the patch:
disable_pm_v4.9.patch

 fixes both the screen lock and firefox issue.

----- Original Message -----
From: "Lukas Wunner" <lukas@wunner.de>
To: "Kilian Singer" <kilian.singer@quantumtechnology.info>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Monday, January 2, 2017 12:40:40 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Fri, Dec 30, 2016 at 01:16:17AM +0100, Kilian Singer wrote:
> I did the debug message on the 4.10-rc1 for now. I could go back to 4.9
> if that helps but needs some time again to compile.
> The debug messages from the first rpm_... to the crash are:
[...]
> [   24.831417] nouveau 0000:01:00.0: rpm_suspend
> [   24.831427] nouveau 0000:01:00.0: DRM: suspending console...
> [   24.831432] nouveau 0000:01:00.0: DRM: suspending display...
> [   24.831477] nouveau 0000:01:00.0: DRM: evicting buffers...
> [   24.865243] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
> [   24.865269] nouveau 0000:01:00.0: DRM: suspending client object trees...
> [   24.870724] nouveau 0000:01:00.0: DRM: suspending kernel object tree...
> [   26.080300] thinkpad_acpi: EC reports that Thermal Table has changed
> [   26.207691] pcieport 0000:00:01.0: rpm_idle
> [   26.207693] pcieport 0000:00:01.0: rpm_suspend
> [   28.927640] snd_hda_codec_hdmi hdaudioC0D0: rpm_suspend
> SYSTEM IS NOW NOT RESPONSIVE

So two seconds before the system became unresponsive, the root port above
the discrete GPU suspended, suggesting that's the culprit.  Could you test
either of the attached patches to confirm this theory?  They disable
runtime PM on this specific root port but allow it on all the others.

You've got an Optimus laptop, i.e. power to the discrete GPU can be cut.
Traditionally this is achieved by invoking an ACPI _DSM (Device Specific
Method).  That's what we did up until v4.7.

However on newer laptops Windows no longer cuts power to the discrete GPU
by invoking the _DSM, but rather by suspending the root port above the
GPU.  (More specifically by turning off Power Resources required for D3
of the root port, those are specified in a _PR3 object.)  We started
supporting this with v4.8.

If the above theory is correct, we need to involve Optimus experts
because this is not an issue then with powering down root ports in
general, but rather specific to this Optimus use case.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 12:10                     ` Mika Westerberg
  2017-01-02 13:53                       ` Mika Westerberg
  2017-01-02 14:48                       ` Mika Westerberg
@ 2017-01-03 17:10                       ` Kilian Singer
  2 siblings, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2017-01-03 17:10 UTC (permalink / raw)
  To: Mika Westerberg; +Cc: Lukas Wunner, Bjorn Helgaas, linux-pci, Rafael J. Wysocki

This makes the bash where executed locked but system stays responsive.
Lockscreen still leads to system becoming unresponsive.

----- Original Message -----
From: "Mika Westerberg" <mika.westerberg@linux.intel.com>
To: "Lukas Wunner" <lukas@wunner.de>
Cc: "Kilian Singer" <kilian.singer@quantumtechnology.info>, "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Monday, January 2, 2017 1:10:19 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Mon, Jan 02, 2017 at 12:40:40PM +0100, Lukas Wunner wrote:
> On Fri, Dec 30, 2016 at 01:16:17AM +0100, Kilian Singer wrote:
> > I did the debug message on the 4.10-rc1 for now. I could go back to 4.9
> > if that helps but needs some time again to compile.
> > The debug messages from the first rpm_... to the crash are:
> [...]
> > [   24.831417] nouveau 0000:01:00.0: rpm_suspend
> > [   24.831427] nouveau 0000:01:00.0: DRM: suspending console...
> > [   24.831432] nouveau 0000:01:00.0: DRM: suspending display...
> > [   24.831477] nouveau 0000:01:00.0: DRM: evicting buffers...
> > [   24.865243] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
> > [   24.865269] nouveau 0000:01:00.0: DRM: suspending client object trees...
> > [   24.870724] nouveau 0000:01:00.0: DRM: suspending kernel object tree...
> > [   26.080300] thinkpad_acpi: EC reports that Thermal Table has changed
> > [   26.207691] pcieport 0000:00:01.0: rpm_idle
> > [   26.207693] pcieport 0000:00:01.0: rpm_suspend
> > [   28.927640] snd_hda_codec_hdmi hdaudioC0D0: rpm_suspend
> > SYSTEM IS NOW NOT RESPONSIVE
> 
> So two seconds before the system became unresponsive, the root port above
> the discrete GPU suspended, suggesting that's the culprit.  Could you test
> either of the attached patches to confirm this theory?  They disable
> runtime PM on this specific root port but allow it on all the others.
> 
> You've got an Optimus laptop, i.e. power to the discrete GPU can be cut.
> Traditionally this is achieved by invoking an ACPI _DSM (Device Specific
> Method).  That's what we did up until v4.7.
> 
> However on newer laptops Windows no longer cuts power to the discrete GPU
> by invoking the _DSM, but rather by suspending the root port above the
> GPU.  (More specifically by turning off Power Resources required for D3
> of the root port, those are specified in a _PR3 object.)  We started
> supporting this with v4.8.
> 
> If the above theory is correct, we need to involve Optimus experts
> because this is not an issue then with powering down root ports in
> general, but rather specific to this Optimus use case.

[Back from vacation now]

I've checked the acpidump of this machine and it does not seem to be a
traditional Optimus machine. At least this one is missing the magic _DSM
which is used to gather capabilities of the graphics device.

However, it does have _PR3 and it is attached to the device
(_SB.PCI0.PEG) itself, not the root port.

One thing you could try in addition to Lucas' patches is just to prevent
D3cold from the device by doing this:

  # echo 0 > /sys/bus/pci/devices/0000:01:00.0/d3cold_allowed

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-02 12:22                                   ` Mika Westerberg
@ 2017-01-03 17:12                                     ` Kilian Singer
  0 siblings, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2017-01-03 17:12 UTC (permalink / raw)
  To: Mika Westerberg; +Cc: Rafael J. Wysocki, Lukas Wunner, Bjorn Helgaas, linux-pci

Just attached the output to the bug report as attachment.

----- Original Message -----
From: "Mika Westerberg" <mika.westerberg@linux.intel.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: "Kilian Singer" <kilian.singer@quantumtechnology.info>, "Lukas Wunner" <lukas@wunner.de>, "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>
Sent: Monday, January 2, 2017 1:22:28 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Fri, Dec 30, 2016 at 03:47:31PM +0100, Rafael J. Wysocki wrote:
> On Friday, December 30, 2016 02:37:17 PM Kilian Singer wrote:
> > Yes,
> > the pci_port_pm=off
> > fixes both the firefox issue and the lock screen issue. Also suspend/resume work.
> > Tested on 4.9
> 
> OK, thanks!
> 
> Please use that as a manual workaround for the time being.
> 
> I looked at the acpidump attached to the BZ entry, but nothing jumped up at me
> immediately.  I'll let Mika take care of this going forward when he's back.

Thanks Rafael and Lucas for the help. I'm now back from my vacation so I
can start investigating this as well.

Kilian, can you attach full output of 'sudo lspci -vv' to the bug? The
one in the comments is pretty hard to read.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 15:15                             ` Peter Wu
  2017-01-03 16:11                               ` Lukas Wunner
@ 2017-01-03 17:37                               ` Kilian Singer
  1 sibling, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2017-01-03 17:37 UTC (permalink / raw)
  To: Peter Wu
  Cc: Mika Westerberg, Rafael J. Wysocki, Lukas Wunner, Bjorn Helgaas,
	linux-pci

Iapplied the patch to noveau_acpi.c with the additional
pci_d3cold_disable(pdev);
right after 
//		*has_pr3 = nouveau_pr3_present(pdev);
and both firefox and screen lock issue are resolved.



----- Original Message -----
From: "Peter Wu" <peter@lekensteyn.nl>
To: "Mika Westerberg" <mika.westerberg@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>, "Lukas Wunner" <lukas@wunner.de>, "Kilian Singer" <kilian.singer@quantumtechnology.info>, "Bjorn Helgaas" <helgaas@kernel.org>, "linux-pci" <linux-pci@vger.kernel.org>
Sent: Tuesday, January 3, 2017 4:15:47 PM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

(replying to earlier comments in the thread:)

Changing (lowering?) the cut-off date would not help as the laptop has
DMI year 2016. (For the long-term, it would probably be desirable to
lower the date or otherwise add detection of _PR3, see
https://bugs.freedesktop.org/show_bug.cgi?id=98505#c23).

Reverting the patch is not a good idea either, it would reintroduce the
memory corruption that have plagued some Lenovo models
(https://bugs.freedesktop.org/show_bug.cgi?id=78530).

On Tue, Jan 03, 2017 at 11:51:58AM +0200, Mika Westerberg wrote:
> On Mon, Jan 02, 2017 at 10:31:07PM +0100, Rafael J. Wysocki wrote:
> > On Monday, January 02, 2017 04:48:52 PM Mika Westerberg wrote:
> > > On Mon, Jan 02, 2017 at 02:10:19PM +0200, Mika Westerberg wrote:
> > > > I've checked the acpidump of this machine and it does not seem to be a
> > > > traditional Optimus machine. At least this one is missing the magic _DSM
> > > > which is used to gather capabilities of the graphics device.
> > > > 
> > > > However, it does have _PR3 and it is attached to the device
> > > > (_SB.PCI0.PEG) itself, not the root port.
> > > 
> > > Nah, actually PEG is the root port. So it certainly looks like
> > > a traditional Optimus machine.
> > 
> > So can we quirk that thing somehow and see if that helps (for debugging
> > purposes at least)?
> 
> I was kind of hoping disabling D3cold would do that (prevent it from
> turning off power resources). But we can also just force it to use _DSM
> instead and see if it makes a difference.

Disabling d3cold that way might be too late due to the short RPM suspend
delay. You would need a udev rule to activate this ASAP. E.g., create
/etc/udev/rules.d/42-nvidia-rpm.rules with:

    SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", ATTR{power/d3cold_allowed}="0"

This disables D3cold on the child device (which should also prevent the
parent PCIe port from using D3cold).

Alternatively, can you try to boot with nouveau.runpm=0 and see if it
makes any difference? When runpm is disabled, then the PCIe port and
Nvidia device should not be suspended and therefore prevent the issue
from being triggered.

> I guess the reason why keyboard and mouse become unresponsive is because
> the driver tries to resume the device and hogs the CPU. At least it
> looks like so from the dmesg in comment 27 (of the bugzilla bug) where
> NMI watchdog is triggered.
> 
> Since this might be related to nouveau, adding Peter Wu to the loop.
> Peter the bug in question is https://bugzilla.kernel.org/show_bug.cgi?id=190861.

Kilian, in the bug you had the issue with Firefox. The trace suggests
that runtime resume was triggered, so you should have this problem too
when using lspci. Can you try:

 1. Switch to a text console (e.g. Ctrl-Alt-F2).
 2. sleep 5; lspci

If that command does not return immediately, you likely have triggered
the same issue.

The acpidump from the bug does not show known issues, it *looks* fine.
There have been other issues related to resuming power on newer Nvidia
hardware (https://bugs.freedesktop.org/show_bug.cgi?id=94725,
https://bugzilla.kernel.org/show_bug.cgi?id=156341) but there is not
much progress here.  (The last time I traced the PCIe register accesses
(via kprobes) and tried to disable some of those, it still did not help
with preventing the power issue.)

> Kilian, can you try the following hack as well?
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> index 193573d191e5..50482d5c8072 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> @@ -282,7 +282,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle *dhandle_out
>  			 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " : "",
>  			 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec supported" : "");
>  
> -		*has_pr3 = nouveau_pr3_present(pdev);
> +//		*has_pr3 = nouveau_pr3_present(pdev);
>  	}
>  }
>  

This would not disable D3cold support and as a result both PR3 and DSM
would be active. Try the above with this line added to force DSM:

    pci_d3cold_disable(pdev);

(This should have the same effect as setting d3cold_allowed=0.)
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 16:31                                 ` Peter Wu
  2017-01-03 16:44                                   ` Deucher, Alexander
@ 2017-01-03 18:09                                   ` Lukas Wunner
  2017-01-03 18:12                                   ` Bjorn Helgaas
  2 siblings, 0 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-03 18:09 UTC (permalink / raw)
  To: Peter Wu
  Cc: Mika Westerberg, Rafael J. Wysocki, Kilian Singer, Bjorn Helgaas,
	linux-pci, Alex Deucher, Dave Airlie

On Tue, Jan 03, 2017 at 05:31:30PM +0100, Peter Wu wrote:
> On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> > On Tue, Jan 03, 2017 at 04:15:47PM +0100, Peter Wu wrote:
> > > The acpidump from the bug does not show known issues, it *looks* fine.
> > > There have been other issues related to resuming power on newer Nvidia
> > > hardware (https://bugs.freedesktop.org/show_bug.cgi?id=94725,
> > > https://bugzilla.kernel.org/show_bug.cgi?id=156341) but there is not
> > > much progress here.  (The last time I traced the PCIe register accesses
> > > (via kprobes) and tried to disable some of those, it still did not help
> > > with preventing the power issue.)
> > 
> > It seems that the _DSM method works on Kilian's laptop.  Would it be
> > viable to default to _DSM if it's available, and only use _PR3 if not?
> 
> DSM should not be preferred when PR3 is available:
> 
>  - After MS introduced D3cold (PR3) support to Win8+, vendors are
>    unlikely to test legacy DSM and the likelihood of breakage increases.
>  - On one Lenovo laptop, the DSM method causes memory corruption while
>    PR3 fixes this problem.
>  - On some laptops, DSM keeps the fan on while PR3 stopped the noise.
>  - On some laptops, DSM does not really power off the GPU and results in
>    increased power consumption during runtime/system sleep. PR3 fully
>    removes the power, as desired.

I see.  How about adding an "optimus" module_param to nouveau which
allows users to force either DSM or PR3 on machines where the method
selected by default doesn't work properly?  At least until we've
figured out how to always select the correct method, or have debugged
the remaining issues with PR3?

When selecting DSM, I guess pci_d3cold_disable() would have to be
called to prevent usage of both methods, right?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 16:31                                 ` Peter Wu
  2017-01-03 16:44                                   ` Deucher, Alexander
  2017-01-03 18:09                                   ` Lukas Wunner
@ 2017-01-03 18:12                                   ` Bjorn Helgaas
  2017-01-03 21:38                                     ` Rafael J. Wysocki
  2 siblings, 1 reply; 115+ messages in thread
From: Bjorn Helgaas @ 2017-01-03 18:12 UTC (permalink / raw)
  To: Peter Wu
  Cc: Lukas Wunner, Mika Westerberg, Rafael J. Wysocki, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Tue, Jan 03, 2017 at 05:31:30PM +0100, Peter Wu wrote:
> On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> > [cc += Dave Airlie:
> > 
> > Dave, we're about to lose support for newer Optimus laptops which use
> > _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> > a commit on his for-linus branch to remove runtime PM for PCIe ports.
> > This fixes a regression on Kilian Singer's laptop on which locking the
> > screen breaks USB and PS/2 input devices:  Mouse movements are still
> > visible, but button or key presses no longer have any effect.  The GPU
> > is powered down upon locking the screen and the current theory is that
> > this causes the issues.]
> 
> (+cc Alex: this might affect amdgpu/radeon too.]
> 
> Bjorn, please reconsider the rpm patch. Reverting support would
> introduce other regressions (see issues below) and make future
> Thunderbolt work harder (according to Lukas). If Kilian's laptop has
> issues, what about a "temporary" quirk?

As I mentioned at the beginning, the outcome I'm hoping for is a patch
that fixes Kilian's laptop while preserving the runtime PM support.

As I also mentioned at the beginning, preserving the runtime PM
support at the expense of breaking Kilian's laptop is not one of the
options.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 16:11                               ` Lukas Wunner
  2017-01-03 16:31                                 ` Peter Wu
@ 2017-01-03 21:26                                 ` Rafael J. Wysocki
  1 sibling, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-03 21:26 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Peter Wu, Mika Westerberg, Kilian Singer, Bjorn Helgaas,
	linux-pci, Dave Airlie

On Tuesday, January 03, 2017 05:11:23 PM Lukas Wunner wrote:
> [cc += Dave Airlie:
> 
> Dave, we're about to lose support for newer Optimus laptops which use
> _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> a commit on his for-linus branch to remove runtime PM for PCIe ports.
> This fixes a regression on Kilian Singer's laptop on which locking the
> screen breaks USB and PS/2 input devices:

It doesn't fix the broken system suspend/resume on his system, though.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 18:12                                   ` Bjorn Helgaas
@ 2017-01-03 21:38                                     ` Rafael J. Wysocki
  2017-01-03 21:52                                       ` Kilian Singer
  2017-01-03 22:25                                       ` Bjorn Helgaas
  0 siblings, 2 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-03 21:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Peter Wu, Lukas Wunner, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Tuesday, January 03, 2017 12:12:21 PM Bjorn Helgaas wrote:
> On Tue, Jan 03, 2017 at 05:31:30PM +0100, Peter Wu wrote:
> > On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> > > [cc += Dave Airlie:
> > > 
> > > Dave, we're about to lose support for newer Optimus laptops which use
> > > _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> > > a commit on his for-linus branch to remove runtime PM for PCIe ports.
> > > This fixes a regression on Kilian Singer's laptop on which locking the
> > > screen breaks USB and PS/2 input devices:  Mouse movements are still
> > > visible, but button or key presses no longer have any effect.  The GPU
> > > is powered down upon locking the screen and the current theory is that
> > > this causes the issues.]
> > 
> > (+cc Alex: this might affect amdgpu/radeon too.]
> > 
> > Bjorn, please reconsider the rpm patch. Reverting support would
> > introduce other regressions (see issues below) and make future
> > Thunderbolt work harder (according to Lukas). If Kilian's laptop has
> > issues, what about a "temporary" quirk?
> 
> As I mentioned at the beginning, the outcome I'm hoping for is a patch
> that fixes Kilian's laptop while preserving the runtime PM support.
> 
> As I also mentioned at the beginning, preserving the runtime PM
> support at the expense of breaking Kilian's laptop is not one of the
> options.

But the revert doesn't really help.

It doesn't fix system suspend/resume on that laptop, which also breaks when
PCIe ports PM is enabled on it.

If you really want to use a sledgehammer approach here (which I don't recommend,
but that's your call), you can change the initial value of pci_bridge_d3_disable to
"true" (and update the pcie_ports_pm= command line to take "on" in case
someone wants to enable the feature).  That at least will take care of the
regression entirely and not just partly.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 21:38                                     ` Rafael J. Wysocki
@ 2017-01-03 21:52                                       ` Kilian Singer
  2017-01-03 22:07                                         ` Rafael J. Wysocki
  2017-01-03 22:25                                       ` Bjorn Helgaas
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2017-01-03 21:52 UTC (permalink / raw)
  To: Rafael J. Wysocki, Bjorn Helgaas
  Cc: Peter Wu, Lukas Wunner, Mika Westerberg, linux-pci, Alex Deucher,
	Dave Airlie

I have not checked if suspend/resume broken.
Which patch should I check suspend resume for.
When I wrote about firefox and lockscreen issue resolved
I forgot to check about the suspend/resume.


On 03-Jan-17 22:38, Rafael J. Wysocki wrote:
> On Tuesday, January 03, 2017 12:12:21 PM Bjorn Helgaas wrote:
>> On Tue, Jan 03, 2017 at 05:31:30PM +0100, Peter Wu wrote:
>>> On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
>>>> [cc += Dave Airlie:
>>>>
>>>> Dave, we're about to lose support for newer Optimus laptops which use
>>>> _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
>>>> a commit on his for-linus branch to remove runtime PM for PCIe ports.
>>>> This fixes a regression on Kilian Singer's laptop on which locking the
>>>> screen breaks USB and PS/2 input devices:  Mouse movements are still
>>>> visible, but button or key presses no longer have any effect.  The GPU
>>>> is powered down upon locking the screen and the current theory is that
>>>> this causes the issues.]
>>> (+cc Alex: this might affect amdgpu/radeon too.]
>>>
>>> Bjorn, please reconsider the rpm patch. Reverting support would
>>> introduce other regressions (see issues below) and make future
>>> Thunderbolt work harder (according to Lukas). If Kilian's laptop has
>>> issues, what about a "temporary" quirk?
>> As I mentioned at the beginning, the outcome I'm hoping for is a patch
>> that fixes Kilian's laptop while preserving the runtime PM support.
>>
>> As I also mentioned at the beginning, preserving the runtime PM
>> support at the expense of breaking Kilian's laptop is not one of the
>> options.
> But the revert doesn't really help.
>
> It doesn't fix system suspend/resume on that laptop, which also breaks when
> PCIe ports PM is enabled on it.
>
> If you really want to use a sledgehammer approach here (which I don't recommend,
> but that's your call), you can change the initial value of pci_bridge_d3_disable to
> "true" (and update the pcie_ports_pm= command line to take "on" in case
> someone wants to enable the feature).  That at least will take care of the
> regression entirely and not just partly.
>
> Thanks,
> Rafael
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 21:52                                       ` Kilian Singer
@ 2017-01-03 22:07                                         ` Rafael J. Wysocki
  2017-01-03 22:25                                           ` Kilian Singer
  0 siblings, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-03 22:07 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Bjorn Helgaas, Peter Wu, Lukas Wunner, Mika Westerberg,
	linux-pci, Alex Deucher, Dave Airlie

On Tuesday, January 03, 2017 10:52:48 PM Kilian Singer wrote:
> I have not checked if suspend/resume broken.
> Which patch should I check suspend resume for.
> When I wrote about firefox and lockscreen issue resolved
> I forgot to check about the suspend/resume.

Well, here:

https://bugzilla.kernel.org/show_bug.cgi?id=190861#c26

you said:

"Shutdown and suspend/resume work on 4.7.0-1 but fail on 4.8. and 4.9."

and here:

https://bugzilla.kernel.org/show_bug.cgi?id=190861#c29

you said:

"the pci_port_pm=off
fixes both the firefox issue and the lock screen issue. Also suspend/resume work.
Tested on 4.9"

which to me means that suspend/resume is also broken for you
due to the PCIe ports PM support enabled (the command line option
should be pcie_port_pm=off, BTW, but I'm assuming that this is what you
actually used).

Now, it may be broken due to the runtime PM breakage triggering after system
resume (which may be verified by checking if the revert actually fixes system
suspend-resume too on your machine), but even then I wouldn't revert that
particular commit.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 21:38                                     ` Rafael J. Wysocki
  2017-01-03 21:52                                       ` Kilian Singer
@ 2017-01-03 22:25                                       ` Bjorn Helgaas
  2017-01-03 23:13                                         ` Rafael J. Wysocki
  1 sibling, 1 reply; 115+ messages in thread
From: Bjorn Helgaas @ 2017-01-03 22:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Wu, Lukas Wunner, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Tue, Jan 03, 2017 at 10:38:24PM +0100, Rafael J. Wysocki wrote:
> On Tuesday, January 03, 2017 12:12:21 PM Bjorn Helgaas wrote:
> > On Tue, Jan 03, 2017 at 05:31:30PM +0100, Peter Wu wrote:
> > > On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> > > > [cc += Dave Airlie:
> > > > 
> > > > Dave, we're about to lose support for newer Optimus laptops which use
> > > > _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> > > > a commit on his for-linus branch to remove runtime PM for PCIe ports.
> > > > This fixes a regression on Kilian Singer's laptop on which locking the
> > > > screen breaks USB and PS/2 input devices:  Mouse movements are still
> > > > visible, but button or key presses no longer have any effect.  The GPU
> > > > is powered down upon locking the screen and the current theory is that
> > > > this causes the issues.]
> > > 
> > > (+cc Alex: this might affect amdgpu/radeon too.]
> > > 
> > > Bjorn, please reconsider the rpm patch. Reverting support would
> > > introduce other regressions (see issues below) and make future
> > > Thunderbolt work harder (according to Lukas). If Kilian's laptop has
> > > issues, what about a "temporary" quirk?
> > 
> > As I mentioned at the beginning, the outcome I'm hoping for is a patch
> > that fixes Kilian's laptop while preserving the runtime PM support.
> > 
> > As I also mentioned at the beginning, preserving the runtime PM
> > support at the expense of breaking Kilian's laptop is not one of the
> > options.
> 
> But the revert doesn't really help.
> 
> It doesn't fix system suspend/resume on that laptop, which also breaks when
> PCIe ports PM is enabled on it.
> 
> If you really want to use a sledgehammer approach here (which I don't recommend,
> but that's your call), you can change the initial value of pci_bridge_d3_disable to
> "true" (and update the pcie_ports_pm= command line to take "on" in case
> someone wants to enable the feature).  That at least will take care of the
> regression entirely and not just partly.

What the heck is the problem here?  I'm not trying to be difficult,
but I didn't write this code and I'm not really interested in figuring
out how to fix it, so my only real option is to solicit fixes and, if
none appear, revert changes that break things.

As I've said more than once, I hope and expect that there is a better
solution than reverting the patch.  But *I* am not going to write it.
As soon as somebody proposes a better patch, I'll use it instead of
the revert.

If you want to fix the regression by changing the
pci_bridge_d3_disable value, all you have to do is post a patch doing
that.

I really don't understand why people are so wrapped around the axle
about this.  This is just the way Linux works -- we try really hard
not to cause regressions on platforms that used to work.  I *SAID* in
the very first posting of the revert that I assume Mika will have a
better solution soon.

When a better patch appears, I'll take that and drop the revert.
What's the problem with that?

Bjorn

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 22:07                                         ` Rafael J. Wysocki
@ 2017-01-03 22:25                                           ` Kilian Singer
  0 siblings, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2017-01-03 22:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjorn Helgaas, Peter Wu, Lukas Wunner, Mika Westerberg,
	linux-pci, Alex Deucher, Dave Airlie

Yes that is true:

"Shutdown and suspend/resume work on 4.7.0-1 but fail on 4.8. and 4.9."


I just did not check if the proposed patches fix the suspend/resume.

On 03-Jan-17 23:07, Rafael J. Wysocki wrote:
> On Tuesday, January 03, 2017 10:52:48 PM Kilian Singer wrote:
>> I have not checked if suspend/resume broken.
>> Which patch should I check suspend resume for.
>> When I wrote about firefox and lockscreen issue resolved
>> I forgot to check about the suspend/resume.
> Well, here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=190861#c26
>
> you said:
>
> "Shutdown and suspend/resume work on 4.7.0-1 but fail on 4.8. and 4.9."
>
> and here:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=190861#c29
>
> you said:
>
> "the pci_port_pm=off
> fixes both the firefox issue and the lock screen issue. Also suspend/resume work.
> Tested on 4.9"
>
> which to me means that suspend/resume is also broken for you
> due to the PCIe ports PM support enabled (the command line option
> should be pcie_port_pm=off, BTW, but I'm assuming that this is what you
> actually used).
>
> Now, it may be broken due to the runtime PM breakage triggering after system
> resume (which may be verified by checking if the revert actually fixes system
> suspend-resume too on your machine), but even then I wouldn't revert that
> particular commit.
>
> Thanks,
> Rafael
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 22:25                                       ` Bjorn Helgaas
@ 2017-01-03 23:13                                         ` Rafael J. Wysocki
  2017-01-04  0:05                                           ` Bjorn Helgaas
  0 siblings, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-03 23:13 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Peter Wu, Lukas Wunner, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Tuesday, January 03, 2017 04:25:09 PM Bjorn Helgaas wrote:
> On Tue, Jan 03, 2017 at 10:38:24PM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, January 03, 2017 12:12:21 PM Bjorn Helgaas wrote:
> > > On Tue, Jan 03, 2017 at 05:31:30PM +0100, Peter Wu wrote:
> > > > On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> > > > > [cc += Dave Airlie:
> > > > > 
> > > > > Dave, we're about to lose support for newer Optimus laptops which use
> > > > > _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> > > > > a commit on his for-linus branch to remove runtime PM for PCIe ports.
> > > > > This fixes a regression on Kilian Singer's laptop on which locking the
> > > > > screen breaks USB and PS/2 input devices:  Mouse movements are still
> > > > > visible, but button or key presses no longer have any effect.  The GPU
> > > > > is powered down upon locking the screen and the current theory is that
> > > > > this causes the issues.]
> > > > 
> > > > (+cc Alex: this might affect amdgpu/radeon too.]
> > > > 
> > > > Bjorn, please reconsider the rpm patch. Reverting support would
> > > > introduce other regressions (see issues below) and make future
> > > > Thunderbolt work harder (according to Lukas). If Kilian's laptop has
> > > > issues, what about a "temporary" quirk?
> > > 
> > > As I mentioned at the beginning, the outcome I'm hoping for is a patch
> > > that fixes Kilian's laptop while preserving the runtime PM support.
> > > 
> > > As I also mentioned at the beginning, preserving the runtime PM
> > > support at the expense of breaking Kilian's laptop is not one of the
> > > options.
> > 
> > But the revert doesn't really help.
> > 
> > It doesn't fix system suspend/resume on that laptop, which also breaks when
> > PCIe ports PM is enabled on it.
> > 
> > If you really want to use a sledgehammer approach here (which I don't recommend,
> > but that's your call), you can change the initial value of pci_bridge_d3_disable to
> > "true" (and update the pcie_ports_pm= command line to take "on" in case
> > someone wants to enable the feature).  That at least will take care of the
> > regression entirely and not just partly.
> 
> What the heck is the problem here?  I'm not trying to be difficult,
> but I didn't write this code and I'm not really interested in figuring
> out how to fix it, so my only real option is to solicit fixes and, if
> none appear, revert changes that break things.
> 
> As I've said more than once, I hope and expect that there is a better
> solution than reverting the patch.  But *I* am not going to write it.
> As soon as somebody proposes a better patch, I'll use it instead of
> the revert.
> 
> If you want to fix the regression by changing the
> pci_bridge_d3_disable value, all you have to do is post a patch doing
> that.

OK, please find appended.

> I really don't understand why people are so wrapped around the axle
> about this.  This is just the way Linux works -- we try really hard
> not to cause regressions on platforms that used to work.

I haven't seen anyone in this thread questioning that.

IMO the point people are trying to make is that reverting stuff may not really be
the way to go.

> I *SAID* in the very first posting of the revert that I assume Mika will have a
> better solution soon.

In which case I wouldn't have queued up a revert had I been you.

> When a better patch appears, I'll take that and drop the revert.
> What's the problem with that?

There are people for whom the commit in question fixed serious issues and the
revert would just take that away from them without any option to make their
systems work.

Thanks,
Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] PCI / PM: Disable power management of PCIe ports by default

Due to regressions introduced by enabling power management of
PCIe ports by default, disable it for the time being, but still
allow it to be enabled via a kernel command line option.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=190861
Tentatively-signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

This particular patch hasn't been tested, but the result of it should be the
same as passing pcie_port_pm=off in the kernel command line, which has
been tested in the BZ entry above.

---
 Documentation/admin-guide/kernel-parameters.txt |    3 --
 drivers/pci/pci.c                               |   26 +++++-------------------
 2 files changed, 7 insertions(+), 22 deletions(-)

Index: linux-pm/drivers/pci/pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pci.c
+++ linux-pm/drivers/pci/pci.c
@@ -108,17 +108,14 @@ unsigned int pcibios_max_latency = 255;
 /* If set, the PCIe ARI capability will not be used. */
 static bool pcie_ari_disabled;
 
-/* Disable bridge_d3 for all PCIe ports */
-static bool pci_bridge_d3_disable;
-/* Force bridge_d3 for all PCIe ports */
-static bool pci_bridge_d3_force;
+/* Enable bridge_d3 for all PCIe ports */
+static bool pci_bridge_d3_enable;
 
 static int __init pcie_port_pm_setup(char *str)
 {
-	if (!strcmp(str, "off"))
-		pci_bridge_d3_disable = true;
-	else if (!strcmp(str, "force"))
-		pci_bridge_d3_force = true;
+	if (!strcmp(str, "on"))
+		pci_bridge_d3_enable = true;
+
 	return 1;
 }
 __setup("pcie_port_pm=", pcie_port_pm_setup);
@@ -2237,7 +2234,7 @@ bool pci_bridge_d3_possible(struct pci_d
 	case PCI_EXP_TYPE_ROOT_PORT:
 	case PCI_EXP_TYPE_UPSTREAM:
 	case PCI_EXP_TYPE_DOWNSTREAM:
-		if (pci_bridge_d3_disable)
+		if (!pci_bridge_d3_enable)
 			return false;
 
 		/*
@@ -2247,17 +2244,6 @@ bool pci_bridge_d3_possible(struct pci_d
 		if (bridge->is_hotplug_bridge && !pciehp_is_native(bridge))
 			return false;
 
-		if (pci_bridge_d3_force)
-			return true;
-
-		/*
-		 * It should be safe to put PCIe ports from 2015 or newer
-		 * to D3.
-		 */
-		if (dmi_get_date(DMI_BIOS_DATE, &year, NULL, NULL) &&
-		    year >= 2015) {
-			return true;
-		}
 		break;
 	}
 
Index: linux-pm/Documentation/admin-guide/kernel-parameters.txt
===================================================================
--- linux-pm.orig/Documentation/admin-guide/kernel-parameters.txt
+++ linux-pm/Documentation/admin-guide/kernel-parameters.txt
@@ -2984,8 +2984,7 @@
 			ports driver.
 
 	pcie_port_pm=	[PCIE] PCIe port power management handling:
-		off	Disable power management of all PCIe ports
-		force	Forcibly enable power management of all PCIe ports
+		on	Enable power management of PCIe ports
 
 	pcie_pme=	[PCIE,PM] Native PCIe PME signaling options:
 		nomsi	Do not use MSI for native PCIe PME signaling (this makes

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-03 23:13                                         ` Rafael J. Wysocki
@ 2017-01-04  0:05                                           ` Bjorn Helgaas
  2017-01-04  1:09                                             ` Rafael J. Wysocki
  2017-01-04  8:16                                             ` Lukas Wunner
  0 siblings, 2 replies; 115+ messages in thread
From: Bjorn Helgaas @ 2017-01-04  0:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Wu, Lukas Wunner, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Wed, Jan 04, 2017 at 12:13:18AM +0100, Rafael J. Wysocki wrote:
> On Tuesday, January 03, 2017 04:25:09 PM Bjorn Helgaas wrote:
> > On Tue, Jan 03, 2017 at 10:38:24PM +0100, Rafael J. Wysocki wrote:
> > > On Tuesday, January 03, 2017 12:12:21 PM Bjorn Helgaas wrote:
> > > > On Tue, Jan 03, 2017 at 05:31:30PM +0100, Peter Wu wrote:
> > > > > On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> > > > > > [cc += Dave Airlie:
> > > > > > 
> > > > > > Dave, we're about to lose support for newer Optimus laptops which use
> > > > > > _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> > > > > > a commit on his for-linus branch to remove runtime PM for PCIe ports.
> > > > > > This fixes a regression on Kilian Singer's laptop on which locking the
> > > > > > screen breaks USB and PS/2 input devices:  Mouse movements are still
> > > > > > visible, but button or key presses no longer have any effect.  The GPU
> > > > > > is powered down upon locking the screen and the current theory is that
> > > > > > this causes the issues.]
> > > > > 
> > > > > (+cc Alex: this might affect amdgpu/radeon too.]
> > > > > 
> > > > > Bjorn, please reconsider the rpm patch. Reverting support would
> > > > > introduce other regressions (see issues below) and make future
> > > > > Thunderbolt work harder (according to Lukas). If Kilian's laptop has
> > > > > issues, what about a "temporary" quirk?
> > > > 
> > > > As I mentioned at the beginning, the outcome I'm hoping for is a patch
> > > > that fixes Kilian's laptop while preserving the runtime PM support.
> > > > 
> > > > As I also mentioned at the beginning, preserving the runtime PM
> > > > support at the expense of breaking Kilian's laptop is not one of the
> > > > options.
> > > 
> > > But the revert doesn't really help.
> > > 
> > > It doesn't fix system suspend/resume on that laptop, which also breaks when
> > > PCIe ports PM is enabled on it.
> > > 
> > > If you really want to use a sledgehammer approach here (which I don't recommend,
> > > but that's your call), you can change the initial value of pci_bridge_d3_disable to
> > > "true" (and update the pcie_ports_pm= command line to take "on" in case
> > > someone wants to enable the feature).  That at least will take care of the
> > > regression entirely and not just partly.
> > 
> > What the heck is the problem here?  I'm not trying to be difficult,
> > but I didn't write this code and I'm not really interested in figuring
> > out how to fix it, so my only real option is to solicit fixes and, if
> > none appear, revert changes that break things.
> > 
> > As I've said more than once, I hope and expect that there is a better
> > solution than reverting the patch.  But *I* am not going to write it.
> > As soon as somebody proposes a better patch, I'll use it instead of
> > the revert.
> > 
> > If you want to fix the regression by changing the
> > pci_bridge_d3_disable value, all you have to do is post a patch doing
> > that.
> 
> OK, please find appended.
> 
> > I really don't understand why people are so wrapped around the axle
> > about this.  This is just the way Linux works -- we try really hard
> > not to cause regressions on platforms that used to work.
> 
> I haven't seen anyone in this thread questioning that.
> 
> IMO the point people are trying to make is that reverting stuff may not really be
> the way to go.
> 
> > I *SAID* in the very first posting of the revert that I assume Mika will have a
> > better solution soon.
> 
> In which case I wouldn't have queued up a revert had I been you.
> 
> > When a better patch appears, I'll take that and drop the revert.
> > What's the problem with that?
> 
> There are people for whom the commit in question fixed serious issues and the
> revert would just take that away from them without any option to make their
> systems work.

I don't *want* to apply the revert.  It's on my for-linus branch as a
worst-case scenario change if we can't figure out a better fix.

The patch below is preferable, but I'd rather not take even it,
because it takes away functionality and forces people to use a boot
parameter to restore it.  I expect that somebody will figure out how
to fix the regression Kilian found and also keep the new functionality
(without requiring boot parameters) before v4.10.

Of course, if a better fix is far off and the patch below is much
better in the interim (avoids memory corruption, fixes problems for
more people, etc.), I will replace the revert with it.  I just
haven't seen the argument for doing that.

My main point is that Kilian found a pretty serious regression and
spent a lot of time bisecting it and testing things, and we need to
address it in some way before v4.10.

> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] PCI / PM: Disable power management of PCIe ports by default
> 
> Due to regressions introduced by enabling power management of
> PCIe ports by default, disable it for the time being, but still
> allow it to be enabled via a kernel command line option.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=190861
> Tentatively-signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> This particular patch hasn't been tested, but the result of it should be the
> same as passing pcie_port_pm=off in the kernel command line, which has
> been tested in the BZ entry above.
> 
> ---
>  Documentation/admin-guide/kernel-parameters.txt |    3 --
>  drivers/pci/pci.c                               |   26 +++++-------------------
>  2 files changed, 7 insertions(+), 22 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci.c
> +++ linux-pm/drivers/pci/pci.c
> @@ -108,17 +108,14 @@ unsigned int pcibios_max_latency = 255;
>  /* If set, the PCIe ARI capability will not be used. */
>  static bool pcie_ari_disabled;
>  
> -/* Disable bridge_d3 for all PCIe ports */
> -static bool pci_bridge_d3_disable;
> -/* Force bridge_d3 for all PCIe ports */
> -static bool pci_bridge_d3_force;
> +/* Enable bridge_d3 for all PCIe ports */
> +static bool pci_bridge_d3_enable;
>  
>  static int __init pcie_port_pm_setup(char *str)
>  {
> -	if (!strcmp(str, "off"))
> -		pci_bridge_d3_disable = true;
> -	else if (!strcmp(str, "force"))
> -		pci_bridge_d3_force = true;
> +	if (!strcmp(str, "on"))
> +		pci_bridge_d3_enable = true;
> +
>  	return 1;
>  }
>  __setup("pcie_port_pm=", pcie_port_pm_setup);
> @@ -2237,7 +2234,7 @@ bool pci_bridge_d3_possible(struct pci_d
>  	case PCI_EXP_TYPE_ROOT_PORT:
>  	case PCI_EXP_TYPE_UPSTREAM:
>  	case PCI_EXP_TYPE_DOWNSTREAM:
> -		if (pci_bridge_d3_disable)
> +		if (!pci_bridge_d3_enable)
>  			return false;
>  
>  		/*
> @@ -2247,17 +2244,6 @@ bool pci_bridge_d3_possible(struct pci_d
>  		if (bridge->is_hotplug_bridge && !pciehp_is_native(bridge))
>  			return false;
>  
> -		if (pci_bridge_d3_force)
> -			return true;
> -
> -		/*
> -		 * It should be safe to put PCIe ports from 2015 or newer
> -		 * to D3.
> -		 */
> -		if (dmi_get_date(DMI_BIOS_DATE, &year, NULL, NULL) &&
> -		    year >= 2015) {
> -			return true;
> -		}
>  		break;
>  	}
>  
> Index: linux-pm/Documentation/admin-guide/kernel-parameters.txt
> ===================================================================
> --- linux-pm.orig/Documentation/admin-guide/kernel-parameters.txt
> +++ linux-pm/Documentation/admin-guide/kernel-parameters.txt
> @@ -2984,8 +2984,7 @@
>  			ports driver.
>  
>  	pcie_port_pm=	[PCIE] PCIe port power management handling:
> -		off	Disable power management of all PCIe ports
> -		force	Forcibly enable power management of all PCIe ports
> +		on	Enable power management of PCIe ports
>  
>  	pcie_pme=	[PCIE,PM] Native PCIe PME signaling options:
>  		nomsi	Do not use MSI for native PCIe PME signaling (this makes
> 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04  0:05                                           ` Bjorn Helgaas
@ 2017-01-04  1:09                                             ` Rafael J. Wysocki
  2017-01-04  8:16                                             ` Lukas Wunner
  1 sibling, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-04  1:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Peter Wu, Lukas Wunner, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Tuesday, January 03, 2017 06:05:57 PM Bjorn Helgaas wrote:
> On Wed, Jan 04, 2017 at 12:13:18AM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, January 03, 2017 04:25:09 PM Bjorn Helgaas wrote:
> > > On Tue, Jan 03, 2017 at 10:38:24PM +0100, Rafael J. Wysocki wrote:
> > > > On Tuesday, January 03, 2017 12:12:21 PM Bjorn Helgaas wrote:
> > > > > On Tue, Jan 03, 2017 at 05:31:30PM +0100, Peter Wu wrote:
> > > > > > On Tue, Jan 03, 2017 at 05:11:23PM +0100, Lukas Wunner wrote:
> > > > > > > [cc += Dave Airlie:
> > > > > > > 
> > > > > > > Dave, we're about to lose support for newer Optimus laptops which use
> > > > > > > _PR3 to cut power to the discrete GPU because Bjorn Helgaas has queued
> > > > > > > a commit on his for-linus branch to remove runtime PM for PCIe ports.
> > > > > > > This fixes a regression on Kilian Singer's laptop on which locking the
> > > > > > > screen breaks USB and PS/2 input devices:  Mouse movements are still
> > > > > > > visible, but button or key presses no longer have any effect.  The GPU
> > > > > > > is powered down upon locking the screen and the current theory is that
> > > > > > > this causes the issues.]
> > > > > > 
> > > > > > (+cc Alex: this might affect amdgpu/radeon too.]
> > > > > > 
> > > > > > Bjorn, please reconsider the rpm patch. Reverting support would
> > > > > > introduce other regressions (see issues below) and make future
> > > > > > Thunderbolt work harder (according to Lukas). If Kilian's laptop has
> > > > > > issues, what about a "temporary" quirk?
> > > > > 
> > > > > As I mentioned at the beginning, the outcome I'm hoping for is a patch
> > > > > that fixes Kilian's laptop while preserving the runtime PM support.
> > > > > 
> > > > > As I also mentioned at the beginning, preserving the runtime PM
> > > > > support at the expense of breaking Kilian's laptop is not one of the
> > > > > options.
> > > > 
> > > > But the revert doesn't really help.
> > > > 
> > > > It doesn't fix system suspend/resume on that laptop, which also breaks when
> > > > PCIe ports PM is enabled on it.
> > > > 
> > > > If you really want to use a sledgehammer approach here (which I don't recommend,
> > > > but that's your call), you can change the initial value of pci_bridge_d3_disable to
> > > > "true" (and update the pcie_ports_pm= command line to take "on" in case
> > > > someone wants to enable the feature).  That at least will take care of the
> > > > regression entirely and not just partly.
> > > 
> > > What the heck is the problem here?  I'm not trying to be difficult,
> > > but I didn't write this code and I'm not really interested in figuring
> > > out how to fix it, so my only real option is to solicit fixes and, if
> > > none appear, revert changes that break things.
> > > 
> > > As I've said more than once, I hope and expect that there is a better
> > > solution than reverting the patch.  But *I* am not going to write it.
> > > As soon as somebody proposes a better patch, I'll use it instead of
> > > the revert.
> > > 
> > > If you want to fix the regression by changing the
> > > pci_bridge_d3_disable value, all you have to do is post a patch doing
> > > that.
> > 
> > OK, please find appended.
> > 
> > > I really don't understand why people are so wrapped around the axle
> > > about this.  This is just the way Linux works -- we try really hard
> > > not to cause regressions on platforms that used to work.
> > 
> > I haven't seen anyone in this thread questioning that.
> > 
> > IMO the point people are trying to make is that reverting stuff may not really be
> > the way to go.
> > 
> > > I *SAID* in the very first posting of the revert that I assume Mika will have a
> > > better solution soon.
> > 
> > In which case I wouldn't have queued up a revert had I been you.
> > 
> > > When a better patch appears, I'll take that and drop the revert.
> > > What's the problem with that?
> > 
> > There are people for whom the commit in question fixed serious issues and the
> > revert would just take that away from them without any option to make their
> > systems work.
> 
> I don't *want* to apply the revert.  It's on my for-linus branch as a
> worst-case scenario change if we can't figure out a better fix.
> 
> The patch below is preferable, but I'd rather not take even it,
> because it takes away functionality and forces people to use a boot
> parameter to restore it.  I expect that somebody will figure out how
> to fix the regression Kilian found and also keep the new functionality
> (without requiring boot parameters) before v4.10.
> 
> Of course, if a better fix is far off and the patch below is much
> better in the interim (avoids memory corruption, fixes problems for
> more people, etc.), I will replace the revert with it.  I just
> haven't seen the argument for doing that.

That is very simple.

If you revert, runtime PM will not work for any PCIe ports no matter what and
there is no way to enable it whatever.  Therefore, if there's anyone who
depends on it whatever the reason, they have no way to enable it other than
patching the kernel and rebuilding it.  There are users who may not be able
to do that.

With this patch, in turn, they at least have a kernel command line option to
enable the feature if they need it.  To me, this would be a good enough reason
to apply this patch instead of the revert.

> My main point is that Kilian found a pretty serious regression and
> spent a lot of time bisecting it and testing things, and we need to
> address it in some way before v4.10.

Let me repeat then that nobody here is questioning the need to address the
issue.

That said to me, reverting would be almost as bad as leaving it unfixed.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04  0:05                                           ` Bjorn Helgaas
  2017-01-04  1:09                                             ` Rafael J. Wysocki
@ 2017-01-04  8:16                                             ` Lukas Wunner
  2017-01-04 10:33                                               ` Kilian Singer
                                                                 ` (3 more replies)
  1 sibling, 4 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-04  8:16 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Rafael J. Wysocki, Peter Wu, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> I don't *want* to apply the revert.  It's on my for-linus branch as a
> worst-case scenario change if we can't figure out a better fix.
> 
> The patch below is preferable, but I'd rather not take even it,
> because it takes away functionality and forces people to use a boot
> parameter to restore it.  I expect that somebody will figure out how
> to fix the regression Kilian found and also keep the new functionality
> (without requiring boot parameters) before v4.10.

The issue is constrained to hybrid graphics laptops with Nvidia discrete
GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
PCI core.

(AFAIUI, laptops with AMD discrete GPU are not affected as it is known
when and how to call an ACPI method versus using PR3.)

(Neither are laptops using the Nvidia proprietary driver as it doesn't
runtime suspend the card.  But battery life will be terrible then.)

We're at rc2 so the time frame for coming up with a fix is probably
4 weeks.  Peter and others have tried for months to reverse-engineer
how to handle runtime PM on newer Nvidia cards.  It seems likely that
we'll not find the ultimate solution to the problem within 4 weeks.

The way it is now, i.e. defaulting to PR3 when available, regresses
certain laptops such as Kilian's.  If on the other hand we default to
DSM when available, we'll regress certain other laptops, as Peter has
pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
approach either, ideally we'd want to use PR3 as Windows does.

As said, the only short-term solution I see is to add an "optimus"
module_param to nouveau to allow users to select which method to use.
So in Kilian's case an additional command line parameter would be
necessary to fix the issue.

Does anyone see a better solution or can we agree on this one?  If so
I can come up with a patch.  This could go in via Dave Airlie's tree.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04  8:16                                             ` Lukas Wunner
@ 2017-01-04 10:33                                               ` Kilian Singer
  2017-01-04 12:29                                                 ` Mika Westerberg
  2017-01-04 15:50                                               ` Deucher, Alexander
                                                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2017-01-04 10:33 UTC (permalink / raw)
  To: Lukas Wunner, Bjorn Helgaas
  Cc: Rafael J. Wysocki, Peter Wu, Mika Westerberg, linux-pci,
	Alex Deucher, Dave Airlie

Dear all,

the weird thing is also that when calling:

echo 0 > /sys/bus/pci/devices/0000:01:00.0/d3cold_allowed
or
echo on > /sys/bus/pci/devices/0000:01:00.0/power/control

on a command line then the command line crashes. The system stays responsive but still
becomes unresponsive when I lock the screen.

Maybe a similar thing are happening in the kernel. Maybe one could use the above fact
and a timeout to test for the problem and then take measures to deactivate the pm.

I hope this gives you some more clues.

Best regards
Kilian

PS: let me know which patches I should test. I am currently terribly busy with work. So if you would select the most urgent patches I would be delighted.



On 04-Jan-17 09:16, Lukas Wunner wrote:
> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
>> I don't *want* to apply the revert.  It's on my for-linus branch as a
>> worst-case scenario change if we can't figure out a better fix.
>>
>> The patch below is preferable, but I'd rather not take even it,
>> because it takes away functionality and forces people to use a boot
>> parameter to restore it.  I expect that somebody will figure out how
>> to fix the regression Kilian found and also keep the new functionality
>> (without requiring boot parameters) before v4.10.
> The issue is constrained to hybrid graphics laptops with Nvidia discrete
> GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
> PCI core.
>
> (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
> when and how to call an ACPI method versus using PR3.)
>
> (Neither are laptops using the Nvidia proprietary driver as it doesn't
> runtime suspend the card.  But battery life will be terrible then.)
>
> We're at rc2 so the time frame for coming up with a fix is probably
> 4 weeks.  Peter and others have tried for months to reverse-engineer
> how to handle runtime PM on newer Nvidia cards.  It seems likely that
> we'll not find the ultimate solution to the problem within 4 weeks.
>
> The way it is now, i.e. defaulting to PR3 when available, regresses
> certain laptops such as Kilian's.  If on the other hand we default to
> DSM when available, we'll regress certain other laptops, as Peter has
> pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
> approach either, ideally we'd want to use PR3 as Windows does.
>
> As said, the only short-term solution I see is to add an "optimus"
> module_param to nouveau to allow users to select which method to use.
> So in Kilian's case an additional command line parameter would be
> necessary to fix the issue.
>
> Does anyone see a better solution or can we agree on this one?  If so
> I can come up with a patch.  This could go in via Dave Airlie's tree.
>
> Thanks,
>
> Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04 10:33                                               ` Kilian Singer
@ 2017-01-04 12:29                                                 ` Mika Westerberg
  0 siblings, 0 replies; 115+ messages in thread
From: Mika Westerberg @ 2017-01-04 12:29 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Lukas Wunner, Bjorn Helgaas, Rafael J. Wysocki, Peter Wu,
	linux-pci, Alex Deucher, Dave Airlie

On Wed, Jan 04, 2017 at 11:33:16AM +0100, Kilian Singer wrote:
> Dear all,
> 
> the weird thing is also that when calling:
> 
> echo 0 > /sys/bus/pci/devices/0000:01:00.0/d3cold_allowed
> or
> echo on > /sys/bus/pci/devices/0000:01:00.0/power/control
> 
> on a command line then the command line crashes. The system stays responsive but still
> becomes unresponsive when I lock the screen.

Most probably because the device has already been runtime suspended.
Setting them from command line may be too late.

> Maybe a similar thing are happening in the kernel. Maybe one could use the above fact
> and a timeout to test for the problem and then take measures to deactivate the pm.
> 
> I hope this gives you some more clues.

You could try to run 'lspci -vv' once the machine is unresponsive (if
you can do anything anymore). That should show whether the device and
the root port are still in D3.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* RE: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04  8:16                                             ` Lukas Wunner
  2017-01-04 10:33                                               ` Kilian Singer
@ 2017-01-04 15:50                                               ` Deucher, Alexander
  2017-01-04 21:09                                               ` Peter Wu
  2017-01-04 21:55                                               ` Rafael J. Wysocki
  3 siblings, 0 replies; 115+ messages in thread
From: Deucher, Alexander @ 2017-01-04 15:50 UTC (permalink / raw)
  To: 'Lukas Wunner', Bjorn Helgaas
  Cc: Rafael J. Wysocki, Peter Wu, Mika Westerberg, Kilian Singer,
	linux-pci, Dave Airlie

> -----Original Message-----
> From: Lukas Wunner [mailto:lukas@wunner.de]
> Sent: Wednesday, January 04, 2017 3:17 AM
> To: Bjorn Helgaas
> Cc: Rafael J. Wysocki; Peter Wu; Mika Westerberg; Kilian Singer; linux-pc=
i;
> Deucher, Alexander; Dave Airlie
> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
>=20
> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> > I don't *want* to apply the revert.  It's on my for-linus branch as a
> > worst-case scenario change if we can't figure out a better fix.
> >
> > The patch below is preferable, but I'd rather not take even it,
> > because it takes away functionality and forces people to use a boot
> > parameter to restore it.  I expect that somebody will figure out how
> > to fix the regression Kilian found and also keep the new functionality
> > (without requiring boot parameters) before v4.10.
>=20
> The issue is constrained to hybrid graphics laptops with Nvidia discrete
> GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
> PCI core.
>=20
> (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
> when and how to call an ACPI method versus using PR3.)
>=20
> (Neither are laptops using the Nvidia proprietary driver as it doesn't
> runtime suspend the card.  But battery life will be terrible then.)
>=20
> We're at rc2 so the time frame for coming up with a fix is probably
> 4 weeks.  Peter and others have tried for months to reverse-engineer
> how to handle runtime PM on newer Nvidia cards.  It seems likely that
> we'll not find the ultimate solution to the problem within 4 weeks.
>=20
> The way it is now, i.e. defaulting to PR3 when available, regresses
> certain laptops such as Kilian's.  If on the other hand we default to
> DSM when available, we'll regress certain other laptops, as Peter has
> pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
> approach either, ideally we'd want to use PR3 as Windows does.
>=20
> As said, the only short-term solution I see is to add an "optimus"
> module_param to nouveau to allow users to select which method to use.
> So in Kilian's case an additional command line parameter would be
> necessary to fix the issue.
>=20
> Does anyone see a better solution or can we agree on this one?  If so
> I can come up with a patch.  This could go in via Dave Airlie's tree.

I think an option may be useful for testing, but I think the best solution =
is probably a quirk for Kilian's system unless there are a lot of users hav=
ing similar problems to Killian.  PR3 standardizes dGPU power control so th=
ings should get better across the board.

Alex

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04  8:16                                             ` Lukas Wunner
  2017-01-04 10:33                                               ` Kilian Singer
  2017-01-04 15:50                                               ` Deucher, Alexander
@ 2017-01-04 21:09                                               ` Peter Wu
  2017-01-04 21:58                                                 ` Rafael J. Wysocki
  2017-01-05 14:42                                                 ` Lukas Wunner
  2017-01-04 21:55                                               ` Rafael J. Wysocki
  3 siblings, 2 replies; 115+ messages in thread
From: Peter Wu @ 2017-01-04 21:09 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> > I don't *want* to apply the revert.  It's on my for-linus branch as a
> > worst-case scenario change if we can't figure out a better fix.
> > 
> > The patch below is preferable, but I'd rather not take even it,
> > because it takes away functionality and forces people to use a boot
> > parameter to restore it.  I expect that somebody will figure out how
> > to fix the regression Kilian found and also keep the new functionality
> > (without requiring boot parameters) before v4.10.
> 
> The issue is constrained to hybrid graphics laptops with Nvidia discrete
> GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
> PCI core.

The problem is not necessarily in the nouveau driver, the same problem
occurs when you enable RPM without loading nouveau. The issue is limited
though to some newer hybrid graphics laptops with Nvidia GPUs. While a
quirk can be added to nouveau, I think that a (temporary) quirk in core
would also be reasonable (since it also occurs without nouveau).

> (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
> when and how to call an ACPI method versus using PR3.)
> 
> (Neither are laptops using the Nvidia proprietary driver as it doesn't
> runtime suspend the card.  But battery life will be terrible then.)
> 
> We're at rc2 so the time frame for coming up with a fix is probably
> 4 weeks.  Peter and others have tried for months to reverse-engineer
> how to handle runtime PM on newer Nvidia cards.  It seems likely that
> we'll not find the ultimate solution to the problem within 4 weeks.

Yep, a quick proper fix seems unlikely.
[ Help/ideas are welcome, I suspect that these failures to restore power
on laptops designed for Win8+ all have the same cause, related to some
unknown interaction between ACPI and PCI. Some links:
https://bugzilla.kernel.org/show_bug.cgi?id=190861
https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]

> The way it is now, i.e. defaulting to PR3 when available, regresses
> certain laptops such as Kilian's.  If on the other hand we default to
> DSM when available, we'll regress certain other laptops, as Peter has
> pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
> approach either, ideally we'd want to use PR3 as Windows does.
> 
> As said, the only short-term solution I see is to add an "optimus"
> module_param to nouveau to allow users to select which method to use.
> So in Kilian's case an additional command line parameter would be
> necessary to fix the issue.
> 
> Does anyone see a better solution or can we agree on this one?  If so
> I can come up with a patch.  This could go in via Dave Airlie's tree.

As pcie_port_pm=off already reverts to DSM, I do not think that an
additional (temporary) nouveau module parameter is going to help. I
instead propose a (hopefully temporary) quirk in pci core that disables
D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
pcie_port_pm=off). Then the option pcie_port_pm=force can still be used
to test possible solutions in the future.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04  8:16                                             ` Lukas Wunner
                                                                 ` (2 preceding siblings ...)
  2017-01-04 21:09                                               ` Peter Wu
@ 2017-01-04 21:55                                               ` Rafael J. Wysocki
  3 siblings, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-04 21:55 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Peter Wu, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Wednesday, January 04, 2017 09:16:39 AM Lukas Wunner wrote:
> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> > I don't *want* to apply the revert.  It's on my for-linus branch as a
> > worst-case scenario change if we can't figure out a better fix.
> > 
> > The patch below is preferable, but I'd rather not take even it,
> > because it takes away functionality and forces people to use a boot
> > parameter to restore it.  I expect that somebody will figure out how
> > to fix the regression Kilian found and also keep the new functionality
> > (without requiring boot parameters) before v4.10.
> 
> The issue is constrained to hybrid graphics laptops with Nvidia discrete
> GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
> PCI core.

But it may boil down to the fact that on some systems some ACPI power
resources are not usable to us.  They shouldn't be used for power management
at all then and I'm not sure whether or not addressing that in nouveau alone is
entirely viable.

> (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
> when and how to call an ACPI method versus using PR3.)
> 
> (Neither are laptops using the Nvidia proprietary driver as it doesn't
> runtime suspend the card.  But battery life will be terrible then.)
> 
> We're at rc2 so the time frame for coming up with a fix is probably
> 4 weeks.  Peter and others have tried for months to reverse-engineer
> how to handle runtime PM on newer Nvidia cards.  It seems likely that
> we'll not find the ultimate solution to the problem within 4 weeks.
> 
> The way it is now, i.e. defaulting to PR3 when available, regresses
> certain laptops such as Kilian's.  If on the other hand we default to
> DSM when available, we'll regress certain other laptops, as Peter has
> pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
> approach either, ideally we'd want to use PR3 as Windows does.
> 
> As said, the only short-term solution I see is to add an "optimus"
> module_param to nouveau to allow users to select which method to use.
> So in Kilian's case an additional command line parameter would be
> necessary to fix the issue.

There is a command line arg he can use for that already, so adding just
another one for the same purpose doesn't look like a great improvement to me.

> Does anyone see a better solution or can we agree on this one?  If so
> I can come up with a patch.  This could go in via Dave Airlie's tree.

We basically need to quirk ACPI power resources on those systems somehow.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04 21:09                                               ` Peter Wu
@ 2017-01-04 21:58                                                 ` Rafael J. Wysocki
  2017-01-04 23:21                                                   ` David Airlie
  2017-01-05 10:49                                                   ` Mika Westerberg
  2017-01-05 14:42                                                 ` Lukas Wunner
  1 sibling, 2 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-04 21:58 UTC (permalink / raw)
  To: Peter Wu
  Cc: Lukas Wunner, Bjorn Helgaas, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
> > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> > > I don't *want* to apply the revert.  It's on my for-linus branch as a
> > > worst-case scenario change if we can't figure out a better fix.
> > > 
> > > The patch below is preferable, but I'd rather not take even it,
> > > because it takes away functionality and forces people to use a boot
> > > parameter to restore it.  I expect that somebody will figure out how
> > > to fix the regression Kilian found and also keep the new functionality
> > > (without requiring boot parameters) before v4.10.
> > 
> > The issue is constrained to hybrid graphics laptops with Nvidia discrete
> > GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
> > PCI core.
> 
> The problem is not necessarily in the nouveau driver, the same problem
> occurs when you enable RPM without loading nouveau. The issue is limited
> though to some newer hybrid graphics laptops with Nvidia GPUs. While a
> quirk can be added to nouveau, I think that a (temporary) quirk in core
> would also be reasonable (since it also occurs without nouveau).
> 
> > (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
> > when and how to call an ACPI method versus using PR3.)
> > 
> > (Neither are laptops using the Nvidia proprietary driver as it doesn't
> > runtime suspend the card.  But battery life will be terrible then.)
> > 
> > We're at rc2 so the time frame for coming up with a fix is probably
> > 4 weeks.  Peter and others have tried for months to reverse-engineer
> > how to handle runtime PM on newer Nvidia cards.  It seems likely that
> > we'll not find the ultimate solution to the problem within 4 weeks.
> 
> Yep, a quick proper fix seems unlikely.
> [ Help/ideas are welcome, I suspect that these failures to restore power
> on laptops designed for Win8+ all have the same cause, related to some
> unknown interaction between ACPI and PCI. Some links:
> https://bugzilla.kernel.org/show_bug.cgi?id=190861
> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> 
> > The way it is now, i.e. defaulting to PR3 when available, regresses
> > certain laptops such as Kilian's.  If on the other hand we default to
> > DSM when available, we'll regress certain other laptops, as Peter has
> > pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
> > approach either, ideally we'd want to use PR3 as Windows does.
> > 
> > As said, the only short-term solution I see is to add an "optimus"
> > module_param to nouveau to allow users to select which method to use.
> > So in Kilian's case an additional command line parameter would be
> > necessary to fix the issue.
> > 
> > Does anyone see a better solution or can we agree on this one?  If so
> > I can come up with a patch.  This could go in via Dave Airlie's tree.
> 
> As pcie_port_pm=off already reverts to DSM, I do not think that an
> additional (temporary) nouveau module parameter is going to help. I
> instead propose a (hopefully temporary) quirk in pci core that disables
> D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
> pcie_port_pm=off). Then the option pcie_port_pm=force can still be used
> to test possible solutions in the future.

I would rather add a quirk to the ACPI core to prevent the power resources in
question from being enumerated.  Or even to prevent ACPI PM from being
used for the port in question.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04 21:58                                                 ` Rafael J. Wysocki
@ 2017-01-04 23:21                                                   ` David Airlie
  2017-01-05 15:06                                                     ` Lukas Wunner
  2017-01-05 10:49                                                   ` Mika Westerberg
  1 sibling, 1 reply; 115+ messages in thread
From: David Airlie @ 2017-01-04 23:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Wu, Lukas Wunner, Bjorn Helgaas, Mika Westerberg,
	Kilian Singer, linux-pci, Alex Deucher


> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
> > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
> > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> > > > I don't *want* to apply the revert.  It's on my for-linus branch as a
> > > > worst-case scenario change if we can't figure out a better fix.
> > > > 
> > > > The patch below is preferable, but I'd rather not take even it,
> > > > because it takes away functionality and forces people to use a boot
> > > > parameter to restore it.  I expect that somebody will figure out how
> > > > to fix the regression Kilian found and also keep the new functionality
> > > > (without requiring boot parameters) before v4.10.
> > > 
> > > The issue is constrained to hybrid graphics laptops with Nvidia discrete
> > > GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
> > > PCI core.
> > 
> > The problem is not necessarily in the nouveau driver, the same problem
> > occurs when you enable RPM without loading nouveau. The issue is limited
> > though to some newer hybrid graphics laptops with Nvidia GPUs. While a
> > quirk can be added to nouveau, I think that a (temporary) quirk in core
> > would also be reasonable (since it also occurs without nouveau).
> > 
> > > (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
> > > when and how to call an ACPI method versus using PR3.)
> > > 
> > > (Neither are laptops using the Nvidia proprietary driver as it doesn't
> > > runtime suspend the card.  But battery life will be terrible then.)
> > > 
> > > We're at rc2 so the time frame for coming up with a fix is probably
> > > 4 weeks.  Peter and others have tried for months to reverse-engineer
> > > how to handle runtime PM on newer Nvidia cards.  It seems likely that
> > > we'll not find the ultimate solution to the problem within 4 weeks.
> > 
> > Yep, a quick proper fix seems unlikely.
> > [ Help/ideas are welcome, I suspect that these failures to restore power
> > on laptops designed for Win8+ all have the same cause, related to some
> > unknown interaction between ACPI and PCI. Some links:
> > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > 
> > > The way it is now, i.e. defaulting to PR3 when available, regresses
> > > certain laptops such as Kilian's.  If on the other hand we default to
> > > DSM when available, we'll regress certain other laptops, as Peter has
> > > pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
> > > approach either, ideally we'd want to use PR3 as Windows does.
> > > 
> > > As said, the only short-term solution I see is to add an "optimus"
> > > module_param to nouveau to allow users to select which method to use.
> > > So in Kilian's case an additional command line parameter would be
> > > necessary to fix the issue.
> > > 
> > > Does anyone see a better solution or can we agree on this one?  If so
> > > I can come up with a patch.  This could go in via Dave Airlie's tree.
> > 
> > As pcie_port_pm=off already reverts to DSM, I do not think that an
> > additional (temporary) nouveau module parameter is going to help. I
> > instead propose a (hopefully temporary) quirk in pci core that disables
> > D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
> > pcie_port_pm=off). Then the option pcie_port_pm=force can still be used
> > to test possible solutions in the future.
> 
> I would rather add a quirk to the ACPI core to prevent the power resources in
> question from being enumerated.  Or even to prevent ACPI PM from being
> used for the port in question.

I do have a W541 in a cupboard in the office somewhere, but I won't be close to
it for a couple of weeks. The W541 was the first place I tested the pm patches
so I'm kinda wondering whether it's all W541's or just some specific model/bios
combo.

However I'm pretty much unavailable to do anything much until late Jan on this.

Dave.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04 21:58                                                 ` Rafael J. Wysocki
  2017-01-04 23:21                                                   ` David Airlie
@ 2017-01-05 10:49                                                   ` Mika Westerberg
  2017-01-05 14:19                                                     ` Rafael J. Wysocki
  2017-01-05 14:20                                                     ` Mika Westerberg
  1 sibling, 2 replies; 115+ messages in thread
From: Mika Westerberg @ 2017-01-05 10:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Wu, Lukas Wunner, Bjorn Helgaas, Kilian Singer, linux-pci,
	Alex Deucher, Dave Airlie

On Wed, Jan 04, 2017 at 10:58:10PM +0100, Rafael J. Wysocki wrote:
> I would rather add a quirk to the ACPI core to prevent the power resources in
> question from being enumerated.  Or even to prevent ACPI PM from being
> used for the port in question.

If we are going to add a quirk, I agree that it should be put to the
ACPI core.

However, Windows seems to be able to use _PR3 just fine. So there is
something that we are missing or do not implement properly which causes
all the troubles. IMHO we should try to find out what that difference is
and fix that if possible.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 10:49                                                   ` Mika Westerberg
@ 2017-01-05 14:19                                                     ` Rafael J. Wysocki
  2017-01-05 14:20                                                     ` Mika Westerberg
  1 sibling, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-05 14:19 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Peter Wu, Lukas Wunner, Bjorn Helgaas, Kilian Singer, linux-pci,
	Alex Deucher, Dave Airlie

On Thursday, January 05, 2017 12:49:40 PM Mika Westerberg wrote:
> On Wed, Jan 04, 2017 at 10:58:10PM +0100, Rafael J. Wysocki wrote:
> > I would rather add a quirk to the ACPI core to prevent the power resources in
> > question from being enumerated.  Or even to prevent ACPI PM from being
> > used for the port in question.
> 
> If we are going to add a quirk, I agree that it should be put to the
> ACPI core.
> 
> However, Windows seems to be able to use _PR3 just fine. So there is
> something that we are missing or do not implement properly which causes
> all the troubles. IMHO we should try to find out what that difference is
> and fix that if possible.

But we are time-constrained and that may take forever.

In the particular case of the Kilian's system, the power resource used by
_PR0 and _PR3 for the port actually operates the same hardware registers
as _PS0 and _PS3 for the VID device under the port (if I remember the name
correctly), so there is something in Windows switching between the two
depending on something.

I gess that depends on the version of Windows, but that's pure speculation.

We don't know what that is and a have little hope to learn about that, so let's
just say that this power resource is fishy and don't use it until we find out
(which frankly may or may not happen).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 10:49                                                   ` Mika Westerberg
  2017-01-05 14:19                                                     ` Rafael J. Wysocki
@ 2017-01-05 14:20                                                     ` Mika Westerberg
  2017-01-05 14:23                                                       ` Rafael J. Wysocki
  1 sibling, 1 reply; 115+ messages in thread
From: Mika Westerberg @ 2017-01-05 14:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Wu, Lukas Wunner, Bjorn Helgaas, Kilian Singer, linux-pci,
	Alex Deucher, Dave Airlie

On Thu, Jan 05, 2017 at 12:49:40PM +0200, Mika Westerberg wrote:
> On Wed, Jan 04, 2017 at 10:58:10PM +0100, Rafael J. Wysocki wrote:
> > I would rather add a quirk to the ACPI core to prevent the power resources in
> > question from being enumerated.  Or even to prevent ACPI PM from being
> > used for the port in question.
> 
> If we are going to add a quirk, I agree that it should be put to the
> ACPI core.
> 
> However, Windows seems to be able to use _PR3 just fine. So there is
> something that we are missing or do not implement properly which causes
> all the troubles. IMHO we should try to find out what that difference is
> and fix that if possible.

Here is one idea. The _OSC method is used as a handshake between OS and
the BIOS to enable/disable certain features. One of those features is
_PR3 support (ACPI specification 6.1 p.328):

  This bit is set if OSPM supports reading _PR3and using power
  resources to switch power. Note this handshake translates to an
  operating model that the platform and OSPM supports both the power
  model containing both D3hot and D3.

Some of our development platforms has BIOS option "RTD3 enable" which
is used to enable/disable this flag (among other things). The BIOS
should return acked caps when _OSC returns but we never check those in
Linux.

Kilian, can you try the below patch and send back dmesg when the system
has been booted? It should show if the BIOS acks _PR3 or not.

diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index 95855cb9d6fb..463eb2d69271 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -345,8 +345,15 @@ static void acpi_bus_osc_support(void)
 		capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_APEI_SUPPORT;
 	if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle)))
 		return;
+
+	acpi_handle_info(handle, "Supported caps: 0x%08x\n", capbuf[1]);
+
 	if (ACPI_SUCCESS(acpi_run_osc(handle, &context))) {
 		u32 *capbuf_ret = context.ret.pointer;
+
+		acpi_handle_info(handle, "Acked caps: 0x%08x (_PR3: %s)\n", capbuf_ret[1],
+				 capbuf_ret[1] & OSC_SB_PR3_SUPPORT ? "on" : "off");
+
 		if (context.ret.length > OSC_SUPPORT_DWORD) {
 			osc_sb_apei_support_acked =
 				capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_APEI_SUPPORT;

^ permalink raw reply related	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 14:20                                                     ` Mika Westerberg
@ 2017-01-05 14:23                                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-05 14:23 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Peter Wu, Lukas Wunner, Bjorn Helgaas, Kilian Singer, linux-pci,
	Alex Deucher, Dave Airlie

On Thursday, January 05, 2017 04:20:29 PM Mika Westerberg wrote:
> On Thu, Jan 05, 2017 at 12:49:40PM +0200, Mika Westerberg wrote:
> > On Wed, Jan 04, 2017 at 10:58:10PM +0100, Rafael J. Wysocki wrote:
> > > I would rather add a quirk to the ACPI core to prevent the power resources in
> > > question from being enumerated.  Or even to prevent ACPI PM from being
> > > used for the port in question.
> > 
> > If we are going to add a quirk, I agree that it should be put to the
> > ACPI core.
> > 
> > However, Windows seems to be able to use _PR3 just fine. So there is
> > something that we are missing or do not implement properly which causes
> > all the troubles. IMHO we should try to find out what that difference is
> > and fix that if possible.
> 
> Here is one idea. The _OSC method is used as a handshake between OS and
> the BIOS to enable/disable certain features. One of those features is
> _PR3 support (ACPI specification 6.1 p.328):
> 
>   This bit is set if OSPM supports reading _PR3and using power
>   resources to switch power. Note this handshake translates to an
>   operating model that the platform and OSPM supports both the power
>   model containing both D3hot and D3.

Ah, good catch!

I forgot about this one and we definitely should do this handshake.

> Some of our development platforms has BIOS option "RTD3 enable" which
> is used to enable/disable this flag (among other things). The BIOS
> should return acked caps when _OSC returns but we never check those in
> Linux.
> 
> Kilian, can you try the below patch and send back dmesg when the system
> has been booted? It should show if the BIOS acks _PR3 or not.
> 
> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
> index 95855cb9d6fb..463eb2d69271 100644
> --- a/drivers/acpi/bus.c
> +++ b/drivers/acpi/bus.c
> @@ -345,8 +345,15 @@ static void acpi_bus_osc_support(void)
>  		capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_APEI_SUPPORT;
>  	if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle)))
>  		return;
> +
> +	acpi_handle_info(handle, "Supported caps: 0x%08x\n", capbuf[1]);
> +
>  	if (ACPI_SUCCESS(acpi_run_osc(handle, &context))) {
>  		u32 *capbuf_ret = context.ret.pointer;
> +
> +		acpi_handle_info(handle, "Acked caps: 0x%08x (_PR3: %s)\n", capbuf_ret[1],
> +				 capbuf_ret[1] & OSC_SB_PR3_SUPPORT ? "on" : "off");
> +
>  		if (context.ret.length > OSC_SUPPORT_DWORD) {
>  			osc_sb_apei_support_acked =
>  				capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_APEI_SUPPORT;
> --

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04 21:09                                               ` Peter Wu
  2017-01-04 21:58                                                 ` Rafael J. Wysocki
@ 2017-01-05 14:42                                                 ` Lukas Wunner
  2017-01-06  1:21                                                   ` Rafael J. Wysocki
  2017-01-07 11:35                                                   ` Peter Wu
  1 sibling, 2 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-05 14:42 UTC (permalink / raw)
  To: Peter Wu
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Wed, Jan 04, 2017 at 10:09:54PM +0100, Peter Wu wrote:
> [ Help/ideas are welcome, I suspect that these failures to restore power
> on laptops designed for Win8+ all have the same cause, related to some
> unknown interaction between ACPI and PCI. Some links:
> https://bugzilla.kernel.org/show_bug.cgi?id=190861
> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]

Looking at Kilian's acpidump again I notice that the methods to power
the GPU on or off (GPON / GPOF) are called from two places:

- From the _PS0 and _PS3 methods of the GPU and
- from the _PR3 power resource of the root port above the GPU.

In the former case they're called for pre Windows 2013 or if VDAD is true.
In the latter case they're called unconditionally but GPOF becomes a no-op
in the pre Windows 2013 case.

This means that GPOF would be executed *twice* on Windows 2013+ if VDAD
is true.  I could imagine this to cause issues.

VDAD is at 0x7CE7D018 + 0xEE2 + 6. It's not set in the DSDT.

@Kilian, what do you get if you execute this as root:

dd iflag=skip_bytes,count_bytes skip=$((0x7CE7D018 + 0xEE2 + 6)) count=1 \
  if=/dev/mem 2>/dev/null | hexdump


Another oddity I've noticed is that when calling the Optimus DSM with
the capabilities function number (0x1A, NOUVEAU_DSM_OPTIMUS_CAPS) and
a special argument, it's possible to influence the behaviour of GPOF
(the method to power the GPU off):  GPOF is a no-op unless it's running
on Windows 2013+ or OMPR has value 0x03.  Initially OMPR has value 0x02,
but by setting bits 18 and 19 in the argument given to the capabilities
function, it can be set to 0x3.  After GPOF has finished, OMPR reverts
back to 0x02.  This means that pre Windows 2013, GPOF only has any effect
if the DSM capabilities function is called with an appropriate argument.
The same functionality can be seen in the Clevo P651RA ssdt3/7.  What
confuses me is that the bits are at position 18 and 19, but in
nouveau_switcheroo_optimus_dsm() we're setting bits 0 and 1 as well as
bits 24 and 25?  This may be a dumb question, I'm not familiar with
Optimus, only Macs.


Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-04 23:21                                                   ` David Airlie
@ 2017-01-05 15:06                                                     ` Lukas Wunner
  2017-01-05 18:13                                                       ` Peter Jones
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-05 15:06 UTC (permalink / raw)
  To: David Airlie
  Cc: Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	Kilian Singer, linux-pci, Alex Deucher, Hans de Goede,
	Peter Jones

On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
> > On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
> > > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
> > > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> > > > > I don't *want* to apply the revert.  It's on my for-linus branch as a
> > > > > worst-case scenario change if we can't figure out a better fix.
> > > > > 
> > > > > The patch below is preferable, but I'd rather not take even it,
> > > > > because it takes away functionality and forces people to use a boot
> > > > > parameter to restore it.  I expect that somebody will figure out how
> > > > > to fix the regression Kilian found and also keep the new functionality
> > > > > (without requiring boot parameters) before v4.10.
> > > > 
> > > > The issue is constrained to hybrid graphics laptops with Nvidia discrete
> > > > GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
> > > > PCI core.
> > > 
> > > The problem is not necessarily in the nouveau driver, the same problem
> > > occurs when you enable RPM without loading nouveau. The issue is limited
> > > though to some newer hybrid graphics laptops with Nvidia GPUs. While a
> > > quirk can be added to nouveau, I think that a (temporary) quirk in core
> > > would also be reasonable (since it also occurs without nouveau).
> > > 
> > > > (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
> > > > when and how to call an ACPI method versus using PR3.)
> > > > 
> > > > (Neither are laptops using the Nvidia proprietary driver as it doesn't
> > > > runtime suspend the card.  But battery life will be terrible then.)
> > > > 
> > > > We're at rc2 so the time frame for coming up with a fix is probably
> > > > 4 weeks.  Peter and others have tried for months to reverse-engineer
> > > > how to handle runtime PM on newer Nvidia cards.  It seems likely that
> > > > we'll not find the ultimate solution to the problem within 4 weeks.
> > > 
> > > Yep, a quick proper fix seems unlikely.
> > > [ Help/ideas are welcome, I suspect that these failures to restore power
> > > on laptops designed for Win8+ all have the same cause, related to some
> > > unknown interaction between ACPI and PCI. Some links:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > > 
> > > > The way it is now, i.e. defaulting to PR3 when available, regresses
> > > > certain laptops such as Kilian's.  If on the other hand we default to
> > > > DSM when available, we'll regress certain other laptops, as Peter has
> > > > pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
> > > > approach either, ideally we'd want to use PR3 as Windows does.
> > > > 
> > > > As said, the only short-term solution I see is to add an "optimus"
> > > > module_param to nouveau to allow users to select which method to use.
> > > > So in Kilian's case an additional command line parameter would be
> > > > necessary to fix the issue.
> > > > 
> > > > Does anyone see a better solution or can we agree on this one?  If so
> > > > I can come up with a patch.  This could go in via Dave Airlie's tree.
> > > 
> > > As pcie_port_pm=off already reverts to DSM, I do not think that an
> > > additional (temporary) nouveau module parameter is going to help. I
> > > instead propose a (hopefully temporary) quirk in pci core that disables
> > > D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
> > > pcie_port_pm=off). Then the option pcie_port_pm=force can still be used
> > > to test possible solutions in the future.
> > 
> > I would rather add a quirk to the ACPI core to prevent the power resources in
> > question from being enumerated.  Or even to prevent ACPI PM from being
> > used for the port in question.
> 
> I do have a W541 in a cupboard in the office somewhere, but I won't be close to
> it for a couple of weeks. The W541 was the first place I tested the pm patches
> so I'm kinda wondering whether it's all W541's or just some specific model/bios
> combo.
> 
> However I'm pretty much unavailable to do anything much until late Jan on this.

Is there anyone else at Red Hat who might be able to look into this?

ISTR that Hans de Goede is working on improving laptop support in Fedora,
and Peter Jones recently got a patch merged for the W541 with the exact
same firmware Kilian is using to work around a botched EFI memory map.
Adding them to cc: in the hope that they may be able to help.

@Peter, have you noticed issues with the discrete Nvidia GPU on your W541
related to runtime suspend and system sleep?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 15:06                                                     ` Lukas Wunner
@ 2017-01-05 18:13                                                       ` Peter Jones
  2017-01-05 19:36                                                         ` David Airlie
  2017-01-07 11:45                                                       ` Hans de Goede
  2017-01-11 11:04                                                       ` Hans de Goede
  2 siblings, 1 reply; 115+ messages in thread
From: Peter Jones @ 2017-01-05 18:13 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: David Airlie, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, Kilian Singer, linux-pci, Alex Deucher,
	Hans de Goede

On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
> > > On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
> > > > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
> > > > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> > > > > > I don't *want* to apply the revert.  It's on my for-linus branch as a
> > > > > > worst-case scenario change if we can't figure out a better fix.
> > > > > > 
> > > > > > The patch below is preferable, but I'd rather not take even it,
> > > > > > because it takes away functionality and forces people to use a boot
> > > > > > parameter to restore it.  I expect that somebody will figure out how
> > > > > > to fix the regression Kilian found and also keep the new functionality
> > > > > > (without requiring boot parameters) before v4.10.
> > > > > 
> > > > > The issue is constrained to hybrid graphics laptops with Nvidia discrete
> > > > > GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
> > > > > PCI core.
> > > > 
> > > > The problem is not necessarily in the nouveau driver, the same problem
> > > > occurs when you enable RPM without loading nouveau. The issue is limited
> > > > though to some newer hybrid graphics laptops with Nvidia GPUs. While a
> > > > quirk can be added to nouveau, I think that a (temporary) quirk in core
> > > > would also be reasonable (since it also occurs without nouveau).
> > > > 
> > > > > (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
> > > > > when and how to call an ACPI method versus using PR3.)
> > > > > 
> > > > > (Neither are laptops using the Nvidia proprietary driver as it doesn't
> > > > > runtime suspend the card.  But battery life will be terrible then.)
> > > > > 
> > > > > We're at rc2 so the time frame for coming up with a fix is probably
> > > > > 4 weeks.  Peter and others have tried for months to reverse-engineer
> > > > > how to handle runtime PM on newer Nvidia cards.  It seems likely that
> > > > > we'll not find the ultimate solution to the problem within 4 weeks.
> > > > 
> > > > Yep, a quick proper fix seems unlikely.
> > > > [ Help/ideas are welcome, I suspect that these failures to restore power
> > > > on laptops designed for Win8+ all have the same cause, related to some
> > > > unknown interaction between ACPI and PCI. Some links:
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > > > 
> > > > > The way it is now, i.e. defaulting to PR3 when available, regresses
> > > > > certain laptops such as Kilian's.  If on the other hand we default to
> > > > > DSM when available, we'll regress certain other laptops, as Peter has
> > > > > pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
> > > > > approach either, ideally we'd want to use PR3 as Windows does.
> > > > > 
> > > > > As said, the only short-term solution I see is to add an "optimus"
> > > > > module_param to nouveau to allow users to select which method to use.
> > > > > So in Kilian's case an additional command line parameter would be
> > > > > necessary to fix the issue.
> > > > > 
> > > > > Does anyone see a better solution or can we agree on this one?  If so
> > > > > I can come up with a patch.  This could go in via Dave Airlie's tree.
> > > > 
> > > > As pcie_port_pm=off already reverts to DSM, I do not think that an
> > > > additional (temporary) nouveau module parameter is going to help. I
> > > > instead propose a (hopefully temporary) quirk in pci core that disables
> > > > D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
> > > > pcie_port_pm=off). Then the option pcie_port_pm=force can still be used
> > > > to test possible solutions in the future.
> > > 
> > > I would rather add a quirk to the ACPI core to prevent the power resources in
> > > question from being enumerated.  Or even to prevent ACPI PM from being
> > > used for the port in question.
> > 
> > I do have a W541 in a cupboard in the office somewhere, but I won't be close to
> > it for a couple of weeks. The W541 was the first place I tested the pm patches
> > so I'm kinda wondering whether it's all W541's or just some specific model/bios
> > combo.

They seem to all ship with the 1.10 firmware, and 2.80 is current (there
are a bunch of intermediate 2.xx versions).  Somewhere along the line
they introduced some bugs in the UEFI stuff, so it wouldn't be
surprising if there's bugs introduced elsewhere as well.

> > However I'm pretty much unavailable to do anything much until late Jan on this.
> 
> Is there anyone else at Red Hat who might be able to look into this?
> 
> ISTR that Hans de Goede is working on improving laptop support in Fedora,
> and Peter Jones recently got a patch merged for the W541 with the exact
> same firmware Kilian is using to work around a botched EFI memory map.
> Adding them to cc: in the hope that they may be able to help.
> 
> @Peter, have you noticed issues with the discrete Nvidia GPU on your W541
> related to runtime suspend and system sleep?

I was using a borrowed one (I can certainly find it again, but I'm not
working on graphics/pm really), but yeah - shutdown and lspci both broke
sometime after pci_pm_runtime_resume().  Here's the traceback from
SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67

Dave, if you know who in Westford should have a look at this, I can see
about getting them hardware.  I am more or less surrounded by that team.

-- 
        Peter

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 18:13                                                       ` Peter Jones
@ 2017-01-05 19:36                                                         ` David Airlie
  2017-01-09 15:11                                                           ` Lyude Paul
  0 siblings, 1 reply; 115+ messages in thread
From: David Airlie @ 2017-01-05 19:36 UTC (permalink / raw)
  To: Peter Jones
  Cc: Lukas Wunner, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, Kilian Singer, linux-pci, Alex Deucher,
	Hans de Goede, Lyude


(cc'ing Lyude, who has the hw also I think).

----- Original Message -----
> From: "Peter Jones" <pjones@redhat.com>
> To: "Lukas Wunner" <lukas@wunner.de>
> Cc: "David Airlie" <airlied@redhat.com>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "Kilian Singer"
> <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-pci@vger.kernel.org>, "Alex Deucher"
> <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redhat.com>
> Sent: Friday, 6 January, 2017 4:13:23 AM
> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
> 
> On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
> > On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
> > > > On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
> > > > > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
> > > > > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
> > > > > > > I don't *want* to apply the revert.  It's on my for-linus branch
> > > > > > > as a
> > > > > > > worst-case scenario change if we can't figure out a better fix.
> > > > > > > 
> > > > > > > The patch below is preferable, but I'd rather not take even it,
> > > > > > > because it takes away functionality and forces people to use a
> > > > > > > boot
> > > > > > > parameter to restore it.  I expect that somebody will figure out
> > > > > > > how
> > > > > > > to fix the regression Kilian found and also keep the new
> > > > > > > functionality
> > > > > > > (without requiring boot parameters) before v4.10.
> > > > > > 
> > > > > > The issue is constrained to hybrid graphics laptops with Nvidia
> > > > > > discrete
> > > > > > GPU using nouveau.  Hence it needs to be fixed in nouveau, not in
> > > > > > the
> > > > > > PCI core.
> > > > > 
> > > > > The problem is not necessarily in the nouveau driver, the same
> > > > > problem
> > > > > occurs when you enable RPM without loading nouveau. The issue is
> > > > > limited
> > > > > though to some newer hybrid graphics laptops with Nvidia GPUs. While
> > > > > a
> > > > > quirk can be added to nouveau, I think that a (temporary) quirk in
> > > > > core
> > > > > would also be reasonable (since it also occurs without nouveau).
> > > > > 
> > > > > > (AFAIUI, laptops with AMD discrete GPU are not affected as it is
> > > > > > known
> > > > > > when and how to call an ACPI method versus using PR3.)
> > > > > > 
> > > > > > (Neither are laptops using the Nvidia proprietary driver as it
> > > > > > doesn't
> > > > > > runtime suspend the card.  But battery life will be terrible then.)
> > > > > > 
> > > > > > We're at rc2 so the time frame for coming up with a fix is probably
> > > > > > 4 weeks.  Peter and others have tried for months to
> > > > > > reverse-engineer
> > > > > > how to handle runtime PM on newer Nvidia cards.  It seems likely
> > > > > > that
> > > > > > we'll not find the ultimate solution to the problem within 4 weeks.
> > > > > 
> > > > > Yep, a quick proper fix seems unlikely.
> > > > > [ Help/ideas are welcome, I suspect that these failures to restore
> > > > > power
> > > > > on laptops designed for Win8+ all have the same cause, related to
> > > > > some
> > > > > unknown interaction between ACPI and PCI. Some links:
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > > > > 
> > > > > > The way it is now, i.e. defaulting to PR3 when available, regresses
> > > > > > certain laptops such as Kilian's.  If on the other hand we default
> > > > > > to
> > > > > > DSM when available, we'll regress certain other laptops, as Peter
> > > > > > has
> > > > > > pointed out.  Whitelisting or blacklisting laptops doesn't seem a
> > > > > > good
> > > > > > approach either, ideally we'd want to use PR3 as Windows does.
> > > > > > 
> > > > > > As said, the only short-term solution I see is to add an "optimus"
> > > > > > module_param to nouveau to allow users to select which method to
> > > > > > use.
> > > > > > So in Kilian's case an additional command line parameter would be
> > > > > > necessary to fix the issue.
> > > > > > 
> > > > > > Does anyone see a better solution or can we agree on this one?  If
> > > > > > so
> > > > > > I can come up with a patch.  This could go in via Dave Airlie's
> > > > > > tree.
> > > > > 
> > > > > As pcie_port_pm=off already reverts to DSM, I do not think that an
> > > > > additional (temporary) nouveau module parameter is going to help. I
> > > > > instead propose a (hopefully temporary) quirk in pci core that
> > > > > disables
> > > > > D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
> > > > > pcie_port_pm=off). Then the option pcie_port_pm=force can still be
> > > > > used
> > > > > to test possible solutions in the future.
> > > > 
> > > > I would rather add a quirk to the ACPI core to prevent the power
> > > > resources in
> > > > question from being enumerated.  Or even to prevent ACPI PM from being
> > > > used for the port in question.
> > > 
> > > I do have a W541 in a cupboard in the office somewhere, but I won't be
> > > close to
> > > it for a couple of weeks. The W541 was the first place I tested the pm
> > > patches
> > > so I'm kinda wondering whether it's all W541's or just some specific
> > > model/bios
> > > combo.
> 
> They seem to all ship with the 1.10 firmware, and 2.80 is current (there
> are a bunch of intermediate 2.xx versions).  Somewhere along the line
> they introduced some bugs in the UEFI stuff, so it wouldn't be
> surprising if there's bugs introduced elsewhere as well.
> 
> > > However I'm pretty much unavailable to do anything much until late Jan on
> > > this.
> > 
> > Is there anyone else at Red Hat who might be able to look into this?
> > 
> > ISTR that Hans de Goede is working on improving laptop support in Fedora,
> > and Peter Jones recently got a patch merged for the W541 with the exact
> > same firmware Kilian is using to work around a botched EFI memory map.
> > Adding them to cc: in the hope that they may be able to help.
> > 
> > @Peter, have you noticed issues with the discrete Nvidia GPU on your W541
> > related to runtime suspend and system sleep?
> 
> I was using a borrowed one (I can certainly find it again, but I'm not
> working on graphics/pm really), but yeah - shutdown and lspci both broke
> sometime after pci_pm_runtime_resume().  Here's the traceback from
> SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
> 
> Dave, if you know who in Westford should have a look at this, I can see
> about getting them hardware.  I am more or less surrounded by that team.
> 
> --
>         Peter
> 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 14:42                                                 ` Lukas Wunner
@ 2017-01-06  1:21                                                   ` Rafael J. Wysocki
  2017-01-07  6:50                                                     ` Mika Westerberg
  2017-01-07 11:35                                                   ` Peter Wu
  1 sibling, 1 reply; 115+ messages in thread
From: Rafael J. Wysocki @ 2017-01-06  1:21 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Peter Wu, Bjorn Helgaas, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Thursday, January 05, 2017 03:42:20 PM Lukas Wunner wrote:
> On Wed, Jan 04, 2017 at 10:09:54PM +0100, Peter Wu wrote:
> > [ Help/ideas are welcome, I suspect that these failures to restore power
> > on laptops designed for Win8+ all have the same cause, related to some
> > unknown interaction between ACPI and PCI. Some links:
> > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> 
> Looking at Kilian's acpidump again I notice that the methods to power
> the GPU on or off (GPON / GPOF) are called from two places:
> 
> - From the _PS0 and _PS3 methods of the GPU and
> - from the _PR3 power resource of the root port above the GPU.
> 
> In the former case they're called for pre Windows 2013 or if VDAD is true.
> In the latter case they're called unconditionally but GPOF becomes a no-op
> in the pre Windows 2013 case.
> 
> This means that GPOF would be executed *twice* on Windows 2013+ if VDAD
> is true.  I could imagine this to cause issues.

Right.  Exactly my observation (http://marc.info/?l=linux-pci&m=148362622326066&w=2).

So on (newer) Windows something is done in order to make it work in addition to
the _PR3 _OSC handshake.

So, I'd like to try to follow the Mika's suggestion to use the response we get
from the _OSC handshake for \_SB and if that says "no _PR3", ignore power
resources for PCIe ports at least.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-06  1:21                                                   ` Rafael J. Wysocki
@ 2017-01-07  6:50                                                     ` Mika Westerberg
  0 siblings, 0 replies; 115+ messages in thread
From: Mika Westerberg @ 2017-01-07  6:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Lukas Wunner, Peter Wu, Bjorn Helgaas, Kilian Singer, linux-pci,
	Alex Deucher, Dave Airlie

On Fri, Jan 06, 2017 at 02:21:11AM +0100, Rafael J. Wysocki wrote:
> On Thursday, January 05, 2017 03:42:20 PM Lukas Wunner wrote:
> > On Wed, Jan 04, 2017 at 10:09:54PM +0100, Peter Wu wrote:
> > > [ Help/ideas are welcome, I suspect that these failures to restore power
> > > on laptops designed for Win8+ all have the same cause, related to some
> > > unknown interaction between ACPI and PCI. Some links:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > 
> > Looking at Kilian's acpidump again I notice that the methods to power
> > the GPU on or off (GPON / GPOF) are called from two places:
> > 
> > - From the _PS0 and _PS3 methods of the GPU and
> > - from the _PR3 power resource of the root port above the GPU.
> > 
> > In the former case they're called for pre Windows 2013 or if VDAD is true.
> > In the latter case they're called unconditionally but GPOF becomes a no-op
> > in the pre Windows 2013 case.
> > 
> > This means that GPOF would be executed *twice* on Windows 2013+ if VDAD
> > is true.  I could imagine this to cause issues.
> 
> Right.  Exactly my observation (http://marc.info/?l=linux-pci&m=148362622326066&w=2).
> 
> So on (newer) Windows something is done in order to make it work in addition to
> the _PR3 _OSC handshake.
> 
> So, I'd like to try to follow the Mika's suggestion to use the response we get
> from the _OSC handshake for \_SB and if that says "no _PR3", ignore power
> resources for PCIe ports at least.

Kilian send the dmesg back to me and unfortunately the BIOS on that
machine acks use of _PR3:

[    0.699776] ACPI: Added _OSI(Module Device)
[    0.699777] ACPI: Added _OSI(Processor Device)
[    0.699777] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.699778] ACPI: Added _OSI(Processor Aggregator Device)
[    0.699783] ACPI : EC: EC started
[    0.700466] ACPI: \: Used as first EC
[    0.700467] ACPI: \: GPE=0x11, EC_CMD/EC_SC=0x66, EC_DATA=0x62
[    0.700468] ACPI: \: Used as boot ECDT EC to handle transactions
[    0.703905] ACPI: [Firmware Bug]: BIOS _OSI(Linux) query ignored
[    0.880246] ACPI: \_SB_: Supported caps: 0x0000009f
[    0.880278] ACPI: \_SB_: Acked caps: 0x0000009f (_PR3: on)

So ignoring _PR3 based on that will not solve the issue :-(

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 14:42                                                 ` Lukas Wunner
  2017-01-06  1:21                                                   ` Rafael J. Wysocki
@ 2017-01-07 11:35                                                   ` Peter Wu
  2017-01-07 12:19                                                     ` Lukas Wunner
  1 sibling, 1 reply; 115+ messages in thread
From: Peter Wu @ 2017-01-07 11:35 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Thu, Jan 05, 2017 at 03:42:20PM +0100, Lukas Wunner wrote:
> On Wed, Jan 04, 2017 at 10:09:54PM +0100, Peter Wu wrote:
> > [ Help/ideas are welcome, I suspect that these failures to restore power
> > on laptops designed for Win8+ all have the same cause, related to some
> > unknown interaction between ACPI and PCI. Some links:
> > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> 
> Looking at Kilian's acpidump again I notice that the methods to power
> the GPU on or off (GPON / GPOF) are called from two places:
> 
> - From the _PS0 and _PS3 methods of the GPU and
> - from the _PR3 power resource of the root port above the GPU.
> 
> In the former case they're called for pre Windows 2013 or if VDAD is true.
> In the latter case they're called unconditionally but GPOF becomes a no-op
> in the pre Windows 2013 case.
> 
> This means that GPOF would be executed *twice* on Windows 2013+ if VDAD
> is true.  I could imagine this to cause issues.

There is a flag "DGOS" which is set when PGON/PGOF are called, so
multiple invocations should not matter for the powerdown/up sequence.
There are some SMI calls though that might have side-effects though.

> VDAD is at 0x7CE7D018 + 0xEE2 + 6. It's not set in the DSDT.
> 
> @Kilian, what do you get if you execute this as root:
> 
> dd iflag=skip_bytes,count_bytes skip=$((0x7CE7D018 + 0xEE2 + 6)) count=1 \
>   if=/dev/mem 2>/dev/null | hexdump
> 
> Another oddity I've noticed is that when calling the Optimus DSM with
> the capabilities function number (0x1A, NOUVEAU_DSM_OPTIMUS_CAPS) and
> a special argument, it's possible to influence the behaviour of GPOF
> (the method to power the GPU off):  GPOF is a no-op unless it's running
> on Windows 2013+ or OMPR has value 0x03.  Initially OMPR has value 0x02,
> but by setting bits 18 and 19 in the argument given to the capabilities
> function, it can be set to 0x3.  After GPOF has finished, OMPR reverts
> back to 0x02.  This means that pre Windows 2013, GPOF only has any effect
> if the DSM capabilities function is called with an appropriate argument.

Pre-Windows 2013 (Win8), the DSM method was used to regulate power.
Value 3 means that _PS3 should power down the dGPU.  Value 2 means that
the platform should not do that.

Starting from Win8, PR3 is supported so this is used instead of DSM.

> The same functionality can be seen in the Clevo P651RA ssdt3/7.  What
> confuses me is that the bits are at position 18 and 19, but in
> nouveau_switcheroo_optimus_dsm() we're setting bits 0 and 1 as well as
> bits 24 and 25?  This may be a dumb question, I'm not familiar with
> Optimus, only Macs.

nouveau_switcheroo_optimus_dsm calls two different functions:

 - NOUVEAU_DSM_OPTIMUS_CAPS (0x1A) with bits 25:24 set (value 3 << 24).
   This enables powering down in _PS3.
 - NOUVEAU_DSM_OPTIMUS_FLAGS (0x1B) with bits 1:0 set (value 3).
   This enables the "dGPU audio codec flag" via SMI.

When the old DSM method is in use, these functions are always invoked
before _PS3.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 15:06                                                     ` Lukas Wunner
  2017-01-05 18:13                                                       ` Peter Jones
@ 2017-01-07 11:45                                                       ` Hans de Goede
  2017-01-07 12:16                                                         ` Lukas Wunner
  2017-01-09 23:00                                                         ` Peter Jones
  2017-01-11 11:04                                                       ` Hans de Goede
  2 siblings, 2 replies; 115+ messages in thread
From: Hans de Goede @ 2017-01-07 11:45 UTC (permalink / raw)
  To: Lukas Wunner, David Airlie
  Cc: Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	Kilian Singer, linux-pci, Alex Deucher, Peter Jones

Hi,

On 05-01-17 16:06, Lukas Wunner wrote:
> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
>>> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
>>>> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
>>>>> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
>>>>>> I don't *want* to apply the revert.  It's on my for-linus branch as a
>>>>>> worst-case scenario change if we can't figure out a better fix.
>>>>>>
>>>>>> The patch below is preferable, but I'd rather not take even it,
>>>>>> because it takes away functionality and forces people to use a boot
>>>>>> parameter to restore it.  I expect that somebody will figure out how
>>>>>> to fix the regression Kilian found and also keep the new functionality
>>>>>> (without requiring boot parameters) before v4.10.
>>>>>
>>>>> The issue is constrained to hybrid graphics laptops with Nvidia discrete
>>>>> GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
>>>>> PCI core.
>>>>
>>>> The problem is not necessarily in the nouveau driver, the same problem
>>>> occurs when you enable RPM without loading nouveau. The issue is limited
>>>> though to some newer hybrid graphics laptops with Nvidia GPUs. While a
>>>> quirk can be added to nouveau, I think that a (temporary) quirk in core
>>>> would also be reasonable (since it also occurs without nouveau).
>>>>
>>>>> (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
>>>>> when and how to call an ACPI method versus using PR3.)
>>>>>
>>>>> (Neither are laptops using the Nvidia proprietary driver as it doesn't
>>>>> runtime suspend the card.  But battery life will be terrible then.)
>>>>>
>>>>> We're at rc2 so the time frame for coming up with a fix is probably
>>>>> 4 weeks.  Peter and others have tried for months to reverse-engineer
>>>>> how to handle runtime PM on newer Nvidia cards.  It seems likely that
>>>>> we'll not find the ultimate solution to the problem within 4 weeks.
>>>>
>>>> Yep, a quick proper fix seems unlikely.
>>>> [ Help/ideas are welcome, I suspect that these failures to restore power
>>>> on laptops designed for Win8+ all have the same cause, related to some
>>>> unknown interaction between ACPI and PCI. Some links:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
>>>>
>>>>> The way it is now, i.e. defaulting to PR3 when available, regresses
>>>>> certain laptops such as Kilian's.  If on the other hand we default to
>>>>> DSM when available, we'll regress certain other laptops, as Peter has
>>>>> pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
>>>>> approach either, ideally we'd want to use PR3 as Windows does.
>>>>>
>>>>> As said, the only short-term solution I see is to add an "optimus"
>>>>> module_param to nouveau to allow users to select which method to use.
>>>>> So in Kilian's case an additional command line parameter would be
>>>>> necessary to fix the issue.
>>>>>
>>>>> Does anyone see a better solution or can we agree on this one?  If so
>>>>> I can come up with a patch.  This could go in via Dave Airlie's tree.
>>>>
>>>> As pcie_port_pm=off already reverts to DSM, I do not think that an
>>>> additional (temporary) nouveau module parameter is going to help. I
>>>> instead propose a (hopefully temporary) quirk in pci core that disables
>>>> D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
>>>> pcie_port_pm=off). Then the option pcie_port_pm=force can still be used
>>>> to test possible solutions in the future.
>>>
>>> I would rather add a quirk to the ACPI core to prevent the power resources in
>>> question from being enumerated.  Or even to prevent ACPI PM from being
>>> used for the port in question.
>>
>> I do have a W541 in a cupboard in the office somewhere, but I won't be close to
>> it for a couple of weeks. The W541 was the first place I tested the pm patches
>> so I'm kinda wondering whether it's all W541's or just some specific model/bios
>> combo.
>>
>> However I'm pretty much unavailable to do anything much until late Jan on this.
>
> Is there anyone else at Red Hat who might be able to look into this?
>
> ISTR that Hans de Goede is working on improving laptop support in Fedora,
> and Peter Jones recently got a patch merged for the W541 with the exact
> same firmware Kilian is using to work around a botched EFI memory map.
> Adding them to cc: in the hope that they may be able to help.
>
> @Peter, have you noticed issues with the discrete Nvidia GPU on your W541
> related to runtime suspend and system sleep?

I've a W541 sitting in my home office at well. I will take it through
some gpu runtime suspend/resume testing. Which kernel introduces the
problem I'm looking for ?

I believe mine has the old BIOS / EFI which is less troublesome so I
will first see if I can reproduce the problem with that and then upgrade
to see if that introduces the problem.

Peter IIRC you said that after upgrading the firmware I need a new enough
kernel to be able to even boot, from which kernel onwards will the machine
boot with the new firmware ?

Also is it possible to downgrade the EFI again ? ...

Regards,

Hans

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-07 11:45                                                       ` Hans de Goede
@ 2017-01-07 12:16                                                         ` Lukas Wunner
  2017-01-09 23:00                                                         ` Peter Jones
  1 sibling, 0 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-07 12:16 UTC (permalink / raw)
  To: Hans de Goede
  Cc: David Airlie, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, Kilian Singer, linux-pci, Alex Deucher,
	Peter Jones

On Sat, Jan 07, 2017 at 12:45:35PM +0100, Hans de Goede wrote:
> I've a W541 sitting in my home office at well. I will take it through
> some gpu runtime suspend/resume testing. Which kernel introduces the
> problem I'm looking for ?

v4.8, it adds runtime PM for PCIe ports.  Or anything newer.


> I believe mine has the old BIOS / EFI which is less troublesome so I
> will first see if I can reproduce the problem with that and then upgrade
> to see if that introduces the problem.

Please be sure to make an acpidump before upgrading the BIOS
so that we can compare what they've changed.

Thanks!

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-07 11:35                                                   ` Peter Wu
@ 2017-01-07 12:19                                                     ` Lukas Wunner
  2017-01-07 12:36                                                       ` Peter Wu
  0 siblings, 1 reply; 115+ messages in thread
From: Lukas Wunner @ 2017-01-07 12:19 UTC (permalink / raw)
  To: Peter Wu
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Sat, Jan 07, 2017 at 12:35:10PM +0100, Peter Wu wrote:
> On Thu, Jan 05, 2017 at 03:42:20PM +0100, Lukas Wunner wrote:
> > On Wed, Jan 04, 2017 at 10:09:54PM +0100, Peter Wu wrote:
> > > [ Help/ideas are welcome, I suspect that these failures to restore power
> > > on laptops designed for Win8+ all have the same cause, related to some
> > > unknown interaction between ACPI and PCI. Some links:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > 
> > Looking at Kilian's acpidump again I notice that the methods to power
> > the GPU on or off (GPON / GPOF) are called from two places:
> > 
> > - From the _PS0 and _PS3 methods of the GPU and
> > - from the _PR3 power resource of the root port above the GPU.
> > 
> > In the former case they're called for pre Windows 2013 or if VDAD is true.
> > In the latter case they're called unconditionally but GPOF becomes a no-op
> > in the pre Windows 2013 case.
> > 
> > This means that GPOF would be executed *twice* on Windows 2013+ if VDAD
> > is true.  I could imagine this to cause issues.
> 
> There is a flag "DGOS" which is set when PGON/PGOF are called, so
> multiple invocations should not matter for the powerdown/up sequence.
> There are some SMI calls though that might have side-effects though.

The PGON method becomes a no-op if DGOS is true.  But the PGOF method
doesn't check DGOS.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-07 12:19                                                     ` Lukas Wunner
@ 2017-01-07 12:36                                                       ` Peter Wu
  2017-01-08 14:05                                                         ` Lukas Wunner
  0 siblings, 1 reply; 115+ messages in thread
From: Peter Wu @ 2017-01-07 12:36 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Sat, Jan 07, 2017 at 01:19:59PM +0100, Lukas Wunner wrote:
> On Sat, Jan 07, 2017 at 12:35:10PM +0100, Peter Wu wrote:
> > On Thu, Jan 05, 2017 at 03:42:20PM +0100, Lukas Wunner wrote:
> > > On Wed, Jan 04, 2017 at 10:09:54PM +0100, Peter Wu wrote:
> > > > [ Help/ideas are welcome, I suspect that these failures to restore power
> > > > on laptops designed for Win8+ all have the same cause, related to some
> > > > unknown interaction between ACPI and PCI. Some links:
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > > 
> > > Looking at Kilian's acpidump again I notice that the methods to power
> > > the GPU on or off (GPON / GPOF) are called from two places:
> > > 
> > > - From the _PS0 and _PS3 methods of the GPU and
> > > - from the _PR3 power resource of the root port above the GPU.
> > > 
> > > In the former case they're called for pre Windows 2013 or if VDAD is true.
> > > In the latter case they're called unconditionally but GPOF becomes a no-op
> > > in the pre Windows 2013 case.
> > > 
> > > This means that GPOF would be executed *twice* on Windows 2013+ if VDAD
> > > is true.  I could imagine this to cause issues.
> > 
> > There is a flag "DGOS" which is set when PGON/PGOF are called, so
> > multiple invocations should not matter for the powerdown/up sequence.
> > There are some SMI calls though that might have side-effects though.
> 
> The PGON method becomes a no-op if DGOS is true.  But the PGOF method
> doesn't check DGOS.

You are right, GPON does not check that. Hopefully "VDAD" is 0 then when
_PS3 is called, otherwise it might invoke PGOF multiple times (though
NVP3._OFF and through _PS3).
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-07 12:36                                                       ` Peter Wu
@ 2017-01-08 14:05                                                         ` Lukas Wunner
  0 siblings, 0 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-08 14:05 UTC (permalink / raw)
  To: Peter Wu
  Cc: Bjorn Helgaas, Rafael J. Wysocki, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher, Dave Airlie

On Sat, Jan 07, 2017 at 01:36:35PM +0100, Peter Wu wrote:
> On Sat, Jan 07, 2017 at 01:19:59PM +0100, Lukas Wunner wrote:
> > On Sat, Jan 07, 2017 at 12:35:10PM +0100, Peter Wu wrote:
> > > On Thu, Jan 05, 2017 at 03:42:20PM +0100, Lukas Wunner wrote:
> > > > On Wed, Jan 04, 2017 at 10:09:54PM +0100, Peter Wu wrote:
> > > > > [ Help/ideas are welcome, I suspect that these failures to restore
> > > > > power on laptops designed for Win8+ all have the same cause,
> > > > > related to some unknown interaction between ACPI and PCI.
> > > > > Some links:
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > > > 
> > > > Looking at Kilian's acpidump again I notice that the methods to power
> > > > the GPU on or off (GPON / GPOF) are called from two places:
> > > > 
> > > > - From the _PS0 and _PS3 methods of the GPU and
> > > > - from the _PR3 power resource of the root port above the GPU.
> > > > 
> > > > In the former case they're called for pre Windows 2013 or if VDAD is
> > > > true.  In the latter case they're called unconditionally but GPOF
> > > > becomes a no-op in the pre Windows 2013 case.
> > > > 
> > > > This means that GPOF would be executed *twice* on Windows 2013+ if
> > > > VDAD is true.  I could imagine this to cause issues.
> > > 
> > > There is a flag "DGOS" which is set when PGON/PGOF are called, so
> > > multiple invocations should not matter for the powerdown/up sequence.
> > > There are some SMI calls though that might have side-effects though.
> > 
> > The PGON method becomes a no-op if DGOS is true.  But the PGOF method
> > doesn't check DGOS.
> 
> You are right, GPON does not check that. Hopefully "VDAD" is 0 then when
> _PS3 is called, otherwise it might invoke PGOF multiple times (though
> NVP3._OFF and through _PS3).

Kilian responded off-list that the byte containing the VDAD bit has
value 0x01.  So one of the 8 bits is set but I'm not sure if that's
the VDAD bit.  In the DSDT the bit is defined thus:

    OperationRegion (MNVS, SystemMemory, 0x7CE7D018, 0x1000)
    Field (MNVS, DWordAcc, NoLock, Preserve)
    {
        Offset (0xEE2),
        TPPP,   8,
        TPPC,   8,
        WKRS,   8,
        FNWK,   8,
        USBC,   8,
        ODDF,   8,
        VDAD,   1,                    <-------
        Offset (0xEE9),

Does this mean that VDAD is the most significant or least significant
bit?  The value read on Kilian's laptop has the least significant bit
set.

In any case, he cleared that bit but locking the screen still breaks
keyboard/mouse input.

I'm running out of ideas, the only viable solution I can see is to
add a model-specific quirk which causes nouveau to fall back to DSM. :-(

Best regards,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 19:36                                                         ` David Airlie
@ 2017-01-09 15:11                                                           ` Lyude Paul
  2017-01-09 15:21                                                             ` Hans de Goede
  0 siblings, 1 reply; 115+ messages in thread
From: Lyude Paul @ 2017-01-09 15:11 UTC (permalink / raw)
  To: David Airlie, Peter Jones
  Cc: Lukas Wunner, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, Kilian Singer, linux-pci, Alex Deucher,
	Hans de Goede, Lyude

fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64 running on
here and so far it seems to suspend/resume just fine using firmware
version 2.19

On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
> (cc'ing Lyude, who has the hw also I think).
> 
> ----- Original Message -----
> > From: "Peter Jones" <pjones@redhat.com>
> > To: "Lukas Wunner" <lukas@wunner.de>
> > Cc: "David Airlie" <airlied@redhat.com>, "Rafael J. Wysocki" <rjw@r
> > jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> > "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.weste
> > rberg@linux.intel.com>, "Kilian Singer"
> > <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-pci@vger
> > .kernel.org>, "Alex Deucher"
> > <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redhat.com>
> > Sent: Friday, 6 January, 2017 4:13:23 AM
> > Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe
> > ports"
> > 
> > On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
> > > On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
> > > > > On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
> > > > > > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner
> > > > > > wrote:
> > > > > > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas
> > > > > > > wrote:
> > > > > > > > I don't *want* to apply the revert.  It's on my for-
> > > > > > > > linus branch
> > > > > > > > as a
> > > > > > > > worst-case scenario change if we can't figure out a
> > > > > > > > better fix.
> > > > > > > > 
> > > > > > > > The patch below is preferable, but I'd rather not take
> > > > > > > > even it,
> > > > > > > > because it takes away functionality and forces people
> > > > > > > > to use a
> > > > > > > > boot
> > > > > > > > parameter to restore it.  I expect that somebody will
> > > > > > > > figure out
> > > > > > > > how
> > > > > > > > to fix the regression Kilian found and also keep the
> > > > > > > > new
> > > > > > > > functionality
> > > > > > > > (without requiring boot parameters) before v4.10.
> > > > > > > 
> > > > > > > The issue is constrained to hybrid graphics laptops with
> > > > > > > Nvidia
> > > > > > > discrete
> > > > > > > GPU using nouveau.  Hence it needs to be fixed in
> > > > > > > nouveau, not in
> > > > > > > the
> > > > > > > PCI core.
> > > > > > 
> > > > > > The problem is not necessarily in the nouveau driver, the
> > > > > > same
> > > > > > problem
> > > > > > occurs when you enable RPM without loading nouveau. The
> > > > > > issue is
> > > > > > limited
> > > > > > though to some newer hybrid graphics laptops with Nvidia
> > > > > > GPUs. While
> > > > > > a
> > > > > > quirk can be added to nouveau, I think that a (temporary)
> > > > > > quirk in
> > > > > > core
> > > > > > would also be reasonable (since it also occurs without
> > > > > > nouveau).
> > > > > > 
> > > > > > > (AFAIUI, laptops with AMD discrete GPU are not affected
> > > > > > > as it is
> > > > > > > known
> > > > > > > when and how to call an ACPI method versus using PR3.)
> > > > > > > 
> > > > > > > (Neither are laptops using the Nvidia proprietary driver
> > > > > > > as it
> > > > > > > doesn't
> > > > > > > runtime suspend the card.  But battery life will be
> > > > > > > terrible then.)
> > > > > > > 
> > > > > > > We're at rc2 so the time frame for coming up with a fix
> > > > > > > is probably
> > > > > > > 4 weeks.  Peter and others have tried for months to
> > > > > > > reverse-engineer
> > > > > > > how to handle runtime PM on newer Nvidia cards.  It seems
> > > > > > > likely
> > > > > > > that
> > > > > > > we'll not find the ultimate solution to the problem
> > > > > > > within 4 weeks.
> > > > > > 
> > > > > > Yep, a quick proper fix seems unlikely.
> > > > > > [ Help/ideas are welcome, I suspect that these failures to
> > > > > > restore
> > > > > > power
> > > > > > on laptops designed for Win8+ all have the same cause,
> > > > > > related to
> > > > > > some
> > > > > > unknown interaction between ACPI and PCI. Some links:
> > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > > > > > 
> > > > > > > The way it is now, i.e. defaulting to PR3 when available,
> > > > > > > regresses
> > > > > > > certain laptops such as Kilian's.  If on the other hand
> > > > > > > we default
> > > > > > > to
> > > > > > > DSM when available, we'll regress certain other laptops,
> > > > > > > as Peter
> > > > > > > has
> > > > > > > pointed out.  Whitelisting or blacklisting laptops
> > > > > > > doesn't seem a
> > > > > > > good
> > > > > > > approach either, ideally we'd want to use PR3 as Windows
> > > > > > > does.
> > > > > > > 
> > > > > > > As said, the only short-term solution I see is to add an
> > > > > > > "optimus"
> > > > > > > module_param to nouveau to allow users to select which
> > > > > > > method to
> > > > > > > use.
> > > > > > > So in Kilian's case an additional command line parameter
> > > > > > > would be
> > > > > > > necessary to fix the issue.
> > > > > > > 
> > > > > > > Does anyone see a better solution or can we agree on this
> > > > > > > one?  If
> > > > > > > so
> > > > > > > I can come up with a patch.  This could go in via Dave
> > > > > > > Airlie's
> > > > > > > tree.
> > > > > > 
> > > > > > As pcie_port_pm=off already reverts to DSM, I do not think
> > > > > > that an
> > > > > > additional (temporary) nouveau module parameter is going to
> > > > > > help. I
> > > > > > instead propose a (hopefully temporary) quirk in pci core
> > > > > > that
> > > > > > disables
> > > > > > D3cold RPM for just Kilians Lenovo laptop (basically
> > > > > > defaulting to
> > > > > > pcie_port_pm=off). Then the option pcie_port_pm=force can
> > > > > > still be
> > > > > > used
> > > > > > to test possible solutions in the future.
> > > > > 
> > > > > I would rather add a quirk to the ACPI core to prevent the
> > > > > power
> > > > > resources in
> > > > > question from being enumerated.  Or even to prevent ACPI PM
> > > > > from being
> > > > > used for the port in question.
> > > > 
> > > > I do have a W541 in a cupboard in the office somewhere, but I
> > > > won't be
> > > > close to
> > > > it for a couple of weeks. The W541 was the first place I tested
> > > > the pm
> > > > patches
> > > > so I'm kinda wondering whether it's all W541's or just some
> > > > specific
> > > > model/bios
> > > > combo.
> > 
> > They seem to all ship with the 1.10 firmware, and 2.80 is current
> > (there
> > are a bunch of intermediate 2.xx versions).  Somewhere along the
> > line
> > they introduced some bugs in the UEFI stuff, so it wouldn't be
> > surprising if there's bugs introduced elsewhere as well.
> > 
> > > > However I'm pretty much unavailable to do anything much until
> > > > late Jan on
> > > > this.
> > > 
> > > Is there anyone else at Red Hat who might be able to look into
> > > this?
> > > 
> > > ISTR that Hans de Goede is working on improving laptop support in
> > > Fedora,
> > > and Peter Jones recently got a patch merged for the W541 with the
> > > exact
> > > same firmware Kilian is using to work around a botched EFI memory
> > > map.
> > > Adding them to cc: in the hope that they may be able to help.
> > > 
> > > @Peter, have you noticed issues with the discrete Nvidia GPU on
> > > your W541
> > > related to runtime suspend and system sleep?
> > 
> > I was using a borrowed one (I can certainly find it again, but I'm
> > not
> > working on graphics/pm really), but yeah - shutdown and lspci both
> > broke
> > sometime after pci_pm_runtime_resume().  Here's the traceback from
> > SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
> > 
> > Dave, if you know who in Westford should have a look at this, I can
> > see
> > about getting them hardware.  I am more or less surrounded by that
> > team.
> > 
> > --
> >         Peter
> > 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-09 15:11                                                           ` Lyude Paul
@ 2017-01-09 15:21                                                             ` Hans de Goede
  2017-01-09 18:48                                                               ` Kilian Singer
  2017-01-11 20:40                                                               ` Lyude Paul
  0 siblings, 2 replies; 115+ messages in thread
From: Hans de Goede @ 2017-01-09 15:21 UTC (permalink / raw)
  To: Lyude Paul, David Airlie, Peter Jones
  Cc: Lukas Wunner, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, Kilian Singer, linux-pci, Alex Deucher, Lyude

Hi Lyude,

On 09-01-17 16:11, Lyude Paul wrote:
> fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64 running on
> here and so far it seems to suspend/resume just fine using firmware
> version 2.19

Note this is not about normal suspend resume, but runtime
suspend/resume of the nvidia discrete GPU...

Try running glxgears like this:

DRI_PRIME=1 glxgears -info | grep REND

(the grep is to check you're really running on the nvidia GPU).

Then you should see msgs in dmesg about nouveau resuming the gpu,
then kill glxgears and wait for 5 seconds, now the nouveau drv
should say the gpu is suspending, etc.

If it never runtime suspends, then make sure you are not using
any external screens, only the built-in laptop screen.

Regards,

Hans


>
> On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
>> (cc'ing Lyude, who has the hw also I think).
>>
>> ----- Original Message -----
>>> From: "Peter Jones" <pjones@redhat.com>
>>> To: "Lukas Wunner" <lukas@wunner.de>
>>> Cc: "David Airlie" <airlied@redhat.com>, "Rafael J. Wysocki" <rjw@r
>>> jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
>>> "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.weste
>>> rberg@linux.intel.com>, "Kilian Singer"
>>> <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-pci@vger
>>> .kernel.org>, "Alex Deucher"
>>> <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redhat.com>
>>> Sent: Friday, 6 January, 2017 4:13:23 AM
>>> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe
>>> ports"
>>>
>>> On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
>>>> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
>>>>>> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
>>>>>>> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner
>>>>>>> wrote:
>>>>>>>> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas
>>>>>>>> wrote:
>>>>>>>>> I don't *want* to apply the revert.  It's on my for-
>>>>>>>>> linus branch
>>>>>>>>> as a
>>>>>>>>> worst-case scenario change if we can't figure out a
>>>>>>>>> better fix.
>>>>>>>>>
>>>>>>>>> The patch below is preferable, but I'd rather not take
>>>>>>>>> even it,
>>>>>>>>> because it takes away functionality and forces people
>>>>>>>>> to use a
>>>>>>>>> boot
>>>>>>>>> parameter to restore it.  I expect that somebody will
>>>>>>>>> figure out
>>>>>>>>> how
>>>>>>>>> to fix the regression Kilian found and also keep the
>>>>>>>>> new
>>>>>>>>> functionality
>>>>>>>>> (without requiring boot parameters) before v4.10.
>>>>>>>>
>>>>>>>> The issue is constrained to hybrid graphics laptops with
>>>>>>>> Nvidia
>>>>>>>> discrete
>>>>>>>> GPU using nouveau.  Hence it needs to be fixed in
>>>>>>>> nouveau, not in
>>>>>>>> the
>>>>>>>> PCI core.
>>>>>>>
>>>>>>> The problem is not necessarily in the nouveau driver, the
>>>>>>> same
>>>>>>> problem
>>>>>>> occurs when you enable RPM without loading nouveau. The
>>>>>>> issue is
>>>>>>> limited
>>>>>>> though to some newer hybrid graphics laptops with Nvidia
>>>>>>> GPUs. While
>>>>>>> a
>>>>>>> quirk can be added to nouveau, I think that a (temporary)
>>>>>>> quirk in
>>>>>>> core
>>>>>>> would also be reasonable (since it also occurs without
>>>>>>> nouveau).
>>>>>>>
>>>>>>>> (AFAIUI, laptops with AMD discrete GPU are not affected
>>>>>>>> as it is
>>>>>>>> known
>>>>>>>> when and how to call an ACPI method versus using PR3.)
>>>>>>>>
>>>>>>>> (Neither are laptops using the Nvidia proprietary driver
>>>>>>>> as it
>>>>>>>> doesn't
>>>>>>>> runtime suspend the card.  But battery life will be
>>>>>>>> terrible then.)
>>>>>>>>
>>>>>>>> We're at rc2 so the time frame for coming up with a fix
>>>>>>>> is probably
>>>>>>>> 4 weeks.  Peter and others have tried for months to
>>>>>>>> reverse-engineer
>>>>>>>> how to handle runtime PM on newer Nvidia cards.  It seems
>>>>>>>> likely
>>>>>>>> that
>>>>>>>> we'll not find the ultimate solution to the problem
>>>>>>>> within 4 weeks.
>>>>>>>
>>>>>>> Yep, a quick proper fix seems unlikely.
>>>>>>> [ Help/ideas are welcome, I suspect that these failures to
>>>>>>> restore
>>>>>>> power
>>>>>>> on laptops designed for Win8+ all have the same cause,
>>>>>>> related to
>>>>>>> some
>>>>>>> unknown interaction between ACPI and PCI. Some links:
>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
>>>>>>>
>>>>>>>> The way it is now, i.e. defaulting to PR3 when available,
>>>>>>>> regresses
>>>>>>>> certain laptops such as Kilian's.  If on the other hand
>>>>>>>> we default
>>>>>>>> to
>>>>>>>> DSM when available, we'll regress certain other laptops,
>>>>>>>> as Peter
>>>>>>>> has
>>>>>>>> pointed out.  Whitelisting or blacklisting laptops
>>>>>>>> doesn't seem a
>>>>>>>> good
>>>>>>>> approach either, ideally we'd want to use PR3 as Windows
>>>>>>>> does.
>>>>>>>>
>>>>>>>> As said, the only short-term solution I see is to add an
>>>>>>>> "optimus"
>>>>>>>> module_param to nouveau to allow users to select which
>>>>>>>> method to
>>>>>>>> use.
>>>>>>>> So in Kilian's case an additional command line parameter
>>>>>>>> would be
>>>>>>>> necessary to fix the issue.
>>>>>>>>
>>>>>>>> Does anyone see a better solution or can we agree on this
>>>>>>>> one?  If
>>>>>>>> so
>>>>>>>> I can come up with a patch.  This could go in via Dave
>>>>>>>> Airlie's
>>>>>>>> tree.
>>>>>>>
>>>>>>> As pcie_port_pm=off already reverts to DSM, I do not think
>>>>>>> that an
>>>>>>> additional (temporary) nouveau module parameter is going to
>>>>>>> help. I
>>>>>>> instead propose a (hopefully temporary) quirk in pci core
>>>>>>> that
>>>>>>> disables
>>>>>>> D3cold RPM for just Kilians Lenovo laptop (basically
>>>>>>> defaulting to
>>>>>>> pcie_port_pm=off). Then the option pcie_port_pm=force can
>>>>>>> still be
>>>>>>> used
>>>>>>> to test possible solutions in the future.
>>>>>>
>>>>>> I would rather add a quirk to the ACPI core to prevent the
>>>>>> power
>>>>>> resources in
>>>>>> question from being enumerated.  Or even to prevent ACPI PM
>>>>>> from being
>>>>>> used for the port in question.
>>>>>
>>>>> I do have a W541 in a cupboard in the office somewhere, but I
>>>>> won't be
>>>>> close to
>>>>> it for a couple of weeks. The W541 was the first place I tested
>>>>> the pm
>>>>> patches
>>>>> so I'm kinda wondering whether it's all W541's or just some
>>>>> specific
>>>>> model/bios
>>>>> combo.
>>>
>>> They seem to all ship with the 1.10 firmware, and 2.80 is current
>>> (there
>>> are a bunch of intermediate 2.xx versions).  Somewhere along the
>>> line
>>> they introduced some bugs in the UEFI stuff, so it wouldn't be
>>> surprising if there's bugs introduced elsewhere as well.
>>>
>>>>> However I'm pretty much unavailable to do anything much until
>>>>> late Jan on
>>>>> this.
>>>>
>>>> Is there anyone else at Red Hat who might be able to look into
>>>> this?
>>>>
>>>> ISTR that Hans de Goede is working on improving laptop support in
>>>> Fedora,
>>>> and Peter Jones recently got a patch merged for the W541 with the
>>>> exact
>>>> same firmware Kilian is using to work around a botched EFI memory
>>>> map.
>>>> Adding them to cc: in the hope that they may be able to help.
>>>>
>>>> @Peter, have you noticed issues with the discrete Nvidia GPU on
>>>> your W541
>>>> related to runtime suspend and system sleep?
>>>
>>> I was using a borrowed one (I can certainly find it again, but I'm
>>> not
>>> working on graphics/pm really), but yeah - shutdown and lspci both
>>> broke
>>> sometime after pci_pm_runtime_resume().  Here's the traceback from
>>> SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
>>>
>>> Dave, if you know who in Westford should have a look at this, I can
>>> see
>>> about getting them hardware.  I am more or less surrounded by that
>>> team.
>>>
>>> --
>>>         Peter
>>>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-09 15:21                                                             ` Hans de Goede
@ 2017-01-09 18:48                                                               ` Kilian Singer
  2017-01-10  0:33                                                                 ` David Airlie
  2017-01-11 20:40                                                               ` Lyude Paul
  1 sibling, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2017-01-09 18:48 UTC (permalink / raw)
  To: Hans de Goede, Lyude Paul, David Airlie, Peter Jones
  Cc: Lukas Wunner, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, linux-pci, Alex Deucher, Lyude

Hi Lyude Paul,

normal supend resume does not work neither on my machine.

Best regards

Kilian


On 09-Jan-17 16:21, Hans de Goede wrote:
> Hi Lyude,
>
> On 09-01-17 16:11, Lyude Paul wrote:
>> fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64 running on
>> here and so far it seems to suspend/resume just fine using firmware
>> version 2.19
>
> Note this is not about normal suspend resume, but runtime
> suspend/resume of the nvidia discrete GPU...
>
> Try running glxgears like this:
>
> DRI_PRIME=1 glxgears -info | grep REND
>
> (the grep is to check you're really running on the nvidia GPU).
>
> Then you should see msgs in dmesg about nouveau resuming the gpu,
> then kill glxgears and wait for 5 seconds, now the nouveau drv
> should say the gpu is suspending, etc.
>
> If it never runtime suspends, then make sure you are not using
> any external screens, only the built-in laptop screen.
>
> Regards,
>
> Hans
>
>
>>
>> On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
>>> (cc'ing Lyude, who has the hw also I think).
>>>
>>> ----- Original Message -----
>>>> From: "Peter Jones" <pjones@redhat.com>
>>>> To: "Lukas Wunner" <lukas@wunner.de>
>>>> Cc: "David Airlie" <airlied@redhat.com>, "Rafael J. Wysocki" <rjw@r
>>>> jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
>>>> "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.weste
>>>> rberg@linux.intel.com>, "Kilian Singer"
>>>> <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-pci@vger
>>>> .kernel.org>, "Alex Deucher"
>>>> <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redhat.com>
>>>> Sent: Friday, 6 January, 2017 4:13:23 AM
>>>> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe
>>>> ports"
>>>>
>>>> On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
>>>>> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
>>>>>>> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
>>>>>>>> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner
>>>>>>>> wrote:
>>>>>>>>> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas
>>>>>>>>> wrote:
>>>>>>>>>> I don't *want* to apply the revert.  It's on my for-
>>>>>>>>>> linus branch
>>>>>>>>>> as a
>>>>>>>>>> worst-case scenario change if we can't figure out a
>>>>>>>>>> better fix.
>>>>>>>>>>
>>>>>>>>>> The patch below is preferable, but I'd rather not take
>>>>>>>>>> even it,
>>>>>>>>>> because it takes away functionality and forces people
>>>>>>>>>> to use a
>>>>>>>>>> boot
>>>>>>>>>> parameter to restore it.  I expect that somebody will
>>>>>>>>>> figure out
>>>>>>>>>> how
>>>>>>>>>> to fix the regression Kilian found and also keep the
>>>>>>>>>> new
>>>>>>>>>> functionality
>>>>>>>>>> (without requiring boot parameters) before v4.10.
>>>>>>>>>
>>>>>>>>> The issue is constrained to hybrid graphics laptops with
>>>>>>>>> Nvidia
>>>>>>>>> discrete
>>>>>>>>> GPU using nouveau.  Hence it needs to be fixed in
>>>>>>>>> nouveau, not in
>>>>>>>>> the
>>>>>>>>> PCI core.
>>>>>>>>
>>>>>>>> The problem is not necessarily in the nouveau driver, the
>>>>>>>> same
>>>>>>>> problem
>>>>>>>> occurs when you enable RPM without loading nouveau. The
>>>>>>>> issue is
>>>>>>>> limited
>>>>>>>> though to some newer hybrid graphics laptops with Nvidia
>>>>>>>> GPUs. While
>>>>>>>> a
>>>>>>>> quirk can be added to nouveau, I think that a (temporary)
>>>>>>>> quirk in
>>>>>>>> core
>>>>>>>> would also be reasonable (since it also occurs without
>>>>>>>> nouveau).
>>>>>>>>
>>>>>>>>> (AFAIUI, laptops with AMD discrete GPU are not affected
>>>>>>>>> as it is
>>>>>>>>> known
>>>>>>>>> when and how to call an ACPI method versus using PR3.)
>>>>>>>>>
>>>>>>>>> (Neither are laptops using the Nvidia proprietary driver
>>>>>>>>> as it
>>>>>>>>> doesn't
>>>>>>>>> runtime suspend the card.  But battery life will be
>>>>>>>>> terrible then.)
>>>>>>>>>
>>>>>>>>> We're at rc2 so the time frame for coming up with a fix
>>>>>>>>> is probably
>>>>>>>>> 4 weeks.  Peter and others have tried for months to
>>>>>>>>> reverse-engineer
>>>>>>>>> how to handle runtime PM on newer Nvidia cards.  It seems
>>>>>>>>> likely
>>>>>>>>> that
>>>>>>>>> we'll not find the ultimate solution to the problem
>>>>>>>>> within 4 weeks.
>>>>>>>>
>>>>>>>> Yep, a quick proper fix seems unlikely.
>>>>>>>> [ Help/ideas are welcome, I suspect that these failures to
>>>>>>>> restore
>>>>>>>> power
>>>>>>>> on laptops designed for Win8+ all have the same cause,
>>>>>>>> related to
>>>>>>>> some
>>>>>>>> unknown interaction between ACPI and PCI. Some links:
>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
>>>>>>>>
>>>>>>>>> The way it is now, i.e. defaulting to PR3 when available,
>>>>>>>>> regresses
>>>>>>>>> certain laptops such as Kilian's.  If on the other hand
>>>>>>>>> we default
>>>>>>>>> to
>>>>>>>>> DSM when available, we'll regress certain other laptops,
>>>>>>>>> as Peter
>>>>>>>>> has
>>>>>>>>> pointed out.  Whitelisting or blacklisting laptops
>>>>>>>>> doesn't seem a
>>>>>>>>> good
>>>>>>>>> approach either, ideally we'd want to use PR3 as Windows
>>>>>>>>> does.
>>>>>>>>>
>>>>>>>>> As said, the only short-term solution I see is to add an
>>>>>>>>> "optimus"
>>>>>>>>> module_param to nouveau to allow users to select which
>>>>>>>>> method to
>>>>>>>>> use.
>>>>>>>>> So in Kilian's case an additional command line parameter
>>>>>>>>> would be
>>>>>>>>> necessary to fix the issue.
>>>>>>>>>
>>>>>>>>> Does anyone see a better solution or can we agree on this
>>>>>>>>> one?  If
>>>>>>>>> so
>>>>>>>>> I can come up with a patch.  This could go in via Dave
>>>>>>>>> Airlie's
>>>>>>>>> tree.
>>>>>>>>
>>>>>>>> As pcie_port_pm=off already reverts to DSM, I do not think
>>>>>>>> that an
>>>>>>>> additional (temporary) nouveau module parameter is going to
>>>>>>>> help. I
>>>>>>>> instead propose a (hopefully temporary) quirk in pci core
>>>>>>>> that
>>>>>>>> disables
>>>>>>>> D3cold RPM for just Kilians Lenovo laptop (basically
>>>>>>>> defaulting to
>>>>>>>> pcie_port_pm=off). Then the option pcie_port_pm=force can
>>>>>>>> still be
>>>>>>>> used
>>>>>>>> to test possible solutions in the future.
>>>>>>>
>>>>>>> I would rather add a quirk to the ACPI core to prevent the
>>>>>>> power
>>>>>>> resources in
>>>>>>> question from being enumerated.  Or even to prevent ACPI PM
>>>>>>> from being
>>>>>>> used for the port in question.
>>>>>>
>>>>>> I do have a W541 in a cupboard in the office somewhere, but I
>>>>>> won't be
>>>>>> close to
>>>>>> it for a couple of weeks. The W541 was the first place I tested
>>>>>> the pm
>>>>>> patches
>>>>>> so I'm kinda wondering whether it's all W541's or just some
>>>>>> specific
>>>>>> model/bios
>>>>>> combo.
>>>>
>>>> They seem to all ship with the 1.10 firmware, and 2.80 is current
>>>> (there
>>>> are a bunch of intermediate 2.xx versions).  Somewhere along the
>>>> line
>>>> they introduced some bugs in the UEFI stuff, so it wouldn't be
>>>> surprising if there's bugs introduced elsewhere as well.
>>>>
>>>>>> However I'm pretty much unavailable to do anything much until
>>>>>> late Jan on
>>>>>> this.
>>>>>
>>>>> Is there anyone else at Red Hat who might be able to look into
>>>>> this?
>>>>>
>>>>> ISTR that Hans de Goede is working on improving laptop support in
>>>>> Fedora,
>>>>> and Peter Jones recently got a patch merged for the W541 with the
>>>>> exact
>>>>> same firmware Kilian is using to work around a botched EFI memory
>>>>> map.
>>>>> Adding them to cc: in the hope that they may be able to help.
>>>>>
>>>>> @Peter, have you noticed issues with the discrete Nvidia GPU on
>>>>> your W541
>>>>> related to runtime suspend and system sleep?
>>>>
>>>> I was using a borrowed one (I can certainly find it again, but I'm
>>>> not
>>>> working on graphics/pm really), but yeah - shutdown and lspci both
>>>> broke
>>>> sometime after pci_pm_runtime_resume().  Here's the traceback from
>>>> SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
>>>>
>>>> Dave, if you know who in Westford should have a look at this, I can
>>>> see
>>>> about getting them hardware.  I am more or less surrounded by that
>>>> team.
>>>>
>>>> -- 
>>>>         Peter
>>>>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-07 11:45                                                       ` Hans de Goede
  2017-01-07 12:16                                                         ` Lukas Wunner
@ 2017-01-09 23:00                                                         ` Peter Jones
  2017-01-10  0:17                                                           ` David Airlie
  1 sibling, 1 reply; 115+ messages in thread
From: Peter Jones @ 2017-01-09 23:00 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Lukas Wunner, David Airlie, Rafael J. Wysocki, Peter Wu,
	Bjorn Helgaas, Mika Westerberg, Kilian Singer, linux-pci,
	Alex Deucher

On Sat, Jan 07, 2017 at 12:45:35PM +0100, Hans de Goede wrote:

> I've a W541 sitting in my home office at well. I will take it through
> some gpu runtime suspend/resume testing. Which kernel introduces the
> problem I'm looking for ?
> 
> I believe mine has the old BIOS / EFI which is less troublesome so I
> will first see if I can reproduce the problem with that and then upgrade
> to see if that introduces the problem.
> 
> Peter IIRC you said that after upgrading the firmware I need a new enough
> kernel to be able to even boot, from which kernel onwards will the machine
> boot with the new firmware ?

That fix is currently commit b2a91a35 in the "next" branch of the repo
at git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi .  I'm not sure
what the timeframe for it landing in linus' tree is, but it doesn't look
like it has yet.

> Also is it possible to downgrade the EFI again ? ...

IIRC that model has a switch in the firmware to enable downgrading.  I
have not tried it.  Also, there's some chance the firmware you're
starting from isn't available.

-- 
        Peter

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-09 23:00                                                         ` Peter Jones
@ 2017-01-10  0:17                                                           ` David Airlie
  2017-01-10  1:24                                                             ` Lukas Wunner
  0 siblings, 1 reply; 115+ messages in thread
From: David Airlie @ 2017-01-10  0:17 UTC (permalink / raw)
  To: Peter Jones
  Cc: Hans de Goede, Lukas Wunner, Rafael J. Wysocki, Peter Wu,
	Bjorn Helgaas, Mika Westerberg, Kilian Singer, linux-pci,
	Alex Deucher

> 
> On Sat, Jan 07, 2017 at 12:45:35PM +0100, Hans de Goede wrote:
> 
> > I've a W541 sitting in my home office at well. I will take it through
> > some gpu runtime suspend/resume testing. Which kernel introduces the
> > problem I'm looking for ?
> > 
> > I believe mine has the old BIOS / EFI which is less troublesome so I
> > will first see if I can reproduce the problem with that and then upgrade
> > to see if that introduces the problem.
> > 
> > Peter IIRC you said that after upgrading the firmware I need a new enough
> > kernel to be able to even boot, from which kernel onwards will the machine
> > boot with the new firmware ?
> 
> That fix is currently commit b2a91a35 in the "next" branch of the repo
> at git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi .  I'm not sure
> what the timeframe for it landing in linus' tree is, but it doesn't look
> like it has yet.
> 
> > Also is it possible to downgrade the EFI again ? ...
> 
> IIRC that model has a switch in the firmware to enable downgrading.  I
> have not tried it.  Also, there's some chance the firmware you're
> starting from isn't available.
> 
> --
>         Peter
> 

just FYI, but W541 with Fedora 25 and Linux 4.10-rc3 + drm-next and the efi
fix (you might want to motivate that fix a bit harder), seems to be working
well.

I can suspend/resume, and the nvidia seems to go off.

[  411.799035] nouveau 0000:01:00.0: DRM: suspending console...
[  411.799059] nouveau 0000:01:00.0: DRM: suspending display...
[  411.799119] nouveau 0000:01:00.0: DRM: evicting buffers...
[  411.799125] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
[  411.799176] nouveau 0000:01:00.0: DRM: suspending client object trees...
[  411.805616] nouveau 0000:01:00.0: DRM: suspending kernel object tree...
[  413.217090] device_pm-0235 device_set_power      : Device [VID1] transitioned to D3hot
[  413.217099] device_pm-0124 device_get_power      : Device [VID1] power state is (unknown)
[  413.230201] thinkpad_acpi: EC reports that Thermal Table has changed
[  413.351497]     power-0275 __acpi_power_off      : Power resource [NVP3] turned off
[  413.351507] device_pm-0235 device_set_power      : Device [PEG] transitioned to D3hot
[  413.351526]     power-0189 power_get_state       : Resource [NVP3] is off
[  413.351530]     power-0219 power_get_list_state  : Resource list is off
[  413.351542]     power-0189 power_get_state       : Resource [NVP2] is on
[  413.351545]     power-0219 power_get_list_state  : Resource list is on
[  413.351548] device_pm-0124 device_get_power      : Device [PEG] power state is D2

That is with some acpi debugging enabled (though I'm not quite sure why D2 is where it ends up).

Dave.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-09 18:48                                                               ` Kilian Singer
@ 2017-01-10  0:33                                                                 ` David Airlie
  2017-01-10  9:17                                                                   ` Kilian Singer
  0 siblings, 1 reply; 115+ messages in thread
From: David Airlie @ 2017-01-10  0:33 UTC (permalink / raw)
  To: Kilian Singer
  Cc: Hans de Goede, Lyude Paul, Peter Jones, Lukas Wunner,
	Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	linux-pci, Alex Deucher, Lyude

Hi Killian,

do you use powertop or have you ever used it, I'm guessing some port is getting into suspend on your machine that isn't on ours due to differeing userspace or powertop settings.

Dave.

----- Original Message -----
> From: "Kilian Singer" <kilian.singer@quantumtechnology.info>
> To: "Hans de Goede" <hdegoede@redhat.com>, "Lyude Paul" <lyude@redhat.com>, "David Airlie" <airlied@redhat.com>,
> "Peter Jones" <pjones@redhat.com>
> Cc: "Lukas Wunner" <lukas@wunner.de>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "linux-pci"
> <linux-pci@vger.kernel.org>, "Alex Deucher" <alexander.deucher@amd.com>, "Lyude" <cpaul@redhat.com>
> Sent: Tuesday, 10 January, 2017 4:48:22 AM
> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
> 
> Hi Lyude Paul,
> 
> normal supend resume does not work neither on my machine.
> 
> Best regards
> 
> Kilian
> 
> 
> On 09-Jan-17 16:21, Hans de Goede wrote:
> > Hi Lyude,
> >
> > On 09-01-17 16:11, Lyude Paul wrote:
> >> fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64 running on
> >> here and so far it seems to suspend/resume just fine using firmware
> >> version 2.19
> >
> > Note this is not about normal suspend resume, but runtime
> > suspend/resume of the nvidia discrete GPU...
> >
> > Try running glxgears like this:
> >
> > DRI_PRIME=1 glxgears -info | grep REND
> >
> > (the grep is to check you're really running on the nvidia GPU).
> >
> > Then you should see msgs in dmesg about nouveau resuming the gpu,
> > then kill glxgears and wait for 5 seconds, now the nouveau drv
> > should say the gpu is suspending, etc.
> >
> > If it never runtime suspends, then make sure you are not using
> > any external screens, only the built-in laptop screen.
> >
> > Regards,
> >
> > Hans
> >
> >
> >>
> >> On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
> >>> (cc'ing Lyude, who has the hw also I think).
> >>>
> >>> ----- Original Message -----
> >>>> From: "Peter Jones" <pjones@redhat.com>
> >>>> To: "Lukas Wunner" <lukas@wunner.de>
> >>>> Cc: "David Airlie" <airlied@redhat.com>, "Rafael J. Wysocki" <rjw@r
> >>>> jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> >>>> "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.weste
> >>>> rberg@linux.intel.com>, "Kilian Singer"
> >>>> <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-pci@vger
> >>>> .kernel.org>, "Alex Deucher"
> >>>> <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redhat.com>
> >>>> Sent: Friday, 6 January, 2017 4:13:23 AM
> >>>> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe
> >>>> ports"
> >>>>
> >>>> On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
> >>>>> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
> >>>>>>> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
> >>>>>>>> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner
> >>>>>>>> wrote:
> >>>>>>>>> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas
> >>>>>>>>> wrote:
> >>>>>>>>>> I don't *want* to apply the revert.  It's on my for-
> >>>>>>>>>> linus branch
> >>>>>>>>>> as a
> >>>>>>>>>> worst-case scenario change if we can't figure out a
> >>>>>>>>>> better fix.
> >>>>>>>>>>
> >>>>>>>>>> The patch below is preferable, but I'd rather not take
> >>>>>>>>>> even it,
> >>>>>>>>>> because it takes away functionality and forces people
> >>>>>>>>>> to use a
> >>>>>>>>>> boot
> >>>>>>>>>> parameter to restore it.  I expect that somebody will
> >>>>>>>>>> figure out
> >>>>>>>>>> how
> >>>>>>>>>> to fix the regression Kilian found and also keep the
> >>>>>>>>>> new
> >>>>>>>>>> functionality
> >>>>>>>>>> (without requiring boot parameters) before v4.10.
> >>>>>>>>>
> >>>>>>>>> The issue is constrained to hybrid graphics laptops with
> >>>>>>>>> Nvidia
> >>>>>>>>> discrete
> >>>>>>>>> GPU using nouveau.  Hence it needs to be fixed in
> >>>>>>>>> nouveau, not in
> >>>>>>>>> the
> >>>>>>>>> PCI core.
> >>>>>>>>
> >>>>>>>> The problem is not necessarily in the nouveau driver, the
> >>>>>>>> same
> >>>>>>>> problem
> >>>>>>>> occurs when you enable RPM without loading nouveau. The
> >>>>>>>> issue is
> >>>>>>>> limited
> >>>>>>>> though to some newer hybrid graphics laptops with Nvidia
> >>>>>>>> GPUs. While
> >>>>>>>> a
> >>>>>>>> quirk can be added to nouveau, I think that a (temporary)
> >>>>>>>> quirk in
> >>>>>>>> core
> >>>>>>>> would also be reasonable (since it also occurs without
> >>>>>>>> nouveau).
> >>>>>>>>
> >>>>>>>>> (AFAIUI, laptops with AMD discrete GPU are not affected
> >>>>>>>>> as it is
> >>>>>>>>> known
> >>>>>>>>> when and how to call an ACPI method versus using PR3.)
> >>>>>>>>>
> >>>>>>>>> (Neither are laptops using the Nvidia proprietary driver
> >>>>>>>>> as it
> >>>>>>>>> doesn't
> >>>>>>>>> runtime suspend the card.  But battery life will be
> >>>>>>>>> terrible then.)
> >>>>>>>>>
> >>>>>>>>> We're at rc2 so the time frame for coming up with a fix
> >>>>>>>>> is probably
> >>>>>>>>> 4 weeks.  Peter and others have tried for months to
> >>>>>>>>> reverse-engineer
> >>>>>>>>> how to handle runtime PM on newer Nvidia cards.  It seems
> >>>>>>>>> likely
> >>>>>>>>> that
> >>>>>>>>> we'll not find the ultimate solution to the problem
> >>>>>>>>> within 4 weeks.
> >>>>>>>>
> >>>>>>>> Yep, a quick proper fix seems unlikely.
> >>>>>>>> [ Help/ideas are welcome, I suspect that these failures to
> >>>>>>>> restore
> >>>>>>>> power
> >>>>>>>> on laptops designed for Win8+ all have the same cause,
> >>>>>>>> related to
> >>>>>>>> some
> >>>>>>>> unknown interaction between ACPI and PCI. Some links:
> >>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
> >>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> >>>>>>>>
> >>>>>>>>> The way it is now, i.e. defaulting to PR3 when available,
> >>>>>>>>> regresses
> >>>>>>>>> certain laptops such as Kilian's.  If on the other hand
> >>>>>>>>> we default
> >>>>>>>>> to
> >>>>>>>>> DSM when available, we'll regress certain other laptops,
> >>>>>>>>> as Peter
> >>>>>>>>> has
> >>>>>>>>> pointed out.  Whitelisting or blacklisting laptops
> >>>>>>>>> doesn't seem a
> >>>>>>>>> good
> >>>>>>>>> approach either, ideally we'd want to use PR3 as Windows
> >>>>>>>>> does.
> >>>>>>>>>
> >>>>>>>>> As said, the only short-term solution I see is to add an
> >>>>>>>>> "optimus"
> >>>>>>>>> module_param to nouveau to allow users to select which
> >>>>>>>>> method to
> >>>>>>>>> use.
> >>>>>>>>> So in Kilian's case an additional command line parameter
> >>>>>>>>> would be
> >>>>>>>>> necessary to fix the issue.
> >>>>>>>>>
> >>>>>>>>> Does anyone see a better solution or can we agree on this
> >>>>>>>>> one?  If
> >>>>>>>>> so
> >>>>>>>>> I can come up with a patch.  This could go in via Dave
> >>>>>>>>> Airlie's
> >>>>>>>>> tree.
> >>>>>>>>
> >>>>>>>> As pcie_port_pm=off already reverts to DSM, I do not think
> >>>>>>>> that an
> >>>>>>>> additional (temporary) nouveau module parameter is going to
> >>>>>>>> help. I
> >>>>>>>> instead propose a (hopefully temporary) quirk in pci core
> >>>>>>>> that
> >>>>>>>> disables
> >>>>>>>> D3cold RPM for just Kilians Lenovo laptop (basically
> >>>>>>>> defaulting to
> >>>>>>>> pcie_port_pm=off). Then the option pcie_port_pm=force can
> >>>>>>>> still be
> >>>>>>>> used
> >>>>>>>> to test possible solutions in the future.
> >>>>>>>
> >>>>>>> I would rather add a quirk to the ACPI core to prevent the
> >>>>>>> power
> >>>>>>> resources in
> >>>>>>> question from being enumerated.  Or even to prevent ACPI PM
> >>>>>>> from being
> >>>>>>> used for the port in question.
> >>>>>>
> >>>>>> I do have a W541 in a cupboard in the office somewhere, but I
> >>>>>> won't be
> >>>>>> close to
> >>>>>> it for a couple of weeks. The W541 was the first place I tested
> >>>>>> the pm
> >>>>>> patches
> >>>>>> so I'm kinda wondering whether it's all W541's or just some
> >>>>>> specific
> >>>>>> model/bios
> >>>>>> combo.
> >>>>
> >>>> They seem to all ship with the 1.10 firmware, and 2.80 is current
> >>>> (there
> >>>> are a bunch of intermediate 2.xx versions).  Somewhere along the
> >>>> line
> >>>> they introduced some bugs in the UEFI stuff, so it wouldn't be
> >>>> surprising if there's bugs introduced elsewhere as well.
> >>>>
> >>>>>> However I'm pretty much unavailable to do anything much until
> >>>>>> late Jan on
> >>>>>> this.
> >>>>>
> >>>>> Is there anyone else at Red Hat who might be able to look into
> >>>>> this?
> >>>>>
> >>>>> ISTR that Hans de Goede is working on improving laptop support in
> >>>>> Fedora,
> >>>>> and Peter Jones recently got a patch merged for the W541 with the
> >>>>> exact
> >>>>> same firmware Kilian is using to work around a botched EFI memory
> >>>>> map.
> >>>>> Adding them to cc: in the hope that they may be able to help.
> >>>>>
> >>>>> @Peter, have you noticed issues with the discrete Nvidia GPU on
> >>>>> your W541
> >>>>> related to runtime suspend and system sleep?
> >>>>
> >>>> I was using a borrowed one (I can certainly find it again, but I'm
> >>>> not
> >>>> working on graphics/pm really), but yeah - shutdown and lspci both
> >>>> broke
> >>>> sometime after pci_pm_runtime_resume().  Here's the traceback from
> >>>> SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
> >>>>
> >>>> Dave, if you know who in Westford should have a look at this, I can
> >>>> see
> >>>> about getting them hardware.  I am more or less surrounded by that
> >>>> team.
> >>>>
> >>>> --
> >>>>         Peter
> >>>>
> 
> 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-10  0:17                                                           ` David Airlie
@ 2017-01-10  1:24                                                             ` Lukas Wunner
  2017-01-10  2:15                                                               ` David Airlie
  0 siblings, 1 reply; 115+ messages in thread
From: Lukas Wunner @ 2017-01-10  1:24 UTC (permalink / raw)
  To: David Airlie
  Cc: Peter Jones, Hans de Goede, Rafael J. Wysocki, Peter Wu,
	Bjorn Helgaas, Mika Westerberg, Kilian Singer, linux-pci,
	Alex Deucher

On Mon, Jan 09, 2017 at 07:17:17PM -0500, David Airlie wrote:
> just FYI, but W541 with Fedora 25 and Linux 4.10-rc3 + drm-next and the efi
> fix (you might want to motivate that fix a bit harder), seems to be working
> well.

The efi fix is on the efi.git next branch without stable designation,
i.e. slated for 4.11 not 4.10, and Matt Fleming usually sends his pull
to Ingo between rc4 and rc5.


> I can suspend/resume, and the nvidia seems to go off.
> 
> [  411.799035] nouveau 0000:01:00.0: DRM: suspending console...
> [  411.799059] nouveau 0000:01:00.0: DRM: suspending display...
> [  411.799119] nouveau 0000:01:00.0: DRM: evicting buffers...
> [  411.799125] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
> [  411.799176] nouveau 0000:01:00.0: DRM: suspending client object trees...
> [  411.805616] nouveau 0000:01:00.0: DRM: suspending kernel object tree...
> [  413.217090] device_pm-0235 device_set_power      : Device [VID1] transitioned to D3hot
> [  413.217099] device_pm-0124 device_get_power      : Device [VID1] power state is (unknown)
> [  413.230201] thinkpad_acpi: EC reports that Thermal Table has changed
> [  413.351497]     power-0275 __acpi_power_off      : Power resource [NVP3] turned off
> [  413.351507] device_pm-0235 device_set_power      : Device [PEG] transitioned to D3hot
> [  413.351526]     power-0189 power_get_state       : Resource [NVP3] is off
> [  413.351530]     power-0219 power_get_list_state  : Resource list is off
> [  413.351542]     power-0189 power_get_state       : Resource [NVP2] is on
> [  413.351545]     power-0219 power_get_list_state  : Resource list is on
> [  413.351548] device_pm-0124 device_get_power      : Device [PEG] power state is D2
> 
> That is with some acpi debugging enabled (though I'm not quite sure why D2 is where it ends up).

The ACPI D2 state seems fishy indeed, could you post the log for a
runtime resume as well?  It would be interesting to see how it gets
out of this incorrect power state.  Also, what is the firmware version
that you're using?  If it's not GNET80WW (2.28), could you attach an
acpidump to the bugzilla entry?

https://bugzilla.kernel.org/show_bug.cgi?id=190861

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-10  1:24                                                             ` Lukas Wunner
@ 2017-01-10  2:15                                                               ` David Airlie
  0 siblings, 0 replies; 115+ messages in thread
From: David Airlie @ 2017-01-10  2:15 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Peter Jones, Hans de Goede, Rafael J. Wysocki, Peter Wu,
	Bjorn Helgaas, Mika Westerberg, Kilian Singer, linux-pci,
	Alex Deucher


> On Mon, Jan 09, 2017 at 07:17:17PM -0500, David Airlie wrote:
> > just FYI, but W541 with Fedora 25 and Linux 4.10-rc3 + drm-next and the efi
> > fix (you might want to motivate that fix a bit harder), seems to be working
> > well.
> 
> The efi fix is on the efi.git next branch without stable designation,
> i.e. slated for 4.11 not 4.10, and Matt Fleming usually sends his pull
> to Ingo between rc4 and rc5.
> 
> 
> > I can suspend/resume, and the nvidia seems to go off.
> > 
> > [  411.799035] nouveau 0000:01:00.0: DRM: suspending console...
> > [  411.799059] nouveau 0000:01:00.0: DRM: suspending display...
> > [  411.799119] nouveau 0000:01:00.0: DRM: evicting buffers...
> > [  411.799125] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go
> > idle...
> > [  411.799176] nouveau 0000:01:00.0: DRM: suspending client object trees...
> > [  411.805616] nouveau 0000:01:00.0: DRM: suspending kernel object tree...
> > [  413.217090] device_pm-0235 device_set_power      : Device [VID1]
> > transitioned to D3hot
> > [  413.217099] device_pm-0124 device_get_power      : Device [VID1] power
> > state is (unknown)
> > [  413.230201] thinkpad_acpi: EC reports that Thermal Table has changed
> > [  413.351497]     power-0275 __acpi_power_off      : Power resource [NVP3]
> > turned off
> > [  413.351507] device_pm-0235 device_set_power      : Device [PEG]
> > transitioned to D3hot
> > [  413.351526]     power-0189 power_get_state       : Resource [NVP3] is
> > off
> > [  413.351530]     power-0219 power_get_list_state  : Resource list is off
> > [  413.351542]     power-0189 power_get_state       : Resource [NVP2] is on
> > [  413.351545]     power-0219 power_get_list_state  : Resource list is on
> > [  413.351548] device_pm-0124 device_get_power      : Device [PEG] power
> > state is D2
> > 
> > That is with some acpi debugging enabled (though I'm not quite sure why D2
> > is where it ends up).
> 
> The ACPI D2 state seems fishy indeed, could you post the log for a
> runtime resume as well?  It would be interesting to see how it gets
> out of this incorrect power state.  Also, what is the firmware version
> that you're using?  If it's not GNET80WW (2.28), could you attach an
> acpidump to the bugzilla entry?
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=190861
> 

Okay I found a possible race and sent patches to fix it dri-devel

https://patchwork.freedesktop.org/series/17731/

also https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next-wip-fix-runtime-race
here with the efi patch also.

It might just be that enabling runtime PM makes things actually suspend/resume and
we can hit this, or else I've just found something else in the area.

Dave.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-10  0:33                                                                 ` David Airlie
@ 2017-01-10  9:17                                                                   ` Kilian Singer
  2017-01-12 18:10                                                                     ` Lyude Paul
  0 siblings, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2017-01-10  9:17 UTC (permalink / raw)
  To: David Airlie
  Cc: Hans de Goede, Lyude Paul, Peter Jones, Lukas Wunner,
	Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	linux-pci, Alex Deucher, Lyude

It is a standart debian installation.

I have not installed powertop.


On 10-Jan-17 01:33, David Airlie wrote:
> Hi Killian,
>
> do you use powertop or have you ever used it, I'm guessing some port is getting into suspend on your machine that isn't on ours due to differeing userspace or powertop settings.
>
> Dave.
>
> ----- Original Message -----
>> From: "Kilian Singer" <kilian.singer@quantumtechnology.info>
>> To: "Hans de Goede" <hdegoede@redhat.com>, "Lyude Paul" <lyude@redhat.com>, "David Airlie" <airlied@redhat.com>,
>> "Peter Jones" <pjones@redhat.com>
>> Cc: "Lukas Wunner" <lukas@wunner.de>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
>> "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.westerberg@linux.intel.com>, "linux-pci"
>> <linux-pci@vger.kernel.org>, "Alex Deucher" <alexander.deucher@amd.com>, "Lyude" <cpaul@redhat.com>
>> Sent: Tuesday, 10 January, 2017 4:48:22 AM
>> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
>>
>> Hi Lyude Paul,
>>
>> normal supend resume does not work neither on my machine.
>>
>> Best regards
>>
>> Kilian
>>
>>
>> On 09-Jan-17 16:21, Hans de Goede wrote:
>>> Hi Lyude,
>>>
>>> On 09-01-17 16:11, Lyude Paul wrote:
>>>> fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64 running on
>>>> here and so far it seems to suspend/resume just fine using firmware
>>>> version 2.19
>>> Note this is not about normal suspend resume, but runtime
>>> suspend/resume of the nvidia discrete GPU...
>>>
>>> Try running glxgears like this:
>>>
>>> DRI_PRIME=1 glxgears -info | grep REND
>>>
>>> (the grep is to check you're really running on the nvidia GPU).
>>>
>>> Then you should see msgs in dmesg about nouveau resuming the gpu,
>>> then kill glxgears and wait for 5 seconds, now the nouveau drv
>>> should say the gpu is suspending, etc.
>>>
>>> If it never runtime suspends, then make sure you are not using
>>> any external screens, only the built-in laptop screen.
>>>
>>> Regards,
>>>
>>> Hans
>>>
>>>
>>>> On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
>>>>> (cc'ing Lyude, who has the hw also I think).
>>>>>
>>>>> ----- Original Message -----
>>>>>> From: "Peter Jones" <pjones@redhat.com>
>>>>>> To: "Lukas Wunner" <lukas@wunner.de>
>>>>>> Cc: "David Airlie" <airlied@redhat.com>, "Rafael J. Wysocki" <rjw@r
>>>>>> jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
>>>>>> "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.weste
>>>>>> rberg@linux.intel.com>, "Kilian Singer"
>>>>>> <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-pci@vger
>>>>>> .kernel.org>, "Alex Deucher"
>>>>>> <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redhat.com>
>>>>>> Sent: Friday, 6 January, 2017 4:13:23 AM
>>>>>> Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe
>>>>>> ports"
>>>>>>
>>>>>> On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
>>>>>>> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
>>>>>>>>> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
>>>>>>>>>> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner
>>>>>>>>>> wrote:
>>>>>>>>>>> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas
>>>>>>>>>>> wrote:
>>>>>>>>>>>> I don't *want* to apply the revert.  It's on my for-
>>>>>>>>>>>> linus branch
>>>>>>>>>>>> as a
>>>>>>>>>>>> worst-case scenario change if we can't figure out a
>>>>>>>>>>>> better fix.
>>>>>>>>>>>>
>>>>>>>>>>>> The patch below is preferable, but I'd rather not take
>>>>>>>>>>>> even it,
>>>>>>>>>>>> because it takes away functionality and forces people
>>>>>>>>>>>> to use a
>>>>>>>>>>>> boot
>>>>>>>>>>>> parameter to restore it.  I expect that somebody will
>>>>>>>>>>>> figure out
>>>>>>>>>>>> how
>>>>>>>>>>>> to fix the regression Kilian found and also keep the
>>>>>>>>>>>> new
>>>>>>>>>>>> functionality
>>>>>>>>>>>> (without requiring boot parameters) before v4.10.
>>>>>>>>>>> The issue is constrained to hybrid graphics laptops with
>>>>>>>>>>> Nvidia
>>>>>>>>>>> discrete
>>>>>>>>>>> GPU using nouveau.  Hence it needs to be fixed in
>>>>>>>>>>> nouveau, not in
>>>>>>>>>>> the
>>>>>>>>>>> PCI core.
>>>>>>>>>> The problem is not necessarily in the nouveau driver, the
>>>>>>>>>> same
>>>>>>>>>> problem
>>>>>>>>>> occurs when you enable RPM without loading nouveau. The
>>>>>>>>>> issue is
>>>>>>>>>> limited
>>>>>>>>>> though to some newer hybrid graphics laptops with Nvidia
>>>>>>>>>> GPUs. While
>>>>>>>>>> a
>>>>>>>>>> quirk can be added to nouveau, I think that a (temporary)
>>>>>>>>>> quirk in
>>>>>>>>>> core
>>>>>>>>>> would also be reasonable (since it also occurs without
>>>>>>>>>> nouveau).
>>>>>>>>>>
>>>>>>>>>>> (AFAIUI, laptops with AMD discrete GPU are not affected
>>>>>>>>>>> as it is
>>>>>>>>>>> known
>>>>>>>>>>> when and how to call an ACPI method versus using PR3.)
>>>>>>>>>>>
>>>>>>>>>>> (Neither are laptops using the Nvidia proprietary driver
>>>>>>>>>>> as it
>>>>>>>>>>> doesn't
>>>>>>>>>>> runtime suspend the card.  But battery life will be
>>>>>>>>>>> terrible then.)
>>>>>>>>>>>
>>>>>>>>>>> We're at rc2 so the time frame for coming up with a fix
>>>>>>>>>>> is probably
>>>>>>>>>>> 4 weeks.  Peter and others have tried for months to
>>>>>>>>>>> reverse-engineer
>>>>>>>>>>> how to handle runtime PM on newer Nvidia cards.  It seems
>>>>>>>>>>> likely
>>>>>>>>>>> that
>>>>>>>>>>> we'll not find the ultimate solution to the problem
>>>>>>>>>>> within 4 weeks.
>>>>>>>>>> Yep, a quick proper fix seems unlikely.
>>>>>>>>>> [ Help/ideas are welcome, I suspect that these failures to
>>>>>>>>>> restore
>>>>>>>>>> power
>>>>>>>>>> on laptops designed for Win8+ all have the same cause,
>>>>>>>>>> related to
>>>>>>>>>> some
>>>>>>>>>> unknown interaction between ACPI and PCI. Some links:
>>>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>>>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
>>>>>>>>>>
>>>>>>>>>>> The way it is now, i.e. defaulting to PR3 when available,
>>>>>>>>>>> regresses
>>>>>>>>>>> certain laptops such as Kilian's.  If on the other hand
>>>>>>>>>>> we default
>>>>>>>>>>> to
>>>>>>>>>>> DSM when available, we'll regress certain other laptops,
>>>>>>>>>>> as Peter
>>>>>>>>>>> has
>>>>>>>>>>> pointed out.  Whitelisting or blacklisting laptops
>>>>>>>>>>> doesn't seem a
>>>>>>>>>>> good
>>>>>>>>>>> approach either, ideally we'd want to use PR3 as Windows
>>>>>>>>>>> does.
>>>>>>>>>>>
>>>>>>>>>>> As said, the only short-term solution I see is to add an
>>>>>>>>>>> "optimus"
>>>>>>>>>>> module_param to nouveau to allow users to select which
>>>>>>>>>>> method to
>>>>>>>>>>> use.
>>>>>>>>>>> So in Kilian's case an additional command line parameter
>>>>>>>>>>> would be
>>>>>>>>>>> necessary to fix the issue.
>>>>>>>>>>>
>>>>>>>>>>> Does anyone see a better solution or can we agree on this
>>>>>>>>>>> one?  If
>>>>>>>>>>> so
>>>>>>>>>>> I can come up with a patch.  This could go in via Dave
>>>>>>>>>>> Airlie's
>>>>>>>>>>> tree.
>>>>>>>>>> As pcie_port_pm=off already reverts to DSM, I do not think
>>>>>>>>>> that an
>>>>>>>>>> additional (temporary) nouveau module parameter is going to
>>>>>>>>>> help. I
>>>>>>>>>> instead propose a (hopefully temporary) quirk in pci core
>>>>>>>>>> that
>>>>>>>>>> disables
>>>>>>>>>> D3cold RPM for just Kilians Lenovo laptop (basically
>>>>>>>>>> defaulting to
>>>>>>>>>> pcie_port_pm=off). Then the option pcie_port_pm=force can
>>>>>>>>>> still be
>>>>>>>>>> used
>>>>>>>>>> to test possible solutions in the future.
>>>>>>>>> I would rather add a quirk to the ACPI core to prevent the
>>>>>>>>> power
>>>>>>>>> resources in
>>>>>>>>> question from being enumerated.  Or even to prevent ACPI PM
>>>>>>>>> from being
>>>>>>>>> used for the port in question.
>>>>>>>> I do have a W541 in a cupboard in the office somewhere, but I
>>>>>>>> won't be
>>>>>>>> close to
>>>>>>>> it for a couple of weeks. The W541 was the first place I tested
>>>>>>>> the pm
>>>>>>>> patches
>>>>>>>> so I'm kinda wondering whether it's all W541's or just some
>>>>>>>> specific
>>>>>>>> model/bios
>>>>>>>> combo.
>>>>>> They seem to all ship with the 1.10 firmware, and 2.80 is current
>>>>>> (there
>>>>>> are a bunch of intermediate 2.xx versions).  Somewhere along the
>>>>>> line
>>>>>> they introduced some bugs in the UEFI stuff, so it wouldn't be
>>>>>> surprising if there's bugs introduced elsewhere as well.
>>>>>>
>>>>>>>> However I'm pretty much unavailable to do anything much until
>>>>>>>> late Jan on
>>>>>>>> this.
>>>>>>> Is there anyone else at Red Hat who might be able to look into
>>>>>>> this?
>>>>>>>
>>>>>>> ISTR that Hans de Goede is working on improving laptop support in
>>>>>>> Fedora,
>>>>>>> and Peter Jones recently got a patch merged for the W541 with the
>>>>>>> exact
>>>>>>> same firmware Kilian is using to work around a botched EFI memory
>>>>>>> map.
>>>>>>> Adding them to cc: in the hope that they may be able to help.
>>>>>>>
>>>>>>> @Peter, have you noticed issues with the discrete Nvidia GPU on
>>>>>>> your W541
>>>>>>> related to runtime suspend and system sleep?
>>>>>> I was using a borrowed one (I can certainly find it again, but I'm
>>>>>> not
>>>>>> working on graphics/pm really), but yeah - shutdown and lspci both
>>>>>> broke
>>>>>> sometime after pci_pm_runtime_resume().  Here's the traceback from
>>>>>> SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
>>>>>>
>>>>>> Dave, if you know who in Westford should have a look at this, I can
>>>>>> see
>>>>>> about getting them hardware.  I am more or less surrounded by that
>>>>>> team.
>>>>>>
>>>>>> --
>>>>>>         Peter
>>>>>>
>>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-05 15:06                                                     ` Lukas Wunner
  2017-01-05 18:13                                                       ` Peter Jones
  2017-01-07 11:45                                                       ` Hans de Goede
@ 2017-01-11 11:04                                                       ` Hans de Goede
  2017-01-11 13:24                                                         ` Kilian Singer
  2 siblings, 1 reply; 115+ messages in thread
From: Hans de Goede @ 2017-01-11 11:04 UTC (permalink / raw)
  To: Lukas Wunner, David Airlie
  Cc: Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	Kilian Singer, linux-pci, Alex Deucher, Peter Jones

HI,

On 05-01-17 16:06, Lukas Wunner wrote:
> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
>>> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
>>>> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
>>>>> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
>>>>>> I don't *want* to apply the revert.  It's on my for-linus branch as a
>>>>>> worst-case scenario change if we can't figure out a better fix.
>>>>>>
>>>>>> The patch below is preferable, but I'd rather not take even it,
>>>>>> because it takes away functionality and forces people to use a boot
>>>>>> parameter to restore it.  I expect that somebody will figure out how
>>>>>> to fix the regression Kilian found and also keep the new functionality
>>>>>> (without requiring boot parameters) before v4.10.
>>>>>
>>>>> The issue is constrained to hybrid graphics laptops with Nvidia discrete
>>>>> GPU using nouveau.  Hence it needs to be fixed in nouveau, not in the
>>>>> PCI core.
>>>>
>>>> The problem is not necessarily in the nouveau driver, the same problem
>>>> occurs when you enable RPM without loading nouveau. The issue is limited
>>>> though to some newer hybrid graphics laptops with Nvidia GPUs. While a
>>>> quirk can be added to nouveau, I think that a (temporary) quirk in core
>>>> would also be reasonable (since it also occurs without nouveau).
>>>>
>>>>> (AFAIUI, laptops with AMD discrete GPU are not affected as it is known
>>>>> when and how to call an ACPI method versus using PR3.)
>>>>>
>>>>> (Neither are laptops using the Nvidia proprietary driver as it doesn't
>>>>> runtime suspend the card.  But battery life will be terrible then.)
>>>>>
>>>>> We're at rc2 so the time frame for coming up with a fix is probably
>>>>> 4 weeks.  Peter and others have tried for months to reverse-engineer
>>>>> how to handle runtime PM on newer Nvidia cards.  It seems likely that
>>>>> we'll not find the ultimate solution to the problem within 4 weeks.
>>>>
>>>> Yep, a quick proper fix seems unlikely.
>>>> [ Help/ideas are welcome, I suspect that these failures to restore power
>>>> on laptops designed for Win8+ all have the same cause, related to some
>>>> unknown interaction between ACPI and PCI. Some links:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
>>>>
>>>>> The way it is now, i.e. defaulting to PR3 when available, regresses
>>>>> certain laptops such as Kilian's.  If on the other hand we default to
>>>>> DSM when available, we'll regress certain other laptops, as Peter has
>>>>> pointed out.  Whitelisting or blacklisting laptops doesn't seem a good
>>>>> approach either, ideally we'd want to use PR3 as Windows does.
>>>>>
>>>>> As said, the only short-term solution I see is to add an "optimus"
>>>>> module_param to nouveau to allow users to select which method to use.
>>>>> So in Kilian's case an additional command line parameter would be
>>>>> necessary to fix the issue.
>>>>>
>>>>> Does anyone see a better solution or can we agree on this one?  If so
>>>>> I can come up with a patch.  This could go in via Dave Airlie's tree.
>>>>
>>>> As pcie_port_pm=off already reverts to DSM, I do not think that an
>>>> additional (temporary) nouveau module parameter is going to help. I
>>>> instead propose a (hopefully temporary) quirk in pci core that disables
>>>> D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
>>>> pcie_port_pm=off). Then the option pcie_port_pm=force can still be used
>>>> to test possible solutions in the future.
>>>
>>> I would rather add a quirk to the ACPI core to prevent the power resources in
>>> question from being enumerated.  Or even to prevent ACPI PM from being
>>> used for the port in question.
>>
>> I do have a W541 in a cupboard in the office somewhere, but I won't be close to
>> it for a couple of weeks. The W541 was the first place I tested the pm patches
>> so I'm kinda wondering whether it's all W541's or just some specific model/bios
>> combo.
>>
>> However I'm pretty much unavailable to do anything much until late Jan on this.
>
> Is there anyone else at Red Hat who might be able to look into this?
>
> ISTR that Hans de Goede is working on improving laptop support in Fedora,
> and Peter Jones recently got a patch merged for the W541 with the exact
> same firmware Kilian is using to work around a botched EFI memory map.
> Adding them to cc: in the hope that they may be able to help.
>
> @Peter, have you noticed issues with the discrete Nvidia GPU on your W541
> related to runtime suspend and system sleep?

I've tried to reproduce this problem on my W541, which has the exact
same CPU + GPU combo as the reporter of:

https://bugzilla.kernel.org/show_bug.cgi?id=190861

But no luck, I started out with BIOS-2.27 and when I could not reproduce
I updated to 2.29 (should have tried 2.28 which is what the reporter
has first in retrospect) and still no luck in reproducing this.

I'll attach acpidumps of the 2 Bios versions I've tried to the bug.

Regards,

Hans

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-11 11:04                                                       ` Hans de Goede
@ 2017-01-11 13:24                                                         ` Kilian Singer
  2017-01-11 13:26                                                           ` Hans de Goede
  0 siblings, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2017-01-11 13:24 UTC (permalink / raw)
  To: Hans de Goede, Lukas Wunner, David Airlie
  Cc: Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	linux-pci, Alex Deucher, Peter Jones

Dear all,

sounds interesting I could try to update to 2.29.

Shall I do so?

Best regards

Kilian



On 11-Jan-17 12:04, Hans de Goede wrote:
> HI,
>
> On 05-01-17 16:06, Lukas Wunner wrote:
>> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
>>>> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
>>>>> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
>>>>>> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
>>>>>>> I don't *want* to apply the revert.  It's on my for-linus branch
>>>>>>> as a
>>>>>>> worst-case scenario change if we can't figure out a better fix.
>>>>>>>
>>>>>>> The patch below is preferable, but I'd rather not take even it,
>>>>>>> because it takes away functionality and forces people to use a boot
>>>>>>> parameter to restore it.  I expect that somebody will figure out
>>>>>>> how
>>>>>>> to fix the regression Kilian found and also keep the new
>>>>>>> functionality
>>>>>>> (without requiring boot parameters) before v4.10.
>>>>>>
>>>>>> The issue is constrained to hybrid graphics laptops with Nvidia
>>>>>> discrete
>>>>>> GPU using nouveau.  Hence it needs to be fixed in nouveau, not in
>>>>>> the
>>>>>> PCI core.
>>>>>
>>>>> The problem is not necessarily in the nouveau driver, the same
>>>>> problem
>>>>> occurs when you enable RPM without loading nouveau. The issue is
>>>>> limited
>>>>> though to some newer hybrid graphics laptops with Nvidia GPUs.
>>>>> While a
>>>>> quirk can be added to nouveau, I think that a (temporary) quirk in
>>>>> core
>>>>> would also be reasonable (since it also occurs without nouveau).
>>>>>
>>>>>> (AFAIUI, laptops with AMD discrete GPU are not affected as it is
>>>>>> known
>>>>>> when and how to call an ACPI method versus using PR3.)
>>>>>>
>>>>>> (Neither are laptops using the Nvidia proprietary driver as it
>>>>>> doesn't
>>>>>> runtime suspend the card.  But battery life will be terrible then.)
>>>>>>
>>>>>> We're at rc2 so the time frame for coming up with a fix is probably
>>>>>> 4 weeks.  Peter and others have tried for months to reverse-engineer
>>>>>> how to handle runtime PM on newer Nvidia cards.  It seems likely
>>>>>> that
>>>>>> we'll not find the ultimate solution to the problem within 4 weeks.
>>>>>
>>>>> Yep, a quick proper fix seems unlikely.
>>>>> [ Help/ideas are welcome, I suspect that these failures to restore
>>>>> power
>>>>> on laptops designed for Win8+ all have the same cause, related to
>>>>> some
>>>>> unknown interaction between ACPI and PCI. Some links:
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
>>>>>
>>>>>> The way it is now, i.e. defaulting to PR3 when available, regresses
>>>>>> certain laptops such as Kilian's.  If on the other hand we
>>>>>> default to
>>>>>> DSM when available, we'll regress certain other laptops, as Peter
>>>>>> has
>>>>>> pointed out.  Whitelisting or blacklisting laptops doesn't seem a
>>>>>> good
>>>>>> approach either, ideally we'd want to use PR3 as Windows does.
>>>>>>
>>>>>> As said, the only short-term solution I see is to add an "optimus"
>>>>>> module_param to nouveau to allow users to select which method to
>>>>>> use.
>>>>>> So in Kilian's case an additional command line parameter would be
>>>>>> necessary to fix the issue.
>>>>>>
>>>>>> Does anyone see a better solution or can we agree on this one? 
>>>>>> If so
>>>>>> I can come up with a patch.  This could go in via Dave Airlie's
>>>>>> tree.
>>>>>
>>>>> As pcie_port_pm=off already reverts to DSM, I do not think that an
>>>>> additional (temporary) nouveau module parameter is going to help. I
>>>>> instead propose a (hopefully temporary) quirk in pci core that
>>>>> disables
>>>>> D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
>>>>> pcie_port_pm=off). Then the option pcie_port_pm=force can still be
>>>>> used
>>>>> to test possible solutions in the future.
>>>>
>>>> I would rather add a quirk to the ACPI core to prevent the power
>>>> resources in
>>>> question from being enumerated.  Or even to prevent ACPI PM from being
>>>> used for the port in question.
>>>
>>> I do have a W541 in a cupboard in the office somewhere, but I won't
>>> be close to
>>> it for a couple of weeks. The W541 was the first place I tested the
>>> pm patches
>>> so I'm kinda wondering whether it's all W541's or just some specific
>>> model/bios
>>> combo.
>>>
>>> However I'm pretty much unavailable to do anything much until late
>>> Jan on this.
>>
>> Is there anyone else at Red Hat who might be able to look into this?
>>
>> ISTR that Hans de Goede is working on improving laptop support in
>> Fedora,
>> and Peter Jones recently got a patch merged for the W541 with the exact
>> same firmware Kilian is using to work around a botched EFI memory map.
>> Adding them to cc: in the hope that they may be able to help.
>>
>> @Peter, have you noticed issues with the discrete Nvidia GPU on your
>> W541
>> related to runtime suspend and system sleep?
>
> I've tried to reproduce this problem on my W541, which has the exact
> same CPU + GPU combo as the reporter of:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>
> But no luck, I started out with BIOS-2.27 and when I could not reproduce
> I updated to 2.29 (should have tried 2.28 which is what the reporter
> has first in retrospect) and still no luck in reproducing this.
>
> I'll attach acpidumps of the 2 Bios versions I've tried to the bug.
>
> Regards,
>
> Hans
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-11 13:24                                                         ` Kilian Singer
@ 2017-01-11 13:26                                                           ` Hans de Goede
  2017-01-11 16:24                                                             ` Peter Jones
  0 siblings, 1 reply; 115+ messages in thread
From: Hans de Goede @ 2017-01-11 13:26 UTC (permalink / raw)
  To: Kilian Singer, Lukas Wunner, David Airlie
  Cc: Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	linux-pci, Alex Deucher, Peter Jones

Hi,

On 11-01-17 14:24, Kilian Singer wrote:
> Dear all,
>
> sounds interesting I could try to update to 2.29.
>
> Shall I do so?

According to the BIOS changelog 2.29 has some
fixes for bugs introduces in 2.28, so trying
2.29 is probably a good idea.

Regards,

Hans



>
> Best regards
>
> Kilian
>
>
>
> On 11-Jan-17 12:04, Hans de Goede wrote:
>> HI,
>>
>> On 05-01-17 16:06, Lukas Wunner wrote:
>>> On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
>>>>> On Wednesday, January 04, 2017 10:09:54 PM Peter Wu wrote:
>>>>>> On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner wrote:
>>>>>>> On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn Helgaas wrote:
>>>>>>>> I don't *want* to apply the revert.  It's on my for-linus branch
>>>>>>>> as a
>>>>>>>> worst-case scenario change if we can't figure out a better fix.
>>>>>>>>
>>>>>>>> The patch below is preferable, but I'd rather not take even it,
>>>>>>>> because it takes away functionality and forces people to use a boot
>>>>>>>> parameter to restore it.  I expect that somebody will figure out
>>>>>>>> how
>>>>>>>> to fix the regression Kilian found and also keep the new
>>>>>>>> functionality
>>>>>>>> (without requiring boot parameters) before v4.10.
>>>>>>>
>>>>>>> The issue is constrained to hybrid graphics laptops with Nvidia
>>>>>>> discrete
>>>>>>> GPU using nouveau.  Hence it needs to be fixed in nouveau, not in
>>>>>>> the
>>>>>>> PCI core.
>>>>>>
>>>>>> The problem is not necessarily in the nouveau driver, the same
>>>>>> problem
>>>>>> occurs when you enable RPM without loading nouveau. The issue is
>>>>>> limited
>>>>>> though to some newer hybrid graphics laptops with Nvidia GPUs.
>>>>>> While a
>>>>>> quirk can be added to nouveau, I think that a (temporary) quirk in
>>>>>> core
>>>>>> would also be reasonable (since it also occurs without nouveau).
>>>>>>
>>>>>>> (AFAIUI, laptops with AMD discrete GPU are not affected as it is
>>>>>>> known
>>>>>>> when and how to call an ACPI method versus using PR3.)
>>>>>>>
>>>>>>> (Neither are laptops using the Nvidia proprietary driver as it
>>>>>>> doesn't
>>>>>>> runtime suspend the card.  But battery life will be terrible then.)
>>>>>>>
>>>>>>> We're at rc2 so the time frame for coming up with a fix is probably
>>>>>>> 4 weeks.  Peter and others have tried for months to reverse-engineer
>>>>>>> how to handle runtime PM on newer Nvidia cards.  It seems likely
>>>>>>> that
>>>>>>> we'll not find the ultimate solution to the problem within 4 weeks.
>>>>>>
>>>>>> Yep, a quick proper fix seems unlikely.
>>>>>> [ Help/ideas are welcome, I suspect that these failures to restore
>>>>>> power
>>>>>> on laptops designed for Win8+ all have the same cause, related to
>>>>>> some
>>>>>> unknown interaction between ACPI and PCI. Some links:
>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
>>>>>>
>>>>>>> The way it is now, i.e. defaulting to PR3 when available, regresses
>>>>>>> certain laptops such as Kilian's.  If on the other hand we
>>>>>>> default to
>>>>>>> DSM when available, we'll regress certain other laptops, as Peter
>>>>>>> has
>>>>>>> pointed out.  Whitelisting or blacklisting laptops doesn't seem a
>>>>>>> good
>>>>>>> approach either, ideally we'd want to use PR3 as Windows does.
>>>>>>>
>>>>>>> As said, the only short-term solution I see is to add an "optimus"
>>>>>>> module_param to nouveau to allow users to select which method to
>>>>>>> use.
>>>>>>> So in Kilian's case an additional command line parameter would be
>>>>>>> necessary to fix the issue.
>>>>>>>
>>>>>>> Does anyone see a better solution or can we agree on this one?
>>>>>>> If so
>>>>>>> I can come up with a patch.  This could go in via Dave Airlie's
>>>>>>> tree.
>>>>>>
>>>>>> As pcie_port_pm=off already reverts to DSM, I do not think that an
>>>>>> additional (temporary) nouveau module parameter is going to help. I
>>>>>> instead propose a (hopefully temporary) quirk in pci core that
>>>>>> disables
>>>>>> D3cold RPM for just Kilians Lenovo laptop (basically defaulting to
>>>>>> pcie_port_pm=off). Then the option pcie_port_pm=force can still be
>>>>>> used
>>>>>> to test possible solutions in the future.
>>>>>
>>>>> I would rather add a quirk to the ACPI core to prevent the power
>>>>> resources in
>>>>> question from being enumerated.  Or even to prevent ACPI PM from being
>>>>> used for the port in question.
>>>>
>>>> I do have a W541 in a cupboard in the office somewhere, but I won't
>>>> be close to
>>>> it for a couple of weeks. The W541 was the first place I tested the
>>>> pm patches
>>>> so I'm kinda wondering whether it's all W541's or just some specific
>>>> model/bios
>>>> combo.
>>>>
>>>> However I'm pretty much unavailable to do anything much until late
>>>> Jan on this.
>>>
>>> Is there anyone else at Red Hat who might be able to look into this?
>>>
>>> ISTR that Hans de Goede is working on improving laptop support in
>>> Fedora,
>>> and Peter Jones recently got a patch merged for the W541 with the exact
>>> same firmware Kilian is using to work around a botched EFI memory map.
>>> Adding them to cc: in the hope that they may be able to help.
>>>
>>> @Peter, have you noticed issues with the discrete Nvidia GPU on your
>>> W541
>>> related to runtime suspend and system sleep?
>>
>> I've tried to reproduce this problem on my W541, which has the exact
>> same CPU + GPU combo as the reporter of:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=190861
>>
>> But no luck, I started out with BIOS-2.27 and when I could not reproduce
>> I updated to 2.29 (should have tried 2.28 which is what the reporter
>> has first in retrospect) and still no luck in reproducing this.
>>
>> I'll attach acpidumps of the 2 Bios versions I've tried to the bug.
>>
>> Regards,
>>
>> Hans
>>
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-11 13:26                                                           ` Hans de Goede
@ 2017-01-11 16:24                                                             ` Peter Jones
  2017-01-11 19:20                                                               ` Kilian Singer
  0 siblings, 1 reply; 115+ messages in thread
From: Peter Jones @ 2017-01-11 16:24 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Kilian Singer, Lukas Wunner, David Airlie, Rafael J. Wysocki,
	Peter Wu, Bjorn Helgaas, Mika Westerberg, linux-pci,
	Alex Deucher

On Wed, Jan 11, 2017 at 02:26:09PM +0100, Hans de Goede wrote:
> Hi,
> 
> On 11-01-17 14:24, Kilian Singer wrote:
> > Dear all,
> > 
> > sounds interesting I could try to update to 2.29.
> > 
> > Shall I do so?
> 
> According to the BIOS changelog 2.29 has some
> fixes for bugs introduces in 2.28, so trying
> 2.29 is probably a good idea.

I'd also be interested in seeing dmesg from that with efi=debug on the
kernel command line, if you don't mind.

-- 
        Peter

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-11 16:24                                                             ` Peter Jones
@ 2017-01-11 19:20                                                               ` Kilian Singer
  0 siblings, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2017-01-11 19:20 UTC (permalink / raw)
  To: Peter Jones, Hans de Goede
  Cc: Lukas Wunner, David Airlie, Rafael J. Wysocki, Peter Wu,
	Bjorn Helgaas, Mika Westerberg, linux-pci, Alex Deucher

I updated ti 2.29 but issue is not resolved.


On 11-Jan-17 17:24, Peter Jones wrote:
> On Wed, Jan 11, 2017 at 02:26:09PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> On 11-01-17 14:24, Kilian Singer wrote:
>>> Dear all,
>>>
>>> sounds interesting I could try to update to 2.29.
>>>
>>> Shall I do so?
>> According to the BIOS changelog 2.29 has some
>> fixes for bugs introduces in 2.28, so trying
>> 2.29 is probably a good idea.
> I'd also be interested in seeing dmesg from that with efi=debug on the
> kernel command line, if you don't mind.
>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-09 15:21                                                             ` Hans de Goede
  2017-01-09 18:48                                                               ` Kilian Singer
@ 2017-01-11 20:40                                                               ` Lyude Paul
  2017-01-12  1:13                                                                 ` Lyude Paul
  1 sibling, 1 reply; 115+ messages in thread
From: Lyude Paul @ 2017-01-11 20:40 UTC (permalink / raw)
  To: Hans de Goede, David Airlie, Peter Jones
  Cc: Lukas Wunner, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, Kilian Singer, linux-pci, Alex Deucher

Alright yeah, runtime suspend definitely doesn't seem to work on this
one either. I thought I had sent a patch for this a while back, but
trying my patch now it doesn't seem to fix the issue…

On Mon, 2017-01-09 at 16:21 +0100, Hans de Goede wrote:
> Hi Lyude,
> 
> On 09-01-17 16:11, Lyude Paul wrote:
> > fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64
> > running on
> > here and so far it seems to suspend/resume just fine using firmware
> > version 2.19
> 
> Note this is not about normal suspend resume, but runtime
> suspend/resume of the nvidia discrete GPU...
> 
> Try running glxgears like this:
> 
> DRI_PRIME=1 glxgears -info | grep REND
> 
> (the grep is to check you're really running on the nvidia GPU).
> 
> Then you should see msgs in dmesg about nouveau resuming the gpu,
> then kill glxgears and wait for 5 seconds, now the nouveau drv
> should say the gpu is suspending, etc.
> 
> If it never runtime suspends, then make sure you are not using
> any external screens, only the built-in laptop screen.
> 
> Regards,
> 
> Hans
> 
> 
> > 
> > On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
> > > (cc'ing Lyude, who has the hw also I think).
> > > 
> > > ----- Original Message -----
> > > > From: "Peter Jones" <pjones@redhat.com>
> > > > To: "Lukas Wunner" <lukas@wunner.de>
> > > > Cc: "David Airlie" <airlied@redhat.com>, "Rafael J. Wysocki" <r
> > > > jw@r
> > > > jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> > > > "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg"
> > > > <mika.weste
> > > > rberg@linux.intel.com>, "Kilian Singer"
> > > > <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-pci@
> > > > vger
> > > > .kernel.org>, "Alex Deucher"
> > > > <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redhat.c
> > > > om>
> > > > Sent: Friday, 6 January, 2017 4:13:23 AM
> > > > Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe
> > > > ports"
> > > > 
> > > > On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
> > > > > On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie wrote:
> > > > > > > On Wednesday, January 04, 2017 10:09:54 PM Peter Wu
> > > > > > > wrote:
> > > > > > > > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas Wunner
> > > > > > > > wrote:
> > > > > > > > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn
> > > > > > > > > Helgaas
> > > > > > > > > wrote:
> > > > > > > > > > I don't *want* to apply the revert.  It's on my
> > > > > > > > > > for-
> > > > > > > > > > linus branch
> > > > > > > > > > as a
> > > > > > > > > > worst-case scenario change if we can't figure out a
> > > > > > > > > > better fix.
> > > > > > > > > > 
> > > > > > > > > > The patch below is preferable, but I'd rather not
> > > > > > > > > > take
> > > > > > > > > > even it,
> > > > > > > > > > because it takes away functionality and forces
> > > > > > > > > > people
> > > > > > > > > > to use a
> > > > > > > > > > boot
> > > > > > > > > > parameter to restore it.  I expect that somebody
> > > > > > > > > > will
> > > > > > > > > > figure out
> > > > > > > > > > how
> > > > > > > > > > to fix the regression Kilian found and also keep
> > > > > > > > > > the
> > > > > > > > > > new
> > > > > > > > > > functionality
> > > > > > > > > > (without requiring boot parameters) before v4.10.
> > > > > > > > > 
> > > > > > > > > The issue is constrained to hybrid graphics laptops
> > > > > > > > > with
> > > > > > > > > Nvidia
> > > > > > > > > discrete
> > > > > > > > > GPU using nouveau.  Hence it needs to be fixed in
> > > > > > > > > nouveau, not in
> > > > > > > > > the
> > > > > > > > > PCI core.
> > > > > > > > 
> > > > > > > > The problem is not necessarily in the nouveau driver,
> > > > > > > > the
> > > > > > > > same
> > > > > > > > problem
> > > > > > > > occurs when you enable RPM without loading nouveau. The
> > > > > > > > issue is
> > > > > > > > limited
> > > > > > > > though to some newer hybrid graphics laptops with
> > > > > > > > Nvidia
> > > > > > > > GPUs. While
> > > > > > > > a
> > > > > > > > quirk can be added to nouveau, I think that a
> > > > > > > > (temporary)
> > > > > > > > quirk in
> > > > > > > > core
> > > > > > > > would also be reasonable (since it also occurs without
> > > > > > > > nouveau).
> > > > > > > > 
> > > > > > > > > (AFAIUI, laptops with AMD discrete GPU are not
> > > > > > > > > affected
> > > > > > > > > as it is
> > > > > > > > > known
> > > > > > > > > when and how to call an ACPI method versus using
> > > > > > > > > PR3.)
> > > > > > > > > 
> > > > > > > > > (Neither are laptops using the Nvidia proprietary
> > > > > > > > > driver
> > > > > > > > > as it
> > > > > > > > > doesn't
> > > > > > > > > runtime suspend the card.  But battery life will be
> > > > > > > > > terrible then.)
> > > > > > > > > 
> > > > > > > > > We're at rc2 so the time frame for coming up with a
> > > > > > > > > fix
> > > > > > > > > is probably
> > > > > > > > > 4 weeks.  Peter and others have tried for months to
> > > > > > > > > reverse-engineer
> > > > > > > > > how to handle runtime PM on newer Nvidia cards.  It
> > > > > > > > > seems
> > > > > > > > > likely
> > > > > > > > > that
> > > > > > > > > we'll not find the ultimate solution to the problem
> > > > > > > > > within 4 weeks.
> > > > > > > > 
> > > > > > > > Yep, a quick proper fix seems unlikely.
> > > > > > > > [ Help/ideas are welcome, I suspect that these failures
> > > > > > > > to
> > > > > > > > restore
> > > > > > > > power
> > > > > > > > on laptops designed for Win8+ all have the same cause,
> > > > > > > > related to
> > > > > > > > some
> > > > > > > > unknown interaction between ACPI and PCI. Some links:
> > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > > > > > > > 
> > > > > > > > > The way it is now, i.e. defaulting to PR3 when
> > > > > > > > > available,
> > > > > > > > > regresses
> > > > > > > > > certain laptops such as Kilian's.  If on the other
> > > > > > > > > hand
> > > > > > > > > we default
> > > > > > > > > to
> > > > > > > > > DSM when available, we'll regress certain other
> > > > > > > > > laptops,
> > > > > > > > > as Peter
> > > > > > > > > has
> > > > > > > > > pointed out.  Whitelisting or blacklisting laptops
> > > > > > > > > doesn't seem a
> > > > > > > > > good
> > > > > > > > > approach either, ideally we'd want to use PR3 as
> > > > > > > > > Windows
> > > > > > > > > does.
> > > > > > > > > 
> > > > > > > > > As said, the only short-term solution I see is to add
> > > > > > > > > an
> > > > > > > > > "optimus"
> > > > > > > > > module_param to nouveau to allow users to select
> > > > > > > > > which
> > > > > > > > > method to
> > > > > > > > > use.
> > > > > > > > > So in Kilian's case an additional command line
> > > > > > > > > parameter
> > > > > > > > > would be
> > > > > > > > > necessary to fix the issue.
> > > > > > > > > 
> > > > > > > > > Does anyone see a better solution or can we agree on
> > > > > > > > > this
> > > > > > > > > one?  If
> > > > > > > > > so
> > > > > > > > > I can come up with a patch.  This could go in via
> > > > > > > > > Dave
> > > > > > > > > Airlie's
> > > > > > > > > tree.
> > > > > > > > 
> > > > > > > > As pcie_port_pm=off already reverts to DSM, I do not
> > > > > > > > think
> > > > > > > > that an
> > > > > > > > additional (temporary) nouveau module parameter is
> > > > > > > > going to
> > > > > > > > help. I
> > > > > > > > instead propose a (hopefully temporary) quirk in pci
> > > > > > > > core
> > > > > > > > that
> > > > > > > > disables
> > > > > > > > D3cold RPM for just Kilians Lenovo laptop (basically
> > > > > > > > defaulting to
> > > > > > > > pcie_port_pm=off). Then the option pcie_port_pm=force
> > > > > > > > can
> > > > > > > > still be
> > > > > > > > used
> > > > > > > > to test possible solutions in the future.
> > > > > > > 
> > > > > > > I would rather add a quirk to the ACPI core to prevent
> > > > > > > the
> > > > > > > power
> > > > > > > resources in
> > > > > > > question from being enumerated.  Or even to prevent ACPI
> > > > > > > PM
> > > > > > > from being
> > > > > > > used for the port in question.
> > > > > > 
> > > > > > I do have a W541 in a cupboard in the office somewhere, but
> > > > > > I
> > > > > > won't be
> > > > > > close to
> > > > > > it for a couple of weeks. The W541 was the first place I
> > > > > > tested
> > > > > > the pm
> > > > > > patches
> > > > > > so I'm kinda wondering whether it's all W541's or just some
> > > > > > specific
> > > > > > model/bios
> > > > > > combo.
> > > > 
> > > > They seem to all ship with the 1.10 firmware, and 2.80 is
> > > > current
> > > > (there
> > > > are a bunch of intermediate 2.xx versions).  Somewhere along
> > > > the
> > > > line
> > > > they introduced some bugs in the UEFI stuff, so it wouldn't be
> > > > surprising if there's bugs introduced elsewhere as well.
> > > > 
> > > > > > However I'm pretty much unavailable to do anything much
> > > > > > until
> > > > > > late Jan on
> > > > > > this.
> > > > > 
> > > > > Is there anyone else at Red Hat who might be able to look
> > > > > into
> > > > > this?
> > > > > 
> > > > > ISTR that Hans de Goede is working on improving laptop
> > > > > support in
> > > > > Fedora,
> > > > > and Peter Jones recently got a patch merged for the W541 with
> > > > > the
> > > > > exact
> > > > > same firmware Kilian is using to work around a botched EFI
> > > > > memory
> > > > > map.
> > > > > Adding them to cc: in the hope that they may be able to help.
> > > > > 
> > > > > @Peter, have you noticed issues with the discrete Nvidia GPU
> > > > > on
> > > > > your W541
> > > > > related to runtime suspend and system sleep?
> > > > 
> > > > I was using a borrowed one (I can certainly find it again, but
> > > > I'm
> > > > not
> > > > working on graphics/pm really), but yeah - shutdown and lspci
> > > > both
> > > > broke
> > > > sometime after pci_pm_runtime_resume().  Here's the traceback
> > > > from
> > > > SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
> > > > 
> > > > Dave, if you know who in Westford should have a look at this, I
> > > > can
> > > > see
> > > > about getting them hardware.  I am more or less surrounded by
> > > > that
> > > > team.
> > > > 
> > > > --
> > > >         Peter
> > > > 
-- 
Cheers,
	Lyude

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-11 20:40                                                               ` Lyude Paul
@ 2017-01-12  1:13                                                                 ` Lyude Paul
  2017-01-12  2:04                                                                   ` Lyude Paul
  0 siblings, 1 reply; 115+ messages in thread
From: Lyude Paul @ 2017-01-12  1:13 UTC (permalink / raw)
  To: Hans de Goede, David Airlie, Peter Jones
  Cc: Lukas Wunner, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, Kilian Singer, linux-pci, Alex Deucher

Neat, went through one of my old kernel git repos and indeed I did
write a patch for fixing this exact issue. I must have forgotten to
follow up with it after sending it out to the mailing list.

Anyway, the original patch I had doesn't fix the problem entirely on
new kernels so I added some more deadlock fixes into it, now RPM seems
to work fine. I will respond with a koji build of the F25 kernel with
the patches applied once I finish wrestling with getting kerberos to
work with FAS…

On Wed, 2017-01-11 at 15:40 -0500, Lyude Paul wrote:
> Alright yeah, runtime suspend definitely doesn't seem to work on this
> one either. I thought I had sent a patch for this a while back, but
> trying my patch now it doesn't seem to fix the issue…
> 
> On Mon, 2017-01-09 at 16:21 +0100, Hans de Goede wrote:
> > Hi Lyude,
> > 
> > On 09-01-17 16:11, Lyude Paul wrote:
> > > fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64
> > > running on
> > > here and so far it seems to suspend/resume just fine using
> > > firmware
> > > version 2.19
> > 
> > Note this is not about normal suspend resume, but runtime
> > suspend/resume of the nvidia discrete GPU...
> > 
> > Try running glxgears like this:
> > 
> > DRI_PRIME=1 glxgears -info | grep REND
> > 
> > (the grep is to check you're really running on the nvidia GPU).
> > 
> > Then you should see msgs in dmesg about nouveau resuming the gpu,
> > then kill glxgears and wait for 5 seconds, now the nouveau drv
> > should say the gpu is suspending, etc.
> > 
> > If it never runtime suspends, then make sure you are not using
> > any external screens, only the built-in laptop screen.
> > 
> > Regards,
> > 
> > Hans
> > 
> > 
> > > 
> > > On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
> > > > (cc'ing Lyude, who has the hw also I think).
> > > > 
> > > > ----- Original Message -----
> > > > > From: "Peter Jones" <pjones@redhat.com>
> > > > > To: "Lukas Wunner" <lukas@wunner.de>
> > > > > Cc: "David Airlie" <airlied@redhat.com>, "Rafael J. Wysocki"
> > > > > <r
> > > > > jw@r
> > > > > jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> > > > > "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg"
> > > > > <mika.weste
> > > > > rberg@linux.intel.com>, "Kilian Singer"
> > > > > <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-
> > > > > pci@
> > > > > vger
> > > > > .kernel.org>, "Alex Deucher"
> > > > > <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redhat
> > > > > .c
> > > > > om>
> > > > > Sent: Friday, 6 January, 2017 4:13:23 AM
> > > > > Subject: Re: PCI: Revert "PCI: Add runtime PM support for
> > > > > PCIe
> > > > > ports"
> > > > > 
> > > > > On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner wrote:
> > > > > > On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie
> > > > > > wrote:
> > > > > > > > On Wednesday, January 04, 2017 10:09:54 PM Peter Wu
> > > > > > > > wrote:
> > > > > > > > > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas
> > > > > > > > > Wunner
> > > > > > > > > wrote:
> > > > > > > > > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn
> > > > > > > > > > Helgaas
> > > > > > > > > > wrote:
> > > > > > > > > > > I don't *want* to apply the revert.  It's on my
> > > > > > > > > > > for-
> > > > > > > > > > > linus branch
> > > > > > > > > > > as a
> > > > > > > > > > > worst-case scenario change if we can't figure out
> > > > > > > > > > > a
> > > > > > > > > > > better fix.
> > > > > > > > > > > 
> > > > > > > > > > > The patch below is preferable, but I'd rather not
> > > > > > > > > > > take
> > > > > > > > > > > even it,
> > > > > > > > > > > because it takes away functionality and forces
> > > > > > > > > > > people
> > > > > > > > > > > to use a
> > > > > > > > > > > boot
> > > > > > > > > > > parameter to restore it.  I expect that somebody
> > > > > > > > > > > will
> > > > > > > > > > > figure out
> > > > > > > > > > > how
> > > > > > > > > > > to fix the regression Kilian found and also keep
> > > > > > > > > > > the
> > > > > > > > > > > new
> > > > > > > > > > > functionality
> > > > > > > > > > > (without requiring boot parameters) before v4.10.
> > > > > > > > > > 
> > > > > > > > > > The issue is constrained to hybrid graphics laptops
> > > > > > > > > > with
> > > > > > > > > > Nvidia
> > > > > > > > > > discrete
> > > > > > > > > > GPU using nouveau.  Hence it needs to be fixed in
> > > > > > > > > > nouveau, not in
> > > > > > > > > > the
> > > > > > > > > > PCI core.
> > > > > > > > > 
> > > > > > > > > The problem is not necessarily in the nouveau driver,
> > > > > > > > > the
> > > > > > > > > same
> > > > > > > > > problem
> > > > > > > > > occurs when you enable RPM without loading nouveau.
> > > > > > > > > The
> > > > > > > > > issue is
> > > > > > > > > limited
> > > > > > > > > though to some newer hybrid graphics laptops with
> > > > > > > > > Nvidia
> > > > > > > > > GPUs. While
> > > > > > > > > a
> > > > > > > > > quirk can be added to nouveau, I think that a
> > > > > > > > > (temporary)
> > > > > > > > > quirk in
> > > > > > > > > core
> > > > > > > > > would also be reasonable (since it also occurs
> > > > > > > > > without
> > > > > > > > > nouveau).
> > > > > > > > > 
> > > > > > > > > > (AFAIUI, laptops with AMD discrete GPU are not
> > > > > > > > > > affected
> > > > > > > > > > as it is
> > > > > > > > > > known
> > > > > > > > > > when and how to call an ACPI method versus using
> > > > > > > > > > PR3.)
> > > > > > > > > > 
> > > > > > > > > > (Neither are laptops using the Nvidia proprietary
> > > > > > > > > > driver
> > > > > > > > > > as it
> > > > > > > > > > doesn't
> > > > > > > > > > runtime suspend the card.  But battery life will be
> > > > > > > > > > terrible then.)
> > > > > > > > > > 
> > > > > > > > > > We're at rc2 so the time frame for coming up with a
> > > > > > > > > > fix
> > > > > > > > > > is probably
> > > > > > > > > > 4 weeks.  Peter and others have tried for months to
> > > > > > > > > > reverse-engineer
> > > > > > > > > > how to handle runtime PM on newer Nvidia cards.  It
> > > > > > > > > > seems
> > > > > > > > > > likely
> > > > > > > > > > that
> > > > > > > > > > we'll not find the ultimate solution to the problem
> > > > > > > > > > within 4 weeks.
> > > > > > > > > 
> > > > > > > > > Yep, a quick proper fix seems unlikely.
> > > > > > > > > [ Help/ideas are welcome, I suspect that these
> > > > > > > > > failures
> > > > > > > > > to
> > > > > > > > > restore
> > > > > > > > > power
> > > > > > > > > on laptops designed for Win8+ all have the same
> > > > > > > > > cause,
> > > > > > > > > related to
> > > > > > > > > some
> > > > > > > > > unknown interaction between ACPI and PCI. Some links:
> > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341 ]
> > > > > > > > > 
> > > > > > > > > > The way it is now, i.e. defaulting to PR3 when
> > > > > > > > > > available,
> > > > > > > > > > regresses
> > > > > > > > > > certain laptops such as Kilian's.  If on the other
> > > > > > > > > > hand
> > > > > > > > > > we default
> > > > > > > > > > to
> > > > > > > > > > DSM when available, we'll regress certain other
> > > > > > > > > > laptops,
> > > > > > > > > > as Peter
> > > > > > > > > > has
> > > > > > > > > > pointed out.  Whitelisting or blacklisting laptops
> > > > > > > > > > doesn't seem a
> > > > > > > > > > good
> > > > > > > > > > approach either, ideally we'd want to use PR3 as
> > > > > > > > > > Windows
> > > > > > > > > > does.
> > > > > > > > > > 
> > > > > > > > > > As said, the only short-term solution I see is to
> > > > > > > > > > add
> > > > > > > > > > an
> > > > > > > > > > "optimus"
> > > > > > > > > > module_param to nouveau to allow users to select
> > > > > > > > > > which
> > > > > > > > > > method to
> > > > > > > > > > use.
> > > > > > > > > > So in Kilian's case an additional command line
> > > > > > > > > > parameter
> > > > > > > > > > would be
> > > > > > > > > > necessary to fix the issue.
> > > > > > > > > > 
> > > > > > > > > > Does anyone see a better solution or can we agree
> > > > > > > > > > on
> > > > > > > > > > this
> > > > > > > > > > one?  If
> > > > > > > > > > so
> > > > > > > > > > I can come up with a patch.  This could go in via
> > > > > > > > > > Dave
> > > > > > > > > > Airlie's
> > > > > > > > > > tree.
> > > > > > > > > 
> > > > > > > > > As pcie_port_pm=off already reverts to DSM, I do not
> > > > > > > > > think
> > > > > > > > > that an
> > > > > > > > > additional (temporary) nouveau module parameter is
> > > > > > > > > going to
> > > > > > > > > help. I
> > > > > > > > > instead propose a (hopefully temporary) quirk in pci
> > > > > > > > > core
> > > > > > > > > that
> > > > > > > > > disables
> > > > > > > > > D3cold RPM for just Kilians Lenovo laptop (basically
> > > > > > > > > defaulting to
> > > > > > > > > pcie_port_pm=off). Then the option pcie_port_pm=force
> > > > > > > > > can
> > > > > > > > > still be
> > > > > > > > > used
> > > > > > > > > to test possible solutions in the future.
> > > > > > > > 
> > > > > > > > I would rather add a quirk to the ACPI core to prevent
> > > > > > > > the
> > > > > > > > power
> > > > > > > > resources in
> > > > > > > > question from being enumerated.  Or even to prevent
> > > > > > > > ACPI
> > > > > > > > PM
> > > > > > > > from being
> > > > > > > > used for the port in question.
> > > > > > > 
> > > > > > > I do have a W541 in a cupboard in the office somewhere,
> > > > > > > but
> > > > > > > I
> > > > > > > won't be
> > > > > > > close to
> > > > > > > it for a couple of weeks. The W541 was the first place I
> > > > > > > tested
> > > > > > > the pm
> > > > > > > patches
> > > > > > > so I'm kinda wondering whether it's all W541's or just
> > > > > > > some
> > > > > > > specific
> > > > > > > model/bios
> > > > > > > combo.
> > > > > 
> > > > > They seem to all ship with the 1.10 firmware, and 2.80 is
> > > > > current
> > > > > (there
> > > > > are a bunch of intermediate 2.xx versions).  Somewhere along
> > > > > the
> > > > > line
> > > > > they introduced some bugs in the UEFI stuff, so it wouldn't
> > > > > be
> > > > > surprising if there's bugs introduced elsewhere as well.
> > > > > 
> > > > > > > However I'm pretty much unavailable to do anything much
> > > > > > > until
> > > > > > > late Jan on
> > > > > > > this.
> > > > > > 
> > > > > > Is there anyone else at Red Hat who might be able to look
> > > > > > into
> > > > > > this?
> > > > > > 
> > > > > > ISTR that Hans de Goede is working on improving laptop
> > > > > > support in
> > > > > > Fedora,
> > > > > > and Peter Jones recently got a patch merged for the W541
> > > > > > with
> > > > > > the
> > > > > > exact
> > > > > > same firmware Kilian is using to work around a botched EFI
> > > > > > memory
> > > > > > map.
> > > > > > Adding them to cc: in the hope that they may be able to
> > > > > > help.
> > > > > > 
> > > > > > @Peter, have you noticed issues with the discrete Nvidia
> > > > > > GPU
> > > > > > on
> > > > > > your W541
> > > > > > related to runtime suspend and system sleep?
> > > > > 
> > > > > I was using a borrowed one (I can certainly find it again,
> > > > > but
> > > > > I'm
> > > > > not
> > > > > working on graphics/pm really), but yeah - shutdown and lspci
> > > > > both
> > > > > broke
> > > > > sometime after pci_pm_runtime_resume().  Here's the traceback
> > > > > from
> > > > > SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
> > > > > 
> > > > > Dave, if you know who in Westford should have a look at this,
> > > > > I
> > > > > can
> > > > > see
> > > > > about getting them hardware.  I am more or less surrounded by
> > > > > that
> > > > > team.
> > > > > 
> > > > > --
> > > > >         Peter
> > > > > 
-- 
Cheers,
	Lyude

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-12  1:13                                                                 ` Lyude Paul
@ 2017-01-12  2:04                                                                   ` Lyude Paul
  2017-01-12  2:12                                                                     ` Lukas Wunner
  0 siblings, 1 reply; 115+ messages in thread
From: Lyude Paul @ 2017-01-12  2:04 UTC (permalink / raw)
  To: Hans de Goede, David Airlie, Peter Jones
  Cc: Lukas Wunner, Rafael J. Wysocki, Peter Wu, Bjorn Helgaas,
	Mika Westerberg, Kilian Singer, linux-pci, Alex Deucher

Finally got koji to work :). People having the runtime resume problem,
can you give this kernel RPM a try and tell me if the issue still
persists?

https://koji.fedoraproject.org/koji/taskinfo?taskID=17250338

On Wed, 2017-01-11 at 20:13 -0500, Lyude Paul wrote:
> Neat, went through one of my old kernel git repos and indeed I did
> write a patch for fixing this exact issue. I must have forgotten to
> follow up with it after sending it out to the mailing list.
> 
> Anyway, the original patch I had doesn't fix the problem entirely on
> new kernels so I added some more deadlock fixes into it, now RPM
> seems
> to work fine. I will respond with a koji build of the F25 kernel with
> the patches applied once I finish wrestling with getting kerberos to
> work with FAS…
> 
> On Wed, 2017-01-11 at 15:40 -0500, Lyude Paul wrote:
> > Alright yeah, runtime suspend definitely doesn't seem to work on
> > this
> > one either. I thought I had sent a patch for this a while back, but
> > trying my patch now it doesn't seem to fix the issue…
> > 
> > On Mon, 2017-01-09 at 16:21 +0100, Hans de Goede wrote:
> > > Hi Lyude,
> > > 
> > > On 09-01-17 16:11, Lyude Paul wrote:
> > > > fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64
> > > > running on
> > > > here and so far it seems to suspend/resume just fine using
> > > > firmware
> > > > version 2.19
> > > 
> > > Note this is not about normal suspend resume, but runtime
> > > suspend/resume of the nvidia discrete GPU...
> > > 
> > > Try running glxgears like this:
> > > 
> > > DRI_PRIME=1 glxgears -info | grep REND
> > > 
> > > (the grep is to check you're really running on the nvidia GPU).
> > > 
> > > Then you should see msgs in dmesg about nouveau resuming the gpu,
> > > then kill glxgears and wait for 5 seconds, now the nouveau drv
> > > should say the gpu is suspending, etc.
> > > 
> > > If it never runtime suspends, then make sure you are not using
> > > any external screens, only the built-in laptop screen.
> > > 
> > > Regards,
> > > 
> > > Hans
> > > 
> > > 
> > > > 
> > > > On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
> > > > > (cc'ing Lyude, who has the hw also I think).
> > > > > 
> > > > > ----- Original Message -----
> > > > > > From: "Peter Jones" <pjones@redhat.com>
> > > > > > To: "Lukas Wunner" <lukas@wunner.de>
> > > > > > Cc: "David Airlie" <airlied@redhat.com>, "Rafael J.
> > > > > > Wysocki"
> > > > > > <r
> > > > > > jw@r
> > > > > > jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> > > > > > "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg"
> > > > > > <mika.weste
> > > > > > rberg@linux.intel.com>, "Kilian Singer"
> > > > > > <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-
> > > > > > pci@
> > > > > > vger
> > > > > > .kernel.org>, "Alex Deucher"
> > > > > > <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@redh
> > > > > > at
> > > > > > .c
> > > > > > om>
> > > > > > Sent: Friday, 6 January, 2017 4:13:23 AM
> > > > > > Subject: Re: PCI: Revert "PCI: Add runtime PM support for
> > > > > > PCIe
> > > > > > ports"
> > > > > > 
> > > > > > On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner
> > > > > > wrote:
> > > > > > > On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie
> > > > > > > wrote:
> > > > > > > > > On Wednesday, January 04, 2017 10:09:54 PM Peter Wu
> > > > > > > > > wrote:
> > > > > > > > > > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas
> > > > > > > > > > Wunner
> > > > > > > > > > wrote:
> > > > > > > > > > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn
> > > > > > > > > > > Helgaas
> > > > > > > > > > > wrote:
> > > > > > > > > > > > I don't *want* to apply the revert.  It's on my
> > > > > > > > > > > > for-
> > > > > > > > > > > > linus branch
> > > > > > > > > > > > as a
> > > > > > > > > > > > worst-case scenario change if we can't figure
> > > > > > > > > > > > out
> > > > > > > > > > > > a
> > > > > > > > > > > > better fix.
> > > > > > > > > > > > 
> > > > > > > > > > > > The patch below is preferable, but I'd rather
> > > > > > > > > > > > not
> > > > > > > > > > > > take
> > > > > > > > > > > > even it,
> > > > > > > > > > > > because it takes away functionality and forces
> > > > > > > > > > > > people
> > > > > > > > > > > > to use a
> > > > > > > > > > > > boot
> > > > > > > > > > > > parameter to restore it.  I expect that
> > > > > > > > > > > > somebody
> > > > > > > > > > > > will
> > > > > > > > > > > > figure out
> > > > > > > > > > > > how
> > > > > > > > > > > > to fix the regression Kilian found and also
> > > > > > > > > > > > keep
> > > > > > > > > > > > the
> > > > > > > > > > > > new
> > > > > > > > > > > > functionality
> > > > > > > > > > > > (without requiring boot parameters) before
> > > > > > > > > > > > v4.10.
> > > > > > > > > > > 
> > > > > > > > > > > The issue is constrained to hybrid graphics
> > > > > > > > > > > laptops
> > > > > > > > > > > with
> > > > > > > > > > > Nvidia
> > > > > > > > > > > discrete
> > > > > > > > > > > GPU using nouveau.  Hence it needs to be fixed in
> > > > > > > > > > > nouveau, not in
> > > > > > > > > > > the
> > > > > > > > > > > PCI core.
> > > > > > > > > > 
> > > > > > > > > > The problem is not necessarily in the nouveau
> > > > > > > > > > driver,
> > > > > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > problem
> > > > > > > > > > occurs when you enable RPM without loading nouveau.
> > > > > > > > > > The
> > > > > > > > > > issue is
> > > > > > > > > > limited
> > > > > > > > > > though to some newer hybrid graphics laptops with
> > > > > > > > > > Nvidia
> > > > > > > > > > GPUs. While
> > > > > > > > > > a
> > > > > > > > > > quirk can be added to nouveau, I think that a
> > > > > > > > > > (temporary)
> > > > > > > > > > quirk in
> > > > > > > > > > core
> > > > > > > > > > would also be reasonable (since it also occurs
> > > > > > > > > > without
> > > > > > > > > > nouveau).
> > > > > > > > > > 
> > > > > > > > > > > (AFAIUI, laptops with AMD discrete GPU are not
> > > > > > > > > > > affected
> > > > > > > > > > > as it is
> > > > > > > > > > > known
> > > > > > > > > > > when and how to call an ACPI method versus using
> > > > > > > > > > > PR3.)
> > > > > > > > > > > 
> > > > > > > > > > > (Neither are laptops using the Nvidia proprietary
> > > > > > > > > > > driver
> > > > > > > > > > > as it
> > > > > > > > > > > doesn't
> > > > > > > > > > > runtime suspend the card.  But battery life will
> > > > > > > > > > > be
> > > > > > > > > > > terrible then.)
> > > > > > > > > > > 
> > > > > > > > > > > We're at rc2 so the time frame for coming up with
> > > > > > > > > > > a
> > > > > > > > > > > fix
> > > > > > > > > > > is probably
> > > > > > > > > > > 4 weeks.  Peter and others have tried for months
> > > > > > > > > > > to
> > > > > > > > > > > reverse-engineer
> > > > > > > > > > > how to handle runtime PM on newer Nvidia
> > > > > > > > > > > cards.  It
> > > > > > > > > > > seems
> > > > > > > > > > > likely
> > > > > > > > > > > that
> > > > > > > > > > > we'll not find the ultimate solution to the
> > > > > > > > > > > problem
> > > > > > > > > > > within 4 weeks.
> > > > > > > > > > 
> > > > > > > > > > Yep, a quick proper fix seems unlikely.
> > > > > > > > > > [ Help/ideas are welcome, I suspect that these
> > > > > > > > > > failures
> > > > > > > > > > to
> > > > > > > > > > restore
> > > > > > > > > > power
> > > > > > > > > > on laptops designed for Win8+ all have the same
> > > > > > > > > > cause,
> > > > > > > > > > related to
> > > > > > > > > > some
> > > > > > > > > > unknown interaction between ACPI and PCI. Some
> > > > > > > > > > links:
> > > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=190861
> > > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=156341
> > > > > > > > > > ]
> > > > > > > > > > 
> > > > > > > > > > > The way it is now, i.e. defaulting to PR3 when
> > > > > > > > > > > available,
> > > > > > > > > > > regresses
> > > > > > > > > > > certain laptops such as Kilian's.  If on the
> > > > > > > > > > > other
> > > > > > > > > > > hand
> > > > > > > > > > > we default
> > > > > > > > > > > to
> > > > > > > > > > > DSM when available, we'll regress certain other
> > > > > > > > > > > laptops,
> > > > > > > > > > > as Peter
> > > > > > > > > > > has
> > > > > > > > > > > pointed out.  Whitelisting or blacklisting
> > > > > > > > > > > laptops
> > > > > > > > > > > doesn't seem a
> > > > > > > > > > > good
> > > > > > > > > > > approach either, ideally we'd want to use PR3 as
> > > > > > > > > > > Windows
> > > > > > > > > > > does.
> > > > > > > > > > > 
> > > > > > > > > > > As said, the only short-term solution I see is to
> > > > > > > > > > > add
> > > > > > > > > > > an
> > > > > > > > > > > "optimus"
> > > > > > > > > > > module_param to nouveau to allow users to select
> > > > > > > > > > > which
> > > > > > > > > > > method to
> > > > > > > > > > > use.
> > > > > > > > > > > So in Kilian's case an additional command line
> > > > > > > > > > > parameter
> > > > > > > > > > > would be
> > > > > > > > > > > necessary to fix the issue.
> > > > > > > > > > > 
> > > > > > > > > > > Does anyone see a better solution or can we agree
> > > > > > > > > > > on
> > > > > > > > > > > this
> > > > > > > > > > > one?  If
> > > > > > > > > > > so
> > > > > > > > > > > I can come up with a patch.  This could go in via
> > > > > > > > > > > Dave
> > > > > > > > > > > Airlie's
> > > > > > > > > > > tree.
> > > > > > > > > > 
> > > > > > > > > > As pcie_port_pm=off already reverts to DSM, I do
> > > > > > > > > > not
> > > > > > > > > > think
> > > > > > > > > > that an
> > > > > > > > > > additional (temporary) nouveau module parameter is
> > > > > > > > > > going to
> > > > > > > > > > help. I
> > > > > > > > > > instead propose a (hopefully temporary) quirk in
> > > > > > > > > > pci
> > > > > > > > > > core
> > > > > > > > > > that
> > > > > > > > > > disables
> > > > > > > > > > D3cold RPM for just Kilians Lenovo laptop
> > > > > > > > > > (basically
> > > > > > > > > > defaulting to
> > > > > > > > > > pcie_port_pm=off). Then the option
> > > > > > > > > > pcie_port_pm=force
> > > > > > > > > > can
> > > > > > > > > > still be
> > > > > > > > > > used
> > > > > > > > > > to test possible solutions in the future.
> > > > > > > > > 
> > > > > > > > > I would rather add a quirk to the ACPI core to
> > > > > > > > > prevent
> > > > > > > > > the
> > > > > > > > > power
> > > > > > > > > resources in
> > > > > > > > > question from being enumerated.  Or even to prevent
> > > > > > > > > ACPI
> > > > > > > > > PM
> > > > > > > > > from being
> > > > > > > > > used for the port in question.
> > > > > > > > 
> > > > > > > > I do have a W541 in a cupboard in the office somewhere,
> > > > > > > > but
> > > > > > > > I
> > > > > > > > won't be
> > > > > > > > close to
> > > > > > > > it for a couple of weeks. The W541 was the first place
> > > > > > > > I
> > > > > > > > tested
> > > > > > > > the pm
> > > > > > > > patches
> > > > > > > > so I'm kinda wondering whether it's all W541's or just
> > > > > > > > some
> > > > > > > > specific
> > > > > > > > model/bios
> > > > > > > > combo.
> > > > > > 
> > > > > > They seem to all ship with the 1.10 firmware, and 2.80 is
> > > > > > current
> > > > > > (there
> > > > > > are a bunch of intermediate 2.xx versions).  Somewhere
> > > > > > along
> > > > > > the
> > > > > > line
> > > > > > they introduced some bugs in the UEFI stuff, so it wouldn't
> > > > > > be
> > > > > > surprising if there's bugs introduced elsewhere as well.
> > > > > > 
> > > > > > > > However I'm pretty much unavailable to do anything much
> > > > > > > > until
> > > > > > > > late Jan on
> > > > > > > > this.
> > > > > > > 
> > > > > > > Is there anyone else at Red Hat who might be able to look
> > > > > > > into
> > > > > > > this?
> > > > > > > 
> > > > > > > ISTR that Hans de Goede is working on improving laptop
> > > > > > > support in
> > > > > > > Fedora,
> > > > > > > and Peter Jones recently got a patch merged for the W541
> > > > > > > with
> > > > > > > the
> > > > > > > exact
> > > > > > > same firmware Kilian is using to work around a botched
> > > > > > > EFI
> > > > > > > memory
> > > > > > > map.
> > > > > > > Adding them to cc: in the hope that they may be able to
> > > > > > > help.
> > > > > > > 
> > > > > > > @Peter, have you noticed issues with the discrete Nvidia
> > > > > > > GPU
> > > > > > > on
> > > > > > > your W541
> > > > > > > related to runtime suspend and system sleep?
> > > > > > 
> > > > > > I was using a borrowed one (I can certainly find it again,
> > > > > > but
> > > > > > I'm
> > > > > > not
> > > > > > working on graphics/pm really), but yeah - shutdown and
> > > > > > lspci
> > > > > > both
> > > > > > broke
> > > > > > sometime after pci_pm_runtime_resume().  Here's the
> > > > > > traceback
> > > > > > from
> > > > > > SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
> > > > > > 
> > > > > > Dave, if you know who in Westford should have a look at
> > > > > > this,
> > > > > > I
> > > > > > can
> > > > > > see
> > > > > > about getting them hardware.  I am more or less surrounded
> > > > > > by
> > > > > > that
> > > > > > team.
> > > > > > 
> > > > > > --
> > > > > >         Peter
> > > > > > 
-- 
Cheers,
	Lyude

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-12  2:04                                                                   ` Lyude Paul
@ 2017-01-12  2:12                                                                     ` Lukas Wunner
  2017-01-17 15:55                                                                       ` Mika Westerberg
  0 siblings, 1 reply; 115+ messages in thread
From: Lukas Wunner @ 2017-01-12  2:12 UTC (permalink / raw)
  To: Lyude Paul
  Cc: Hans de Goede, David Airlie, Peter Jones, Rafael J. Wysocki,
	Peter Wu, Bjorn Helgaas, Mika Westerberg, Kilian Singer,
	linux-pci, Alex Deucher

On Wed, Jan 11, 2017 at 09:04:48PM -0500, Lyude Paul wrote:
> Finally got koji to work :). People having the runtime resume problem,
> can you give this kernel RPM a try and tell me if the issue still
> persists?
> 
> https://koji.fedoraproject.org/koji/taskinfo?taskID=17250338

Kilian uses Debian but has a git repo with Linus' tree.  Could you post
the relevant patch(es) based on either the tip of Linus' master branch
or one of his tags (such as v4.9, v4.10-rc3)?  This would also allow us
to review/comment on the patches.

Thanks!

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-10  9:17                                                                   ` Kilian Singer
@ 2017-01-12 18:10                                                                     ` Lyude Paul
  2017-01-24  4:59                                                                       ` Lukas Wunner
  0 siblings, 1 reply; 115+ messages in thread
From: Lyude Paul @ 2017-01-12 18:10 UTC (permalink / raw)
  To: Kilian Singer, David Airlie
  Cc: Hans de Goede, Peter Jones, Lukas Wunner, Rafael J. Wysocki,
	Peter Wu, Bjorn Helgaas, Mika Westerberg, linux-pci,
	Alex Deucher, Lyude

Fwiw, danvet showed me a patch he had already submitted that actually
fixes this issue as well:

https://patchwork.freedesktop.org/patch/132477/

So we're going to go with that. This doesn't fix the race conditions
I've noticed in fbcon(), but danvet suggested that some of the code for
that in nouveau should be cleaned up anyway.

On Tue, 2017-01-10 at 10:17 +0100, Kilian Singer wrote:
> It is a standart debian installation.
> 
> I have not installed powertop.
> 
> 
> On 10-Jan-17 01:33, David Airlie wrote:
> > Hi Killian,
> > 
> > do you use powertop or have you ever used it, I'm guessing some
> > port is getting into suspend on your machine that isn't on ours due
> > to differeing userspace or powertop settings.
> > 
> > Dave.
> > 
> > ----- Original Message -----
> > > From: "Kilian Singer" <kilian.singer@quantumtechnology.info>
> > > To: "Hans de Goede" <hdegoede@redhat.com>, "Lyude Paul" <lyude@re
> > > dhat.com>, "David Airlie" <airlied@redhat.com>,
> > > "Peter Jones" <pjones@redhat.com>
> > > Cc: "Lukas Wunner" <lukas@wunner.de>, "Rafael J. Wysocki" <rjw@rj
> > > wysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> > > "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg" <mika.wes
> > > terberg@linux.intel.com>, "linux-pci"
> > > <linux-pci@vger.kernel.org>, "Alex Deucher" <alexander.deucher@am
> > > d.com>, "Lyude" <cpaul@redhat.com>
> > > Sent: Tuesday, 10 January, 2017 4:48:22 AM
> > > Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe
> > > ports"
> > > 
> > > Hi Lyude Paul,
> > > 
> > > normal supend resume does not work neither on my machine.
> > > 
> > > Best regards
> > > 
> > > Kilian
> > > 
> > > 
> > > On 09-Jan-17 16:21, Hans de Goede wrote:
> > > > Hi Lyude,
> > > > 
> > > > On 09-01-17 16:11, Lyude Paul wrote:
> > > > > fwiw, I just tried on the W541 I have 4.8.15-300.fc25.x86_64
> > > > > running on
> > > > > here and so far it seems to suspend/resume just fine using
> > > > > firmware
> > > > > version 2.19
> > > > 
> > > > Note this is not about normal suspend resume, but runtime
> > > > suspend/resume of the nvidia discrete GPU...
> > > > 
> > > > Try running glxgears like this:
> > > > 
> > > > DRI_PRIME=1 glxgears -info | grep REND
> > > > 
> > > > (the grep is to check you're really running on the nvidia GPU).
> > > > 
> > > > Then you should see msgs in dmesg about nouveau resuming the
> > > > gpu,
> > > > then kill glxgears and wait for 5 seconds, now the nouveau drv
> > > > should say the gpu is suspending, etc.
> > > > 
> > > > If it never runtime suspends, then make sure you are not using
> > > > any external screens, only the built-in laptop screen.
> > > > 
> > > > Regards,
> > > > 
> > > > Hans
> > > > 
> > > > 
> > > > > On Thu, 2017-01-05 at 14:36 -0500, David Airlie wrote:
> > > > > > (cc'ing Lyude, who has the hw also I think).
> > > > > > 
> > > > > > ----- Original Message -----
> > > > > > > From: "Peter Jones" <pjones@redhat.com>
> > > > > > > To: "Lukas Wunner" <lukas@wunner.de>
> > > > > > > Cc: "David Airlie" <airlied@redhat.com>, "Rafael J.
> > > > > > > Wysocki" <rjw@r
> > > > > > > jwysocki.net>, "Peter Wu" <peter@lekensteyn.nl>,
> > > > > > > "Bjorn Helgaas" <helgaas@kernel.org>, "Mika Westerberg"
> > > > > > > <mika.weste
> > > > > > > rberg@linux.intel.com>, "Kilian Singer"
> > > > > > > <kilian.singer@quantumtechnology.info>, "linux-pci" <linu
> > > > > > > x-pci@vger
> > > > > > > .kernel.org>, "Alex Deucher"
> > > > > > > <alexander.deucher@amd.com>, "Hans de Goede" <hdegoede@re
> > > > > > > dhat.com>
> > > > > > > Sent: Friday, 6 January, 2017 4:13:23 AM
> > > > > > > Subject: Re: PCI: Revert "PCI: Add runtime PM support for
> > > > > > > PCIe
> > > > > > > ports"
> > > > > > > 
> > > > > > > On Thu, Jan 05, 2017 at 04:06:46PM +0100, Lukas Wunner
> > > > > > > wrote:
> > > > > > > > On Wed, Jan 04, 2017 at 06:21:14PM -0500, David Airlie
> > > > > > > > wrote:
> > > > > > > > > > On Wednesday, January 04, 2017 10:09:54 PM Peter Wu
> > > > > > > > > > wrote:
> > > > > > > > > > > On Wed, Jan 04, 2017 at 09:16:39AM +0100, Lukas
> > > > > > > > > > > Wunner
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Tue, Jan 03, 2017 at 06:05:57PM -0600, Bjorn
> > > > > > > > > > > > Helgaas
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > I don't *want* to apply the revert.  It's on
> > > > > > > > > > > > > my for-
> > > > > > > > > > > > > linus branch
> > > > > > > > > > > > > as a
> > > > > > > > > > > > > worst-case scenario change if we can't figure
> > > > > > > > > > > > > out a
> > > > > > > > > > > > > better fix.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > The patch below is preferable, but I'd rather
> > > > > > > > > > > > > not take
> > > > > > > > > > > > > even it,
> > > > > > > > > > > > > because it takes away functionality and
> > > > > > > > > > > > > forces people
> > > > > > > > > > > > > to use a
> > > > > > > > > > > > > boot
> > > > > > > > > > > > > parameter to restore it.  I expect that
> > > > > > > > > > > > > somebody will
> > > > > > > > > > > > > figure out
> > > > > > > > > > > > > how
> > > > > > > > > > > > > to fix the regression Kilian found and also
> > > > > > > > > > > > > keep the
> > > > > > > > > > > > > new
> > > > > > > > > > > > > functionality
> > > > > > > > > > > > > (without requiring boot parameters) before
> > > > > > > > > > > > > v4.10.
> > > > > > > > > > > > 
> > > > > > > > > > > > The issue is constrained to hybrid graphics
> > > > > > > > > > > > laptops with
> > > > > > > > > > > > Nvidia
> > > > > > > > > > > > discrete
> > > > > > > > > > > > GPU using nouveau.  Hence it needs to be fixed
> > > > > > > > > > > > in
> > > > > > > > > > > > nouveau, not in
> > > > > > > > > > > > the
> > > > > > > > > > > > PCI core.
> > > > > > > > > > > 
> > > > > > > > > > > The problem is not necessarily in the nouveau
> > > > > > > > > > > driver, the
> > > > > > > > > > > same
> > > > > > > > > > > problem
> > > > > > > > > > > occurs when you enable RPM without loading
> > > > > > > > > > > nouveau. The
> > > > > > > > > > > issue is
> > > > > > > > > > > limited
> > > > > > > > > > > though to some newer hybrid graphics laptops with
> > > > > > > > > > > Nvidia
> > > > > > > > > > > GPUs. While
> > > > > > > > > > > a
> > > > > > > > > > > quirk can be added to nouveau, I think that a
> > > > > > > > > > > (temporary)
> > > > > > > > > > > quirk in
> > > > > > > > > > > core
> > > > > > > > > > > would also be reasonable (since it also occurs
> > > > > > > > > > > without
> > > > > > > > > > > nouveau).
> > > > > > > > > > > 
> > > > > > > > > > > > (AFAIUI, laptops with AMD discrete GPU are not
> > > > > > > > > > > > affected
> > > > > > > > > > > > as it is
> > > > > > > > > > > > known
> > > > > > > > > > > > when and how to call an ACPI method versus
> > > > > > > > > > > > using PR3.)
> > > > > > > > > > > > 
> > > > > > > > > > > > (Neither are laptops using the Nvidia
> > > > > > > > > > > > proprietary driver
> > > > > > > > > > > > as it
> > > > > > > > > > > > doesn't
> > > > > > > > > > > > runtime suspend the card.  But battery life
> > > > > > > > > > > > will be
> > > > > > > > > > > > terrible then.)
> > > > > > > > > > > > 
> > > > > > > > > > > > We're at rc2 so the time frame for coming up
> > > > > > > > > > > > with a fix
> > > > > > > > > > > > is probably
> > > > > > > > > > > > 4 weeks.  Peter and others have tried for
> > > > > > > > > > > > months to
> > > > > > > > > > > > reverse-engineer
> > > > > > > > > > > > how to handle runtime PM on newer Nvidia
> > > > > > > > > > > > cards.  It seems
> > > > > > > > > > > > likely
> > > > > > > > > > > > that
> > > > > > > > > > > > we'll not find the ultimate solution to the
> > > > > > > > > > > > problem
> > > > > > > > > > > > within 4 weeks.
> > > > > > > > > > > 
> > > > > > > > > > > Yep, a quick proper fix seems unlikely.
> > > > > > > > > > > [ Help/ideas are welcome, I suspect that these
> > > > > > > > > > > failures to
> > > > > > > > > > > restore
> > > > > > > > > > > power
> > > > > > > > > > > on laptops designed for Win8+ all have the same
> > > > > > > > > > > cause,
> > > > > > > > > > > related to
> > > > > > > > > > > some
> > > > > > > > > > > unknown interaction between ACPI and PCI. Some
> > > > > > > > > > > links:
> > > > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=19086
> > > > > > > > > > > 1
> > > > > > > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15634
> > > > > > > > > > > 1 ]
> > > > > > > > > > > 
> > > > > > > > > > > > The way it is now, i.e. defaulting to PR3 when
> > > > > > > > > > > > available,
> > > > > > > > > > > > regresses
> > > > > > > > > > > > certain laptops such as Kilian's.  If on the
> > > > > > > > > > > > other hand
> > > > > > > > > > > > we default
> > > > > > > > > > > > to
> > > > > > > > > > > > DSM when available, we'll regress certain other
> > > > > > > > > > > > laptops,
> > > > > > > > > > > > as Peter
> > > > > > > > > > > > has
> > > > > > > > > > > > pointed out.  Whitelisting or blacklisting
> > > > > > > > > > > > laptops
> > > > > > > > > > > > doesn't seem a
> > > > > > > > > > > > good
> > > > > > > > > > > > approach either, ideally we'd want to use PR3
> > > > > > > > > > > > as Windows
> > > > > > > > > > > > does.
> > > > > > > > > > > > 
> > > > > > > > > > > > As said, the only short-term solution I see is
> > > > > > > > > > > > to add an
> > > > > > > > > > > > "optimus"
> > > > > > > > > > > > module_param to nouveau to allow users to
> > > > > > > > > > > > select which
> > > > > > > > > > > > method to
> > > > > > > > > > > > use.
> > > > > > > > > > > > So in Kilian's case an additional command line
> > > > > > > > > > > > parameter
> > > > > > > > > > > > would be
> > > > > > > > > > > > necessary to fix the issue.
> > > > > > > > > > > > 
> > > > > > > > > > > > Does anyone see a better solution or can we
> > > > > > > > > > > > agree on this
> > > > > > > > > > > > one?  If
> > > > > > > > > > > > so
> > > > > > > > > > > > I can come up with a patch.  This could go in
> > > > > > > > > > > > via Dave
> > > > > > > > > > > > Airlie's
> > > > > > > > > > > > tree.
> > > > > > > > > > > 
> > > > > > > > > > > As pcie_port_pm=off already reverts to DSM, I do
> > > > > > > > > > > not think
> > > > > > > > > > > that an
> > > > > > > > > > > additional (temporary) nouveau module parameter
> > > > > > > > > > > is going to
> > > > > > > > > > > help. I
> > > > > > > > > > > instead propose a (hopefully temporary) quirk in
> > > > > > > > > > > pci core
> > > > > > > > > > > that
> > > > > > > > > > > disables
> > > > > > > > > > > D3cold RPM for just Kilians Lenovo laptop
> > > > > > > > > > > (basically
> > > > > > > > > > > defaulting to
> > > > > > > > > > > pcie_port_pm=off). Then the option
> > > > > > > > > > > pcie_port_pm=force can
> > > > > > > > > > > still be
> > > > > > > > > > > used
> > > > > > > > > > > to test possible solutions in the future.
> > > > > > > > > > 
> > > > > > > > > > I would rather add a quirk to the ACPI core to
> > > > > > > > > > prevent the
> > > > > > > > > > power
> > > > > > > > > > resources in
> > > > > > > > > > question from being enumerated.  Or even to prevent
> > > > > > > > > > ACPI PM
> > > > > > > > > > from being
> > > > > > > > > > used for the port in question.
> > > > > > > > > 
> > > > > > > > > I do have a W541 in a cupboard in the office
> > > > > > > > > somewhere, but I
> > > > > > > > > won't be
> > > > > > > > > close to
> > > > > > > > > it for a couple of weeks. The W541 was the first
> > > > > > > > > place I tested
> > > > > > > > > the pm
> > > > > > > > > patches
> > > > > > > > > so I'm kinda wondering whether it's all W541's or
> > > > > > > > > just some
> > > > > > > > > specific
> > > > > > > > > model/bios
> > > > > > > > > combo.
> > > > > > > 
> > > > > > > They seem to all ship with the 1.10 firmware, and 2.80 is
> > > > > > > current
> > > > > > > (there
> > > > > > > are a bunch of intermediate 2.xx versions).  Somewhere
> > > > > > > along the
> > > > > > > line
> > > > > > > they introduced some bugs in the UEFI stuff, so it
> > > > > > > wouldn't be
> > > > > > > surprising if there's bugs introduced elsewhere as well.
> > > > > > > 
> > > > > > > > > However I'm pretty much unavailable to do anything
> > > > > > > > > much until
> > > > > > > > > late Jan on
> > > > > > > > > this.
> > > > > > > > 
> > > > > > > > Is there anyone else at Red Hat who might be able to
> > > > > > > > look into
> > > > > > > > this?
> > > > > > > > 
> > > > > > > > ISTR that Hans de Goede is working on improving laptop
> > > > > > > > support in
> > > > > > > > Fedora,
> > > > > > > > and Peter Jones recently got a patch merged for the
> > > > > > > > W541 with the
> > > > > > > > exact
> > > > > > > > same firmware Kilian is using to work around a botched
> > > > > > > > EFI memory
> > > > > > > > map.
> > > > > > > > Adding them to cc: in the hope that they may be able to
> > > > > > > > help.
> > > > > > > > 
> > > > > > > > @Peter, have you noticed issues with the discrete
> > > > > > > > Nvidia GPU on
> > > > > > > > your W541
> > > > > > > > related to runtime suspend and system sleep?
> > > > > > > 
> > > > > > > I was using a borrowed one (I can certainly find it
> > > > > > > again, but I'm
> > > > > > > not
> > > > > > > working on graphics/pm really), but yeah - shutdown and
> > > > > > > lspci both
> > > > > > > broke
> > > > > > > sometime after pci_pm_runtime_resume().  Here's the
> > > > > > > traceback from
> > > > > > > SYS_reboot(): https://goo.gl/photos/T1fr1bksHQb9RSU67
> > > > > > > 
> > > > > > > Dave, if you know who in Westford should have a look at
> > > > > > > this, I can
> > > > > > > see
> > > > > > > about getting them hardware.  I am more or less
> > > > > > > surrounded by that
> > > > > > > team.
> > > > > > > 
> > > > > > > --
> > > > > > >         Peter
> > > > > > > 
> 
> 

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-27 23:57 PCI: Revert "PCI: Add runtime PM support for PCIe ports" Bjorn Helgaas
  2016-12-28  9:17 ` Mika Westerberg
  2016-12-28 11:29 ` Lukas Wunner
@ 2017-01-17 14:56 ` Bjorn Helgaas
  2017-01-17 15:49   ` Kilian Singer
  2017-01-23 20:33   ` Bjorn Helgaas
  2017-01-25 17:58 ` Bjorn Helgaas
  3 siblings, 2 replies; 115+ messages in thread
From: Bjorn Helgaas @ 2017-01-17 14:56 UTC (permalink / raw)
  To: kilian.singer; +Cc: linux-pci, Mika Westerberg, Lukas Wunner, Rafael J. Wysocki

On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> Hi Killian,
> 
> Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> and all the debugging you've done.  Below is a revert of the troublesome
> commit.  Can you test it and verify that it also fixes the problem?
> 
> I assume Mika is looking at this and will have a better solution soon.
> But if not, I'll queue this up for v4.10.

Can somebody please summarize the current state of this issue?  I
assume somebody has already posted a better patch that should replace
this naive revert, but I haven't been following the whole thread.

> commit e648b1ca2b94d207289fedc2538d33c57cdbc4de
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Tue Dec 27 17:27:30 2016 -0600
> 
>     Revert "PCI: Add runtime PM support for PCIe ports"
>     
>     Revert 006d44e49a25 ("PCI: Add runtime PM support for PCIe ports").
>     
>     Killian reported that on a Lenovo W54l with i7-4810MQ, Intel HD Graphics
>     4600, and NVIDIA Quadro® K1100M, locking the screen kills all keyboard and
>     mouse interaction.  Reverting 006d44e49a25 fixes the problem.
>     
>     Link: https://bugzilla.kernel.org/show_bug.cgi?id=190861
>     Reported-by: kilian.singer@quantumtechnology.info
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>     CC: stable@vger.kernel.org	# v4.8+
>     CC: Mika Westerberg <mika.westerberg@linux.intel.com>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-17 14:56 ` Bjorn Helgaas
@ 2017-01-17 15:49   ` Kilian Singer
  2017-01-23 20:33   ` Bjorn Helgaas
  1 sibling, 0 replies; 115+ messages in thread
From: Kilian Singer @ 2017-01-17 15:49 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Mika Westerberg, Lukas Wunner, Rafael J. Wysocki

Dear Bjorn,

I got two patches to test but they fail because I need to know the git
version against which to patch.

Also a manual patch was not possible because the source code differed
too much.

So currently I am waiting for that info.

Best regards

Kilian


On 17-Jan-17 15:56, Bjorn Helgaas wrote:
> On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
>> Hi Killian,
>>
>> Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=3D1=
90861)
>> and all the debugging you've done.  Below is a revert of the troubleso=
me
>> commit.  Can you test it and verify that it also fixes the problem?
>>
>> I assume Mika is looking at this and will have a better solution soon.
>> But if not, I'll queue this up for v4.10.
> Can somebody please summarize the current state of this issue?  I
> assume somebody has already posted a better patch that should replace
> this naive revert, but I haven't been following the whole thread.
>
>> commit e648b1ca2b94d207289fedc2538d33c57cdbc4de
>> Author: Bjorn Helgaas <bhelgaas@google.com>
>> Date:   Tue Dec 27 17:27:30 2016 -0600
>>
>>     Revert "PCI: Add runtime PM support for PCIe ports"
>>    =20
>>     Revert 006d44e49a25 ("PCI: Add runtime PM support for PCIe ports")=
.
>>    =20
>>     Killian reported that on a Lenovo W54l with i7-4810MQ, Intel HD Gr=
aphics
>>     4600, and NVIDIA Quadro=AE K1100M, locking the screen kills all ke=
yboard and
>>     mouse interaction.  Reverting 006d44e49a25 fixes the problem.
>>    =20
>>     Link: https://bugzilla.kernel.org/show_bug.cgi?id=3D190861
>>     Reported-by: kilian.singer@quantumtechnology.info
>>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>     CC: stable@vger.kernel.org	# v4.8+
>>     CC: Mika Westerberg <mika.westerberg@linux.intel.com>

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-12  2:12                                                                     ` Lukas Wunner
@ 2017-01-17 15:55                                                                       ` Mika Westerberg
  2017-01-17 18:06                                                                         ` Lyude Paul
  0 siblings, 1 reply; 115+ messages in thread
From: Mika Westerberg @ 2017-01-17 15:55 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Lyude Paul, Hans de Goede, David Airlie, Peter Jones,
	Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Kilian Singer,
	linux-pci, Alex Deucher

On Thu, Jan 12, 2017 at 03:12:35AM +0100, Lukas Wunner wrote:
> On Wed, Jan 11, 2017 at 09:04:48PM -0500, Lyude Paul wrote:
> > Finally got koji to work :). People having the runtime resume problem,
> > can you give this kernel RPM a try and tell me if the issue still
> > persists?
> > 
> > https://koji.fedoraproject.org/koji/taskinfo?taskID=17250338
> 
> Kilian uses Debian but has a git repo with Linus' tree.  Could you post
> the relevant patch(es) based on either the tip of Linus' master branch
> or one of his tags (such as v4.9, v4.10-rc3)?  This would also allow us
> to review/comment on the patches.

Hi Lyude,

Just a reminder. Can you post the patch(es) here so that people can try
them out and comment?

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-17 15:55                                                                       ` Mika Westerberg
@ 2017-01-17 18:06                                                                         ` Lyude Paul
  2017-01-17 19:10                                                                           ` Bjorn Helgaas
  0 siblings, 1 reply; 115+ messages in thread
From: Lyude Paul @ 2017-01-17 18:06 UTC (permalink / raw)
  To: Mika Westerberg, Lukas Wunner
  Cc: Hans de Goede, David Airlie, Peter Jones, Rafael J. Wysocki,
	Peter Wu, Bjorn Helgaas, Kilian Singer, linux-pci, Alex Deucher

Did you not get the e-mails I CC'd you in? I did originally send you
guys the patches, but I ended up finding out that Daniel Vetter had
already submitted a fix for this:

https://patchwork.freedesktop.org/patch/132478/

On Tue, 2017-01-17 at 17:55 +0200, Mika Westerberg wrote:
> On Thu, Jan 12, 2017 at 03:12:35AM +0100, Lukas Wunner wrote:
> > On Wed, Jan 11, 2017 at 09:04:48PM -0500, Lyude Paul wrote:
> > > Finally got koji to work :). People having the runtime resume
> > > problem,
> > > can you give this kernel RPM a try and tell me if the issue still
> > > persists?
> > > 
> > > https://koji.fedoraproject.org/koji/taskinfo?taskID=17250338
> > 
> > Kilian uses Debian but has a git repo with Linus' tree.  Could you
> > post
> > the relevant patch(es) based on either the tip of Linus' master
> > branch
> > or one of his tags (such as v4.9, v4.10-rc3)?  This would also
> > allow us
> > to review/comment on the patches.
> 
> Hi Lyude,
> 
> Just a reminder. Can you post the patch(es) here so that people can
> try
> them out and comment?

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-17 18:06                                                                         ` Lyude Paul
@ 2017-01-17 19:10                                                                           ` Bjorn Helgaas
  2017-01-17 19:49                                                                             ` Lyude Paul
  0 siblings, 1 reply; 115+ messages in thread
From: Bjorn Helgaas @ 2017-01-17 19:10 UTC (permalink / raw)
  To: Lyude Paul
  Cc: Mika Westerberg, Lukas Wunner, Hans de Goede, David Airlie,
	Peter Jones, Rafael J. Wysocki, Peter Wu, Kilian Singer,
	linux-pci, Alex Deucher, Daniel Vetter

[+cc Daniel]

On Tue, Jan 17, 2017 at 01:06:06PM -0500, Lyude Paul wrote:
> Did you not get the e-mails I CC'd you in? I did originally send you
> guys the patches, but I ended up finding out that Daniel Vetter had
> already submitted a fix for this:
> 
> https://patchwork.freedesktop.org/patch/132478/

Is this related to https://bugzilla.kernel.org/show_bug.cgi?id=190861 ?

If so, we need to connect the dots a little bit by mentioning it in the
changelog, CC'ing this thread on linux-pci, and figuring out how to connect
the fix with the regression, i.e., stable backports, etc.

I haven't followed the entire thread, so I apologize if the patch you
mention is unrelated to this bugzilla.

> On Tue, 2017-01-17 at 17:55 +0200, Mika Westerberg wrote:
> > On Thu, Jan 12, 2017 at 03:12:35AM +0100, Lukas Wunner wrote:
> > > On Wed, Jan 11, 2017 at 09:04:48PM -0500, Lyude Paul wrote:
> > > > Finally got koji to work :). People having the runtime resume
> > > > problem,
> > > > can you give this kernel RPM a try and tell me if the issue still
> > > > persists?
> > > > 
> > > > https://koji.fedoraproject.org/koji/taskinfo?taskID=17250338
> > > 
> > > Kilian uses Debian but has a git repo with Linus' tree.  Could you
> > > post
> > > the relevant patch(es) based on either the tip of Linus' master
> > > branch
> > > or one of his tags (such as v4.9, v4.10-rc3)?  This would also
> > > allow us
> > > to review/comment on the patches.
> > 
> > Hi Lyude,
> > 
> > Just a reminder. Can you post the patch(es) here so that people can
> > try
> > them out and comment?

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-17 19:10                                                                           ` Bjorn Helgaas
@ 2017-01-17 19:49                                                                             ` Lyude Paul
  0 siblings, 0 replies; 115+ messages in thread
From: Lyude Paul @ 2017-01-17 19:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Mika Westerberg, Lukas Wunner, Hans de Goede, David Airlie,
	Peter Jones, Rafael J. Wysocki, Peter Wu, Kilian Singer,
	linux-pci, Alex Deucher, Daniel Vetter

On Tue, 2017-01-17 at 13:10 -0600, Bjorn Helgaas wrote:
> [+cc Daniel]
> 
> On Tue, Jan 17, 2017 at 01:06:06PM -0500, Lyude Paul wrote:
> > Did you not get the e-mails I CC'd you in? I did originally send
> > you
> > guys the patches, but I ended up finding out that Daniel Vetter had
> > already submitted a fix for this:
> > 
> > https://patchwork.freedesktop.org/patch/132478/
> 
> Is this related to https://bugzilla.kernel.org/show_bug.cgi?id=190861
>  ?
> 
> If so, we need to connect the dots a little bit by mentioning it in
> the
> changelog, CC'ing this thread on linux-pci, and figuring out how to
> connect
> the fix with the regression, i.e., stable backports, etc.
> 
> I haven't followed the entire thread, so I apologize if the patch you
> mention is unrelated to this bugzilla.

Nice catch. I'm not entirely sure if it's related either but judging
from the bz comments it's certainly plausible. I'll link to the patch
there and see what happens.
> 
> > On Tue, 2017-01-17 at 17:55 +0200, Mika Westerberg wrote:
> > > On Thu, Jan 12, 2017 at 03:12:35AM +0100, Lukas Wunner wrote:
> > > > On Wed, Jan 11, 2017 at 09:04:48PM -0500, Lyude Paul wrote:
> > > > > Finally got koji to work :). People having the runtime resume
> > > > > problem,
> > > > > can you give this kernel RPM a try and tell me if the issue
> > > > > still
> > > > > persists?
> > > > > 
> > > > > https://koji.fedoraproject.org/koji/taskinfo?taskID=17250338
> > > > 
> > > > Kilian uses Debian but has a git repo with Linus' tree.  Could
> > > > you
> > > > post
> > > > the relevant patch(es) based on either the tip of Linus' master
> > > > branch
> > > > or one of his tags (such as v4.9, v4.10-rc3)?  This would also
> > > > allow us
> > > > to review/comment on the patches.
> > > 
> > > Hi Lyude,
> > > 
> > > Just a reminder. Can you post the patch(es) here so that people
> > > can
> > > try
> > > them out and comment?
-- 
Cheers,
	Lyude

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-17 14:56 ` Bjorn Helgaas
  2017-01-17 15:49   ` Kilian Singer
@ 2017-01-23 20:33   ` Bjorn Helgaas
  2017-01-23 21:12     ` Mika Westerberg
  1 sibling, 1 reply; 115+ messages in thread
From: Bjorn Helgaas @ 2017-01-23 20:33 UTC (permalink / raw)
  To: kilian.singer; +Cc: linux-pci, Mika Westerberg, Lukas Wunner, Rafael J. Wysocki

On Tue, Jan 17, 2017 at 08:56:28AM -0600, Bjorn Helgaas wrote:
> On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > Hi Killian,
> > 
> > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > and all the debugging you've done.  Below is a revert of the troublesome
> > commit.  Can you test it and verify that it also fixes the problem?
> > 
> > I assume Mika is looking at this and will have a better solution soon.
> > But if not, I'll queue this up for v4.10.
> 
> Can somebody please summarize the current state of this issue?  I
> assume somebody has already posted a better patch that should replace
> this naive revert, but I haven't been following the whole thread.

This is somewhat frustrating.  Is there a better patch than the revert
mentioned below?  There was a lot of hullabaloo when I first posted
it, but I haven't seen a good alternative yet.  I intended the revert
as a worst-case scenario fix, with the expectation that somebody would
fix the problem or at least avoid it without having to do the revert.
Maybe somebody posted that better fix and I just missed it?

>From my perspective (and I have not followed the whole 100 message
thread), the bare bones of the situation are that 006d44e49a25 ("PCI:
Add runtime PM support for PCIe ports") probably reduced power
consumption on some machines.  But it also made Kilian's system
unresponsive when locking the screen.

Given only those assumptions, a revert seems like a reasonable
approach.  I understand and agree that we want to save power, but
not at the expense of making systems unresponsive.

Maybe 006d44e49a25 actually fixed a functional problem in addition to
saving power?  I don't think the changelog mentions anything like
that, but if that's the case, we should certainly consider that.

We're at -rc5 already, so if we want something other than a revert,
now is the time to propose it.

> > commit e648b1ca2b94d207289fedc2538d33c57cdbc4de
> > Author: Bjorn Helgaas <bhelgaas@google.com>
> > Date:   Tue Dec 27 17:27:30 2016 -0600
> > 
> >     Revert "PCI: Add runtime PM support for PCIe ports"
> >     
> >     Revert 006d44e49a25 ("PCI: Add runtime PM support for PCIe ports").
> >     
> >     Killian reported that on a Lenovo W54l with i7-4810MQ, Intel HD Graphics
> >     4600, and NVIDIA Quadro® K1100M, locking the screen kills all keyboard and
> >     mouse interaction.  Reverting 006d44e49a25 fixes the problem.
> >     
> >     Link: https://bugzilla.kernel.org/show_bug.cgi?id=190861
> >     Reported-by: kilian.singer@quantumtechnology.info
> >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> >     CC: stable@vger.kernel.org	# v4.8+
> >     CC: Mika Westerberg <mika.westerberg@linux.intel.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-23 20:33   ` Bjorn Helgaas
@ 2017-01-23 21:12     ` Mika Westerberg
  2017-01-24  4:53       ` Lukas Wunner
  2017-01-24 20:01       ` Bjorn Helgaas
  0 siblings, 2 replies; 115+ messages in thread
From: Mika Westerberg @ 2017-01-23 21:12 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: kilian.singer, linux-pci, Lukas Wunner, Rafael J. Wysocki

On Mon, Jan 23, 2017 at 02:33:35PM -0600, Bjorn Helgaas wrote:
> On Tue, Jan 17, 2017 at 08:56:28AM -0600, Bjorn Helgaas wrote:
> > On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > > Hi Killian,
> > > 
> > > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > > and all the debugging you've done.  Below is a revert of the troublesome
> > > commit.  Can you test it and verify that it also fixes the problem?
> > > 
> > > I assume Mika is looking at this and will have a better solution soon.
> > > But if not, I'll queue this up for v4.10.
> > 
> > Can somebody please summarize the current state of this issue?  I
> > assume somebody has already posted a better patch that should replace
> > this naive revert, but I haven't been following the whole thread.
> 
> This is somewhat frustrating.  Is there a better patch than the revert
> mentioned below?  There was a lot of hullabaloo when I first posted
> it, but I haven't seen a good alternative yet.  I intended the revert
> as a worst-case scenario fix, with the expectation that somebody would
> fix the problem or at least avoid it without having to do the revert.
> Maybe somebody posted that better fix and I just missed it?

I understood that there is a patch here:

https://patchwork.freedesktop.org/patch/132478/

that is supposed to fix the issue. I'm waiting Kilian to test it.

> >From my perspective (and I have not followed the whole 100 message
> thread), the bare bones of the situation are that 006d44e49a25 ("PCI:
> Add runtime PM support for PCIe ports") probably reduced power
> consumption on some machines.  But it also made Kilian's system
> unresponsive when locking the screen.
> 
> Given only those assumptions, a revert seems like a reasonable
> approach.  I understand and agree that we want to save power, but
> not at the expense of making systems unresponsive.

But even if you revert the runtime PM commit, the same thing happens
when the system is suspended.

> Maybe 006d44e49a25 actually fixed a functional problem in addition to
> saving power?  I don't think the changelog mentions anything like
> that, but if that's the case, we should certainly consider that.
> 
> We're at -rc5 already, so if we want something other than a revert,
> now is the time to propose it.

Hmm, runtime PM patches went in for 4.9 IIRC. It is not a regression
introduced in v4.10 release cycle so I'm not sure why we are in such
hurry here?

I understand that the issue should be fixed but not why it should be
fixed for v4.10 as it is not a regression introduced by v4.10-rc1.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-23 21:12     ` Mika Westerberg
@ 2017-01-24  4:53       ` Lukas Wunner
  2017-01-24 20:01       ` Bjorn Helgaas
  1 sibling, 0 replies; 115+ messages in thread
From: Lukas Wunner @ 2017-01-24  4:53 UTC (permalink / raw)
  To: Mika Westerberg
  Cc: Bjorn Helgaas, kilian.singer, linux-pci, Rafael J. Wysocki

On Mon, Jan 23, 2017 at 11:12:47PM +0200, Mika Westerberg wrote:
> On Mon, Jan 23, 2017 at 02:33:35PM -0600, Bjorn Helgaas wrote:
> > On Tue, Jan 17, 2017 at 08:56:28AM -0600, Bjorn Helgaas wrote:
> > > On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> > > > Hi Killian,
> > > > 
> > > > Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> > > > and all the debugging you've done.  Below is a revert of the troublesome
> > > > commit.  Can you test it and verify that it also fixes the problem?
> > > > 
> > > > I assume Mika is looking at this and will have a better solution soon.
> > > > But if not, I'll queue this up for v4.10.
> > > 
> > > Can somebody please summarize the current state of this issue?  I
> > > assume somebody has already posted a better patch that should replace
> > > this naive revert, but I haven't been following the whole thread.
> > 
> > This is somewhat frustrating.  Is there a better patch than the revert
> > mentioned below?  There was a lot of hullabaloo when I first posted
> > it, but I haven't seen a good alternative yet.  I intended the revert
> > as a worst-case scenario fix, with the expectation that somebody would
> > fix the problem or at least avoid it without having to do the revert.
> > Maybe somebody posted that better fix and I just missed it?
> 
> I understood that there is a patch here:
> 
> https://patchwork.freedesktop.org/patch/132478/
> 
> that is supposed to fix the issue. I'm waiting Kilian to test it.

That patch landed in Linus' tree tonight (commit 3846fd9b8600,
merge commit 3258943ddb90).

@Kilian: Could you retest with the tip of Linus' master branch?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-12 18:10                                                                     ` Lyude Paul
@ 2017-01-24  4:59                                                                       ` Lukas Wunner
  2017-01-24 19:09                                                                         ` Lyude Paul
  0 siblings, 1 reply; 115+ messages in thread
From: Lukas Wunner @ 2017-01-24  4:59 UTC (permalink / raw)
  To: Lyude Paul
  Cc: Kilian Singer, David Airlie, Hans de Goede, Peter Jones,
	Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	linux-pci, Alex Deucher, Lyude

On Thu, Jan 12, 2017 at 01:10:35PM -0500, Lyude Paul wrote:
> Fwiw, danvet showed me a patch he had already submitted that actually
> fixes this issue as well:
> 
> https://patchwork.freedesktop.org/patch/132477/
> 
> So we're going to go with that. This doesn't fix the race conditions
> I've noticed in fbcon(), but danvet suggested that some of the code for
> that in nouveau should be cleaned up anyway.

@Lyude:  Since Daniel's patch landed in Linus' tree tonight, what else
is needed to fix the race conditions you mention above that might be at
the root of Kilian's as well as Peter's issues with nouveau?

See:
https://bugzilla.kernel.org/show_bug.cgi?id=190861
https://bugzilla.kernel.org/show_bug.cgi?id=156341

Could you propose a patch on top of Daniel's fix?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-24  4:59                                                                       ` Lukas Wunner
@ 2017-01-24 19:09                                                                         ` Lyude Paul
  0 siblings, 0 replies; 115+ messages in thread
From: Lyude Paul @ 2017-01-24 19:09 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Kilian Singer, David Airlie, Hans de Goede, Peter Jones,
	Rafael J. Wysocki, Peter Wu, Bjorn Helgaas, Mika Westerberg,
	linux-pci, Alex Deucher

On Tue, 2017-01-24 at 05:59 +0100, Lukas Wunner wrote:
> On Thu, Jan 12, 2017 at 01:10:35PM -0500, Lyude Paul wrote:
> > Fwiw, danvet showed me a patch he had already submitted that
> > actually
> > fixes this issue as well:
> > 
> > https://patchwork.freedesktop.org/patch/132477/
> > 
> > So we're going to go with that. This doesn't fix the race
> > conditions
> > I've noticed in fbcon(), but danvet suggested that some of the code
> > for
> > that in nouveau should be cleaned up anyway.
fwiw, this should only ever come up if you are manually trying to turn
the GPU on or off using the debugfs interface with vga-switcheroo
> 
> @Lyude:  Since Daniel's patch landed in Linus' tree tonight, what
> else
> is needed to fix the race conditions you mention above that might be
> at
> the root of Kilian's as well as Peter's issues with nouveau?

I'm not sure what you mean here, are you still seeing these issues?
Daniel's patch should be all that's required for fixing this

> 
> See:
> https://bugzilla.kernel.org/show_bug.cgi?id=190861
> https://bugzilla.kernel.org/show_bug.cgi?id=156341
> 
> Could you propose a patch on top of Daniel's fix?
> 
> Thanks,
> 
> Lukas
-- 
Cheers,
	Lyude

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-23 21:12     ` Mika Westerberg
  2017-01-24  4:53       ` Lukas Wunner
@ 2017-01-24 20:01       ` Bjorn Helgaas
  2017-01-25  9:48         ` Mika Westerberg
  1 sibling, 1 reply; 115+ messages in thread
From: Bjorn Helgaas @ 2017-01-24 20:01 UTC (permalink / raw)
  To: Mika Westerberg; +Cc: kilian.singer, linux-pci, Lukas Wunner, Rafael J. Wysocki

On Mon, Jan 23, 2017 at 11:12:47PM +0200, Mika Westerberg wrote:
> On Mon, Jan 23, 2017 at 02:33:35PM -0600, Bjorn Helgaas wrote:
> > From my perspective (and I have not followed the whole 100 message
> > thread), the bare bones of the situation are that 006d44e49a25 ("PCI:
> > Add runtime PM support for PCIe ports") probably reduced power
> > consumption on some machines.  But it also made Kilian's system
> > unresponsive when locking the screen.
> > 
> > Given only those assumptions, a revert seems like a reasonable
> > approach.  I understand and agree that we want to save power, but
> > not at the expense of making systems unresponsive.
> 
> But even if you revert the runtime PM commit, the same thing happens
> when the system is suspended.

In other words, we always had bug A, and after adding 006d44e49a25, we
have bug A and bug B.  It is worthwhile to avoid B even if A still
exists.

Kilian tripped over B, and no doubt others have as well.  Most others
will be frustrated and unable to work around it.  We're lucky Kilian
was patient and sophisticated enough to track it down.

> > Maybe 006d44e49a25 actually fixed a functional problem in addition to
> > saving power?  I don't think the changelog mentions anything like
> > that, but if that's the case, we should certainly consider that.
> > 
> > We're at -rc5 already, so if we want something other than a revert,
> > now is the time to propose it.
> 
> Hmm, runtime PM patches went in for 4.9 IIRC. It is not a regression
> introduced in v4.10 release cycle so I'm not sure why we are in such
> hurry here?
> 
> I understand that the issue should be fixed but not why it should be
> fixed for v4.10 as it is not a regression introduced by v4.10-rc1.

As far as I can tell, the downside of a revert is only that we'll
consume a little more power.  I'm not sure why we would release v4.10
with a known issue that we can easily avoid.

Bjorn

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-24 20:01       ` Bjorn Helgaas
@ 2017-01-25  9:48         ` Mika Westerberg
  2017-01-25 16:05           ` Kilian Singer
  0 siblings, 1 reply; 115+ messages in thread
From: Mika Westerberg @ 2017-01-25  9:48 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: kilian.singer, linux-pci, Lukas Wunner, Rafael J. Wysocki

On Tue, Jan 24, 2017 at 02:01:03PM -0600, Bjorn Helgaas wrote:
> On Mon, Jan 23, 2017 at 11:12:47PM +0200, Mika Westerberg wrote:
> > On Mon, Jan 23, 2017 at 02:33:35PM -0600, Bjorn Helgaas wrote:
> > > From my perspective (and I have not followed the whole 100 message
> > > thread), the bare bones of the situation are that 006d44e49a25 ("PCI:
> > > Add runtime PM support for PCIe ports") probably reduced power
> > > consumption on some machines.  But it also made Kilian's system
> > > unresponsive when locking the screen.
> > > 
> > > Given only those assumptions, a revert seems like a reasonable
> > > approach.  I understand and agree that we want to save power, but
> > > not at the expense of making systems unresponsive.
> > 
> > But even if you revert the runtime PM commit, the same thing happens
> > when the system is suspended.
> 
> In other words, we always had bug A, and after adding 006d44e49a25, we
> have bug A and bug B.  It is worthwhile to avoid B even if A still
> exists.

I meant the same PCI PM series also added support for powering down PCI
bridges when the system is suspended. So the same issue happens when the
system is suspended even if the runtime PM patch is reverted.

> Kilian tripped over B, and no doubt others have as well.  Most others
> will be frustrated and unable to work around it.  We're lucky Kilian
> was patient and sophisticated enough to track it down.

I agree. Thanks Kilian for the patience :)

> > > Maybe 006d44e49a25 actually fixed a functional problem in addition to
> > > saving power?  I don't think the changelog mentions anything like
> > > that, but if that's the case, we should certainly consider that.
> > > 
> > > We're at -rc5 already, so if we want something other than a revert,
> > > now is the time to propose it.
> > 
> > Hmm, runtime PM patches went in for 4.9 IIRC. It is not a regression
> > introduced in v4.10 release cycle so I'm not sure why we are in such
> > hurry here?
> > 
> > I understand that the issue should be fixed but not why it should be
> > fixed for v4.10 as it is not a regression introduced by v4.10-rc1.
> 
> As far as I can tell, the downside of a revert is only that we'll
> consume a little more power.  I'm not sure why we would release v4.10
> with a known issue that we can easily avoid.

I've been told that reverting the nouveau driver back to use ACPI _DSM
method causes other issues.

I would rather try to understand what is actually going on and why this
happens in the first place, even if it takes longer than when v4.10 is
released, if 3846fd9b8600 ("drm/probe-helpers: Drop locking from
poll_enable") does not solve the issue.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-25  9:48         ` Mika Westerberg
@ 2017-01-25 16:05           ` Kilian Singer
  2017-01-25 16:31             ` Mika Westerberg
  0 siblings, 1 reply; 115+ messages in thread
From: Kilian Singer @ 2017-01-25 16:05 UTC (permalink / raw)
  To: Mika Westerberg; +Cc: Bjorn Helgaas, linux-pci, Lukas Wunner, Rafael J. Wysocki

Dear Mika,
just came back from my lecture. Booted into the new kernel runtime suspend works!
Thanks for all the support. I really enjoyed working with you all.
It was very interesting to see how bugs are fixed :)

Greetings
Kilian

----- Original Message -----
From: "Mika Westerberg" <mika.westerberg@linux.intel.com>
To: "Bjorn Helgaas" <helgaas@kernel.org>
Cc: "Kilian Singer" <kilian.singer@quantumtechnology.info>, "linux-pci" <linux-pci@vger.kernel.org>, "Lukas Wunner" <lukas@wunner.de>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Sent: Wednesday, January 25, 2017 10:48:31 AM
Subject: Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"

On Tue, Jan 24, 2017 at 02:01:03PM -0600, Bjorn Helgaas wrote:
> On Mon, Jan 23, 2017 at 11:12:47PM +0200, Mika Westerberg wrote:
> > On Mon, Jan 23, 2017 at 02:33:35PM -0600, Bjorn Helgaas wrote:
> > > From my perspective (and I have not followed the whole 100 message
> > > thread), the bare bones of the situation are that 006d44e49a25 ("PCI:
> > > Add runtime PM support for PCIe ports") probably reduced power
> > > consumption on some machines.  But it also made Kilian's system
> > > unresponsive when locking the screen.
> > > 
> > > Given only those assumptions, a revert seems like a reasonable
> > > approach.  I understand and agree that we want to save power, but
> > > not at the expense of making systems unresponsive.
> > 
> > But even if you revert the runtime PM commit, the same thing happens
> > when the system is suspended.
> 
> In other words, we always had bug A, and after adding 006d44e49a25, we
> have bug A and bug B.  It is worthwhile to avoid B even if A still
> exists.

I meant the same PCI PM series also added support for powering down PCI
bridges when the system is suspended. So the same issue happens when the
system is suspended even if the runtime PM patch is reverted.

> Kilian tripped over B, and no doubt others have as well.  Most others
> will be frustrated and unable to work around it.  We're lucky Kilian
> was patient and sophisticated enough to track it down.

I agree. Thanks Kilian for the patience :)

> > > Maybe 006d44e49a25 actually fixed a functional problem in addition to
> > > saving power?  I don't think the changelog mentions anything like
> > > that, but if that's the case, we should certainly consider that.
> > > 
> > > We're at -rc5 already, so if we want something other than a revert,
> > > now is the time to propose it.
> > 
> > Hmm, runtime PM patches went in for 4.9 IIRC. It is not a regression
> > introduced in v4.10 release cycle so I'm not sure why we are in such
> > hurry here?
> > 
> > I understand that the issue should be fixed but not why it should be
> > fixed for v4.10 as it is not a regression introduced by v4.10-rc1.
> 
> As far as I can tell, the downside of a revert is only that we'll
> consume a little more power.  I'm not sure why we would release v4.10
> with a known issue that we can easily avoid.

I've been told that reverting the nouveau driver back to use ACPI _DSM
method causes other issues.

I would rather try to understand what is actually going on and why this
happens in the first place, even if it takes longer than when v4.10 is
released, if 3846fd9b8600 ("drm/probe-helpers: Drop locking from
poll_enable") does not solve the issue.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2017-01-25 16:05           ` Kilian Singer
@ 2017-01-25 16:31             ` Mika Westerberg
  0 siblings, 0 replies; 115+ messages in thread
From: Mika Westerberg @ 2017-01-25 16:31 UTC (permalink / raw)
  To: Kilian Singer; +Cc: Bjorn Helgaas, linux-pci, Lukas Wunner, Rafael J. Wysocki

On Wed, Jan 25, 2017 at 05:05:29PM +0100, Kilian Singer wrote:
> Dear Mika,
> just came back from my lecture. Booted into the new kernel runtime suspend works!
> Thanks for all the support. I really enjoyed working with you all.
> It was very interesting to see how bugs are fixed :)

OK, thanks a lot for your patience :)

So it was 3846fd9b8600 ("drm/probe-helpers: Drop locking from
poll_enable") that actually fixed the issue for Kilian. I hope we do not
need to revert the runtime PM patch anymore ;-)

Thanks for everyone who participated in this.

^ permalink raw reply	[flat|nested] 115+ messages in thread

* Re: PCI: Revert "PCI: Add runtime PM support for PCIe ports"
  2016-12-27 23:57 PCI: Revert "PCI: Add runtime PM support for PCIe ports" Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2017-01-17 14:56 ` Bjorn Helgaas
@ 2017-01-25 17:58 ` Bjorn Helgaas
  3 siblings, 0 replies; 115+ messages in thread
From: Bjorn Helgaas @ 2017-01-25 17:58 UTC (permalink / raw)
  To: kilian.singer
  Cc: linux-pci, Mika Westerberg, Lukas Wunner, Rafael J. Wysocki,
	Ben Hutchings, Daniel Vetter, Dave Airlie, Chris Wilson, Lyude

[+cc Ben, since this bug was reported against a Debian stretch kernel,
Daniel, Dave, Chris, Lyude]

On Tue, Dec 27, 2016 at 05:57:37PM -0600, Bjorn Helgaas wrote:
> Hi Killian,
> 
> Thanks for the report (https://bugzilla.kernel.org/show_bug.cgi?id=190861)
> and all the debugging you've done.  Below is a revert of the troublesome
> commit.  Can you test it and verify that it also fixes the problem?
> 
> I assume Mika is looking at this and will have a better solution soon.
> But if not, I'll queue this up for v4.10.
> 
> 
> commit e648b1ca2b94d207289fedc2538d33c57cdbc4de
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Tue Dec 27 17:27:30 2016 -0600
> 
>     Revert "PCI: Add runtime PM support for PCIe ports"
>     
>     Revert 006d44e49a25 ("PCI: Add runtime PM support for PCIe ports").
>     
>     Killian reported that on a Lenovo W54l with i7-4810MQ, Intel HD Graphics
>     4600, and NVIDIA Quadro® K1100M, locking the screen kills all keyboard and
>     mouse interaction.  Reverting 006d44e49a25 fixes the problem.
>     
>     Link: https://bugzilla.kernel.org/show_bug.cgi?id=190861
>     Reported-by: kilian.singer@quantumtechnology.info
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>     CC: stable@vger.kernel.org	# v4.8+
>     CC: Mika Westerberg <mika.westerberg@linux.intel.com>

I dropped this revert, since Kilian has confirmed that 3846fd9b8600
("drm/probe-helpers: Drop locking from poll_enable"), which is already
in Linus' tree, fixes the problem.

Unfortunately the 3846fd9b8600 changelog does not mention this problem and
I don't know what's required to backport the fix to v4.9.

> diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> index 9698289..dcb185c 100644
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -11,7 +11,6 @@
>  #include <linux/kernel.h>
>  #include <linux/errno.h>
>  #include <linux/pm.h>
> -#include <linux/pm_runtime.h>
>  #include <linux/string.h>
>  #include <linux/slab.h>
>  #include <linux/pcieport_if.h>
> @@ -343,8 +342,6 @@ static int pcie_device_init(struct pci_dev *pdev, int service, int irq)
>  		return retval;
>  	}
>  
> -	pm_runtime_no_callbacks(device);
> -
>  	return 0;
>  }
>  
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index 8aa3f14..d3af264 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -85,26 +85,6 @@ static int pcie_port_resume_noirq(struct device *dev)
>  	return 0;
>  }
>  
> -static int pcie_port_runtime_suspend(struct device *dev)
> -{
> -	return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
> -}
> -
> -static int pcie_port_runtime_resume(struct device *dev)
> -{
> -	return 0;
> -}
> -
> -static int pcie_port_runtime_idle(struct device *dev)
> -{
> -	/*
> -	 * Assume the PCI core has set bridge_d3 whenever it thinks the port
> -	 * should be good to go to D3.  Everything else, including moving
> -	 * the port to D3, is handled by the PCI core.
> -	 */
> -	return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
> -}
> -
>  static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>  	.suspend	= pcie_port_device_suspend,
>  	.resume		= pcie_port_device_resume,
> @@ -113,9 +93,6 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>  	.poweroff	= pcie_port_device_suspend,
>  	.restore	= pcie_port_device_resume,
>  	.resume_noirq	= pcie_port_resume_noirq,
> -	.runtime_suspend = pcie_port_runtime_suspend,
> -	.runtime_resume	= pcie_port_runtime_resume,
> -	.runtime_idle	= pcie_port_runtime_idle,
>  };
>  
>  #define PCIE_PORTDRV_PM_OPS	(&pcie_portdrv_pm_ops)
> @@ -149,31 +126,11 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
>  		return status;
>  
>  	pci_save_state(dev);
> -
> -	if (pci_bridge_d3_possible(dev)) {
> -		/*
> -		 * Keep the port resumed 100ms to make sure things like
> -		 * config space accesses from userspace (lspci) will not
> -		 * cause the port to repeatedly suspend and resume.
> -		 */
> -		pm_runtime_set_autosuspend_delay(&dev->dev, 100);
> -		pm_runtime_use_autosuspend(&dev->dev);
> -		pm_runtime_mark_last_busy(&dev->dev);
> -		pm_runtime_put_autosuspend(&dev->dev);
> -		pm_runtime_allow(&dev->dev);
> -	}
> -
>  	return 0;
>  }
>  
>  static void pcie_portdrv_remove(struct pci_dev *dev)
>  {
> -	if (pci_bridge_d3_possible(dev)) {
> -		pm_runtime_forbid(&dev->dev);
> -		pm_runtime_get_noresume(&dev->dev);
> -		pm_runtime_dont_use_autosuspend(&dev->dev);
> -	}
> -
>  	pcie_port_device_remove(dev);
>  }
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 115+ messages in thread

end of thread, other threads:[~2017-01-25 17:58 UTC | newest]

Thread overview: 115+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-27 23:57 PCI: Revert "PCI: Add runtime PM support for PCIe ports" Bjorn Helgaas
2016-12-28  9:17 ` Mika Westerberg
2016-12-28 11:29 ` Lukas Wunner
2016-12-28 16:18   ` Bjorn Helgaas
2016-12-29  9:58     ` Kilian Singer
2016-12-29 16:02       ` Kilian Singer
2016-12-29 16:20         ` Kilian Singer
2016-12-29 17:50           ` Lukas Wunner
2016-12-29 22:52             ` Kilian Singer
2016-12-29 23:02               ` Kilian Singer
2016-12-29 23:05                 ` Kilian Singer
2016-12-29 23:48               ` Lukas Wunner
2016-12-29 23:20             ` Kilian Singer
2016-12-30  0:07               ` Lukas Wunner
2016-12-30  0:16                 ` Kilian Singer
2016-12-30  0:24                   ` Kilian Singer
2016-12-30  0:22                     ` Rafael J. Wysocki
2016-12-30  0:39                       ` Kilian Singer
2016-12-30  0:41                         ` Rafael J. Wysocki
2016-12-30  0:45                       ` Kilian Singer
2016-12-30  1:40                         ` Rafael J. Wysocki
2016-12-30  1:50                           ` Rafael J. Wysocki
2016-12-30  1:52                             ` Rafael J. Wysocki
2016-12-30 13:37                               ` Kilian Singer
2016-12-30 13:59                                 ` Kilian Singer
2016-12-30 14:44                                   ` Rafael J. Wysocki
2016-12-30 14:47                                 ` Rafael J. Wysocki
2017-01-02 12:22                                   ` Mika Westerberg
2017-01-03 17:12                                     ` Kilian Singer
2017-01-02 11:40                   ` Lukas Wunner
2017-01-02 12:10                     ` Mika Westerberg
2017-01-02 13:53                       ` Mika Westerberg
2017-01-02 14:48                       ` Mika Westerberg
2017-01-02 21:31                         ` Rafael J. Wysocki
2017-01-03  9:51                           ` Mika Westerberg
2017-01-03 15:15                             ` Peter Wu
2017-01-03 16:11                               ` Lukas Wunner
2017-01-03 16:31                                 ` Peter Wu
2017-01-03 16:44                                   ` Deucher, Alexander
2017-01-03 18:09                                   ` Lukas Wunner
2017-01-03 18:12                                   ` Bjorn Helgaas
2017-01-03 21:38                                     ` Rafael J. Wysocki
2017-01-03 21:52                                       ` Kilian Singer
2017-01-03 22:07                                         ` Rafael J. Wysocki
2017-01-03 22:25                                           ` Kilian Singer
2017-01-03 22:25                                       ` Bjorn Helgaas
2017-01-03 23:13                                         ` Rafael J. Wysocki
2017-01-04  0:05                                           ` Bjorn Helgaas
2017-01-04  1:09                                             ` Rafael J. Wysocki
2017-01-04  8:16                                             ` Lukas Wunner
2017-01-04 10:33                                               ` Kilian Singer
2017-01-04 12:29                                                 ` Mika Westerberg
2017-01-04 15:50                                               ` Deucher, Alexander
2017-01-04 21:09                                               ` Peter Wu
2017-01-04 21:58                                                 ` Rafael J. Wysocki
2017-01-04 23:21                                                   ` David Airlie
2017-01-05 15:06                                                     ` Lukas Wunner
2017-01-05 18:13                                                       ` Peter Jones
2017-01-05 19:36                                                         ` David Airlie
2017-01-09 15:11                                                           ` Lyude Paul
2017-01-09 15:21                                                             ` Hans de Goede
2017-01-09 18:48                                                               ` Kilian Singer
2017-01-10  0:33                                                                 ` David Airlie
2017-01-10  9:17                                                                   ` Kilian Singer
2017-01-12 18:10                                                                     ` Lyude Paul
2017-01-24  4:59                                                                       ` Lukas Wunner
2017-01-24 19:09                                                                         ` Lyude Paul
2017-01-11 20:40                                                               ` Lyude Paul
2017-01-12  1:13                                                                 ` Lyude Paul
2017-01-12  2:04                                                                   ` Lyude Paul
2017-01-12  2:12                                                                     ` Lukas Wunner
2017-01-17 15:55                                                                       ` Mika Westerberg
2017-01-17 18:06                                                                         ` Lyude Paul
2017-01-17 19:10                                                                           ` Bjorn Helgaas
2017-01-17 19:49                                                                             ` Lyude Paul
2017-01-07 11:45                                                       ` Hans de Goede
2017-01-07 12:16                                                         ` Lukas Wunner
2017-01-09 23:00                                                         ` Peter Jones
2017-01-10  0:17                                                           ` David Airlie
2017-01-10  1:24                                                             ` Lukas Wunner
2017-01-10  2:15                                                               ` David Airlie
2017-01-11 11:04                                                       ` Hans de Goede
2017-01-11 13:24                                                         ` Kilian Singer
2017-01-11 13:26                                                           ` Hans de Goede
2017-01-11 16:24                                                             ` Peter Jones
2017-01-11 19:20                                                               ` Kilian Singer
2017-01-05 10:49                                                   ` Mika Westerberg
2017-01-05 14:19                                                     ` Rafael J. Wysocki
2017-01-05 14:20                                                     ` Mika Westerberg
2017-01-05 14:23                                                       ` Rafael J. Wysocki
2017-01-05 14:42                                                 ` Lukas Wunner
2017-01-06  1:21                                                   ` Rafael J. Wysocki
2017-01-07  6:50                                                     ` Mika Westerberg
2017-01-07 11:35                                                   ` Peter Wu
2017-01-07 12:19                                                     ` Lukas Wunner
2017-01-07 12:36                                                       ` Peter Wu
2017-01-08 14:05                                                         ` Lukas Wunner
2017-01-04 21:55                                               ` Rafael J. Wysocki
2017-01-03 21:26                                 ` Rafael J. Wysocki
2017-01-03 17:37                               ` Kilian Singer
2017-01-03 17:10                       ` Kilian Singer
2017-01-03 16:59                     ` Kilian Singer
2017-01-03 17:08                     ` Kilian Singer
2016-12-30  0:19     ` Rafael J. Wysocki
2016-12-30 14:48       ` Rafael J. Wysocki
2017-01-17 14:56 ` Bjorn Helgaas
2017-01-17 15:49   ` Kilian Singer
2017-01-23 20:33   ` Bjorn Helgaas
2017-01-23 21:12     ` Mika Westerberg
2017-01-24  4:53       ` Lukas Wunner
2017-01-24 20:01       ` Bjorn Helgaas
2017-01-25  9:48         ` Mika Westerberg
2017-01-25 16:05           ` Kilian Singer
2017-01-25 16:31             ` Mika Westerberg
2017-01-25 17:58 ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.