linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
@ 2020-08-21 12:32 Kai-Heng Feng
  2020-08-24 13:04 ` Mika Westerberg
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Kai-Heng Feng @ 2020-08-21 12:32 UTC (permalink / raw)
  To: bhelgaas
  Cc: jonathan.derrick, Mario.Limonciello, Kai-Heng Feng,
	Heiner Kallweit, Mika Westerberg, Rafael J. Wysocki,
	Xiongfeng Wang, Krzysztof Wilczynski, open list:PCI SUBSYSTEM,
	open list

New Intel laptops with VMD cannot reach deeper power saving state,
renders very short battery time.

As BIOS may not be able to program the config space for devices under
VMD domain, ASPM needs to be programmed manually by software. This is
also the case under Windows.

The VMD controller itself is a root complex integrated endpoint that
doesn't have ASPM capability, so we can't propagate the ASPM settings to
devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
VMD domain, unsupported states will be cleared out anyway.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/pci/pcie/aspm.c |  3 ++-
 drivers/pci/quirks.c    | 11 +++++++++++
 include/linux/pci.h     |  2 ++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 253c30cc1967..dcc002dbca19 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 		aspm_calc_l1ss_info(link, &upreg, &dwreg);
 
 	/* Save default state */
-	link->aspm_default = link->aspm_enabled;
+	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
+			     ASPM_STATE_ALL : link->aspm_enabled;
 
 	/* Setup initial capable state. Will be updated later */
 	link->aspm_capable = link->aspm_support;
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index bdf9b52567e0..2e2f525bd892 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5632,3 +5632,14 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
 }
 DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
 			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
+
+/*
+ * Device [8086:9a09]
+ * BIOS may not be able to access config space of devices under VMD domain, so
+ * it relies on software to enable ASPM for links under VMD.
+ */
+static void pci_fixup_enable_aspm(struct pci_dev *pdev)
+{
+	pdev->dev_flags |= PCI_DEV_FLAGS_ENABLE_ASPM;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a09, pci_fixup_enable_aspm);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 835530605c0d..66a45916c7c6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -227,6 +227,8 @@ enum pci_dev_flags {
 	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
 	/* Don't use Relaxed Ordering for TLPs directed at this device */
 	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
+	/* Enable ASPM regardless of how LnkCtl is programmed */
+	PCI_DEV_FLAGS_ENABLE_ASPM = (__force pci_dev_flags_t) (1 << 12),
 };
 
 enum pci_irq_reroute_variant {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-21 12:32 [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain Kai-Heng Feng
@ 2020-08-24 13:04 ` Mika Westerberg
  2020-08-25  6:23 ` Christoph Hellwig
  2020-09-10  1:55 ` Bjorn Helgaas
  2 siblings, 0 replies; 25+ messages in thread
From: Mika Westerberg @ 2020-08-24 13:04 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: bhelgaas, jonathan.derrick, Mario.Limonciello, Heiner Kallweit,
	Rafael J. Wysocki, Xiongfeng Wang, Krzysztof Wilczynski,
	open list:PCI SUBSYSTEM, open list

Hi,

On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> New Intel laptops with VMD cannot reach deeper power saving state,
> renders very short battery time.
> 
> As BIOS may not be able to program the config space for devices under
> VMD domain, ASPM needs to be programmed manually by software. This is
> also the case under Windows.
> 
> The VMD controller itself is a root complex integrated endpoint that
> doesn't have ASPM capability, so we can't propagate the ASPM settings to
> devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
> VMD domain, unsupported states will be cleared out anyway.
> 
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/pci/pcie/aspm.c |  3 ++-
>  drivers/pci/quirks.c    | 11 +++++++++++
>  include/linux/pci.h     |  2 ++
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 253c30cc1967..dcc002dbca19 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
>  		aspm_calc_l1ss_info(link, &upreg, &dwreg);
>  
>  	/* Save default state */
> -	link->aspm_default = link->aspm_enabled;
> +	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
> +			     ASPM_STATE_ALL : link->aspm_enabled;
>  
>  	/* Setup initial capable state. Will be updated later */
>  	link->aspm_capable = link->aspm_support;
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index bdf9b52567e0..2e2f525bd892 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5632,3 +5632,14 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
>  }
>  DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
>  			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
> +
> +/*
> + * Device [8086:9a09]
> + * BIOS may not be able to access config space of devices under VMD domain, so
> + * it relies on software to enable ASPM for links under VMD.
> + */
> +static void pci_fixup_enable_aspm(struct pci_dev *pdev)
> +{
> +	pdev->dev_flags |= PCI_DEV_FLAGS_ENABLE_ASPM;
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a09, pci_fixup_enable_aspm);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 835530605c0d..66a45916c7c6 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -227,6 +227,8 @@ enum pci_dev_flags {
>  	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
>  	/* Don't use Relaxed Ordering for TLPs directed at this device */
>  	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> +	/* Enable ASPM regardless of how LnkCtl is programmed */
> +	PCI_DEV_FLAGS_ENABLE_ASPM = (__force pci_dev_flags_t) (1 << 12),

I wonder if instead of dev_flags this should have a bit field in struct
pci_dev? Not sure which one is prefered actually, both seem to include
quirks as well ;-)

>  };
>  
>  enum pci_irq_reroute_variant {
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-21 12:32 [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain Kai-Heng Feng
  2020-08-24 13:04 ` Mika Westerberg
@ 2020-08-25  6:23 ` Christoph Hellwig
  2020-08-25  6:39   ` Kai Heng Feng
  2020-08-26 21:43   ` Derrick, Jonathan
  2020-09-10  1:55 ` Bjorn Helgaas
  2 siblings, 2 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-08-25  6:23 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: bhelgaas, jonathan.derrick, Mario.Limonciello, Heiner Kallweit,
	Mika Westerberg, Rafael J. Wysocki, Xiongfeng Wang,
	Krzysztof Wilczynski, open list:PCI SUBSYSTEM, open list,
	Dan Williams, Huffman, Amber

On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> New Intel laptops with VMD cannot reach deeper power saving state,
> renders very short battery time.

So what about just disabling VMD given how bloody pointless it is?
Hasn't anyone learned from the AHCI remapping debacle?

I'm really pissed at all this pointless crap intel comes up with just
to make life hard for absolutely no gain.  Is it so hard to just leave
a NVMe device as a standard NVMe device instead of f*^&ing everything
up in the chipset to make OS support a pain and I/O slower than by
doing nothing?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-25  6:23 ` Christoph Hellwig
@ 2020-08-25  6:39   ` Kai Heng Feng
  2020-08-25  6:56     ` Christoph Hellwig
  2020-08-26 21:43   ` Derrick, Jonathan
  1 sibling, 1 reply; 25+ messages in thread
From: Kai Heng Feng @ 2020-08-25  6:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Bjorn Helgaas, jonathan.derrick, Mario.Limonciello,
	Heiner Kallweit, Mika Westerberg, Rafael J. Wysocki,
	Xiongfeng Wang, Krzysztof Wilczynski, open list:PCI SUBSYSTEM,
	open list, Dan Williams, Huffman, Amber

Hi Christoph,

> On Aug 25, 2020, at 2:23 PM, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
>> New Intel laptops with VMD cannot reach deeper power saving state,
>> renders very short battery time.
> 
> So what about just disabling VMD given how bloody pointless it is?
> Hasn't anyone learned from the AHCI remapping debacle?
> 
> I'm really pissed at all this pointless crap intel comes up with just
> to make life hard for absolutely no gain.  Is it so hard to just leave
> a NVMe device as a standard NVMe device instead of f*^&ing everything
> up in the chipset to make OS support a pain and I/O slower than by
> doing nothing?

From what I can see from the hardwares at my hand, VMD only enables a PCI domain and PCI bridges behind it.

NVMe works as a regular NVMe under those bridges. No magic remapping happens here.

Kai-Heng

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-25  6:39   ` Kai Heng Feng
@ 2020-08-25  6:56     ` Christoph Hellwig
  2020-08-26  5:53       ` Kai-Heng Feng
  2020-09-02 19:48       ` David Fugate
  0 siblings, 2 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-08-25  6:56 UTC (permalink / raw)
  To: Kai Heng Feng
  Cc: Christoph Hellwig, Bjorn Helgaas, jonathan.derrick,
	Mario.Limonciello, Heiner Kallweit, Mika Westerberg,
	Rafael J. Wysocki, Xiongfeng Wang, Krzysztof Wilczynski,
	open list:PCI SUBSYSTEM, open list, Dan Williams, Huffman, Amber

On Tue, Aug 25, 2020 at 02:39:55PM +0800, Kai Heng Feng wrote:
> Hi Christoph,
> 
> > On Aug 25, 2020, at 2:23 PM, Christoph Hellwig <hch@infradead.org> wrote:
> > 
> > On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> >> New Intel laptops with VMD cannot reach deeper power saving state,
> >> renders very short battery time.
> > 
> > So what about just disabling VMD given how bloody pointless it is?
> > Hasn't anyone learned from the AHCI remapping debacle?
> > 
> > I'm really pissed at all this pointless crap intel comes up with just
> > to make life hard for absolutely no gain.  Is it so hard to just leave
> > a NVMe device as a standard NVMe device instead of f*^&ing everything
> > up in the chipset to make OS support a pain and I/O slower than by
> > doing nothing?
> 
> From what I can see from the hardwares at my hand, VMD only enables a PCI domain and PCI bridges behind it.
> 
> NVMe works as a regular NVMe under those bridges. No magic remapping happens here.

It definitively is less bad than the AHCI remapping, that is for sure.

But it still requires:

 - a new OS driver just to mak the PCIe device show up
 - indirections in the irq handling
 - indirections in the DMA handling
 - hacks for ASPSM
 - hacks for X (there were a few more)

while adding absolutely no value.  Basically we have to add a large
chunk of kernel code just to undo silicone/firmware Intel added to their
platform to make things complicated.  I mean it is their platform and if
they want a "make things complicated" option that is fine, but it should
not be on by default.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-25  6:56     ` Christoph Hellwig
@ 2020-08-26  5:53       ` Kai-Heng Feng
  2020-09-02 19:48       ` David Fugate
  1 sibling, 0 replies; 25+ messages in thread
From: Kai-Heng Feng @ 2020-08-26  5:53 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Bjorn Helgaas, Derrick, Jonathan, Mario.Limonciello,
	Heiner Kallweit, Mika Westerberg, Rafael J. Wysocki,
	Xiongfeng Wang, Krzysztof Wilczynski, open list:PCI SUBSYSTEM,
	open list, Dan Williams, Huffman, Amber



> On Aug 25, 2020, at 14:56, Christoph Hellwig <hch@infradead.org> wrote:
> 
> On Tue, Aug 25, 2020 at 02:39:55PM +0800, Kai Heng Feng wrote:
>> Hi Christoph,
>> 
>>> On Aug 25, 2020, at 2:23 PM, Christoph Hellwig <hch@infradead.org> wrote:
>>> 
>>> On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
>>>> New Intel laptops with VMD cannot reach deeper power saving state,
>>>> renders very short battery time.
>>> 
>>> So what about just disabling VMD given how bloody pointless it is?
>>> Hasn't anyone learned from the AHCI remapping debacle?
>>> 
>>> I'm really pissed at all this pointless crap intel comes up with just
>>> to make life hard for absolutely no gain.  Is it so hard to just leave
>>> a NVMe device as a standard NVMe device instead of f*^&ing everything
>>> up in the chipset to make OS support a pain and I/O slower than by
>>> doing nothing?
>> 
>> From what I can see from the hardwares at my hand, VMD only enables a PCI domain and PCI bridges behind it.
>> 
>> NVMe works as a regular NVMe under those bridges. No magic remapping happens here.
> 
> It definitively is less bad than the AHCI remapping, that is for sure.
> 
> But it still requires:
> 
> - a new OS driver just to mak the PCIe device show up
> - indirections in the irq handling
> - indirections in the DMA handling
> - hacks for ASPSM
> - hacks for X (there were a few more)
> 
> while adding absolutely no value.  Basically we have to add a large
> chunk of kernel code just to undo silicone/firmware Intel added to their
> platform to make things complicated.  I mean it is their platform and if
> they want a "make things complicated" option that is fine, but it should
> not be on by default.

Yes, I do want it to be a regular PCIe bridge... but it's not the reality here.
Almost all next-gen Intel laptops will have VMD enabled, so users are forced to have it.
I would really like to have this patch in upstream instead of carrying it as a downstream distro-only patch.

Kai-Heng


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-25  6:23 ` Christoph Hellwig
  2020-08-25  6:39   ` Kai Heng Feng
@ 2020-08-26 21:43   ` Derrick, Jonathan
  2020-08-27  6:34     ` hch
  1 sibling, 1 reply; 25+ messages in thread
From: Derrick, Jonathan @ 2020-08-26 21:43 UTC (permalink / raw)
  To: kai.heng.feng, hch
  Cc: wangxiongfeng2, kw, hkallweit1, linux-kernel, mika.westerberg,
	Mario.Limonciello, linux-pci, Williams, Dan J, bhelgaas, Huffman,
	Amber, Wysocki, Rafael J

On Tue, 2020-08-25 at 06:23 +0000, Christoph Hellwig wrote:
> On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> > New Intel laptops with VMD cannot reach deeper power saving state,
> > renders very short battery time.
> 
> So what about just disabling VMD given how bloody pointless it is?
> Hasn't anyone learned from the AHCI remapping debacle?
> 
> I'm really pissed at all this pointless crap intel comes up with just
> to make life hard for absolutely no gain.  Is it so hard to just leave
> a NVMe device as a standard NVMe device instead of f*^&ing everything
> up in the chipset to make OS support a pain and I/O slower than by
> doing nothing?


Feel free to review my set to disable the MSI remapping which will make
it perform as well as direct-attached:

https://patchwork.kernel.org/project/linux-pci/list/?series=325681

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-26 21:43   ` Derrick, Jonathan
@ 2020-08-27  6:34     ` hch
  2020-08-27 16:13       ` Derrick, Jonathan
  0 siblings, 1 reply; 25+ messages in thread
From: hch @ 2020-08-27  6:34 UTC (permalink / raw)
  To: Derrick, Jonathan
  Cc: kai.heng.feng, hch, wangxiongfeng2, kw, hkallweit1, linux-kernel,
	mika.westerberg, Mario.Limonciello, linux-pci, Williams, Dan J,
	bhelgaas, Huffman, Amber, Wysocki, Rafael J

On Wed, Aug 26, 2020 at 09:43:27PM +0000, Derrick, Jonathan wrote:
> Feel free to review my set to disable the MSI remapping which will make
> it perform as well as direct-attached:
> 
> https://patchwork.kernel.org/project/linux-pci/list/?series=325681

So that then we have to deal with your schemes to make individual
device direct assignment work in a convoluted way?  Please just give us
a disable nob for VMD, which solves _all_ these problems without adding
any.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-27  6:34     ` hch
@ 2020-08-27 16:13       ` Derrick, Jonathan
  2020-08-27 16:23         ` hch
  0 siblings, 1 reply; 25+ messages in thread
From: Derrick, Jonathan @ 2020-08-27 16:13 UTC (permalink / raw)
  To: hch
  Cc: wangxiongfeng2, kw, hkallweit1, kai.heng.feng, linux-kernel,
	mika.westerberg, Mario.Limonciello, linux-pci, Williams, Dan J,
	bhelgaas, Huffman, Amber, Wysocki, Rafael J

On Thu, 2020-08-27 at 06:34 +0000, hch@infradead.org wrote:
> On Wed, Aug 26, 2020 at 09:43:27PM +0000, Derrick, Jonathan wrote:
> > Feel free to review my set to disable the MSI remapping which will
> > make
> > it perform as well as direct-attached:
> > 
> > https://patchwork.kernel.org/project/linux-pci/list/?series=325681
> 
> So that then we have to deal with your schemes to make individual
> device direct assignment work in a convoluted way?

That's not the intent of that patchset -at all-. It was to address the
performance bottlenecks with VMD that you constantly complain about. 


> Please just give us
> a disable nob for VMD, which solves _all_ these problems without
> adding
> any.

I don't see the purpose of this line of discussion. VMD has been in the
kernel for 5 years. We are constantly working on better support.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-27 16:13       ` Derrick, Jonathan
@ 2020-08-27 16:23         ` hch
  2020-08-27 16:45           ` Derrick, Jonathan
  2020-08-27 17:49           ` Limonciello, Mario
  0 siblings, 2 replies; 25+ messages in thread
From: hch @ 2020-08-27 16:23 UTC (permalink / raw)
  To: Derrick, Jonathan
  Cc: hch, wangxiongfeng2, kw, hkallweit1, kai.heng.feng, linux-kernel,
	mika.westerberg, Mario.Limonciello, linux-pci, Williams, Dan J,
	bhelgaas, Huffman, Amber, Wysocki, Rafael J

On Thu, Aug 27, 2020 at 04:13:44PM +0000, Derrick, Jonathan wrote:
> On Thu, 2020-08-27 at 06:34 +0000, hch@infradead.org wrote:
> > On Wed, Aug 26, 2020 at 09:43:27PM +0000, Derrick, Jonathan wrote:
> > > Feel free to review my set to disable the MSI remapping which will
> > > make
> > > it perform as well as direct-attached:
> > > 
> > > https://patchwork.kernel.org/project/linux-pci/list/?series=325681
> > 
> > So that then we have to deal with your schemes to make individual
> > device direct assignment work in a convoluted way?
> 
> That's not the intent of that patchset -at all-. It was to address the
> performance bottlenecks with VMD that you constantly complain about. 

I know.  But once we fix that bottleneck we fix the next issue,
then to tackle the next.  While at the same time VMD brings zero
actual benefits.

> > Please just give us
> > a disable nob for VMD, which solves _all_ these problems without
> > adding
> > any.
> 
> I don't see the purpose of this line of discussion. VMD has been in the
> kernel for 5 years. We are constantly working on better support.

Please just work with the platform people to allow the host to disable
VMD.  That is the only really useful value add here.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-27 16:23         ` hch
@ 2020-08-27 16:45           ` Derrick, Jonathan
  2020-08-27 16:50             ` hch
  2020-08-27 21:33             ` Dan Williams
  2020-08-27 17:49           ` Limonciello, Mario
  1 sibling, 2 replies; 25+ messages in thread
From: Derrick, Jonathan @ 2020-08-27 16:45 UTC (permalink / raw)
  To: hch
  Cc: wangxiongfeng2, kw, hkallweit1, kai.heng.feng, linux-kernel,
	mika.westerberg, Mario.Limonciello, linux-pci, Williams, Dan J,
	bhelgaas, Huffman, Amber, Wysocki, Rafael J

On Thu, 2020-08-27 at 17:23 +0100, hch@infradead.org wrote:
> On Thu, Aug 27, 2020 at 04:13:44PM +0000, Derrick, Jonathan wrote:
> > On Thu, 2020-08-27 at 06:34 +0000, hch@infradead.org wrote:
> > > On Wed, Aug 26, 2020 at 09:43:27PM +0000, Derrick, Jonathan wrote:
> > > > Feel free to review my set to disable the MSI remapping which will
> > > > make
> > > > it perform as well as direct-attached:
> > > > 
> > > > https://patchwork.kernel.org/project/linux-pci/list/?series=325681
> > > 
> > > So that then we have to deal with your schemes to make individual
> > > device direct assignment work in a convoluted way?
> > 
> > That's not the intent of that patchset -at all-. It was to address the
> > performance bottlenecks with VMD that you constantly complain about. 
> 
> I know.  But once we fix that bottleneck we fix the next issue,
> then to tackle the next.  While at the same time VMD brings zero
> actual benefits.
> 

Just a few benefits and there are other users with unique use cases:
1. Passthrough of the endpoint to OSes which don't natively support
hotplug can enable hotplug for that OS using the guest VMD driver
2. Some hypervisors have a limit on the number of devices that can be
passed through. VMD endpoint is a single device that expands to many.
3. Expansion of possible bus numbers beyond 256 by using other
segments.
4. Custom RAID LED patterns driven by ledctl

I'm not trying to market this. Just pointing out that this isn't
"bringing zero actual benefits" to many users.


> > > Please just give us
> > > a disable nob for VMD, which solves _all_ these problems without
> > > adding
> > > any.
> > 
> > I don't see the purpose of this line of discussion. VMD has been in the
> > kernel for 5 years. We are constantly working on better support.
> 
> Please just work with the platform people to allow the host to disable
> VMD.  That is the only really useful value add here.

Cheers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-27 16:45           ` Derrick, Jonathan
@ 2020-08-27 16:50             ` hch
  2020-08-27 21:33             ` Dan Williams
  1 sibling, 0 replies; 25+ messages in thread
From: hch @ 2020-08-27 16:50 UTC (permalink / raw)
  To: Derrick, Jonathan
  Cc: hch, wangxiongfeng2, kw, hkallweit1, kai.heng.feng, linux-kernel,
	mika.westerberg, Mario.Limonciello, linux-pci, Williams, Dan J,
	bhelgaas, Huffman, Amber, Wysocki, Rafael J

On Thu, Aug 27, 2020 at 04:45:53PM +0000, Derrick, Jonathan wrote:
> Just a few benefits and there are other users with unique use cases:
> 1. Passthrough of the endpoint to OSes which don't natively support
> hotplug can enable hotplug for that OS using the guest VMD driver

Or they could just write a hotplug driver, which would be more useful
than writing a hotplug driver.

> 2. Some hypervisors have a limit on the number of devices that can be
> passed through. VMD endpoint is a single device that expands to many.

Or you just fix the hypervisor.  Never mind that this is so much
less likely than wanting to pass an individual device or VF to a guest,
which VMD makes impossible (at least without tons of hacks specifically
for it).

> 3. Expansion of possible bus numbers beyond 256 by using other
> segments.

Which we can trivially to with PCI domains.

> 4. Custom RAID LED patterns driven by ledctl

Which you can also do by any other vendor specific way.

> 
> I'm not trying to market this. Just pointing out that this isn't
> "bringing zero actual benefits" to many users.

Which of those are a benefit to a Linux user?  Serious, I really don't
care if Intel wants to sell VMD as a value add to those that have
a perceived or in rare cases even real need.  Just let Linux opt out
of it instead of needing special quirks all over.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-27 16:23         ` hch
  2020-08-27 16:45           ` Derrick, Jonathan
@ 2020-08-27 17:49           ` Limonciello, Mario
  2020-08-29  7:24             ` hch
  1 sibling, 1 reply; 25+ messages in thread
From: Limonciello, Mario @ 2020-08-27 17:49 UTC (permalink / raw)
  To: hch, Derrick, Jonathan
  Cc: wangxiongfeng2, kw, hkallweit1, kai.heng.feng, linux-kernel,
	mika.westerberg, linux-pci, Williams, Dan J, bhelgaas, Huffman,
	Amber, Wysocki, Rafael J

> > I don't see the purpose of this line of discussion. VMD has been in the
> > kernel for 5 years. We are constantly working on better support.
> 
> Please just work with the platform people to allow the host to disable
> VMD.  That is the only really useful value add here.

Can you further elaborate what exactly you're wanting here?  VMD enable/disable
is something that is configured in firmware setup as the firmware does the early
configuration for the silicon related to it.  So it's up to the OEM whether to
offer the knob to an end user.

At least for Dell this setting also does export to sysfs and can be turned on/off
around a reboot cycle via this: https://patchwork.kernel.org/patch/11693231/.

As was mentioned earlier in this thread VMD is likely to be defaulting to "on"
for many machines with the upcoming silicon.  Making it work well on Linux is
preferable to again having to change firmware settings between operating systems
like the NVME remapping thing from earlier silicon required.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-27 16:45           ` Derrick, Jonathan
  2020-08-27 16:50             ` hch
@ 2020-08-27 21:33             ` Dan Williams
  2020-08-29  7:23               ` hch
  1 sibling, 1 reply; 25+ messages in thread
From: Dan Williams @ 2020-08-27 21:33 UTC (permalink / raw)
  To: Derrick, Jonathan
  Cc: hch, wangxiongfeng2, kw, hkallweit1, kai.heng.feng, linux-kernel,
	mika.westerberg, Mario.Limonciello, linux-pci, bhelgaas, Huffman,
	Amber, Wysocki, Rafael J

On Thu, Aug 27, 2020 at 9:46 AM Derrick, Jonathan
<jonathan.derrick@intel.com> wrote:
>
> On Thu, 2020-08-27 at 17:23 +0100, hch@infradead.org wrote:
> > On Thu, Aug 27, 2020 at 04:13:44PM +0000, Derrick, Jonathan wrote:
> > > On Thu, 2020-08-27 at 06:34 +0000, hch@infradead.org wrote:
> > > > On Wed, Aug 26, 2020 at 09:43:27PM +0000, Derrick, Jonathan wrote:
> > > > > Feel free to review my set to disable the MSI remapping which will
> > > > > make
> > > > > it perform as well as direct-attached:
> > > > >
> > > > > https://patchwork.kernel.org/project/linux-pci/list/?series=325681
> > > >
> > > > So that then we have to deal with your schemes to make individual
> > > > device direct assignment work in a convoluted way?
> > >
> > > That's not the intent of that patchset -at all-. It was to address the
> > > performance bottlenecks with VMD that you constantly complain about.
> >
> > I know.  But once we fix that bottleneck we fix the next issue,
> > then to tackle the next.  While at the same time VMD brings zero
> > actual benefits.
> >
>
> Just a few benefits and there are other users with unique use cases:
> 1. Passthrough of the endpoint to OSes which don't natively support
> hotplug can enable hotplug for that OS using the guest VMD driver
> 2. Some hypervisors have a limit on the number of devices that can be
> passed through. VMD endpoint is a single device that expands to many.
> 3. Expansion of possible bus numbers beyond 256 by using other
> segments.
> 4. Custom RAID LED patterns driven by ledctl
>
> I'm not trying to market this. Just pointing out that this isn't
> "bringing zero actual benefits" to many users.
>

The initial intent of the VMD driver was to allow Linux to find and
initialize devices behind a VMD configuration where VMD was required
for a non-Linux OS. For Linux, if full native PCI-E is an available
configuration option I think it makes sense to recommend Linux users
to flip that knob rather than continue to wrestle with the caveats of
the VMD driver. Where that knob isn't possible / available VMD can be
a fallback, but full native PCI-E is what Linux wants in the end.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-27 21:33             ` Dan Williams
@ 2020-08-29  7:23               ` hch
  0 siblings, 0 replies; 25+ messages in thread
From: hch @ 2020-08-29  7:23 UTC (permalink / raw)
  To: Dan Williams
  Cc: Derrick, Jonathan, hch, wangxiongfeng2, kw, hkallweit1,
	kai.heng.feng, linux-kernel, mika.westerberg, Mario.Limonciello,
	linux-pci, bhelgaas, Huffman, Amber, Wysocki, Rafael J

On Thu, Aug 27, 2020 at 02:33:56PM -0700, Dan Williams wrote:
> > Just a few benefits and there are other users with unique use cases:
> > 1. Passthrough of the endpoint to OSes which don't natively support
> > hotplug can enable hotplug for that OS using the guest VMD driver
> > 2. Some hypervisors have a limit on the number of devices that can be
> > passed through. VMD endpoint is a single device that expands to many.
> > 3. Expansion of possible bus numbers beyond 256 by using other
> > segments.
> > 4. Custom RAID LED patterns driven by ledctl
> >
> > I'm not trying to market this. Just pointing out that this isn't
> > "bringing zero actual benefits" to many users.
> >
> 
> The initial intent of the VMD driver was to allow Linux to find and
> initialize devices behind a VMD configuration where VMD was required
> for a non-Linux OS. For Linux, if full native PCI-E is an available
> configuration option I think it makes sense to recommend Linux users
> to flip that knob rather than continue to wrestle with the caveats of
> the VMD driver. Where that knob isn't possible / available VMD can be
> a fallback, but full native PCI-E is what Linux wants in the end.

Agreed.  And the thing we need for this to really work is to turn VMD
off without a reboot when we find it.  That would solve all the problems
we have, and also allow an amazing kernel hacker like Derrick do more
productive work than coming up with one VMD workaround after another.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-27 17:49           ` Limonciello, Mario
@ 2020-08-29  7:24             ` hch
  0 siblings, 0 replies; 25+ messages in thread
From: hch @ 2020-08-29  7:24 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: hch, Derrick, Jonathan, wangxiongfeng2, kw, hkallweit1,
	kai.heng.feng, linux-kernel, mika.westerberg, linux-pci,
	Williams, Dan J, bhelgaas, Huffman, Amber, Wysocki, Rafael J

On Thu, Aug 27, 2020 at 05:49:40PM +0000, Limonciello, Mario wrote:
> Can you further elaborate what exactly you're wanting here?  VMD enable/disable
> is something that is configured in firmware setup as the firmware does the early
> configuration for the silicon related to it.  So it's up to the OEM whether to
> offer the knob to an end user.
> 
> At least for Dell this setting also does export to sysfs and can be turned on/off
> around a reboot cycle via this: https://patchwork.kernel.org/patch/11693231/.
> 
> As was mentioned earlier in this thread VMD is likely to be defaulting to "on"
> for many machines with the upcoming silicon.  Making it work well on Linux is
> preferable to again having to change firmware settings between operating systems
> like the NVME remapping thing from earlier silicon required.

And the right answer is to turn it off, but we really need to do that
at runtime, and not over a reboot cycle..

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-25  6:56     ` Christoph Hellwig
  2020-08-26  5:53       ` Kai-Heng Feng
@ 2020-09-02 19:48       ` David Fugate
  2020-09-02 22:54         ` Keith Busch
  1 sibling, 1 reply; 25+ messages in thread
From: David Fugate @ 2020-09-02 19:48 UTC (permalink / raw)
  To: Christoph Hellwig, Kai Heng Feng
  Cc: Bjorn Helgaas, jonathan.derrick, Mario.Limonciello,
	Heiner Kallweit, Mika Westerberg, Rafael J. Wysocki,
	Xiongfeng Wang, Krzysztof Wilczynski, open list:PCI SUBSYSTEM,
	open list, Dan Williams, Huffman, Amber, david.fugate

On Tue, 2020-08-25 at 07:56 +0100, Christoph Hellwig wrote:
> while adding absolutely no value.  Basically we have to add a large
> chunk of kernel code just to undo silicone/firmware Intel added to
> their
> platform to make things complicated.  I mean it is their platform and
> if
> they want a "make things complicated" option that is fine, but it
> should
> not be on by default.


Thanks for your feedback.

Over the years, I've been forwarded numerous emails from VMD customers 
praising it's ability to prevent Linux kernel panics upon hot-removals
and inserts of U.2 NVMe drives. Many were migrating from SATA drives,
which didn't have this issue, and considered it a showstopper to NVMe
adoption. I hope we can all agree reliable and robust hot-plug support
adds value.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-09-02 19:48       ` David Fugate
@ 2020-09-02 22:54         ` Keith Busch
  0 siblings, 0 replies; 25+ messages in thread
From: Keith Busch @ 2020-09-02 22:54 UTC (permalink / raw)
  To: David Fugate
  Cc: Christoph Hellwig, Kai Heng Feng, Bjorn Helgaas,
	jonathan.derrick, Mario.Limonciello, Heiner Kallweit,
	Mika Westerberg, Rafael J. Wysocki, Xiongfeng Wang,
	Krzysztof Wilczynski, open list:PCI SUBSYSTEM, open list,
	Dan Williams, Huffman, Amber, david.fugate

On Wed, Sep 02, 2020 at 01:48:19PM -0600, David Fugate wrote:
> Over the years, I've been forwarded numerous emails from VMD customers 
> praising it's ability to prevent Linux kernel panics upon hot-removals
> and inserts of U.2 NVMe drives.

The same nvme and pcie hotplug drivers are used with or without VMD
enabled. Where are the bug reports for linux kernel panics? We have a
very real interest in fixing such issues.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-08-21 12:32 [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain Kai-Heng Feng
  2020-08-24 13:04 ` Mika Westerberg
  2020-08-25  6:23 ` Christoph Hellwig
@ 2020-09-10  1:55 ` Bjorn Helgaas
  2020-09-10 16:33   ` Derrick, Jonathan
  2 siblings, 1 reply; 25+ messages in thread
From: Bjorn Helgaas @ 2020-09-10  1:55 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: bhelgaas, jonathan.derrick, Mario.Limonciello, Heiner Kallweit,
	Mika Westerberg, Rafael J. Wysocki, Xiongfeng Wang,
	Krzysztof Wilczynski, linux-pci, linux-kernel,
	Saheed O. Bolarinwa

[+cc Saheed]

On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> New Intel laptops with VMD cannot reach deeper power saving state,
> renders very short battery time.
> 
> As BIOS may not be able to program the config space for devices under
> VMD domain, ASPM needs to be programmed manually by software. This is
> also the case under Windows.
> 
> The VMD controller itself is a root complex integrated endpoint that
> doesn't have ASPM capability, so we can't propagate the ASPM settings to
> devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
> VMD domain, unsupported states will be cleared out anyway.
> 
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/pci/pcie/aspm.c |  3 ++-
>  drivers/pci/quirks.c    | 11 +++++++++++
>  include/linux/pci.h     |  2 ++
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 253c30cc1967..dcc002dbca19 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
>  		aspm_calc_l1ss_info(link, &upreg, &dwreg);
>  
>  	/* Save default state */
> -	link->aspm_default = link->aspm_enabled;
> +	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
> +			     ASPM_STATE_ALL : link->aspm_enabled;

This function is ridiculously complicated already, and I really don't
want to make it worse.

What exactly is the PCIe topology here?  Apparently the VMD controller
is a Root Complex Integrated Endpoint, so it's a Type 0 (non-bridge)
device.  And it has no Link, hence no Link Capabilities or Control and
hence no ASPM-related bits.  Right?

And the devices under the VMD controller?  I guess they are regular
PCIe Endpoints, Switch Ports, etc?  Obviously there's a Link involved
somewhere.  Does the VMD controller have some magic, non-architected
Port on the downstream side?

Does this patch enable ASPM on this magic Link between VMD and the
next device?  Configuring ASPM correctly requires knowledge and knobs
from both ends of the Link, and apparently we don't have those for the
VMD end.

Or is it for Links deeper in the hierarchy?  I assume those should
just work already, although there might be issues with latency
computation, etc., because we may not be able to account for the part
of the path above VMD.

I want aspm.c to eventually get out of the business of managing struct
pcie_link_state.  I think it should manage *device* state for each end
of the link.  Maybe that's a path forward, e.g., if we cache the Link
Capabilities during enumeration, quirks could modify that directly,
and aspm.c could just consume that cached information.  I think Saheed
(cc'd) is already working on patches in this direction.

I'm still not sure how this works if VMD is the upstream end of a
Link, though.

>  	/* Setup initial capable state. Will be updated later */
>  	link->aspm_capable = link->aspm_support;
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index bdf9b52567e0..2e2f525bd892 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5632,3 +5632,14 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
>  }
>  DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
>  			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
> +
> +/*
> + * Device [8086:9a09]
> + * BIOS may not be able to access config space of devices under VMD domain, so
> + * it relies on software to enable ASPM for links under VMD.
> + */
> +static void pci_fixup_enable_aspm(struct pci_dev *pdev)
> +{
> +	pdev->dev_flags |= PCI_DEV_FLAGS_ENABLE_ASPM;
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a09, pci_fixup_enable_aspm);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 835530605c0d..66a45916c7c6 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -227,6 +227,8 @@ enum pci_dev_flags {
>  	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
>  	/* Don't use Relaxed Ordering for TLPs directed at this device */
>  	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> +	/* Enable ASPM regardless of how LnkCtl is programmed */
> +	PCI_DEV_FLAGS_ENABLE_ASPM = (__force pci_dev_flags_t) (1 << 12),
>  };
>  
>  enum pci_irq_reroute_variant {
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-09-10  1:55 ` Bjorn Helgaas
@ 2020-09-10 16:33   ` Derrick, Jonathan
  2020-09-10 17:38     ` Bjorn Helgaas
  0 siblings, 1 reply; 25+ messages in thread
From: Derrick, Jonathan @ 2020-09-10 16:33 UTC (permalink / raw)
  To: kai.heng.feng, helgaas
  Cc: wangxiongfeng2, kw, hkallweit1, refactormyself, linux-kernel,
	mika.westerberg, Mario.Limonciello, linux-pci, bhelgaas, Wysocki,
	Rafael J

Hi Bjorn

On Wed, 2020-09-09 at 20:55 -0500, Bjorn Helgaas wrote:
> [+cc Saheed]
> 
> On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> > New Intel laptops with VMD cannot reach deeper power saving state,
> > renders very short battery time.
> > 
> > As BIOS may not be able to program the config space for devices under
> > VMD domain, ASPM needs to be programmed manually by software. This is
> > also the case under Windows.
> > 
> > The VMD controller itself is a root complex integrated endpoint that
> > doesn't have ASPM capability, so we can't propagate the ASPM settings to
> > devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
> > VMD domain, unsupported states will be cleared out anyway.
> > 
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> >  drivers/pci/pcie/aspm.c |  3 ++-
> >  drivers/pci/quirks.c    | 11 +++++++++++
> >  include/linux/pci.h     |  2 ++
> >  3 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 253c30cc1967..dcc002dbca19 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> >  		aspm_calc_l1ss_info(link, &upreg, &dwreg);
> >  
> >  	/* Save default state */
> > -	link->aspm_default = link->aspm_enabled;
> > +	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
> > +			     ASPM_STATE_ALL : link->aspm_enabled;
> 
> This function is ridiculously complicated already, and I really don't
> want to make it worse.
> 
> What exactly is the PCIe topology here?  Apparently the VMD controller
> is a Root Complex Integrated Endpoint, so it's a Type 0 (non-bridge)
> device.  And it has no Link, hence no Link Capabilities or Control and
> hence no ASPM-related bits.  Right?
That's correct. VMD is the Type 0 device providing config/mmio
apertures to another segment and MSI/X remapping. No link and no ASPM
related bits.

Hierarchy is usually something like:

Segment 0           | VMD segment
Root Complex -> VMD | Type 0 (RP/Bridge; physical slot) - Type 1
                    | Type 0 (RP/Bridge; physical slot) - Type 1

> 
> And the devices under the VMD controller?  I guess they are regular
> PCIe Endpoints, Switch Ports, etc?  Obviously there's a Link involved
> somewhere.  Does the VMD controller have some magic, non-architected
> Port on the downstream side?
Correct: Type 0 and Type 1 devices, and any number of Switch ports as
it's usually pinned out to physical slot.


> 
> Does this patch enable ASPM on this magic Link between VMD and the
> next device?  Configuring ASPM correctly requires knowledge and knobs
> from both ends of the Link, and apparently we don't have those for the
> VMD end.
VMD itself doesn't have the link to it's domain. It's really just the
config/mmio aperture and MSI/X remapper. The PCIe link is between the
Type 0 and Type 1 devices on the VMD domain. So fortunately the VMD
itself is not the upstream part of the link.

> 
> Or is it for Links deeper in the hierarchy?  I assume those should
> just work already, although there might be issues with latency
> computation, etc., because we may not be able to account for the part
> of the path above VMD.
That's correct. This is for the links within the domain itself, such as
between a type 0 and NVMe device.

> 
> I want aspm.c to eventually get out of the business of managing struct
> pcie_link_state.  I think it should manage *device* state for each end
> of the link.  Maybe that's a path forward, e.g., if we cache the Link
> Capabilities during enumeration, quirks could modify that directly,
> and aspm.c could just consume that cached information.  I think Saheed
> (cc'd) is already working on patches in this direction.
> 
> I'm still not sure how this works if VMD is the upstream end of a
> Link, though.
> 
> >  	/* Setup initial capable state. Will be updated later */
> >  	link->aspm_capable = link->aspm_support;
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index bdf9b52567e0..2e2f525bd892 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -5632,3 +5632,14 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
> >  }
> >  DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
> >  			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
> > +
> > +/*
> > + * Device [8086:9a09]
> > + * BIOS may not be able to access config space of devices under VMD domain, so
> > + * it relies on software to enable ASPM for links under VMD.
> > + */
> > +static void pci_fixup_enable_aspm(struct pci_dev *pdev)
> > +{
> > +	pdev->dev_flags |= PCI_DEV_FLAGS_ENABLE_ASPM;
> > +}
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a09, pci_fixup_enable_aspm);
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 835530605c0d..66a45916c7c6 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -227,6 +227,8 @@ enum pci_dev_flags {
> >  	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
> >  	/* Don't use Relaxed Ordering for TLPs directed at this device */
> >  	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> > +	/* Enable ASPM regardless of how LnkCtl is programmed */
> > +	PCI_DEV_FLAGS_ENABLE_ASPM = (__force pci_dev_flags_t) (1 << 12),
> >  };
> >  
> >  enum pci_irq_reroute_variant {
> > -- 
> > 2.17.1
> > 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-09-10 16:33   ` Derrick, Jonathan
@ 2020-09-10 17:38     ` Bjorn Helgaas
  0 siblings, 0 replies; 25+ messages in thread
From: Bjorn Helgaas @ 2020-09-10 17:38 UTC (permalink / raw)
  To: Derrick, Jonathan
  Cc: kai.heng.feng, wangxiongfeng2, kw, hkallweit1, refactormyself,
	linux-kernel, mika.westerberg, Mario.Limonciello, linux-pci,
	bhelgaas, Wysocki, Rafael J

On Thu, Sep 10, 2020 at 04:33:39PM +0000, Derrick, Jonathan wrote:
> On Wed, 2020-09-09 at 20:55 -0500, Bjorn Helgaas wrote:
> > On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> > > New Intel laptops with VMD cannot reach deeper power saving state,
> > > renders very short battery time.
> > > 
> > > As BIOS may not be able to program the config space for devices under
> > > VMD domain, ASPM needs to be programmed manually by software. This is
> > > also the case under Windows.
> > > 
> > > The VMD controller itself is a root complex integrated endpoint that
> > > doesn't have ASPM capability, so we can't propagate the ASPM settings to
> > > devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
> > > VMD domain, unsupported states will be cleared out anyway.
> > > 
> > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > ---
> > >  drivers/pci/pcie/aspm.c |  3 ++-
> > >  drivers/pci/quirks.c    | 11 +++++++++++
> > >  include/linux/pci.h     |  2 ++
> > >  3 files changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > index 253c30cc1967..dcc002dbca19 100644
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > >  		aspm_calc_l1ss_info(link, &upreg, &dwreg);
> > >  
> > >  	/* Save default state */
> > > -	link->aspm_default = link->aspm_enabled;
> > > +	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
> > > +			     ASPM_STATE_ALL : link->aspm_enabled;
> > 
> > This function is ridiculously complicated already, and I really don't
> > want to make it worse.
> > 
> > What exactly is the PCIe topology here?  Apparently the VMD controller
> > is a Root Complex Integrated Endpoint, so it's a Type 0 (non-bridge)
> > device.  And it has no Link, hence no Link Capabilities or Control and
> > hence no ASPM-related bits.  Right?
>
> That's correct. VMD is the Type 0 device providing config/mmio
> apertures to another segment and MSI/X remapping. No link and no ASPM
> related bits.
> 
> Hierarchy is usually something like:
> 
> Segment 0           | VMD segment
> Root Complex -> VMD | Type 0 (RP/Bridge; physical slot) - Type 1
>                     | Type 0 (RP/Bridge; physical slot) - Type 1
> 
> > 
> > And the devices under the VMD controller?  I guess they are regular
> > PCIe Endpoints, Switch Ports, etc?  Obviously there's a Link involved
> > somewhere.  Does the VMD controller have some magic, non-architected
> > Port on the downstream side?
>
> Correct: Type 0 and Type 1 devices, and any number of Switch ports as
> it's usually pinned out to physical slot.
> 
> > Does this patch enable ASPM on this magic Link between VMD and the
> > next device?  Configuring ASPM correctly requires knowledge and knobs
> > from both ends of the Link, and apparently we don't have those for the
> > VMD end.
>
> VMD itself doesn't have the link to it's domain. It's really just the
> config/mmio aperture and MSI/X remapper. The PCIe link is between the
> Type 0 and Type 1 devices on the VMD domain. So fortunately the VMD
> itself is not the upstream part of the link.
> 
> > Or is it for Links deeper in the hierarchy?  I assume those should
> > just work already, although there might be issues with latency
> > computation, etc., because we may not be able to account for the part
> > of the path above VMD.
>
> That's correct. This is for the links within the domain itself, such as
> between a type 0 and NVMe device.

OK, great.  So IIUC, below the VMD, there is a Root Port, and the Root
Port has a link to some Endpoint or Switch, e.g., an NVMe device.  And
we just want to enable ASPM on that link.

That should not be a special case; we should be able to make this so
it Just Works.  Based on this patch, I guess the reason it doesn't
work is because link->aspm_enabled for that link isn't set correctly.

So is this just a consequence of us depending on the initial Link
Control value from BIOS?  That seems like something we shouldn't
really depend on.

> > I want aspm.c to eventually get out of the business of managing struct
> > pcie_link_state.  I think it should manage *device* state for each end
> > of the link.  Maybe that's a path forward, e.g., if we cache the Link
> > Capabilities during enumeration, quirks could modify that directly,
> > and aspm.c could just consume that cached information.  I think Saheed
> > (cc'd) is already working on patches in this direction.
> > 
> > I'm still not sure how this works if VMD is the upstream end of a
> > Link, though.
> > 
> > >  	/* Setup initial capable state. Will be updated later */
> > >  	link->aspm_capable = link->aspm_support;
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index bdf9b52567e0..2e2f525bd892 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -5632,3 +5632,14 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
> > >  }
> > >  DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
> > >  			       PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
> > > +
> > > +/*
> > > + * Device [8086:9a09]
> > > + * BIOS may not be able to access config space of devices under VMD domain, so
> > > + * it relies on software to enable ASPM for links under VMD.
> > > + */
> > > +static void pci_fixup_enable_aspm(struct pci_dev *pdev)
> > > +{
> > > +	pdev->dev_flags |= PCI_DEV_FLAGS_ENABLE_ASPM;
> > > +}
> > > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a09, pci_fixup_enable_aspm);
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 835530605c0d..66a45916c7c6 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -227,6 +227,8 @@ enum pci_dev_flags {
> > >  	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
> > >  	/* Don't use Relaxed Ordering for TLPs directed at this device */
> > >  	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
> > > +	/* Enable ASPM regardless of how LnkCtl is programmed */
> > > +	PCI_DEV_FLAGS_ENABLE_ASPM = (__force pci_dev_flags_t) (1 << 12),
> > >  };
> > >  
> > >  enum pci_irq_reroute_variant {
> > > -- 
> > > 2.17.1
> > > 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-09-17 17:20     ` Bjorn Helgaas
@ 2020-09-23 14:29       ` Kai-Heng Feng
  0 siblings, 0 replies; 25+ messages in thread
From: Kai-Heng Feng @ 2020-09-23 14:29 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Derrick, Jonathan, wangxiongfeng2, kw, hkallweit1,
	refactormyself, linux-kernel, mika.westerberg, Mario.Limonciello,
	linux-pci, bhelgaas, Wysocki, Rafael J



> On Sep 18, 2020, at 01:20, Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> On Thu, Sep 10, 2020 at 07:51:05PM +0000, Derrick, Jonathan wrote:
>> On Thu, 2020-09-10 at 14:17 -0500, Bjorn Helgaas wrote:
>>> On Thu, Sep 10, 2020 at 06:52:48PM +0000, Derrick, Jonathan wrote:
>>>> On Thu, 2020-09-10 at 12:38 -0500, Bjorn Helgaas wrote:
>>>>> On Thu, Sep 10, 2020 at 04:33:39PM +0000, Derrick, Jonathan wrote:
>>>>>> On Wed, 2020-09-09 at 20:55 -0500, Bjorn Helgaas wrote:
>>>>>>> On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
>>>>>>>> New Intel laptops with VMD cannot reach deeper power saving state,
>>>>>>>> renders very short battery time.
>>>>>>>> 
>>>>>>>> As BIOS may not be able to program the config space for devices under
>>>>>>>> VMD domain, ASPM needs to be programmed manually by software. This is
>>>>>>>> also the case under Windows.
>>>>>>>> 
>>>>>>>> The VMD controller itself is a root complex integrated endpoint that
>>>>>>>> doesn't have ASPM capability, so we can't propagate the ASPM settings to
>>>>>>>> devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
>>>>>>>> VMD domain, unsupported states will be cleared out anyway.
>>>>>>>> 
>>>>>>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>>>>>>> ---
>>>>>>>> drivers/pci/pcie/aspm.c |  3 ++-
>>>>>>>> drivers/pci/quirks.c    | 11 +++++++++++
>>>>>>>> include/linux/pci.h     |  2 ++
>>>>>>>> 3 files changed, 15 insertions(+), 1 deletion(-)
>>>>>>>> 
>>>>>>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>>>>>>> index 253c30cc1967..dcc002dbca19 100644
>>>>>>>> --- a/drivers/pci/pcie/aspm.c
>>>>>>>> +++ b/drivers/pci/pcie/aspm.c
>>>>>>>> @@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
>>>>>>>> 		aspm_calc_l1ss_info(link, &upreg, &dwreg);
>>>>>>>> 
>>>>>>>> 	/* Save default state */
>>>>>>>> -	link->aspm_default = link->aspm_enabled;
>>>>>>>> +	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
>>>>>>>> +			     ASPM_STATE_ALL : link->aspm_enabled;
>>>>>>> 
>>>>>>> This function is ridiculously complicated already, and I really don't
>>>>>>> want to make it worse.
>>>>>>> 
>>>>>>> What exactly is the PCIe topology here?  Apparently the VMD controller
>>>>>>> is a Root Complex Integrated Endpoint, so it's a Type 0 (non-bridge)
>>>>>>> device.  And it has no Link, hence no Link Capabilities or Control and
>>>>>>> hence no ASPM-related bits.  Right?
>>>>>> 
>>>>>> That's correct. VMD is the Type 0 device providing config/mmio
>>>>>> apertures to another segment and MSI/X remapping. No link and no ASPM
>>>>>> related bits.
>>>>>> 
>>>>>> Hierarchy is usually something like:
>>>>>> 
>>>>>> Segment 0           | VMD segment
>>>>>> Root Complex -> VMD | Type 0 (RP/Bridge; physical slot) - Type 1
>>>>>>                    | Type 0 (RP/Bridge; physical slot) - Type 1
>>>>>> 
>>>>>>> And the devices under the VMD controller?  I guess they are regular
>>>>>>> PCIe Endpoints, Switch Ports, etc?  Obviously there's a Link involved
>>>>>>> somewhere.  Does the VMD controller have some magic, non-architected
>>>>>>> Port on the downstream side?
>>>>>> 
>>>>>> Correct: Type 0 and Type 1 devices, and any number of Switch ports as
>>>>>> it's usually pinned out to physical slot.
>>>>>> 
>>>>>>> Does this patch enable ASPM on this magic Link between VMD and the
>>>>>>> next device?  Configuring ASPM correctly requires knowledge and knobs
>>>>>>> from both ends of the Link, and apparently we don't have those for the
>>>>>>> VMD end.
>>>>>> 
>>>>>> VMD itself doesn't have the link to it's domain. It's really just the
>>>>>> config/mmio aperture and MSI/X remapper. The PCIe link is between the
>>>>>> Type 0 and Type 1 devices on the VMD domain. So fortunately the VMD
>>>>>> itself is not the upstream part of the link.
>>>>>> 
>>>>>>> Or is it for Links deeper in the hierarchy?  I assume those should
>>>>>>> just work already, although there might be issues with latency
>>>>>>> computation, etc., because we may not be able to account for the part
>>>>>>> of the path above VMD.
>>>>>> 
>>>>>> That's correct. This is for the links within the domain itself, such as
>>>>>> between a type 0 and NVMe device.
>>>>> 
>>>>> OK, great.  So IIUC, below the VMD, there is a Root Port, and the Root
>>>>> Port has a link to some Endpoint or Switch, e.g., an NVMe device.  And
>>>>> we just want to enable ASPM on that link.
>>>>> 
>>>>> That should not be a special case; we should be able to make this so
>>>>> it Just Works.  Based on this patch, I guess the reason it doesn't
>>>>> work is because link->aspm_enabled for that link isn't set correctly.
>>>>> 
>>>>> So is this just a consequence of us depending on the initial Link
>>>>> Control value from BIOS?  That seems like something we shouldn't
>>>>> really depend on.
>> Seems like a good idea, that it should instead be quirked if ASPM is
>> found unusable on a link. Though I'm not aware of how many platforms
>> would require a quirk..
>> 
>>>>> 
>>>> That's the crux. There's always pcie_aspm=force.
>>>> Something I've wondered is if there is a way we could 'discover' if the
>>>> link is ASPM safe?
>>> 
>>> Sure.  Link Capabilities is supposed to tell us that.  If aspm.c
>>> depends on the BIOS settings, I think that's a design mistake.
>>> 
>>> But what CONFIG_PCIEASPM_* setting are you using?  The default
>>> is CONFIG_PCIEASPM_DEFAULT, which literally means "Use the BIOS
>>> defaults".  If you're using that, and BIOS doesn't enable ASPM below
>>> VMD, I guess aspm.c will leave it disabled, and that seems like it
>>> would be the expected behavior.
>>> 
>>> Does "pcie_aspm=force" really help you?  I don't see any uses of it
>>> that should apply to your situation.
>>> 
>>> Bjorn
>> 
>> No you're right. I don't think we need pcie_aspm=force, just the policy
>> configuration.
> 
> I'm not sure where we're at here.
> 
> If the kernel is built with CONFIG_PCIEASPM_DEFAULT=y (which means
> "use the BIOS defaults"), and the BIOS doesn't enable ASPM on these
> links below VMD, then Linux will leave things alone.  I think that's
> working as intended.

Yes and that's the problem here. BIOS doesn't enable ASPM for links behind VMD.
The ASPM is enabled by VMD driver under Windows.

> 
> If desired, we should be able to enable ASPM using sysfs in that case.

I hope to keep everything inside kernel. Of course we can use udev rules to change sysfs value, if anyone prefers that approach.

> 
> We have a pci_disable_link_state() kernel interface that drivers can
> use to *disable* ASPM for their device.  But I don't think there's any
> corresponding interface for drivers to *enable* ASPM.  Maybe that's an
> avenue to explore?

Okay, I will work on pci_enable_link_state() helper and let VMD driver as its first user.

Kai-Heng

> 
> Bjorn


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-09-10 19:51   ` Derrick, Jonathan
@ 2020-09-17 17:20     ` Bjorn Helgaas
  2020-09-23 14:29       ` Kai-Heng Feng
  0 siblings, 1 reply; 25+ messages in thread
From: Bjorn Helgaas @ 2020-09-17 17:20 UTC (permalink / raw)
  To: Derrick, Jonathan
  Cc: wangxiongfeng2, kw, hkallweit1, kai.heng.feng, refactormyself,
	linux-kernel, mika.westerberg, Mario.Limonciello, linux-pci,
	bhelgaas, Wysocki, Rafael J

On Thu, Sep 10, 2020 at 07:51:05PM +0000, Derrick, Jonathan wrote:
> On Thu, 2020-09-10 at 14:17 -0500, Bjorn Helgaas wrote:
> > On Thu, Sep 10, 2020 at 06:52:48PM +0000, Derrick, Jonathan wrote:
> > > On Thu, 2020-09-10 at 12:38 -0500, Bjorn Helgaas wrote:
> > > > On Thu, Sep 10, 2020 at 04:33:39PM +0000, Derrick, Jonathan wrote:
> > > > > On Wed, 2020-09-09 at 20:55 -0500, Bjorn Helgaas wrote:
> > > > > > On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> > > > > > > New Intel laptops with VMD cannot reach deeper power saving state,
> > > > > > > renders very short battery time.
> > > > > > > 
> > > > > > > As BIOS may not be able to program the config space for devices under
> > > > > > > VMD domain, ASPM needs to be programmed manually by software. This is
> > > > > > > also the case under Windows.
> > > > > > > 
> > > > > > > The VMD controller itself is a root complex integrated endpoint that
> > > > > > > doesn't have ASPM capability, so we can't propagate the ASPM settings to
> > > > > > > devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
> > > > > > > VMD domain, unsupported states will be cleared out anyway.
> > > > > > > 
> > > > > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > > > > ---
> > > > > > >  drivers/pci/pcie/aspm.c |  3 ++-
> > > > > > >  drivers/pci/quirks.c    | 11 +++++++++++
> > > > > > >  include/linux/pci.h     |  2 ++
> > > > > > >  3 files changed, 15 insertions(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > > > > index 253c30cc1967..dcc002dbca19 100644
> > > > > > > --- a/drivers/pci/pcie/aspm.c
> > > > > > > +++ b/drivers/pci/pcie/aspm.c
> > > > > > > @@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > > > > > >  		aspm_calc_l1ss_info(link, &upreg, &dwreg);
> > > > > > >  
> > > > > > >  	/* Save default state */
> > > > > > > -	link->aspm_default = link->aspm_enabled;
> > > > > > > +	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
> > > > > > > +			     ASPM_STATE_ALL : link->aspm_enabled;
> > > > > > 
> > > > > > This function is ridiculously complicated already, and I really don't
> > > > > > want to make it worse.
> > > > > > 
> > > > > > What exactly is the PCIe topology here?  Apparently the VMD controller
> > > > > > is a Root Complex Integrated Endpoint, so it's a Type 0 (non-bridge)
> > > > > > device.  And it has no Link, hence no Link Capabilities or Control and
> > > > > > hence no ASPM-related bits.  Right?
> > > > > 
> > > > > That's correct. VMD is the Type 0 device providing config/mmio
> > > > > apertures to another segment and MSI/X remapping. No link and no ASPM
> > > > > related bits.
> > > > > 
> > > > > Hierarchy is usually something like:
> > > > > 
> > > > > Segment 0           | VMD segment
> > > > > Root Complex -> VMD | Type 0 (RP/Bridge; physical slot) - Type 1
> > > > >                     | Type 0 (RP/Bridge; physical slot) - Type 1
> > > > > 
> > > > > > And the devices under the VMD controller?  I guess they are regular
> > > > > > PCIe Endpoints, Switch Ports, etc?  Obviously there's a Link involved
> > > > > > somewhere.  Does the VMD controller have some magic, non-architected
> > > > > > Port on the downstream side?
> > > > > 
> > > > > Correct: Type 0 and Type 1 devices, and any number of Switch ports as
> > > > > it's usually pinned out to physical slot.
> > > > > 
> > > > > > Does this patch enable ASPM on this magic Link between VMD and the
> > > > > > next device?  Configuring ASPM correctly requires knowledge and knobs
> > > > > > from both ends of the Link, and apparently we don't have those for the
> > > > > > VMD end.
> > > > > 
> > > > > VMD itself doesn't have the link to it's domain. It's really just the
> > > > > config/mmio aperture and MSI/X remapper. The PCIe link is between the
> > > > > Type 0 and Type 1 devices on the VMD domain. So fortunately the VMD
> > > > > itself is not the upstream part of the link.
> > > > > 
> > > > > > Or is it for Links deeper in the hierarchy?  I assume those should
> > > > > > just work already, although there might be issues with latency
> > > > > > computation, etc., because we may not be able to account for the part
> > > > > > of the path above VMD.
> > > > > 
> > > > > That's correct. This is for the links within the domain itself, such as
> > > > > between a type 0 and NVMe device.
> > > > 
> > > > OK, great.  So IIUC, below the VMD, there is a Root Port, and the Root
> > > > Port has a link to some Endpoint or Switch, e.g., an NVMe device.  And
> > > > we just want to enable ASPM on that link.
> > > > 
> > > > That should not be a special case; we should be able to make this so
> > > > it Just Works.  Based on this patch, I guess the reason it doesn't
> > > > work is because link->aspm_enabled for that link isn't set correctly.
> > > > 
> > > > So is this just a consequence of us depending on the initial Link
> > > > Control value from BIOS?  That seems like something we shouldn't
> > > > really depend on.
> Seems like a good idea, that it should instead be quirked if ASPM is
> found unusable on a link. Though I'm not aware of how many platforms
> would require a quirk..
> 
> > > > 
> > > That's the crux. There's always pcie_aspm=force.
> > > Something I've wondered is if there is a way we could 'discover' if the
> > > link is ASPM safe?
> > 
> > Sure.  Link Capabilities is supposed to tell us that.  If aspm.c
> > depends on the BIOS settings, I think that's a design mistake.
> > 
> > But what CONFIG_PCIEASPM_* setting are you using?  The default
> > is CONFIG_PCIEASPM_DEFAULT, which literally means "Use the BIOS
> > defaults".  If you're using that, and BIOS doesn't enable ASPM below
> > VMD, I guess aspm.c will leave it disabled, and that seems like it
> > would be the expected behavior.
> > 
> > Does "pcie_aspm=force" really help you?  I don't see any uses of it
> > that should apply to your situation.
> > 
> > Bjorn
> 
> No you're right. I don't think we need pcie_aspm=force, just the policy
> configuration.

I'm not sure where we're at here.

If the kernel is built with CONFIG_PCIEASPM_DEFAULT=y (which means
"use the BIOS defaults"), and the BIOS doesn't enable ASPM on these
links below VMD, then Linux will leave things alone.  I think that's
working as intended.

If desired, we should be able to enable ASPM using sysfs in that case.

We have a pci_disable_link_state() kernel interface that drivers can
use to *disable* ASPM for their device.  But I don't think there's any
corresponding interface for drivers to *enable* ASPM.  Maybe that's an
avenue to explore?

Bjorn

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
  2020-09-10 19:17 ` Bjorn Helgaas
@ 2020-09-10 19:51   ` Derrick, Jonathan
  2020-09-17 17:20     ` Bjorn Helgaas
  0 siblings, 1 reply; 25+ messages in thread
From: Derrick, Jonathan @ 2020-09-10 19:51 UTC (permalink / raw)
  To: helgaas
  Cc: wangxiongfeng2, kw, hkallweit1, kai.heng.feng, refactormyself,
	linux-kernel, mika.westerberg, Mario.Limonciello, linux-pci,
	bhelgaas, Wysocki, Rafael J

On Thu, 2020-09-10 at 14:17 -0500, Bjorn Helgaas wrote:
> On Thu, Sep 10, 2020 at 06:52:48PM +0000, Derrick, Jonathan wrote:
> > On Thu, 2020-09-10 at 12:38 -0500, Bjorn Helgaas wrote:
> > > On Thu, Sep 10, 2020 at 04:33:39PM +0000, Derrick, Jonathan wrote:
> > > > On Wed, 2020-09-09 at 20:55 -0500, Bjorn Helgaas wrote:
> > > > > On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> > > > > > New Intel laptops with VMD cannot reach deeper power saving state,
> > > > > > renders very short battery time.
> > > > > > 
> > > > > > As BIOS may not be able to program the config space for devices under
> > > > > > VMD domain, ASPM needs to be programmed manually by software. This is
> > > > > > also the case under Windows.
> > > > > > 
> > > > > > The VMD controller itself is a root complex integrated endpoint that
> > > > > > doesn't have ASPM capability, so we can't propagate the ASPM settings to
> > > > > > devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
> > > > > > VMD domain, unsupported states will be cleared out anyway.
> > > > > > 
> > > > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > > > ---
> > > > > >  drivers/pci/pcie/aspm.c |  3 ++-
> > > > > >  drivers/pci/quirks.c    | 11 +++++++++++
> > > > > >  include/linux/pci.h     |  2 ++
> > > > > >  3 files changed, 15 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > > > index 253c30cc1967..dcc002dbca19 100644
> > > > > > --- a/drivers/pci/pcie/aspm.c
> > > > > > +++ b/drivers/pci/pcie/aspm.c
> > > > > > @@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > > > > >  		aspm_calc_l1ss_info(link, &upreg, &dwreg);
> > > > > >  
> > > > > >  	/* Save default state */
> > > > > > -	link->aspm_default = link->aspm_enabled;
> > > > > > +	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
> > > > > > +			     ASPM_STATE_ALL : link->aspm_enabled;
> > > > > 
> > > > > This function is ridiculously complicated already, and I really don't
> > > > > want to make it worse.
> > > > > 
> > > > > What exactly is the PCIe topology here?  Apparently the VMD controller
> > > > > is a Root Complex Integrated Endpoint, so it's a Type 0 (non-bridge)
> > > > > device.  And it has no Link, hence no Link Capabilities or Control and
> > > > > hence no ASPM-related bits.  Right?
> > > > 
> > > > That's correct. VMD is the Type 0 device providing config/mmio
> > > > apertures to another segment and MSI/X remapping. No link and no ASPM
> > > > related bits.
> > > > 
> > > > Hierarchy is usually something like:
> > > > 
> > > > Segment 0           | VMD segment
> > > > Root Complex -> VMD | Type 0 (RP/Bridge; physical slot) - Type 1
> > > >                     | Type 0 (RP/Bridge; physical slot) - Type 1
> > > > 
> > > > > And the devices under the VMD controller?  I guess they are regular
> > > > > PCIe Endpoints, Switch Ports, etc?  Obviously there's a Link involved
> > > > > somewhere.  Does the VMD controller have some magic, non-architected
> > > > > Port on the downstream side?
> > > > 
> > > > Correct: Type 0 and Type 1 devices, and any number of Switch ports as
> > > > it's usually pinned out to physical slot.
> > > > 
> > > > > Does this patch enable ASPM on this magic Link between VMD and the
> > > > > next device?  Configuring ASPM correctly requires knowledge and knobs
> > > > > from both ends of the Link, and apparently we don't have those for the
> > > > > VMD end.
> > > > 
> > > > VMD itself doesn't have the link to it's domain. It's really just the
> > > > config/mmio aperture and MSI/X remapper. The PCIe link is between the
> > > > Type 0 and Type 1 devices on the VMD domain. So fortunately the VMD
> > > > itself is not the upstream part of the link.
> > > > 
> > > > > Or is it for Links deeper in the hierarchy?  I assume those should
> > > > > just work already, although there might be issues with latency
> > > > > computation, etc., because we may not be able to account for the part
> > > > > of the path above VMD.
> > > > 
> > > > That's correct. This is for the links within the domain itself, such as
> > > > between a type 0 and NVMe device.
> > > 
> > > OK, great.  So IIUC, below the VMD, there is a Root Port, and the Root
> > > Port has a link to some Endpoint or Switch, e.g., an NVMe device.  And
> > > we just want to enable ASPM on that link.
> > > 
> > > That should not be a special case; we should be able to make this so
> > > it Just Works.  Based on this patch, I guess the reason it doesn't
> > > work is because link->aspm_enabled for that link isn't set correctly.
> > > 
> > > So is this just a consequence of us depending on the initial Link
> > > Control value from BIOS?  That seems like something we shouldn't
> > > really depend on.
Seems like a good idea, that it should instead be quirked if ASPM is
found unusable on a link. Though I'm not aware of how many platforms
would require a quirk..

> > > 
> > That's the crux. There's always pcie_aspm=force.
> > Something I've wondered is if there is a way we could 'discover' if the
> > link is ASPM safe?
> 
> Sure.  Link Capabilities is supposed to tell us that.  If aspm.c
> depends on the BIOS settings, I think that's a design mistake.
> 
> But what CONFIG_PCIEASPM_* setting are you using?  The default
> is CONFIG_PCIEASPM_DEFAULT, which literally means "Use the BIOS
> defaults".  If you're using that, and BIOS doesn't enable ASPM below
> VMD, I guess aspm.c will leave it disabled, and that seems like it
> would be the expected behavior.
> 
> Does "pcie_aspm=force" really help you?  I don't see any uses of it
> that should apply to your situation.
> 
> Bjorn

No you're right. I don't think we need pcie_aspm=force, just the policy
configuration.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain
       [not found] <0f902d555deb423ef1c79835b23c917be2633162.camel@intel.com>
@ 2020-09-10 19:17 ` Bjorn Helgaas
  2020-09-10 19:51   ` Derrick, Jonathan
  0 siblings, 1 reply; 25+ messages in thread
From: Bjorn Helgaas @ 2020-09-10 19:17 UTC (permalink / raw)
  To: Derrick, Jonathan
  Cc: wangxiongfeng2, kw, hkallweit1, kai.heng.feng, refactormyself,
	linux-kernel, mika.westerberg, Mario.Limonciello, linux-pci,
	bhelgaas, Wysocki, Rafael J

On Thu, Sep 10, 2020 at 06:52:48PM +0000, Derrick, Jonathan wrote:
> On Thu, 2020-09-10 at 12:38 -0500, Bjorn Helgaas wrote:
> > On Thu, Sep 10, 2020 at 04:33:39PM +0000, Derrick, Jonathan wrote:
> > > On Wed, 2020-09-09 at 20:55 -0500, Bjorn Helgaas wrote:
> > > > On Fri, Aug 21, 2020 at 08:32:20PM +0800, Kai-Heng Feng wrote:
> > > > > New Intel laptops with VMD cannot reach deeper power saving state,
> > > > > renders very short battery time.
> > > > > 
> > > > > As BIOS may not be able to program the config space for devices under
> > > > > VMD domain, ASPM needs to be programmed manually by software. This is
> > > > > also the case under Windows.
> > > > > 
> > > > > The VMD controller itself is a root complex integrated endpoint that
> > > > > doesn't have ASPM capability, so we can't propagate the ASPM settings to
> > > > > devices under it. Hence, simply apply ASPM_STATE_ALL to the links under
> > > > > VMD domain, unsupported states will be cleared out anyway.
> > > > > 
> > > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > > ---
> > > > >  drivers/pci/pcie/aspm.c |  3 ++-
> > > > >  drivers/pci/quirks.c    | 11 +++++++++++
> > > > >  include/linux/pci.h     |  2 ++
> > > > >  3 files changed, 15 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > > index 253c30cc1967..dcc002dbca19 100644
> > > > > --- a/drivers/pci/pcie/aspm.c
> > > > > +++ b/drivers/pci/pcie/aspm.c
> > > > > @@ -624,7 +624,8 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > > > >  		aspm_calc_l1ss_info(link, &upreg, &dwreg);
> > > > >  
> > > > >  	/* Save default state */
> > > > > -	link->aspm_default = link->aspm_enabled;
> > > > > +	link->aspm_default = parent->dev_flags & PCI_DEV_FLAGS_ENABLE_ASPM ?
> > > > > +			     ASPM_STATE_ALL : link->aspm_enabled;
> > > > 
> > > > This function is ridiculously complicated already, and I really don't
> > > > want to make it worse.
> > > > 
> > > > What exactly is the PCIe topology here?  Apparently the VMD controller
> > > > is a Root Complex Integrated Endpoint, so it's a Type 0 (non-bridge)
> > > > device.  And it has no Link, hence no Link Capabilities or Control and
> > > > hence no ASPM-related bits.  Right?
> > > 
> > > That's correct. VMD is the Type 0 device providing config/mmio
> > > apertures to another segment and MSI/X remapping. No link and no ASPM
> > > related bits.
> > > 
> > > Hierarchy is usually something like:
> > > 
> > > Segment 0           | VMD segment
> > > Root Complex -> VMD | Type 0 (RP/Bridge; physical slot) - Type 1
> > >                     | Type 0 (RP/Bridge; physical slot) - Type 1
> > > 
> > > > And the devices under the VMD controller?  I guess they are regular
> > > > PCIe Endpoints, Switch Ports, etc?  Obviously there's a Link involved
> > > > somewhere.  Does the VMD controller have some magic, non-architected
> > > > Port on the downstream side?
> > > 
> > > Correct: Type 0 and Type 1 devices, and any number of Switch ports as
> > > it's usually pinned out to physical slot.
> > > 
> > > > Does this patch enable ASPM on this magic Link between VMD and the
> > > > next device?  Configuring ASPM correctly requires knowledge and knobs
> > > > from both ends of the Link, and apparently we don't have those for the
> > > > VMD end.
> > > 
> > > VMD itself doesn't have the link to it's domain. It's really just the
> > > config/mmio aperture and MSI/X remapper. The PCIe link is between the
> > > Type 0 and Type 1 devices on the VMD domain. So fortunately the VMD
> > > itself is not the upstream part of the link.
> > > 
> > > > Or is it for Links deeper in the hierarchy?  I assume those should
> > > > just work already, although there might be issues with latency
> > > > computation, etc., because we may not be able to account for the part
> > > > of the path above VMD.
> > > 
> > > That's correct. This is for the links within the domain itself, such as
> > > between a type 0 and NVMe device.
> > 
> > OK, great.  So IIUC, below the VMD, there is a Root Port, and the Root
> > Port has a link to some Endpoint or Switch, e.g., an NVMe device.  And
> > we just want to enable ASPM on that link.
> > 
> > That should not be a special case; we should be able to make this so
> > it Just Works.  Based on this patch, I guess the reason it doesn't
> > work is because link->aspm_enabled for that link isn't set correctly.
> > 
> > So is this just a consequence of us depending on the initial Link
> > Control value from BIOS?  That seems like something we shouldn't
> > really depend on.
> > 
> That's the crux. There's always pcie_aspm=force.
> Something I've wondered is if there is a way we could 'discover' if the
> link is ASPM safe?

Sure.  Link Capabilities is supposed to tell us that.  If aspm.c
depends on the BIOS settings, I think that's a design mistake.

But what CONFIG_PCIEASPM_* setting are you using?  The default
is CONFIG_PCIEASPM_DEFAULT, which literally means "Use the BIOS
defaults".  If you're using that, and BIOS doesn't enable ASPM below
VMD, I guess aspm.c will leave it disabled, and that seems like it
would be the expected behavior.

Does "pcie_aspm=force" really help you?  I don't see any uses of it
that should apply to your situation.

Bjorn

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-09-23 14:30 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-21 12:32 [PATCH] PCI/ASPM: Enable ASPM for links under VMD domain Kai-Heng Feng
2020-08-24 13:04 ` Mika Westerberg
2020-08-25  6:23 ` Christoph Hellwig
2020-08-25  6:39   ` Kai Heng Feng
2020-08-25  6:56     ` Christoph Hellwig
2020-08-26  5:53       ` Kai-Heng Feng
2020-09-02 19:48       ` David Fugate
2020-09-02 22:54         ` Keith Busch
2020-08-26 21:43   ` Derrick, Jonathan
2020-08-27  6:34     ` hch
2020-08-27 16:13       ` Derrick, Jonathan
2020-08-27 16:23         ` hch
2020-08-27 16:45           ` Derrick, Jonathan
2020-08-27 16:50             ` hch
2020-08-27 21:33             ` Dan Williams
2020-08-29  7:23               ` hch
2020-08-27 17:49           ` Limonciello, Mario
2020-08-29  7:24             ` hch
2020-09-10  1:55 ` Bjorn Helgaas
2020-09-10 16:33   ` Derrick, Jonathan
2020-09-10 17:38     ` Bjorn Helgaas
     [not found] <0f902d555deb423ef1c79835b23c917be2633162.camel@intel.com>
2020-09-10 19:17 ` Bjorn Helgaas
2020-09-10 19:51   ` Derrick, Jonathan
2020-09-17 17:20     ` Bjorn Helgaas
2020-09-23 14:29       ` Kai-Heng Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).