linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function
@ 2021-12-16  5:55 David E. Box
  2021-12-16  5:56 ` [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices David E. Box
  2021-12-16 17:06 ` [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function Bjorn Helgaas
  0 siblings, 2 replies; 12+ messages in thread
From: David E. Box @ 2021-12-16  5:55 UTC (permalink / raw)
  To: nirmal.patel, jonathan.derrick, lorenzo.pieralisi, hch, kw, robh,
	bhelgaas, david.e.box, michael.a.bottini, rafael, me
  Cc: linux-pci, linux-kernel

From: Michael Bottini <michael.a.bottini@linux.intel.com>

Devices that appear under the Intel VMD host bridge are not visible to BIOS
and therefore not programmed by BIOS with ASPM settings. For these devices,
it is necessary for the driver to configure ASPM. Since ASPM settings are
adjustable at runtime by module parameter, use the same mechanism to allow
drivers to override the default (in this case never configured) BIOS policy
to ASPM_STATE_ALL. Then, reconfigure ASPM on the link. Do not override if
ASPM control is disabled.

Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
---
V4
 - No changes.
V3
 - Fix missing semicolon in static inline function.

V2
 - Change return type to int so caller can determine if override was
   successful.
 - Return immediately if link is not found so that lock it not
   unecessarily taken, suggested by kw@linux.com.
 - Don't override if aspm_disabled is true.

 drivers/pci/pci.h       |  2 ++
 drivers/pci/pcie/aspm.c | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 3d60cabde1a1..c9c55d43cd8a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -562,11 +562,13 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev);
 void pcie_aspm_exit_link_state(struct pci_dev *pdev);
 void pcie_aspm_pm_state_change(struct pci_dev *pdev);
 void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
+int pcie_aspm_policy_override(struct pci_dev *dev);
 #else
 static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
 static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
+static inline int pcie_aspm_policy_override(struct pci_dev *dev) { return -EINVAL; }
 #endif
 
 #ifdef CONFIG_PCIE_ECRC
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 52c74682601a..e2c61e14e724 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1140,6 +1140,25 @@ int pci_disable_link_state(struct pci_dev *pdev, int state)
 }
 EXPORT_SYMBOL(pci_disable_link_state);
 
+int pcie_aspm_policy_override(struct pci_dev *pdev)
+{
+	struct pcie_link_state *link = pcie_aspm_get_link(pdev);
+
+	if (!link || aspm_disabled)
+		return -EINVAL;
+
+	down_read(&pci_bus_sem);
+	mutex_lock(&aspm_lock);
+	link->aspm_default = ASPM_STATE_ALL;
+	pcie_config_aspm_link(link, policy_to_aspm_state(link));
+	pcie_set_clkpm(link, policy_to_clkpm_state(link));
+	mutex_unlock(&aspm_lock);
+	up_read(&pci_bus_sem);
+
+	return 0;
+}
+EXPORT_SYMBOL(pcie_aspm_policy_override);
+
 static int pcie_aspm_set_policy(const char *val,
 				const struct kernel_param *kp)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-16  5:55 [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function David E. Box
@ 2021-12-16  5:56 ` David E. Box
  2021-12-16 17:26   ` Bjorn Helgaas
  2021-12-16 17:06 ` [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function Bjorn Helgaas
  1 sibling, 1 reply; 12+ messages in thread
From: David E. Box @ 2021-12-16  5:56 UTC (permalink / raw)
  To: nirmal.patel, jonathan.derrick, lorenzo.pieralisi, hch, kw, robh,
	bhelgaas, david.e.box, michael.a.bottini, rafael, me
  Cc: linux-pci, linux-kernel

From: Michael Bottini <michael.a.bottini@linux.intel.com>

On Tiger Lake and Alder Lake platforms, VMD controllers do not have ASPM
enabled nor LTR values set by BIOS. This leads high power consumption on
these platforms when VMD is enabled as reported in bugzilla [1].  Enable
these features in the VMD driver using pcie_aspm_policy_override() to set
the ASPM policy for the root ports.

To do this, add an additional flag in VMD features to specify devices that
must have their respective policies overridden.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=213717

Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Tested-by: Adhitya Mohan <me@adhityamohan.in>
---
V4
 - Refactor vmd_enable_apsm() to exit early, making the lines shorter
   and more readable. Suggested by Christoph.
V3
 - No changes
V2
 - Use return status to print pci_info message if ASPM cannot be enabled.
 - Add missing static declaration, caught by lkp@intel.com

 drivers/pci/controller/vmd.c | 43 +++++++++++++++++++++++++++++++++---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index a45e8e59d3d4..880afd450a14 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -20,6 +20,8 @@
 
 #include <asm/irqdomain.h>
 
+#include "../pci.h"
+
 #define VMD_CFGBAR	0
 #define VMD_MEMBAR1	2
 #define VMD_MEMBAR2	4
@@ -67,6 +69,12 @@ enum vmd_features {
 	 * interrupt handling.
 	 */
 	VMD_FEAT_CAN_BYPASS_MSI_REMAP		= (1 << 4),
+
+	/*
+	 * Device must have ASPM policy overridden, as its default policy is
+	 * incorrect.
+	 */
+	VMD_FEAT_QUIRK_OVERRIDE_ASPM		= (1 << 5),
 };
 
 static DEFINE_IDA(vmd_instance_ida);
@@ -661,6 +669,30 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
 	return 0;
 }
 
+/*
+ * Override the BIOS ASPM policy and set the LTR value for PCI storage
+ * devices on the VMD bride.
+ */
+static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
+{
+	int features = *(int *)userdata, pos;
+
+	if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
+	    pdev->class != PCI_CLASS_STORAGE_EXPRESS)
+		return 0;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
+	if (!pos)
+		return 0;
+
+	pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, 0x1003);
+	pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT, 0x1003);
+	if (pcie_aspm_policy_override(pdev))
+		pci_info(pdev, "Unable of override ASPM policy\n");
+
+	return 0;
+}
+
 static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 {
 	struct pci_sysdata *sd = &vmd->sysdata;
@@ -807,6 +839,8 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 	pci_scan_child_bus(vmd->bus);
 	pci_assign_unassigned_bus_resources(vmd->bus);
 
+	pci_walk_bus(vmd->bus, vmd_enable_aspm, &features);
+
 	/*
 	 * VMD root buses are virtual and don't return true on pci_is_pcie()
 	 * and will fail pcie_bus_configure_settings() early. It can instead be
@@ -948,15 +982,18 @@ static const struct pci_device_id vmd_ids[] = {
 	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x467f),
 		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
 				VMD_FEAT_HAS_BUS_RESTRICTIONS |
-				VMD_FEAT_OFFSET_FIRST_VECTOR,},
+				VMD_FEAT_OFFSET_FIRST_VECTOR |
+				VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
 	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x4c3d),
 		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
 				VMD_FEAT_HAS_BUS_RESTRICTIONS |
-				VMD_FEAT_OFFSET_FIRST_VECTOR,},
+				VMD_FEAT_OFFSET_FIRST_VECTOR |
+				VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
 	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B),
 		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
 				VMD_FEAT_HAS_BUS_RESTRICTIONS |
-				VMD_FEAT_OFFSET_FIRST_VECTOR,},
+				VMD_FEAT_OFFSET_FIRST_VECTOR |
+				VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
 	{0,}
 };
 MODULE_DEVICE_TABLE(pci, vmd_ids);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function
  2021-12-16  5:55 [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function David E. Box
  2021-12-16  5:56 ` [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices David E. Box
@ 2021-12-16 17:06 ` Bjorn Helgaas
  2021-12-16 18:33   ` David E. Box
  1 sibling, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2021-12-16 17:06 UTC (permalink / raw)
  To: David E. Box
  Cc: nirmal.patel, jonathan.derrick, lorenzo.pieralisi, hch, kw, robh,
	bhelgaas, michael.a.bottini, rafael, me, linux-pci, linux-kernel

On Wed, Dec 15, 2021 at 09:55:59PM -0800, David E. Box wrote:
> From: Michael Bottini <michael.a.bottini@linux.intel.com>
> 
> Devices that appear under the Intel VMD host bridge are not visible to BIOS
> and therefore not programmed by BIOS with ASPM settings. For these devices,
> it is necessary for the driver to configure ASPM. 

The VMD-related parts of this commit log belong in the next patch,
because this patch has nothing in particular to do with VMD.

> Since ASPM settings are adjustable at runtime by module parameter,
> use the same mechanism to allow drivers to override the default (in
> this case never configured) BIOS policy to ASPM_STATE_ALL. Then,
> reconfigure ASPM on the link. Do not override if ASPM control is
> disabled.

The module parameter ("policy") has global effect: it runs
pcie_aspm_set_policy(), which assigns the global "aspm_policy" and
then reconfigures all links in the system.

This is not that; it's a link-based thing that doesn't change
"aspm_policy" and only affects a single link.  This is more like
aspm_attr_store_common() for the sysfs "l0s_aspm" and similar
attributes, or the pci_disable_link_state() interface for drivers.

> Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
> Signed-off-by: David E. Box <david.e.box@linux.intel.com>
> ---
> V4
>  - No changes.
> V3
>  - Fix missing semicolon in static inline function.
> 
> V2
>  - Change return type to int so caller can determine if override was
>    successful.
>  - Return immediately if link is not found so that lock it not
>    unecessarily taken, suggested by kw@linux.com.
>  - Don't override if aspm_disabled is true.
> 
>  drivers/pci/pci.h       |  2 ++
>  drivers/pci/pcie/aspm.c | 19 +++++++++++++++++++
>  2 files changed, 21 insertions(+)
> 
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 3d60cabde1a1..c9c55d43cd8a 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -562,11 +562,13 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev);
>  void pcie_aspm_exit_link_state(struct pci_dev *pdev);
>  void pcie_aspm_pm_state_change(struct pci_dev *pdev);
>  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
> +int pcie_aspm_policy_override(struct pci_dev *dev);
>  #else
>  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
> +static inline int pcie_aspm_policy_override(struct pci_dev *dev) { return -EINVAL; }
>  #endif
>  
>  #ifdef CONFIG_PCIE_ECRC
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 52c74682601a..e2c61e14e724 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -1140,6 +1140,25 @@ int pci_disable_link_state(struct pci_dev *pdev, int state)
>  }
>  EXPORT_SYMBOL(pci_disable_link_state);
>  
> +int pcie_aspm_policy_override(struct pci_dev *pdev)
> +{
> +	struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> +
> +	if (!link || aspm_disabled)
> +		return -EINVAL;
> +
> +	down_read(&pci_bus_sem);
> +	mutex_lock(&aspm_lock);
> +	link->aspm_default = ASPM_STATE_ALL;
> +	pcie_config_aspm_link(link, policy_to_aspm_state(link));
> +	pcie_set_clkpm(link, policy_to_clkpm_state(link));
> +	mutex_unlock(&aspm_lock);
> +	up_read(&pci_bus_sem);

This is essentially the inverse of pci_disable_link_state().  Why not
name it so the connection is obvious?  Probably also make the
signature ("int state") similar.

> +	return 0;
> +}
> +EXPORT_SYMBOL(pcie_aspm_policy_override);
> +
>  static int pcie_aspm_set_policy(const char *val,
>  				const struct kernel_param *kp)
>  {
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-16  5:56 ` [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices David E. Box
@ 2021-12-16 17:26   ` Bjorn Helgaas
  2021-12-16 21:24     ` David E. Box
  0 siblings, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2021-12-16 17:26 UTC (permalink / raw)
  To: David E. Box
  Cc: nirmal.patel, jonathan.derrick, lorenzo.pieralisi, hch, kw, robh,
	bhelgaas, michael.a.bottini, rafael, me, linux-pci, linux-kernel,
	Rajat Jain

[+cc Rajat for LTR max latency write]

On Wed, Dec 15, 2021 at 09:56:00PM -0800, David E. Box wrote:
> From: Michael Bottini <michael.a.bottini@linux.intel.com>
> 
> On Tiger Lake and Alder Lake platforms, VMD controllers do not have ASPM
> enabled nor LTR values set by BIOS. This leads high power consumption on
> these platforms when VMD is enabled as reported in bugzilla [1].  Enable
> these features in the VMD driver using pcie_aspm_policy_override() to set
> the ASPM policy for the root ports.

s/leads high/leads to high/

Does this depend on "Tiger Lake and Alder Lake platforms"
specifically, or does it depend on a BIOS design choice, i.e., "don't
configure ASPM or LTR for devices below a VMD"?

The subject says "override ASPM on VMD devices," but it looks like
this affects the ASPM configuration of devices *below* the VMD, not of
the VMD itself.

It looks like this only affects *NVMe* devices, since
vmd_enable_aspm() checks for PCI_CLASS_STORAGE_EXPRESS.  Why is that?
Is there something special about NVMe?  I'd think you would want to do
this for *all* devices below a VMD.

Since it only affects PCI_CLASS_STORAGE_EXPRESS devices, I don't think
it actually "sets ASPM policy for the root ports".  vmd_enable_aspm()
calls pcie_aspm_policy_override() on endpoints.  It's true that the
link ASPM state happens to be attached to the upstream end of the
link, but that's an ASPM implementation detail.

This all needs to be clear in the subject and commit log.

> To do this, add an additional flag in VMD features to specify devices that
> must have their respective policies overridden.

I'm not clear on why you want this to apply to only certain VMDs and
not others.  Do some BIOSes configure ASPM for devices below some
VMDs?

> [1] https://bugzilla.kernel.org/show_bug.cgi?id=213717
> 
> Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
> Signed-off-by: David E. Box <david.e.box@linux.intel.com>
> Tested-by: Adhitya Mohan <me@adhityamohan.in>
> ---
> V4
>  - Refactor vmd_enable_apsm() to exit early, making the lines shorter
>    and more readable. Suggested by Christoph.
> V3
>  - No changes
> V2
>  - Use return status to print pci_info message if ASPM cannot be enabled.
>  - Add missing static declaration, caught by lkp@intel.com
> 
>  drivers/pci/controller/vmd.c | 43 +++++++++++++++++++++++++++++++++---
>  1 file changed, 40 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index a45e8e59d3d4..880afd450a14 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -20,6 +20,8 @@
>  
>  #include <asm/irqdomain.h>
>  
> +#include "../pci.h"
> +
>  #define VMD_CFGBAR	0
>  #define VMD_MEMBAR1	2
>  #define VMD_MEMBAR2	4
> @@ -67,6 +69,12 @@ enum vmd_features {
>  	 * interrupt handling.
>  	 */
>  	VMD_FEAT_CAN_BYPASS_MSI_REMAP		= (1 << 4),
> +
> +	/*
> +	 * Device must have ASPM policy overridden, as its default policy is
> +	 * incorrect.
> +	 */
> +	VMD_FEAT_QUIRK_OVERRIDE_ASPM		= (1 << 5),

I think you specifically want to *enable* some ASPM link states, not
just "override the default policy."  "Override" tells us nothing about
whether you are enabling or disabling ASPM.  Applies to subject line
as well.

>  };
>  
>  static DEFINE_IDA(vmd_instance_ida);
> @@ -661,6 +669,30 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
>  	return 0;
>  }
>  
> +/*
> + * Override the BIOS ASPM policy and set the LTR value for PCI storage
> + * devices on the VMD bride.

I don't think there's any BIOS "policy" here.  At this point BIOS is
no longer involved at all, so all that's left is whatever ASPM config
the BIOS did or did not do.

Why only storage?

s/bride/bridge/

> + */
> +static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
> +{
> +	int features = *(int *)userdata, pos;
> +
> +	if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
> +	    pdev->class != PCI_CLASS_STORAGE_EXPRESS)
> +		return 0;
> +
> +	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
> +	if (!pos)
> +		return 0;
> +
> +	pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, 0x1003);
> +	pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT, 0x1003);

1) Where did this magic 0x1003 value come from?  Does that depend on
the VMD device?  The endpoint?  The circuit design?  The path between
endpoint and VMD?  What if there are switches in the path?

2) There exist broken devices where WORD config accesses don't work:
https://lore.kernel.org/all/20211208000948.487820-1-rajatja@google.com/

We might need a way to quirk config accesses to those devices, but we
don't have one yet.  So for now this needs to be a single DWORD write.

> +	if (pcie_aspm_policy_override(pdev))
> +		pci_info(pdev, "Unable of override ASPM policy\n");

s/Unable of/Unable to/

I think we might need a message about when we *do* override the
policy.  A note in dmesg might be useful for debugging.  I'm worried
about the LTR programming because I really don't understand how we
should be doing that.

> +	return 0;
> +}
> +
>  static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  {
>  	struct pci_sysdata *sd = &vmd->sysdata;
> @@ -807,6 +839,8 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  	pci_scan_child_bus(vmd->bus);
>  	pci_assign_unassigned_bus_resources(vmd->bus);
>  
> +	pci_walk_bus(vmd->bus, vmd_enable_aspm, &features);

Do you support hotplug under VMD?  This will not happen for hot-added
devices.

>  	/*
>  	 * VMD root buses are virtual and don't return true on pci_is_pcie()
>  	 * and will fail pcie_bus_configure_settings() early. It can instead be
> @@ -948,15 +982,18 @@ static const struct pci_device_id vmd_ids[] = {
>  	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x467f),
>  		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
>  				VMD_FEAT_HAS_BUS_RESTRICTIONS |
> -				VMD_FEAT_OFFSET_FIRST_VECTOR,},
> +				VMD_FEAT_OFFSET_FIRST_VECTOR |
> +				VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
>  	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x4c3d),
>  		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
>  				VMD_FEAT_HAS_BUS_RESTRICTIONS |
> -				VMD_FEAT_OFFSET_FIRST_VECTOR,},
> +				VMD_FEAT_OFFSET_FIRST_VECTOR |
> +				VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
>  	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B),
>  		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
>  				VMD_FEAT_HAS_BUS_RESTRICTIONS |
> -				VMD_FEAT_OFFSET_FIRST_VECTOR,},
> +				VMD_FEAT_OFFSET_FIRST_VECTOR |
> +				VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
>  	{0,}
>  };
>  MODULE_DEVICE_TABLE(pci, vmd_ids);
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function
  2021-12-16 17:06 ` [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function Bjorn Helgaas
@ 2021-12-16 18:33   ` David E. Box
  0 siblings, 0 replies; 12+ messages in thread
From: David E. Box @ 2021-12-16 18:33 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: nirmal.patel, jonathan.derrick, lorenzo.pieralisi, hch, kw, robh,
	bhelgaas, michael.a.bottini, rafael, me, linux-pci, linux-kernel

On Thu, 2021-12-16 at 11:06 -0600, Bjorn Helgaas wrote:
> On Wed, Dec 15, 2021 at 09:55:59PM -0800, David E. Box wrote:
> > From: Michael Bottini <michael.a.bottini@linux.intel.com>
> > 
> > Devices that appear under the Intel VMD host bridge are not visible to BIOS
> > and therefore not programmed by BIOS with ASPM settings. For these devices,
> > it is necessary for the driver to configure ASPM. 
> 
> The VMD-related parts of this commit log belong in the next patch,
> because this patch has nothing in particular to do with VMD.

Okay, but shouldn't something be said here as to why it's needed? Maybe a more general description?

> 
> > Since ASPM settings are adjustable at runtime by module parameter,
> > use the same mechanism to allow drivers to override the default (in
> > this case never configured) BIOS policy to ASPM_STATE_ALL. Then,
> > reconfigure ASPM on the link. Do not override if ASPM control is
> > disabled.
> 
> The module parameter ("policy") has global effect: it runs
> pcie_aspm_set_policy(), which assigns the global "aspm_policy" and
> then reconfigures all links in the system.
> 
> This is not that; it's a link-based thing that doesn't change
> "aspm_policy" and only affects a single link.  This is more like
> aspm_attr_store_common() for the sysfs "l0s_aspm" and similar
> attributes, or the pci_disable_link_state() interface for drivers.

Yes. I'll clean up the wording.

> 
> > Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
> > Signed-off-by: David E. Box <david.e.box@linux.intel.com>
> > ---
> > V4
> >  - No changes.
> > V3
> >  - Fix missing semicolon in static inline function.
> > 
> > V2
> >  - Change return type to int so caller can determine if override was
> >    successful.
> >  - Return immediately if link is not found so that lock it not
> >    unecessarily taken, suggested by kw@linux.com.
> >  - Don't override if aspm_disabled is true.
> > 
> >  drivers/pci/pci.h       |  2 ++
> >  drivers/pci/pcie/aspm.c | 19 +++++++++++++++++++
> >  2 files changed, 21 insertions(+)
> > 
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index 3d60cabde1a1..c9c55d43cd8a 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -562,11 +562,13 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev);
> >  void pcie_aspm_exit_link_state(struct pci_dev *pdev);
> >  void pcie_aspm_pm_state_change(struct pci_dev *pdev);
> >  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
> > +int pcie_aspm_policy_override(struct pci_dev *dev);
> >  #else
> >  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
> >  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
> >  static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
> >  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
> > +static inline int pcie_aspm_policy_override(struct pci_dev *dev) { return -EINVAL; }
> >  #endif
> >  
> >  #ifdef CONFIG_PCIE_ECRC
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 52c74682601a..e2c61e14e724 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -1140,6 +1140,25 @@ int pci_disable_link_state(struct pci_dev *pdev, int state)
> >  }
> >  EXPORT_SYMBOL(pci_disable_link_state);
> >  
> > +int pcie_aspm_policy_override(struct pci_dev *pdev)
> > +{
> > +       struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > +
> > +       if (!link || aspm_disabled)
> > +               return -EINVAL;
> > +
> > +       down_read(&pci_bus_sem);
> > +       mutex_lock(&aspm_lock);
> > +       link->aspm_default = ASPM_STATE_ALL;
> > +       pcie_config_aspm_link(link, policy_to_aspm_state(link));
> > +       pcie_set_clkpm(link, policy_to_clkpm_state(link));
> > +       mutex_unlock(&aspm_lock);
> > +       up_read(&pci_bus_sem);
> 
> This is essentially the inverse of pci_disable_link_state().  Why not
> name it so the connection is obvious?  Probably also make the
> signature ("int state") similar.

Indeed it is the inverse. Will implement as pci_enable_link_state().

Thanks.

> 
> > +       return 0;
> > +}
> > +EXPORT_SYMBOL(pcie_aspm_policy_override);
> > +
> >  static int pcie_aspm_set_policy(const char *val,
> >                                 const struct kernel_param *kp)
> >  {
> > -- 
> > 2.25.1
> > 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-16 17:26   ` Bjorn Helgaas
@ 2021-12-16 21:24     ` David E. Box
  2021-12-20 17:28       ` Bjorn Helgaas
  2021-12-21  1:52       ` Rajat Jain
  0 siblings, 2 replies; 12+ messages in thread
From: David E. Box @ 2021-12-16 21:24 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: nirmal.patel, jonathan.derrick, lorenzo.pieralisi, hch, kw, robh,
	bhelgaas, michael.a.bottini, rafael, me, linux-pci, linux-kernel,
	Rajat Jain

On Thu, 2021-12-16 at 11:26 -0600, Bjorn Helgaas wrote:
> [+cc Rajat for LTR max latency write]
> 
> On Wed, Dec 15, 2021 at 09:56:00PM -0800, David E. Box wrote:
> > From: Michael Bottini <michael.a.bottini@linux.intel.com>
> > 
> > On Tiger Lake and Alder Lake platforms, VMD controllers do not have ASPM
> > enabled nor LTR values set by BIOS. This leads high power consumption on
> > these platforms when VMD is enabled as reported in bugzilla [1].  Enable
> > these features in the VMD driver using pcie_aspm_policy_override() to set
> > the ASPM policy for the root ports.
> 
> s/leads high/leads to high/
> 
> Does this depend on "Tiger Lake and Alder Lake platforms"
> specifically, or does it depend on a BIOS design choice, i.e., "don't
> configure ASPM or LTR for devices below a VMD"?

It was a BIOS design choice to have this done by the OS driver for both Tiger Lake and Alder Lake.

> 
> The subject says "override ASPM on VMD devices," but it looks like
> this affects the ASPM configuration of devices *below* the VMD, not of
> the VMD itself.

Yes.

> 
> It looks like this only affects *NVMe* devices, since
> vmd_enable_aspm() checks for PCI_CLASS_STORAGE_EXPRESS.  Why is that?
> Is there something special about NVMe?  I'd think you would want to do
> this for *all* devices below a VMD.

We need this for all devices under the PCIe root ports below VMD. Those will only be NVMe device
AFAICS.

> 
> Since it only affects PCI_CLASS_STORAGE_EXPRESS devices, I don't think
> it actually "sets ASPM policy for the root ports".  vmd_enable_aspm()
> calls pcie_aspm_policy_override() on endpoints.  It's true that the
> link ASPM state happens to be attached to the upstream end of the
> link, but that's an ASPM implementation detail.
> 
> This all needs to be clear in the subject and commit log.

Ack

> 
> > To do this, add an additional flag in VMD features to specify devices that
> > must have their respective policies overridden.
> 
> I'm not clear on why you want this to apply to only certain VMDs and
> not others.  Do some BIOSes configure ASPM for devices below some
> VMDs?

Not currently. But the plan is for future devices to move back to having BIOS do the programming.

> 
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=213717
> > 
> > Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
> > Signed-off-by: David E. Box <david.e.box@linux.intel.com>
> > Tested-by: Adhitya Mohan <me@adhityamohan.in>
> > ---
> > V4
> >  - Refactor vmd_enable_apsm() to exit early, making the lines shorter
> >    and more readable. Suggested by Christoph.
> > V3
> >  - No changes
> > V2
> >  - Use return status to print pci_info message if ASPM cannot be enabled.
> >  - Add missing static declaration, caught by lkp@intel.com
> > 
> >  drivers/pci/controller/vmd.c | 43 +++++++++++++++++++++++++++++++++---
> >  1 file changed, 40 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > index a45e8e59d3d4..880afd450a14 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -20,6 +20,8 @@
> >  
> >  #include <asm/irqdomain.h>
> >  
> > +#include "../pci.h"
> > +
> >  #define VMD_CFGBAR     0
> >  #define VMD_MEMBAR1    2
> >  #define VMD_MEMBAR2    4
> > @@ -67,6 +69,12 @@ enum vmd_features {
> >          * interrupt handling.
> >          */
> >         VMD_FEAT_CAN_BYPASS_MSI_REMAP           = (1 << 4),
> > +
> > +       /*
> > +        * Device must have ASPM policy overridden, as its default policy is
> > +        * incorrect.
> > +        */
> > +       VMD_FEAT_QUIRK_OVERRIDE_ASPM            = (1 << 5),
> 
> I think you specifically want to *enable* some ASPM link states, not
> just "override the default policy."  "Override" tells us nothing about
> whether you are enabling or disabling ASPM.  Applies to subject line
> as well.

Ack

> 
> >  };
> >  
> >  static DEFINE_IDA(vmd_instance_ida);
> > @@ -661,6 +669,30 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
> >         return 0;
> >  }
> >  
> > +/*
> > + * Override the BIOS ASPM policy and set the LTR value for PCI storage
> > + * devices on the VMD bride.
> 
> I don't think there's any BIOS "policy" here.  At this point BIOS is
> no longer involved at all, so all that's left is whatever ASPM config
> the BIOS did or did not do.
> 
> Why only storage?

Only storage devices will be on these root ports.

> 
> s/bride/bridge/
> 
> > + */
> > +static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
> > +{
> > +       int features = *(int *)userdata, pos;
> > +
> > +       if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
> > +           pdev->class != PCI_CLASS_STORAGE_EXPRESS)
> > +               return 0;
> > +
> > +       pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
> > +       if (!pos)
> > +               return 0;
> > +
> > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, 0x1003);
> > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT, 0x1003);
> 
> 1) Where did this magic 0x1003 value come from?  Does that depend on
> the VMD device?  The endpoint?  The circuit design?  The path between
> endpoint and VMD?  What if there are switches in the path?

The number comes from the BIOS team. They are tied to the SoC. I don't believe there can be switches
in the path but Nirmal and Jonathan should know for sure. From what I've seen these root ports are
wired directly to M.2 slots on boards that are intended for storage devices.

> 
> 2) There exist broken devices where WORD config accesses don't work:
> https://lore.kernel.org/all/20211208000948.487820-1-rajatja@google.com/
> 
> We might need a way to quirk config accesses to those devices, but we
> don't have one yet.  So for now this needs to be a single DWORD write.

Ack

> 
> > +       if (pcie_aspm_policy_override(pdev))
> > +               pci_info(pdev, "Unable of override ASPM policy\n");
> 
> s/Unable of/Unable to/
> 
> I think we might need a message about when we *do* override the
> policy.  A note in dmesg might be useful for debugging.  I'm worried
> about the LTR programming because I really don't understand how we
> should be doing that.

Sure.

> 
> > +       return 0;
> > +}
> > +
> >  static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> >  {
> >         struct pci_sysdata *sd = &vmd->sysdata;
> > @@ -807,6 +839,8 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> >         pci_scan_child_bus(vmd->bus);
> >         pci_assign_unassigned_bus_resources(vmd->bus);
> >  
> > +       pci_walk_bus(vmd->bus, vmd_enable_aspm, &features);
> 
> Do you support hotplug under VMD?  This will not happen for hot-added
> devices.

That I don't know. Nirmal, Jonathan?

David

> 
> >         /*
> >          * VMD root buses are virtual and don't return true on pci_is_pcie()
> >          * and will fail pcie_bus_configure_settings() early. It can instead be
> > @@ -948,15 +982,18 @@ static const struct pci_device_id vmd_ids[] = {
> >         {PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x467f),
> >                 .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
> >                                 VMD_FEAT_HAS_BUS_RESTRICTIONS |
> > -                               VMD_FEAT_OFFSET_FIRST_VECTOR,},
> > +                               VMD_FEAT_OFFSET_FIRST_VECTOR |
> > +                               VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
> >         {PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x4c3d),
> >                 .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
> >                                 VMD_FEAT_HAS_BUS_RESTRICTIONS |
> > -                               VMD_FEAT_OFFSET_FIRST_VECTOR,},
> > +                               VMD_FEAT_OFFSET_FIRST_VECTOR |
> > +                               VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
> >         {PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B),
> >                 .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
> >                                 VMD_FEAT_HAS_BUS_RESTRICTIONS |
> > -                               VMD_FEAT_OFFSET_FIRST_VECTOR,},
> > +                               VMD_FEAT_OFFSET_FIRST_VECTOR |
> > +                               VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
> >         {0,}
> >  };
> >  MODULE_DEVICE_TABLE(pci, vmd_ids);
> > -- 
> > 2.25.1
> > 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-16 21:24     ` David E. Box
@ 2021-12-20 17:28       ` Bjorn Helgaas
  2021-12-20 23:06         ` David E. Box
  2021-12-21  1:52       ` Rajat Jain
  1 sibling, 1 reply; 12+ messages in thread
From: Bjorn Helgaas @ 2021-12-20 17:28 UTC (permalink / raw)
  To: David E. Box
  Cc: nirmal.patel, jonathan.derrick, lorenzo.pieralisi, hch, kw, robh,
	bhelgaas, michael.a.bottini, rafael, me, linux-pci, linux-kernel,
	Rajat Jain

On Thu, Dec 16, 2021 at 01:24:00PM -0800, David E. Box wrote:
> On Thu, 2021-12-16 at 11:26 -0600, Bjorn Helgaas wrote:
> > On Wed, Dec 15, 2021 at 09:56:00PM -0800, David E. Box wrote:
> > > From: Michael Bottini <michael.a.bottini@linux.intel.com>
> > > 
> > > On Tiger Lake and Alder Lake platforms, VMD controllers do not have ASPM
> > > enabled nor LTR values set by BIOS. This leads high power consumption on
> > > these platforms when VMD is enabled as reported in bugzilla [1].  Enable
> > > these features in the VMD driver using pcie_aspm_policy_override() to set
> > > the ASPM policy for the root ports.
> > > ...

> > > To do this, add an additional flag in VMD features to specify
> > > devices that must have their respective policies overridden.
> > 
> > I'm not clear on why you want this to apply to only certain VMDs
> > and not others.  Do some BIOSes configure ASPM for devices below
> > some VMDs?
> 
> Not currently. But the plan is for future devices to move back to
> having BIOS do the programming.

Since this is apparently a BIOS design choice, it seems wrong to base
the functionality on the Device ID instead of some signal that tells
us what the BIOS is doing.

> > > + * Override the BIOS ASPM policy and set the LTR value for PCI storage
> > > + * devices on the VMD bride.
> > 
> > I don't think there's any BIOS "policy" here.  At this point BIOS
> > is no longer involved at all, so all that's left is whatever ASPM
> > config the BIOS did or did not do.
> > 
> > Why only storage?
> 
> Only storage devices will be on these root ports.

How do you know this?  You say below that there's an M.2 slot, so
surely the slot could contain a non-storage device?  Couldn't somebody
build a platform with a VMD root port connected to a regular PCIe x4
slot?  Couldn't such a slot support hotplug?

It would be very unusual to hard-code topology knowledge like this
into the kernel, since plug-and-play has always been a major goal of
PCI.

> > > +static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
> > > +{
> > > +       int features = *(int *)userdata, pos;
> > > +
> > > +       if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
> > > +           pdev->class != PCI_CLASS_STORAGE_EXPRESS)
> > > +               return 0;

> > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, 0x1003);
> > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT, 0x1003);
> > 
> > 1) Where did this magic 0x1003 value come from?  Does that depend
> > on the VMD device?  The endpoint?  The circuit design?  The path
> > between endpoint and VMD?  What if there are switches in the path?
> 
> The number comes from the BIOS team. They are tied to the SoC. I
> don't believe there can be switches in the path but Nirmal and
> Jonathan should know for sure. From what I've seen these root ports
> are wired directly to M.2 slots on boards that are intended for
> storage devices.

I guess you're saying that 0x1003 is determined by the SoC.  If so, I
think this value should be in your .driver_data (which would mean
converting it from a scalar to a struct, as many other drivers do).
The current code suggests that 0x1003 is the correct value for *all*
VMDs and all configurations.

I don't understand LTR well enough to be convinced that this static
value would be the correct value for all possible hierarchies and
devices that could appear below a VMD root port.  I would love to be
educated about this.

Bjorn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-20 17:28       ` Bjorn Helgaas
@ 2021-12-20 23:06         ` David E. Box
  2021-12-21  2:14           ` Rajat Jain
  0 siblings, 1 reply; 12+ messages in thread
From: David E. Box @ 2021-12-20 23:06 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: francisco.munoz.ruiz, nirmal.patel, jonathan.derrick,
	lorenzo.pieralisi, hch, kw, robh, bhelgaas, michael.a.bottini,
	rafael, me, linux-pci, linux-kernel, Rajat Jain

Hi Born,

So you know this patch comes from our platform power management team to address several VMD bugs
affecting power. The VMD driver developers are already cc'd but also adding Francisco from the VMD
team who can maybe help clarify.

On Mon, 2021-12-20 at 11:28 -0600, Bjorn Helgaas wrote:
> On Thu, Dec 16, 2021 at 01:24:00PM -0800, David E. Box wrote:
> > On Thu, 2021-12-16 at 11:26 -0600, Bjorn Helgaas wrote:
> > > On Wed, Dec 15, 2021 at 09:56:00PM -0800, David E. Box wrote:
> > > > From: Michael Bottini <michael.a.bottini@linux.intel.com>
> > > > 
> > > > On Tiger Lake and Alder Lake platforms, VMD controllers do not have ASPM
> > > > enabled nor LTR values set by BIOS. This leads high power consumption on
> > > > these platforms when VMD is enabled as reported in bugzilla [1].  Enable
> > > > these features in the VMD driver using pcie_aspm_policy_override() to set
> > > > the ASPM policy for the root ports.
> > > > ...
> 
> > > > To do this, add an additional flag in VMD features to specify
> > > > devices that must have their respective policies overridden.
> > > 
> > > I'm not clear on why you want this to apply to only certain VMDs
> > > and not others.  Do some BIOSes configure ASPM for devices below
> > > some VMDs?
> > 
> > Not currently. But the plan is for future devices to move back to
> > having BIOS do the programming.
> 
> Since this is apparently a BIOS design choice, it seems wrong to base
> the functionality on the Device ID instead of some signal that tells
> us what the BIOS is doing.

I don't know if there is another way to tell what the BIOS is _not_ doing. DID is tied to the SoC so
it's our best method of inferring this. We know the firmware team has said they will not support
programming these values for the mentioned platforms so there's no BIOS update coming to change
this. But to be sure, we can add a check first to confirm that these values have not been programmed
(ASPM is disabled and LTRs are 0).

> 
> > > > + * Override the BIOS ASPM policy and set the LTR value for PCI storage
> > > > + * devices on the VMD bride.
> > > 
> > > I don't think there's any BIOS "policy" here.  At this point BIOS
> > > is no longer involved at all, so all that's left is whatever ASPM
> > > config the BIOS did or did not do.
> > > 
> > > Why only storage?
> > 
> > Only storage devices will be on these root ports.
> 
> How do you know this?  You say below that there's an M.2 slot, so
> surely the slot could contain a non-storage device?  Couldn't somebody
> build a platform with a VMD root port connected to a regular PCIe x4
> slot?

In VMD mode the root ports under the VMD bridge are used specifically to manage storage devices. So
only storage devices should be attached to these ports.

>   Couldn't such a slot support hotplug?

The answer I just learned is yes. It may be possible to have a switch too. So both of those would
need to be addressed though I'm note quite sure how yet.

> 
> It would be very unusual to hard-code topology knowledge like this
> into the kernel, since plug-and-play has always been a major goal of
> PCI.
> 
> > > > +static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
> > > > +{
> > > > +       int features = *(int *)userdata, pos;
> > > > +
> > > > +       if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
> > > > +           pdev->class != PCI_CLASS_STORAGE_EXPRESS)
> > > > +               return 0;
> 
> > > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, 0x1003);
> > > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT, 0x1003);
> > > 
> > > 1) Where did this magic 0x1003 value come from?  Does that depend
> > > on the VMD device?  The endpoint?  The circuit design?  The path
> > > between endpoint and VMD?  What if there are switches in the path?
> > 
> > The number comes from the BIOS team. They are tied to the SoC. I
> > don't believe there can be switches in the path but Nirmal and
> > Jonathan should know for sure. From what I've seen these root ports
> > are wired directly to M.2 slots on boards that are intended for
> > storage devices.
> 
> I guess you're saying that 0x1003 is determined by the SoC.  If so, I
> think this value should be in your .driver_data (which would mean
> converting it from a scalar to a struct, as many other drivers do).
> The current code suggests that 0x1003 is the correct value for *all*
> VMDs and all configurations.

Okay

> 
> I don't understand LTR well enough to be convinced that this static
> value would be the correct value for all possible hierarchies and
> devices that could appear below a VMD root port.  I would love to be
> educated about this.

I am not an LTR expert either, but LTRs are "conglomerated" per spec. Switches subtract their own
latency from the minimum LTR of the downstream ports and report this conglomerated value upstream.
So the value will decrease as needed to account for latency in the hierarchy.

PCI 5.0 Version 1.0, Section 6.18 (pg 611)

Thanks

David

> 
> Bjorn



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-16 21:24     ` David E. Box
  2021-12-20 17:28       ` Bjorn Helgaas
@ 2021-12-21  1:52       ` Rajat Jain
  1 sibling, 0 replies; 12+ messages in thread
From: Rajat Jain @ 2021-12-21  1:52 UTC (permalink / raw)
  To: david.e.box
  Cc: Bjorn Helgaas, nirmal.patel, jonathan.derrick, lorenzo.pieralisi,
	hch, kw, robh, bhelgaas, michael.a.bottini, rafael, me,
	linux-pci, linux-kernel

Hello,

On Thu, Dec 16, 2021 at 1:24 PM David E. Box
<david.e.box@linux.intel.com> wrote:
>
> On Thu, 2021-12-16 at 11:26 -0600, Bjorn Helgaas wrote:
> > [+cc Rajat for LTR max latency write]

Thanks!

> >
> > On Wed, Dec 15, 2021 at 09:56:00PM -0800, David E. Box wrote:
> > > From: Michael Bottini <michael.a.bottini@linux.intel.com>
> > >
> > > On Tiger Lake and Alder Lake platforms, VMD controllers do not have ASPM
> > > enabled nor LTR values set by BIOS. This leads high power consumption on
> > > these platforms when VMD is enabled as reported in bugzilla [1].  Enable
> > > these features in the VMD driver using pcie_aspm_policy_override() to set
> > > the ASPM policy for the root ports.
> >
> > s/leads high/leads to high/
> >
> > Does this depend on "Tiger Lake and Alder Lake platforms"
> > specifically, or does it depend on a BIOS design choice, i.e., "don't
> > configure ASPM or LTR for devices below a VMD"?
>
> It was a BIOS design choice to have this done by the OS driver for both Tiger Lake and Alder Lake.
>
> >
> > The subject says "override ASPM on VMD devices," but it looks like
> > this affects the ASPM configuration of devices *below* the VMD, not of
> > the VMD itself.
>
> Yes.
>
> >
> > It looks like this only affects *NVMe* devices, since
> > vmd_enable_aspm() checks for PCI_CLASS_STORAGE_EXPRESS.  Why is that?
> > Is there something special about NVMe?  I'd think you would want to do
> > this for *all* devices below a VMD.
>
> We need this for all devices under the PCIe root ports below VMD. Those will only be NVMe device
> AFAICS.
>
> >
> > Since it only affects PCI_CLASS_STORAGE_EXPRESS devices, I don't think
> > it actually "sets ASPM policy for the root ports".  vmd_enable_aspm()
> > calls pcie_aspm_policy_override() on endpoints.  It's true that the
> > link ASPM state happens to be attached to the upstream end of the
> > link, but that's an ASPM implementation detail.
> >
> > This all needs to be clear in the subject and commit log.
>
> Ack
>
> >
> > > To do this, add an additional flag in VMD features to specify devices that
> > > must have their respective policies overridden.
> >
> > I'm not clear on why you want this to apply to only certain VMDs and
> > not others.  Do some BIOSes configure ASPM for devices below some
> > VMDs?
>
> Not currently. But the plan is for future devices to move back to having BIOS do the programming.
>
> >
> > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=213717
> > >
> > > Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
> > > Signed-off-by: David E. Box <david.e.box@linux.intel.com>
> > > Tested-by: Adhitya Mohan <me@adhityamohan.in>
> > > ---
> > > V4
> > >  - Refactor vmd_enable_apsm() to exit early, making the lines shorter
> > >    and more readable. Suggested by Christoph.
> > > V3
> > >  - No changes
> > > V2
> > >  - Use return status to print pci_info message if ASPM cannot be enabled.
> > >  - Add missing static declaration, caught by lkp@intel.com
> > >
> > >  drivers/pci/controller/vmd.c | 43 +++++++++++++++++++++++++++++++++---
> > >  1 file changed, 40 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > > index a45e8e59d3d4..880afd450a14 100644
> > > --- a/drivers/pci/controller/vmd.c
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -20,6 +20,8 @@
> > >
> > >  #include <asm/irqdomain.h>
> > >
> > > +#include "../pci.h"
> > > +
> > >  #define VMD_CFGBAR     0
> > >  #define VMD_MEMBAR1    2
> > >  #define VMD_MEMBAR2    4
> > > @@ -67,6 +69,12 @@ enum vmd_features {
> > >          * interrupt handling.
> > >          */
> > >         VMD_FEAT_CAN_BYPASS_MSI_REMAP           = (1 << 4),
> > > +
> > > +       /*
> > > +        * Device must have ASPM policy overridden, as its default policy is
> > > +        * incorrect.
> > > +        */
> > > +       VMD_FEAT_QUIRK_OVERRIDE_ASPM            = (1 << 5),
> >
> > I think you specifically want to *enable* some ASPM link states, not
> > just "override the default policy."  "Override" tells us nothing about
> > whether you are enabling or disabling ASPM.  Applies to subject line
> > as well.
>
> Ack
>
> >
> > >  };
> > >
> > >  static DEFINE_IDA(vmd_instance_ida);
> > > @@ -661,6 +669,30 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
> > >         return 0;
> > >  }
> > >
> > > +/*
> > > + * Override the BIOS ASPM policy and set the LTR value for PCI storage
> > > + * devices on the VMD bride.
> >
> > I don't think there's any BIOS "policy" here.  At this point BIOS is
> > no longer involved at all, so all that's left is whatever ASPM config
> > the BIOS did or did not do.
> >
> > Why only storage?
>
> Only storage devices will be on these root ports.
>
> >
> > s/bride/bridge/
> >
> > > + */
> > > +static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
> > > +{
> > > +       int features = *(int *)userdata, pos;
> > > +
> > > +       if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
> > > +           pdev->class != PCI_CLASS_STORAGE_EXPRESS)
> > > +               return 0;
> > > +
> > > +       pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
> > > +       if (!pos)
> > > +               return 0;
> > > +
> > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, 0x1003);
> > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT, 0x1003);
> >
> > 1) Where did this magic 0x1003 value come from?  Does that depend on
> > the VMD device?  The endpoint?  The circuit design?  The path between
> > endpoint and VMD?  What if there are switches in the path?
>
> The number comes from the BIOS team. They are tied to the SoC. I don't believe there can be switches
> in the path but Nirmal and Jonathan should know for sure. From what I've seen these root ports are
> wired directly to M.2 slots on boards that are intended for storage devices.
>
> >
> > 2) There exist broken devices where WORD config accesses don't work:
> > https://lore.kernel.org/all/20211208000948.487820-1-rajatja@google.com/
> >
> > We might need a way to quirk config accesses to those devices, but we
> > don't have one yet.  So for now this needs to be a single DWORD write.
>
> Ack
>

Sideband: Yes, I'm trying to work out such a quirk mechanism, but for
now, yes, please do that as a single DWORD write access.

Thanks,

Rajat

> >
> > > +       if (pcie_aspm_policy_override(pdev))
> > > +               pci_info(pdev, "Unable of override ASPM policy\n");
> >
> > s/Unable of/Unable to/
> >
> > I think we might need a message about when we *do* override the
> > policy.  A note in dmesg might be useful for debugging.  I'm worried
> > about the LTR programming because I really don't understand how we
> > should be doing that.
>
> Sure.
>
> >
> > > +       return 0;
> > > +}
> > > +
> > >  static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > >  {
> > >         struct pci_sysdata *sd = &vmd->sysdata;
> > > @@ -807,6 +839,8 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > >         pci_scan_child_bus(vmd->bus);
> > >         pci_assign_unassigned_bus_resources(vmd->bus);
> > >
> > > +       pci_walk_bus(vmd->bus, vmd_enable_aspm, &features);
> >
> > Do you support hotplug under VMD?  This will not happen for hot-added
> > devices.
>
> That I don't know. Nirmal, Jonathan?
>
> David
>
> >
> > >         /*
> > >          * VMD root buses are virtual and don't return true on pci_is_pcie()
> > >          * and will fail pcie_bus_configure_settings() early. It can instead be
> > > @@ -948,15 +982,18 @@ static const struct pci_device_id vmd_ids[] = {
> > >         {PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x467f),
> > >                 .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
> > >                                 VMD_FEAT_HAS_BUS_RESTRICTIONS |
> > > -                               VMD_FEAT_OFFSET_FIRST_VECTOR,},
> > > +                               VMD_FEAT_OFFSET_FIRST_VECTOR |
> > > +                               VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
> > >         {PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x4c3d),
> > >                 .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
> > >                                 VMD_FEAT_HAS_BUS_RESTRICTIONS |
> > > -                               VMD_FEAT_OFFSET_FIRST_VECTOR,},
> > > +                               VMD_FEAT_OFFSET_FIRST_VECTOR |
> > > +                               VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
> > >         {PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_9A0B),
> > >                 .driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
> > >                                 VMD_FEAT_HAS_BUS_RESTRICTIONS |
> > > -                               VMD_FEAT_OFFSET_FIRST_VECTOR,},
> > > +                               VMD_FEAT_OFFSET_FIRST_VECTOR |
> > > +                               VMD_FEAT_QUIRK_OVERRIDE_ASPM,},
> > >         {0,}
> > >  };
> > >  MODULE_DEVICE_TABLE(pci, vmd_ids);
> > > --
> > > 2.25.1
> > >
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-20 23:06         ` David E. Box
@ 2021-12-21  2:14           ` Rajat Jain
  2021-12-21 21:11             ` David E. Box
  0 siblings, 1 reply; 12+ messages in thread
From: Rajat Jain @ 2021-12-21  2:14 UTC (permalink / raw)
  To: david.e.box
  Cc: Bjorn Helgaas, francisco.munoz.ruiz, nirmal.patel,
	jonathan.derrick, lorenzo.pieralisi, hch, kw, robh, bhelgaas,
	michael.a.bottini, rafael, me, linux-pci, linux-kernel

Hello,

On Mon, Dec 20, 2021 at 3:06 PM David E. Box
<david.e.box@linux.intel.com> wrote:
>
> Hi Born,
>
> So you know this patch comes from our platform power management team to address several VMD bugs
> affecting power. The VMD driver developers are already cc'd but also adding Francisco from the VMD
> team who can maybe help clarify.
>
> On Mon, 2021-12-20 at 11:28 -0600, Bjorn Helgaas wrote:
> > On Thu, Dec 16, 2021 at 01:24:00PM -0800, David E. Box wrote:
> > > On Thu, 2021-12-16 at 11:26 -0600, Bjorn Helgaas wrote:
> > > > On Wed, Dec 15, 2021 at 09:56:00PM -0800, David E. Box wrote:
> > > > > From: Michael Bottini <michael.a.bottini@linux.intel.com>
> > > > >
> > > > > On Tiger Lake and Alder Lake platforms, VMD controllers do not have ASPM
> > > > > enabled nor LTR values set by BIOS. This leads high power consumption on
> > > > > these platforms when VMD is enabled as reported in bugzilla [1].  Enable
> > > > > these features in the VMD driver using pcie_aspm_policy_override() to set
> > > > > the ASPM policy for the root ports.
> > > > > ...
> >
> > > > > To do this, add an additional flag in VMD features to specify
> > > > > devices that must have their respective policies overridden.
> > > >
> > > > I'm not clear on why you want this to apply to only certain VMDs
> > > > and not others.  Do some BIOSes configure ASPM for devices below
> > > > some VMDs?
> > >
> > > Not currently. But the plan is for future devices to move back to
> > > having BIOS do the programming.
> >
> > Since this is apparently a BIOS design choice, it seems wrong to base
> > the functionality on the Device ID instead of some signal that tells
> > us what the BIOS is doing.
>
> I don't know if there is another way to tell what the BIOS is _not_ doing. DID is tied to the SoC so
> it's our best method of inferring this. We know the firmware team has said they will not support
> programming these values for the mentioned platforms so there's no BIOS update coming to change
> this.

I'd like to mention here that ChromeOS also runs linux kernel on TGL
and ADL platforms, and we may run a different BIOS altogether, so yes,
we'd greatly appreciate if you do not assume BIOS setting from device
ID / SOC ID, if the setting is *not tied to device ID / SOC ID*.
Perhaps you use ACPI or device trees to convey what BIOS has or hasn't
configured?

> But to be sure, we can add a check first to confirm that these values have not been programmed
> (ASPM is disabled and LTRs are 0).

Yes, reading the ASPM configuration from hardware sounds OK to me.

>
> >
> > > > > + * Override the BIOS ASPM policy and set the LTR value for PCI storage
> > > > > + * devices on the VMD bride.
> > > >
> > > > I don't think there's any BIOS "policy" here.  At this point BIOS
> > > > is no longer involved at all, so all that's left is whatever ASPM
> > > > config the BIOS did or did not do.

Should the Policy override really take place if aspm_policy =
POLICY_DEFAULT which seems to suggest we leave it to whatever BIOS
left it at?

Perhaps for this override to take place, aspm_policy needs to be the
policy needs to be POLICY_POWERSAVE?

#define POLICY_DEFAULT 0        /* BIOS default setting */
#define POLICY_PERFORMANCE 1    /* high performance */
#define POLICY_POWERSAVE 2      /* high power saving */
#define POLICY_POWER_SUPERSAVE 3 /* possibly even more power saving */


> > > >
> > > > Why only storage?
> > >
> > > Only storage devices will be on these root ports.
> >
> > How do you know this?  You say below that there's an M.2 slot, so
> > surely the slot could contain a non-storage device?  Couldn't somebody
> > build a platform with a VMD root port connected to a regular PCIe x4
> > slot?
>
> In VMD mode the root ports under the VMD bridge are used specifically to manage storage devices. So
> only storage devices should be attached to these ports.

Bjorn: from https://www.intel.com/content/www/us/en/architecture-and-technology/intel-volume-management-device-overview.html
"Intel® Volume Management Device (Intel® VMD) is specifically designed
for enterprise-grade management of NVMe SSDs connected to Intel® Xeon®
CPUs."

I think what David means is that any other device other than an NVME
SSD showing up under VMD might be an error / bad device.


>
> >   Couldn't such a slot support hotplug?
>
> The answer I just learned is yes. It may be possible to have a switch too. So both of those would
> need to be addressed though I'm note quite sure how yet.
>
> >
> > It would be very unusual to hard-code topology knowledge like this
> > into the kernel, since plug-and-play has always been a major goal of
> > PCI.
> >
> > > > > +static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
> > > > > +{
> > > > > +       int features = *(int *)userdata, pos;
> > > > > +
> > > > > +       if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
> > > > > +           pdev->class != PCI_CLASS_STORAGE_EXPRESS)
> > > > > +               return 0;
> >
> > > > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, 0x1003);
> > > > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT, 0x1003);
> > > >
> > > > 1) Where did this magic 0x1003 value come from?  Does that depend
> > > > on the VMD device?  The endpoint?  The circuit design?  The path
> > > > between endpoint and VMD?  What if there are switches in the path?
> > >
> > > The number comes from the BIOS team. They are tied to the SoC. I
> > > don't believe there can be switches in the path but Nirmal and
> > > Jonathan should know for sure. From what I've seen these root ports
> > > are wired directly to M.2 slots on boards that are intended for
> > > storage devices.
> >
> > I guess you're saying that 0x1003 is determined by the SoC.  If so, I
> > think this value should be in your .driver_data (which would mean
> > converting it from a scalar to a struct, as many other drivers do).
> > The current code suggests that 0x1003 is the correct value for *all*
> > VMDs and all configurations.
>
> Okay
>
> >
> > I don't understand LTR well enough to be convinced that this static
> > value would be the correct value for all possible hierarchies and
> > devices that could appear below a VMD root port.  I would love to be
> > educated about this.
>
> I am not an LTR expert either, but LTRs are "conglomerated" per spec. Switches subtract their own
> latency from the minimum LTR of the downstream ports and report this conglomerated value upstream.
> So the value will decrease as needed to account for latency in the hierarchy.
>
> PCI 5.0 Version 1.0, Section 6.18 (pg 611)

As described here and as much as I understand, I think these LTR
values (0x1003 here) depend on the platform topology. So what works
for David's platform may not work for another one that may have
Switched for e.g., or for perhaps for another NVME that is hotplugged
later (and has different characteristics than what is on board).
Technically tying the LTR values to a SOC or VID/DID sounds incorrect
to me, and instead should be tied to the platform. Ideally I *thiink*
they should come from ACPI or device tree. However, I wouldn't protest
putting them in driver_data for now given that there aren't any other
users. We may have to find a solution once there are more platforms
that begin to use this driver. I think this is a bigger problem that
needs to be addressed w.r.t. ASPM in general.

Thanks & Best Regards,

Rajat


>
> Thanks
>
> David
>
> >
> > Bjorn
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-21  2:14           ` Rajat Jain
@ 2021-12-21 21:11             ` David E. Box
  2021-12-22  7:50               ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: David E. Box @ 2021-12-21 21:11 UTC (permalink / raw)
  To: Rajat Jain
  Cc: Bjorn Helgaas, francisco.munoz.ruiz, nirmal.patel,
	jonathan.derrick, lorenzo.pieralisi, hch, kw, robh, bhelgaas,
	michael.a.bottini, rafael, me, linux-pci, linux-kernel

Hi,

On Mon, 2021-12-20 at 18:14 -0800, Rajat Jain wrote:
> Hello,
> 
> On Mon, Dec 20, 2021 at 3:06 PM David E. Box
> <david.e.box@linux.intel.com> wrote:
> > Hi Born,
> > 
> > So you know this patch comes from our platform power management team to
> > address several VMD bugs
> > affecting power. The VMD driver developers are already cc'd but also adding
> > Francisco from the VMD
> > team who can maybe help clarify.
> > 
> > On Mon, 2021-12-20 at 11:28 -0600, Bjorn Helgaas wrote:
> > > On Thu, Dec 16, 2021 at 01:24:00PM -0800, David E. Box wrote:
> > > > On Thu, 2021-12-16 at 11:26 -0600, Bjorn Helgaas wrote:
> > > > > On Wed, Dec 15, 2021 at 09:56:00PM -0800, David E. Box wrote:
> > > > > > From: Michael Bottini <michael.a.bottini@linux.intel.com>
> > > > > > 
> > > > > > On Tiger Lake and Alder Lake platforms, VMD controllers do not have
> > > > > > ASPM
> > > > > > enabled nor LTR values set by BIOS. This leads high power
> > > > > > consumption on
> > > > > > these platforms when VMD is enabled as reported in bugzilla
> > > > > > [1].  Enable
> > > > > > these features in the VMD driver using pcie_aspm_policy_override()
> > > > > > to set
> > > > > > the ASPM policy for the root ports.
> > > > > > ...
> > > > > > To do this, add an additional flag in VMD features to specify
> > > > > > devices that must have their respective policies overridden.
> > > > > 
> > > > > I'm not clear on why you want this to apply to only certain VMDs
> > > > > and not others.  Do some BIOSes configure ASPM for devices below
> > > > > some VMDs?
> > > > 
> > > > Not currently. But the plan is for future devices to move back to
> > > > having BIOS do the programming.
> > > 
> > > Since this is apparently a BIOS design choice, it seems wrong to base
> > > the functionality on the Device ID instead of some signal that tells
> > > us what the BIOS is doing.
> > 
> > I don't know if there is another way to tell what the BIOS is _not_ doing.
> > DID is tied to the SoC so
> > it's our best method of inferring this. We know the firmware team has said
> > they will not support
> > programming these values for the mentioned platforms so there's no BIOS
> > update coming to change
> > this.
> 
> I'd like to mention here that ChromeOS also runs linux kernel on TGL
> and ADL platforms, and we may run a different BIOS altogether, so yes,
> we'd greatly appreciate if you do not assume BIOS setting from device
> ID / SOC ID, if the setting is *not tied to device ID / SOC ID*.
> Perhaps you use ACPI or device trees to convey what BIOS has or hasn't
> configured?

The reason why BIOS is not programming these values is because when VMD is
enabled the ports are not visible to BIOS. This would apply to any BIOS
including those used by ChromeOS.


And because BIOS doesn't see these ports ...

> 
> > But to be sure, we can add a check first to confirm that these values have
> > not been programmed
> > (ASPM is disabled and LTRs are 0).
> 
> Yes, reading the ASPM configuration from hardware sounds OK to me.
> 
> > > > > > + * Override the BIOS ASPM policy and set the LTR value for PCI
> > > > > > storage
> > > > > > + * devices on the VMD bride.
> > > > > 
> > > > > I don't think there's any BIOS "policy" here.  At this point BIOS
> > > > > is no longer involved at all, so all that's left is whatever ASPM
> > > > > config the BIOS did or did not do.
> 
> Should the Policy override really take place if aspm_policy =
> POLICY_DEFAULT which seems to suggest we leave it to whatever BIOS
> left it at?

... we cannot rely on POLICY_DEFAULT for their ASPM setting. This leaves the
devices with ASPM disabled.

That said, we do honor the PCI aspm_disabled flag which is tied to both the FADT
bit and module parameter. So we only enable it if OS control is allowed.

> 
> Perhaps for this override to take place, aspm_policy needs to be the
> policy needs to be POLICY_POWERSAVE?
> 
> #define POLICY_DEFAULT 0        /* BIOS default setting */
> #define POLICY_PERFORMANCE 1    /* high performance */
> #define POLICY_POWERSAVE 2      /* high power saving */
> #define POLICY_POWER_SUPERSAVE 3 /* possibly even more power saving */

We do have to set a default in place of the missing BIOS configuration. We chose
to do ASPM_STATE_ALL, allowing all ASPM settings supported by the endpoint.

> 
> 
> > > > > Why only storage?
> > > > 
> > > > Only storage devices will be on these root ports.
> > > 
> > > How do you know this?  You say below that there's an M.2 slot, so
> > > surely the slot could contain a non-storage device?  Couldn't somebody
> > > build a platform with a VMD root port connected to a regular PCIe x4
> > > slot?
> > 
> > In VMD mode the root ports under the VMD bridge are used specifically to
> > manage storage devices. So
> > only storage devices should be attached to these ports.
> 
> Bjorn: from 
> https://www.intel.com/content/www/us/en/architecture-and-technology/intel-volume-management-device-overview.html
> "Intel® Volume Management Device (Intel® VMD) is specifically designed
> for enterprise-grade management of NVMe SSDs connected to Intel® Xeon®
> CPUs."
> 
> I think what David means is that any other device other than an NVME
> SSD showing up under VMD might be an error / bad device.

I don't know if that would happen. It certainly would not be used as intended.

> 
> 
> > >   Couldn't such a slot support hotplug?
> > 
> > The answer I just learned is yes. It may be possible to have a switch too.
> > So both of those would
> > need to be addressed though I'm note quite sure how yet.
> > 
> > > It would be very unusual to hard-code topology knowledge like this
> > > into the kernel, since plug-and-play has always been a major goal of
> > > PCI.
> > > 
> > > > > > +static int vmd_enable_aspm(struct pci_dev *pdev, void *userdata)
> > > > > > +{
> > > > > > +       int features = *(int *)userdata, pos;
> > > > > > +
> > > > > > +       if (!(features & VMD_FEAT_QUIRK_OVERRIDE_ASPM) ||
> > > > > > +           pdev->class != PCI_CLASS_STORAGE_EXPRESS)
> > > > > > +               return 0;
> > > > > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_SNOOP_LAT,
> > > > > > 0x1003);
> > > > > > +       pci_write_config_word(pdev, pos + PCI_LTR_MAX_NOSNOOP_LAT,
> > > > > > 0x1003);
> > > > > 
> > > > > 1) Where did this magic 0x1003 value come from?  Does that depend
> > > > > on the VMD device?  The endpoint?  The circuit design?  The path
> > > > > between endpoint and VMD?  What if there are switches in the path?
> > > > 
> > > > The number comes from the BIOS team. They are tied to the SoC. I
> > > > don't believe there can be switches in the path but Nirmal and
> > > > Jonathan should know for sure. From what I've seen these root ports
> > > > are wired directly to M.2 slots on boards that are intended for
> > > > storage devices.
> > > 
> > > I guess you're saying that 0x1003 is determined by the SoC.  If so, I
> > > think this value should be in your .driver_data (which would mean
> > > converting it from a scalar to a struct, as many other drivers do).
> > > The current code suggests that 0x1003 is the correct value for *all*
> > > VMDs and all configurations.
> > 
> > Okay
> > 
> > > I don't understand LTR well enough to be convinced that this static
> > > value would be the correct value for all possible hierarchies and
> > > devices that could appear below a VMD root port.  I would love to be
> > > educated about this.
> > 
> > I am not an LTR expert either, but LTRs are "conglomerated" per spec.
> > Switches subtract their own
> > latency from the minimum LTR of the downstream ports and report this
> > conglomerated value upstream.
> > So the value will decrease as needed to account for latency in the
> > hierarchy.
> > 
> > PCI 5.0 Version 1.0, Section 6.18 (pg 611)
> 
> As described here and as much as I understand, I think these LTR
> values (0x1003 here) depend on the platform topology. So what works
> for David's platform may not work for another one that may have
> Switched for e.g., or for perhaps for another NVME that is hotplugged
> later (and has different characteristics than what is on board).
> Technically tying the LTR values to a SOC or VID/DID sounds incorrect
> to me, and instead should be tied to the platform. Ideally I *thiink*
> they should come from ACPI or device tree. However, I wouldn't protest
> putting them in driver_data for now given that there aren't any other
> users. We may have to find a solution once there are more platforms
> that begin to use this driver. I think this is a bigger problem that
> needs to be addressed w.r.t. ASPM in general.

I'll seek more clarification from firmware team on the LTR. But again the fact
that switches can reduce the reported LTR seems to allow for at least some
decoupling of the endpoint LTR value from the topology.

David

> 
> Thanks & Best Regards,
> 
> Rajat
> 
> 
> > Thanks
> > 
> > David
> > 
> > > Bjorn


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices
  2021-12-21 21:11             ` David E. Box
@ 2021-12-22  7:50               ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2021-12-22  7:50 UTC (permalink / raw)
  To: David E. Box
  Cc: Rajat Jain, Bjorn Helgaas, francisco.munoz.ruiz, nirmal.patel,
	jonathan.derrick, lorenzo.pieralisi, hch, kw, robh, bhelgaas,
	michael.a.bottini, rafael, me, linux-pci, linux-kernel

On Tue, Dec 21, 2021 at 01:11:26PM -0800, David E. Box wrote:
> The reason why BIOS is not programming these values is because when VMD is
> enabled the ports are not visible to BIOS. This would apply to any BIOS
> including those used by ChromeOS.
> And because BIOS doesn't see these ports ...

Isn't VMD enabled by the BIOS?  The bios should be able see them the
same way as Linux does.  But given that the whole point of VMD is to hide 
these ports from Windows it obviously doesn't.  For Linux and thus for
ChromeOS VMD is completely pointless, so give that the ChromeOS people
aren't as dumb as some people at Intel I'm pretty sure they won't enable
it and just use the ports as normal one and avoid this whole
self-inflicted pain.

If only Intel could give the OS a way to disable VMD at runtime and
discover the actual ports, then we would not have this whole nightmare
of having to support it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-12-22  7:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-16  5:55 [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function David E. Box
2021-12-16  5:56 ` [PATCH V4 2/2] PCI: vmd: Override ASPM on TGL/ADL VMD devices David E. Box
2021-12-16 17:26   ` Bjorn Helgaas
2021-12-16 21:24     ` David E. Box
2021-12-20 17:28       ` Bjorn Helgaas
2021-12-20 23:06         ` David E. Box
2021-12-21  2:14           ` Rajat Jain
2021-12-21 21:11             ` David E. Box
2021-12-22  7:50               ` Christoph Hellwig
2021-12-21  1:52       ` Rajat Jain
2021-12-16 17:06 ` [PATCH V4 1/2] PCI/ASPM: Add ASPM BIOS override function Bjorn Helgaas
2021-12-16 18:33   ` David E. Box

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).