All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
@ 2023-10-30 20:16 Nirmal Patel
  2023-10-31 15:31 ` Bjorn Helgaas
  0 siblings, 1 reply; 19+ messages in thread
From: Nirmal Patel @ 2023-10-30 20:16 UTC (permalink / raw)
  To: nirmal.patel, linux-pci

VMD Hotplug should be enabled or disabled based on VMD rootports'
Hotplug configuration in BIOS. is_hotplug_bridge is set on each
VMD rootport based on Hotplug capable bit in SltCap in probe.c.
Check is_hotplug_bridge and enable or disable native_pcie_hotplug
based on that value.

Currently VMD driver copies ACPI settings or platform configurations
for Hotplug, AER, DPC, PM, etc and enables or disables these features
on VMD bridge which is not correct in case of Hotplug.

Also during the Guest boot up, ACPI settings along with VMD UEFI
driver are not present in Guest BIOS which results in assigning
default values to Hotplug, AER, DPC, etc. As a result Hotplug is
disabled on VMD in the Guest OS.

This patch will make sure that Hotplug is enabled properly in Host
as well as in VM.

Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
---
---
 drivers/pci/controller/vmd.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 769eedeb8802..e39eaef5549a 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -720,6 +720,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 	resource_size_t membar2_offset = 0x2000;
 	struct pci_bus *child;
 	struct pci_dev *dev;
+	struct pci_host_bridge *vmd_bridge;
 	int ret;
 
 	/*
@@ -886,8 +887,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 	 * and will fail pcie_bus_configure_settings() early. It can instead be
 	 * run on each of the real root ports.
 	 */
-	list_for_each_entry(child, &vmd->bus->children, node)
+	vmd_bridge = to_pci_host_bridge(vmd->bus->bridge);
+	list_for_each_entry(child, &vmd->bus->children, node) {
 		pcie_bus_configure_settings(child);
+		/*
+		 * When Hotplug is enabled on vmd root-port, enable it on vmd
+		 * bridge.
+		 */
+		if (child->self->is_hotplug_bridge)
+			vmd_bridge->native_pcie_hotplug = 1;
+	}
 
 	pci_bus_add_devices(vmd->bus);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-10-30 20:16 [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports Nirmal Patel
@ 2023-10-31 15:31 ` Bjorn Helgaas
  2023-10-31 19:59   ` Nirmal Patel
  2023-10-31 20:11   ` Nirmal Patel
  0 siblings, 2 replies; 19+ messages in thread
From: Bjorn Helgaas @ 2023-10-31 15:31 UTC (permalink / raw)
  To: Nirmal Patel; +Cc: linux-pci

On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel wrote:
> VMD Hotplug should be enabled or disabled based on VMD rootports'
> Hotplug configuration in BIOS. is_hotplug_bridge is set on each
> VMD rootport based on Hotplug capable bit in SltCap in probe.c.
> Check is_hotplug_bridge and enable or disable native_pcie_hotplug
> based on that value.
> 
> Currently VMD driver copies ACPI settings or platform configurations
> for Hotplug, AER, DPC, PM, etc and enables or disables these features
> on VMD bridge which is not correct in case of Hotplug.

This needs some background about why it's correct to copy the ACPI
settings in the case of AER, DPC, PM, etc, but incorrect for hotplug.

> Also during the Guest boot up, ACPI settings along with VMD UEFI
> driver are not present in Guest BIOS which results in assigning
> default values to Hotplug, AER, DPC, etc. As a result Hotplug is
> disabled on VMD in the Guest OS.
> 
> This patch will make sure that Hotplug is enabled properly in Host
> as well as in VM.

Did we come to some consensus about how or whether _OSC for the host
bridge above the VMD device should apply to devices in the separate
domain below the VMD?

I think this warrants some clarification and possibly discussion in
the PCI firmware SIG.

At the very least, the commit log should mention _OSC and say
something about the fact that this is assuming PCIe hotplug ownership
for devices below VMD, regardless of what the upstream _OSC said.

> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> ---
> ---
>  drivers/pci/controller/vmd.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 769eedeb8802..e39eaef5549a 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -720,6 +720,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  	resource_size_t membar2_offset = 0x2000;
>  	struct pci_bus *child;
>  	struct pci_dev *dev;
> +	struct pci_host_bridge *vmd_bridge;
>  	int ret;
>  
>  	/*
> @@ -886,8 +887,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  	 * and will fail pcie_bus_configure_settings() early. It can instead be
>  	 * run on each of the real root ports.
>  	 */
> -	list_for_each_entry(child, &vmd->bus->children, node)
> +	vmd_bridge = to_pci_host_bridge(vmd->bus->bridge);
> +	list_for_each_entry(child, &vmd->bus->children, node) {
>  		pcie_bus_configure_settings(child);
> +		/*
> +		 * When Hotplug is enabled on vmd root-port, enable it on vmd
> +		 * bridge.
> +		 */
> +		if (child->self->is_hotplug_bridge)
> +			vmd_bridge->native_pcie_hotplug = 1;
> +	}
>  
>  	pci_bus_add_devices(vmd->bus);
>  
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-10-31 15:31 ` Bjorn Helgaas
@ 2023-10-31 19:59   ` Nirmal Patel
  2023-10-31 23:26     ` Nirmal Patel
  2023-11-01 22:20     ` Bjorn Helgaas
  2023-10-31 20:11   ` Nirmal Patel
  1 sibling, 2 replies; 19+ messages in thread
From: Nirmal Patel @ 2023-10-31 19:59 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci

On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel wrote:
> > VMD Hotplug should be enabled or disabled based on VMD rootports'
> > Hotplug configuration in BIOS. is_hotplug_bridge is set on each
> > VMD rootport based on Hotplug capable bit in SltCap in probe.c.
> > Check is_hotplug_bridge and enable or disable native_pcie_hotplug
> > based on that value.
> > 
> > Currently VMD driver copies ACPI settings or platform
> > configurations
> > for Hotplug, AER, DPC, PM, etc and enables or disables these
> > features
> > on VMD bridge which is not correct in case of Hotplug.
> 
> This needs some background about why it's correct to copy the ACPI
> settings in the case of AER, DPC, PM, etc, but incorrect for hotplug.
> 
> > Also during the Guest boot up, ACPI settings along with VMD UEFI
> > driver are not present in Guest BIOS which results in assigning
> > default values to Hotplug, AER, DPC, etc. As a result Hotplug is
> > disabled on VMD in the Guest OS.
> > 
> > This patch will make sure that Hotplug is enabled properly in Host
> > as well as in VM.
> 
> Did we come to some consensus about how or whether _OSC for the host
> bridge above the VMD device should apply to devices in the separate
> domain below the VMD?
We are not able to come to any consensus. Someone suggested to copy
either all _OSC flags or none. But logic behind that assumption is
that the VMD is a bridge device which is not completely true. VMD is an
endpoint device and it owns its domain.

Also please keep this in your consideration, since Guest BIOS
doesn't have _OSC implementation, all of the flags Hotplug, AER, DPC
are set to power state default value and VMD's very important hotplug
functionality is broken.

The patch 04b12ef163d1 assumes VMD is a bridge device and borrows/
*imposes system settings* for AER, DPC, Hotplug, PM, etc on VMD.
VMD is*type 0 PCI endpoint* device and all the PCI devices under VMD
are *privately *owned by VMD not by the OS.

Also VMD has its own Hotplug setting for its rootports in BIOS under
VMD settings that are different from global BIOS system settings. It is
these settings that give VMD its own unique functionality.

That is why I suggested three solutions but never got any confirmation.

#1: Revert the patch 04b12ef163d1 which was added under wrong
assumption. This patch didn't need to be added to VMD code if AER
was disabled from BIOS platform settings.

#2: VMD driver disables AER by copying AER BIOS system settings
which the patch 04b12ef163d1 does but do not change Hotplug.
I proposed this patch and didn't get approval.

#3: If hotplug is enabled on VMD root ports, make sure hotplug is
enabled on the bridge rootports are connected to. The proposed patch
does that.

Can we please come to some decision? VM Hotplug is an important part of
VMD and customers are reporting VM Hotplug issues. I would like to get
some progress on this one.

Thanks
nirmal
> I think this warrants some clarification and possibly discussion in
> the PCI firmware SIG.
> 
> At the very least, the commit log should mention _OSC and say
> something about the fact that this is assuming PCIe hotplug ownership
> for devices below VMD, regardless of what the upstream _OSC said.
> 
> > Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> > ---
> > ---
> >  drivers/pci/controller/vmd.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/controller/vmd.c
> > b/drivers/pci/controller/vmd.c
> > index 769eedeb8802..e39eaef5549a 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -720,6 +720,7 @@ static int vmd_enable_domain(struct vmd_dev
> > *vmd, unsigned long features)
> >  	resource_size_t membar2_offset = 0x2000;
> >  	struct pci_bus *child;
> >  	struct pci_dev *dev;
> > +	struct pci_host_bridge *vmd_bridge;
> >  	int ret;
> >  
> >  	/*
> > @@ -886,8 +887,16 @@ static int vmd_enable_domain(struct vmd_dev
> > *vmd, unsigned long features)
> >  	 * and will fail pcie_bus_configure_settings() early. It can
> > instead be
> >  	 * run on each of the real root ports.
> >  	 */
> > -	list_for_each_entry(child, &vmd->bus->children, node)
> > +	vmd_bridge = to_pci_host_bridge(vmd->bus->bridge);
> > +	list_for_each_entry(child, &vmd->bus->children, node) {
> >  		pcie_bus_configure_settings(child);
> > +		/*
> > +		 * When Hotplug is enabled on vmd root-port, enable it
> > on vmd
> > +		 * bridge.
> > +		 */
> > +		if (child->self->is_hotplug_bridge)
> > +			vmd_bridge->native_pcie_hotplug = 1;
> > +	}
> >  
> >  	pci_bus_add_devices(vmd->bus);
> >  
> > -- 
> > 2.31.1
> > 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-10-31 15:31 ` Bjorn Helgaas
  2023-10-31 19:59   ` Nirmal Patel
@ 2023-10-31 20:11   ` Nirmal Patel
  1 sibling, 0 replies; 19+ messages in thread
From: Nirmal Patel @ 2023-10-31 20:11 UTC (permalink / raw)
  To: nirmal.patel@linux.intel.com; Bjorn Helgaas; +Cc: linux-pci

On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel wrote:
> > VMD Hotplug should be enabled or disabled based on VMD rootports'
> > Hotplug configuration in BIOS. is_hotplug_bridge is set on each
> > VMD rootport based on Hotplug capable bit in SltCap in probe.c.
> > Check is_hotplug_bridge and enable or disable native_pcie_hotplug
> > based on that value.
> > 
> > Currently VMD driver copies ACPI settings or platform
> > configurations
> > for Hotplug, AER, DPC, PM, etc and enables or disables these
> > features
> > on VMD bridge which is not correct in case of Hotplug.
> 
> This needs some background about why it's correct to copy the ACPI
> settings in the case of AER, DPC, PM, etc, but incorrect for hotplug.
> 
> > Also during the Guest boot up, ACPI settings along with VMD UEFI
> > driver are not present in Guest BIOS which results in assigning
> > default values to Hotplug, AER, DPC, etc. As a result Hotplug is
> > disabled on VMD in the Guest OS.
> > 
> > This patch will make sure that Hotplug is enabled properly in Host
> > as well as in VM.
> 
> Did we come to some consensus about how or whether _OSC for the host
> bridge above the VMD device should apply to devices in the separate
> domain below the VMD?
> 
> I think this warrants some clarification and possibly discussion in
> the PCI firmware SIG.
> 
> At the very least, the commit log should mention _OSC and say
> something about the fact that this is assuming PCIe hotplug ownership
> for devices below VMD, regardless of what the upstream _OSC said.
I will make an adjustment to the commit log once we have some
aggrement.
> 
> > Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> > ---
> > ---
> >  drivers/pci/controller/vmd.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/controller/vmd.c
> > b/drivers/pci/controller/vmd.c
> > index 769eedeb8802..e39eaef5549a 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -720,6 +720,7 @@ static int vmd_enable_domain(struct vmd_dev
> > *vmd, unsigned long features)
> >  	resource_size_t membar2_offset = 0x2000;
> >  	struct pci_bus *child;
> >  	struct pci_dev *dev;
> > +	struct pci_host_bridge *vmd_bridge;
> >  	int ret;
> >  
> >  	/*
> > @@ -886,8 +887,16 @@ static int vmd_enable_domain(struct vmd_dev
> > *vmd, unsigned long features)
> >  	 * and will fail pcie_bus_configure_settings() early. It can
> > instead be
> >  	 * run on each of the real root ports.
> >  	 */
> > -	list_for_each_entry(child, &vmd->bus->children, node)
> > +	vmd_bridge = to_pci_host_bridge(vmd->bus->bridge);
> > +	list_for_each_entry(child, &vmd->bus->children, node) {
> >  		pcie_bus_configure_settings(child);
> > +		/*
> > +		 * When Hotplug is enabled on vmd root-port, enable it
> > on vmd
> > +		 * bridge.
> > +		 */
> > +		if (child->self->is_hotplug_bridge)
> > +			vmd_bridge->native_pcie_hotplug = 1;
> > +	}
> >  
> >  	pci_bus_add_devices(vmd->bus);
> >  
> > -- 
> > 2.31.1
> > 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-10-31 19:59   ` Nirmal Patel
@ 2023-10-31 23:26     ` Nirmal Patel
  2023-11-01 22:20     ` Bjorn Helgaas
  1 sibling, 0 replies; 19+ messages in thread
From: Nirmal Patel @ 2023-10-31 23:26 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, orden.e.smith

On Tue, 2023-10-31 at 12:59 -0700, Nirmal Patel wrote:
> On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel wrote:
> > > VMD Hotplug should be enabled or disabled based on VMD rootports'
> > > Hotplug configuration in BIOS. is_hotplug_bridge is set on each
> > > VMD rootport based on Hotplug capable bit in SltCap in probe.c.
> > > Check is_hotplug_bridge and enable or disable native_pcie_hotplug
> > > based on that value.
> > > 
> > > Currently VMD driver copies ACPI settings or platform
> > > configurations
> > > for Hotplug, AER, DPC, PM, etc and enables or disables these
> > > features
> > > on VMD bridge which is not correct in case of Hotplug.
> > 
> > This needs some background about why it's correct to copy the ACPI
> > settings in the case of AER, DPC, PM, etc, but incorrect for
> > hotplug.
> > 
> > > Also during the Guest boot up, ACPI settings along with VMD UEFI
> > > driver are not present in Guest BIOS which results in assigning
> > > default values to Hotplug, AER, DPC, etc. As a result Hotplug is
> > > disabled on VMD in the Guest OS.
> > > 
> > > This patch will make sure that Hotplug is enabled properly in
> > > Host
> > > as well as in VM.
> > 
> > Did we come to some consensus about how or whether _OSC for the
> > host
> > bridge above the VMD device should apply to devices in the separate
> > domain below the VMD?
> We are not able to come to any consensus. Someone suggested to copy
> either all _OSC flags or none. But logic behind that assumption is
> that the VMD is a bridge device which is not completely true. VMD is
> an
> endpoint device and it owns its domain.
> 
> Also please keep this in your consideration, since Guest BIOS
> doesn't have _OSC implementation, all of the flags Hotplug, AER, DPC
> are set to power state default value and VMD's very important hotplug
> functionality is broken.

In case of Host OS, when VMD copies all the _OSC flags, Hotplug, AER,
DPC, etc, it reflects Host BIOS settings.

But in case of Guest OS or VM, the _OSC flags do not reflect Host BIOS
settings. Instead what we have is power ON default values in VM, thus
does not reflect any Host BIOS settings. For example, disabling Hotplug
in VM eventhough it is enabled in Host BIOS.

The patch 04b12ef163d1 broke the settings in Guest kernel by applying
non-host default _OSC values. This long discussion is about restoring
some of these settings to correct Host BIOS settings. i.e. Hotplug.

> 
> The patch 04b12ef163d1 assumes VMD is a bridge device and borrows/
> *imposes system settings* for AER, DPC, Hotplug, PM, etc on VMD.
> VMD is*type 0 PCI endpoint* device and all the PCI devices under VMD
> are *privately *owned by VMD not by the OS.
> 
> Also VMD has its own Hotplug setting for its rootports in BIOS under
> VMD settings that are different from global BIOS system settings. It
> is
> these settings that give VMD its own unique functionality.
> 
> That is why I suggested three solutions but never got any
> confirmation.
> 
> #1: Revert the patch 04b12ef163d1 which was added under wrong
> assumption. This patch didn't need to be added to VMD code if AER
> was disabled from BIOS platform settings.
> 
> #2: VMD driver disables AER by copying AER BIOS system settings
> which the patch 04b12ef163d1 does but do not change Hotplug.
> I proposed this patch and didn't get approval.
> 
> #3: If hotplug is enabled on VMD root ports, make sure hotplug is
> enabled on the bridge rootports are connected to. The proposed patch
> does that.
> 
> Can we please come to some decision? VM Hotplug is an important part
> of
> VMD and customers are reporting VM Hotplug issues. I would like to
> get
> some progress on this one.
> 
> Thanks
> nirmal
> > I think this warrants some clarification and possibly discussion in
> > the PCI firmware SIG.
> > 
> > At the very least, the commit log should mention _OSC and say
> > something about the fact that this is assuming PCIe hotplug
> > ownership
> > for devices below VMD, regardless of what the upstream _OSC said.
> > 
> > > Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> > > ---
> > > ---
> > >  drivers/pci/controller/vmd.c | 11 ++++++++++-
> > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/pci/controller/vmd.c
> > > b/drivers/pci/controller/vmd.c
> > > index 769eedeb8802..e39eaef5549a 100644
> > > --- a/drivers/pci/controller/vmd.c
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -720,6 +720,7 @@ static int vmd_enable_domain(struct vmd_dev
> > > *vmd, unsigned long features)
> > >  	resource_size_t membar2_offset = 0x2000;
> > >  	struct pci_bus *child;
> > >  	struct pci_dev *dev;
> > > +	struct pci_host_bridge *vmd_bridge;
> > >  	int ret;
> > >  
> > >  	/*
> > > @@ -886,8 +887,16 @@ static int vmd_enable_domain(struct vmd_dev
> > > *vmd, unsigned long features)
> > >  	 * and will fail pcie_bus_configure_settings() early. It can
> > > instead be
> > >  	 * run on each of the real root ports.
> > >  	 */
> > > -	list_for_each_entry(child, &vmd->bus->children, node)
> > > +	vmd_bridge = to_pci_host_bridge(vmd->bus->bridge);
> > > +	list_for_each_entry(child, &vmd->bus->children, node) {
> > >  		pcie_bus_configure_settings(child);
> > > +		/*
> > > +		 * When Hotplug is enabled on vmd root-port, enable it
> > > on vmd
> > > +		 * bridge.
> > > +		 */
> > > +		if (child->self->is_hotplug_bridge)
> > > +			vmd_bridge->native_pcie_hotplug = 1;
> > > +	}
> > >  
> > >  	pci_bus_add_devices(vmd->bus);
> > >  
> > > -- 
> > > 2.31.1
> > > 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-10-31 19:59   ` Nirmal Patel
  2023-10-31 23:26     ` Nirmal Patel
@ 2023-11-01 22:20     ` Bjorn Helgaas
  2023-11-02 20:07       ` Nirmal Patel
  1 sibling, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2023-11-01 22:20 UTC (permalink / raw)
  To: Nirmal Patel; +Cc: linux-pci

On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel wrote:
> On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel wrote:
> > > VMD Hotplug should be enabled or disabled based on VMD rootports'
> > > Hotplug configuration in BIOS. is_hotplug_bridge is set on each
> > > VMD rootport based on Hotplug capable bit in SltCap in probe.c.
> > > Check is_hotplug_bridge and enable or disable native_pcie_hotplug
> > > based on that value.
> > > 
> > > Currently VMD driver copies ACPI settings or platform
> > > configurations
> > > for Hotplug, AER, DPC, PM, etc and enables or disables these
> > > features
> > > on VMD bridge which is not correct in case of Hotplug.
> > 
> > This needs some background about why it's correct to copy the ACPI
> > settings in the case of AER, DPC, PM, etc, but incorrect for hotplug.
> > 
> > > Also during the Guest boot up, ACPI settings along with VMD UEFI
> > > driver are not present in Guest BIOS which results in assigning
> > > default values to Hotplug, AER, DPC, etc. As a result Hotplug is
> > > disabled on VMD in the Guest OS.
> > > 
> > > This patch will make sure that Hotplug is enabled properly in Host
> > > as well as in VM.
> > 
> > Did we come to some consensus about how or whether _OSC for the host
> > bridge above the VMD device should apply to devices in the separate
> > domain below the VMD?
>
> We are not able to come to any consensus. Someone suggested to copy
> either all _OSC flags or none. But logic behind that assumption is
> that the VMD is a bridge device which is not completely true. VMD is an
> endpoint device and it owns its domain.

Do you want to facilitate a discussion in the PCI firmware SIG about
this?  It seems like we may want a little text in the spec about how
to handle this situation so platforms and OSes have the same
expectations.

Bjorn

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-01 22:20     ` Bjorn Helgaas
@ 2023-11-02 20:07       ` Nirmal Patel
  2023-11-02 20:41         ` Bjorn Helgaas
  0 siblings, 1 reply; 19+ messages in thread
From: Nirmal Patel @ 2023-11-02 20:07 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci

On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel wrote:
> > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel wrote:
> > > > VMD Hotplug should be enabled or disabled based on VMD
> > > > rootports'
> > > > Hotplug configuration in BIOS. is_hotplug_bridge is set on each
> > > > VMD rootport based on Hotplug capable bit in SltCap in probe.c.
> > > > Check is_hotplug_bridge and enable or disable
> > > > native_pcie_hotplug
> > > > based on that value.
> > > > 
> > > > Currently VMD driver copies ACPI settings or platform
> > > > configurations
> > > > for Hotplug, AER, DPC, PM, etc and enables or disables these
> > > > features
> > > > on VMD bridge which is not correct in case of Hotplug.
> > > 
> > > This needs some background about why it's correct to copy the
> > > ACPI
> > > settings in the case of AER, DPC, PM, etc, but incorrect for
> > > hotplug.
> > > 
> > > > Also during the Guest boot up, ACPI settings along with VMD
> > > > UEFI
> > > > driver are not present in Guest BIOS which results in assigning
> > > > default values to Hotplug, AER, DPC, etc. As a result Hotplug
> > > > is
> > > > disabled on VMD in the Guest OS.
> > > > 
> > > > This patch will make sure that Hotplug is enabled properly in
> > > > Host
> > > > as well as in VM.
> > > 
> > > Did we come to some consensus about how or whether _OSC for the
> > > host
> > > bridge above the VMD device should apply to devices in the
> > > separate
> > > domain below the VMD?
> > 
> > We are not able to come to any consensus. Someone suggested to copy
> > either all _OSC flags or none. But logic behind that assumption is
> > that the VMD is a bridge device which is not completely true. VMD
> > is an
> > endpoint device and it owns its domain.
> 
> Do you want to facilitate a discussion in the PCI firmware SIG about
> this?  It seems like we may want a little text in the spec about how
> to handle this situation so platforms and OSes have the same
> expectations.
The patch 04b12ef163d1 broke intel VMD's hotplug capabilities and
author did not test in VM environment impact.
We can resolve the issue easily by 
#1 Revert the patch which means restoring VMD's original functionality
and author provide better fix.

or

#2 Allow the current change to re-enable VMD hotplug inside VMD driver.

There is a significant impact for our customers hotplug use cases which
forces us to apply the fix in out-of-box drivers for different OSs.

nirmal
> 
> Bjorn


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-02 20:07       ` Nirmal Patel
@ 2023-11-02 20:41         ` Bjorn Helgaas
  2023-11-02 23:49           ` Nirmal Patel
  0 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2023-11-02 20:41 UTC (permalink / raw)
  To: Nirmal Patel; +Cc: linux-pci, Kai-Heng Feng

[+cc Kai-Heng]

On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel wrote:
> > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel wrote:
> > > > > VMD Hotplug should be enabled or disabled based on VMD
> > > > > rootports' Hotplug configuration in BIOS. is_hotplug_bridge
> > > > > is set on each VMD rootport based on Hotplug capable bit in
> > > > > SltCap in probe.c.  Check is_hotplug_bridge and enable or
> > > > > disable native_pcie_hotplug based on that value.
> > > > > 
> > > > > Currently VMD driver copies ACPI settings or platform
> > > > > configurations for Hotplug, AER, DPC, PM, etc and enables or
> > > > > disables these features on VMD bridge which is not correct
> > > > > in case of Hotplug.
> > > > 
> > > > This needs some background about why it's correct to copy the
> > > > ACPI settings in the case of AER, DPC, PM, etc, but incorrect
> > > > for hotplug.
> > > > 
> > > > > Also during the Guest boot up, ACPI settings along with VMD
> > > > > UEFI driver are not present in Guest BIOS which results in
> > > > > assigning default values to Hotplug, AER, DPC, etc. As a
> > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > 
> > > > > This patch will make sure that Hotplug is enabled properly
> > > > > in Host as well as in VM.
> > > > 
> > > > Did we come to some consensus about how or whether _OSC for
> > > > the host bridge above the VMD device should apply to devices
> > > > in the separate domain below the VMD?
> > > 
> > > We are not able to come to any consensus. Someone suggested to
> > > copy either all _OSC flags or none. But logic behind that
> > > assumption is that the VMD is a bridge device which is not
> > > completely true. VMD is an endpoint device and it owns its
> > > domain.
> > 
> > Do you want to facilitate a discussion in the PCI firmware SIG
> > about this?  It seems like we may want a little text in the spec
> > about how to handle this situation so platforms and OSes have the
> > same expectations.
>
> The patch 04b12ef163d1 broke intel VMD's hotplug capabilities and
> author did not test in VM environment impact.
> We can resolve the issue easily by 
>
> #1 Revert the patch which means restoring VMD's original functionality
> and author provide better fix.
> 
> or
> 
> #2 Allow the current change to re-enable VMD hotplug inside VMD driver.
> 
> There is a significant impact for our customers hotplug use cases which
> forces us to apply the fix in out-of-box drivers for different OSs.

I agree 100% that there's a serious problem here and we need to fix
it, there's no argument there.

I guess you're saying it's obvious that an _OSC above VMD does not
apply to devices below VMD, and therefore, no PCI Firmware SIG
discussion or spec clarification is needed?

I'm more interested in a long-term maintainable solution than a quick
fix.  A maintainable solution requires an explanation for (a) why _OSC
above VMD doesn't apply below VMD, and (b) consistency across
everything negotiated by _OSC, not just hotplug.

Bjorn

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-02 20:41         ` Bjorn Helgaas
@ 2023-11-02 23:49           ` Nirmal Patel
  2023-11-07 21:50             ` Nirmal Patel
  0 siblings, 1 reply; 19+ messages in thread
From: Nirmal Patel @ 2023-11-02 23:49 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Kai-Heng Feng

On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> [+cc Kai-Heng]
> 
> On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel wrote:
> > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel wrote:
> > > > > > VMD Hotplug should be enabled or disabled based on VMD
> > > > > > rootports' Hotplug configuration in BIOS. is_hotplug_bridge
> > > > > > is set on each VMD rootport based on Hotplug capable bit in
> > > > > > SltCap in probe.c.  Check is_hotplug_bridge and enable or
> > > > > > disable native_pcie_hotplug based on that value.
> > > > > > 
> > > > > > Currently VMD driver copies ACPI settings or platform
> > > > > > configurations for Hotplug, AER, DPC, PM, etc and enables
> > > > > > or
> > > > > > disables these features on VMD bridge which is not correct
> > > > > > in case of Hotplug.
> > > > > 
> > > > > This needs some background about why it's correct to copy the
> > > > > ACPI settings in the case of AER, DPC, PM, etc, but incorrect
> > > > > for hotplug.
> > > > > 
> > > > > > Also during the Guest boot up, ACPI settings along with VMD
> > > > > > UEFI driver are not present in Guest BIOS which results in
> > > > > > assigning default values to Hotplug, AER, DPC, etc. As a
> > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > 
> > > > > > This patch will make sure that Hotplug is enabled properly
> > > > > > in Host as well as in VM.
> > > > > 
> > > > > Did we come to some consensus about how or whether _OSC for
> > > > > the host bridge above the VMD device should apply to devices
> > > > > in the separate domain below the VMD?
> > > > 
> > > > We are not able to come to any consensus. Someone suggested to
> > > > copy either all _OSC flags or none. But logic behind that
> > > > assumption is that the VMD is a bridge device which is not
> > > > completely true. VMD is an endpoint device and it owns its
> > > > domain.
> > > 
> > > Do you want to facilitate a discussion in the PCI firmware SIG
> > > about this?  It seems like we may want a little text in the spec
> > > about how to handle this situation so platforms and OSes have the
> > > same expectations.
> > 
> > The patch 04b12ef163d1 broke intel VMD's hotplug capabilities and
> > author did not test in VM environment impact.
> > We can resolve the issue easily by 
> > 
> > #1 Revert the patch which means restoring VMD's original
> > functionality
> > and author provide better fix.
> > 
> > or
> > 
> > #2 Allow the current change to re-enable VMD hotplug inside VMD
> > driver.
> > 
> > There is a significant impact for our customers hotplug use cases
> > which
> > forces us to apply the fix in out-of-box drivers for different OSs.
> 
> I agree 100% that there's a serious problem here and we need to fix
> it, there's no argument there.
> 
> I guess you're saying it's obvious that an _OSC above VMD does not
> apply to devices below VMD, and therefore, no PCI Firmware SIG
> discussion or spec clarification is needed?
Yes. By design VMD is an endpoint device to OS and its domain is
privately owned by VMD only. I believe we should revert back to
original design and not impose _OSC settings on VMD domain which is
also a maintainable solution.
> 
> I'm more interested in a long-term maintainable solution than a quick
> fix.  A maintainable solution requires an explanation for (a) why
> _OSC
> above VMD doesn't apply below VMD, and (b) consistency across
> everything negotiated by _OSC, not just hotplug.
The only reason I suggested to alter Hotplug because VMD has it's own
independent Hotplug platform setting different from _OSC which is not
the case for AER, DPC. So we can honor VMD's Hotplug BIOS settings as
well as _OSC settings for other flags while maintaining functionalities
across Host and Guest OSs.
> 
> Bjorn


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-02 23:49           ` Nirmal Patel
@ 2023-11-07 21:50             ` Nirmal Patel
  2023-11-07 22:30               ` Bjorn Helgaas
  0 siblings, 1 reply; 19+ messages in thread
From: Nirmal Patel @ 2023-11-07 21:50 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, Kai-Heng Feng, orden.e.smith, samruddh.dhope

On Thu, 2023-11-02 at 16:49 -0700, Nirmal Patel wrote:
> On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> > [+cc Kai-Heng]
> > 
> > On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel wrote:
> > > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel
> > > > > > wrote:
> > > > > > > VMD Hotplug should be enabled or disabled based on VMD
> > > > > > > rootports' Hotplug configuration in BIOS.
> > > > > > > is_hotplug_bridge
> > > > > > > is set on each VMD rootport based on Hotplug capable bit
> > > > > > > in
> > > > > > > SltCap in probe.c.  Check is_hotplug_bridge and enable or
> > > > > > > disable native_pcie_hotplug based on that value.
> > > > > > > 
> > > > > > > Currently VMD driver copies ACPI settings or platform
> > > > > > > configurations for Hotplug, AER, DPC, PM, etc and enables
> > > > > > > or
> > > > > > > disables these features on VMD bridge which is not
> > > > > > > correct
> > > > > > > in case of Hotplug.
> > > > > > 
> > > > > > This needs some background about why it's correct to copy
> > > > > > the
> > > > > > ACPI settings in the case of AER, DPC, PM, etc, but
> > > > > > incorrect
> > > > > > for hotplug.
> > > > > > 
> > > > > > > Also during the Guest boot up, ACPI settings along with
> > > > > > > VMD
> > > > > > > UEFI driver are not present in Guest BIOS which results
> > > > > > > in
> > > > > > > assigning default values to Hotplug, AER, DPC, etc. As a
> > > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > > 
> > > > > > > This patch will make sure that Hotplug is enabled
> > > > > > > properly
> > > > > > > in Host as well as in VM.
> > > > > > 
> > > > > > Did we come to some consensus about how or whether _OSC for
> > > > > > the host bridge above the VMD device should apply to
> > > > > > devices
> > > > > > in the separate domain below the VMD?
> > > > > 
> > > > > We are not able to come to any consensus. Someone suggested
> > > > > to
> > > > > copy either all _OSC flags or none. But logic behind that
> > > > > assumption is that the VMD is a bridge device which is not
> > > > > completely true. VMD is an endpoint device and it owns its
> > > > > domain.
> > > > 
> > > > Do you want to facilitate a discussion in the PCI firmware SIG
> > > > about this?  It seems like we may want a little text in the
> > > > spec
> > > > about how to handle this situation so platforms and OSes have
> > > > the
> > > > same expectations.
> > > 
> > > The patch 04b12ef163d1 broke intel VMD's hotplug capabilities and
> > > author did not test in VM environment impact.
> > > We can resolve the issue easily by 
> > > 
> > > #1 Revert the patch which means restoring VMD's original
> > > functionality
> > > and author provide better fix.
> > > 
> > > or
> > > 
> > > #2 Allow the current change to re-enable VMD hotplug inside VMD
> > > driver.
> > > 
> > > There is a significant impact for our customers hotplug use cases
> > > which
> > > forces us to apply the fix in out-of-box drivers for different
> > > OSs.
> > 
> > I agree 100% that there's a serious problem here and we need to fix
> > it, there's no argument there.
> > 
> > I guess you're saying it's obvious that an _OSC above VMD does not
> > apply to devices below VMD, and therefore, no PCI Firmware SIG
> > discussion or spec clarification is needed?
> Yes. By design VMD is an endpoint device to OS and its domain is
> privately owned by VMD only. I believe we should revert back to
> original design and not impose _OSC settings on VMD domain which is
> also a maintainable solution.
I will send out revert patch. The _OSC settings shouldn't apply
toprivate VMD domain. 

Even the patch 04b12ef163d1 needs more changes to make sure _OSC
settings are passed on from Host BIOS to Guest BIOS which means
involvement of ESXi, Windows HyperV, KVM.
> > I'm more interested in a long-term maintainable solution than a
> > quick
> > fix.  A maintainable solution requires an explanation for (a) why
> > _OSC
> > above VMD doesn't apply below VMD, and (b) consistency across
> > everything negotiated by _OSC, not just hotplug.
> The only reason I suggested to alter Hotplug because VMD has it's own
> independent Hotplug platform setting different from _OSC which is not
> the case for AER, DPC. So we can honor VMD's Hotplug BIOS settings as
> well as _OSC settings for other flags while maintaining
> functionalities
> across Host and Guest OSs.
> > Bjorn


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-07 21:50             ` Nirmal Patel
@ 2023-11-07 22:30               ` Bjorn Helgaas
  2023-11-08 14:49                 ` Kai-Heng Feng
  2023-11-14 23:29                 ` Nirmal Patel
  0 siblings, 2 replies; 19+ messages in thread
From: Bjorn Helgaas @ 2023-11-07 22:30 UTC (permalink / raw)
  To: Nirmal Patel
  Cc: linux-pci, Kai-Heng Feng, orden.e.smith, samruddh.dhope,
	Rafael J. Wysocki

[+cc Rafael, just FYI re 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features")]

On Tue, Nov 07, 2023 at 02:50:57PM -0700, Nirmal Patel wrote:
> On Thu, 2023-11-02 at 16:49 -0700, Nirmal Patel wrote:
> > On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> > > On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > > > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel wrote:
> > > > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel
> > > > > > > wrote:
> > > > > > > > VMD Hotplug should be enabled or disabled based on VMD
> > > > > > > > rootports' Hotplug configuration in BIOS.
> > > > > > > > is_hotplug_bridge
> > > > > > > > is set on each VMD rootport based on Hotplug capable bit
> > > > > > > > in
> > > > > > > > SltCap in probe.c.  Check is_hotplug_bridge and enable or
> > > > > > > > disable native_pcie_hotplug based on that value.
> > > > > > > > 
> > > > > > > > Currently VMD driver copies ACPI settings or platform
> > > > > > > > configurations for Hotplug, AER, DPC, PM, etc and enables
> > > > > > > > or
> > > > > > > > disables these features on VMD bridge which is not
> > > > > > > > correct
> > > > > > > > in case of Hotplug.
> > > > > > > 
> > > > > > > This needs some background about why it's correct to copy
> > > > > > > the
> > > > > > > ACPI settings in the case of AER, DPC, PM, etc, but
> > > > > > > incorrect
> > > > > > > for hotplug.
> > > > > > > 
> > > > > > > > Also during the Guest boot up, ACPI settings along with
> > > > > > > > VMD
> > > > > > > > UEFI driver are not present in Guest BIOS which results
> > > > > > > > in
> > > > > > > > assigning default values to Hotplug, AER, DPC, etc. As a
> > > > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > > > 
> > > > > > > > This patch will make sure that Hotplug is enabled
> > > > > > > > properly
> > > > > > > > in Host as well as in VM.
> > > > > > > 
> > > > > > > Did we come to some consensus about how or whether _OSC for
> > > > > > > the host bridge above the VMD device should apply to
> > > > > > > devices
> > > > > > > in the separate domain below the VMD?
> > > > > > 
> > > > > > We are not able to come to any consensus. Someone suggested
> > > > > > to
> > > > > > copy either all _OSC flags or none. But logic behind that
> > > > > > assumption is that the VMD is a bridge device which is not
> > > > > > completely true. VMD is an endpoint device and it owns its
> > > > > > domain.
> > > > > 
> > > > > Do you want to facilitate a discussion in the PCI firmware SIG
> > > > > about this?  It seems like we may want a little text in the
> > > > > spec
> > > > > about how to handle this situation so platforms and OSes have
> > > > > the
> > > > > same expectations.
> > > > 
> > > > The patch 04b12ef163d1 broke intel VMD's hotplug capabilities and
> > > > author did not test in VM environment impact.
> > > > We can resolve the issue easily by 
> > > > 
> > > > #1 Revert the patch which means restoring VMD's original
> > > > functionality
> > > > and author provide better fix.
> > > > 
> > > > or
> > > > 
> > > > #2 Allow the current change to re-enable VMD hotplug inside VMD
> > > > driver.
> > > > 
> > > > There is a significant impact for our customers hotplug use cases
> > > > which
> > > > forces us to apply the fix in out-of-box drivers for different
> > > > OSs.
> > > 
> > > I agree 100% that there's a serious problem here and we need to fix
> > > it, there's no argument there.
> > > 
> > > I guess you're saying it's obvious that an _OSC above VMD does not
> > > apply to devices below VMD, and therefore, no PCI Firmware SIG
> > > discussion or spec clarification is needed?
> >
> > Yes. By design VMD is an endpoint device to OS and its domain is
> > privately owned by VMD only. I believe we should revert back to
> > original design and not impose _OSC settings on VMD domain which is
> > also a maintainable solution.
>
> I will send out revert patch. The _OSC settings shouldn't apply
> to private VMD domain. 

I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC
on PCIe features").  That appeared in v5.17, and it fixed (or at least
prevented) an AER message flood.  We can't simply revert 04b12ef163d1
unless we first prevent that AER message flood in another way.

Bjorn

> Even the patch 04b12ef163d1 needs more changes to make sure _OSC
> settings are passed on from Host BIOS to Guest BIOS which means
> involvement of ESXi, Windows HyperV, KVM.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-07 22:30               ` Bjorn Helgaas
@ 2023-11-08 14:49                 ` Kai-Heng Feng
  2023-11-08 19:44                   ` Nirmal Patel
  2023-11-14 21:07                   ` Nirmal Patel
  2023-11-14 23:29                 ` Nirmal Patel
  1 sibling, 2 replies; 19+ messages in thread
From: Kai-Heng Feng @ 2023-11-08 14:49 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Nirmal Patel, linux-pci, orden.e.smith, samruddh.dhope,
	Rafael J. Wysocki

On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Rafael, just FYI re 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features")]
>
> On Tue, Nov 07, 2023 at 02:50:57PM -0700, Nirmal Patel wrote:
> > On Thu, 2023-11-02 at 16:49 -0700, Nirmal Patel wrote:
> > > On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> > > > On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > > > > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > > > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel wrote:
> > > > > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel
> > > > > > > > wrote:
> > > > > > > > > VMD Hotplug should be enabled or disabled based on VMD
> > > > > > > > > rootports' Hotplug configuration in BIOS.
> > > > > > > > > is_hotplug_bridge
> > > > > > > > > is set on each VMD rootport based on Hotplug capable bit
> > > > > > > > > in
> > > > > > > > > SltCap in probe.c.  Check is_hotplug_bridge and enable or
> > > > > > > > > disable native_pcie_hotplug based on that value.
> > > > > > > > >
> > > > > > > > > Currently VMD driver copies ACPI settings or platform
> > > > > > > > > configurations for Hotplug, AER, DPC, PM, etc and enables
> > > > > > > > > or
> > > > > > > > > disables these features on VMD bridge which is not
> > > > > > > > > correct
> > > > > > > > > in case of Hotplug.
> > > > > > > >
> > > > > > > > This needs some background about why it's correct to copy
> > > > > > > > the
> > > > > > > > ACPI settings in the case of AER, DPC, PM, etc, but
> > > > > > > > incorrect
> > > > > > > > for hotplug.
> > > > > > > >
> > > > > > > > > Also during the Guest boot up, ACPI settings along with
> > > > > > > > > VMD
> > > > > > > > > UEFI driver are not present in Guest BIOS which results
> > > > > > > > > in
> > > > > > > > > assigning default values to Hotplug, AER, DPC, etc. As a
> > > > > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > > > >
> > > > > > > > > This patch will make sure that Hotplug is enabled
> > > > > > > > > properly
> > > > > > > > > in Host as well as in VM.
> > > > > > > >
> > > > > > > > Did we come to some consensus about how or whether _OSC for
> > > > > > > > the host bridge above the VMD device should apply to
> > > > > > > > devices
> > > > > > > > in the separate domain below the VMD?
> > > > > > >
> > > > > > > We are not able to come to any consensus. Someone suggested
> > > > > > > to
> > > > > > > copy either all _OSC flags or none. But logic behind that
> > > > > > > assumption is that the VMD is a bridge device which is not
> > > > > > > completely true. VMD is an endpoint device and it owns its
> > > > > > > domain.
> > > > > >
> > > > > > Do you want to facilitate a discussion in the PCI firmware SIG
> > > > > > about this?  It seems like we may want a little text in the
> > > > > > spec
> > > > > > about how to handle this situation so platforms and OSes have
> > > > > > the
> > > > > > same expectations.
> > > > >
> > > > > The patch 04b12ef163d1 broke intel VMD's hotplug capabilities and
> > > > > author did not test in VM environment impact.
> > > > > We can resolve the issue easily by
> > > > >
> > > > > #1 Revert the patch which means restoring VMD's original
> > > > > functionality
> > > > > and author provide better fix.
> > > > >
> > > > > or
> > > > >
> > > > > #2 Allow the current change to re-enable VMD hotplug inside VMD
> > > > > driver.
> > > > >
> > > > > There is a significant impact for our customers hotplug use cases
> > > > > which
> > > > > forces us to apply the fix in out-of-box drivers for different
> > > > > OSs.
> > > >
> > > > I agree 100% that there's a serious problem here and we need to fix
> > > > it, there's no argument there.
> > > >
> > > > I guess you're saying it's obvious that an _OSC above VMD does not
> > > > apply to devices below VMD, and therefore, no PCI Firmware SIG
> > > > discussion or spec clarification is needed?
> > >
> > > Yes. By design VMD is an endpoint device to OS and its domain is
> > > privately owned by VMD only. I believe we should revert back to
> > > original design and not impose _OSC settings on VMD domain which is
> > > also a maintainable solution.
> >
> > I will send out revert patch. The _OSC settings shouldn't apply
> > to private VMD domain.
>
> I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC
> on PCIe features").  That appeared in v5.17, and it fixed (or at least
> prevented) an AER message flood.  We can't simply revert 04b12ef163d1
> unless we first prevent that AER message flood in another way.

The error is "correctable".
Does masking all correctable AER error by default make any sense? And
add a sysfs knob to make it optional.

Kai-Heng

>
> Bjorn
>
> > Even the patch 04b12ef163d1 needs more changes to make sure _OSC
> > settings are passed on from Host BIOS to Guest BIOS which means
> > involvement of ESXi, Windows HyperV, KVM.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-08 14:49                 ` Kai-Heng Feng
@ 2023-11-08 19:44                   ` Nirmal Patel
  2023-11-14 21:07                   ` Nirmal Patel
  1 sibling, 0 replies; 19+ messages in thread
From: Nirmal Patel @ 2023-11-08 19:44 UTC (permalink / raw)
  To: Kai-Heng Feng, Bjorn Helgaas
  Cc: linux-pci, orden.e.smith, samruddh.dhope, Rafael J. Wysocki

On Wed, 2023-11-08 at 16:49 +0200, Kai-Heng Feng wrote:
> On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org>
> wrote:
> > [+cc Rafael, just FYI re 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC
> > on PCIe features")]
> > 
> > On Tue, Nov 07, 2023 at 02:50:57PM -0700, Nirmal Patel wrote:
> > > On Thu, 2023-11-02 at 16:49 -0700, Nirmal Patel wrote:
> > > > On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> > > > > On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > > > > > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > > > > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel
> > > > > > > wrote:
> > > > > > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal
> > > > > > > > > Patel
> > > > > > > > > wrote:
> > > > > > > > > > VMD Hotplug should be enabled or disabled based on
> > > > > > > > > > VMD
> > > > > > > > > > rootports' Hotplug configuration in BIOS.
> > > > > > > > > > is_hotplug_bridge
> > > > > > > > > > is set on each VMD rootport based on Hotplug
> > > > > > > > > > capable bit
> > > > > > > > > > in
> > > > > > > > > > SltCap in probe.c.  Check is_hotplug_bridge and
> > > > > > > > > > enable or
> > > > > > > > > > disable native_pcie_hotplug based on that value.
> > > > > > > > > > 
> > > > > > > > > > Currently VMD driver copies ACPI settings or
> > > > > > > > > > platform
> > > > > > > > > > configurations for Hotplug, AER, DPC, PM, etc and
> > > > > > > > > > enables
> > > > > > > > > > or
> > > > > > > > > > disables these features on VMD bridge which is not
> > > > > > > > > > correct
> > > > > > > > > > in case of Hotplug.
> > > > > > > > > 
> > > > > > > > > This needs some background about why it's correct to
> > > > > > > > > copy
> > > > > > > > > the
> > > > > > > > > ACPI settings in the case of AER, DPC, PM, etc, but
> > > > > > > > > incorrect
> > > > > > > > > for hotplug.
> > > > > > > > > 
> > > > > > > > > > Also during the Guest boot up, ACPI settings along
> > > > > > > > > > with
> > > > > > > > > > VMD
> > > > > > > > > > UEFI driver are not present in Guest BIOS which
> > > > > > > > > > results
> > > > > > > > > > in
> > > > > > > > > > assigning default values to Hotplug, AER, DPC, etc.
> > > > > > > > > > As a
> > > > > > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > > > > > 
> > > > > > > > > > This patch will make sure that Hotplug is enabled
> > > > > > > > > > properly
> > > > > > > > > > in Host as well as in VM.
> > > > > > > > > 
> > > > > > > > > Did we come to some consensus about how or whether
> > > > > > > > > _OSC for
> > > > > > > > > the host bridge above the VMD device should apply to
> > > > > > > > > devices
> > > > > > > > > in the separate domain below the VMD?
> > > > > > > > 
> > > > > > > > We are not able to come to any consensus. Someone
> > > > > > > > suggested
> > > > > > > > to
> > > > > > > > copy either all _OSC flags or none. But logic behind
> > > > > > > > that
> > > > > > > > assumption is that the VMD is a bridge device which is
> > > > > > > > not
> > > > > > > > completely true. VMD is an endpoint device and it owns
> > > > > > > > its
> > > > > > > > domain.
> > > > > > > 
> > > > > > > Do you want to facilitate a discussion in the PCI
> > > > > > > firmware SIG
> > > > > > > about this?  It seems like we may want a little text in
> > > > > > > the
> > > > > > > spec
> > > > > > > about how to handle this situation so platforms and OSes
> > > > > > > have
> > > > > > > the
> > > > > > > same expectations.
> > > > > > 
> > > > > > The patch 04b12ef163d1 broke intel VMD's hotplug
> > > > > > capabilities and
> > > > > > author did not test in VM environment impact.
> > > > > > We can resolve the issue easily by
> > > > > > 
> > > > > > #1 Revert the patch which means restoring VMD's original
> > > > > > functionality
> > > > > > and author provide better fix.
> > > > > > 
> > > > > > or
> > > > > > 
> > > > > > #2 Allow the current change to re-enable VMD hotplug inside
> > > > > > VMD
> > > > > > driver.
> > > > > > 
> > > > > > There is a significant impact for our customers hotplug use
> > > > > > cases
> > > > > > which
> > > > > > forces us to apply the fix in out-of-box drivers for
> > > > > > different
> > > > > > OSs.
> > > > > 
> > > > > I agree 100% that there's a serious problem here and we need
> > > > > to fix
> > > > > it, there's no argument there.
> > > > > 
> > > > > I guess you're saying it's obvious that an _OSC above VMD
> > > > > does not
> > > > > apply to devices below VMD, and therefore, no PCI Firmware
> > > > > SIG
> > > > > discussion or spec clarification is needed?
> > > > 
> > > > Yes. By design VMD is an endpoint device to OS and its domain
> > > > is
> > > > privately owned by VMD only. I believe we should revert back to
> > > > original design and not impose _OSC settings on VMD domain
> > > > which is
> > > > also a maintainable solution.
> > > 
> > > I will send out revert patch. The _OSC settings shouldn't apply
> > > to private VMD domain.
> > 
> > I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor ACPI
> > _OSC
> > on PCIe features").  That appeared in v5.17, and it fixed (or at
> > least
> > prevented) an AER message flood.  We can't simply revert
> > 04b12ef163d1
> > unless we first prevent that AER message flood in another way.
> 
> The error is "correctable".
> Does masking all correctable AER error by default make any sense? And
> add a sysfs knob to make it optional.
> 
> Kai-Heng
Where does the sysfs knob go? Masking AER errors on VMD domain via VMD
driver?

Another way is to disable AER from VMD driver on VMD domain using sysfs
knob.

nirmal
> 
> > Bjorn
> > 
> > > Even the patch 04b12ef163d1 needs more changes to make sure _OSC
> > > settings are passed on from Host BIOS to Guest BIOS which means
> > > involvement of ESXi, Windows HyperV, KVM.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-08 14:49                 ` Kai-Heng Feng
  2023-11-08 19:44                   ` Nirmal Patel
@ 2023-11-14 21:07                   ` Nirmal Patel
  2023-12-06  2:18                     ` Kai-Heng Feng
  1 sibling, 1 reply; 19+ messages in thread
From: Nirmal Patel @ 2023-11-14 21:07 UTC (permalink / raw)
  To: Kai-Heng Feng, Bjorn Helgaas
  Cc: linux-pci, orden.e.smith, samruddh.dhope, Rafael J. Wysocki

On Wed, 2023-11-08 at 16:49 +0200, Kai-Heng Feng wrote:
> On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org>
> wrote:
> > [+cc Rafael, just FYI re 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC
> > on PCIe features")]
> > 
> > On Tue, Nov 07, 2023 at 02:50:57PM -0700, Nirmal Patel wrote:
> > > On Thu, 2023-11-02 at 16:49 -0700, Nirmal Patel wrote:
> > > > On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> > > > > On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > > > > > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > > > > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel
> > > > > > > wrote:
> > > > > > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal
> > > > > > > > > Patel
> > > > > > > > > wrote:
> > > > > > > > > > VMD Hotplug should be enabled or disabled based on
> > > > > > > > > > VMD
> > > > > > > > > > rootports' Hotplug configuration in BIOS.
> > > > > > > > > > is_hotplug_bridge
> > > > > > > > > > is set on each VMD rootport based on Hotplug
> > > > > > > > > > capable bit
> > > > > > > > > > in
> > > > > > > > > > SltCap in probe.c.  Check is_hotplug_bridge and
> > > > > > > > > > enable or
> > > > > > > > > > disable native_pcie_hotplug based on that value.
> > > > > > > > > > 
> > > > > > > > > > Currently VMD driver copies ACPI settings or
> > > > > > > > > > platform
> > > > > > > > > > configurations for Hotplug, AER, DPC, PM, etc and
> > > > > > > > > > enables
> > > > > > > > > > or
> > > > > > > > > > disables these features on VMD bridge which is not
> > > > > > > > > > correct
> > > > > > > > > > in case of Hotplug.
> > > > > > > > > 
> > > > > > > > > This needs some background about why it's correct to
> > > > > > > > > copy
> > > > > > > > > the
> > > > > > > > > ACPI settings in the case of AER, DPC, PM, etc, but
> > > > > > > > > incorrect
> > > > > > > > > for hotplug.
> > > > > > > > > 
> > > > > > > > > > Also during the Guest boot up, ACPI settings along
> > > > > > > > > > with
> > > > > > > > > > VMD
> > > > > > > > > > UEFI driver are not present in Guest BIOS which
> > > > > > > > > > results
> > > > > > > > > > in
> > > > > > > > > > assigning default values to Hotplug, AER, DPC, etc.
> > > > > > > > > > As a
> > > > > > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > > > > > 
> > > > > > > > > > This patch will make sure that Hotplug is enabled
> > > > > > > > > > properly
> > > > > > > > > > in Host as well as in VM.
> > > > > > > > > 
> > > > > > > > > Did we come to some consensus about how or whether
> > > > > > > > > _OSC for
> > > > > > > > > the host bridge above the VMD device should apply to
> > > > > > > > > devices
> > > > > > > > > in the separate domain below the VMD?
> > > > > > > > 
> > > > > > > > We are not able to come to any consensus. Someone
> > > > > > > > suggested
> > > > > > > > to
> > > > > > > > copy either all _OSC flags or none. But logic behind
> > > > > > > > that
> > > > > > > > assumption is that the VMD is a bridge device which is
> > > > > > > > not
> > > > > > > > completely true. VMD is an endpoint device and it owns
> > > > > > > > its
> > > > > > > > domain.
> > > > > > > 
> > > > > > > Do you want to facilitate a discussion in the PCI
> > > > > > > firmware SIG
> > > > > > > about this?  It seems like we may want a little text in
> > > > > > > the
> > > > > > > spec
> > > > > > > about how to handle this situation so platforms and OSes
> > > > > > > have
> > > > > > > the
> > > > > > > same expectations.
> > > > > > 
> > > > > > The patch 04b12ef163d1 broke intel VMD's hotplug
> > > > > > capabilities and
> > > > > > author did not test in VM environment impact.
> > > > > > We can resolve the issue easily by
> > > > > > 
> > > > > > #1 Revert the patch which means restoring VMD's original
> > > > > > functionality
> > > > > > and author provide better fix.
> > > > > > 
> > > > > > or
> > > > > > 
> > > > > > #2 Allow the current change to re-enable VMD hotplug inside
> > > > > > VMD
> > > > > > driver.
> > > > > > 
> > > > > > There is a significant impact for our customers hotplug use
> > > > > > cases
> > > > > > which
> > > > > > forces us to apply the fix in out-of-box drivers for
> > > > > > different
> > > > > > OSs.
> > > > > 
> > > > > I agree 100% that there's a serious problem here and we need
> > > > > to fix
> > > > > it, there's no argument there.
> > > > > 
> > > > > I guess you're saying it's obvious that an _OSC above VMD
> > > > > does not
> > > > > apply to devices below VMD, and therefore, no PCI Firmware
> > > > > SIG
> > > > > discussion or spec clarification is needed?
> > > > 
> > > > Yes. By design VMD is an endpoint device to OS and its domain
> > > > is
> > > > privately owned by VMD only. I believe we should revert back to
> > > > original design and not impose _OSC settings on VMD domain
> > > > which is
> > > > also a maintainable solution.
> > > 
> > > I will send out revert patch. The _OSC settings shouldn't apply
> > > to private VMD domain.
> > 
> > I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor ACPI
> > _OSC
> > on PCIe features").  That appeared in v5.17, and it fixed (or at
> > least
> > prevented) an AER message flood.  We can't simply revert
> > 04b12ef163d1
> > unless we first prevent that AER message flood in another way.
> 
> The error is "correctable".
> Does masking all correctable AER error by default make any sense? And
> add a sysfs knob to make it optional.
I assume sysfs knob requires driver reload. right? Can you send a
patch?

nirmal
> 
> Kai-Heng
> 
> > Bjorn
> > 
> > > Even the patch 04b12ef163d1 needs more changes to make sure _OSC
> > > settings are passed on from Host BIOS to Guest BIOS which means
> > > involvement of ESXi, Windows HyperV, KVM.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-07 22:30               ` Bjorn Helgaas
  2023-11-08 14:49                 ` Kai-Heng Feng
@ 2023-11-14 23:29                 ` Nirmal Patel
  1 sibling, 0 replies; 19+ messages in thread
From: Nirmal Patel @ 2023-11-14 23:29 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Kai-Heng Feng, orden.e.smith, samruddh.dhope,
	Rafael J. Wysocki

On Tue, 2023-11-07 at 16:30 -0600, Bjorn Helgaas wrote:
> [+cc Rafael, just FYI re 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on
> PCIe features")]
> 
> On Tue, Nov 07, 2023 at 02:50:57PM -0700, Nirmal Patel wrote:
> > On Thu, 2023-11-02 at 16:49 -0700, Nirmal Patel wrote:
> > > On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> > > > On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > > > > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > > > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel
> > > > > > wrote:
> > > > > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel
> > > > > > > > wrote:
> > > > > > > > > VMD Hotplug should be enabled or disabled based on
> > > > > > > > > VMD
> > > > > > > > > rootports' Hotplug configuration in BIOS.
> > > > > > > > > is_hotplug_bridge
> > > > > > > > > is set on each VMD rootport based on Hotplug capable
> > > > > > > > > bit
> > > > > > > > > in
> > > > > > > > > SltCap in probe.c.  Check is_hotplug_bridge and
> > > > > > > > > enable or
> > > > > > > > > disable native_pcie_hotplug based on that value.
> > > > > > > > > 
> > > > > > > > > Currently VMD driver copies ACPI settings or platform
> > > > > > > > > configurations for Hotplug, AER, DPC, PM, etc and
> > > > > > > > > enables
> > > > > > > > > or
> > > > > > > > > disables these features on VMD bridge which is not
> > > > > > > > > correct
> > > > > > > > > in case of Hotplug.
> > > > > > > > 
> > > > > > > > This needs some background about why it's correct to
> > > > > > > > copy
> > > > > > > > the
> > > > > > > > ACPI settings in the case of AER, DPC, PM, etc, but
> > > > > > > > incorrect
> > > > > > > > for hotplug.
> > > > > > > > 
> > > > > > > > > Also during the Guest boot up, ACPI settings along
> > > > > > > > > with
> > > > > > > > > VMD
> > > > > > > > > UEFI driver are not present in Guest BIOS which
> > > > > > > > > results
> > > > > > > > > in
> > > > > > > > > assigning default values to Hotplug, AER, DPC, etc.
> > > > > > > > > As a
> > > > > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > > > > 
> > > > > > > > > This patch will make sure that Hotplug is enabled
> > > > > > > > > properly
> > > > > > > > > in Host as well as in VM.
> > > > > > > > 
> > > > > > > > Did we come to some consensus about how or whether _OSC
> > > > > > > > for
> > > > > > > > the host bridge above the VMD device should apply to
> > > > > > > > devices
> > > > > > > > in the separate domain below the VMD?
> > > > > > > 
> > > > > > > We are not able to come to any consensus. Someone
> > > > > > > suggested
> > > > > > > to
> > > > > > > copy either all _OSC flags or none. But logic behind that
> > > > > > > assumption is that the VMD is a bridge device which is
> > > > > > > not
> > > > > > > completely true. VMD is an endpoint device and it owns
> > > > > > > its
> > > > > > > domain.
> > > > > > 
> > > > > > Do you want to facilitate a discussion in the PCI firmware
> > > > > > SIG
> > > > > > about this?  It seems like we may want a little text in the
> > > > > > spec
> > > > > > about how to handle this situation so platforms and OSes
> > > > > > have
> > > > > > the
> > > > > > same expectations.
> > > > > 
> > > > > The patch 04b12ef163d1 broke intel VMD's hotplug capabilities
> > > > > and
> > > > > author did not test in VM environment impact.
> > > > > We can resolve the issue easily by 
> > > > > 
> > > > > #1 Revert the patch which means restoring VMD's original
> > > > > functionality
> > > > > and author provide better fix.
> > > > > 
> > > > > or
> > > > > 
> > > > > #2 Allow the current change to re-enable VMD hotplug inside
> > > > > VMD
> > > > > driver.
> > > > > 
> > > > > There is a significant impact for our customers hotplug use
> > > > > cases
> > > > > which
> > > > > forces us to apply the fix in out-of-box drivers for
> > > > > different
> > > > > OSs.
> > > > 
> > > > I agree 100% that there's a serious problem here and we need to
> > > > fix
> > > > it, there's no argument there.
> > > > 
> > > > I guess you're saying it's obvious that an _OSC above VMD does
> > > > not
> > > > apply to devices below VMD, and therefore, no PCI Firmware SIG
> > > > discussion or spec clarification is needed?
> > > 
> > > Yes. By design VMD is an endpoint device to OS and its domain is
> > > privately owned by VMD only. I believe we should revert back to
> > > original design and not impose _OSC settings on VMD domain which
> > > is
> > > also a maintainable solution.
> > 
> > I will send out revert patch. The _OSC settings shouldn't apply
> > to private VMD domain. 
> 
> I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC
> on PCIe features").  That appeared in v5.17, and it fixed (or at
> least
> prevented) an AER message flood.  We can't simply revert 04b12ef163d1
> unless we first prevent that AER message flood in another way.
Hi Bjorn,

The patch 04b12ef163d1 has broken the VMD driver for HyperVisor guest.
However We dont need to revert AER issue which was fixed in v5.17 to
fix current broken VMD driver for VM.

The proposed solution is cleanest and easiest to maintain because of
the following reasons.

There is only one separate setting for Hotplug for VMD and this patch
honors that while also keeping AER, DPC settings same as global BIOS
settings or _OSC settings.

Also sltCap is passed in both Host and Guest OS which means Hotplug is
enabled and/or disabled correctly without HyperVisor's interference.
And Hotplug functionality works both in VM and Host. Also keep in mind
_OSC flags are not passed in VM from HyperVisor.

The AER issue is seen only in Client system because client BIOS doesn't
expose globale AER settings. So the patch 04b12ef163d1 helps us with
this limitation by copying _OSC flags.

I also tried different solutions since your last response but none of
them are as clean as the proposed patch.

#1 VMD UEFI driver writes nvram EFI variable and VMD Linux driver readsit and configure all the flags. But this solution requires VMD driver to call efivar_get_variable and make VMD driver dependent on EFI module which is not good also with VMD UEFI driver changes. Also reading this variable in VM adds another challenge. This solution is not backword compatible since efivar_get_variable was added a couple of years ago.

#2 I also looked at sysfs knob solution suggested by Kai-heng but it
requires user access to the driver and driver reload in addition to
sysfs knob not being persistent across reboots.

#3 Boot parameter - Adding boot parameters for all the flags require at
least 5 new parameters and not a good solution.

#4 Write a value for AER,DPC, Hotplug in to the VMD scratchpad
registers and read them during bootup to configure all the
functionalities. This adds more complexity in VMD UEFI driver since it
has to read global BIOS settings and write them in scratchpad
registers. VMD Linux driver needs to know if it is in VM or not and
read the scratchpad register to configure Hotplug, AER, DPC, etc.

Thanks
nirmal

> 
> Bjorn
> 
> > Even the patch 04b12ef163d1 needs more changes to make sure _OSC
> > settings are passed on from Host BIOS to Guest BIOS which means
> > involvement of ESXi, Windows HyperV, KVM.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-11-14 21:07                   ` Nirmal Patel
@ 2023-12-06  2:18                     ` Kai-Heng Feng
  2023-12-06 16:30                       ` Bjorn Helgaas
  0 siblings, 1 reply; 19+ messages in thread
From: Kai-Heng Feng @ 2023-12-06  2:18 UTC (permalink / raw)
  To: Nirmal Patel
  Cc: Bjorn Helgaas, linux-pci, orden.e.smith, samruddh.dhope,
	Rafael J. Wysocki

Hi Nirmal,

On Wed, Nov 15, 2023 at 5:00 AM Nirmal Patel
<nirmal.patel@linux.intel.com> wrote:
>
> On Wed, 2023-11-08 at 16:49 +0200, Kai-Heng Feng wrote:
> > On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org>
> > wrote:
> > > [+cc Rafael, just FYI re 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC
> > > on PCIe features")]
> > >
> > > On Tue, Nov 07, 2023 at 02:50:57PM -0700, Nirmal Patel wrote:
> > > > On Thu, 2023-11-02 at 16:49 -0700, Nirmal Patel wrote:
> > > > > On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> > > > > > On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > > > > > > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > > > > > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel
> > > > > > > > wrote:
> > > > > > > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > > > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal
> > > > > > > > > > Patel
> > > > > > > > > > wrote:
> > > > > > > > > > > VMD Hotplug should be enabled or disabled based on
> > > > > > > > > > > VMD
> > > > > > > > > > > rootports' Hotplug configuration in BIOS.
> > > > > > > > > > > is_hotplug_bridge
> > > > > > > > > > > is set on each VMD rootport based on Hotplug
> > > > > > > > > > > capable bit
> > > > > > > > > > > in
> > > > > > > > > > > SltCap in probe.c.  Check is_hotplug_bridge and
> > > > > > > > > > > enable or
> > > > > > > > > > > disable native_pcie_hotplug based on that value.
> > > > > > > > > > >
> > > > > > > > > > > Currently VMD driver copies ACPI settings or
> > > > > > > > > > > platform
> > > > > > > > > > > configurations for Hotplug, AER, DPC, PM, etc and
> > > > > > > > > > > enables
> > > > > > > > > > > or
> > > > > > > > > > > disables these features on VMD bridge which is not
> > > > > > > > > > > correct
> > > > > > > > > > > in case of Hotplug.
> > > > > > > > > >
> > > > > > > > > > This needs some background about why it's correct to
> > > > > > > > > > copy
> > > > > > > > > > the
> > > > > > > > > > ACPI settings in the case of AER, DPC, PM, etc, but
> > > > > > > > > > incorrect
> > > > > > > > > > for hotplug.
> > > > > > > > > >
> > > > > > > > > > > Also during the Guest boot up, ACPI settings along
> > > > > > > > > > > with
> > > > > > > > > > > VMD
> > > > > > > > > > > UEFI driver are not present in Guest BIOS which
> > > > > > > > > > > results
> > > > > > > > > > > in
> > > > > > > > > > > assigning default values to Hotplug, AER, DPC, etc.
> > > > > > > > > > > As a
> > > > > > > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > > > > > >
> > > > > > > > > > > This patch will make sure that Hotplug is enabled
> > > > > > > > > > > properly
> > > > > > > > > > > in Host as well as in VM.
> > > > > > > > > >
> > > > > > > > > > Did we come to some consensus about how or whether
> > > > > > > > > > _OSC for
> > > > > > > > > > the host bridge above the VMD device should apply to
> > > > > > > > > > devices
> > > > > > > > > > in the separate domain below the VMD?
> > > > > > > > >
> > > > > > > > > We are not able to come to any consensus. Someone
> > > > > > > > > suggested
> > > > > > > > > to
> > > > > > > > > copy either all _OSC flags or none. But logic behind
> > > > > > > > > that
> > > > > > > > > assumption is that the VMD is a bridge device which is
> > > > > > > > > not
> > > > > > > > > completely true. VMD is an endpoint device and it owns
> > > > > > > > > its
> > > > > > > > > domain.
> > > > > > > >
> > > > > > > > Do you want to facilitate a discussion in the PCI
> > > > > > > > firmware SIG
> > > > > > > > about this?  It seems like we may want a little text in
> > > > > > > > the
> > > > > > > > spec
> > > > > > > > about how to handle this situation so platforms and OSes
> > > > > > > > have
> > > > > > > > the
> > > > > > > > same expectations.
> > > > > > >
> > > > > > > The patch 04b12ef163d1 broke intel VMD's hotplug
> > > > > > > capabilities and
> > > > > > > author did not test in VM environment impact.
> > > > > > > We can resolve the issue easily by
> > > > > > >
> > > > > > > #1 Revert the patch which means restoring VMD's original
> > > > > > > functionality
> > > > > > > and author provide better fix.
> > > > > > >
> > > > > > > or
> > > > > > >
> > > > > > > #2 Allow the current change to re-enable VMD hotplug inside
> > > > > > > VMD
> > > > > > > driver.
> > > > > > >
> > > > > > > There is a significant impact for our customers hotplug use
> > > > > > > cases
> > > > > > > which
> > > > > > > forces us to apply the fix in out-of-box drivers for
> > > > > > > different
> > > > > > > OSs.
> > > > > >
> > > > > > I agree 100% that there's a serious problem here and we need
> > > > > > to fix
> > > > > > it, there's no argument there.
> > > > > >
> > > > > > I guess you're saying it's obvious that an _OSC above VMD
> > > > > > does not
> > > > > > apply to devices below VMD, and therefore, no PCI Firmware
> > > > > > SIG
> > > > > > discussion or spec clarification is needed?
> > > > >
> > > > > Yes. By design VMD is an endpoint device to OS and its domain
> > > > > is
> > > > > privately owned by VMD only. I believe we should revert back to
> > > > > original design and not impose _OSC settings on VMD domain
> > > > > which is
> > > > > also a maintainable solution.
> > > >
> > > > I will send out revert patch. The _OSC settings shouldn't apply
> > > > to private VMD domain.
> > >
> > > I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor ACPI
> > > _OSC
> > > on PCIe features").  That appeared in v5.17, and it fixed (or at
> > > least
> > > prevented) an AER message flood.  We can't simply revert
> > > 04b12ef163d1
> > > unless we first prevent that AER message flood in another way.
> >
> > The error is "correctable".
> > Does masking all correctable AER error by default make any sense? And
> > add a sysfs knob to make it optional.
> I assume sysfs knob requires driver reload. right? Can you send a
> patch?

What I mean is to mask Correctable Errors by default on *all*
rootports, and create a new sysfs knob to let user decide if
Correctable Errors should be unmasked.

I can send a patch, but of course I'd like to hear what Bjorn thinks
about this approach.

Kai-Heng

>
> nirmal
> >
> > Kai-Heng
> >
> > > Bjorn
> > >
> > > > Even the patch 04b12ef163d1 needs more changes to make sure _OSC
> > > > settings are passed on from Host BIOS to Guest BIOS which means
> > > > involvement of ESXi, Windows HyperV, KVM.
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-12-06  2:18                     ` Kai-Heng Feng
@ 2023-12-06 16:30                       ` Bjorn Helgaas
  2023-12-11 23:19                         ` Nirmal Patel
  0 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2023-12-06 16:30 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Nirmal Patel, linux-pci, orden.e.smith, samruddh.dhope,
	Rafael J. Wysocki, Grant Grundler, Rajat Khandelwal, Rajat Jain

[+cc Grant, Rajat, Rajat]

On Wed, Dec 06, 2023 at 10:18:56AM +0800, Kai-Heng Feng wrote:
> On Wed, Nov 15, 2023 at 5:00 AM Nirmal Patel <nirmal.patel@linux.intel.com> wrote:
> > On Wed, 2023-11-08 at 16:49 +0200, Kai-Heng Feng wrote:
> > > On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> ...

> > > > I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor
> > > > ACPI _OSC on PCIe features").  That appeared in v5.17, and it
> > > > fixed (or at least prevented) an AER message flood.  We can't
> > > > simply revert 04b12ef163d1 unless we first prevent that AER
> > > > message flood in another way.
> > >
> > > The error is "correctable".  Does masking all correctable AER
> > > error by default make any sense? And add a sysfs knob to make it
> > > optional.
> >
> > I assume sysfs knob requires driver reload. right? Can you send a
> > patch?
> 
> What I mean is to mask Correctable Errors by default on *all*
> rootports, and create a new sysfs knob to let user decide if
> Correctable Errors should be unmasked.

I don't think we should mask Correctable Errors by default.  Even
though they've been corrected by hardware and no software action is
required, I think these errors are valuable signals about Link
integrity.

I think rate-limiting and/or reporting on the *frequency* of
Correctable Errors would make a lot of sense.  We had some work toward
this recently, but it hasn't quite gotten finished yet.

The most recent work I'm aware of is this:
https://lore.kernel.org/r/20230606035442.2886343-1-grundler@chromium.org

Bjorn

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-12-06 16:30                       ` Bjorn Helgaas
@ 2023-12-11 23:19                         ` Nirmal Patel
  2023-12-12  3:20                           ` Kai-Heng Feng
  0 siblings, 1 reply; 19+ messages in thread
From: Nirmal Patel @ 2023-12-11 23:19 UTC (permalink / raw)
  To: Bjorn Helgaas, Kai-Heng Feng
  Cc: linux-pci, orden.e.smith, samruddh.dhope, Rafael J. Wysocki,
	Grant Grundler, Rajat Khandelwal, Rajat Jain

On Wed, 2023-12-06 at 10:30 -0600, Bjorn Helgaas wrote:
> [+cc Grant, Rajat, Rajat]
> 
> On Wed, Dec 06, 2023 at 10:18:56AM +0800, Kai-Heng Feng wrote:
> > On Wed, Nov 15, 2023 at 5:00 AM Nirmal Patel <
> > nirmal.patel@linux.intel.com> wrote:
> > > On Wed, 2023-11-08 at 16:49 +0200, Kai-Heng Feng wrote:
> > > > On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <
> > > > helgaas@kernel.org> wrote:
> > ...
> > > > > I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor
> > > > > ACPI _OSC on PCIe features").  That appeared in v5.17, and it
> > > > > fixed (or at least prevented) an AER message flood.  We can't
> > > > > simply revert 04b12ef163d1 unless we first prevent that AER
> > > > > message flood in another way.
> > > > 
> > > > The error is "correctable".  Does masking all correctable AER
> > > > error by default make any sense? And add a sysfs knob to make
> > > > it
> > > > optional.
> > > 
> > > I assume sysfs knob requires driver reload. right? Can you send a
> > > patch?
> > 
> > What I mean is to mask Correctable Errors by default on *all*
> > rootports, and create a new sysfs knob to let user decide if
> > Correctable Errors should be unmasked.
> 
> I don't think we should mask Correctable Errors by default.  Even
> though they've been corrected by hardware and no software action is
> required, I think these errors are valuable signals about Link
> integrity.
> 
> I think rate-limiting and/or reporting on the *frequency* of
> Correctable Errors would make a lot of sense.  We had some work
> toward
> this recently, but it hasn't quite gotten finished yet.
> 
> The most recent work I'm aware of is this:
> https://lore.kernel.org/r/20230606035442.2886343-1-grundler@chromium.org

Hi Kai-Heng, Bjorn,

I believe the rate limit will not alone fix the issue rather will act
as a work around. Without 04b12ef163d1, the VMD driver is not aware of
OS native AER support setting, then we will see AER flooding issue
which is a bug in VMD driver since it will always enable the AER.

There is a setting in BIOS that allows us to enable OS native AER
support on the platform. This setting is located in EDK Menu ->
Platform configuration -> system event log -> IIO error enabling -> OS
native AER support. I have verified that the above BIOS setting alters
the native AER flag of _OSC. We can also verify it on Kai-Heng's
system.

I believe instead of going in the direction of rate limit, vmd driver
should honor OS native AER support setting.

Do you have any suggestion on this?

nirmal
> 
> Bjorn


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports
  2023-12-11 23:19                         ` Nirmal Patel
@ 2023-12-12  3:20                           ` Kai-Heng Feng
  0 siblings, 0 replies; 19+ messages in thread
From: Kai-Heng Feng @ 2023-12-12  3:20 UTC (permalink / raw)
  To: Nirmal Patel
  Cc: Bjorn Helgaas, linux-pci, orden.e.smith, samruddh.dhope,
	Rafael J. Wysocki, Grant Grundler, Rajat Khandelwal, Rajat Jain

Hi Nirmal,

On Tue, Dec 12, 2023 at 7:13 AM Nirmal Patel
<nirmal.patel@linux.intel.com> wrote:
>
> On Wed, 2023-12-06 at 10:30 -0600, Bjorn Helgaas wrote:
> > [+cc Grant, Rajat, Rajat]
> >
> > On Wed, Dec 06, 2023 at 10:18:56AM +0800, Kai-Heng Feng wrote:
> > > On Wed, Nov 15, 2023 at 5:00 AM Nirmal Patel <
> > > nirmal.patel@linux.intel.com> wrote:
> > > > On Wed, 2023-11-08 at 16:49 +0200, Kai-Heng Feng wrote:
> > > > > On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <
> > > > > helgaas@kernel.org> wrote:
> > > ...
> > > > > > I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor
> > > > > > ACPI _OSC on PCIe features").  That appeared in v5.17, and it
> > > > > > fixed (or at least prevented) an AER message flood.  We can't
> > > > > > simply revert 04b12ef163d1 unless we first prevent that AER
> > > > > > message flood in another way.
> > > > >
> > > > > The error is "correctable".  Does masking all correctable AER
> > > > > error by default make any sense? And add a sysfs knob to make
> > > > > it
> > > > > optional.
> > > >
> > > > I assume sysfs knob requires driver reload. right? Can you send a
> > > > patch?
> > >
> > > What I mean is to mask Correctable Errors by default on *all*
> > > rootports, and create a new sysfs knob to let user decide if
> > > Correctable Errors should be unmasked.
> >
> > I don't think we should mask Correctable Errors by default.  Even
> > though they've been corrected by hardware and no software action is
> > required, I think these errors are valuable signals about Link
> > integrity.
> >
> > I think rate-limiting and/or reporting on the *frequency* of
> > Correctable Errors would make a lot of sense.  We had some work
> > toward
> > this recently, but it hasn't quite gotten finished yet.
> >
> > The most recent work I'm aware of is this:
> > https://lore.kernel.org/r/20230606035442.2886343-1-grundler@chromium.org
>
> Hi Kai-Heng, Bjorn,
>
> I believe the rate limit will not alone fix the issue rather will act
> as a work around. Without 04b12ef163d1, the VMD driver is not aware of
> OS native AER support setting, then we will see AER flooding issue
> which is a bug in VMD driver since it will always enable the AER.

Agree. Rate limiting doesn't stop the AER interrupt, so it won't flood
the kernel message but will still hog CPU time.

Kai-Heng

> There is a setting in BIOS that allows us to enable OS native AER
> support on the platform. This setting is located in EDK Menu ->
> Platform configuration -> system event log -> IIO error enabling -> OS
> native AER support. I have verified that the above BIOS setting alters
> the native AER flag of _OSC. We can also verify it on Kai-Heng's
> system.
>
> I believe instead of going in the direction of rate limit, vmd driver
> should honor OS native AER support setting.
>
> Do you have any suggestion on this?
>
> nirmal
> >
> > Bjorn
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-12-12  3:21 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-30 20:16 [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports Nirmal Patel
2023-10-31 15:31 ` Bjorn Helgaas
2023-10-31 19:59   ` Nirmal Patel
2023-10-31 23:26     ` Nirmal Patel
2023-11-01 22:20     ` Bjorn Helgaas
2023-11-02 20:07       ` Nirmal Patel
2023-11-02 20:41         ` Bjorn Helgaas
2023-11-02 23:49           ` Nirmal Patel
2023-11-07 21:50             ` Nirmal Patel
2023-11-07 22:30               ` Bjorn Helgaas
2023-11-08 14:49                 ` Kai-Heng Feng
2023-11-08 19:44                   ` Nirmal Patel
2023-11-14 21:07                   ` Nirmal Patel
2023-12-06  2:18                     ` Kai-Heng Feng
2023-12-06 16:30                       ` Bjorn Helgaas
2023-12-11 23:19                         ` Nirmal Patel
2023-12-12  3:20                           ` Kai-Heng Feng
2023-11-14 23:29                 ` Nirmal Patel
2023-10-31 20:11   ` Nirmal Patel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.