linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] PCI: vmd: Clean up domain before enumeration
@ 2021-11-16 22:11 Nirmal Patel
  2021-11-19 20:48 ` Patel, Nirmal
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Nirmal Patel @ 2021-11-16 22:11 UTC (permalink / raw)
  To: Nirmal Patel, linux-pci, jonathan.derrick

During VT-d pass-through, the VMD driver occasionally fails to
enumerate underlying NVMe devices when repetitive reboots are
performed in the guest OS. The issue can be resolved by resetting
VMD root ports for proper enumeration and triggering secondary bus
reset which will also propagate reset through downstream bridges.

Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev>
---
---
v4->v5: Fixing small nitpick fix.
v3->v4: Using pci_reset_bus function for secondary bus reset instead of
        manually triggering secondary bus reset, addressing review
        comments of v3.
v2->v3: Combining two functions into one, Remove redundant definations
        and Formatting fixes

 drivers/pci/controller/vmd.c | 37 ++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index a5987e52700e..a905fce6232f 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -498,6 +498,40 @@ static inline void vmd_acpi_begin(void) { }
 static inline void vmd_acpi_end(void) { }
 #endif /* CONFIG_ACPI */
 
+static void vmd_domain_reset(struct vmd_dev *vmd)
+{
+	u16 bus, max_buses = resource_size(&vmd->resources[0]);
+	u8 dev, functions, fn, hdr_type;
+	char __iomem *base;
+
+	for (bus = 0; bus < max_buses; bus++) {
+		for (dev = 0; dev < 32; dev++) {
+			base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
+						PCI_DEVFN(dev, 0), 0);
+
+			hdr_type = readb(base + PCI_HEADER_TYPE) &
+					 PCI_HEADER_TYPE_MASK;
+
+			functions = (hdr_type & 0x80) ? 8 : 1;
+			for (fn = 0; fn < functions; fn++) {
+				base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
+						PCI_DEVFN(dev, fn), 0);
+
+				hdr_type = readb(base + PCI_HEADER_TYPE) &
+						PCI_HEADER_TYPE_MASK;
+
+				if (hdr_type != PCI_HEADER_TYPE_BRIDGE ||
+				    (readw(base + PCI_CLASS_DEVICE) !=
+				     PCI_CLASS_BRIDGE_PCI))
+					continue;
+
+				memset_io(base + PCI_IO_BASE, 0,
+					  PCI_ROM_ADDRESS1 - PCI_IO_BASE);
+			}
+		}
+	}
+}
+
 static void vmd_attach_resources(struct vmd_dev *vmd)
 {
 	vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1];
@@ -801,6 +835,9 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 	vmd_acpi_begin();
 
 	pci_scan_child_bus(vmd->bus);
+	vmd_domain_reset(vmd);
+	list_for_each_entry(child, &vmd->bus->children, node)
+		pci_reset_bus(child->self);
 	pci_assign_unassigned_bus_resources(vmd->bus);
 
 	/*
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5] PCI: vmd: Clean up domain before enumeration
  2021-11-16 22:11 [PATCH v5] PCI: vmd: Clean up domain before enumeration Nirmal Patel
@ 2021-11-19 20:48 ` Patel, Nirmal
  2021-11-19 20:55   ` Bjorn Helgaas
  2021-12-01 12:01 ` Lorenzo Pieralisi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Patel, Nirmal @ 2021-11-19 20:48 UTC (permalink / raw)
  To: lorenzo.pieralisi, Bjorn Helgaas
  Cc: jonathan.derrick, linux-pci, Krzysztof Wilczyński

On 11/16/2021 3:11 PM, Nirmal Patel wrote:
> During VT-d pass-through, the VMD driver occasionally fails to
> enumerate underlying NVMe devices when repetitive reboots are
> performed in the guest OS. The issue can be resolved by resetting
> VMD root ports for proper enumeration and triggering secondary bus
> reset which will also propagate reset through downstream bridges.
>
> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev>
> ---
> ---
> v4->v5: Fixing small nitpick fix.
> v3->v4: Using pci_reset_bus function for secondary bus reset instead of
>         manually triggering secondary bus reset, addressing review
>         comments of v3.
> v2->v3: Combining two functions into one, Remove redundant definations
>         and Formatting fixes
>
>  drivers/pci/controller/vmd.c | 37 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
>
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index a5987e52700e..a905fce6232f 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -498,6 +498,40 @@ static inline void vmd_acpi_begin(void) { }
>  static inline void vmd_acpi_end(void) { }
>  #endif /* CONFIG_ACPI */
>  
> +static void vmd_domain_reset(struct vmd_dev *vmd)
> +{
> +	u16 bus, max_buses = resource_size(&vmd->resources[0]);
> +	u8 dev, functions, fn, hdr_type;
> +	char __iomem *base;
> +
> +	for (bus = 0; bus < max_buses; bus++) {
> +		for (dev = 0; dev < 32; dev++) {
> +			base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
> +						PCI_DEVFN(dev, 0), 0);
> +
> +			hdr_type = readb(base + PCI_HEADER_TYPE) &
> +					 PCI_HEADER_TYPE_MASK;
> +
> +			functions = (hdr_type & 0x80) ? 8 : 1;
> +			for (fn = 0; fn < functions; fn++) {
> +				base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
> +						PCI_DEVFN(dev, fn), 0);
> +
> +				hdr_type = readb(base + PCI_HEADER_TYPE) &
> +						PCI_HEADER_TYPE_MASK;
> +
> +				if (hdr_type != PCI_HEADER_TYPE_BRIDGE ||
> +				    (readw(base + PCI_CLASS_DEVICE) !=
> +				     PCI_CLASS_BRIDGE_PCI))
> +					continue;
> +
> +				memset_io(base + PCI_IO_BASE, 0,
> +					  PCI_ROM_ADDRESS1 - PCI_IO_BASE);
> +			}
> +		}
> +	}
> +}
> +
>  static void vmd_attach_resources(struct vmd_dev *vmd)
>  {
>  	vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1];
> @@ -801,6 +835,9 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  	vmd_acpi_begin();
>  
>  	pci_scan_child_bus(vmd->bus);
> +	vmd_domain_reset(vmd);
> +	list_for_each_entry(child, &vmd->bus->children, node)
> +		pci_reset_bus(child->self);
>  	pci_assign_unassigned_bus_resources(vmd->bus);
>  
>  	/*

Hi

Gentle ping. Please let me know if you are okay with these changes (with Jon's Reviewed-by). Thanks.

-nirmal


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5] PCI: vmd: Clean up domain before enumeration
  2021-11-19 20:48 ` Patel, Nirmal
@ 2021-11-19 20:55   ` Bjorn Helgaas
  0 siblings, 0 replies; 8+ messages in thread
From: Bjorn Helgaas @ 2021-11-19 20:55 UTC (permalink / raw)
  To: Patel, Nirmal
  Cc: lorenzo.pieralisi, jonathan.derrick, linux-pci,
	Krzysztof Wilczyński

On Fri, Nov 19, 2021 at 01:48:59PM -0700, Patel, Nirmal wrote:
> On 11/16/2021 3:11 PM, Nirmal Patel wrote:
> > During VT-d pass-through, the VMD driver occasionally fails to
> > enumerate underlying NVMe devices when repetitive reboots are
> > performed in the guest OS. The issue can be resolved by resetting
> > VMD root ports for proper enumeration and triggering secondary bus
> > reset which will also propagate reset through downstream bridges.
> >
> > Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> > Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev>
> > ---
> > ---
> > v4->v5: Fixing small nitpick fix.
> > v3->v4: Using pci_reset_bus function for secondary bus reset instead of
> >         manually triggering secondary bus reset, addressing review
> >         comments of v3.
> > v2->v3: Combining two functions into one, Remove redundant definations
> >         and Formatting fixes
> >
> >  drivers/pci/controller/vmd.c | 37 ++++++++++++++++++++++++++++++++++++
> >  1 file changed, 37 insertions(+)
> >
> > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > index a5987e52700e..a905fce6232f 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -498,6 +498,40 @@ static inline void vmd_acpi_begin(void) { }
> >  static inline void vmd_acpi_end(void) { }
> >  #endif /* CONFIG_ACPI */
> >  
> > +static void vmd_domain_reset(struct vmd_dev *vmd)
> > +{
> > +	u16 bus, max_buses = resource_size(&vmd->resources[0]);
> > +	u8 dev, functions, fn, hdr_type;
> > +	char __iomem *base;
> > +
> > +	for (bus = 0; bus < max_buses; bus++) {
> > +		for (dev = 0; dev < 32; dev++) {
> > +			base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
> > +						PCI_DEVFN(dev, 0), 0);
> > +
> > +			hdr_type = readb(base + PCI_HEADER_TYPE) &
> > +					 PCI_HEADER_TYPE_MASK;
> > +
> > +			functions = (hdr_type & 0x80) ? 8 : 1;
> > +			for (fn = 0; fn < functions; fn++) {
> > +				base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
> > +						PCI_DEVFN(dev, fn), 0);
> > +
> > +				hdr_type = readb(base + PCI_HEADER_TYPE) &
> > +						PCI_HEADER_TYPE_MASK;
> > +
> > +				if (hdr_type != PCI_HEADER_TYPE_BRIDGE ||
> > +				    (readw(base + PCI_CLASS_DEVICE) !=
> > +				     PCI_CLASS_BRIDGE_PCI))
> > +					continue;
> > +
> > +				memset_io(base + PCI_IO_BASE, 0,
> > +					  PCI_ROM_ADDRESS1 - PCI_IO_BASE);
> > +			}
> > +		}
> > +	}
> > +}
> > +
> >  static void vmd_attach_resources(struct vmd_dev *vmd)
> >  {
> >  	vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1];
> > @@ -801,6 +835,9 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> >  	vmd_acpi_begin();
> >  
> >  	pci_scan_child_bus(vmd->bus);
> > +	vmd_domain_reset(vmd);
> > +	list_for_each_entry(child, &vmd->bus->children, node)
> > +		pci_reset_bus(child->self);
> >  	pci_assign_unassigned_bus_resources(vmd->bus);
> >  
> >  	/*
> 
> Hi
> 
> Gentle ping. Please let me know if you are okay with these changes
> (with Jon's Reviewed-by). Thanks.

This got assigned to me by mistake, so I just reassigned it to Lorenzo
in patchwork.  There's no hurry, we're only at v5.16-rc1.  It doesn't
appear to be a candidate for v5.16, and there's always "cc: stable" if
appropriate.

Bjorn

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5] PCI: vmd: Clean up domain before enumeration
  2021-11-16 22:11 [PATCH v5] PCI: vmd: Clean up domain before enumeration Nirmal Patel
  2021-11-19 20:48 ` Patel, Nirmal
@ 2021-12-01 12:01 ` Lorenzo Pieralisi
  2021-12-02 16:38 ` Patel, Nirmal
  2021-12-03 18:18 ` Bjorn Helgaas
  3 siblings, 0 replies; 8+ messages in thread
From: Lorenzo Pieralisi @ 2021-12-01 12:01 UTC (permalink / raw)
  To: Nirmal Patel, linux-pci, jonathan.derrick; +Cc: Lorenzo Pieralisi

On Tue, 16 Nov 2021 15:11:36 -0700, Nirmal Patel wrote:
> During VT-d pass-through, the VMD driver occasionally fails to
> enumerate underlying NVMe devices when repetitive reboots are
> performed in the guest OS. The issue can be resolved by resetting
> VMD root ports for proper enumeration and triggering secondary bus
> reset which will also propagate reset through downstream bridges.
> 
> 
> [...]

Applied to pci/vmd, thanks!

[1/1] PCI: vmd: Clean up domain before enumeration
      https://git.kernel.org/lpieralisi/pci/c/6aab562229

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5] PCI: vmd: Clean up domain before enumeration
  2021-11-16 22:11 [PATCH v5] PCI: vmd: Clean up domain before enumeration Nirmal Patel
  2021-11-19 20:48 ` Patel, Nirmal
  2021-12-01 12:01 ` Lorenzo Pieralisi
@ 2021-12-02 16:38 ` Patel, Nirmal
  2021-12-02 17:32   ` Keith Busch
  2021-12-03 18:18 ` Bjorn Helgaas
  3 siblings, 1 reply; 8+ messages in thread
From: Patel, Nirmal @ 2021-12-02 16:38 UTC (permalink / raw)
  To: lorenzo.pieralisi, Krzysztof Wilczyński; +Cc: jonathan.derrick, linux-pci

On 11/16/2021 3:11 PM, Nirmal Patel wrote:
> During VT-d pass-through, the VMD driver occasionally fails to
> enumerate underlying NVMe devices when repetitive reboots are
> performed in the guest OS. The issue can be resolved by resetting
> VMD root ports for proper enumeration and triggering secondary bus
> reset which will also propagate reset through downstream bridges.
>
> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev>
> ---
> ---
> v4->v5: Fixing small nitpick fix.
> v3->v4: Using pci_reset_bus function for secondary bus reset instead of
>         manually triggering secondary bus reset, addressing review
>         comments of v3.
> v2->v3: Combining two functions into one, Remove redundant definations
>         and Formatting fixes
>
>  drivers/pci/controller/vmd.c | 37 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
>
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index a5987e52700e..a905fce6232f 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -498,6 +498,40 @@ static inline void vmd_acpi_begin(void) { }
>  static inline void vmd_acpi_end(void) { }
>  #endif /* CONFIG_ACPI */
>  
> +static void vmd_domain_reset(struct vmd_dev *vmd)
> +{
> +	u16 bus, max_buses = resource_size(&vmd->resources[0]);
> +	u8 dev, functions, fn, hdr_type;
> +	char __iomem *base;
> +
> +	for (bus = 0; bus < max_buses; bus++) {
> +		for (dev = 0; dev < 32; dev++) {
> +			base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
> +						PCI_DEVFN(dev, 0), 0);
> +
> +			hdr_type = readb(base + PCI_HEADER_TYPE) &
> +					 PCI_HEADER_TYPE_MASK;
> +
> +			functions = (hdr_type & 0x80) ? 8 : 1;
> +			for (fn = 0; fn < functions; fn++) {
> +				base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
> +						PCI_DEVFN(dev, fn), 0);
> +
> +				hdr_type = readb(base + PCI_HEADER_TYPE) &
> +						PCI_HEADER_TYPE_MASK;
> +
> +				if (hdr_type != PCI_HEADER_TYPE_BRIDGE ||
> +				    (readw(base + PCI_CLASS_DEVICE) !=
> +				     PCI_CLASS_BRIDGE_PCI))
> +					continue;
> +
> +				memset_io(base + PCI_IO_BASE, 0,
> +					  PCI_ROM_ADDRESS1 - PCI_IO_BASE);
> +			}
> +		}
> +	}
> +}
> +
>  static void vmd_attach_resources(struct vmd_dev *vmd)
>  {
>  	vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1];
> @@ -801,6 +835,9 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  	vmd_acpi_begin();
>  
>  	pci_scan_child_bus(vmd->bus);
> +	vmd_domain_reset(vmd);
> +	list_for_each_entry(child, &vmd->bus->children, node)
> +		pci_reset_bus(child->self);
>  	pci_assign_unassigned_bus_resources(vmd->bus);
>  
>  	/*

Hi

Gentle ping. Please let me know if you are okay with these changes (with Jon's Reviewed-by). Thanks.

-nirmal


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5] PCI: vmd: Clean up domain before enumeration
  2021-12-02 16:38 ` Patel, Nirmal
@ 2021-12-02 17:32   ` Keith Busch
  0 siblings, 0 replies; 8+ messages in thread
From: Keith Busch @ 2021-12-02 17:32 UTC (permalink / raw)
  To: Patel, Nirmal
  Cc: lorenzo.pieralisi, Krzysztof Wilczyński, jonathan.derrick,
	linux-pci

On Thu, Dec 02, 2021 at 09:38:26AM -0700, Patel, Nirmal wrote:
> Gentle ping. Please let me know if you are okay with these changes (with Jon's Reviewed-by). Thanks.

Your patch is already staged for the next merge window here:

  https://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/pci.git/commit/?h=pci/vmd&id=6aab5622296b990024ee67dd7efa7d143e7558d0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5] PCI: vmd: Clean up domain before enumeration
  2021-11-16 22:11 [PATCH v5] PCI: vmd: Clean up domain before enumeration Nirmal Patel
                   ` (2 preceding siblings ...)
  2021-12-02 16:38 ` Patel, Nirmal
@ 2021-12-03 18:18 ` Bjorn Helgaas
  2021-12-04  2:34   ` Patel, Nirmal
  3 siblings, 1 reply; 8+ messages in thread
From: Bjorn Helgaas @ 2021-12-03 18:18 UTC (permalink / raw)
  To: Nirmal Patel; +Cc: linux-pci, jonathan.derrick, Lorenzo Pieralisi

[+cc Lorenzo]

On Tue, Nov 16, 2021 at 03:11:36PM -0700, Nirmal Patel wrote:
> During VT-d pass-through, the VMD driver occasionally fails to
> enumerate underlying NVMe devices when repetitive reboots are
> performed in the guest OS. The issue can be resolved by resetting
> VMD root ports for proper enumeration and triggering secondary bus
> reset which will also propagate reset through downstream bridges.

Hmm.  Does not say what the root cause is, just that it can be "fixed"
by a reset, but OK.

> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
> Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev>
> ---
> ---
> v4->v5: Fixing small nitpick fix.
> v3->v4: Using pci_reset_bus function for secondary bus reset instead of
>         manually triggering secondary bus reset, addressing review
>         comments of v3.
> v2->v3: Combining two functions into one, Remove redundant definations
>         and Formatting fixes
> 
>  drivers/pci/controller/vmd.c | 37 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index a5987e52700e..a905fce6232f 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -498,6 +498,40 @@ static inline void vmd_acpi_begin(void) { }
>  static inline void vmd_acpi_end(void) { }
>  #endif /* CONFIG_ACPI */
>  
> +static void vmd_domain_reset(struct vmd_dev *vmd)
> +{
> +	u16 bus, max_buses = resource_size(&vmd->resources[0]);
> +	u8 dev, functions, fn, hdr_type;
> +	char __iomem *base;

I don't understand what's going on here.

> +	for (bus = 0; bus < max_buses; bus++) {
> +		for (dev = 0; dev < 32; dev++) {
> +			base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
> +						PCI_DEVFN(dev, 0), 0);
> +
> +			hdr_type = readb(base + PCI_HEADER_TYPE) &
> +					 PCI_HEADER_TYPE_MASK;
> +			functions = (hdr_type & 0x80) ? 8 : 1;

This look like an open-coded version of pci_hdr_type() for every
possible device on every possible bus below the VMD, regardless of
whether that device actually exists.  So most of these reads will
result in Unsupported Request errors and 0xff data returns, which you
interpret as a multi-function device.

> +			for (fn = 0; fn < functions; fn++) {
> +				base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
> +						PCI_DEVFN(dev, fn), 0);
> +
> +				hdr_type = readb(base + PCI_HEADER_TYPE) &
> +						PCI_HEADER_TYPE_MASK;

So you open-code pci_hdr_type() again, for lots of functions that
don't exist.

> +				if (hdr_type != PCI_HEADER_TYPE_BRIDGE ||
> +				    (readw(base + PCI_CLASS_DEVICE) !=
> +				     PCI_CLASS_BRIDGE_PCI))
> +					continue;

This at least will skip the rest for functions that don't exist, since
hdr_type will be 0x7f (not 0x01, PCI_HEADER_TYPE_BRIDGE).

> +
> +				memset_io(base + PCI_IO_BASE, 0,
> +					  PCI_ROM_ADDRESS1 - PCI_IO_BASE);

And here you clear the base & limit registers of the IO and MEM
windows of each bridge, again with basically a hand-written special
case of pci_write_config_*().

This looks like you're trying to disable the windows.  AFAICT, the
spec doesn't say what happens to the window base/limit registers after
reset, but it does say that reset clears the I/O Space Enable and
Memory Space Enable in the Command Register.  For bridges, that will
disable the windows.

Writing zero to both base & limit does not disable the windows; it
just sets them to [io 0x0000-0x0fff], [mem 0x00000000-0x000fffff],
etc.

So I'm not really convinced that this function is necessary at all,
given that you do a secondary bus reset on every VMD root port right
afterwards.

> +			}
> +		}
> +	}
> +}
> +
>  static void vmd_attach_resources(struct vmd_dev *vmd)
>  {
>  	vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1];
> @@ -801,6 +835,9 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  	vmd_acpi_begin();
>  
>  	pci_scan_child_bus(vmd->bus);
> +	vmd_domain_reset(vmd);
> +	list_for_each_entry(child, &vmd->bus->children, node)
> +		pci_reset_bus(child->self);
>  	pci_assign_unassigned_bus_resources(vmd->bus);
>  
>  	/*
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5] PCI: vmd: Clean up domain before enumeration
  2021-12-03 18:18 ` Bjorn Helgaas
@ 2021-12-04  2:34   ` Patel, Nirmal
  0 siblings, 0 replies; 8+ messages in thread
From: Patel, Nirmal @ 2021-12-04  2:34 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: linux-pci, jonathan.derrick, Lorenzo Pieralisi

On 12/3/2021 11:18 AM, Bjorn Helgaas wrote:
> [+cc Lorenzo]
>
> On Tue, Nov 16, 2021 at 03:11:36PM -0700, Nirmal Patel wrote:
>> During VT-d pass-through, the VMD driver occasionally fails to
>> enumerate underlying NVMe devices when repetitive reboots are
>> performed in the guest OS. The issue can be resolved by resetting
>> VMD root ports for proper enumeration and triggering secondary bus
>> reset which will also propagate reset through downstream bridges.
> Hmm.  Does not say what the root cause is, just that it can be "fixed"
> by a reset, but OK.
>
>> Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
>> Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev>
>> ---
>> ---
>> v4->v5: Fixing small nitpick fix.
>> v3->v4: Using pci_reset_bus function for secondary bus reset instead of
>>         manually triggering secondary bus reset, addressing review
>>         comments of v3.
>> v2->v3: Combining two functions into one, Remove redundant definations
>>         and Formatting fixes
>>
>>  drivers/pci/controller/vmd.c | 37 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 37 insertions(+)
>>
>> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
>> index a5987e52700e..a905fce6232f 100644
>> --- a/drivers/pci/controller/vmd.c
>> +++ b/drivers/pci/controller/vmd.c
>> @@ -498,6 +498,40 @@ static inline void vmd_acpi_begin(void) { }
>>  static inline void vmd_acpi_end(void) { }
>>  #endif /* CONFIG_ACPI */
>>  
>> +static void vmd_domain_reset(struct vmd_dev *vmd)
>> +{
>> +	u16 bus, max_buses = resource_size(&vmd->resources[0]);
>> +	u8 dev, functions, fn, hdr_type;
>> +	char __iomem *base;
> I don't understand what's going on here.
>
>> +	for (bus = 0; bus < max_buses; bus++) {
>> +		for (dev = 0; dev < 32; dev++) {
>> +			base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
>> +						PCI_DEVFN(dev, 0), 0);
>> +
>> +			hdr_type = readb(base + PCI_HEADER_TYPE) &
>> +					 PCI_HEADER_TYPE_MASK;
>> +			functions = (hdr_type & 0x80) ? 8 : 1;
> This look like an open-coded version of pci_hdr_type() for every
> possible device on every possible bus below the VMD, regardless of
> whether that device actually exists.  So most of these reads will
> result in Unsupported Request errors and 0xff data returns, which you
> interpret as a multi-function device.
>
>> +			for (fn = 0; fn < functions; fn++) {
>> +				base = vmd->cfgbar + PCIE_ECAM_OFFSET(bus,
>> +						PCI_DEVFN(dev, fn), 0);
>> +
>> +				hdr_type = readb(base + PCI_HEADER_TYPE) &
>> +						PCI_HEADER_TYPE_MASK;
> So you open-code pci_hdr_type() again, for lots of functions that
> don't exist.
>
>> +				if (hdr_type != PCI_HEADER_TYPE_BRIDGE ||
>> +				    (readw(base + PCI_CLASS_DEVICE) !=
>> +				     PCI_CLASS_BRIDGE_PCI))
>> +					continue;
> This at least will skip the rest for functions that don't exist, since
> hdr_type will be 0x7f (not 0x01, PCI_HEADER_TYPE_BRIDGE).
>
>> +
>> +				memset_io(base + PCI_IO_BASE, 0,
>> +					  PCI_ROM_ADDRESS1 - PCI_IO_BASE);
> And here you clear the base & limit registers of the IO and MEM
> windows of each bridge, again with basically a hand-written special
> case of pci_write_config_*().
>
> This looks like you're trying to disable the windows.  AFAICT, the
> spec doesn't say what happens to the window base/limit registers after
> reset, but it does say that reset clears the I/O Space Enable and
> Memory Space Enable in the Command Register.  For bridges, that will
> disable the windows.
>
> Writing zero to both base & limit does not disable the windows; it
> just sets them to [io 0x0000-0x0fff], [mem 0x00000000-0x000fffff],
> etc.
>
> So I'm not really convinced that this function is necessary at all,
> given that you do a secondary bus reset on every VMD root port right
> afterwards.

I am trying to clean up bridge devices under each vmd domain and then run
secondary bus reset. Is propagation of hot reset on downstream ports via
secondary bus reset enough between multiple reboot? I am testing if it
solves our issue. I will let you know the results. Please let me know if you
have more suggestions.

Thanks.

>
>> +			}
>> +		}
>> +	}
>> +}
>> +
>>  static void vmd_attach_resources(struct vmd_dev *vmd)
>>  {
>>  	vmd->dev->resource[VMD_MEMBAR1].child = &vmd->resources[1];
>> @@ -801,6 +835,9 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>>  	vmd_acpi_begin();
>>  
>>  	pci_scan_child_bus(vmd->bus);
>> +	vmd_domain_reset(vmd);
>> +	list_for_each_entry(child, &vmd->bus->children, node)
>> +		pci_reset_bus(child->self);
>>  	pci_assign_unassigned_bus_resources(vmd->bus);
>>  
>>  	/*
>> -- 
>> 2.27.0
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-12-04  2:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-16 22:11 [PATCH v5] PCI: vmd: Clean up domain before enumeration Nirmal Patel
2021-11-19 20:48 ` Patel, Nirmal
2021-11-19 20:55   ` Bjorn Helgaas
2021-12-01 12:01 ` Lorenzo Pieralisi
2021-12-02 16:38 ` Patel, Nirmal
2021-12-02 17:32   ` Keith Busch
2021-12-03 18:18 ` Bjorn Helgaas
2021-12-04  2:34   ` Patel, Nirmal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).