linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, after secondary bus reset
@ 2022-05-16 17:30 windy.bi.enflame
  2022-05-16 20:28 ` Bjorn Helgaas
  2022-05-17  5:34 ` [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, " kernel test robot
  0 siblings, 2 replies; 17+ messages in thread
From: windy.bi.enflame @ 2022-05-16 17:30 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, windy.bi.enflame

While I do reset test of a PCIe endpoint device on a server, I find that
the EP device always been removed and re-inserted again by hotplug module,
 after secondary bus reset.

After checking I find:
1> "pciehp_reset_slot()" always disable slot's DLLSC interrupt before
   doing reset and restore after reset, to try to filter the hotplug
   event happened during reset.
2> "pci_bridge_secondary_bus_reset()" sleep 1 seconad and "pci_dev_wait()"
   until device ready with "PCIE_RESET_READY_POLL_MS" timeout.
3> There is a PCIe switch between CPU and the EP devicem the topology as:
   CPU <-> Switch <-> EP.
4> While trigger sbr reset at the switch's downstream port, it needs 1.5
   seconds for internal enumeration.

About why 1.5 seconds ready time is not filtered by "pci_dev_wait()" with
"PCIE_RESET_READY_POLL_MS" timeout, I find it is because in
"pci_bridge_secondary_bus_reset()", the function is operating slot's
config space to trigger sbr and also wait slot itself ready by input same
"dev" parameter. Different from other resets like FLR which is triggered
by operating the config space of EP device itself, sbr is triggered by
up slot but need to wait downstream devices' ready, so I think function
"pci_dev_wait()" works for resets like FLR but not for sbr.

In this proposed patch, I'm changing the waiting function used in sbr to
"pci_bridge_secondary_bus_wait()" which will wait all the downstream
hierarchy ready with the same timeout setting "PCIE_RESET_READY_POLL_MS".
In "pci_bridge_secondary_bus_wait()" the "subordinate" and
"subordinate->devices" will be checked firstly, and then downstream
devices' present state.

Signed-off-by: windy.bi.enflame <windy.bi.enflame@gmail.com>
---
 drivers/pci/pci.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 9ecce435fb3f..d7ec3859268b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5002,6 +5002,29 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
 	}
 }
 
+int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
+{
+	struct pci_dev *dev;
+	int delay = 1;
+
+	if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
+		return 0;
+
+	list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
+		while (!pci_device_is_present(dev)) {
+			if (delay > timeout) {
+				pci_warn(dev, "secondary bus not ready after %dms\n", delay);
+				return -ENOTTY;
+			}
+
+			msleep(delay);
+			delay *= 2;
+		}
+	}
+
+	return 0;
+}
+
 void pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	u16 ctrl;
@@ -5045,7 +5068,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
 {
 	pcibios_reset_secondary_bus(dev);
 
-	return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
+	return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);
 }
 EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, after secondary bus reset
  2022-05-16 17:30 [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, after secondary bus reset windy.bi.enflame
@ 2022-05-16 20:28 ` Bjorn Helgaas
  2022-05-16 22:57   ` Alex Williamson
  2022-05-17  5:34 ` [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, " kernel test robot
  1 sibling, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2022-05-16 20:28 UTC (permalink / raw)
  To: windy.bi.enflame
  Cc: bhelgaas, linux-pci, linux-kernel, Lukas Wunner, Alex Williamson

[+cc Lukas, pciehp expert; Alex, reset person]

Thanks for the testing, analysis, and patch!

Run "git log --oneline drivers/pci/pci.c" and make your subject line
similar.

On Tue, May 17, 2022 at 01:30:47AM +0800, windy.bi.enflame wrote:
> While I do reset test of a PCIe endpoint device on a server, I find that
> the EP device always been removed and re-inserted again by hotplug module,
>  after secondary bus reset.
> 
> After checking I find:
> 1> "pciehp_reset_slot()" always disable slot's DLLSC interrupt before
>    doing reset and restore after reset, to try to filter the hotplug
>    event happened during reset.
> 2> "pci_bridge_secondary_bus_reset()" sleep 1 seconad and "pci_dev_wait()"
>    until device ready with "PCIE_RESET_READY_POLL_MS" timeout.
> 3> There is a PCIe switch between CPU and the EP devicem the topology as:
>    CPU <-> Switch <-> EP.
> 4> While trigger sbr reset at the switch's downstream port, it needs 1.5
>    seconds for internal enumeration.

s/seconad/second/
s/devicem/device/
s/sbr/SBR/
s/"pciehp_reset_slot()"/pciehp_reset_slot()/ also for other functions

> About why 1.5 seconds ready time is not filtered by "pci_dev_wait()" with
> "PCIE_RESET_READY_POLL_MS" timeout, I find it is because in
> "pci_bridge_secondary_bus_reset()", the function is operating slot's
> config space to trigger sbr and also wait slot itself ready by input same
> "dev" parameter. Different from other resets like FLR which is triggered
> by operating the config space of EP device itself, sbr is triggered by
> up slot but need to wait downstream devices' ready, so I think function
> "pci_dev_wait()" works for resets like FLR but not for sbr.
> 
> In this proposed patch, I'm changing the waiting function used in sbr to
> "pci_bridge_secondary_bus_wait()" which will wait all the downstream
> hierarchy ready with the same timeout setting "PCIE_RESET_READY_POLL_MS".
> In "pci_bridge_secondary_bus_wait()" the "subordinate" and
> "subordinate->devices" will be checked firstly, and then downstream
> devices' present state.
> 
> Signed-off-by: windy.bi.enflame <windy.bi.enflame@gmail.com>

See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=v5.17#n407
regarding pseudonyms.

> ---
>  drivers/pci/pci.c | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9ecce435fb3f..d7ec3859268b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5002,6 +5002,29 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
>  	}
>  }
>  
> +int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
> +{
> +	struct pci_dev *dev;
> +	int delay = 1;
> +
> +	if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
> +		return 0;
> +
> +	list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
> +		while (!pci_device_is_present(dev)) {
> +			if (delay > timeout) {
> +				pci_warn(dev, "secondary bus not ready after %dms\n", delay);
> +				return -ENOTTY;
> +			}
> +
> +			msleep(delay);
> +			delay *= 2;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  void pci_reset_secondary_bus(struct pci_dev *dev)
>  {
>  	u16 ctrl;
> @@ -5045,7 +5068,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
>  {
>  	pcibios_reset_secondary_bus(dev);
>  
> -	return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
> +	return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);
>  }
>  EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
>  
> -- 
> 2.36.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, after secondary bus reset
  2022-05-16 20:28 ` Bjorn Helgaas
@ 2022-05-16 22:57   ` Alex Williamson
  2022-05-17 14:56     ` windy Bi
  2022-05-18 11:54     ` [PATCH v2] PCI: Fix no-op wait " Sheng Bi
  0 siblings, 2 replies; 17+ messages in thread
From: Alex Williamson @ 2022-05-16 22:57 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: windy.bi.enflame, bhelgaas, linux-pci, linux-kernel, Lukas Wunner

On Mon, 16 May 2022 15:28:25 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> [+cc Lukas, pciehp expert; Alex, reset person]
> 
> Thanks for the testing, analysis, and patch!
> 
> Run "git log --oneline drivers/pci/pci.c" and make your subject line
> similar.
> 
> On Tue, May 17, 2022 at 01:30:47AM +0800, windy.bi.enflame wrote:
> > While I do reset test of a PCIe endpoint device on a server, I find that
> > the EP device always been removed and re-inserted again by hotplug module,
> >  after secondary bus reset.
> > 
> > After checking I find:  
> > 1> "pciehp_reset_slot()" always disable slot's DLLSC interrupt before  
> >    doing reset and restore after reset, to try to filter the hotplug
> >    event happened during reset.  
> > 2> "pci_bridge_secondary_bus_reset()" sleep 1 seconad and "pci_dev_wait()"  
> >    until device ready with "PCIE_RESET_READY_POLL_MS" timeout.  
> > 3> There is a PCIe switch between CPU and the EP devicem the topology as:  
> >    CPU <-> Switch <-> EP.  
> > 4> While trigger sbr reset at the switch's downstream port, it needs 1.5  
> >    seconds for internal enumeration.  
> 
> s/seconad/second/
> s/devicem/device/
> s/sbr/SBR/
> s/"pciehp_reset_slot()"/pciehp_reset_slot()/ also for other functions
> 
> > About why 1.5 seconds ready time is not filtered by "pci_dev_wait()" with
> > "PCIE_RESET_READY_POLL_MS" timeout, I find it is because in
> > "pci_bridge_secondary_bus_reset()", the function is operating slot's
> > config space to trigger sbr and also wait slot itself ready by input same
> > "dev" parameter. Different from other resets like FLR which is triggered
> > by operating the config space of EP device itself, sbr is triggered by
> > up slot but need to wait downstream devices' ready, so I think function
> > "pci_dev_wait()" works for resets like FLR but not for sbr.

Is the unexpected hotplug occurring then because the device is not
ready after the 1s sleep after the sbr and we re-trigger the hotplug
controller which then triggers because the link status is still down?

> > In this proposed patch, I'm changing the waiting function used in sbr to
> > "pci_bridge_secondary_bus_wait()" which will wait all the downstream
> > hierarchy ready with the same timeout setting "PCIE_RESET_READY_POLL_MS".
> > In "pci_bridge_secondary_bus_wait()" the "subordinate" and
> > "subordinate->devices" will be checked firstly, and then downstream
> > devices' present state.
> > 
> > Signed-off-by: windy.bi.enflame <windy.bi.enflame@gmail.com>  
> 
> See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=v5.17#n407
> regarding pseudonyms.
> 
> > ---
> >  drivers/pci/pci.c | 25 ++++++++++++++++++++++++-
> >  1 file changed, 24 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 9ecce435fb3f..d7ec3859268b 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5002,6 +5002,29 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
> >  	}
> >  }
> >  
> > +int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
> > +{
> > +	struct pci_dev *dev;
> > +	int delay = 1;
> > +
> > +	if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
> > +		return 0;
> > +
> > +	list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
> > +		while (!pci_device_is_present(dev)) {
> > +			if (delay > timeout) {
> > +				pci_warn(dev, "secondary bus not ready after %dms\n", delay);
> > +				return -ENOTTY;
> > +			}
> > +
> > +			msleep(delay);
> > +			delay *= 2;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  void pci_reset_secondary_bus(struct pci_dev *dev)
> >  {
> >  	u16 ctrl;
> > @@ -5045,7 +5068,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
> >  {
> >  	pcibios_reset_secondary_bus(dev);
> >  
> > -	return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);

I assume pci_dev_wait here was always a no-op because we're testing the
wrong device, maybe this should be marked as:

Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")

> > +	return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);

The theory looks reasonable to me, but I'd hope we cold get a better
commit log and improve the dev_warn message.  It seems to make sense to
use pci_device_is_present() since we shouldn't be dealing with VFs
after a bus reset, but I wonder if we want to enumerate all the missing
devices.  Since the timeout has passed, we shouldn't incur any extra
delays beyond the first device that doesn't re-appear.  Thanks,

Alex

> >  }
> >  EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
> >  
> > -- 
> > 2.36.1
> >   
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, after secondary bus reset
  2022-05-16 17:30 [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, after secondary bus reset windy.bi.enflame
  2022-05-16 20:28 ` Bjorn Helgaas
@ 2022-05-17  5:34 ` kernel test robot
  1 sibling, 0 replies; 17+ messages in thread
From: kernel test robot @ 2022-05-17  5:34 UTC (permalink / raw)
  To: windy.bi.enflame, bhelgaas
  Cc: kbuild-all, linux-pci, linux-kernel, windy.bi.enflame

Hi "windy.bi.enflame",

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on helgaas-pci/next]
[also build test WARNING on v5.18-rc7 next-20220516]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/windy-bi-enflame/drivers-pci-wait-downstream-hierarchy-ready-instead-of-slot-itself-ready-after-secondary-bus-reset/20220517-013158
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: alpha-defconfig (https://download.01.org/0day-ci/archive/20220517/202205171330.ye71SisD-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 11.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/99d829ca818d01cbd8bd4f95353f58a01723fe21
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review windy-bi-enflame/drivers-pci-wait-downstream-hierarchy-ready-instead-of-slot-itself-ready-after-secondary-bus-reset/20220517-013158
        git checkout 99d829ca818d01cbd8bd4f95353f58a01723fe21
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross W=1 O=build_dir ARCH=alpha SHELL=/bin/bash drivers/pci/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/pci/pci.c:5052:5: warning: no previous prototype for 'pci_bridge_secondary_bus_wait' [-Wmissing-prototypes]
    5052 | int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
         |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~


vim +/pci_bridge_secondary_bus_wait +5052 drivers/pci/pci.c

  5051	
> 5052	int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
  5053	{
  5054		struct pci_dev *dev;
  5055		int delay = 1;
  5056	
  5057		if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
  5058			return 0;
  5059	
  5060		list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
  5061			while (!pci_device_is_present(dev)) {
  5062				if (delay > timeout) {
  5063					pci_warn(dev, "secondary bus not ready after %dms\n", delay);
  5064					return -ENOTTY;
  5065				}
  5066	
  5067				msleep(delay);
  5068				delay *= 2;
  5069			}
  5070		}
  5071	
  5072		return 0;
  5073	}
  5074	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, after secondary bus reset
  2022-05-16 22:57   ` Alex Williamson
@ 2022-05-17 14:56     ` windy Bi
  2022-05-18 11:54     ` [PATCH v2] PCI: Fix no-op wait " Sheng Bi
  1 sibling, 0 replies; 17+ messages in thread
From: windy Bi @ 2022-05-17 14:56 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Bjorn Helgaas, bhelgaas, linux-pci, linux-kernel, Lukas Wunner

Hi Bjorn, Alex

Thank you for reviewing the patch and comments below, I will amend the
violation of
submission rule in patch V2.

Thanks

On Tue, May 17, 2022 at 6:57 AM Alex Williamson
<alex.williamson@redhat.com> wrote:
>
> On Mon, 16 May 2022 15:28:25 -0500
> Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> > [+cc Lukas, pciehp expert; Alex, reset person]
> >
> > Thanks for the testing, analysis, and patch!
> >
> > Run "git log --oneline drivers/pci/pci.c" and make your subject line
> > similar.
> >
> > On Tue, May 17, 2022 at 01:30:47AM +0800, windy.bi.enflame wrote:
> > > While I do reset test of a PCIe endpoint device on a server, I find that
> > > the EP device always been removed and re-inserted again by hotplug module,
> > >  after secondary bus reset.
> > >
> > > After checking I find:
> > > 1> "pciehp_reset_slot()" always disable slot's DLLSC interrupt before
> > >    doing reset and restore after reset, to try to filter the hotplug
> > >    event happened during reset.
> > > 2> "pci_bridge_secondary_bus_reset()" sleep 1 seconad and "pci_dev_wait()"
> > >    until device ready with "PCIE_RESET_READY_POLL_MS" timeout.
> > > 3> There is a PCIe switch between CPU and the EP devicem the topology as:
> > >    CPU <-> Switch <-> EP.
> > > 4> While trigger sbr reset at the switch's downstream port, it needs 1.5
> > >    seconds for internal enumeration.
> >
> > s/seconad/second/
> > s/devicem/device/
> > s/sbr/SBR/
> > s/"pciehp_reset_slot()"/pciehp_reset_slot()/ also for other functions
> >
> > > About why 1.5 seconds ready time is not filtered by "pci_dev_wait()" with
> > > "PCIE_RESET_READY_POLL_MS" timeout, I find it is because in
> > > "pci_bridge_secondary_bus_reset()", the function is operating slot's
> > > config space to trigger sbr and also wait slot itself ready by input same
> > > "dev" parameter. Different from other resets like FLR which is triggered
> > > by operating the config space of EP device itself, sbr is triggered by
> > > up slot but need to wait downstream devices' ready, so I think function
> > > "pci_dev_wait()" works for resets like FLR but not for sbr.
>
> Is the unexpected hotplug occurring then because the device is not
> ready after the 1s sleep after the sbr and we re-trigger the hotplug
> controller which then triggers because the link status is still down?

Yes, the device becomes accessible at ~1.5s after SBR while hotplug
interrupt was re-enabled after 1s sleep. Then the hotplug event at 1.5s
was been judged as real hotplug.

>
> > > In this proposed patch, I'm changing the waiting function used in sbr to
> > > "pci_bridge_secondary_bus_wait()" which will wait all the downstream
> > > hierarchy ready with the same timeout setting "PCIE_RESET_READY_POLL_MS".
> > > In "pci_bridge_secondary_bus_wait()" the "subordinate" and
> > > "subordinate->devices" will be checked firstly, and then downstream
> > > devices' present state.
> > >
> > > Signed-off-by: windy.bi.enflame <windy.bi.enflame@gmail.com>
> >
> > See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=v5.17#n407
> > regarding pseudonyms.
> >
> > > ---
> > >  drivers/pci/pci.c | 25 ++++++++++++++++++++++++-
> > >  1 file changed, 24 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 9ecce435fb3f..d7ec3859268b 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -5002,6 +5002,29 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
> > >     }
> > >  }
> > >
> > > +int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
> > > +{
> > > +   struct pci_dev *dev;
> > > +   int delay = 1;
> > > +
> > > +   if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
> > > +           return 0;
> > > +
> > > +   list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
> > > +           while (!pci_device_is_present(dev)) {
> > > +                   if (delay > timeout) {
> > > +                           pci_warn(dev, "secondary bus not ready after %dms\n", delay);
> > > +                           return -ENOTTY;
> > > +                   }
> > > +
> > > +                   msleep(delay);
> > > +                   delay *= 2;
> > > +           }
> > > +   }
> > > +
> > > +   return 0;
> > > +}
> > > +
> > >  void pci_reset_secondary_bus(struct pci_dev *dev)
> > >  {
> > >     u16 ctrl;
> > > @@ -5045,7 +5068,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
> > >  {
> > >     pcibios_reset_secondary_bus(dev);
> > >
> > > -   return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
>
> I assume pci_dev_wait here was always a no-op because we're testing the
> wrong device, maybe this should be marked as:
>
> Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")

I think so too, will mark it if we all aligned.

>
> > > +   return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);
>
> The theory looks reasonable to me, but I'd hope we cold get a better
> commit log and improve the dev_warn message.  It seems to make sense to
> use pci_device_is_present() since we shouldn't be dealing with VFs
> after a bus reset, but I wonder if we want to enumerate all the missing
> devices.  Since the timeout has passed, we shouldn't incur any extra
> delays beyond the first device that doesn't re-appear.  Thanks,
>
> Alex

Thanks for your suggestion. I thought to enumerate all the missing
devices because SBR affects all the downstream hierarchy and
devices need to be re-enumerated as possible as we can.
I agree that we shouldn't incur any extra delays once the timeout has
already passed, since SBR fails as long as one device fails.

>
> > >  }
> > >  EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
> > >
> > > --
> > > 2.36.1
> > >
> >
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-16 22:57   ` Alex Williamson
  2022-05-17 14:56     ` windy Bi
@ 2022-05-18 11:54     ` Sheng Bi
  2022-05-19 17:06       ` Alex Williamson
  2022-05-20  6:41       ` Lukas Wunner
  1 sibling, 2 replies; 17+ messages in thread
From: Sheng Bi @ 2022-05-18 11:54 UTC (permalink / raw)
  To: helgaas, bhelgaas
  Cc: alex.williamson, lukas, linux-pci, linux-kernel, Sheng Bi

pci_bridge_secondary_bus_reset() triggers SBR followed by 1 second sleep,
and then uses pci_dev_wait() for waiting device ready. The dev parameter
passes to the wait function is currently the bridge itself, but not the
device been reset.

If we call pci_bridge_secondary_bus_reset() to trigger SBR to a device,
there is 1 second sleep but not waiting device ready, since the bridge
is always ready while resetting downstream devices. pci_dev_wait() here
is a no-op actually. This would be risky in the case which the device
becomes ready after more than 1 second, especially while hotplug enabled.
The late coming hotplug event after 1 second will trigger hotplug module
to remove/re-insert the device.

Instead of waiting ready of bridge itself, changing to wait all the
downstream devices become ready with timeout PCIE_RESET_READY_POLL_MS
after SBR, considering all downstream devices are affected during SBR.
Once one of the devices doesn't reappear within the timeout, return
-ENOTTY to indicate SBR doesn't complete successfully.

Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
Signed-off-by: Sheng Bi <windy.bi.enflame@gmail.com>
---
 drivers/pci/pci.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index eb7c0a08ff57..32b7a5c1fa3a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5049,6 +5049,34 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
 	}
 }
 
+static int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
+{
+	struct pci_dev *dev;
+	int delay = 0;
+
+	if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
+		return 0;
+
+	list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
+		while (!pci_device_is_present(dev)) {
+			if (delay > timeout) {
+				pci_warn(dev, "not ready %dms after secondary bus reset; giving up\n",
+					delay);
+				return -ENOTTY;
+			}
+
+			msleep(20);
+			delay += 20;
+		}
+
+		if (delay > 1000)
+			pci_info(dev, "ready %dms after secondary bus reset\n",
+				delay);
+	}
+
+	return 0;
+}
+
 void pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	u16 ctrl;
@@ -5092,7 +5120,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
 {
 	pcibios_reset_secondary_bus(dev);
 
-	return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
+	return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);
 }
 EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
 

base-commit: 617c8a1e527fadaaec3ba5bafceae7a922ebef7e
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-18 11:54     ` [PATCH v2] PCI: Fix no-op wait " Sheng Bi
@ 2022-05-19 17:06       ` Alex Williamson
  2022-05-20  3:00         ` windy Bi
  2022-05-20  6:41       ` Lukas Wunner
  1 sibling, 1 reply; 17+ messages in thread
From: Alex Williamson @ 2022-05-19 17:06 UTC (permalink / raw)
  To: Sheng Bi; +Cc: helgaas, bhelgaas, lukas, linux-pci, linux-kernel

On Wed, 18 May 2022 19:54:32 +0800
Sheng Bi <windy.bi.enflame@gmail.com> wrote:

> pci_bridge_secondary_bus_reset() triggers SBR followed by 1 second sleep,
> and then uses pci_dev_wait() for waiting device ready. The dev parameter
> passes to the wait function is currently the bridge itself, but not the
> device been reset.
> 
> If we call pci_bridge_secondary_bus_reset() to trigger SBR to a device,
> there is 1 second sleep but not waiting device ready, since the bridge
> is always ready while resetting downstream devices. pci_dev_wait() here
> is a no-op actually. This would be risky in the case which the device
> becomes ready after more than 1 second, especially while hotplug enabled.
> The late coming hotplug event after 1 second will trigger hotplug module
> to remove/re-insert the device.
> 
> Instead of waiting ready of bridge itself, changing to wait all the
> downstream devices become ready with timeout PCIE_RESET_READY_POLL_MS
> after SBR, considering all downstream devices are affected during SBR.
> Once one of the devices doesn't reappear within the timeout, return
> -ENOTTY to indicate SBR doesn't complete successfully.
> 
> Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
> Signed-off-by: Sheng Bi <windy.bi.enflame@gmail.com>
> ---
>  drivers/pci/pci.c | 30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index eb7c0a08ff57..32b7a5c1fa3a 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5049,6 +5049,34 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
>  	}
>  }
>  
> +static int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
> +{
> +	struct pci_dev *dev;
> +	int delay = 0;
> +
> +	if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
> +		return 0;
> +
> +	list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
> +		while (!pci_device_is_present(dev)) {
> +			if (delay > timeout) {
> +				pci_warn(dev, "not ready %dms after secondary bus reset; giving up\n",
> +					delay);
> +				return -ENOTTY;
> +			}
> +
> +			msleep(20);
> +			delay += 20;

Your previous version used the same exponential back-off as used in
pci_dev_wait(), why the change here to poll at 20ms intervals?  Thanks,

Alex

> +		}
> +
> +		if (delay > 1000)
> +			pci_info(dev, "ready %dms after secondary bus reset\n",
> +				delay);
> +	}
> +
> +	return 0;
> +}
> +
>  void pci_reset_secondary_bus(struct pci_dev *dev)
>  {
>  	u16 ctrl;
> @@ -5092,7 +5120,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
>  {
>  	pcibios_reset_secondary_bus(dev);
>  
> -	return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
> +	return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);
>  }
>  EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
>  
> 
> base-commit: 617c8a1e527fadaaec3ba5bafceae7a922ebef7e


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-19 17:06       ` Alex Williamson
@ 2022-05-20  3:00         ` windy Bi
  0 siblings, 0 replies; 17+ messages in thread
From: windy Bi @ 2022-05-20  3:00 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Bjorn Helgaas, Bjorn Helgaas, Lukas Wunner, linux-pci, linux-kernel

On Fri, May 20, 2022 at 1:06 AM Alex Williamson
<alex.williamson@redhat.com> wrote:
>
> On Wed, 18 May 2022 19:54:32 +0800
> Sheng Bi <windy.bi.enflame@gmail.com> wrote:
>
> > pci_bridge_secondary_bus_reset() triggers SBR followed by 1 second sleep,
> > and then uses pci_dev_wait() for waiting device ready. The dev parameter
> > passes to the wait function is currently the bridge itself, but not the
> > device been reset.
> >
> > If we call pci_bridge_secondary_bus_reset() to trigger SBR to a device,
> > there is 1 second sleep but not waiting device ready, since the bridge
> > is always ready while resetting downstream devices. pci_dev_wait() here
> > is a no-op actually. This would be risky in the case which the device
> > becomes ready after more than 1 second, especially while hotplug enabled.
> > The late coming hotplug event after 1 second will trigger hotplug module
> > to remove/re-insert the device.
> >
> > Instead of waiting ready of bridge itself, changing to wait all the
> > downstream devices become ready with timeout PCIE_RESET_READY_POLL_MS
> > after SBR, considering all downstream devices are affected during SBR.
> > Once one of the devices doesn't reappear within the timeout, return
> > -ENOTTY to indicate SBR doesn't complete successfully.
> >
> > Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
> > Signed-off-by: Sheng Bi <windy.bi.enflame@gmail.com>
> > ---
> >  drivers/pci/pci.c | 30 +++++++++++++++++++++++++++++-
> >  1 file changed, 29 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index eb7c0a08ff57..32b7a5c1fa3a 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5049,6 +5049,34 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
> >       }
> >  }
> >
> > +static int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
> > +{
> > +     struct pci_dev *dev;
> > +     int delay = 0;
> > +
> > +     if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
> > +             return 0;
> > +
> > +     list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
> > +             while (!pci_device_is_present(dev)) {
> > +                     if (delay > timeout) {
> > +                             pci_warn(dev, "not ready %dms after secondary bus reset; giving up\n",
> > +                                     delay);
> > +                             return -ENOTTY;
> > +                     }
> > +
> > +                     msleep(20);
> > +                     delay += 20;
>
> Your previous version used the same exponential back-off as used in
> pci_dev_wait(), why the change here to poll at 20ms intervals?  Thanks,
>
> Alex

Many thanks for your time. The change is to get a more accurate
timeout, to align with
previous statement "we shouldn't incur any extra delay once timeout has passed".
Previous binary exponential back-off incurred probable unexpected
extra delay, like
60,000 ms timeout but actual 65,535 ms, and the difference probably
goes worse by
timeout setting changes. Thanks,

windy

>
> > +             }
> > +
> > +             if (delay > 1000)
> > +                     pci_info(dev, "ready %dms after secondary bus reset\n",
> > +                             delay);
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> >  void pci_reset_secondary_bus(struct pci_dev *dev)
> >  {
> >       u16 ctrl;
> > @@ -5092,7 +5120,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
> >  {
> >       pcibios_reset_secondary_bus(dev);
> >
> > -     return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
> > +     return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);
> >  }
> >  EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
> >
> >
> > base-commit: 617c8a1e527fadaaec3ba5bafceae7a922ebef7e
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-18 11:54     ` [PATCH v2] PCI: Fix no-op wait " Sheng Bi
  2022-05-19 17:06       ` Alex Williamson
@ 2022-05-20  6:41       ` Lukas Wunner
  2022-05-21  8:36         ` Sheng Bi
  1 sibling, 1 reply; 17+ messages in thread
From: Lukas Wunner @ 2022-05-20  6:41 UTC (permalink / raw)
  To: Sheng Bi; +Cc: helgaas, alex.williamson, linux-pci, linux-kernel

On Wed, May 18, 2022 at 07:54:32PM +0800, Sheng Bi wrote:
> +static int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
> +{
> +	struct pci_dev *dev;
> +	int delay = 0;
> +
> +	if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
> +		return 0;
> +
> +	list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
> +		while (!pci_device_is_present(dev)) {
> +			if (delay > timeout) {
> +				pci_warn(dev, "not ready %dms after secondary bus reset; giving up\n",
> +					delay);
> +				return -ENOTTY;
> +			}
> +
> +			msleep(20);
> +			delay += 20;
> +		}
> +
> +		if (delay > 1000)
> +			pci_info(dev, "ready %dms after secondary bus reset\n",
> +				delay);
> +	}
> +
> +	return 0;
> +}

An alternative approach you may want to consider is to call
pci_dev_wait() in the list_for_each_entry loop, but instead of
passing it a constant timeout you'd pass the remaining time.

Get the current time before and after each pci_dev_wait() call
from "jiffies", calculate the difference, convert to msecs with
jiffies_to_msecs() and subtract from the "timeout" parameter
passed in by the caller, then simply pass "timeout" to each
pci_dev_wait() call.

As a side note, traversing the bus list normally requires
holding the pci_bus_sem for reading.  But it's probably unlikely
that devices are added/removed concurrently to a bus reset
and we're doing it wrong pretty much everywhere in the
PCI reset code, so...

(I fixed up one of the reset functions with 10791141a6cf,
but plenty of others remain...)

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-20  6:41       ` Lukas Wunner
@ 2022-05-21  8:36         ` Sheng Bi
  2022-05-21 12:49           ` Lukas Wunner
  0 siblings, 1 reply; 17+ messages in thread
From: Sheng Bi @ 2022-05-21  8:36 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, Alex Williamson, linux-pci, linux-kernel

On Fri, May 20, 2022 at 2:41 PM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Wed, May 18, 2022 at 07:54:32PM +0800, Sheng Bi wrote:
> > +static int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
> > +{
> > +     struct pci_dev *dev;
> > +     int delay = 0;
> > +
> > +     if (!bridge->subordinate || list_empty(&bridge->subordinate->devices))
> > +             return 0;
> > +
> > +     list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
> > +             while (!pci_device_is_present(dev)) {
> > +                     if (delay > timeout) {
> > +                             pci_warn(dev, "not ready %dms after secondary bus reset; giving up\n",
> > +                                     delay);
> > +                             return -ENOTTY;
> > +                     }
> > +
> > +                     msleep(20);
> > +                     delay += 20;
> > +             }
> > +
> > +             if (delay > 1000)
> > +                     pci_info(dev, "ready %dms after secondary bus reset\n",
> > +                             delay);
> > +     }
> > +
> > +     return 0;
> > +}
>
> An alternative approach you may want to consider is to call
> pci_dev_wait() in the list_for_each_entry loop, but instead of
> passing it a constant timeout you'd pass the remaining time.
>
> Get the current time before and after each pci_dev_wait() call
> from "jiffies", calculate the difference, convert to msecs with
> jiffies_to_msecs() and subtract from the "timeout" parameter
> passed in by the caller, then simply pass "timeout" to each
> pci_dev_wait() call.

Thanks for your proposal, which can avoid doing duplicated things as
pci_dev_wait().

If so, I also want to align the polling things mentioned in the
question from Alex, since pci_dev_wait() is also used for reset
functions other than SBR. To Bjorn, Alex, Lucas, how do you think if
we need to change the polling in pci_dev_wait() to 20ms intervals, or
keep binary exponential back-off with probable unexpected extra
timeout delay.

>
> As a side note, traversing the bus list normally requires
> holding the pci_bus_sem for reading.  But it's probably unlikely
> that devices are added/removed concurrently to a bus reset
> and we're doing it wrong pretty much everywhere in the
> PCI reset code, so...

Yeah... I think that is why I saw different coding there. I would
prefer a separate thread for estimating which ones are real risks.

>
> (I fixed up one of the reset functions with 10791141a6cf,
> but plenty of others remain...)
>
> Thanks,
>
> Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-21  8:36         ` Sheng Bi
@ 2022-05-21 12:49           ` Lukas Wunner
  2022-05-21 17:37             ` Sheng Bi
  0 siblings, 1 reply; 17+ messages in thread
From: Lukas Wunner @ 2022-05-21 12:49 UTC (permalink / raw)
  To: Sheng Bi; +Cc: Bjorn Helgaas, Alex Williamson, linux-pci, linux-kernel

On Sat, May 21, 2022 at 04:36:10PM +0800, Sheng Bi wrote:
> If so, I also want to align the polling things mentioned in the
> question from Alex, since pci_dev_wait() is also used for reset
> functions other than SBR. To Bjorn, Alex, Lucas, how do you think if
> we need to change the polling in pci_dev_wait() to 20ms intervals, or
> keep binary exponential back-off with probable unexpected extra
> timeout delay.

The exponential backoff should probably be capped at some point
to avoid excessive wait delays.  I guess the rationale for
exponential backoff is to not poll too frequently.
Capping at 20 msec or 100 msec may be reasonable, i.e.:

-		delay *= 2;
+		delay = min(delay * 2, 100);

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-21 12:49           ` Lukas Wunner
@ 2022-05-21 17:37             ` Sheng Bi
  2022-05-23 14:20               ` Lukas Wunner
  0 siblings, 1 reply; 17+ messages in thread
From: Sheng Bi @ 2022-05-21 17:37 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, Alex Williamson, linux-pci, linux-kernel

On Sat, May 21, 2022 at 8:49 PM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Sat, May 21, 2022 at 04:36:10PM +0800, Sheng Bi wrote:
> > If so, I also want to align the polling things mentioned in the
> > question from Alex, since pci_dev_wait() is also used for reset
> > functions other than SBR. To Bjorn, Alex, Lucas, how do you think if
> > we need to change the polling in pci_dev_wait() to 20ms intervals, or
> > keep binary exponential back-off with probable unexpected extra
> > timeout delay.
>
> The exponential backoff should probably be capped at some point
> to avoid excessive wait delays.  I guess the rationale for
> exponential backoff is to not poll too frequently.
> Capping at 20 msec or 100 msec may be reasonable, i.e.:
>
> -               delay *= 2;
> +               delay = min(delay * 2, 100);
>
> Thanks,
>
> Lukas

Capping at 20 or 100 msec seems reasonable to me. Btw, since 20 msec
is not a long time in these scenarios, how about changing to a fixed
20 msec interval? Thanks,

windy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-21 17:37             ` Sheng Bi
@ 2022-05-23 14:20               ` Lukas Wunner
  2022-05-23 15:59                 ` Sheng Bi
  0 siblings, 1 reply; 17+ messages in thread
From: Lukas Wunner @ 2022-05-23 14:20 UTC (permalink / raw)
  To: Sheng Bi; +Cc: Bjorn Helgaas, Alex Williamson, linux-pci, linux-kernel

On Sun, May 22, 2022 at 01:37:50AM +0800, Sheng Bi wrote:
> On Sat, May 21, 2022 at 8:49 PM Lukas Wunner <lukas@wunner.de> wrote:
> > On Sat, May 21, 2022 at 04:36:10PM +0800, Sheng Bi wrote:
> > > If so, I also want to align the polling things mentioned in the
> > > question from Alex, since pci_dev_wait() is also used for reset
> > > functions other than SBR. To Bjorn, Alex, Lucas, how do you think if
> > > we need to change the polling in pci_dev_wait() to 20ms intervals, or
> > > keep binary exponential back-off with probable unexpected extra
> > > timeout delay.
> >
> > The exponential backoff should probably be capped at some point
> > to avoid excessive wait delays.  I guess the rationale for
> > exponential backoff is to not poll too frequently.
> > Capping at 20 msec or 100 msec may be reasonable, i.e.:
> >
> > -               delay *= 2;
> > +               delay = min(delay * 2, 100);
> 
> Capping at 20 or 100 msec seems reasonable to me. Btw, since 20 msec
> is not a long time in these scenarios, how about changing to a fixed
> 20 msec interval?

The callers of pci_dev_wait() seem to wait for the spec-defined
delay and only call pci_dev_wait() to allow for an additional period
that non-compliant devices may need.  That extra delay can be expected
to be low, which is why it makes sense to start with a short poll interval
and gradually extend it.  So the algorithm seems to be reasonable and
I wouldn't recommend changing it to a constant interval unless that
fixes something which is currently broken.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] PCI: Fix no-op wait after secondary bus reset
  2022-05-23 14:20               ` Lukas Wunner
@ 2022-05-23 15:59                 ` Sheng Bi
  2022-05-23 17:15                   ` [PATCH v3] " Sheng Bi
  0 siblings, 1 reply; 17+ messages in thread
From: Sheng Bi @ 2022-05-23 15:59 UTC (permalink / raw)
  To: Lukas Wunner; +Cc: Bjorn Helgaas, Alex Williamson, linux-pci, linux-kernel

On Mon, May 23, 2022 at 10:20 PM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Sun, May 22, 2022 at 01:37:50AM +0800, Sheng Bi wrote:
> > On Sat, May 21, 2022 at 8:49 PM Lukas Wunner <lukas@wunner.de> wrote:
> > > On Sat, May 21, 2022 at 04:36:10PM +0800, Sheng Bi wrote:
> > > > If so, I also want to align the polling things mentioned in the
> > > > question from Alex, since pci_dev_wait() is also used for reset
> > > > functions other than SBR. To Bjorn, Alex, Lucas, how do you think if
> > > > we need to change the polling in pci_dev_wait() to 20ms intervals, or
> > > > keep binary exponential back-off with probable unexpected extra
> > > > timeout delay.
> > >
> > > The exponential backoff should probably be capped at some point
> > > to avoid excessive wait delays.  I guess the rationale for
> > > exponential backoff is to not poll too frequently.
> > > Capping at 20 msec or 100 msec may be reasonable, i.e.:
> > >
> > > -               delay *= 2;
> > > +               delay = min(delay * 2, 100);
> >
> > Capping at 20 or 100 msec seems reasonable to me. Btw, since 20 msec
> > is not a long time in these scenarios, how about changing to a fixed
> > 20 msec interval?
>
> The callers of pci_dev_wait() seem to wait for the spec-defined
> delay and only call pci_dev_wait() to allow for an additional period
> that non-compliant devices may need.  That extra delay can be expected
> to be low, which is why it makes sense to start with a short poll interval
> and gradually extend it.  So the algorithm seems to be reasonable and
> I wouldn't recommend changing it to a constant interval unless that
> fixes something which is currently broken.
>
> Thanks,
>
> Lukas

Thanks Lukas!

From my perspective, there is nothing broken so far, but a theoretical
unexpected extra delay while the timeout has passed. So I will keep
pci_dev_wait() as previously with exponential backoff in this patch,
and change the pci_bridge_secondary_bus_wait() with "jiffies" and
pci_dev_wait().

Thanks,

windy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3] PCI: Fix no-op wait after secondary bus reset
  2022-05-23 15:59                 ` Sheng Bi
@ 2022-05-23 17:15                   ` Sheng Bi
  2022-06-08 13:16                     ` Sheng Bi
  2022-06-08 15:23                     ` Lukas Wunner
  0 siblings, 2 replies; 17+ messages in thread
From: Sheng Bi @ 2022-05-23 17:15 UTC (permalink / raw)
  To: helgaas, bhelgaas, alex.williamson, lukas
  Cc: linux-pci, linux-kernel, Sheng Bi

pci_bridge_secondary_bus_reset() triggers SBR followed by 1 second sleep,
and then uses pci_dev_wait() for waiting device ready. The dev parameter
passes to the wait function is currently the bridge itself, but not the
device been reset.

If we call pci_bridge_secondary_bus_reset() to trigger SBR to a device,
there is 1 second sleep but not waiting device ready, since the bridge
is always ready while resetting downstream devices. pci_dev_wait() here
is a no-op actually. This would be risky in the case which the device
becomes ready after more than 1 second, especially while hotplug enabled.
The late coming hotplug event after 1 second will trigger hotplug module
to remove/re-insert the device.

Instead of waiting ready of bridge itself, changing to wait all the
downstream devices become ready with timeout PCIE_RESET_READY_POLL_MS
after SBR, considering all downstream devices are affected during SBR.
Once one of the devices doesn't reappear within the timeout, return
-ENOTTY to indicate SBR doesn't complete successfully.

Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
Signed-off-by: Sheng Bi <windy.bi.enflame@gmail.com>
---
 drivers/pci/pci.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index eb7c0a08ff57..4653a9ae6e5b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5049,6 +5049,34 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
 	}
 }
 
+static int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
+{
+	struct pci_dev *dev;
+	unsigned long start_jiffies;
+
+	down_read(&pci_bus_sem);
+
+	if (!bridge->subordinate || list_empty(&bridge->subordinate->devices)) {
+		up_read(&pci_bus_sem);
+		return 0;
+	}
+
+	list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
+		start_jiffies = jiffies;
+
+		if (timeout < 0 || pci_dev_wait(dev, "bus reset", timeout)) {
+			up_read(&pci_bus_sem);
+			return -ENOTTY;
+		}
+
+		timeout -= jiffies_to_msecs(jiffies - start_jiffies);
+	}
+
+	up_read(&pci_bus_sem);
+
+	return 0;
+}
+
 void pci_reset_secondary_bus(struct pci_dev *dev)
 {
 	u16 ctrl;
@@ -5092,7 +5120,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
 {
 	pcibios_reset_secondary_bus(dev);
 
-	return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
+	return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);
 }
 EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
 

base-commit: 617c8a1e527fadaaec3ba5bafceae7a922ebef7e
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3] PCI: Fix no-op wait after secondary bus reset
  2022-05-23 17:15                   ` [PATCH v3] " Sheng Bi
@ 2022-06-08 13:16                     ` Sheng Bi
  2022-06-08 15:23                     ` Lukas Wunner
  1 sibling, 0 replies; 17+ messages in thread
From: Sheng Bi @ 2022-06-08 13:16 UTC (permalink / raw)
  To: Bjorn Helgaas, Bjorn Helgaas, Alex Williamson, Lukas Wunner
  Cc: linux-pci, linux-kernel

Hi Bjorn, Alex, Lukas,

Is this acceptable or anything needs to be improved?

Thanks
windy

On Tue, May 24, 2022 at 1:15 AM Sheng Bi <windy.bi.enflame@gmail.com> wrote:
>
> pci_bridge_secondary_bus_reset() triggers SBR followed by 1 second sleep,
> and then uses pci_dev_wait() for waiting device ready. The dev parameter
> passes to the wait function is currently the bridge itself, but not the
> device been reset.
>
> If we call pci_bridge_secondary_bus_reset() to trigger SBR to a device,
> there is 1 second sleep but not waiting device ready, since the bridge
> is always ready while resetting downstream devices. pci_dev_wait() here
> is a no-op actually. This would be risky in the case which the device
> becomes ready after more than 1 second, especially while hotplug enabled.
> The late coming hotplug event after 1 second will trigger hotplug module
> to remove/re-insert the device.
>
> Instead of waiting ready of bridge itself, changing to wait all the
> downstream devices become ready with timeout PCIE_RESET_READY_POLL_MS
> after SBR, considering all downstream devices are affected during SBR.
> Once one of the devices doesn't reappear within the timeout, return
> -ENOTTY to indicate SBR doesn't complete successfully.
>
> Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
> Signed-off-by: Sheng Bi <windy.bi.enflame@gmail.com>
> ---
>  drivers/pci/pci.c | 30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index eb7c0a08ff57..4653a9ae6e5b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5049,6 +5049,34 @@ void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev)
>         }
>  }
>
> +static int pci_bridge_secondary_bus_wait(struct pci_dev *bridge, int timeout)
> +{
> +       struct pci_dev *dev;
> +       unsigned long start_jiffies;
> +
> +       down_read(&pci_bus_sem);
> +
> +       if (!bridge->subordinate || list_empty(&bridge->subordinate->devices)) {
> +               up_read(&pci_bus_sem);
> +               return 0;
> +       }
> +
> +       list_for_each_entry(dev, &bridge->subordinate->devices, bus_list) {
> +               start_jiffies = jiffies;
> +
> +               if (timeout < 0 || pci_dev_wait(dev, "bus reset", timeout)) {
> +                       up_read(&pci_bus_sem);
> +                       return -ENOTTY;
> +               }
> +
> +               timeout -= jiffies_to_msecs(jiffies - start_jiffies);
> +       }
> +
> +       up_read(&pci_bus_sem);
> +
> +       return 0;
> +}
> +
>  void pci_reset_secondary_bus(struct pci_dev *dev)
>  {
>         u16 ctrl;
> @@ -5092,7 +5120,7 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
>  {
>         pcibios_reset_secondary_bus(dev);
>
> -       return pci_dev_wait(dev, "bus reset", PCIE_RESET_READY_POLL_MS);
> +       return pci_bridge_secondary_bus_wait(dev, PCIE_RESET_READY_POLL_MS);
>  }
>  EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);
>
>
> base-commit: 617c8a1e527fadaaec3ba5bafceae7a922ebef7e
> --
> 2.36.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3] PCI: Fix no-op wait after secondary bus reset
  2022-05-23 17:15                   ` [PATCH v3] " Sheng Bi
  2022-06-08 13:16                     ` Sheng Bi
@ 2022-06-08 15:23                     ` Lukas Wunner
  1 sibling, 0 replies; 17+ messages in thread
From: Lukas Wunner @ 2022-06-08 15:23 UTC (permalink / raw)
  To: Sheng Bi; +Cc: helgaas, bhelgaas, alex.williamson, linux-pci, linux-kernel

On Tue, May 24, 2022 at 01:15:17AM +0800, Sheng Bi wrote:
> pci_bridge_secondary_bus_reset() triggers SBR followed by 1 second sleep,
> and then uses pci_dev_wait() for waiting device ready. The dev parameter
> passes to the wait function is currently the bridge itself, but not the
> device been reset.
> 
> If we call pci_bridge_secondary_bus_reset() to trigger SBR to a device,
> there is 1 second sleep but not waiting device ready, since the bridge
> is always ready while resetting downstream devices. pci_dev_wait() here
> is a no-op actually. This would be risky in the case which the device
> becomes ready after more than 1 second, especially while hotplug enabled.
> The late coming hotplug event after 1 second will trigger hotplug module
> to remove/re-insert the device.
> 
> Instead of waiting ready of bridge itself, changing to wait all the
> downstream devices become ready with timeout PCIE_RESET_READY_POLL_MS
> after SBR, considering all downstream devices are affected during SBR.
> Once one of the devices doesn't reappear within the timeout, return
> -ENOTTY to indicate SBR doesn't complete successfully.
> 
> Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
> Signed-off-by: Sheng Bi <windy.bi.enflame@gmail.com>

Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.17+

Code-wise, this LGTM.  There are a few things that could be
improved in the commit message, e.g. in the last paragraph,
"changing" (gerund form) is not proper English and the
imperative form "change" would be correct here.  However,
these details are difficult to get right for anyone who is
not an English native speaker and often Bjorn will wordsmith
the commit message to perfect it.

See here for some of the things Bjorn looks for:
https://lore.kernel.org/linux-pci/20171026223701.GA25649@bhelgaas-glaptop.roam.corp.google.com/

Bjorn may not find the time to look over your patch immediately,
so please be patient.

Thanks!

Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-06-08 15:27 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-16 17:30 [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, after secondary bus reset windy.bi.enflame
2022-05-16 20:28 ` Bjorn Helgaas
2022-05-16 22:57   ` Alex Williamson
2022-05-17 14:56     ` windy Bi
2022-05-18 11:54     ` [PATCH v2] PCI: Fix no-op wait " Sheng Bi
2022-05-19 17:06       ` Alex Williamson
2022-05-20  3:00         ` windy Bi
2022-05-20  6:41       ` Lukas Wunner
2022-05-21  8:36         ` Sheng Bi
2022-05-21 12:49           ` Lukas Wunner
2022-05-21 17:37             ` Sheng Bi
2022-05-23 14:20               ` Lukas Wunner
2022-05-23 15:59                 ` Sheng Bi
2022-05-23 17:15                   ` [PATCH v3] " Sheng Bi
2022-06-08 13:16                     ` Sheng Bi
2022-06-08 15:23                     ` Lukas Wunner
2022-05-17  5:34 ` [PATCH] drivers/pci: wait downstream hierarchy ready instead of slot itself ready, " kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).