linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock
@ 2019-09-01 14:23 Remi Pommarel
  2019-09-02 10:06 ` Lorenzo Pieralisi
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Remi Pommarel @ 2019-09-01 14:23 UTC (permalink / raw)
  To: Thomas Petazzoni, Lorenzo Pieralisi, Bjorn Helgaas
  Cc: linux-pci, linux-arm-kernel, linux-kernel, Remi Pommarel

advk_pcie_wait_pio() can be called while holding a spinlock (from
pci_bus_read_config_dword()), then depends on jiffies in order to
timeout while polling on PIO state registers. In the case the PIO
transaction failed, the timeout will never happen and will also cause
the cpu to stall.

This decrements a variable and wait instead of using jiffies.

Signed-off-by: Remi Pommarel <repk@triplefau.lt>
---
 drivers/pci/controller/pci-aardvark.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index fc0fe4d4de49..1fa6d04ad7aa 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -175,7 +175,8 @@
 	(PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn))	| \
 	 PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
 
-#define PIO_TIMEOUT_MS			1
+#define PIO_RETRY_CNT			10
+#define PIO_RETRY_DELAY			100 /* 100 us*/
 
 #define LINK_WAIT_MAX_RETRIES		10
 #define LINK_WAIT_USLEEP_MIN		90000
@@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
 static int advk_pcie_wait_pio(struct advk_pcie *pcie)
 {
 	struct device *dev = &pcie->pdev->dev;
-	unsigned long timeout;
+	size_t i;
 
-	timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS);
-
-	while (time_before(jiffies, timeout)) {
+	for (i = 0; i < PIO_RETRY_CNT; ++i) {
 		u32 start, isr;
 
 		start = advk_readl(pcie, PIO_START);
 		isr = advk_readl(pcie, PIO_ISR);
 		if (!start && isr)
 			return 0;
+		udelay(PIO_RETRY_DELAY);
 	}
 
 	dev_err(dev, "config read/write timed out\n");
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock
  2019-09-01 14:23 [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock Remi Pommarel
@ 2019-09-02 10:06 ` Lorenzo Pieralisi
  2019-09-02 10:50 ` Andrew Murray
  2019-09-25  9:33 ` Thomas Petazzoni
  2 siblings, 0 replies; 5+ messages in thread
From: Lorenzo Pieralisi @ 2019-09-02 10:06 UTC (permalink / raw)
  To: Remi Pommarel, Thomas Petazzoni
  Cc: Bjorn Helgaas, linux-pci, linux-arm-kernel, linux-kernel

On Sun, Sep 01, 2019 at 04:23:03PM +0200, Remi Pommarel wrote:
> advk_pcie_wait_pio() can be called while holding a spinlock (from
> pci_bus_read_config_dword()), then depends on jiffies in order to
> timeout while polling on PIO state registers. In the case the PIO
> transaction failed, the timeout will never happen and will also cause
> the cpu to stall.
> 
> This decrements a variable and wait instead of using jiffies.
> 
> Signed-off-by: Remi Pommarel <repk@triplefau.lt>
> ---
>  drivers/pci/controller/pci-aardvark.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)

Thomas, I would like to merge this patch and a couple of
others from Remi, may I ask you please to review them ?

https://patchwork.ozlabs.org/user/todo/linux-pci/?series=&submitter=67495&state=&q=&archive=

Thanks,
Lorenzo

> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index fc0fe4d4de49..1fa6d04ad7aa 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -175,7 +175,8 @@
>  	(PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn))	| \
>  	 PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
>  
> -#define PIO_TIMEOUT_MS			1
> +#define PIO_RETRY_CNT			10
> +#define PIO_RETRY_DELAY			100 /* 100 us*/
>  
>  #define LINK_WAIT_MAX_RETRIES		10
>  #define LINK_WAIT_USLEEP_MIN		90000
> @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
>  static int advk_pcie_wait_pio(struct advk_pcie *pcie)
>  {
>  	struct device *dev = &pcie->pdev->dev;
> -	unsigned long timeout;
> +	size_t i;
>  
> -	timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS);
> -
> -	while (time_before(jiffies, timeout)) {
> +	for (i = 0; i < PIO_RETRY_CNT; ++i) {
>  		u32 start, isr;
>  
>  		start = advk_readl(pcie, PIO_START);
>  		isr = advk_readl(pcie, PIO_ISR);
>  		if (!start && isr)
>  			return 0;
> +		udelay(PIO_RETRY_DELAY);
>  	}
>  
>  	dev_err(dev, "config read/write timed out\n");
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock
  2019-09-01 14:23 [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock Remi Pommarel
  2019-09-02 10:06 ` Lorenzo Pieralisi
@ 2019-09-02 10:50 ` Andrew Murray
  2019-09-25  9:33 ` Thomas Petazzoni
  2 siblings, 0 replies; 5+ messages in thread
From: Andrew Murray @ 2019-09-02 10:50 UTC (permalink / raw)
  To: Remi Pommarel
  Cc: Thomas Petazzoni, Lorenzo Pieralisi, Bjorn Helgaas, linux-pci,
	linux-arm-kernel, linux-kernel

On Sun, Sep 01, 2019 at 04:23:03PM +0200, Remi Pommarel wrote:
> advk_pcie_wait_pio() can be called while holding a spinlock (from
> pci_bus_read_config_dword()), then depends on jiffies in order to
> timeout while polling on PIO state registers. In the case the PIO
> transaction failed, the timeout will never happen and will also cause
> the cpu to stall.
> 
> This decrements a variable and wait instead of using jiffies.
> 
> Signed-off-by: Remi Pommarel <repk@triplefau.lt>

Reviewed-by: Andrew Murray <andrew.murray@arm.com>

> ---
>  drivers/pci/controller/pci-aardvark.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index fc0fe4d4de49..1fa6d04ad7aa 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -175,7 +175,8 @@
>  	(PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn))	| \
>  	 PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
>  
> -#define PIO_TIMEOUT_MS			1
> +#define PIO_RETRY_CNT			10
> +#define PIO_RETRY_DELAY			100 /* 100 us*/
>  
>  #define LINK_WAIT_MAX_RETRIES		10
>  #define LINK_WAIT_USLEEP_MIN		90000
> @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
>  static int advk_pcie_wait_pio(struct advk_pcie *pcie)
>  {
>  	struct device *dev = &pcie->pdev->dev;
> -	unsigned long timeout;
> +	size_t i;
>  
> -	timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS);
> -
> -	while (time_before(jiffies, timeout)) {
> +	for (i = 0; i < PIO_RETRY_CNT; ++i) {
>  		u32 start, isr;
>  
>  		start = advk_readl(pcie, PIO_START);
>  		isr = advk_readl(pcie, PIO_ISR);
>  		if (!start && isr)
>  			return 0;
> +		udelay(PIO_RETRY_DELAY);
>  	}
>  
>  	dev_err(dev, "config read/write timed out\n");
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock
  2019-09-01 14:23 [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock Remi Pommarel
  2019-09-02 10:06 ` Lorenzo Pieralisi
  2019-09-02 10:50 ` Andrew Murray
@ 2019-09-25  9:33 ` Thomas Petazzoni
  2019-09-27  8:25   ` Remi Pommarel
  2 siblings, 1 reply; 5+ messages in thread
From: Thomas Petazzoni @ 2019-09-25  9:33 UTC (permalink / raw)
  To: Remi Pommarel
  Cc: Lorenzo Pieralisi, Bjorn Helgaas, linux-pci, linux-arm-kernel,
	linux-kernel

Hello Remi,

Thanks for the patch, I have a few comments/questions below.

On Sun,  1 Sep 2019 16:23:03 +0200
Remi Pommarel <repk@triplefau.lt> wrote:

> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index fc0fe4d4de49..1fa6d04ad7aa 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -175,7 +175,8 @@
>  	(PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn))	| \
>  	 PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
>  
> -#define PIO_TIMEOUT_MS			1
> +#define PIO_RETRY_CNT			10
> +#define PIO_RETRY_DELAY			100 /* 100 us*/
>  
>  #define LINK_WAIT_MAX_RETRIES		10
>  #define LINK_WAIT_USLEEP_MIN		90000
> @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
>  static int advk_pcie_wait_pio(struct advk_pcie *pcie)
>  {
>  	struct device *dev = &pcie->pdev->dev;
> -	unsigned long timeout;
> +	size_t i;

Is it common to use a size_t for a loop counter ?

>  
> -	timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS);
> -
> -	while (time_before(jiffies, timeout)) {
> +	for (i = 0; i < PIO_RETRY_CNT; ++i) {

I find it more common to use post-increment for loop counters rather
than pre-increment, but that's a really nitpick and I don't care much.

>  		u32 start, isr;
>  
>  		start = advk_readl(pcie, PIO_START);
>  		isr = advk_readl(pcie, PIO_ISR);
>  		if (!start && isr)
>  			return 0;
> +		udelay(PIO_RETRY_DELAY);

But the bigger issue is that this change causes a 100us delay at
*every* single PIO read or write operation.

Indeed, at the first iteration of the loop, the PIO operation has not
completed, so you will always hit the udelay(100) a first time, and
it's only at the second iteration of the loop that the PIO operation
has completed (for successful PIO operations of course, which don't hit
the timeout).

I took a measurement around wait_pio() with sched_clock before and
after the patch. Before the patch, I have measurements like this (in
nanoseconds):

[    1.562801] time = 6000
[    1.565310] time = 6000
[    1.567809] time = 6080
[    1.570327] time = 6080
[    1.572836] time = 6080
[    1.575339] time = 6080
[    1.577858] time = 2720
[    1.580366] time = 2720
[    1.582862] time = 6000
[    1.585377] time = 2720
[    1.587890] time = 2720
[    1.590393] time = 2720

So it takes a few microseconds for each PIO operation.

With your patch applied:

[    2.267291] time = 101680
[    2.270002] time = 100880
[    2.272852] time = 100800
[    2.275573] time = 100880
[    2.278285] time = 100800
[    2.281005] time = 100880
[    2.283722] time = 100800
[    2.286444] time = 100880
[    2.289264] time = 100880
[    2.291981] time = 100800
[    2.294690] time = 100800
[    2.297405] time = 100800

We're jumping to 100us for every PIO read/write operation. To be
honest, I don't know if this is very important, there are not that many
PIO operations, and they are not used in any performance hot path. But
I thought it was worth pointing out the additional delay caused by this
implementation change.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock
  2019-09-25  9:33 ` Thomas Petazzoni
@ 2019-09-27  8:25   ` Remi Pommarel
  0 siblings, 0 replies; 5+ messages in thread
From: Remi Pommarel @ 2019-09-27  8:25 UTC (permalink / raw)
  To: Thomas Petazzoni
  Cc: Lorenzo Pieralisi, Bjorn Helgaas, linux-pci, linux-arm-kernel,
	linux-kernel

Hi Thomas,

Thanks for the review.

On Wed, Sep 25, 2019 at 11:33:51AM +0200, Thomas Petazzoni wrote:
> Hello Remi,
> 
> Thanks for the patch, I have a few comments/questions below.
> 
> On Sun,  1 Sep 2019 16:23:03 +0200
> Remi Pommarel <repk@triplefau.lt> wrote:
> 
> > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> > index fc0fe4d4de49..1fa6d04ad7aa 100644
> > --- a/drivers/pci/controller/pci-aardvark.c
> > +++ b/drivers/pci/controller/pci-aardvark.c
> > @@ -175,7 +175,8 @@
> >  	(PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn))	| \
> >  	 PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
> >  
> > -#define PIO_TIMEOUT_MS			1
> > +#define PIO_RETRY_CNT			10
> > +#define PIO_RETRY_DELAY			100 /* 100 us*/
> >  
> >  #define LINK_WAIT_MAX_RETRIES		10
> >  #define LINK_WAIT_USLEEP_MIN		90000
> > @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie)
> >  static int advk_pcie_wait_pio(struct advk_pcie *pcie)
> >  {
> >  	struct device *dev = &pcie->pdev->dev;
> > -	unsigned long timeout;
> > +	size_t i;
> 
> Is it common to use a size_t for a loop counter ?

It was for me but seem not to be used that much. I can change that to an
int.

> >  
> > -	timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS);
> > -
> > -	while (time_before(jiffies, timeout)) {
> > +	for (i = 0; i < PIO_RETRY_CNT; ++i) {
> 
> I find it more common to use post-increment for loop counters rather
> than pre-increment, but that's a really nitpick and I don't care much.
> 

Will change that to post-increment.

> >  		u32 start, isr;
> >  
> >  		start = advk_readl(pcie, PIO_START);
> >  		isr = advk_readl(pcie, PIO_ISR);
> >  		if (!start && isr)
> >  			return 0;
> > +		udelay(PIO_RETRY_DELAY);
> 
> But the bigger issue is that this change causes a 100us delay at
> *every* single PIO read or write operation.
> 
> Indeed, at the first iteration of the loop, the PIO operation has not
> completed, so you will always hit the udelay(100) a first time, and
> it's only at the second iteration of the loop that the PIO operation
> has completed (for successful PIO operations of course, which don't hit
> the timeout).
> 
> I took a measurement around wait_pio() with sched_clock before and
> after the patch. Before the patch, I have measurements like this (in
> nanoseconds):
> 
> [    1.562801] time = 6000
> [    1.565310] time = 6000
> [    1.567809] time = 6080
> [    1.570327] time = 6080
> [    1.572836] time = 6080
> [    1.575339] time = 6080
> [    1.577858] time = 2720
> [    1.580366] time = 2720
> [    1.582862] time = 6000
> [    1.585377] time = 2720
> [    1.587890] time = 2720
> [    1.590393] time = 2720
> 
> So it takes a few microseconds for each PIO operation.
> 
> With your patch applied:
> 
> [    2.267291] time = 101680
> [    2.270002] time = 100880
> [    2.272852] time = 100800
> [    2.275573] time = 100880
> [    2.278285] time = 100800
> [    2.281005] time = 100880
> [    2.283722] time = 100800
> [    2.286444] time = 100880
> [    2.289264] time = 100880
> [    2.291981] time = 100800
> [    2.294690] time = 100800
> [    2.297405] time = 100800
> 
> We're jumping to 100us for every PIO read/write operation. To be
> honest, I don't know if this is very important, there are not that many
> PIO operations, and they are not used in any performance hot path. But
> I thought it was worth pointing out the additional delay caused by this
> implementation change.

Good catch thanks for the measurements, will move to a 2us delay.

-- 
Remi

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-27  8:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-01 14:23 [PATCH] PCI: aardvark: Don't rely on jiffies while holding spinlock Remi Pommarel
2019-09-02 10:06 ` Lorenzo Pieralisi
2019-09-02 10:50 ` Andrew Murray
2019-09-25  9:33 ` Thomas Petazzoni
2019-09-27  8:25   ` Remi Pommarel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).