netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Spurious timeouts in mvmdio
@ 2013-12-02 15:15 Nicolas Schichan
  2013-12-03 12:23 ` Jason Cooper
  0 siblings, 1 reply; 19+ messages in thread
From: Nicolas Schichan @ 2013-12-02 15:15 UTC (permalink / raw)
  To: LKML, netdev, linux-arm-kernel
  Cc: David S. Miller, Florian Fainelli, Leigh Brown, Sebastian Hesselbarth


Hi,

During 3.13-rc1 testing, I have found out that the mvmdio driver would report 
timeouts on the kernel console:

[   11.011334] orion-mdio orion-mdio: Timeout: SMI busy for too long

The hardware is a MV88F6281 Kirkwood CPU. The mvmdio driver is using the irq 
line 46 (ge00_err).

I am inclined to believe that it is due to the fact that wait_event_timeout() 
is called with a timeout parameter of 1 jiffy in orion_mdio_wait_ready(). If 
the timer interrupt ticks right after calling wait_event_timeout(), we may end 
up spending much less time than MVMDIO_SMI_TIMEOUT (1 msec) in 
wait_event_timeout(), and as a result report a timeout as the MDIO access did 
not complete in such a short time.

As to how to fix this, I see two options (I don't know which one would be 
prefered):

- Option 1: always pass a timeout of at least 2 jiffy to wait_event_timeout().
- Option 2: switch to wait_event_hrtimeout().

I can provide patches for both options.

Regards,

-- 
Nicolas Schichan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-02 15:15 Spurious timeouts in mvmdio Nicolas Schichan
@ 2013-12-03 12:23 ` Jason Cooper
  2013-12-03 12:40   ` Russell King - ARM Linux
  2013-12-03 13:16   ` [PATCH] net: mvmdio: fix wait_event_timeout() being called with a 1 jiffy timeout Nicolas Schichan
  0 siblings, 2 replies; 19+ messages in thread
From: Jason Cooper @ 2013-12-03 12:23 UTC (permalink / raw)
  To: Nicolas Schichan
  Cc: LKML, netdev, linux-arm-kernel, Sebastian Hesselbarth,
	Leigh Brown, David S. Miller, Florian Fainelli

Nicolas,

Sorry for the delay, we spoke about this yesterday on irc, and
apparently we all thought the other person was going to respond.  oops
:(

On Mon, Dec 02, 2013 at 04:15:54PM +0100, Nicolas Schichan wrote:
> During 3.13-rc1 testing, I have found out that the mvmdio driver
> would report timeouts on the kernel console:
> 
> [   11.011334] orion-mdio orion-mdio: Timeout: SMI busy for too long
> 
> The hardware is a MV88F6281 Kirkwood CPU. The mvmdio driver is using
> the irq line 46 (ge00_err).
> 
> I am inclined to believe that it is due to the fact that
> wait_event_timeout() is called with a timeout parameter of 1 jiffy
> in orion_mdio_wait_ready(). If the timer interrupt ticks right after
> calling wait_event_timeout(), we may end up spending much less time
> than MVMDIO_SMI_TIMEOUT (1 msec) in wait_event_timeout(), and as a
> result report a timeout as the MDIO access did not complete in such
> a short time.
> 
> As to how to fix this, I see two options (I don't know which one
> would be prefered):
> 
> - Option 1: always pass a timeout of at least 2 jiffy to wait_event_timeout().
> - Option 2: switch to wait_event_hrtimeout().
> 
> I can provide patches for both options.

Based on yesterday's irc chat, option 1 sounds good.  Here's the dump
from yesterday where Sebastian provided a thorough explanation:

11:29 < shesselba> increasing max timeout to 2 ticks at least sounds reasonable
11:29 < shesselba> 10ms should be enough for every CONFIG_HZ there is

11:30 < kos_tom> why make the timeout tied to the ticks? there are functions/macros to convert real time numbers into ticks.
11:30 < kos_tom> msecs_to_jiffies() or something

11:31 < shesselba> kos_tom: it is already using usecs_to_jiffies()
11:31 < shesselba> the thing is: 1ms is less than a jiffy

11:33 < kos_tom> so it will wait one jiffy or a little bit more, no?

11:38 < shesselba> no, the spurious timeouts he is seeing come from (1) mvmdio gets jiffies close before the next tick, (2) wait_event_timeout is called with jiffies + timeout
11:39 < shesselba> with timeout << 1 jiffy
11:39 < shesselba> then (3) the next timer tick occurs
11:39 < shesselba> it will end up waiting less then a jiffy
11:40 < shesselba> IOW, increase timeout to be at least two jiffies (or 20ms for CONFIG_HZ=100)
11:41 < shesselba> originally, it was 100ms anyway

Looking forward to the patch!

thx,

Jason.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 12:23 ` Jason Cooper
@ 2013-12-03 12:40   ` Russell King - ARM Linux
  2013-12-03 13:43     ` Jason Cooper
  2013-12-03 20:57     ` Leigh Brown
  2013-12-03 13:16   ` [PATCH] net: mvmdio: fix wait_event_timeout() being called with a 1 jiffy timeout Nicolas Schichan
  1 sibling, 2 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2013-12-03 12:40 UTC (permalink / raw)
  To: Jason Cooper
  Cc: Nicolas Schichan, Leigh Brown, netdev, LKML, Florian Fainelli,
	David S. Miller, linux-arm-kernel, Sebastian Hesselbarth

On Tue, Dec 03, 2013 at 07:23:46AM -0500, Jason Cooper wrote:
> On Mon, Dec 02, 2013 at 04:15:54PM +0100, Nicolas Schichan wrote:
> > During 3.13-rc1 testing, I have found out that the mvmdio driver
> > would report timeouts on the kernel console:
> > 
> > [   11.011334] orion-mdio orion-mdio: Timeout: SMI busy for too long
> > 
> > The hardware is a MV88F6281 Kirkwood CPU. The mvmdio driver is using
> > the irq line 46 (ge00_err).
> > 
> > I am inclined to believe that it is due to the fact that
> > wait_event_timeout() is called with a timeout parameter of 1 jiffy
> > in orion_mdio_wait_ready(). If the timer interrupt ticks right after
> > calling wait_event_timeout(), we may end up spending much less time
> > than MVMDIO_SMI_TIMEOUT (1 msec) in wait_event_timeout(), and as a
> > result report a timeout as the MDIO access did not complete in such
> > a short time.
> > 
> > As to how to fix this, I see two options (I don't know which one
> > would be prefered):
> > 
> > - Option 1: always pass a timeout of at least 2 jiffy to wait_event_timeout().
> > - Option 2: switch to wait_event_hrtimeout().
> > 
> > I can provide patches for both options.
> 
> Based on yesterday's irc chat, option 1 sounds good.  Here's the dump
> from yesterday where Sebastian provided a thorough explanation:
> 
> 11:29 < shesselba> increasing max timeout to 2 ticks at least sounds reasonable
> 11:29 < shesselba> 10ms should be enough for every CONFIG_HZ there is
> 
> 11:30 < kos_tom> why make the timeout tied to the ticks? there are functions/macros to convert real time numbers into ticks.
> 11:30 < kos_tom> msecs_to_jiffies() or something
> 
> 11:31 < shesselba> kos_tom: it is already using usecs_to_jiffies()
> 11:31 < shesselba> the thing is: 1ms is less than a jiffy

Yes, and the kernels time conversion functions aren't stupid.  Let's
look at this function's implementation:

unsigned long usecs_to_jiffies(const unsigned int u)
{
        if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
                return MAX_JIFFY_OFFSET;
#if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ)
        return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
#elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC)
        return u * (HZ / USEC_PER_SEC);
#else
        return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
                >> USEC_TO_HZ_SHR32;
#endif
}

Now, assuming HZ=100 and USEC_PER_SEC=1000000, we will use:

	return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);

If you ask for 1us, this comes out as:

	return (1 + (1000000 / 100) - 1) / (1000000 / 100);

which is one jiffy.  So, for a requested 1us period, you're given a
1 jiffy interval, or 10ms.  For other (sensible) values:

        return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
                >> USEC_TO_HZ_SHR32;

gets used, which has a similar behaviour.

Now, depending on how you use this one jiffy interval, the thing to realise
is that with this kind of loop:

	timeout = jiffies + usecs_to_jiffies(1);
	do {
		something;
	} while (time_is_before_jiffies(timeout));

what this equates to is:

	} while (jiffies - timeout < 0);

What this means is that the loop breaks at jiffies = timeout, so it can
indeed timeout before one tick - within 0 to 10ms for HZ=100.  The problem
is not the usecs_to_jiffies(), it's with the implementation.

If you use time_is_before_eq_jiffies() instead, it will also loop if
jiffies == timeout, which will give you the additional safety margin -
meaning it will timeout after 10 to 20ms instead.

You may wish to consider coding this differently as well - if you have
the error interrupt, there's no need for this loop.  You only need the
loop if you're using usleep_range().  Note the return value of
wait_event_timeout() will tell you positively and correctly if the waited
condition succeeded or you timed out.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] net: mvmdio: fix wait_event_timeout() being called with a 1 jiffy timeout.
  2013-12-03 12:23 ` Jason Cooper
  2013-12-03 12:40   ` Russell King - ARM Linux
@ 2013-12-03 13:16   ` Nicolas Schichan
  2013-12-03 13:34     ` Russell King - ARM Linux
  1 sibling, 1 reply; 19+ messages in thread
From: Nicolas Schichan @ 2013-12-03 13:16 UTC (permalink / raw)
  To: linux-arm-kernel, netdev, linux-kernel, Jason Cooper, David S. Miller
  Cc: Florian Fainelli, Sebastian Hesselbarth, Leigh Brown, Nicolas Schichan

If a timer interrupt kicks right after the wait_event_timeout() call,
we may endup a reporting timeout before the timeout really occured.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
---
 drivers/net/ethernet/marvell/mvmdio.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvmdio.c b/drivers/net/ethernet/marvell/mvmdio.c
index 7354960..2814bbc 100644
--- a/drivers/net/ethernet/marvell/mvmdio.c
+++ b/drivers/net/ethernet/marvell/mvmdio.c
@@ -76,9 +76,19 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
 {
 	struct orion_mdio_dev *dev = bus->priv;
 	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
-	unsigned long end = jiffies + timeout;
+	unsigned long end;
 	int timedout = 0;
 
+	/*
+	 * make sure that we wait at least more than one
+	 * jiffy. wait_event_timeout() with a timeout parameter of 1
+	 * jiffy may return before the SMI access is done if a timer
+	 * interrupt kicks immediately after.
+	 */
+	if (timeout < 2)
+		timeout = 2;
+	end = jiffies + timeout;
+
 	while (1) {
 	        if (orion_mdio_smi_is_done(dev))
 			return 0;
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] net: mvmdio: fix wait_event_timeout() being called with a 1 jiffy timeout.
  2013-12-03 13:16   ` [PATCH] net: mvmdio: fix wait_event_timeout() being called with a 1 jiffy timeout Nicolas Schichan
@ 2013-12-03 13:34     ` Russell King - ARM Linux
  0 siblings, 0 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2013-12-03 13:34 UTC (permalink / raw)
  To: Nicolas Schichan
  Cc: linux-arm-kernel, netdev, linux-kernel, Jason Cooper,
	David S. Miller, Leigh Brown, Florian Fainelli,
	Sebastian Hesselbarth

On Tue, Dec 03, 2013 at 02:16:00PM +0100, Nicolas Schichan wrote:
> If a timer interrupt kicks right after the wait_event_timeout() call,
> we may endup a reporting timeout before the timeout really occured.

Please see my reply to Jason Cooper about this in the thread
"Spurious timeouts in mvmdio".

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 12:40   ` Russell King - ARM Linux
@ 2013-12-03 13:43     ` Jason Cooper
  2013-12-03 18:48       ` Nicolas Schichan
  2013-12-03 20:57     ` Leigh Brown
  1 sibling, 1 reply; 19+ messages in thread
From: Jason Cooper @ 2013-12-03 13:43 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Leigh Brown, Nicolas Schichan, netdev, LKML, Florian Fainelli,
	David S. Miller, linux-arm-kernel, Sebastian Hesselbarth

On Tue, Dec 03, 2013 at 12:40:34PM +0000, Russell King - ARM Linux wrote:
> On Tue, Dec 03, 2013 at 07:23:46AM -0500, Jason Cooper wrote:
> > On Mon, Dec 02, 2013 at 04:15:54PM +0100, Nicolas Schichan wrote:
> > > During 3.13-rc1 testing, I have found out that the mvmdio driver
> > > would report timeouts on the kernel console:
> > > 
> > > [   11.011334] orion-mdio orion-mdio: Timeout: SMI busy for too long
> > > 
> > > The hardware is a MV88F6281 Kirkwood CPU. The mvmdio driver is using
> > > the irq line 46 (ge00_err).
> > > 
> > > I am inclined to believe that it is due to the fact that
> > > wait_event_timeout() is called with a timeout parameter of 1 jiffy
> > > in orion_mdio_wait_ready(). If the timer interrupt ticks right after
> > > calling wait_event_timeout(), we may end up spending much less time
> > > than MVMDIO_SMI_TIMEOUT (1 msec) in wait_event_timeout(), and as a
> > > result report a timeout as the MDIO access did not complete in such
> > > a short time.
> > > 
> > > As to how to fix this, I see two options (I don't know which one
> > > would be prefered):
> > > 
> > > - Option 1: always pass a timeout of at least 2 jiffy to wait_event_timeout().
> > > - Option 2: switch to wait_event_hrtimeout().
> > > 
> > > I can provide patches for both options.
> > 
> > Based on yesterday's irc chat, option 1 sounds good.  Here's the dump
> > from yesterday where Sebastian provided a thorough explanation:
> > 
> > 11:29 < shesselba> increasing max timeout to 2 ticks at least sounds reasonable
> > 11:29 < shesselba> 10ms should be enough for every CONFIG_HZ there is
> > 
> > 11:30 < kos_tom> why make the timeout tied to the ticks? there are functions/macros to convert real time numbers into ticks.
> > 11:30 < kos_tom> msecs_to_jiffies() or something
> > 
> > 11:31 < shesselba> kos_tom: it is already using usecs_to_jiffies()
> > 11:31 < shesselba> the thing is: 1ms is less than a jiffy
> 
> Yes, and the kernels time conversion functions aren't stupid.  Let's
> look at this function's implementation:
> 
> unsigned long usecs_to_jiffies(const unsigned int u)
> {
>         if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
>                 return MAX_JIFFY_OFFSET;
> #if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ)
>         return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
> #elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC)
>         return u * (HZ / USEC_PER_SEC);
> #else
>         return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
>                 >> USEC_TO_HZ_SHR32;
> #endif
> }
> 
> Now, assuming HZ=100 and USEC_PER_SEC=1000000, we will use:
> 
> 	return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
> 
> If you ask for 1us, this comes out as:
> 
> 	return (1 + (1000000 / 100) - 1) / (1000000 / 100);
> 
> which is one jiffy.  So, for a requested 1us period, you're given a
> 1 jiffy interval, or 10ms.  For other (sensible) values:
> 
>         return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
>                 >> USEC_TO_HZ_SHR32;
> 
> gets used, which has a similar behaviour.
> 
> Now, depending on how you use this one jiffy interval, the thing to realise
> is that with this kind of loop:
> 
> 	timeout = jiffies + usecs_to_jiffies(1);
> 	do {
> 		something;
> 	} while (time_is_before_jiffies(timeout));
> 
> what this equates to is:
> 
> 	} while (jiffies - timeout < 0);
> 
> What this means is that the loop breaks at jiffies = timeout, so it can
> indeed timeout before one tick - within 0 to 10ms for HZ=100.  The problem
> is not the usecs_to_jiffies(), it's with the implementation.

Ack.

> If you use time_is_before_eq_jiffies() instead, it will also loop if
> jiffies == timeout, which will give you the additional safety margin -
> meaning it will timeout after 10 to 20ms instead.
> 
> You may wish to consider coding this differently as well - if you have
> the error interrupt, there's no need for this loop.  You only need the
> loop if you're using usleep_range().  Note the return value of
> wait_event_timeout() will tell you positively and correctly if the waited
> condition succeeded or you timed out.

Nicolas, sorry for the confusion.  Mind spinning a v2?

thx,

Jason.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 13:43     ` Jason Cooper
@ 2013-12-03 18:48       ` Nicolas Schichan
  0 siblings, 0 replies; 19+ messages in thread
From: Nicolas Schichan @ 2013-12-03 18:48 UTC (permalink / raw)
  To: Jason Cooper, Russell King - ARM Linux
  Cc: Leigh Brown, netdev, LKML, Florian Fainelli, David S. Miller,
	linux-arm-kernel, Sebastian Hesselbarth

On 12/03/2013 02:43 PM, Jason Cooper wrote:
> On Tue, Dec 03, 2013 at 12:40:34PM +0000, Russell King - ARM Linux wrote:
>> On Tue, Dec 03, 2013 at 07:23:46AM -0500, Jason Cooper wrote:
>>> On Mon, Dec 02, 2013 at 04:15:54PM +0100, Nicolas Schichan wrote:
>>>> During 3.13-rc1 testing, I have found out that the mvmdio driver
>>>> would report timeouts on the kernel console:
>>>>
>>>> [   11.011334] orion-mdio orion-mdio: Timeout: SMI busy for too long
>>>>
>>>> The hardware is a MV88F6281 Kirkwood CPU. The mvmdio driver is using
>>>> the irq line 46 (ge00_err).
>>>>
>>>> I am inclined to believe that it is due to the fact that
>>>> wait_event_timeout() is called with a timeout parameter of 1 jiffy
>>>> in orion_mdio_wait_ready(). If the timer interrupt ticks right after
>>>> calling wait_event_timeout(), we may end up spending much less time
>>>> than MVMDIO_SMI_TIMEOUT (1 msec) in wait_event_timeout(), and as a
>>>> result report a timeout as the MDIO access did not complete in such
>>>> a short time.
>>>>
>>>> As to how to fix this, I see two options (I don't know which one
>>>> would be prefered):
>>>>
>>>> - Option 1: always pass a timeout of at least 2 jiffy to wait_event_timeout().
>>>> - Option 2: switch to wait_event_hrtimeout().
>>>>
>>>> I can provide patches for both options.
>>>
>>> Based on yesterday's irc chat, option 1 sounds good.  Here's the dump
>>> from yesterday where Sebastian provided a thorough explanation:
>>>
>>> 11:29 < shesselba> increasing max timeout to 2 ticks at least sounds reasonable
>>> 11:29 < shesselba> 10ms should be enough for every CONFIG_HZ there is
>>>
>>> 11:30 < kos_tom> why make the timeout tied to the ticks? there are functions/macros to convert real time numbers into ticks.
>>> 11:30 < kos_tom> msecs_to_jiffies() or something
>>>
>>> 11:31 < shesselba> kos_tom: it is already using usecs_to_jiffies()
>>> 11:31 < shesselba> the thing is: 1ms is less than a jiffy
>>
>> Yes, and the kernels time conversion functions aren't stupid.  Let's
>> look at this function's implementation:
>>
>> unsigned long usecs_to_jiffies(const unsigned int u)
>> {
>>          if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
>>                  return MAX_JIFFY_OFFSET;
>> #if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ)
>>          return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
>> #elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC)
>>          return u * (HZ / USEC_PER_SEC);
>> #else
>>          return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
>>                  >> USEC_TO_HZ_SHR32;
>> #endif
>> }
>>
>> Now, assuming HZ=100 and USEC_PER_SEC=1000000, we will use:
>>
>> 	return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
>>
>> If you ask for 1us, this comes out as:
>>
>> 	return (1 + (1000000 / 100) - 1) / (1000000 / 100);
>>
>> which is one jiffy.  So, for a requested 1us period, you're given a
>> 1 jiffy interval, or 10ms.  For other (sensible) values:
>>
>>          return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
>>                  >> USEC_TO_HZ_SHR32;
>>
>> gets used, which has a similar behaviour.
>>
>> Now, depending on how you use this one jiffy interval, the thing to realise
>> is that with this kind of loop:
>>
>> 	timeout = jiffies + usecs_to_jiffies(1);
>> 	do {
>> 		something;
>> 	} while (time_is_before_jiffies(timeout));
>>
>> what this equates to is:
>>
>> 	} while (jiffies - timeout < 0);
>>
>> What this means is that the loop breaks at jiffies = timeout, so it can
>> indeed timeout before one tick - within 0 to 10ms for HZ=100.  The problem
>> is not the usecs_to_jiffies(), it's with the implementation.
>
> Ack.
>
>> If you use time_is_before_eq_jiffies() instead, it will also loop if
>> jiffies == timeout, which will give you the additional safety margin -
>> meaning it will timeout after 10 to 20ms instead.
>>
>> You may wish to consider coding this differently as well - if you have
>> the error interrupt, there's no need for this loop.  You only need the
>> loop if you're using usleep_range().  Note the return value of
>> wait_event_timeout() will tell you positively and correctly if the waited
>> condition succeeded or you timed out.
>
> Nicolas, sorry for the confusion.  Mind spinning a v2?

Sure, I'll respin a V2 of the patch with the following:

- loop only when using polling mode.
- set timeout given to wait_event_timeout() to at least 2
- use the return value of wait_event_timeout to check if condition was met or not.

As for the time_is_before_jiffies() use, when end == jiffies, (end - jiffies < 
0) is false, so we'll stay in the loop for one more jiffy so I guess the code 
is Ok in that regard (and as expected I get SMI timeouts in poll mode when I 
replace time_is_before_jiffies() with time_is_before_eq_jiffies()).

By the way time_is_before_jiffies(timeout) does not expand to (jiffies - 
timeout < 0). I have the following:

time_is_before_jiffies(timeout) -> time_after(jiffies, timeout)
time_after(jiffies, timeout) ->  (timeout - jiffies < 0)

Regards,

-- 
Nicolas Schichan
Freebox SAS

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 12:40   ` Russell King - ARM Linux
  2013-12-03 13:43     ` Jason Cooper
@ 2013-12-03 20:57     ` Leigh Brown
  2013-12-03 22:45       ` Sebastian Hesselbarth
  1 sibling, 1 reply; 19+ messages in thread
From: Leigh Brown @ 2013-12-03 20:57 UTC (permalink / raw)
  To: Russell King - ARM Linux, Nicolas Schichan
  Cc: Jason Cooper, netdev, LKML, Florian Fainelli, David S. Miller,
	linux-arm-kernel, Sebastian Hesselbarth

Hi Russell and Nicolas,

Apologies for taking so long to respond to this thread.

On 2013-12-03 12:40, Russell King - ARM Linux wrote:
> On Tue, Dec 03, 2013 at 07:23:46AM -0500, Jason Cooper wrote:
>> On Mon, Dec 02, 2013 at 04:15:54PM +0100, Nicolas Schichan wrote:
[...]
>> 11:31 < shesselba> kos_tom: it is already using usecs_to_jiffies()
>> 11:31 < shesselba> the thing is: 1ms is less than a jiffy
> 
> Yes, and the kernels time conversion functions aren't stupid.  Let's
> look at this function's implementation:
> 
> unsigned long usecs_to_jiffies(const unsigned int u)
> {
>         if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
>                 return MAX_JIFFY_OFFSET;
> #if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ)
>         return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
> #elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC)
>         return u * (HZ / USEC_PER_SEC);
> #else
>         return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
>                 >> USEC_TO_HZ_SHR32;
> #endif
> }
> 
> Now, assuming HZ=100 and USEC_PER_SEC=1000000, we will use:
> 
> 	return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
> 
> If you ask for 1us, this comes out as:
> 
> 	return (1 + (1000000 / 100) - 1) / (1000000 / 100);
> 
> which is one jiffy.  So, for a requested 1us period, you're given a
> 1 jiffy interval, or 10ms.  For other (sensible) values:
> 
>         return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
>                 >> USEC_TO_HZ_SHR32;
> 
> gets used, which has a similar behaviour.
> 
> Now, depending on how you use this one jiffy interval, the thing to 
> realise
> is that with this kind of loop:
> 
> 	timeout = jiffies + usecs_to_jiffies(1);
> 	do {
> 		something;
> 	} while (time_is_before_jiffies(timeout));
> 
> what this equates to is:
> 
> 	} while (jiffies - timeout < 0);

You've got that last line the wrong way round, but I understand what you 
are
getting at.

> What this means is that the loop breaks at jiffies = timeout, so it can
> indeed timeout before one tick - within 0 to 10ms for HZ=100.  The 
> problem
> is not the usecs_to_jiffies(), it's with the implementation.

> If you use time_is_before_eq_jiffies() instead, it will also loop if
> jiffies == timeout, which will give you the additional safety margin -
> meaning it will timeout after 10 to 20ms instead.

The code in question uses the logic in reverse so it *exits* if
time_is_before_jiffies(end) is true.  In other words it exits if 
"jiffies" is
greater than "end".  Since, as you say, usecs_to_jiffies(somevalue) will 
be a
minimum of 1, that means it will always loop at least twice.  So, I 
think it's
doing what you suggest, but in a different way, when polling.

> You may wish to consider coding this differently as well - if you have
> the error interrupt, there's no need for this loop.  You only need the
> loop if you're using usleep_range().  Note the return value of
> wait_event_timeout() will tell you positively and correctly if the 
> waited
> condition succeeded or you timed out.

Although the the wait_event_timeout is in the loop, it always exits due 
to
setting timedout to 1.  This was done to make the code smaller but I was
probably trying to be cleverer than I should have been.

So, given the above I believe the polling code is correct.  My mistake 
was to
assume that wait_event_timeout() guarantees a minimum of 1 jiffie wait.

Nicolas' patch should fix the issue, but I prefer the following as it is 
more
correct, as it only adjusts the timeout when calling 
wait_event_timeout().  As
I said above,I believe the polling code is correct.

diff --git a/drivers/net/ethernet/marvell/mvmdio.c 
b/drivers/net/ethernet/marvell/mvmdio.c
index 7354960..b187c08 100644
--- a/drivers/net/ethernet/marvell/mvmdio.c
+++ b/drivers/net/ethernet/marvell/mvmdio.c
@@ -92,6 +92,14 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
  			if (time_is_before_jiffies(end))
  				++timedout;
  	        } else {
+			/*
+			 * wait_event_timeout does not guarantee a delay of at
+			 * least one whole jiffie, so timeout must be no less
+			 * than two.
+			 */
+			if (timeout < 2)
+				timeout = 2;
+
  			wait_event_timeout(dev->smi_busy_wait,
  				           orion_mdio_smi_is_done(dev),
  				           timeout);

Nicolas - does the above also fix your issue?  I will also test it on my 
Dreamplug.

Apologies for causing this regression.

Regards,

Leigh.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 20:57     ` Leigh Brown
@ 2013-12-03 22:45       ` Sebastian Hesselbarth
  2013-12-03 23:20         ` Leigh Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Hesselbarth @ 2013-12-03 22:45 UTC (permalink / raw)
  To: Leigh Brown, Russell King - ARM Linux, Nicolas Schichan
  Cc: Jason Cooper, netdev, LKML, Florian Fainelli, David S. Miller,
	linux-arm-kernel

On 12/03/2013 09:57 PM, Leigh Brown wrote:
> Hi Russell and Nicolas,
>
> Apologies for taking so long to respond to this thread.
>
> On 2013-12-03 12:40, Russell King - ARM Linux wrote:
>> On Tue, Dec 03, 2013 at 07:23:46AM -0500, Jason Cooper wrote:
>>> On Mon, Dec 02, 2013 at 04:15:54PM +0100, Nicolas Schichan wrote:
> [...]
>>> 11:31 < shesselba> kos_tom: it is already using usecs_to_jiffies()
>>> 11:31 < shesselba> the thing is: 1ms is less than a jiffy
>>
>> Yes, and the kernels time conversion functions aren't stupid.  Let's
>> look at this function's implementation:
>>
>> unsigned long usecs_to_jiffies(const unsigned int u)
>> {
>>         if (u > jiffies_to_usecs(MAX_JIFFY_OFFSET))
>>                 return MAX_JIFFY_OFFSET;
>> #if HZ <= USEC_PER_SEC && !(USEC_PER_SEC % HZ)
>>         return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
>> #elif HZ > USEC_PER_SEC && !(HZ % USEC_PER_SEC)
>>         return u * (HZ / USEC_PER_SEC);
>> #else
>>         return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
>>                 >> USEC_TO_HZ_SHR32;
>> #endif
>> }
>>
>> Now, assuming HZ=100 and USEC_PER_SEC=1000000, we will use:
>>
>>     return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
>>
>> If you ask for 1us, this comes out as:
>>
>>     return (1 + (1000000 / 100) - 1) / (1000000 / 100);
>>
>> which is one jiffy.  So, for a requested 1us period, you're given a
>> 1 jiffy interval, or 10ms.  For other (sensible) values:
>>
>>         return (USEC_TO_HZ_MUL32 * u + USEC_TO_HZ_ADJ32)
>>                 >> USEC_TO_HZ_SHR32;
>>
>> gets used, which has a similar behaviour.
>>
>> Now, depending on how you use this one jiffy interval, the thing to
>> realise
>> is that with this kind of loop:
>>
>>     timeout = jiffies + usecs_to_jiffies(1);
>>     do {
>>         something;
>>     } while (time_is_before_jiffies(timeout));
>>
>> what this equates to is:
>>
>>     } while (jiffies - timeout < 0);
>
> You've got that last line the wrong way round, but I understand what you
> are
> getting at.
>
>> What this means is that the loop breaks at jiffies = timeout, so it can
>> indeed timeout before one tick - within 0 to 10ms for HZ=100.  The
>> problem
>> is not the usecs_to_jiffies(), it's with the implementation.
>
>> If you use time_is_before_eq_jiffies() instead, it will also loop if
>> jiffies == timeout, which will give you the additional safety margin -
>> meaning it will timeout after 10 to 20ms instead.
>
> The code in question uses the logic in reverse so it *exits* if
> time_is_before_jiffies(end) is true.  In other words it exits if
> "jiffies" is
> greater than "end".  Since, as you say, usecs_to_jiffies(somevalue) will
> be a
> minimum of 1, that means it will always loop at least twice.  So, I
> think it's
> doing what you suggest, but in a different way, when polling.
>
>> You may wish to consider coding this differently as well - if you have
>> the error interrupt, there's no need for this loop.  You only need the
>> loop if you're using usleep_range().  Note the return value of
>> wait_event_timeout() will tell you positively and correctly if the waited
>> condition succeeded or you timed out.
>
> Although the the wait_event_timeout is in the loop, it always exits due to
> setting timedout to 1.  This was done to make the code smaller but I was
> probably trying to be cleverer than I should have been.
>
> So, given the above I believe the polling code is correct.  My mistake
> was to
> assume that wait_event_timeout() guarantees a minimum of 1 jiffie wait.
>
> Nicolas' patch should fix the issue, but I prefer the following as it is
> more
> correct, as it only adjusts the timeout when calling
> wait_event_timeout().  As
> I said above,I believe the polling code is correct.
>
> diff --git a/drivers/net/ethernet/marvell/mvmdio.c
> b/drivers/net/ethernet/marvell/mvmdio.c
> index 7354960..b187c08 100644
> --- a/drivers/net/ethernet/marvell/mvmdio.c
> +++ b/drivers/net/ethernet/marvell/mvmdio.c
> @@ -92,6 +92,14 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
>               if (time_is_before_jiffies(end))
>                   ++timedout;
>               } else {
> +            /*
> +             * wait_event_timeout does not guarantee a delay of at
> +             * least one whole jiffie, so timeout must be no less
> +             * than two.
> +             */
> +            if (timeout < 2)
> +                timeout = 2;

If you always want to wait at least two jiffies, why not just increase
TIMEOUT makro to 20ms instead of messing here with it again?
As said on IRC log above, originally timeout was 100ms.

Sebastian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 23:20         ` Leigh Brown
@ 2013-12-03 23:17           ` Sebastian Hesselbarth
  2013-12-03 23:38             ` Leigh Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Hesselbarth @ 2013-12-03 23:17 UTC (permalink / raw)
  To: Leigh Brown
  Cc: Russell King - ARM Linux, Nicolas Schichan, Jason Cooper, netdev,
	LKML, Florian Fainelli, David S. Miller, linux-arm-kernel

On 12/04/2013 12:20 AM, Leigh Brown wrote:
> On 2013-12-03 22:45, Sebastian Hesselbarth wrote:
>> On 12/03/2013 09:57 PM, Leigh Brown wrote:
> [...]
>>> Nicolas' patch should fix the issue, but I prefer the following as it is
>>> more
>>> correct, as it only adjusts the timeout when calling
>>> wait_event_timeout().  As
>>> I said above,I believe the polling code is correct.
>>>
>>> diff --git a/drivers/net/ethernet/marvell/mvmdio.c
>>> b/drivers/net/ethernet/marvell/mvmdio.c
>>> index 7354960..b187c08 100644
>>> --- a/drivers/net/ethernet/marvell/mvmdio.c
>>> +++ b/drivers/net/ethernet/marvell/mvmdio.c
>>> @@ -92,6 +92,14 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
>>>               if (time_is_before_jiffies(end))
>>>                   ++timedout;
>>>               } else {
>>> +            /*
>>> +             * wait_event_timeout does not guarantee a delay of at
>>> +             * least one whole jiffie, so timeout must be no less
>>> +             * than two.
>>> +             */
>>> +            if (timeout < 2)
>>> +                timeout = 2;
>>
>> If you always want to wait at least two jiffies, why not just increase
>> TIMEOUT makro to 20ms instead of messing here with it again?
>> As said on IRC log above, originally timeout was 100ms.
>
> You could do that, but would you not feel bad leaving a latent bug in
> the code?
> I know it's unlikely that someone would set HZ to 50, but if they did,
> the same
> bug would appear again.

If you want to ensure timeout > 2, why not then just use:

- 	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
+ 	unsigned long timeout = 1 + usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 22:45       ` Sebastian Hesselbarth
@ 2013-12-03 23:20         ` Leigh Brown
  2013-12-03 23:17           ` Sebastian Hesselbarth
  0 siblings, 1 reply; 19+ messages in thread
From: Leigh Brown @ 2013-12-03 23:20 UTC (permalink / raw)
  To: Sebastian Hesselbarth
  Cc: Russell King - ARM Linux, Nicolas Schichan, Jason Cooper, netdev,
	LKML, Florian Fainelli, David S. Miller, linux-arm-kernel

On 2013-12-03 22:45, Sebastian Hesselbarth wrote:
> On 12/03/2013 09:57 PM, Leigh Brown wrote:
[...]
>> Nicolas' patch should fix the issue, but I prefer the following as it 
>> is
>> more
>> correct, as it only adjusts the timeout when calling
>> wait_event_timeout().  As
>> I said above,I believe the polling code is correct.
>> 
>> diff --git a/drivers/net/ethernet/marvell/mvmdio.c
>> b/drivers/net/ethernet/marvell/mvmdio.c
>> index 7354960..b187c08 100644
>> --- a/drivers/net/ethernet/marvell/mvmdio.c
>> +++ b/drivers/net/ethernet/marvell/mvmdio.c
>> @@ -92,6 +92,14 @@ static int orion_mdio_wait_ready(struct mii_bus 
>> *bus)
>>               if (time_is_before_jiffies(end))
>>                   ++timedout;
>>               } else {
>> +            /*
>> +             * wait_event_timeout does not guarantee a delay of at
>> +             * least one whole jiffie, so timeout must be no less
>> +             * than two.
>> +             */
>> +            if (timeout < 2)
>> +                timeout = 2;
> 
> If you always want to wait at least two jiffies, why not just increase
> TIMEOUT makro to 20ms instead of messing here with it again?
> As said on IRC log above, originally timeout was 100ms.
> 
> Sebastian

You could do that, but would you not feel bad leaving a latent bug in 
the code?
I know it's unlikely that someone would set HZ to 50, but if they did, 
the same
bug would appear again.

Regards,

Leigh.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 23:17           ` Sebastian Hesselbarth
@ 2013-12-03 23:38             ` Leigh Brown
  2013-12-03 23:42               ` Russell King - ARM Linux
  2013-12-03 23:45               ` Sebastian Hesselbarth
  0 siblings, 2 replies; 19+ messages in thread
From: Leigh Brown @ 2013-12-03 23:38 UTC (permalink / raw)
  To: Sebastian Hesselbarth
  Cc: Russell King - ARM Linux, Nicolas Schichan, Jason Cooper, netdev,
	LKML, Florian Fainelli, David S. Miller, linux-arm-kernel

On 2013-12-03 23:17, Sebastian Hesselbarth wrote:
> On 12/04/2013 12:20 AM, Leigh Brown wrote:
>> On 2013-12-03 22:45, Sebastian Hesselbarth wrote:
>>> On 12/03/2013 09:57 PM, Leigh Brown wrote:
>> [...]
>>>> Nicolas' patch should fix the issue, but I prefer the following as 
>>>> it is
>>>> more
>>>> correct, as it only adjusts the timeout when calling
>>>> wait_event_timeout().  As
>>>> I said above,I believe the polling code is correct.
>>>> 
>>>> diff --git a/drivers/net/ethernet/marvell/mvmdio.c
>>>> b/drivers/net/ethernet/marvell/mvmdio.c
>>>> index 7354960..b187c08 100644
>>>> --- a/drivers/net/ethernet/marvell/mvmdio.c
>>>> +++ b/drivers/net/ethernet/marvell/mvmdio.c
>>>> @@ -92,6 +92,14 @@ static int orion_mdio_wait_ready(struct mii_bus 
>>>> *bus)
>>>>               if (time_is_before_jiffies(end))
>>>>                   ++timedout;
>>>>               } else {
>>>> +            /*
>>>> +             * wait_event_timeout does not guarantee a delay of at
>>>> +             * least one whole jiffie, so timeout must be no less
>>>> +             * than two.
>>>> +             */
>>>> +            if (timeout < 2)
>>>> +                timeout = 2;
>>> 
>>> If you always want to wait at least two jiffies, why not just 
>>> increase
>>> TIMEOUT makro to 20ms instead of messing here with it again?
>>> As said on IRC log above, originally timeout was 100ms.
>> 
>> You could do that, but would you not feel bad leaving a latent bug in
>> the code?
>> I know it's unlikely that someone would set HZ to 50, but if they did,
>> the same
>> bug would appear again.
> 
> If you want to ensure timeout > 2, why not then just use:
> 
> - 	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
> + 	unsigned long timeout = 1 + usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);

This will make it correct when using interrupts but it will make the 
loop wait one
jiffie longer than it should when polling.

I think I might be being a little to anal about this, but I guess what 
you could do
is make the above change, then replace the call to 
time_is_before_jiffies() with
time_is_before_eq_jiffies() so that the behaviour when polling and when 
using
interrupts is the same.

diff --git a/drivers/net/ethernet/marvell/mvmdio.c 
b/drivers/net/ethernet/marvell/mvmdio.c
index 7354960..48b8ded 100644
--- a/drivers/net/ethernet/marvell/mvmdio.c
+++ b/drivers/net/ethernet/marvell/mvmdio.c
@@ -75,7 +75,7 @@ static int orion_mdio_smi_is_done(struct 
orion_mdio_dev *dev)
  static int orion_mdio_wait_ready(struct mii_bus *bus)
  {
  	struct orion_mdio_dev *dev = bus->priv;
-	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
+	unsigned long timeout = 1 + usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
  	unsigned long end = jiffies + timeout;
  	int timedout = 0;

@@ -89,7 +89,7 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
  			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
  				     MVMDIO_SMI_POLL_INTERVAL_MAX);

-			if (time_is_before_jiffies(end))
+			if (time_is_before_eq_jiffies(end))
  				++timedout;
  	        } else {
  			wait_event_timeout(dev->smi_busy_wait,

Regards,

Leigh.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 23:38             ` Leigh Brown
@ 2013-12-03 23:42               ` Russell King - ARM Linux
  2013-12-04 11:40                 ` Nicolas Schichan
  2013-12-16 18:07                 ` Nicolas Schichan
  2013-12-03 23:45               ` Sebastian Hesselbarth
  1 sibling, 2 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2013-12-03 23:42 UTC (permalink / raw)
  To: Leigh Brown
  Cc: Sebastian Hesselbarth, Nicolas Schichan, Jason Cooper, netdev,
	LKML, Florian Fainelli, David S. Miller, linux-arm-kernel

On Tue, Dec 03, 2013 at 11:38:23PM +0000, Leigh Brown wrote:
> On 2013-12-03 23:17, Sebastian Hesselbarth wrote:
>> If you want to ensure timeout > 2, why not then just use:
>>
>> - 	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
>> + 	unsigned long timeout = 1 + usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
>
> This will make it correct when using interrupts but it will make the  
> loop wait one jiffie longer than it should when polling.

Alternatively, code it like this instead.

 drivers/net/ethernet/marvell/mvmdio.c |   32 +++++++++++++++-----------------
 1 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvmdio.c b/drivers/net/ethernet/marvell/mvmdio.c
index 7354960b583b..a6f59831fc5b 100644
--- a/drivers/net/ethernet/marvell/mvmdio.c
+++ b/drivers/net/ethernet/marvell/mvmdio.c
@@ -76,31 +76,29 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
 {
 	struct orion_mdio_dev *dev = bus->priv;
 	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
-	unsigned long end = jiffies + timeout;
-	int timedout = 0;
 
-	while (1) {
-	        if (orion_mdio_smi_is_done(dev))
+	if (dev->err_interrupt > 0) {
+		if (wait_event_timeout(dev->smi_busy_wait,
+				       orion_mdio_smi_is_done(dev),
+				       1 + timeout))
 			return 0;
-	        else if (timedout)
-			break;
+	} else {
+		unsigned long end = jiffies + timeout;
 
-	        if (dev->err_interrupt <= 0) {
-			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
-				     MVMDIO_SMI_POLL_INTERVAL_MAX);
+		while (1) {
+	        	if (orion_mdio_smi_is_done(dev))
+				return 0;
 
 			if (time_is_before_jiffies(end))
-				++timedout;
-	        } else {
-			wait_event_timeout(dev->smi_busy_wait,
-				           orion_mdio_smi_is_done(dev),
-				           timeout);
-
-			++timedout;
-	        }
+				break;
+
+			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
+				     MVMDIO_SMI_POLL_INTERVAL_MAX);
+		}
 	}
 
 	dev_err(bus->parent, "Timeout: SMI busy for too long\n");
+
 	return  -ETIMEDOUT;
 }
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 23:38             ` Leigh Brown
  2013-12-03 23:42               ` Russell King - ARM Linux
@ 2013-12-03 23:45               ` Sebastian Hesselbarth
  1 sibling, 0 replies; 19+ messages in thread
From: Sebastian Hesselbarth @ 2013-12-03 23:45 UTC (permalink / raw)
  To: Leigh Brown
  Cc: Russell King - ARM Linux, Jason Cooper, Nicolas Schichan, netdev,
	LKML, David S. Miller, Florian Fainelli, linux-arm-kernel

On 12/04/2013 12:38 AM, Leigh Brown wrote:
> On 2013-12-03 23:17, Sebastian Hesselbarth wrote:
>> On 12/04/2013 12:20 AM, Leigh Brown wrote:
>>> On 2013-12-03 22:45, Sebastian Hesselbarth wrote:
>>>> On 12/03/2013 09:57 PM, Leigh Brown wrote:
>>> [...]
>>>>> Nicolas' patch should fix the issue, but I prefer the following as
>>>>> it is
>>>>> more
>>>>> correct, as it only adjusts the timeout when calling
>>>>> wait_event_timeout().  As
>>>>> I said above,I believe the polling code is correct.
>>>>>
>>>>> diff --git a/drivers/net/ethernet/marvell/mvmdio.c
>>>>> b/drivers/net/ethernet/marvell/mvmdio.c
>>>>> index 7354960..b187c08 100644
>>>>> --- a/drivers/net/ethernet/marvell/mvmdio.c
>>>>> +++ b/drivers/net/ethernet/marvell/mvmdio.c
>>>>> @@ -92,6 +92,14 @@ static int orion_mdio_wait_ready(struct mii_bus
>>>>> *bus)
>>>>>               if (time_is_before_jiffies(end))
>>>>>                   ++timedout;
>>>>>               } else {
>>>>> +            /*
>>>>> +             * wait_event_timeout does not guarantee a delay of at
>>>>> +             * least one whole jiffie, so timeout must be no less
>>>>> +             * than two.
>>>>> +             */
>>>>> +            if (timeout < 2)
>>>>> +                timeout = 2;
>>>>
>>>> If you always want to wait at least two jiffies, why not just increase
>>>> TIMEOUT makro to 20ms instead of messing here with it again?
>>>> As said on IRC log above, originally timeout was 100ms.
>>>
>>> You could do that, but would you not feel bad leaving a latent bug in
>>> the code?
>>> I know it's unlikely that someone would set HZ to 50, but if they did,
>>> the same
>>> bug would appear again.
>>
>> If you want to ensure timeout > 2, why not then just use:
>>
>> -     unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
>> +     unsigned long timeout = 1 + usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
>
> This will make it correct when using interrupts but it will make the
> loop wait one
> jiffie longer than it should when polling.

Leigh,

it is not about "waiting longer than it should" but increasing the
_timeout_ to more than one jiffy. If you poll and everything is fine
with mdio, you exit way before timeout. If mdio hangs, it does not
really matter if you timeout after 10, 20, or even 100ms.

Anyway, I am fine with every patch that increases timeout to more than
one jiffy. Also, Russell just sent a patch to separate irq/polling
completely. Maybe give it a try.

Sebastian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 23:42               ` Russell King - ARM Linux
@ 2013-12-04 11:40                 ` Nicolas Schichan
  2013-12-16 18:07                 ` Nicolas Schichan
  1 sibling, 0 replies; 19+ messages in thread
From: Nicolas Schichan @ 2013-12-04 11:40 UTC (permalink / raw)
  To: Russell King - ARM Linux, Leigh Brown
  Cc: Sebastian Hesselbarth, Jason Cooper, netdev, LKML,
	Florian Fainelli, David S. Miller, linux-arm-kernel

On 12/04/2013 12:42 AM, Russell King - ARM Linux wrote:
>> This will make it correct when using interrupts but it will make the
>> loop wait one jiffie longer than it should when polling.
>
> Alternatively, code it like this instead.
>
>   drivers/net/ethernet/marvell/mvmdio.c |   32 +++++++++++++++-----------------
>   1 files changed, 15 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/mvmdio.c b/drivers/net/ethernet/marvell/mvmdio.c
> index 7354960b583b..a6f59831fc5b 100644
> --- a/drivers/net/ethernet/marvell/mvmdio.c
> +++ b/drivers/net/ethernet/marvell/mvmdio.c
> @@ -76,31 +76,29 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
>   {
>   	struct orion_mdio_dev *dev = bus->priv;
>   	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
> -	unsigned long end = jiffies + timeout;
> -	int timedout = 0;
>
> -	while (1) {
> -	        if (orion_mdio_smi_is_done(dev))
> +	if (dev->err_interrupt > 0) {
> +		if (wait_event_timeout(dev->smi_busy_wait,
> +				       orion_mdio_smi_is_done(dev),
> +				       1 + timeout))
>   			return 0;
> -	        else if (timedout)
> -			break;
> +	} else {
> +		unsigned long end = jiffies + timeout;
>
> -	        if (dev->err_interrupt <= 0) {
> -			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
> -				     MVMDIO_SMI_POLL_INTERVAL_MAX);
> +		while (1) {
> +	        	if (orion_mdio_smi_is_done(dev))
> +				return 0;
>
>   			if (time_is_before_jiffies(end))
> -				++timedout;
> -	        } else {
> -			wait_event_timeout(dev->smi_busy_wait,
> -				           orion_mdio_smi_is_done(dev),
> -				           timeout);
> -
> -			++timedout;
> -	        }
> +				break;
> +
> +			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
> +				     MVMDIO_SMI_POLL_INTERVAL_MAX);
> +		}
>   	}
>
>   	dev_err(bus->parent, "Timeout: SMI busy for too long\n");
> +
>   	return  -ETIMEDOUT;
>   }

Hi Russell,

I have just tested your patch on MV88F6281 and MV88F6282 both in polling and 
irq mode and it works just fine.

I had a similar patch almost ready to submit but you were faster than me.

Feel free to add my:

Tested-by: Nicolas Schichan <nschichan@freebox.fr>

Thanks,

-- 
Nicolas Schichan
Freebox SAS

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-03 23:42               ` Russell King - ARM Linux
  2013-12-04 11:40                 ` Nicolas Schichan
@ 2013-12-16 18:07                 ` Nicolas Schichan
  2013-12-16 18:28                   ` Leigh Brown
  2013-12-16 18:48                   ` Russell King - ARM Linux
  1 sibling, 2 replies; 19+ messages in thread
From: Nicolas Schichan @ 2013-12-16 18:07 UTC (permalink / raw)
  To: Russell King - ARM Linux, Leigh Brown
  Cc: Jason Cooper, netdev, LKML, David S. Miller, Florian Fainelli,
	linux-arm-kernel, Sebastian Hesselbarth

On 12/04/2013 12:42 AM, Russell King - ARM Linux wrote:
> Alternatively, code it like this instead.
>
>   drivers/net/ethernet/marvell/mvmdio.c |   32 +++++++++++++++-----------------
>   1 files changed, 15 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/mvmdio.c b/drivers/net/ethernet/marvell/mvmdio.c
> index 7354960b583b..a6f59831fc5b 100644
> --- a/drivers/net/ethernet/marvell/mvmdio.c
> +++ b/drivers/net/ethernet/marvell/mvmdio.c
> @@ -76,31 +76,29 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
>   {
>   	struct orion_mdio_dev *dev = bus->priv;
>   	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
> -	unsigned long end = jiffies + timeout;
> -	int timedout = 0;
>
> -	while (1) {
> -	        if (orion_mdio_smi_is_done(dev))
> +	if (dev->err_interrupt > 0) {
> +		if (wait_event_timeout(dev->smi_busy_wait,
> +				       orion_mdio_smi_is_done(dev),
> +				       1 + timeout))
>   			return 0;
> -	        else if (timedout)
> -			break;
> +	} else {
> +		unsigned long end = jiffies + timeout;
>
> -	        if (dev->err_interrupt <= 0) {
> -			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
> -				     MVMDIO_SMI_POLL_INTERVAL_MAX);
> +		while (1) {
> +	        	if (orion_mdio_smi_is_done(dev))
> +				return 0;
>
>   			if (time_is_before_jiffies(end))
> -				++timedout;
> -	        } else {
> -			wait_event_timeout(dev->smi_busy_wait,
> -				           orion_mdio_smi_is_done(dev),
> -				           timeout);
> -
> -			++timedout;
> -	        }
> +				break;
> +
> +			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
> +				     MVMDIO_SMI_POLL_INTERVAL_MAX);
> +		}
>   	}
>
>   	dev_err(bus->parent, "Timeout: SMI busy for too long\n");
> +
>   	return  -ETIMEDOUT;
>   }

Hi Russell,

I did not find any commit for this in 3.13-rc4.

Would you prefer I submit my attempt at refactoring the 
orion_mdio_wait_ready() code ?

Regards,

-- 
Nicolas Schichan
Freebox SAS

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-16 18:07                 ` Nicolas Schichan
@ 2013-12-16 18:28                   ` Leigh Brown
  2013-12-17 13:49                     ` Nicolas Schichan
  2013-12-16 18:48                   ` Russell King - ARM Linux
  1 sibling, 1 reply; 19+ messages in thread
From: Leigh Brown @ 2013-12-16 18:28 UTC (permalink / raw)
  To: Nicolas Schichan
  Cc: Russell King - ARM Linux, Jason Cooper, netdev, LKML,
	Florian Fainelli, David S. Miller, linux-arm-kernel,
	Sebastian Hesselbarth

On 2013-12-16 18:07, Nicolas Schichan wrote:
> On 12/04/2013 12:42 AM, Russell King - ARM Linux wrote:
>> Alternatively, code it like this instead.
>> 
>>   drivers/net/ethernet/marvell/mvmdio.c |   32 
>> +++++++++++++++-----------------
>>   1 files changed, 15 insertions(+), 17 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/marvell/mvmdio.c 
>> b/drivers/net/ethernet/marvell/mvmdio.c
>> index 7354960b583b..a6f59831fc5b 100644
>> --- a/drivers/net/ethernet/marvell/mvmdio.c
>> +++ b/drivers/net/ethernet/marvell/mvmdio.c
>> @@ -76,31 +76,29 @@ static int orion_mdio_wait_ready(struct mii_bus 
>> *bus)
>>   {
>>   	struct orion_mdio_dev *dev = bus->priv;
>>   	unsigned long timeout = usecs_to_jiffies(MVMDIO_SMI_TIMEOUT);
>> -	unsigned long end = jiffies + timeout;
>> -	int timedout = 0;
>> 
>> -	while (1) {
>> -	        if (orion_mdio_smi_is_done(dev))
>> +	if (dev->err_interrupt > 0) {
>> +		if (wait_event_timeout(dev->smi_busy_wait,
>> +				       orion_mdio_smi_is_done(dev),
>> +				       1 + timeout))
>>   			return 0;
>> -	        else if (timedout)
>> -			break;
>> +	} else {
>> +		unsigned long end = jiffies + timeout;
>> 
>> -	        if (dev->err_interrupt <= 0) {
>> -			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
>> -				     MVMDIO_SMI_POLL_INTERVAL_MAX);
>> +		while (1) {
>> +	        	if (orion_mdio_smi_is_done(dev))
>> +				return 0;
>> 
>>   			if (time_is_before_jiffies(end))
>> -				++timedout;
>> -	        } else {
>> -			wait_event_timeout(dev->smi_busy_wait,
>> -				           orion_mdio_smi_is_done(dev),
>> -				           timeout);
>> -
>> -			++timedout;
>> -	        }
>> +				break;
>> +
>> +			usleep_range(MVMDIO_SMI_POLL_INTERVAL_MIN,
>> +				     MVMDIO_SMI_POLL_INTERVAL_MAX);
>> +		}
>>   	}
>> 
>>   	dev_err(bus->parent, "Timeout: SMI busy for too long\n");
>> +
>>   	return  -ETIMEDOUT;
>>   }
> 
> Hi Russell,
> 
> I did not find any commit for this in 3.13-rc4.
> 
> Would you prefer I submit my attempt at refactoring the
> orion_mdio_wait_ready() code ?

I prefer this patch (which I think you originally proposed) because it 
is just as
correct and the code size is a bit smaller in arm mode, and the same 
size in thumb
mode (on my compiler gcc 4.7.2, at least).  I coded the loop in that way 
to make it
small.

I don't mind submitting it to David as I made the mistake in the first 
place.

Regards,

Leigh.

diff --git a/drivers/net/ethernet/marvell/mvmdio.c 
b/drivers/net/ethernet/marvell/mvmdio.c
index 7354960..a42a750 100644
--- a/drivers/net/ethernet/marvell/mvmdio.c
+++ b/drivers/net/ethernet/marvell/mvmdio.c
@@ -92,6 +92,12 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
  			if (time_is_before_jiffies(end))
  				++timedout;
  	        } else {
+			/* wait_event_timeout does not guarantee a delay of at
+			 * least one whole jiffie, so timeout must be no less
+			 * than two.
+			 */
+			 if (timeout < 2)
+			 	timeout = 2;
  			wait_event_timeout(dev->smi_busy_wait,
  				           orion_mdio_smi_is_done(dev),
  				           timeout);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-16 18:07                 ` Nicolas Schichan
  2013-12-16 18:28                   ` Leigh Brown
@ 2013-12-16 18:48                   ` Russell King - ARM Linux
  1 sibling, 0 replies; 19+ messages in thread
From: Russell King - ARM Linux @ 2013-12-16 18:48 UTC (permalink / raw)
  To: Nicolas Schichan
  Cc: Leigh Brown, Jason Cooper, netdev, LKML, David S. Miller,
	Florian Fainelli, linux-arm-kernel, Sebastian Hesselbarth

On Mon, Dec 16, 2013 at 07:07:28PM +0100, Nicolas Schichan wrote:
> I did not find any commit for this in 3.13-rc4.
>
> Would you prefer I submit my attempt at refactoring the  
> orion_mdio_wait_ready() code ?

It's up to Jason.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spurious timeouts in mvmdio
  2013-12-16 18:28                   ` Leigh Brown
@ 2013-12-17 13:49                     ` Nicolas Schichan
  0 siblings, 0 replies; 19+ messages in thread
From: Nicolas Schichan @ 2013-12-17 13:49 UTC (permalink / raw)
  To: Leigh Brown
  Cc: Russell King - ARM Linux, Jason Cooper, netdev, LKML,
	Florian Fainelli, David S. Miller, linux-arm-kernel,
	Sebastian Hesselbarth

On 12/16/2013 07:28 PM, Leigh Brown wrote:

> I prefer this patch (which I think you originally proposed) because it is just as
> correct and the code size is a bit smaller in arm mode, and the same size in
> thumb
> mode (on my compiler gcc 4.7.2, at least).  I coded the loop in that way to
> make it
> small.
>
> I don't mind submitting it to David as I made the mistake in the first place.
>
> Regards,
>
> Leigh.
>
> diff --git a/drivers/net/ethernet/marvell/mvmdio.c
> b/drivers/net/ethernet/marvell/mvmdio.c
> index 7354960..a42a750 100644
> --- a/drivers/net/ethernet/marvell/mvmdio.c
> +++ b/drivers/net/ethernet/marvell/mvmdio.c
> @@ -92,6 +92,12 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
>               if (time_is_before_jiffies(end))
>                   ++timedout;
>               } else {
> +            /* wait_event_timeout does not guarantee a delay of at
> +             * least one whole jiffie, so timeout must be no less
> +             * than two.
> +             */
> +             if (timeout < 2)
> +                 timeout = 2;
>               wait_event_timeout(dev->smi_busy_wait,
>                              orion_mdio_smi_is_done(dev),
>                              timeout);

Hi Leigh,

My original patch changed "timeout" before the entering the while loop, but 
I'm fine with your patch too.

Feel free to add my:

Tested-by: Nicolas Schichan <nschichan@freebox.fr>

Regards,

-- 
Nicolas Schichan
Freebox SAS

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-12-17 13:49 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-02 15:15 Spurious timeouts in mvmdio Nicolas Schichan
2013-12-03 12:23 ` Jason Cooper
2013-12-03 12:40   ` Russell King - ARM Linux
2013-12-03 13:43     ` Jason Cooper
2013-12-03 18:48       ` Nicolas Schichan
2013-12-03 20:57     ` Leigh Brown
2013-12-03 22:45       ` Sebastian Hesselbarth
2013-12-03 23:20         ` Leigh Brown
2013-12-03 23:17           ` Sebastian Hesselbarth
2013-12-03 23:38             ` Leigh Brown
2013-12-03 23:42               ` Russell King - ARM Linux
2013-12-04 11:40                 ` Nicolas Schichan
2013-12-16 18:07                 ` Nicolas Schichan
2013-12-16 18:28                   ` Leigh Brown
2013-12-17 13:49                     ` Nicolas Schichan
2013-12-16 18:48                   ` Russell King - ARM Linux
2013-12-03 23:45               ` Sebastian Hesselbarth
2013-12-03 13:16   ` [PATCH] net: mvmdio: fix wait_event_timeout() being called with a 1 jiffy timeout Nicolas Schichan
2013-12-03 13:34     ` Russell King - ARM Linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).