linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: stmmac: fix completely hung TX when using TSO
@ 2017-06-06  7:25 Niklas Cassel
  2017-06-06  8:00 ` Giuseppe CAVALLARO
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Niklas Cassel @ 2017-06-06  7:25 UTC (permalink / raw)
  To: Giuseppe Cavallaro, Alexandre Torgue; +Cc: Niklas Cassel, netdev, linux-kernel

stmmac_tso_allocator can fail to set the Last Descriptor bit
on a descriptor that actually was the last descriptor.

This happens when the buffer of the last descriptor ends
up having a size of exactly TSO_MAX_BUFF_SIZE.

When the IP eventually reaches the next last descriptor,
which actually has the bit set, the DMA will hang.

When the DMA hangs, we get a tx timeout, however,
since stmmac does not do a complete reset of the IP
in stmmac_tx_timeout, we end up in a state with
completely hung TX.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 68a188e74c54..440bea049a7f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, unsigned int des,
 
 		priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
 			0, 1,
-			(last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
+			(last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
 			0, 0);
 
 		tmp_len -= TSO_MAX_BUFF_SIZE;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO
  2017-06-06  7:25 [PATCH net] net: stmmac: fix completely hung TX when using TSO Niklas Cassel
@ 2017-06-06  8:00 ` Giuseppe CAVALLARO
  2017-06-06  9:10   ` Alexandre Torgue
  2017-06-06 10:01   ` Niklas Cassel
  2017-06-06 14:31 ` Florian Fainelli
  2017-06-06 20:25 ` David Miller
  2 siblings, 2 replies; 6+ messages in thread
From: Giuseppe CAVALLARO @ 2017-06-06  8:00 UTC (permalink / raw)
  To: Niklas Cassel, Alexandre Torgue; +Cc: Niklas Cassel, netdev, linux-kernel

Hi Niklas

I get the point and I acked the patch but Alex, please, can you confirm
that this issue has never seen on your boxes where the TSO has been
fully tested? The initial development (commit f748be531) introduces
the following:
     (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
...

On 6/6/2017 9:25 AM, Niklas Cassel wrote:
> stmmac_tso_allocator can fail to set the Last Descriptor bit
> on a descriptor that actually was the last descriptor.
>
> This happens when the buffer of the last descriptor ends
> up having a size of exactly TSO_MAX_BUFF_SIZE.
>
> When the IP eventually reaches the next last descriptor,
> which actually has the bit set, the DMA will hang.
>
> When the DMA hangs, we get a tx timeout, however,
> since stmmac does not do a complete reset of the IP
> in stmmac_tx_timeout, we end up in a state with
> completely hung TX.
>
> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>

Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>

> ---
>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 68a188e74c54..440bea049a7f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, unsigned int des,
>   
>   		priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
>   			0, 1,
> -			(last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
> +			(last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
>   			0, 0);
>   
>   		tmp_len -= TSO_MAX_BUFF_SIZE;

Regards
Peppe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO
  2017-06-06  8:00 ` Giuseppe CAVALLARO
@ 2017-06-06  9:10   ` Alexandre Torgue
  2017-06-06 10:01   ` Niklas Cassel
  1 sibling, 0 replies; 6+ messages in thread
From: Alexandre Torgue @ 2017-06-06  9:10 UTC (permalink / raw)
  To: Giuseppe CAVALLARO, Niklas Cassel; +Cc: Niklas Cassel, netdev, linux-kernel

Hi Guys,

On 06/06/2017 10:00 AM, Giuseppe CAVALLARO wrote:
> Hi Niklas
>
> I get the point and I acked the patch but Alex, please, can you confirm
> that this issue has never seen on your boxes where the TSO has been
> fully tested? The initial development (commit f748be531) introduces
> the following:
>     (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),

I don't remember to have seen this kind of issue in the past but for 
sure I agree with this patch.

Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>

> ...
>
> On 6/6/2017 9:25 AM, Niklas Cassel wrote:
>> stmmac_tso_allocator can fail to set the Last Descriptor bit
>> on a descriptor that actually was the last descriptor.
>>
>> This happens when the buffer of the last descriptor ends
>> up having a size of exactly TSO_MAX_BUFF_SIZE.
>>
>> When the IP eventually reaches the next last descriptor,
>> which actually has the bit set, the DMA will hang.
>>
>> When the DMA hangs, we get a tx timeout, however,
>> since stmmac does not do a complete reset of the IP
>> in stmmac_tx_timeout, we end up in a state with
>> completely hung TX.
>>
>> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
>
> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
>
>> ---
>>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 68a188e74c54..440bea049a7f 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct
>> stmmac_priv *priv, unsigned int des,
>>             priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
>>               0, 1,
>> -            (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
>> +            (last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
>>               0, 0);
>>             tmp_len -= TSO_MAX_BUFF_SIZE;
>
> Regards
> Peppe
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO
  2017-06-06  8:00 ` Giuseppe CAVALLARO
  2017-06-06  9:10   ` Alexandre Torgue
@ 2017-06-06 10:01   ` Niklas Cassel
  1 sibling, 0 replies; 6+ messages in thread
From: Niklas Cassel @ 2017-06-06 10:01 UTC (permalink / raw)
  To: Giuseppe CAVALLARO, Alexandre Torgue; +Cc: netdev, linux-kernel

On 06/06/2017 10:00 AM, Giuseppe CAVALLARO wrote:
> Hi Niklas

Hello Peppe, Alex

> 
> I get the point and I acked the patch but Alex, please, can you confirm
> that this issue has never seen on your boxes where the TSO has been
> fully tested? The initial development (commit f748be531) introduces
> the following:
>     (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
> ...

Since you need a packet where the txbuff size of last descriptor
has the exact size of 16K-1, depending on your workload,
it is possible to run your system for a very long time without
triggering the bug.

We noticed the problem since a test in our test suite fetches
the syslog. The syslog usually has a size that is not constant,
and the size usually increases with time.
During some of the test runs, far from all of them, we would
eventually end up with a packet of the exact size that triggers
the bug.

It should be possible to trigger the bug more easily if you
know what buffer size to write to your TCP socket, together
with some manual TCP corking.


Regards,
Niklas

> 
> On 6/6/2017 9:25 AM, Niklas Cassel wrote:
>> stmmac_tso_allocator can fail to set the Last Descriptor bit
>> on a descriptor that actually was the last descriptor.
>>
>> This happens when the buffer of the last descriptor ends
>> up having a size of exactly TSO_MAX_BUFF_SIZE.
>>
>> When the IP eventually reaches the next last descriptor,
>> which actually has the bit set, the DMA will hang.
>>
>> When the DMA hangs, we get a tx timeout, however,
>> since stmmac does not do a complete reset of the IP
>> in stmmac_tx_timeout, we end up in a state with
>> completely hung TX.
>>
>> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
> 
> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> 
>> ---
>>   drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 68a188e74c54..440bea049a7f 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, unsigned int des,
>>             priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
>>               0, 1,
>> -            (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
>> +            (last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
>>               0, 0);
>>             tmp_len -= TSO_MAX_BUFF_SIZE;
> 
> Regards
> Peppe
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO
  2017-06-06  7:25 [PATCH net] net: stmmac: fix completely hung TX when using TSO Niklas Cassel
  2017-06-06  8:00 ` Giuseppe CAVALLARO
@ 2017-06-06 14:31 ` Florian Fainelli
  2017-06-06 20:25 ` David Miller
  2 siblings, 0 replies; 6+ messages in thread
From: Florian Fainelli @ 2017-06-06 14:31 UTC (permalink / raw)
  To: Niklas Cassel, Giuseppe Cavallaro, Alexandre Torgue
  Cc: Niklas Cassel, netdev, linux-kernel



On 06/06/2017 12:25 AM, Niklas Cassel wrote:
> stmmac_tso_allocator can fail to set the Last Descriptor bit
> on a descriptor that actually was the last descriptor.
> 
> This happens when the buffer of the last descriptor ends
> up having a size of exactly TSO_MAX_BUFF_SIZE.
> 
> When the IP eventually reaches the next last descriptor,
> which actually has the bit set, the DMA will hang.
> 
> When the DMA hangs, we get a tx timeout, however,
> since stmmac does not do a complete reset of the IP
> in stmmac_tx_timeout, we end up in a state with
> completely hung TX.
> 
> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>

This should have:

Fixes: f748be531d70 ("stmmac: support new GMAC4")

right?

> ---
>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 68a188e74c54..440bea049a7f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, unsigned int des,
>  
>  		priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
>  			0, 1,
> -			(last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
> +			(last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
>  			0, 0);
>  
>  		tmp_len -= TSO_MAX_BUFF_SIZE;
> 

-- 
Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO
  2017-06-06  7:25 [PATCH net] net: stmmac: fix completely hung TX when using TSO Niklas Cassel
  2017-06-06  8:00 ` Giuseppe CAVALLARO
  2017-06-06 14:31 ` Florian Fainelli
@ 2017-06-06 20:25 ` David Miller
  2 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2017-06-06 20:25 UTC (permalink / raw)
  To: niklas.cassel
  Cc: peppe.cavallaro, alexandre.torgue, niklass, netdev, linux-kernel

From: Niklas Cassel <niklas.cassel@axis.com>
Date: Tue, 6 Jun 2017 09:25:00 +0200

> stmmac_tso_allocator can fail to set the Last Descriptor bit
> on a descriptor that actually was the last descriptor.
> 
> This happens when the buffer of the last descriptor ends
> up having a size of exactly TSO_MAX_BUFF_SIZE.
> 
> When the IP eventually reaches the next last descriptor,
> which actually has the bit set, the DMA will hang.
> 
> When the DMA hangs, we get a tx timeout, however,
> since stmmac does not do a complete reset of the IP
> in stmmac_tx_timeout, we end up in a state with
> completely hung TX.
> 
> Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>

Applied and queued up for -stable, thank you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-06 20:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-06  7:25 [PATCH net] net: stmmac: fix completely hung TX when using TSO Niklas Cassel
2017-06-06  8:00 ` Giuseppe CAVALLARO
2017-06-06  9:10   ` Alexandre Torgue
2017-06-06 10:01   ` Niklas Cassel
2017-06-06 14:31 ` Florian Fainelli
2017-06-06 20:25 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).