netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* stmmac: Race in coalesce timer and NAPI
@ 2018-09-21  9:19 Jose Abreu
  2018-09-21 13:54 ` Eric Dumazet
  0 siblings, 1 reply; 2+ messages in thread
From: Jose Abreu @ 2018-09-21  9:19 UTC (permalink / raw)
  To: netdev, Joao Pinto

Hello,

I'm getting a race in stmmac coalesce timer and the
napi_schedule() interrupt and I'm asking for advice. Currently,
we are scheduling NAPI in coalesce timer but this leads to
stmmac_tx_clean() deadlock because this function tries to acquire
queue lock.

I find that this is not expected because only one instance of
NAPI should run at same time so I was wondering if it is possible
that xmit() callback is causing the deadlock ?

BTW, this is solved by:
    - Directly call stmmac_tx_clean() in timer function AND
    - Use netif_tx_trylock() in stmmac_tx_clean(). Then, if queue
is already locked we re-arm coalesce timer or reschedule NAPI.

This is easily reproducible in an ARM board with 8 core running
at 100MHz each.

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: stmmac: Race in coalesce timer and NAPI
  2018-09-21  9:19 stmmac: Race in coalesce timer and NAPI Jose Abreu
@ 2018-09-21 13:54 ` Eric Dumazet
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Dumazet @ 2018-09-21 13:54 UTC (permalink / raw)
  To: Jose Abreu, netdev, Joao Pinto



On 09/21/2018 02:19 AM, Jose Abreu wrote:
> Hello,
> 
> I'm getting a race in stmmac coalesce timer and the
> napi_schedule() interrupt and I'm asking for advice. Currently,
> we are scheduling NAPI in coalesce timer but this leads to
> stmmac_tx_clean() deadlock because this function tries to acquire
> queue lock.

This is strange. Which lock are you talking about ?

The napi_schedule() stuff should be enough to protect your use case.


> 
> I find that this is not expected because only one instance of
> NAPI should run at same time so I was wondering if it is possible
> that xmit() callback is causing the deadlock ?
> 
> BTW, this is solved by:
>     - Directly call stmmac_tx_clean() in timer function AND
>     - Use netif_tx_trylock() in stmmac_tx_clean(). Then, if queue
> is already locked we re-arm coalesce timer or reschedule NAPI.
> 
> This is easily reproducible in an ARM board with 8 core running
> at 100MHz each.
> 
> Thanks and Best Regards,
> Jose Miguel Abreu
> 

It looks to me stmmac_napi_poll() should not apply/consume any budget for TX completion.

The budget for a NAPI poll shared by RX and TX is really only for the RX side.

netpoll will specificall call the poll() with budget==0 to only drain TX

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-09-21 19:43 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-21  9:19 stmmac: Race in coalesce timer and NAPI Jose Abreu
2018-09-21 13:54 ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).