linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH resend] can: rcar_canfd: fix possible IRQ storm on high load
@ 2019-06-26 13:08 Nikita Yushchenko
  2019-06-26 13:12 ` Wolfram Sang
  0 siblings, 1 reply; 3+ messages in thread
From: Nikita Yushchenko @ 2019-06-26 13:08 UTC (permalink / raw)
  To: Wolfgang Grandegger, Marc Kleine-Budde, David S. Miller,
	Simon Horman, Wolfram Sang
  Cc: linux-can, netdev, linux-kernel, Artemi Ivanov, Sergei Shtylyov,
	Nikita Yushchenko

We have observed rcar_canfd driver entering IRQ storm under high load,
with following scenario:
- rcar_canfd_global_interrupt() in entered due to Rx available,
- napi_schedule_prep() is called, and sets NAPIF_STATE_SCHED in state
- Rx fifo interrupts are masked,
- rcar_canfd_global_interrupt() is entered again, this time due to
  error interrupt (e.g. due to overflow),
- since scheduled napi poller has not yet executed, condition for calling
  napi_schedule_prep() from rcar_canfd_global_interrupt() remains true,
  thus napi_schedule_prep() gets called and sets NAPIF_STATE_MISSED flag
  in state,
- later, napi poller function rcar_canfd_rx_poll() gets executed, and
  calls napi_complete_done(),
- due to NAPIF_STATE_MISSED flag in state, this call does not clear
  NAPIF_STATE_SCHED flag from state,
- on return from napi_complete_done(), rcar_canfd_rx_poll() unmasks Rx
  interrutps,
- Rx interrupt happens, rcar_canfd_global_interrupt() gets called
  and calls napi_schedule_prep(),
- since NAPIF_STATE_SCHED is set in state at this time, this call
  returns false,
- due to that false return, rcar_canfd_global_interrupt() returns
  without masking Rx interrupt
- and this results into IRQ storm: unmasked Rx interrupt happens again
  and again is misprocessed in the same way.

This patch fixes that scenario by unmasking Rx interrupts only when
napi_complete_done() returns true, which means it has cleared
NAPIF_STATE_SCHED in state.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
---
 drivers/net/can/rcar/rcar_canfd.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
index 05410008aa6b..de34a4b82d4a 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1508,10 +1508,11 @@ static int rcar_canfd_rx_poll(struct napi_struct *napi, int quota)
 
 	/* All packets processed */
 	if (num_pkts < quota) {
-		napi_complete_done(napi, num_pkts);
-		/* Enable Rx FIFO interrupts */
-		rcar_canfd_set_bit(priv->base, RCANFD_RFCC(ridx),
-				   RCANFD_RFCC_RFIE);
+		if (napi_complete_done(napi, num_pkts)) {
+			/* Enable Rx FIFO interrupts */
+			rcar_canfd_set_bit(priv->base, RCANFD_RFCC(ridx),
+					   RCANFD_RFCC_RFIE);
+		}
 	}
 	return num_pkts;
 }
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH resend] can: rcar_canfd: fix possible IRQ storm on high load
  2019-06-26 13:08 [PATCH resend] can: rcar_canfd: fix possible IRQ storm on high load Nikita Yushchenko
@ 2019-06-26 13:12 ` Wolfram Sang
  2019-06-26 13:32   ` Wolfram Sang
  0 siblings, 1 reply; 3+ messages in thread
From: Wolfram Sang @ 2019-06-26 13:12 UTC (permalink / raw)
  To: Nikita Yushchenko, Ramesh Shanmugasundaram
  Cc: Wolfgang Grandegger, Marc Kleine-Budde, David S. Miller,
	Simon Horman, Wolfram Sang, linux-can, netdev, linux-kernel,
	Artemi Ivanov, Sergei Shtylyov

[-- Attachment #1: Type: text/plain, Size: 2605 bytes --]

On Wed, Jun 26, 2019 at 04:08:48PM +0300, Nikita Yushchenko wrote:
> We have observed rcar_canfd driver entering IRQ storm under high load,
> with following scenario:
> - rcar_canfd_global_interrupt() in entered due to Rx available,
> - napi_schedule_prep() is called, and sets NAPIF_STATE_SCHED in state
> - Rx fifo interrupts are masked,
> - rcar_canfd_global_interrupt() is entered again, this time due to
>   error interrupt (e.g. due to overflow),
> - since scheduled napi poller has not yet executed, condition for calling
>   napi_schedule_prep() from rcar_canfd_global_interrupt() remains true,
>   thus napi_schedule_prep() gets called and sets NAPIF_STATE_MISSED flag
>   in state,
> - later, napi poller function rcar_canfd_rx_poll() gets executed, and
>   calls napi_complete_done(),
> - due to NAPIF_STATE_MISSED flag in state, this call does not clear
>   NAPIF_STATE_SCHED flag from state,
> - on return from napi_complete_done(), rcar_canfd_rx_poll() unmasks Rx
>   interrutps,
> - Rx interrupt happens, rcar_canfd_global_interrupt() gets called
>   and calls napi_schedule_prep(),
> - since NAPIF_STATE_SCHED is set in state at this time, this call
>   returns false,
> - due to that false return, rcar_canfd_global_interrupt() returns
>   without masking Rx interrupt
> - and this results into IRQ storm: unmasked Rx interrupt happens again
>   and again is misprocessed in the same way.
> 
> This patch fixes that scenario by unmasking Rx interrupts only when
> napi_complete_done() returns true, which means it has cleared
> NAPIF_STATE_SCHED in state.
> 
> Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>

CCing the driver author...

> ---
>  drivers/net/can/rcar/rcar_canfd.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
> index 05410008aa6b..de34a4b82d4a 100644
> --- a/drivers/net/can/rcar/rcar_canfd.c
> +++ b/drivers/net/can/rcar/rcar_canfd.c
> @@ -1508,10 +1508,11 @@ static int rcar_canfd_rx_poll(struct napi_struct *napi, int quota)
>  
>  	/* All packets processed */
>  	if (num_pkts < quota) {
> -		napi_complete_done(napi, num_pkts);
> -		/* Enable Rx FIFO interrupts */
> -		rcar_canfd_set_bit(priv->base, RCANFD_RFCC(ridx),
> -				   RCANFD_RFCC_RFIE);
> +		if (napi_complete_done(napi, num_pkts)) {
> +			/* Enable Rx FIFO interrupts */
> +			rcar_canfd_set_bit(priv->base, RCANFD_RFCC(ridx),
> +					   RCANFD_RFCC_RFIE);
> +		}
>  	}
>  	return num_pkts;
>  }
> -- 
> 2.11.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH resend] can: rcar_canfd: fix possible IRQ storm on high load
  2019-06-26 13:12 ` Wolfram Sang
@ 2019-06-26 13:32   ` Wolfram Sang
  0 siblings, 0 replies; 3+ messages in thread
From: Wolfram Sang @ 2019-06-26 13:32 UTC (permalink / raw)
  To: Nikita Yushchenko
  Cc: Wolfgang Grandegger, Marc Kleine-Budde, David S. Miller,
	Simon Horman, Wolfram Sang, linux-can, netdev, linux-kernel,
	Artemi Ivanov, Sergei Shtylyov

[-- Attachment #1: Type: text/plain, Size: 1854 bytes --]

On Wed, Jun 26, 2019 at 03:12:51PM +0200, Wolfram Sang wrote:
> On Wed, Jun 26, 2019 at 04:08:48PM +0300, Nikita Yushchenko wrote:
> > We have observed rcar_canfd driver entering IRQ storm under high load,
> > with following scenario:
> > - rcar_canfd_global_interrupt() in entered due to Rx available,
> > - napi_schedule_prep() is called, and sets NAPIF_STATE_SCHED in state
> > - Rx fifo interrupts are masked,
> > - rcar_canfd_global_interrupt() is entered again, this time due to
> >   error interrupt (e.g. due to overflow),
> > - since scheduled napi poller has not yet executed, condition for calling
> >   napi_schedule_prep() from rcar_canfd_global_interrupt() remains true,
> >   thus napi_schedule_prep() gets called and sets NAPIF_STATE_MISSED flag
> >   in state,
> > - later, napi poller function rcar_canfd_rx_poll() gets executed, and
> >   calls napi_complete_done(),
> > - due to NAPIF_STATE_MISSED flag in state, this call does not clear
> >   NAPIF_STATE_SCHED flag from state,
> > - on return from napi_complete_done(), rcar_canfd_rx_poll() unmasks Rx
> >   interrutps,
> > - Rx interrupt happens, rcar_canfd_global_interrupt() gets called
> >   and calls napi_schedule_prep(),
> > - since NAPIF_STATE_SCHED is set in state at this time, this call
> >   returns false,
> > - due to that false return, rcar_canfd_global_interrupt() returns
> >   without masking Rx interrupt
> > - and this results into IRQ storm: unmasked Rx interrupt happens again
> >   and again is misprocessed in the same way.
> > 
> > This patch fixes that scenario by unmasking Rx interrupts only when
> > napi_complete_done() returns true, which means it has cleared
> > NAPIF_STATE_SCHED in state.
> > 
> > Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
> 
> CCing the driver author...

Bounced :(


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-06-26 13:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-26 13:08 [PATCH resend] can: rcar_canfd: fix possible IRQ storm on high load Nikita Yushchenko
2019-06-26 13:12 ` Wolfram Sang
2019-06-26 13:32   ` Wolfram Sang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).