All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
@ 2018-09-25 21:31 Michael Chan
  2018-09-25 22:15 ` Eric Dumazet
  2018-09-25 23:35 ` Song Liu
  0 siblings, 2 replies; 7+ messages in thread
From: Michael Chan @ 2018-09-25 21:31 UTC (permalink / raw)
  To: songliubraving, edumazet, davem; +Cc: netdev

The current netpoll implementation in the bnxt_en driver has problems
that may miss TX completion events.  bnxt_poll_work() in effect is
only handling at most 1 TX packet before exiting.  In addition,
there may be in flight TX completions that ->poll() may miss even
after we fix bnxt_poll_work() to handle all visible TX completions.
netpoll may not call ->poll() again and HW may not generate IRQ
because the driver does not ARM the IRQ when the budget (0 for netpoll)
is reached.

We fix it by handling all TX completions and to always ARM the IRQ
when we exit ->poll() with 0 budget.

Reported-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 61957b0..c981b53 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1913,7 +1913,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
 		}
 		raw_cons = NEXT_RAW_CMP(raw_cons);
 
-		if (rx_pkts == budget)
+		if (rx_pkts && rx_pkts == budget)
 			break;
 	}
 
@@ -2027,8 +2027,12 @@ static int bnxt_poll(struct napi_struct *napi, int budget)
 	while (1) {
 		work_done += bnxt_poll_work(bp, bnapi, budget - work_done);
 
-		if (work_done >= budget)
+		if (work_done >= budget) {
+			if (!budget)
+				BNXT_CP_DB_REARM(cpr->cp_doorbell,
+						 cpr->cp_raw_cons);
 			break;
+		}
 
 		if (!bnxt_has_work(bp, cpr)) {
 			if (napi_complete_done(napi, work_done))
-- 
2.5.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
  2018-09-25 21:31 [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll Michael Chan
@ 2018-09-25 22:15 ` Eric Dumazet
  2018-09-25 23:11   ` Michael Chan
  2018-09-25 23:35 ` Song Liu
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2018-09-25 22:15 UTC (permalink / raw)
  To: Michael Chan, songliubraving, edumazet, davem; +Cc: netdev



On 09/25/2018 02:31 PM, Michael Chan wrote:
> The current netpoll implementation in the bnxt_en driver has problems
> that may miss TX completion events.  bnxt_poll_work() in effect is
> only handling at most 1 TX packet before exiting.  In addition,
> there may be in flight TX completions that ->poll() may miss even
> after we fix bnxt_poll_work() to handle all visible TX completions.
> netpoll may not call ->poll() again and HW may not generate IRQ
> because the driver does not ARM the IRQ when the budget (0 for netpoll)
> is reached.
> 
> We fix it by handling all TX completions and to always ARM the IRQ
> when we exit ->poll() with 0 budget.
> 
> Reported-by: Song Liu <songliubraving@fb.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 61957b0..c981b53 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -1913,7 +1913,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
>  		}
>  		raw_cons = NEXT_RAW_CMP(raw_cons);
>  
> -		if (rx_pkts == budget)
> +		if (rx_pkts && rx_pkts == budget)
>  			break;
>  	}
>  
> @@ -2027,8 +2027,12 @@ static int bnxt_poll(struct napi_struct *napi, int budget)
>  	while (1) {
>  		work_done += bnxt_poll_work(bp, bnapi, budget - work_done);
>  
> -		if (work_done >= budget)
> +		if (work_done >= budget) {
> +			if (!budget)
> +				BNXT_CP_DB_REARM(cpr->cp_doorbell,
> +						 cpr->cp_raw_cons);
>  			break;
> +		}
>  
>  		if (!bnxt_has_work(bp, cpr)) {
>  			if (napi_complete_done(napi, work_done))
> 

Hi Michael, thanks for the patch.

It seems bnx2 should have a similar issue ?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
  2018-09-25 22:15 ` Eric Dumazet
@ 2018-09-25 23:11   ` Michael Chan
  2018-09-26  2:14     ` Michael Chan
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Chan @ 2018-09-25 23:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Song Liu, Eric Dumazet, David Miller, Netdev

On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:

>
> It seems bnx2 should have a similar issue ?
>

Yes, I think so.  The MSIX mode in bnx2 is also auto-masking, meaning
that MSIX will only assert once after it is ARMed.  If we return from
->poll() when budget of 0 is reached without ARMing, we may not get
another MSIX.

I can work on a similar patch but I don't have bnx2 cards to test with
anymore.  Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
  2018-09-25 21:31 [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll Michael Chan
  2018-09-25 22:15 ` Eric Dumazet
@ 2018-09-25 23:35 ` Song Liu
  1 sibling, 0 replies; 7+ messages in thread
From: Song Liu @ 2018-09-25 23:35 UTC (permalink / raw)
  To: Michael Chan; +Cc: edumazet, davem, netdev

Thanks Michael!

This works well in my tests. 


Reviewed-and-tested-by: Song Liu <songliubraving@fb.com>


> On Sep 25, 2018, at 2:31 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> 
> The current netpoll implementation in the bnxt_en driver has problems
> that may miss TX completion events.  bnxt_poll_work() in effect is
> only handling at most 1 TX packet before exiting.  In addition,
> there may be in flight TX completions that ->poll() may miss even
> after we fix bnxt_poll_work() to handle all visible TX completions.
> netpoll may not call ->poll() again and HW may not generate IRQ
> because the driver does not ARM the IRQ when the budget (0 for netpoll)
> is reached.
> 
> We fix it by handling all TX completions and to always ARM the IRQ
> when we exit ->poll() with 0 budget.
> 
> Reported-by: Song Liu <songliubraving@fb.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 61957b0..c981b53 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -1913,7 +1913,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
> 		}
> 		raw_cons = NEXT_RAW_CMP(raw_cons);
> 
> -		if (rx_pkts == budget)
> +		if (rx_pkts && rx_pkts == budget)
> 			break;
> 	}
> 
> @@ -2027,8 +2027,12 @@ static int bnxt_poll(struct napi_struct *napi, int budget)
> 	while (1) {
> 		work_done += bnxt_poll_work(bp, bnapi, budget - work_done);
> 
> -		if (work_done >= budget)
> +		if (work_done >= budget) {
> +			if (!budget)
> +				BNXT_CP_DB_REARM(cpr->cp_doorbell,
> +						 cpr->cp_raw_cons);
> 			break;
> +		}
> 
> 		if (!bnxt_has_work(bp, cpr)) {
> 			if (napi_complete_done(napi, work_done))
> -- 
> 2.5.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
  2018-09-25 23:11   ` Michael Chan
@ 2018-09-26  2:14     ` Michael Chan
  2018-09-26  2:25       ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Chan @ 2018-09-26  2:14 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Song Liu, Eric Dumazet, David Miller, Netdev

On Tue, Sep 25, 2018 at 4:11 PM Michael Chan <michael.chan@broadcom.com> wrote:
>
> On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> >
> > It seems bnx2 should have a similar issue ?
> >
>
> Yes, I think so.  The MSIX mode in bnx2 is also auto-masking, meaning
> that MSIX will only assert once after it is ARMed.  If we return from
> ->poll() when budget of 0 is reached without ARMing, we may not get
> another MSIX.
>

On second thought, I think bnx2 is ok.  If netpoll is polling on the
TX packets and reaching budget of 0 and returning, the INT_ACK_CMD
register is untouched.  bnx2 uses the status block for events and the
producers/consumers are cumulative.  So there is no need to ACK the
status block unless ARMing for interrupts.  If there is an IRQ about
to be fired, it won't be affected by the polling done by netpoll.

In the case of bnxt, a completion ring is used for the events.  The
polling done by netpoll will cause the completion ring to be ACKed as
entries are processed.  ACKing the completion ring without ARMing may
cause future IRQs to be disabled for that ring.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
  2018-09-26  2:14     ` Michael Chan
@ 2018-09-26  2:25       ` Eric Dumazet
  2018-09-26  2:54         ` Michael Chan
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2018-09-26  2:25 UTC (permalink / raw)
  To: Michael Chan; +Cc: Eric Dumazet, Song Liu, David Miller, netdev

On Tue, Sep 25, 2018 at 7:15 PM Michael Chan <michael.chan@broadcom.com> wrote:
>
> On Tue, Sep 25, 2018 at 4:11 PM Michael Chan <michael.chan@broadcom.com> wrote:
> >
> > On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > >
> > > It seems bnx2 should have a similar issue ?
> > >
> >
> > Yes, I think so.  The MSIX mode in bnx2 is also auto-masking, meaning
> > that MSIX will only assert once after it is ARMed.  If we return from
> > ->poll() when budget of 0 is reached without ARMing, we may not get
> > another MSIX.
> >
>
> On second thought, I think bnx2 is ok.  If netpoll is polling on the
> TX packets and reaching budget of 0 and returning, the INT_ACK_CMD
> register is untouched.  bnx2 uses the status block for events and the
> producers/consumers are cumulative.  So there is no need to ACK the
> status block unless ARMing for interrupts.  If there is an IRQ about
> to be fired, it won't be affected by the polling done by netpoll.
>
> In the case of bnxt, a completion ring is used for the events.  The
> polling done by netpoll will cause the completion ring to be ACKed as
> entries are processed.  ACKing the completion ring without ARMing may
> cause future IRQs to be disabled for that ring.

About bnxt : Are you sure it is all about IRQ problems ?

What if the whole ring buffer is is filled, then all entries
are processed from netpoll.

If cp_raw_cons becomes too high without the NIC knowing its (updated)
value, maybe no IRQ can be generated anymore because
of some wrapping issue (based on ring size)

I guess that in order to test this, we would need something bursting
16000 messages while holding napi->poll_owner.
The (single) IRQ would set/grab the SCHED bit but the cpu responsible
to service this (soft)irq would spin for the whole test,
and no more IRQ should be fired really.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
  2018-09-26  2:25       ` Eric Dumazet
@ 2018-09-26  2:54         ` Michael Chan
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Chan @ 2018-09-26  2:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Eric Dumazet, Song Liu, David Miller, Netdev

On Tue, Sep 25, 2018 at 7:25 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Sep 25, 2018 at 7:15 PM Michael Chan <michael.chan@broadcom.com> wrote:
> >
> > On Tue, Sep 25, 2018 at 4:11 PM Michael Chan <michael.chan@broadcom.com> wrote:
> > >
> > > On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > >
> > > >
> > > > It seems bnx2 should have a similar issue ?
> > > >
> > >
> > > Yes, I think so.  The MSIX mode in bnx2 is also auto-masking, meaning
> > > that MSIX will only assert once after it is ARMed.  If we return from
> > > ->poll() when budget of 0 is reached without ARMing, we may not get
> > > another MSIX.
> > >
> >
> > On second thought, I think bnx2 is ok.  If netpoll is polling on the
> > TX packets and reaching budget of 0 and returning, the INT_ACK_CMD
> > register is untouched.  bnx2 uses the status block for events and the
> > producers/consumers are cumulative.  So there is no need to ACK the
> > status block unless ARMing for interrupts.  If there is an IRQ about
> > to be fired, it won't be affected by the polling done by netpoll.
> >
> > In the case of bnxt, a completion ring is used for the events.  The
> > polling done by netpoll will cause the completion ring to be ACKed as
> > entries are processed.  ACKing the completion ring without ARMing may
> > cause future IRQs to be disabled for that ring.
>
> About bnxt : Are you sure it is all about IRQ problems ?

I'm pretty sure, because FB first reported TX timeouts followed by
ring reset failures when running netconsole.  These ring reset
failures are caused by IRQs no longer working on some rings.

>
> What if the whole ring buffer is is filled, then all entries
> are processed from netpoll.
>
> If cp_raw_cons becomes too high without the NIC knowing its (updated)
> value, maybe no IRQ can be generated anymore because
> of some wrapping issue (based on ring size)

Good point.  We have logic to handle that.  We will ACK the ring at
least once every tp->tx_wake_thresh TX packets.  But this logic fails
when the budget is 0, so I need to send a revised patch take care of
this one case.

>
> I guess that in order to test this, we would need something bursting
> 16000 messages while holding napi->poll_owner.
> The (single) IRQ would set/grab the SCHED bit but the cpu responsible
> to service this (soft)irq would spin for the whole test,
> and no more IRQ should be fired really.

Right, not easy to hit.  But it should be handled by my v2 patch.  Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-09-26  9:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-25 21:31 [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll Michael Chan
2018-09-25 22:15 ` Eric Dumazet
2018-09-25 23:11   ` Michael Chan
2018-09-26  2:14     ` Michael Chan
2018-09-26  2:25       ` Eric Dumazet
2018-09-26  2:54         ` Michael Chan
2018-09-25 23:35 ` Song Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.