All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net 0/4] bnxt: Tx NAPI disabling resiliency improvements
@ 2021-08-11 19:32 Jakub Kicinski
  2021-08-11 19:32 ` [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll Jakub Kicinski
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Jakub Kicinski @ 2021-08-11 19:32 UTC (permalink / raw)
  To: davem
  Cc: michael.chan, huangjw, eddie.wai, prashant, gospo, netdev,
	edwin.peer, Jakub Kicinski

A lockdep warning was triggered by netpoll because napi poll
was taking the xmit lock. Fix that and a couple more issues
noticed while reading the code.

Jakub Kicinski (4):
  bnxt: don't lock the tx queue from napi poll
  bnxt: disable napi before cancelling DIM
  bnxt: make sure xmit_more + errors does not miss doorbells
  bnxt: count Tx drops

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 59 ++++++++++++++---------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 2 files changed, 36 insertions(+), 24 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll
  2021-08-11 19:32 [PATCH net 0/4] bnxt: Tx NAPI disabling resiliency improvements Jakub Kicinski
@ 2021-08-11 19:32 ` Jakub Kicinski
  2021-08-11 20:13   ` Edwin Peer
  2021-08-11 20:42   ` Michael Chan
  2021-08-11 19:32 ` [PATCH net 2/4] bnxt: disable napi before cancelling DIM Jakub Kicinski
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 9+ messages in thread
From: Jakub Kicinski @ 2021-08-11 19:32 UTC (permalink / raw)
  To: davem
  Cc: michael.chan, huangjw, eddie.wai, prashant, gospo, netdev,
	edwin.peer, Jakub Kicinski

We can't take the tx lock from the napi poll routine, because
netpoll can poll napi at any moment, including with the tx lock
already held.

It seems that the tx lock is only protecting against the disable
path, appropriate barriers are already in place to make sure
cleanup can safely run concurrently with start_xmit. I don't see
any other reason why 'stopped && avail > thresh' needs to be
re-checked under the lock.

Remove the tx lock and use synchronize_net() to make sure
closing the device does not race we restarting the queues.
Annotate accesses to dev_state against data races.

Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 865fcb8cf29f..07827d6b0fec 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -730,15 +730,10 @@ static void bnxt_tx_int(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
 	 */
 	smp_mb();
 
-	if (unlikely(netif_tx_queue_stopped(txq)) &&
-	    (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh)) {
-		__netif_tx_lock(txq, smp_processor_id());
-		if (netif_tx_queue_stopped(txq) &&
-		    bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
-		    txr->dev_state != BNXT_DEV_STATE_CLOSING)
-			netif_tx_wake_queue(txq);
-		__netif_tx_unlock(txq);
-	}
+	if (netif_tx_queue_stopped(txq) &&
+	    bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
+	    READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING)
+		netif_tx_wake_queue(txq);
 }
 
 static struct page *__bnxt_alloc_rx_page(struct bnxt *bp, dma_addr_t *mapping,
@@ -9264,9 +9259,11 @@ void bnxt_tx_disable(struct bnxt *bp)
 	if (bp->tx_ring) {
 		for (i = 0; i < bp->tx_nr_rings; i++) {
 			txr = &bp->tx_ring[i];
-			txr->dev_state = BNXT_DEV_STATE_CLOSING;
+			WRITE_ONCE(txr->dev_state, BNXT_DEV_STATE_CLOSING);
 		}
 	}
+	/* Make sure napi polls see @dev_state change */
+	synchronize_net();
 	/* Drop carrier first to prevent TX timeout */
 	netif_carrier_off(bp->dev);
 	/* Stop all TX queues */
@@ -9280,8 +9277,10 @@ void bnxt_tx_enable(struct bnxt *bp)
 
 	for (i = 0; i < bp->tx_nr_rings; i++) {
 		txr = &bp->tx_ring[i];
-		txr->dev_state = 0;
+		WRITE_ONCE(txr->dev_state, 0);
 	}
+	/* Make sure napi polls see @dev_state change */
+	synchronize_net();
 	netif_tx_wake_all_queues(bp->dev);
 	if (bp->link_info.link_up)
 		netif_carrier_on(bp->dev);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 2/4] bnxt: disable napi before cancelling DIM
  2021-08-11 19:32 [PATCH net 0/4] bnxt: Tx NAPI disabling resiliency improvements Jakub Kicinski
  2021-08-11 19:32 ` [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll Jakub Kicinski
@ 2021-08-11 19:32 ` Jakub Kicinski
  2021-08-11 19:32 ` [PATCH net 3/4] bnxt: make sure xmit_more + errors does not miss doorbells Jakub Kicinski
  2021-08-11 19:32 ` [PATCH net 4/4] bnxt: count Tx drops Jakub Kicinski
  3 siblings, 0 replies; 9+ messages in thread
From: Jakub Kicinski @ 2021-08-11 19:32 UTC (permalink / raw)
  To: davem
  Cc: michael.chan, huangjw, eddie.wai, prashant, gospo, netdev,
	edwin.peer, Jakub Kicinski

napi schedules DIM, napi has to be disabled first,
then DIM cancelled.

Noticed while reading the code.

Fixes: 0bc0b97fca73 ("bnxt_en: cleanup DIM work on device shutdown")
Fixes: 6a8788f25625 ("bnxt_en: add support for software dynamic interrupt moderation")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 07827d6b0fec..2c0240ee2105 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -9222,10 +9222,9 @@ static void bnxt_disable_napi(struct bnxt *bp)
 	for (i = 0; i < bp->cp_nr_rings; i++) {
 		struct bnxt_cp_ring_info *cpr = &bp->bnapi[i]->cp_ring;
 
+		napi_disable(&bp->bnapi[i]->napi);
 		if (bp->bnapi[i]->rx_ring)
 			cancel_work_sync(&cpr->dim.work);
-
-		napi_disable(&bp->bnapi[i]->napi);
 	}
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 3/4] bnxt: make sure xmit_more + errors does not miss doorbells
  2021-08-11 19:32 [PATCH net 0/4] bnxt: Tx NAPI disabling resiliency improvements Jakub Kicinski
  2021-08-11 19:32 ` [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll Jakub Kicinski
  2021-08-11 19:32 ` [PATCH net 2/4] bnxt: disable napi before cancelling DIM Jakub Kicinski
@ 2021-08-11 19:32 ` Jakub Kicinski
  2021-08-11 21:12   ` Michael Chan
  2021-08-11 19:32 ` [PATCH net 4/4] bnxt: count Tx drops Jakub Kicinski
  3 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2021-08-11 19:32 UTC (permalink / raw)
  To: davem
  Cc: michael.chan, huangjw, eddie.wai, prashant, gospo, netdev,
	edwin.peer, Jakub Kicinski

skbs are freed on error and not put on the ring. We may, however,
be in a situation where we're freeing the last skb of a batch,
and there is a doorbell ring pending because of xmit_more() being
true earlier. Make sure we ring the door bell in such situations.

Since errors are rare don't pay attention to xmit_more() and just
always flush the pending frames.

The ring should never be busy given the queue is stopped in advance
so add a warning there and ignore the busy case.

Noticed while reading the code.

Fixes: 4d172f21cefe ("bnxt_en: Implement xmit_more.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 33 +++++++++++++++--------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 2 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 2c0240ee2105..b80ed556c28b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -367,6 +367,13 @@ static u16 bnxt_xmit_get_cfa_action(struct sk_buff *skb)
 	return md_dst->u.port_info.port_id;
 }
 
+static void bnxt_txr_db_kick(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
+			     u16 prod)
+{
+	bnxt_db_write(bp, &txr->tx_db, prod);
+	txr->kick_pending = 0;
+}
+
 static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct bnxt *bp = netdev_priv(dev);
@@ -396,6 +403,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	free_size = bnxt_tx_avail(bp, txr);
 	if (unlikely(free_size < skb_shinfo(skb)->nr_frags + 2)) {
 		netif_tx_stop_queue(txq);
+		if (net_ratelimit())
+			netdev_warn(dev, "bnxt: ring busy!\n");
 		return NETDEV_TX_BUSY;
 	}
 
@@ -516,21 +525,16 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 normal_tx:
 	if (length < BNXT_MIN_PKT_SIZE) {
 		pad = BNXT_MIN_PKT_SIZE - length;
-		if (skb_pad(skb, pad)) {
+		if (skb_pad(skb, pad))
 			/* SKB already freed. */
-			tx_buf->skb = NULL;
-			return NETDEV_TX_OK;
-		}
+			goto tx_kick_pending;
 		length = BNXT_MIN_PKT_SIZE;
 	}
 
 	mapping = dma_map_single(&pdev->dev, skb->data, len, DMA_TO_DEVICE);
 
-	if (unlikely(dma_mapping_error(&pdev->dev, mapping))) {
-		dev_kfree_skb_any(skb);
-		tx_buf->skb = NULL;
-		return NETDEV_TX_OK;
-	}
+	if (unlikely(dma_mapping_error(&pdev->dev, mapping)))
+		goto tx_free;
 
 	dma_unmap_addr_set(tx_buf, mapping, mapping);
 	flags = (len << TX_BD_LEN_SHIFT) | TX_BD_TYPE_LONG_TX_BD |
@@ -617,13 +621,15 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	txr->tx_prod = prod;
 
 	if (!netdev_xmit_more() || netif_xmit_stopped(txq))
-		bnxt_db_write(bp, &txr->tx_db, prod);
+		bnxt_txr_db_kick(bp, txr, prod);
+	else
+		txr->kick_pending = 1;
 
 tx_done:
 
 	if (unlikely(bnxt_tx_avail(bp, txr) <= MAX_SKB_FRAGS + 1)) {
 		if (netdev_xmit_more() && !tx_buf->is_push)
-			bnxt_db_write(bp, &txr->tx_db, prod);
+			bnxt_txr_db_kick(bp, txr, prod);
 
 		netif_tx_stop_queue(txq);
 
@@ -661,7 +667,12 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			       PCI_DMA_TODEVICE);
 	}
 
+tx_free:
 	dev_kfree_skb_any(skb);
+tx_kick_pending:
+	tx_buf->skb = NULL;
+	if (txr->kick_pending)
+		bnxt_txr_db_kick(bp, txr, prod);
 	return NETDEV_TX_OK;
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 9c3324e76ff7..7b989b6e4f6e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -799,6 +799,7 @@ struct bnxt_tx_ring_info {
 	u16			tx_prod;
 	u16			tx_cons;
 	u16			txq_index;
+	u8			kick_pending;
 	struct bnxt_db_info	tx_db;
 
 	struct tx_bd		*tx_desc_ring[MAX_TX_PAGES];
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 4/4] bnxt: count Tx drops
  2021-08-11 19:32 [PATCH net 0/4] bnxt: Tx NAPI disabling resiliency improvements Jakub Kicinski
                   ` (2 preceding siblings ...)
  2021-08-11 19:32 ` [PATCH net 3/4] bnxt: make sure xmit_more + errors does not miss doorbells Jakub Kicinski
@ 2021-08-11 19:32 ` Jakub Kicinski
  3 siblings, 0 replies; 9+ messages in thread
From: Jakub Kicinski @ 2021-08-11 19:32 UTC (permalink / raw)
  To: davem
  Cc: michael.chan, huangjw, eddie.wai, prashant, gospo, netdev,
	edwin.peer, Jakub Kicinski

Drivers should count packets they are dropping.

Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index b80ed556c28b..d5049e714c94 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -393,6 +393,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	i = skb_get_queue_mapping(skb);
 	if (unlikely(i >= bp->tx_nr_rings)) {
 		dev_kfree_skb_any(skb);
+		atomic_long_inc(&dev->tx_dropped);
 		return NETDEV_TX_OK;
 	}
 
@@ -673,6 +674,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	tx_buf->skb = NULL;
 	if (txr->kick_pending)
 		bnxt_txr_db_kick(bp, txr, prod);
+	atomic_long_inc(&dev->tx_dropped);
 	return NETDEV_TX_OK;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll
  2021-08-11 19:32 ` [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll Jakub Kicinski
@ 2021-08-11 20:13   ` Edwin Peer
  2021-08-11 20:42   ` Michael Chan
  1 sibling, 0 replies; 9+ messages in thread
From: Edwin Peer @ 2021-08-11 20:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, Michael Chan, Jeffrey Huang, Eddie Wai,
	Prashant Sreedharan, Andrew Gospodarek, netdev

On Wed, Aug 11, 2021 at 12:32 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> We can't take the tx lock from the napi poll routine, because
> netpoll can poll napi at any moment, including with the tx lock
> already held.
>
> It seems that the tx lock is only protecting against the disable
> path, appropriate barriers are already in place to make sure
> cleanup can safely run concurrently with start_xmit. I don't see
> any other reason why 'stopped && avail > thresh' needs to be
> re-checked under the lock.
>
> Remove the tx lock and use synchronize_net() to make sure
> closing the device does not race we restarting the queues.
> Annotate accesses to dev_state against data races.
>
> Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 865fcb8cf29f..07827d6b0fec 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -730,15 +730,10 @@ static void bnxt_tx_int(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
>          */
>         smp_mb();
>
> -       if (unlikely(netif_tx_queue_stopped(txq)) &&
> -           (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh)) {
> -               __netif_tx_lock(txq, smp_processor_id());
> -               if (netif_tx_queue_stopped(txq) &&
> -                   bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> -                   txr->dev_state != BNXT_DEV_STATE_CLOSING)
> -                       netif_tx_wake_queue(txq);
> -               __netif_tx_unlock(txq);
> -       }
> +       if (netif_tx_queue_stopped(txq) &&

Probably worth retaining the unlikely() that the original outer check had.

> +           bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> +           READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING)
> +               netif_tx_wake_queue(txq);
>  }
>
>  static struct page *__bnxt_alloc_rx_page(struct bnxt *bp, dma_addr_t *mapping,
> @@ -9264,9 +9259,11 @@ void bnxt_tx_disable(struct bnxt *bp)
>         if (bp->tx_ring) {
>                 for (i = 0; i < bp->tx_nr_rings; i++) {
>                         txr = &bp->tx_ring[i];
> -                       txr->dev_state = BNXT_DEV_STATE_CLOSING;
> +                       WRITE_ONCE(txr->dev_state, BNXT_DEV_STATE_CLOSING);
>                 }
>         }
> +       /* Make sure napi polls see @dev_state change */
> +       synchronize_net();
>         /* Drop carrier first to prevent TX timeout */
>         netif_carrier_off(bp->dev);
>         /* Stop all TX queues */
> @@ -9280,8 +9277,10 @@ void bnxt_tx_enable(struct bnxt *bp)
>
>         for (i = 0; i < bp->tx_nr_rings; i++) {
>                 txr = &bp->tx_ring[i];
> -               txr->dev_state = 0;
> +               WRITE_ONCE(txr->dev_state, 0);
>         }
> +       /* Make sure napi polls see @dev_state change */
> +       synchronize_net();
>         netif_tx_wake_all_queues(bp->dev);
>         if (bp->link_info.link_up)
>                 netif_carrier_on(bp->dev);
> --
> 2.31.1

Regards,
Edwin Peer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll
  2021-08-11 19:32 ` [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll Jakub Kicinski
  2021-08-11 20:13   ` Edwin Peer
@ 2021-08-11 20:42   ` Michael Chan
  2021-08-11 21:09     ` Jakub Kicinski
  1 sibling, 1 reply; 9+ messages in thread
From: Michael Chan @ 2021-08-11 20:42 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, Jeffrey Huang, Eddie Wai, Prashant Sreedharan,
	Andrew Gospodarek, Netdev, Edwin Peer

[-- Attachment #1: Type: text/plain, Size: 1657 bytes --]

On Wed, Aug 11, 2021 at 12:32 PM Jakub Kicinski <kuba@kernel.org> wrote:

> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 865fcb8cf29f..07827d6b0fec 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -730,15 +730,10 @@ static void bnxt_tx_int(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
>          */
>         smp_mb();
>
> -       if (unlikely(netif_tx_queue_stopped(txq)) &&
> -           (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh)) {
> -               __netif_tx_lock(txq, smp_processor_id());
> -               if (netif_tx_queue_stopped(txq) &&
> -                   bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> -                   txr->dev_state != BNXT_DEV_STATE_CLOSING)
> -                       netif_tx_wake_queue(txq);
> -               __netif_tx_unlock(txq);
> -       }
> +       if (netif_tx_queue_stopped(txq) &&
> +           bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> +           READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING)

This can race with bnxt_start_xmit().  bnxt_start_xmit() can also wake
up the queue when it sees that descriptors are available.  I think
this is the reason we added tx locking here.  The race may be ok
because in the worst case, we will wake up the TX queue when it's not
supposed to wakeup.  If that happens, bnxt_start_xmit() will return
NETDEV_TX_BUSY and stop the queue again when there are not enough TX
descriptors.

> +               netif_tx_wake_queue(txq);
>  }
>
>  static struct page *__bnxt_alloc_rx_page(struct bnxt *bp, dma_addr_t *mapping,

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll
  2021-08-11 20:42   ` Michael Chan
@ 2021-08-11 21:09     ` Jakub Kicinski
  0 siblings, 0 replies; 9+ messages in thread
From: Jakub Kicinski @ 2021-08-11 21:09 UTC (permalink / raw)
  To: Michael Chan
  Cc: David Miller, Jeffrey Huang, Eddie Wai, Prashant Sreedharan,
	Andrew Gospodarek, Netdev, Edwin Peer

On Wed, 11 Aug 2021 13:42:40 -0700 Michael Chan wrote:
> > -       if (unlikely(netif_tx_queue_stopped(txq)) &&
> > -           (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh)) {
> > -               __netif_tx_lock(txq, smp_processor_id());
> > -               if (netif_tx_queue_stopped(txq) &&
> > -                   bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> > -                   txr->dev_state != BNXT_DEV_STATE_CLOSING)
> > -                       netif_tx_wake_queue(txq);
> > -               __netif_tx_unlock(txq);
> > -       }
> > +       if (netif_tx_queue_stopped(txq) &&
> > +           bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> > +           READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING)  
> 
> This can race with bnxt_start_xmit().  bnxt_start_xmit() can also wake
> up the queue when it sees that descriptors are available.  I think
> this is the reason we added tx locking here.  The race may be ok
> because in the worst case, we will wake up the TX queue when it's not
> supposed to wakeup.  If that happens, bnxt_start_xmit() will return
> NETDEV_TX_BUSY and stop the queue again when there are not enough TX
> descriptors.

Good point, let me remove the warning from patch 3, then.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 3/4] bnxt: make sure xmit_more + errors does not miss doorbells
  2021-08-11 19:32 ` [PATCH net 3/4] bnxt: make sure xmit_more + errors does not miss doorbells Jakub Kicinski
@ 2021-08-11 21:12   ` Michael Chan
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Chan @ 2021-08-11 21:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, Jeffrey Huang, Eddie Wai, Prashant Sreedharan,
	Andrew Gospodarek, Netdev, Edwin Peer

[-- Attachment #1: Type: text/plain, Size: 3502 bytes --]

On Wed, Aug 11, 2021 at 12:32 PM Jakub Kicinski <kuba@kernel.org> wrote:

> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 2c0240ee2105..b80ed556c28b 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -367,6 +367,13 @@ static u16 bnxt_xmit_get_cfa_action(struct sk_buff *skb)
>         return md_dst->u.port_info.port_id;
>  }
>
> +static void bnxt_txr_db_kick(struct bnxt *bp, struct bnxt_tx_ring_info *txr,
> +                            u16 prod)
> +{
> +       bnxt_db_write(bp, &txr->tx_db, prod);
> +       txr->kick_pending = 0;
> +}
> +
>  static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
>         struct bnxt *bp = netdev_priv(dev);
> @@ -396,6 +403,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
>         free_size = bnxt_tx_avail(bp, txr);
>         if (unlikely(free_size < skb_shinfo(skb)->nr_frags + 2)) {
>                 netif_tx_stop_queue(txq);
> +               if (net_ratelimit())
> +                       netdev_warn(dev, "bnxt: ring busy!\n");
>                 return NETDEV_TX_BUSY;
>         }
>
> @@ -516,21 +525,16 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  normal_tx:
>         if (length < BNXT_MIN_PKT_SIZE) {
>                 pad = BNXT_MIN_PKT_SIZE - length;
> -               if (skb_pad(skb, pad)) {
> +               if (skb_pad(skb, pad))
>                         /* SKB already freed. */
> -                       tx_buf->skb = NULL;
> -                       return NETDEV_TX_OK;
> -               }
> +                       goto tx_kick_pending;
>                 length = BNXT_MIN_PKT_SIZE;
>         }
>
>         mapping = dma_map_single(&pdev->dev, skb->data, len, DMA_TO_DEVICE);
>
> -       if (unlikely(dma_mapping_error(&pdev->dev, mapping))) {
> -               dev_kfree_skb_any(skb);
> -               tx_buf->skb = NULL;
> -               return NETDEV_TX_OK;
> -       }
> +       if (unlikely(dma_mapping_error(&pdev->dev, mapping)))
> +               goto tx_free;
>
>         dma_unmap_addr_set(tx_buf, mapping, mapping);
>         flags = (len << TX_BD_LEN_SHIFT) | TX_BD_TYPE_LONG_TX_BD |
> @@ -617,13 +621,15 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
>         txr->tx_prod = prod;
>
>         if (!netdev_xmit_more() || netif_xmit_stopped(txq))
> -               bnxt_db_write(bp, &txr->tx_db, prod);
> +               bnxt_txr_db_kick(bp, txr, prod);
> +       else
> +               txr->kick_pending = 1;
>
>  tx_done:
>
>         if (unlikely(bnxt_tx_avail(bp, txr) <= MAX_SKB_FRAGS + 1)) {
>                 if (netdev_xmit_more() && !tx_buf->is_push)
> -                       bnxt_db_write(bp, &txr->tx_db, prod);
> +                       bnxt_txr_db_kick(bp, txr, prod);
>
>                 netif_tx_stop_queue(txq);
>
> @@ -661,7 +667,12 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
>                                PCI_DMA_TODEVICE);
>         }
>
> +tx_free:
>         dev_kfree_skb_any(skb);
> +tx_kick_pending:
> +       tx_buf->skb = NULL;
> +       if (txr->kick_pending)
> +               bnxt_txr_db_kick(bp, txr, prod);

prod may not be the correct value if we have gone through the
tx_dma_error path.  I think we should always use txr->tx_prod here.

Thanks.

>         return NETDEV_TX_OK;
>  }
>

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-08-11 21:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-11 19:32 [PATCH net 0/4] bnxt: Tx NAPI disabling resiliency improvements Jakub Kicinski
2021-08-11 19:32 ` [PATCH net 1/4] bnxt: don't lock the tx queue from napi poll Jakub Kicinski
2021-08-11 20:13   ` Edwin Peer
2021-08-11 20:42   ` Michael Chan
2021-08-11 21:09     ` Jakub Kicinski
2021-08-11 19:32 ` [PATCH net 2/4] bnxt: disable napi before cancelling DIM Jakub Kicinski
2021-08-11 19:32 ` [PATCH net 3/4] bnxt: make sure xmit_more + errors does not miss doorbells Jakub Kicinski
2021-08-11 21:12   ` Michael Chan
2021-08-11 19:32 ` [PATCH net 4/4] bnxt: count Tx drops Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.