linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next 0/6] Perf and debug fixes for hfi
@ 2021-09-13 13:28 Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 1/6] IB/hfi1: Remove cache and embed txreq in ring Dennis Dalessandro
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Dennis Dalessandro @ 2021-09-13 13:28 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma

Here is a series of perf improvements and debug/trace fixes from Mike,
who has this to say about the patches...

The AIP SDMA interrupt handling is inefficient:

- A slab entry is allocated for each sent packet

  This is despite the fact that there is a ring for each possible send slot
  that could be occupied by a tx descriptor

- The interrupt handling/NAPI is lock happy has a mixed up notion of
  producer and consumer

  The ring should be a ring of tx descriptors vs. a ring of pointers

  The consumer of descriptors should be the xmit side of the TX

  The producer of the descriptors is the SDMA interrupt handling and NAPI
  tx completion

  There is certainly no locking required in the interrupt/TX napi tx queue

  There is no locking required in the xmit side since that is held off by NAPI
  code

Note that these patches are also staged publicly on our GitHub site for easy
browsing in context.

https://github.com/cornelisnetworks/linux

---

Mike Marciniszyn (6):
      IB/hfi1: Remove cache and embed txreq in ring
      IB/hfi1: Get rid of hot path divide
      IB/hfi1: Get rid of tx priv backpointer
      IB/hfi1: Tune netdev xmit cachelines
      IB/hfi1: Remove atomic completion count
      IB/hfi1: Add ring consumer and producers traces


 drivers/infiniband/hw/hfi1/ipoib.h    |   76 +++++---
 drivers/infiniband/hw/hfi1/ipoib_tx.c |  314 ++++++++++++++-------------------
 drivers/infiniband/hw/hfi1/trace_tx.h |   71 +++++++
 3 files changed, 246 insertions(+), 215 deletions(-)

--
-Denny

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH for-next 1/6] IB/hfi1: Remove cache and embed txreq in ring
  2021-09-13 13:28 [PATCH for-next 0/6] Perf and debug fixes for hfi Dennis Dalessandro
@ 2021-09-13 13:28 ` Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 2/6] IB/hfi1: Get rid of hot path divide Dennis Dalessandro
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Dennis Dalessandro @ 2021-09-13 13:28 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

This patch removes kmem cache allocation and deallocation in
favor of having the ipoib_txreq in the ring.

The consumer is now the packet sending side allocating tx descriptors
from ring and the producer is the napi interrupt handling freeing
tx descriptors.

The locks are now eliminated because the napi tx lock insures a single
consumer and the napi handling insures a single producer.

The napi poll is converted to memory poll looking for items that have been
marked completed.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/ipoib.h    |   17 +--
 drivers/infiniband/hw/hfi1/ipoib_tx.c |  212 +++++++++++++++------------------
 2 files changed, 101 insertions(+), 128 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/ipoib.h b/drivers/infiniband/hw/hfi1/ipoib.h
index 2cff38b..4e91b70 100644
--- a/drivers/infiniband/hw/hfi1/ipoib.h
+++ b/drivers/infiniband/hw/hfi1/ipoib.h
@@ -45,21 +45,19 @@
 
 /**
  * struct hfi1_ipoib_circ_buf - List of items to be processed
- * @items: ring of items
+ * @items: ring of items each a power of two size
  * @head: ring head
  * @tail: ring tail
  * @max_items: max items + 1 that the ring can contain
- * @producer_lock: producer sync lock
- * @consumer_lock: consumer sync lock
+ * @shift: log2 of size for getting txreq
  */
 struct ipoib_txreq;
 struct hfi1_ipoib_circ_buf {
-	struct ipoib_txreq **items;
-	unsigned long head;
-	unsigned long tail;
-	unsigned long max_items;
-	spinlock_t producer_lock; /* head sync lock */
-	spinlock_t consumer_lock; /* tail sync lock */
+	void *items;
+	u32 head;
+	u32 tail;
+	u32 max_items;
+	u32 shift;
 };
 
 /**
@@ -102,7 +100,6 @@ struct hfi1_ipoib_dev_priv {
 	struct net_device   *netdev;
 	struct ib_device    *device;
 	struct hfi1_ipoib_txq *txqs;
-	struct kmem_cache *txreq_cache;
 	struct napi_struct *tx_napis;
 	u16 pkey;
 	u16 pkey_index;
diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c
index e74ddbe..0a5d327 100644
--- a/drivers/infiniband/hw/hfi1/ipoib_tx.c
+++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c
@@ -27,6 +27,7 @@
  * @txreq: sdma transmit request
  * @sdma_hdr: 9b ib headers
  * @sdma_status: status returned by sdma engine
+ * @complete: non-zero implies complete
  * @priv: ipoib netdev private data
  * @txq: txq on which skb was output
  * @skb: skb to send
@@ -35,6 +36,7 @@ struct ipoib_txreq {
 	struct sdma_txreq           txreq;
 	struct hfi1_sdma_header     sdma_hdr;
 	int                         sdma_status;
+	int                         complete;
 	struct hfi1_ipoib_dev_priv *priv;
 	struct hfi1_ipoib_txq      *txq;
 	struct sk_buff             *skb;
@@ -51,7 +53,13 @@ struct ipoib_txparms {
 	u8                          entropy;
 };
 
-static u64 hfi1_ipoib_txreqs(const u64 sent, const u64 completed)
+static struct ipoib_txreq *
+hfi1_txreq_from_idx(struct hfi1_ipoib_circ_buf *r, u32 idx)
+{
+	return (struct ipoib_txreq *)(r->items + (idx << r->shift));
+}
+
+static u32 hfi1_ipoib_txreqs(const u64 sent, const u64 completed)
 {
 	return sent - completed;
 }
@@ -139,51 +147,55 @@ static void hfi1_ipoib_free_tx(struct ipoib_txreq *tx, int budget)
 	}
 
 	napi_consume_skb(tx->skb, budget);
+	tx->skb = NULL;
 	sdma_txclean(priv->dd, &tx->txreq);
-	kmem_cache_free(priv->txreq_cache, tx);
 }
 
-static int hfi1_ipoib_drain_tx_ring(struct hfi1_ipoib_txq *txq, int budget)
+static void hfi1_ipoib_drain_tx_ring(struct hfi1_ipoib_txq *txq)
 {
 	struct hfi1_ipoib_circ_buf *tx_ring = &txq->tx_ring;
-	unsigned long head;
-	unsigned long tail;
-	unsigned int max_tx;
-	int work_done;
-	int tx_count;
-
-	spin_lock_bh(&tx_ring->consumer_lock);
-
-	/* Read index before reading contents at that index. */
-	head = smp_load_acquire(&tx_ring->head);
-	tail = tx_ring->tail;
-	max_tx = tx_ring->max_items;
-
-	work_done = min_t(int, CIRC_CNT(head, tail, max_tx), budget);
+	int i;
+	struct ipoib_txreq *tx;
 
-	for (tx_count = work_done; tx_count; tx_count--) {
-		hfi1_ipoib_free_tx(tx_ring->items[tail], budget);
-		tail = CIRC_NEXT(tail, max_tx);
+	for (i = 0; i < tx_ring->max_items; i++) {
+		tx = hfi1_txreq_from_idx(tx_ring, i);
+		tx->complete = 0;
+		dev_kfree_skb_any(tx->skb);
+		tx->skb = NULL;
+		sdma_txclean(txq->priv->dd, &tx->txreq);
 	}
-
-	atomic64_add(work_done, &txq->complete_txreqs);
-
-	/* Finished freeing tx items so store the tail value. */
-	smp_store_release(&tx_ring->tail, tail);
-
-	spin_unlock_bh(&tx_ring->consumer_lock);
-
-	hfi1_ipoib_check_queue_stopped(txq);
-
-	return work_done;
+	tx_ring->head = 0;
+	tx_ring->tail = 0;
+	atomic64_set(&txq->complete_txreqs, 0);
+	txq->sent_txreqs = 0;
 }
 
-static int hfi1_ipoib_process_tx_ring(struct napi_struct *napi, int budget)
+static int hfi1_ipoib_poll_tx_ring(struct napi_struct *napi, int budget)
 {
 	struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(napi->dev);
 	struct hfi1_ipoib_txq *txq = &priv->txqs[napi - priv->tx_napis];
+	struct hfi1_ipoib_circ_buf *tx_ring = &txq->tx_ring;
+	u32 head = tx_ring->head;
+	u32 max_tx = tx_ring->max_items;
+	int work_done;
+	struct ipoib_txreq *tx =  hfi1_txreq_from_idx(tx_ring, head);
 
-	int work_done = hfi1_ipoib_drain_tx_ring(txq, budget);
+	trace_hfi1_txq_poll(txq);
+	for (work_done = 0; work_done < budget; work_done++) {
+		/* See hfi1_ipoib_sdma_complete() */
+		if (!smp_load_acquire(&tx->complete))
+			break;
+		tx->complete = 0;
+		hfi1_ipoib_free_tx(tx, budget);
+		head = CIRC_NEXT(head, max_tx);
+		tx =  hfi1_txreq_from_idx(tx_ring, head);
+	}
+	atomic64_add(work_done, &txq->complete_txreqs);
+
+	/* Finished freeing tx items so store the head value. */
+	smp_store_release(&tx_ring->head, head);
+
+	hfi1_ipoib_check_queue_stopped(txq);
 
 	if (work_done < budget)
 		napi_complete_done(napi, work_done);
@@ -191,45 +203,15 @@ static int hfi1_ipoib_process_tx_ring(struct napi_struct *napi, int budget)
 	return work_done;
 }
 
-static void hfi1_ipoib_add_tx(struct ipoib_txreq *tx)
-{
-	struct hfi1_ipoib_circ_buf *tx_ring = &tx->txq->tx_ring;
-	unsigned long head;
-	unsigned long tail;
-	size_t max_tx;
-
-	spin_lock(&tx_ring->producer_lock);
-
-	head = tx_ring->head;
-	tail = READ_ONCE(tx_ring->tail);
-	max_tx = tx_ring->max_items;
-
-	if (likely(CIRC_SPACE(head, tail, max_tx))) {
-		tx_ring->items[head] = tx;
-
-		/* Finish storing txreq before incrementing head. */
-		smp_store_release(&tx_ring->head, CIRC_ADD(head, 1, max_tx));
-		napi_schedule_irqoff(tx->txq->napi);
-	} else {
-		struct hfi1_ipoib_txq *txq = tx->txq;
-		struct hfi1_ipoib_dev_priv *priv = tx->priv;
-
-		/* Ring was full */
-		hfi1_ipoib_free_tx(tx, 0);
-		atomic64_inc(&txq->complete_txreqs);
-		dd_dev_dbg(priv->dd, "txq %d full.\n", txq->q_idx);
-	}
-
-	spin_unlock(&tx_ring->producer_lock);
-}
-
 static void hfi1_ipoib_sdma_complete(struct sdma_txreq *txreq, int status)
 {
 	struct ipoib_txreq *tx = container_of(txreq, struct ipoib_txreq, txreq);
 
+	trace_hfi1_txq_complete(tx->txq);
 	tx->sdma_status = status;
-
-	hfi1_ipoib_add_tx(tx);
+	/* see hfi1_ipoib_poll_tx_ring */
+	smp_store_release(&tx->complete, 1);
+	napi_schedule_irqoff(tx->txq->napi);
 }
 
 static int hfi1_ipoib_build_ulp_payload(struct ipoib_txreq *tx,
@@ -385,19 +367,24 @@ static struct ipoib_txreq *hfi1_ipoib_send_dma_common(struct net_device *dev,
 						      struct ipoib_txparms *txp)
 {
 	struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
+	struct hfi1_ipoib_txq *txq = txp->txq;
 	struct ipoib_txreq *tx;
+	struct hfi1_ipoib_circ_buf *tx_ring;
+	u32 tail;
 	int ret;
 
-	tx = kmem_cache_alloc_node(priv->txreq_cache,
-				   GFP_ATOMIC,
-				   priv->dd->node);
-	if (unlikely(!tx))
+	if (unlikely(hfi1_ipoib_used(txq) >= hfi1_ipoib_ring_hwat(txq)))
+		/* This shouldn't happen with a stopped queue */
 		return ERR_PTR(-ENOMEM);
+	tx_ring = &txq->tx_ring;
+	tail = tx_ring->tail;
+	tx = hfi1_txreq_from_idx(tx_ring, tx_ring->tail);
+	trace_hfi1_txq_alloc_tx(txq);
 
 	/* so that we can test if the sdma descriptors are there */
 	tx->txreq.num_desc = 0;
 	tx->priv = priv;
-	tx->txq = txp->txq;
+	tx->txq = txq;
 	tx->skb = skb;
 	INIT_LIST_HEAD(&tx->txreq.list);
 
@@ -405,21 +392,20 @@ static struct ipoib_txreq *hfi1_ipoib_send_dma_common(struct net_device *dev,
 
 	ret = hfi1_ipoib_build_tx_desc(tx, txp);
 	if (likely(!ret)) {
-		if (txp->txq->flow.as_int != txp->flow.as_int) {
-			txp->txq->flow.tx_queue = txp->flow.tx_queue;
-			txp->txq->flow.sc5 = txp->flow.sc5;
-			txp->txq->sde =
+		if (txq->flow.as_int != txp->flow.as_int) {
+			txq->flow.tx_queue = txp->flow.tx_queue;
+			txq->flow.sc5 = txp->flow.sc5;
+			txq->sde =
 				sdma_select_engine_sc(priv->dd,
 						      txp->flow.tx_queue,
 						      txp->flow.sc5);
-			trace_hfi1_flow_switch(txp->txq);
+			trace_hfi1_flow_switch(txq);
 		}
 
 		return tx;
 	}
 
 	sdma_txclean(priv->dd, &tx->txreq);
-	kmem_cache_free(priv->txreq_cache, tx);
 
 	return ERR_PTR(ret);
 }
@@ -480,8 +466,8 @@ static int hfi1_ipoib_send_dma_single(struct net_device *dev,
 				      struct sk_buff *skb,
 				      struct ipoib_txparms *txp)
 {
-	struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
 	struct hfi1_ipoib_txq *txq = txp->txq;
+	struct hfi1_ipoib_circ_buf *tx_ring;
 	struct ipoib_txreq *tx;
 	int ret;
 
@@ -499,6 +485,9 @@ static int hfi1_ipoib_send_dma_single(struct net_device *dev,
 		return NETDEV_TX_OK;
 	}
 
+	tx_ring = &txq->tx_ring;
+	/* consume tx */
+	smp_store_release(&tx_ring->tail, CIRC_NEXT(tx_ring->tail, tx_ring->max_items));
 	ret = hfi1_ipoib_submit_tx(txq, tx);
 	if (likely(!ret)) {
 tx_ok:
@@ -514,9 +503,10 @@ static int hfi1_ipoib_send_dma_single(struct net_device *dev,
 	if (ret == -EBUSY || ret == -ECOMM)
 		goto tx_ok;
 
-	sdma_txclean(priv->dd, &tx->txreq);
-	dev_kfree_skb_any(skb);
-	kmem_cache_free(priv->txreq_cache, tx);
+	/* mark complete and kick napi tx */
+	smp_store_release(&tx->complete, 1);
+	napi_schedule(tx->txq->napi);
+
 	++dev->stats.tx_carrier_errors;
 
 	return NETDEV_TX_OK;
@@ -527,6 +517,7 @@ static int hfi1_ipoib_send_dma_list(struct net_device *dev,
 				    struct ipoib_txparms *txp)
 {
 	struct hfi1_ipoib_txq *txq = txp->txq;
+	struct hfi1_ipoib_circ_buf *tx_ring;
 	struct ipoib_txreq *tx;
 
 	/* Has the flow change ? */
@@ -556,6 +547,9 @@ static int hfi1_ipoib_send_dma_list(struct net_device *dev,
 		return NETDEV_TX_OK;
 	}
 
+	tx_ring = &txq->tx_ring;
+	/* consume tx */
+	smp_store_release(&tx_ring->tail, CIRC_NEXT(tx_ring->tail, tx_ring->max_items));
 	list_add_tail(&tx->txreq.list, &txq->tx_list);
 
 	hfi1_ipoib_check_queue_depth(txq);
@@ -696,31 +690,22 @@ static void hfi1_ipoib_flush_txq(struct work_struct *work)
 int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 {
 	struct net_device *dev = priv->netdev;
-	char buf[HFI1_IPOIB_TXREQ_NAME_LEN];
-	unsigned long tx_ring_size;
+	u32 tx_ring_size, tx_item_size;
 	int i;
 
-	/*
-	 * Ring holds 1 less than tx_ring_size
-	 * Round up to next power of 2 in order to hold at least tx_queue_len
-	 */
-	tx_ring_size = roundup_pow_of_two((unsigned long)dev->tx_queue_len + 1);
-
-	snprintf(buf, sizeof(buf), "hfi1_%u_ipoib_txreq_cache", priv->dd->unit);
-	priv->txreq_cache = kmem_cache_create(buf,
-					      sizeof(struct ipoib_txreq),
-					      0,
-					      0,
-					      NULL);
-	if (!priv->txreq_cache)
-		return -ENOMEM;
-
 	priv->tx_napis = kcalloc_node(dev->num_tx_queues,
 				      sizeof(struct napi_struct),
 				      GFP_KERNEL,
 				      priv->dd->node);
 	if (!priv->tx_napis)
-		goto free_txreq_cache;
+		return -ENOMEM;
+
+	/*
+	 * Ring holds 1 less than tx_ring_size
+	 * Round up to next power of 2 in order to hold at least tx_queue_len
+	 */
+	tx_ring_size = roundup_pow_of_two(dev->tx_queue_len + 1);
+	tx_item_size = roundup_pow_of_two(sizeof(struct ipoib_txreq));
 
 	priv->txqs = kcalloc_node(dev->num_tx_queues,
 				  sizeof(struct hfi1_ipoib_txq),
@@ -756,19 +741,17 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 					     priv->dd->node);
 
 		txq->tx_ring.items =
-			kcalloc_node(tx_ring_size,
-				     sizeof(struct ipoib_txreq *),
+			kcalloc_node(tx_ring_size, tx_item_size,
 				     GFP_KERNEL, priv->dd->node);
 		if (!txq->tx_ring.items)
 			goto free_txqs;
 
-		spin_lock_init(&txq->tx_ring.producer_lock);
-		spin_lock_init(&txq->tx_ring.consumer_lock);
 		txq->tx_ring.max_items = tx_ring_size;
+		txq->tx_ring.shift = ilog2(tx_ring_size);
 
 		txq->napi = &priv->tx_napis[i];
 		netif_tx_napi_add(dev, txq->napi,
-				  hfi1_ipoib_process_tx_ring,
+				  hfi1_ipoib_poll_tx_ring,
 				  NAPI_POLL_WEIGHT);
 	}
 
@@ -788,10 +771,6 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 free_tx_napis:
 	kfree(priv->tx_napis);
 	priv->tx_napis = NULL;
-
-free_txreq_cache:
-	kmem_cache_destroy(priv->txreq_cache);
-	priv->txreq_cache = NULL;
 	return -ENOMEM;
 }
 
@@ -808,13 +787,13 @@ static void hfi1_ipoib_drain_tx_list(struct hfi1_ipoib_txq *txq)
 		list_del(&txreq->list);
 		sdma_txclean(txq->priv->dd, &tx->txreq);
 		dev_kfree_skb_any(tx->skb);
-		kmem_cache_free(txq->priv->txreq_cache, tx);
+		tx->skb = NULL;
 		atomic64_inc(complete_txreqs);
 	}
 
 	if (hfi1_ipoib_used(txq))
 		dd_dev_warn(txq->priv->dd,
-			    "txq %d not empty found %llu requests\n",
+			    "txq %d not empty found %u requests\n",
 			    txq->q_idx,
 			    hfi1_ipoib_txreqs(txq->sent_txreqs,
 					      atomic64_read(complete_txreqs)));
@@ -831,7 +810,7 @@ void hfi1_ipoib_txreq_deinit(struct hfi1_ipoib_dev_priv *priv)
 		iowait_sdma_drain(&txq->wait);
 		hfi1_ipoib_drain_tx_list(txq);
 		netif_napi_del(txq->napi);
-		(void)hfi1_ipoib_drain_tx_ring(txq, txq->tx_ring.max_items);
+		hfi1_ipoib_drain_tx_ring(txq);
 		kfree(txq->tx_ring.items);
 	}
 
@@ -840,9 +819,6 @@ void hfi1_ipoib_txreq_deinit(struct hfi1_ipoib_dev_priv *priv)
 
 	kfree(priv->tx_napis);
 	priv->tx_napis = NULL;
-
-	kmem_cache_destroy(priv->txreq_cache);
-	priv->txreq_cache = NULL;
 }
 
 void hfi1_ipoib_napi_tx_enable(struct net_device *dev)
@@ -866,7 +842,7 @@ void hfi1_ipoib_napi_tx_disable(struct net_device *dev)
 		struct hfi1_ipoib_txq *txq = &priv->txqs[i];
 
 		napi_disable(txq->napi);
-		(void)hfi1_ipoib_drain_tx_ring(txq, txq->tx_ring.max_items);
+		hfi1_ipoib_drain_tx_ring(txq);
 	}
 }
 
@@ -888,9 +864,9 @@ void hfi1_ipoib_tx_timeout(struct net_device *dev, unsigned int q)
 	dd_dev_info(priv->dd, "flow %x\n", txq->flow.as_int);
 	dd_dev_info(priv->dd, "sent %llu completed %llu used %llu\n",
 		    txq->sent_txreqs, completed, hfi1_ipoib_used(txq));
-	dd_dev_info(priv->dd, "tx_queue_len %u max_items %lu\n",
+	dd_dev_info(priv->dd, "tx_queue_len %u max_items %u\n",
 		    dev->tx_queue_len, txq->tx_ring.max_items);
-	dd_dev_info(priv->dd, "head %lu tail %lu\n",
+	dd_dev_info(priv->dd, "head %u tail %u\n",
 		    txq->tx_ring.head, txq->tx_ring.tail);
 	dd_dev_info(priv->dd, "wait queued %u\n",
 		    !list_empty(&txq->wait.list));


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH for-next 2/6] IB/hfi1: Get rid of hot path divide
  2021-09-13 13:28 [PATCH for-next 0/6] Perf and debug fixes for hfi Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 1/6] IB/hfi1: Remove cache and embed txreq in ring Dennis Dalessandro
@ 2021-09-13 13:28 ` Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 3/6] IB/hfi1: Get rid of tx priv backpointer Dennis Dalessandro
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Dennis Dalessandro @ 2021-09-13 13:28 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

The pointer math in this statemet does a divide;
	struct hfi1_ipoib_txq *txq = &priv->txqs[napi - priv->tx_napis];

Elminate the divide by embedding the struct napi_strut in the txq and
getting the txq with a container_of() using the newly embedded napi.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/ipoib.h    |    3 +--
 drivers/infiniband/hw/hfi1/ipoib_tx.c |   35 +++++++++------------------------
 2 files changed, 11 insertions(+), 27 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/ipoib.h b/drivers/infiniband/hw/hfi1/ipoib.h
index 4e91b70..71b102d 100644
--- a/drivers/infiniband/hw/hfi1/ipoib.h
+++ b/drivers/infiniband/hw/hfi1/ipoib.h
@@ -78,6 +78,7 @@ struct hfi1_ipoib_circ_buf {
  * @tx_ring: ring of ipoib txreqs to be reaped by napi callback
  */
 struct hfi1_ipoib_txq {
+	struct napi_struct napi;
 	struct hfi1_ipoib_dev_priv *priv;
 	struct sdma_engine *sde;
 	struct list_head tx_list;
@@ -91,7 +92,6 @@ struct hfi1_ipoib_txq {
 	struct iowait wait;
 
 	atomic64_t ____cacheline_aligned_in_smp complete_txreqs;
-	struct napi_struct *napi;
 	struct hfi1_ipoib_circ_buf tx_ring;
 };
 
@@ -100,7 +100,6 @@ struct hfi1_ipoib_dev_priv {
 	struct net_device   *netdev;
 	struct ib_device    *device;
 	struct hfi1_ipoib_txq *txqs;
-	struct napi_struct *tx_napis;
 	u16 pkey;
 	u16 pkey_index;
 	u32 qkey;
diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c
index 0a5d327..053eb43 100644
--- a/drivers/infiniband/hw/hfi1/ipoib_tx.c
+++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c
@@ -172,8 +172,8 @@ static void hfi1_ipoib_drain_tx_ring(struct hfi1_ipoib_txq *txq)
 
 static int hfi1_ipoib_poll_tx_ring(struct napi_struct *napi, int budget)
 {
-	struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(napi->dev);
-	struct hfi1_ipoib_txq *txq = &priv->txqs[napi - priv->tx_napis];
+	struct hfi1_ipoib_txq *txq =
+		container_of(napi, struct hfi1_ipoib_txq, napi);
 	struct hfi1_ipoib_circ_buf *tx_ring = &txq->tx_ring;
 	u32 head = tx_ring->head;
 	u32 max_tx = tx_ring->max_items;
@@ -211,7 +211,7 @@ static void hfi1_ipoib_sdma_complete(struct sdma_txreq *txreq, int status)
 	tx->sdma_status = status;
 	/* see hfi1_ipoib_poll_tx_ring */
 	smp_store_release(&tx->complete, 1);
-	napi_schedule_irqoff(tx->txq->napi);
+	napi_schedule_irqoff(&tx->txq->napi);
 }
 
 static int hfi1_ipoib_build_ulp_payload(struct ipoib_txreq *tx,
@@ -505,7 +505,7 @@ static int hfi1_ipoib_send_dma_single(struct net_device *dev,
 
 	/* mark complete and kick napi tx */
 	smp_store_release(&tx->complete, 1);
-	napi_schedule(tx->txq->napi);
+	napi_schedule(&tx->txq->napi);
 
 	++dev->stats.tx_carrier_errors;
 
@@ -693,13 +693,6 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 	u32 tx_ring_size, tx_item_size;
 	int i;
 
-	priv->tx_napis = kcalloc_node(dev->num_tx_queues,
-				      sizeof(struct napi_struct),
-				      GFP_KERNEL,
-				      priv->dd->node);
-	if (!priv->tx_napis)
-		return -ENOMEM;
-
 	/*
 	 * Ring holds 1 less than tx_ring_size
 	 * Round up to next power of 2 in order to hold at least tx_queue_len
@@ -712,7 +705,7 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 				  GFP_KERNEL,
 				  priv->dd->node);
 	if (!priv->txqs)
-		goto free_tx_napis;
+		return -ENOMEM;
 
 	for (i = 0; i < dev->num_tx_queues; i++) {
 		struct hfi1_ipoib_txq *txq = &priv->txqs[i];
@@ -749,8 +742,7 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 		txq->tx_ring.max_items = tx_ring_size;
 		txq->tx_ring.shift = ilog2(tx_ring_size);
 
-		txq->napi = &priv->tx_napis[i];
-		netif_tx_napi_add(dev, txq->napi,
+		netif_tx_napi_add(dev, &txq->napi,
 				  hfi1_ipoib_poll_tx_ring,
 				  NAPI_POLL_WEIGHT);
 	}
@@ -761,16 +753,12 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 	for (i--; i >= 0; i--) {
 		struct hfi1_ipoib_txq *txq = &priv->txqs[i];
 
-		netif_napi_del(txq->napi);
+		netif_napi_del(&txq->napi);
 		kfree(txq->tx_ring.items);
 	}
 
 	kfree(priv->txqs);
 	priv->txqs = NULL;
-
-free_tx_napis:
-	kfree(priv->tx_napis);
-	priv->tx_napis = NULL;
 	return -ENOMEM;
 }
 
@@ -809,16 +797,13 @@ void hfi1_ipoib_txreq_deinit(struct hfi1_ipoib_dev_priv *priv)
 		iowait_cancel_work(&txq->wait);
 		iowait_sdma_drain(&txq->wait);
 		hfi1_ipoib_drain_tx_list(txq);
-		netif_napi_del(txq->napi);
+		netif_napi_del(&txq->napi);
 		hfi1_ipoib_drain_tx_ring(txq);
 		kfree(txq->tx_ring.items);
 	}
 
 	kfree(priv->txqs);
 	priv->txqs = NULL;
-
-	kfree(priv->tx_napis);
-	priv->tx_napis = NULL;
 }
 
 void hfi1_ipoib_napi_tx_enable(struct net_device *dev)
@@ -829,7 +814,7 @@ void hfi1_ipoib_napi_tx_enable(struct net_device *dev)
 	for (i = 0; i < dev->num_tx_queues; i++) {
 		struct hfi1_ipoib_txq *txq = &priv->txqs[i];
 
-		napi_enable(txq->napi);
+		napi_enable(&txq->napi);
 	}
 }
 
@@ -841,7 +826,7 @@ void hfi1_ipoib_napi_tx_disable(struct net_device *dev)
 	for (i = 0; i < dev->num_tx_queues; i++) {
 		struct hfi1_ipoib_txq *txq = &priv->txqs[i];
 
-		napi_disable(txq->napi);
+		napi_disable(&txq->napi);
 		hfi1_ipoib_drain_tx_ring(txq);
 	}
 }


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH for-next 3/6] IB/hfi1: Get rid of tx priv backpointer
  2021-09-13 13:28 [PATCH for-next 0/6] Perf and debug fixes for hfi Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 1/6] IB/hfi1: Remove cache and embed txreq in ring Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 2/6] IB/hfi1: Get rid of hot path divide Dennis Dalessandro
@ 2021-09-13 13:28 ` Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 4/6] IB/hfi1: Tune netdev xmit cachelines Dennis Dalessandro
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Dennis Dalessandro @ 2021-09-13 13:28 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

The txq has the backpointer, so this is a micro optimization
for the tx path.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/ipoib_tx.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c
index 053eb43..734b91d 100644
--- a/drivers/infiniband/hw/hfi1/ipoib_tx.c
+++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c
@@ -133,7 +133,7 @@ static void hfi1_ipoib_check_queue_stopped(struct hfi1_ipoib_txq *txq)
 
 static void hfi1_ipoib_free_tx(struct ipoib_txreq *tx, int budget)
 {
-	struct hfi1_ipoib_dev_priv *priv = tx->priv;
+	struct hfi1_ipoib_dev_priv *priv = tx->txq->priv;
 
 	if (likely(!tx->sdma_status)) {
 		dev_sw_netstats_tx_add(priv->netdev, 1, tx->skb->len);
@@ -273,7 +273,7 @@ static int hfi1_ipoib_build_tx_desc(struct ipoib_txreq *tx,
 static void hfi1_ipoib_build_ib_tx_headers(struct ipoib_txreq *tx,
 					   struct ipoib_txparms *txp)
 {
-	struct hfi1_ipoib_dev_priv *priv = tx->priv;
+	struct hfi1_ipoib_dev_priv *priv = tx->txq->priv;
 	struct hfi1_sdma_header *sdma_hdr = &tx->sdma_hdr;
 	struct sk_buff *skb = tx->skb;
 	struct hfi1_pportdata *ppd = ppd_from_ibp(txp->ibp);
@@ -383,7 +383,6 @@ static struct ipoib_txreq *hfi1_ipoib_send_dma_common(struct net_device *dev,
 
 	/* so that we can test if the sdma descriptors are there */
 	tx->txreq.num_desc = 0;
-	tx->priv = priv;
 	tx->txq = txq;
 	tx->skb = skb;
 	INIT_LIST_HEAD(&tx->txreq.list);
@@ -491,7 +490,7 @@ static int hfi1_ipoib_send_dma_single(struct net_device *dev,
 	ret = hfi1_ipoib_submit_tx(txq, tx);
 	if (likely(!ret)) {
 tx_ok:
-		trace_sdma_output_ibhdr(tx->priv->dd,
+		trace_sdma_output_ibhdr(txq->priv->dd,
 					&tx->sdma_hdr.hdr,
 					ib_is_sc5(txp->flow.sc5));
 		hfi1_ipoib_check_queue_depth(txq);
@@ -554,7 +553,7 @@ static int hfi1_ipoib_send_dma_list(struct net_device *dev,
 
 	hfi1_ipoib_check_queue_depth(txq);
 
-	trace_sdma_output_ibhdr(tx->priv->dd,
+	trace_sdma_output_ibhdr(txq->priv->dd,
 				&tx->sdma_hdr.hdr,
 				ib_is_sc5(txp->flow.sc5));
 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH for-next 4/6] IB/hfi1: Tune netdev xmit cachelines
  2021-09-13 13:28 [PATCH for-next 0/6] Perf and debug fixes for hfi Dennis Dalessandro
                   ` (2 preceding siblings ...)
  2021-09-13 13:28 ` [PATCH for-next 3/6] IB/hfi1: Get rid of tx priv backpointer Dennis Dalessandro
@ 2021-09-13 13:28 ` Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 5/6] IB/hfi1: Remove atomic completion count Dennis Dalessandro
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Dennis Dalessandro @ 2021-09-13 13:28 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

This patch moves fields in the ring and creates a line for
the producer and the consumer.

The adds a consumer side variable that tracks the
ring avail so that the code doesn't have the read the other
cacheline to get a count for every packet. A read now only
occurs when the avail is at 0.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/ipoib.h    |   39 ++++++++++--------
 drivers/infiniband/hw/hfi1/ipoib_tx.c |   71 +++++++++++++++++++--------------
 drivers/infiniband/hw/hfi1/trace_tx.h |    8 ++--
 3 files changed, 66 insertions(+), 52 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/ipoib.h b/drivers/infiniband/hw/hfi1/ipoib.h
index 71b102d..8d9a03a 100644
--- a/drivers/infiniband/hw/hfi1/ipoib.h
+++ b/drivers/infiniband/hw/hfi1/ipoib.h
@@ -46,18 +46,31 @@
 /**
  * struct hfi1_ipoib_circ_buf - List of items to be processed
  * @items: ring of items each a power of two size
- * @head: ring head
- * @tail: ring tail
  * @max_items: max items + 1 that the ring can contain
  * @shift: log2 of size for getting txreq
+ * @sent_txreqs: count of txreqs posted to sdma
+ * @tail: ring tail
+ * @stops: count of stops of queue
+ * @ring_full: ring has been filled
+ * @no_desc: descriptor shortage seen
+ * @complete_txreqs: count of txreqs completed by sdma
+ * @head: ring head
  */
 struct ipoib_txreq;
 struct hfi1_ipoib_circ_buf {
 	void *items;
-	u32 head;
-	u32 tail;
 	u32 max_items;
 	u32 shift;
+	/* consumer cache line */
+	u64 ____cacheline_aligned_in_smp sent_txreqs;
+	u32 avail;
+	u32 tail;
+	atomic_t stops;
+	atomic_t ring_full;
+	atomic_t no_desc;
+	/* producer cache line */
+	atomic64_t ____cacheline_aligned_in_smp complete_txreqs;
+	u32 head;
 };
 
 /**
@@ -66,14 +79,10 @@ struct hfi1_ipoib_circ_buf {
  * @sde: sdma engine
  * @tx_list: tx request list
  * @sent_txreqs: count of txreqs posted to sdma
- * @stops: count of stops of queue
- * @ring_full: ring has been filled
- * @no_desc: descriptor shortage seen
  * @flow: tracks when list needs to be flushed for a flow change
  * @q_idx: ipoib Tx queue index
  * @pkts_sent: indicator packets have been sent from this queue
  * @wait: iowait structure
- * @complete_txreqs: count of txreqs completed by sdma
  * @napi: pointer to tx napi interface
  * @tx_ring: ring of ipoib txreqs to be reaped by napi callback
  */
@@ -82,17 +91,12 @@ struct hfi1_ipoib_txq {
 	struct hfi1_ipoib_dev_priv *priv;
 	struct sdma_engine *sde;
 	struct list_head tx_list;
-	u64 sent_txreqs;
-	atomic_t stops;
-	atomic_t ring_full;
-	atomic_t no_desc;
 	union hfi1_ipoib_flow flow;
 	u8 q_idx;
 	bool pkts_sent;
 	struct iowait wait;
 
-	atomic64_t ____cacheline_aligned_in_smp complete_txreqs;
-	struct hfi1_ipoib_circ_buf tx_ring;
+	struct hfi1_ipoib_circ_buf ____cacheline_aligned_in_smp tx_ring;
 };
 
 struct hfi1_ipoib_dev_priv {
@@ -100,13 +104,12 @@ struct hfi1_ipoib_dev_priv {
 	struct net_device   *netdev;
 	struct ib_device    *device;
 	struct hfi1_ipoib_txq *txqs;
+	const struct net_device_ops *netdev_ops;
+	struct rvt_qp *qp;
+	u32 qkey;
 	u16 pkey;
 	u16 pkey_index;
-	u32 qkey;
 	u8 port_num;
-
-	const struct net_device_ops *netdev_ops;
-	struct rvt_qp *qp;
 };
 
 /* hfi1 ipoib rdma netdev's private data structure */
diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c
index 734b91d..c3e43da 100644
--- a/drivers/infiniband/hw/hfi1/ipoib_tx.c
+++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c
@@ -66,21 +66,21 @@ static u32 hfi1_ipoib_txreqs(const u64 sent, const u64 completed)
 
 static u64 hfi1_ipoib_used(struct hfi1_ipoib_txq *txq)
 {
-	return hfi1_ipoib_txreqs(txq->sent_txreqs,
-				 atomic64_read(&txq->complete_txreqs));
+	return hfi1_ipoib_txreqs(txq->tx_ring.sent_txreqs,
+				 atomic64_read(&txq->tx_ring.complete_txreqs));
 }
 
 static void hfi1_ipoib_stop_txq(struct hfi1_ipoib_txq *txq)
 {
 	trace_hfi1_txq_stop(txq);
-	if (atomic_inc_return(&txq->stops) == 1)
+	if (atomic_inc_return(&txq->tx_ring.stops) == 1)
 		netif_stop_subqueue(txq->priv->netdev, txq->q_idx);
 }
 
 static void hfi1_ipoib_wake_txq(struct hfi1_ipoib_txq *txq)
 {
 	trace_hfi1_txq_wake(txq);
-	if (atomic_dec_and_test(&txq->stops))
+	if (atomic_dec_and_test(&txq->tx_ring.stops))
 		netif_wake_subqueue(txq->priv->netdev, txq->q_idx);
 }
 
@@ -98,9 +98,9 @@ static uint hfi1_ipoib_ring_lwat(struct hfi1_ipoib_txq *txq)
 
 static void hfi1_ipoib_check_queue_depth(struct hfi1_ipoib_txq *txq)
 {
-	++txq->sent_txreqs;
+	++txq->tx_ring.sent_txreqs;
 	if (hfi1_ipoib_used(txq) >= hfi1_ipoib_ring_hwat(txq) &&
-	    !atomic_xchg(&txq->ring_full, 1)) {
+	    !atomic_xchg(&txq->tx_ring.ring_full, 1)) {
 		trace_hfi1_txq_full(txq);
 		hfi1_ipoib_stop_txq(txq);
 	}
@@ -125,7 +125,7 @@ static void hfi1_ipoib_check_queue_stopped(struct hfi1_ipoib_txq *txq)
 	 * to protect against ring overflow.
 	 */
 	if (hfi1_ipoib_used(txq) < hfi1_ipoib_ring_lwat(txq) &&
-	    atomic_xchg(&txq->ring_full, 0)) {
+	    atomic_xchg(&txq->tx_ring.ring_full, 0)) {
 		trace_hfi1_txq_xmit_unstopped(txq);
 		hfi1_ipoib_wake_txq(txq);
 	}
@@ -168,6 +168,7 @@ static void hfi1_ipoib_drain_tx_ring(struct hfi1_ipoib_txq *txq)
 	tx_ring->tail = 0;
 	atomic64_set(&txq->complete_txreqs, 0);
 	txq->sent_txreqs = 0;
+	tx_ring->avail = hfi1_ipoib_ring_hwat(txq);
 }
 
 static int hfi1_ipoib_poll_tx_ring(struct napi_struct *napi, int budget)
@@ -190,7 +191,7 @@ static int hfi1_ipoib_poll_tx_ring(struct napi_struct *napi, int budget)
 		head = CIRC_NEXT(head, max_tx);
 		tx =  hfi1_txreq_from_idx(tx_ring, head);
 	}
-	atomic64_add(work_done, &txq->complete_txreqs);
+	atomic64_add(work_done, &txq->tx_ring.complete_txreqs);
 
 	/* Finished freeing tx items so store the head value. */
 	smp_store_release(&tx_ring->head, head);
@@ -344,7 +345,7 @@ static void hfi1_ipoib_build_ib_tx_headers(struct ipoib_txreq *tx,
 
 	ohdr->bth[0] = cpu_to_be32(bth0);
 	ohdr->bth[1] = cpu_to_be32(txp->dqpn);
-	ohdr->bth[2] = cpu_to_be32(mask_psn((u32)txp->txq->sent_txreqs));
+	ohdr->bth[2] = cpu_to_be32(mask_psn((u32)txp->txq->tx_ring.sent_txreqs));
 
 	/* Build the deth */
 	ohdr->u.ud.deth[0] = cpu_to_be32(priv->qkey);
@@ -369,16 +370,25 @@ static struct ipoib_txreq *hfi1_ipoib_send_dma_common(struct net_device *dev,
 	struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
 	struct hfi1_ipoib_txq *txq = txp->txq;
 	struct ipoib_txreq *tx;
-	struct hfi1_ipoib_circ_buf *tx_ring;
-	u32 tail;
+	struct hfi1_ipoib_circ_buf *tx_ring = &txq->tx_ring;
+	u32 tail = tx_ring->tail;
 	int ret;
 
-	if (unlikely(hfi1_ipoib_used(txq) >= hfi1_ipoib_ring_hwat(txq)))
-		/* This shouldn't happen with a stopped queue */
-		return ERR_PTR(-ENOMEM);
-	tx_ring = &txq->tx_ring;
-	tail = tx_ring->tail;
-	tx = hfi1_txreq_from_idx(tx_ring, tx_ring->tail);
+	if (unlikely(!tx_ring->avail)) {
+		u32 head;
+
+		if (hfi1_ipoib_used(txq) >= hfi1_ipoib_ring_hwat(txq))
+			/* This shouldn't happen with a stopped queue */
+			return ERR_PTR(-ENOMEM);
+		/* See hfi1_ipoib_poll_tx_ring() */
+		head = smp_load_acquire(&tx_ring->head);
+		tx_ring->avail =
+			min_t(u32, hfi1_ipoib_ring_hwat(txq),
+			      CIRC_CNT(head, tail, tx_ring->max_items));
+	} else {
+		tx_ring->avail--;
+	}
+	tx = hfi1_txreq_from_idx(tx_ring, tail);
 	trace_hfi1_txq_alloc_tx(txq);
 
 	/* so that we can test if the sdma descriptors are there */
@@ -639,7 +649,7 @@ static int hfi1_ipoib_sdma_sleep(struct sdma_engine *sde,
 		if (list_empty(&txq->wait.list)) {
 			struct hfi1_ibport *ibp = &sde->ppd->ibport_data;
 
-			if (!atomic_xchg(&txq->no_desc, 1)) {
+			if (!atomic_xchg(&txq->tx_ring.no_desc, 1)) {
 				trace_hfi1_txq_queued(txq);
 				hfi1_ipoib_stop_txq(txq);
 			}
@@ -682,7 +692,7 @@ static void hfi1_ipoib_flush_txq(struct work_struct *work)
 
 	if (likely(dev->reg_state == NETREG_REGISTERED) &&
 	    likely(!hfi1_ipoib_flush_tx_list(dev, txq)))
-		if (atomic_xchg(&txq->no_desc, 0))
+		if (atomic_xchg(&txq->tx_ring.no_desc, 0))
 			hfi1_ipoib_wake_txq(txq);
 }
 
@@ -720,10 +730,10 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 		txq->priv = priv;
 		txq->sde = NULL;
 		INIT_LIST_HEAD(&txq->tx_list);
-		atomic64_set(&txq->complete_txreqs, 0);
-		atomic_set(&txq->stops, 0);
-		atomic_set(&txq->ring_full, 0);
-		atomic_set(&txq->no_desc, 0);
+		atomic64_set(&txq->tx_ring.complete_txreqs, 0);
+		atomic_set(&txq->tx_ring.stops, 0);
+		atomic_set(&txq->tx_ring.ring_full, 0);
+		atomic_set(&txq->tx_ring.no_desc, 0);
 		txq->q_idx = i;
 		txq->flow.tx_queue = 0xff;
 		txq->flow.sc5 = 0xff;
@@ -740,6 +750,7 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 
 		txq->tx_ring.max_items = tx_ring_size;
 		txq->tx_ring.shift = ilog2(tx_ring_size);
+		txq->tx_ring.avail = hfi1_ipoib_ring_hwat(txq);
 
 		netif_tx_napi_add(dev, &txq->napi,
 				  hfi1_ipoib_poll_tx_ring,
@@ -765,7 +776,7 @@ static void hfi1_ipoib_drain_tx_list(struct hfi1_ipoib_txq *txq)
 {
 	struct sdma_txreq *txreq;
 	struct sdma_txreq *txreq_tmp;
-	atomic64_t *complete_txreqs = &txq->complete_txreqs;
+	atomic64_t *complete_txreqs = &txq->tx_ring.complete_txreqs;
 
 	list_for_each_entry_safe(txreq, txreq_tmp, &txq->tx_list, list) {
 		struct ipoib_txreq *tx =
@@ -782,7 +793,7 @@ static void hfi1_ipoib_drain_tx_list(struct hfi1_ipoib_txq *txq)
 		dd_dev_warn(txq->priv->dd,
 			    "txq %d not empty found %u requests\n",
 			    txq->q_idx,
-			    hfi1_ipoib_txreqs(txq->sent_txreqs,
+			    hfi1_ipoib_txreqs(txq->tx_ring.sent_txreqs,
 					      atomic64_read(complete_txreqs)));
 }
 
@@ -834,20 +845,20 @@ void hfi1_ipoib_tx_timeout(struct net_device *dev, unsigned int q)
 {
 	struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
 	struct hfi1_ipoib_txq *txq = &priv->txqs[q];
-	u64 completed = atomic64_read(&txq->complete_txreqs);
+	u64 completed = atomic64_read(&txq->tx_ring.complete_txreqs);
 
 	dd_dev_info(priv->dd, "timeout txq %llx q %u stopped %u stops %d no_desc %d ring_full %d\n",
 		    (unsigned long long)txq, q,
 		    __netif_subqueue_stopped(dev, txq->q_idx),
-		    atomic_read(&txq->stops),
-		    atomic_read(&txq->no_desc),
-		    atomic_read(&txq->ring_full));
+		    atomic_read(&txq->tx_ring.stops),
+		    atomic_read(&txq->tx_ring.no_desc),
+		    atomic_read(&txq->tx_ring.ring_full));
 	dd_dev_info(priv->dd, "sde %llx engine %u\n",
 		    (unsigned long long)txq->sde,
 		    txq->sde ? txq->sde->this_idx : 0);
 	dd_dev_info(priv->dd, "flow %x\n", txq->flow.as_int);
 	dd_dev_info(priv->dd, "sent %llu completed %llu used %llu\n",
-		    txq->sent_txreqs, completed, hfi1_ipoib_used(txq));
+		    txq->tx_ring.sent_txreqs, completed, hfi1_ipoib_used(txq));
 	dd_dev_info(priv->dd, "tx_queue_len %u max_items %u\n",
 		    dev->tx_queue_len, txq->tx_ring.max_items);
 	dd_dev_info(priv->dd, "head %u tail %u\n",
diff --git a/drivers/infiniband/hw/hfi1/trace_tx.h b/drivers/infiniband/hw/hfi1/trace_tx.h
index 7318aa6..c9b1cd0 100644
--- a/drivers/infiniband/hw/hfi1/trace_tx.h
+++ b/drivers/infiniband/hw/hfi1/trace_tx.h
@@ -917,11 +917,11 @@
 		__entry->tail = txq->tx_ring.tail;
 		__entry->idx = txq->q_idx;
 		__entry->used =
-			txq->sent_txreqs -
-			atomic64_read(&txq->complete_txreqs);
+			txq->tx_ring.sent_txreqs -
+			atomic64_read(&txq->tx_ring.complete_txreqs);
 		__entry->flow = txq->flow.as_int;
-		__entry->stops = atomic_read(&txq->stops);
-		__entry->no_desc = atomic_read(&txq->no_desc);
+		__entry->stops = atomic_read(&txq->tx_ring.stops);
+		__entry->no_desc = atomic_read(&txq->tx_ring.no_desc);
 		__entry->stopped =
 		 __netif_subqueue_stopped(txq->priv->netdev, txq->q_idx);
 	),


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH for-next 5/6] IB/hfi1: Remove atomic completion count
  2021-09-13 13:28 [PATCH for-next 0/6] Perf and debug fixes for hfi Dennis Dalessandro
                   ` (3 preceding siblings ...)
  2021-09-13 13:28 ` [PATCH for-next 4/6] IB/hfi1: Tune netdev xmit cachelines Dennis Dalessandro
@ 2021-09-13 13:28 ` Dennis Dalessandro
  2021-09-13 13:28 ` [PATCH for-next 6/6] IB/hfi1: Add ring consumer and producers traces Dennis Dalessandro
  2021-09-27 23:15 ` [PATCH for-next 0/6] Perf and debug fixes for hfi Jason Gunthorpe
  6 siblings, 0 replies; 8+ messages in thread
From: Dennis Dalessandro @ 2021-09-13 13:28 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

The atomic is not needed.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/ipoib.h    |    2 +-
 drivers/infiniband/hw/hfi1/ipoib_tx.c |   18 ++++++++----------
 drivers/infiniband/hw/hfi1/trace_tx.h |    2 +-
 3 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/ipoib.h b/drivers/infiniband/hw/hfi1/ipoib.h
index 8d9a03a..eb5c251 100644
--- a/drivers/infiniband/hw/hfi1/ipoib.h
+++ b/drivers/infiniband/hw/hfi1/ipoib.h
@@ -69,7 +69,7 @@ struct hfi1_ipoib_circ_buf {
 	atomic_t ring_full;
 	atomic_t no_desc;
 	/* producer cache line */
-	atomic64_t ____cacheline_aligned_in_smp complete_txreqs;
+	u64 ____cacheline_aligned_in_smp complete_txreqs;
 	u32 head;
 };
 
diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c
index c3e43da..1a7a837 100644
--- a/drivers/infiniband/hw/hfi1/ipoib_tx.c
+++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c
@@ -67,7 +67,7 @@ static u32 hfi1_ipoib_txreqs(const u64 sent, const u64 completed)
 static u64 hfi1_ipoib_used(struct hfi1_ipoib_txq *txq)
 {
 	return hfi1_ipoib_txreqs(txq->tx_ring.sent_txreqs,
-				 atomic64_read(&txq->tx_ring.complete_txreqs));
+				 txq->tx_ring.complete_txreqs);
 }
 
 static void hfi1_ipoib_stop_txq(struct hfi1_ipoib_txq *txq)
@@ -166,8 +166,8 @@ static void hfi1_ipoib_drain_tx_ring(struct hfi1_ipoib_txq *txq)
 	}
 	tx_ring->head = 0;
 	tx_ring->tail = 0;
-	atomic64_set(&txq->complete_txreqs, 0);
-	txq->sent_txreqs = 0;
+	tx_ring->complete_txreqs = 0;
+	tx_ring->sent_txreqs = 0;
 	tx_ring->avail = hfi1_ipoib_ring_hwat(txq);
 }
 
@@ -191,7 +191,7 @@ static int hfi1_ipoib_poll_tx_ring(struct napi_struct *napi, int budget)
 		head = CIRC_NEXT(head, max_tx);
 		tx =  hfi1_txreq_from_idx(tx_ring, head);
 	}
-	atomic64_add(work_done, &txq->tx_ring.complete_txreqs);
+	tx_ring->complete_txreqs += work_done;
 
 	/* Finished freeing tx items so store the head value. */
 	smp_store_release(&tx_ring->head, head);
@@ -730,7 +730,6 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
 		txq->priv = priv;
 		txq->sde = NULL;
 		INIT_LIST_HEAD(&txq->tx_list);
-		atomic64_set(&txq->tx_ring.complete_txreqs, 0);
 		atomic_set(&txq->tx_ring.stops, 0);
 		atomic_set(&txq->tx_ring.ring_full, 0);
 		atomic_set(&txq->tx_ring.no_desc, 0);
@@ -776,7 +775,6 @@ static void hfi1_ipoib_drain_tx_list(struct hfi1_ipoib_txq *txq)
 {
 	struct sdma_txreq *txreq;
 	struct sdma_txreq *txreq_tmp;
-	atomic64_t *complete_txreqs = &txq->tx_ring.complete_txreqs;
 
 	list_for_each_entry_safe(txreq, txreq_tmp, &txq->tx_list, list) {
 		struct ipoib_txreq *tx =
@@ -786,7 +784,7 @@ static void hfi1_ipoib_drain_tx_list(struct hfi1_ipoib_txq *txq)
 		sdma_txclean(txq->priv->dd, &tx->txreq);
 		dev_kfree_skb_any(tx->skb);
 		tx->skb = NULL;
-		atomic64_inc(complete_txreqs);
+		txq->tx_ring.complete_txreqs++;
 	}
 
 	if (hfi1_ipoib_used(txq))
@@ -794,7 +792,7 @@ static void hfi1_ipoib_drain_tx_list(struct hfi1_ipoib_txq *txq)
 			    "txq %d not empty found %u requests\n",
 			    txq->q_idx,
 			    hfi1_ipoib_txreqs(txq->tx_ring.sent_txreqs,
-					      atomic64_read(complete_txreqs)));
+					      txq->tx_ring.complete_txreqs));
 }
 
 void hfi1_ipoib_txreq_deinit(struct hfi1_ipoib_dev_priv *priv)
@@ -845,7 +843,6 @@ void hfi1_ipoib_tx_timeout(struct net_device *dev, unsigned int q)
 {
 	struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
 	struct hfi1_ipoib_txq *txq = &priv->txqs[q];
-	u64 completed = atomic64_read(&txq->tx_ring.complete_txreqs);
 
 	dd_dev_info(priv->dd, "timeout txq %llx q %u stopped %u stops %d no_desc %d ring_full %d\n",
 		    (unsigned long long)txq, q,
@@ -858,7 +855,8 @@ void hfi1_ipoib_tx_timeout(struct net_device *dev, unsigned int q)
 		    txq->sde ? txq->sde->this_idx : 0);
 	dd_dev_info(priv->dd, "flow %x\n", txq->flow.as_int);
 	dd_dev_info(priv->dd, "sent %llu completed %llu used %llu\n",
-		    txq->tx_ring.sent_txreqs, completed, hfi1_ipoib_used(txq));
+		    txq->tx_ring.sent_txreqs, txq->tx_ring.complete_txreqs,
+		    hfi1_ipoib_used(txq));
 	dd_dev_info(priv->dd, "tx_queue_len %u max_items %u\n",
 		    dev->tx_queue_len, txq->tx_ring.max_items);
 	dd_dev_info(priv->dd, "head %u tail %u\n",
diff --git a/drivers/infiniband/hw/hfi1/trace_tx.h b/drivers/infiniband/hw/hfi1/trace_tx.h
index c9b1cd0..f00696f 100644
--- a/drivers/infiniband/hw/hfi1/trace_tx.h
+++ b/drivers/infiniband/hw/hfi1/trace_tx.h
@@ -918,7 +918,7 @@
 		__entry->idx = txq->q_idx;
 		__entry->used =
 			txq->tx_ring.sent_txreqs -
-			atomic64_read(&txq->tx_ring.complete_txreqs);
+			txq->tx_ring.complete_txreqs;
 		__entry->flow = txq->flow.as_int;
 		__entry->stops = atomic_read(&txq->tx_ring.stops);
 		__entry->no_desc = atomic_read(&txq->tx_ring.no_desc);


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH for-next 6/6] IB/hfi1: Add ring consumer and producers traces
  2021-09-13 13:28 [PATCH for-next 0/6] Perf and debug fixes for hfi Dennis Dalessandro
                   ` (4 preceding siblings ...)
  2021-09-13 13:28 ` [PATCH for-next 5/6] IB/hfi1: Remove atomic completion count Dennis Dalessandro
@ 2021-09-13 13:28 ` Dennis Dalessandro
  2021-09-27 23:15 ` [PATCH for-next 0/6] Perf and debug fixes for hfi Jason Gunthorpe
  6 siblings, 0 replies; 8+ messages in thread
From: Dennis Dalessandro @ 2021-09-13 13:28 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

These traces are used to debugging ring issues.

The ipoib_txreq needed to be moved to a header file to
allow access from the trace header file.

The trace changes include:
- new producer/consumer traces
- new allocation deallocation traces
- additional fidelity for SDMA engine prints

Fixes: 4bd00b55c978 ("IB/hfi1: Add AIP tx traces")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/ipoib.h    |   21 ++++++++++-
 drivers/infiniband/hw/hfi1/ipoib_tx.c |   23 ++----------
 drivers/infiniband/hw/hfi1/trace_tx.h |   63 ++++++++++++++++++++++++++++++++-
 3 files changed, 85 insertions(+), 22 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/ipoib.h b/drivers/infiniband/hw/hfi1/ipoib.h
index eb5c251..9091229 100644
--- a/drivers/infiniband/hw/hfi1/ipoib.h
+++ b/drivers/infiniband/hw/hfi1/ipoib.h
@@ -44,6 +44,26 @@
 };
 
 /**
+ * struct ipoib_txreq - IPOIB transmit descriptor
+ * @txreq: sdma transmit request
+ * @sdma_hdr: 9b ib headers
+ * @sdma_status: status returned by sdma engine
+ * @complete: non-zero implies complete
+ * @priv: ipoib netdev private data
+ * @txq: txq on which skb was output
+ * @skb: skb to send
+ */
+struct ipoib_txreq {
+	struct sdma_txreq           txreq;
+	struct hfi1_sdma_header     sdma_hdr;
+	int                         sdma_status;
+	int                         complete;
+	struct hfi1_ipoib_dev_priv *priv;
+	struct hfi1_ipoib_txq      *txq;
+	struct sk_buff             *skb;
+};
+
+/**
  * struct hfi1_ipoib_circ_buf - List of items to be processed
  * @items: ring of items each a power of two size
  * @max_items: max items + 1 that the ring can contain
@@ -56,7 +76,6 @@
  * @complete_txreqs: count of txreqs completed by sdma
  * @head: ring head
  */
-struct ipoib_txreq;
 struct hfi1_ipoib_circ_buf {
 	void *items;
 	u32 max_items;
diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c
index 1a7a837..d1c2cf5 100644
--- a/drivers/infiniband/hw/hfi1/ipoib_tx.c
+++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c
@@ -22,26 +22,6 @@
 #define CIRC_NEXT(val, size) CIRC_ADD(val, 1, size)
 #define CIRC_PREV(val, size) CIRC_ADD(val, -1, size)
 
-/**
- * struct ipoib_txreq - IPOIB transmit descriptor
- * @txreq: sdma transmit request
- * @sdma_hdr: 9b ib headers
- * @sdma_status: status returned by sdma engine
- * @complete: non-zero implies complete
- * @priv: ipoib netdev private data
- * @txq: txq on which skb was output
- * @skb: skb to send
- */
-struct ipoib_txreq {
-	struct sdma_txreq           txreq;
-	struct hfi1_sdma_header     sdma_hdr;
-	int                         sdma_status;
-	int                         complete;
-	struct hfi1_ipoib_dev_priv *priv;
-	struct hfi1_ipoib_txq      *txq;
-	struct sk_buff             *skb;
-};
-
 struct ipoib_txparms {
 	struct hfi1_devdata        *dd;
 	struct rdma_ah_attr        *ah_attr;
@@ -187,6 +167,7 @@ static int hfi1_ipoib_poll_tx_ring(struct napi_struct *napi, int budget)
 		if (!smp_load_acquire(&tx->complete))
 			break;
 		tx->complete = 0;
+		trace_hfi1_tx_produce(tx, head);
 		hfi1_ipoib_free_tx(tx, budget);
 		head = CIRC_NEXT(head, max_tx);
 		tx =  hfi1_txreq_from_idx(tx_ring, head);
@@ -495,6 +476,7 @@ static int hfi1_ipoib_send_dma_single(struct net_device *dev,
 	}
 
 	tx_ring = &txq->tx_ring;
+	trace_hfi1_tx_consume(tx, tx_ring->tail);
 	/* consume tx */
 	smp_store_release(&tx_ring->tail, CIRC_NEXT(tx_ring->tail, tx_ring->max_items));
 	ret = hfi1_ipoib_submit_tx(txq, tx);
@@ -557,6 +539,7 @@ static int hfi1_ipoib_send_dma_list(struct net_device *dev,
 	}
 
 	tx_ring = &txq->tx_ring;
+	trace_hfi1_tx_consume(tx, tx_ring->tail);
 	/* consume tx */
 	smp_store_release(&tx_ring->tail, CIRC_NEXT(tx_ring->tail, tx_ring->max_items));
 	list_add_tail(&tx->txreq.list, &txq->tx_list);
diff --git a/drivers/infiniband/hw/hfi1/trace_tx.h b/drivers/infiniband/hw/hfi1/trace_tx.h
index f00696f..ed1b9e1 100644
--- a/drivers/infiniband/hw/hfi1/trace_tx.h
+++ b/drivers/infiniband/hw/hfi1/trace_tx.h
@@ -926,11 +926,13 @@
 		 __netif_subqueue_stopped(txq->priv->netdev, txq->q_idx);
 	),
 	TP_printk(/* print  */
-		"[%s] txq %llx idx %u sde %llx head %lx tail %lx flow %x used %u stops %d no_desc %d stopped %u",
+		"[%s] txq %llx idx %u sde %llx:%u cpu %d head %lx tail %lx flow %x used %u stops %d no_desc %d stopped %u",
 		__get_str(dev),
 		(unsigned long long)__entry->txq,
 		__entry->idx,
 		(unsigned long long)__entry->sde,
+		__entry->sde ? __entry->sde->this_idx : 0,
+		__entry->sde ? __entry->sde->cpu : 0,
 		__entry->head,
 		__entry->tail,
 		__entry->flow,
@@ -995,6 +997,65 @@
 	TP_ARGS(txq)
 );
 
+DECLARE_EVENT_CLASS(/* AIP  */
+	hfi1_ipoib_tx_template,
+	TP_PROTO(struct ipoib_txreq *tx, u32 idx),
+	TP_ARGS(tx, idx),
+	TP_STRUCT__entry(/* entry */
+		DD_DEV_ENTRY(tx->txq->priv->dd)
+		__field(struct ipoib_txreq *, tx)
+		__field(struct hfi1_ipoib_txq *, txq)
+		__field(struct sk_buff *, skb)
+		__field(ulong, idx)
+	),
+	TP_fast_assign(/* assign */
+		DD_DEV_ASSIGN(tx->txq->priv->dd);
+		__entry->tx = tx;
+		__entry->skb = tx->skb;
+		__entry->txq = tx->txq;
+		__entry->idx = idx;
+	),
+	TP_printk(/* print  */
+		"[%s] tx %llx txq %llx,%u skb %llx idx %lu",
+		__get_str(dev),
+		(unsigned long long)__entry->tx,
+		(unsigned long long)__entry->txq,
+		__entry->txq ? __entry->txq->q_idx : 0,
+		(unsigned long long)__entry->skb,
+		__entry->idx
+	)
+);
+
+DEFINE_EVENT(/* produce */
+	hfi1_ipoib_tx_template, hfi1_tx_produce,
+	TP_PROTO(struct ipoib_txreq *tx, u32 idx),
+	TP_ARGS(tx, idx)
+);
+
+DEFINE_EVENT(/* consume */
+	hfi1_ipoib_tx_template, hfi1_tx_consume,
+	TP_PROTO(struct ipoib_txreq *tx, u32 idx),
+	TP_ARGS(tx, idx)
+);
+
+DEFINE_EVENT(/* alloc_tx */
+	hfi1_ipoib_txq_template, hfi1_txq_alloc_tx,
+	TP_PROTO(struct hfi1_ipoib_txq *txq),
+	TP_ARGS(txq)
+);
+
+DEFINE_EVENT(/* poll */
+	hfi1_ipoib_txq_template, hfi1_txq_poll,
+	TP_PROTO(struct hfi1_ipoib_txq *txq),
+	TP_ARGS(txq)
+);
+
+DEFINE_EVENT(/* complete */
+	hfi1_ipoib_txq_template, hfi1_txq_complete,
+	TP_PROTO(struct hfi1_ipoib_txq *txq),
+	TP_ARGS(txq)
+);
+
 #endif /* __HFI1_TRACE_TX_H */
 
 #undef TRACE_INCLUDE_PATH


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH for-next 0/6] Perf and debug fixes for hfi
  2021-09-13 13:28 [PATCH for-next 0/6] Perf and debug fixes for hfi Dennis Dalessandro
                   ` (5 preceding siblings ...)
  2021-09-13 13:28 ` [PATCH for-next 6/6] IB/hfi1: Add ring consumer and producers traces Dennis Dalessandro
@ 2021-09-27 23:15 ` Jason Gunthorpe
  6 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2021-09-27 23:15 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: dledford, linux-rdma

On Mon, Sep 13, 2021 at 09:28:20AM -0400, Dennis Dalessandro wrote:
> Here is a series of perf improvements and debug/trace fixes from Mike,
> who has this to say about the patches...
> 
> The AIP SDMA interrupt handling is inefficient:
> 
> - A slab entry is allocated for each sent packet
> 
>   This is despite the fact that there is a ring for each possible send slot
>   that could be occupied by a tx descriptor
> 
> - The interrupt handling/NAPI is lock happy has a mixed up notion of
>   producer and consumer
> 
>   The ring should be a ring of tx descriptors vs. a ring of pointers
> 
>   The consumer of descriptors should be the xmit side of the TX
> 
>   The producer of the descriptors is the SDMA interrupt handling and NAPI
>   tx completion
> 
>   There is certainly no locking required in the interrupt/TX napi tx queue
> 
>   There is no locking required in the xmit side since that is held off by NAPI
>   code
> 
> Note that these patches are also staged publicly on our GitHub site for easy
> browsing in context.
> 
> https://github.com/cornelisnetworks/linux
> 
> ---
> 
> Mike Marciniszyn (6):
>       IB/hfi1: Remove cache and embed txreq in ring
>       IB/hfi1: Get rid of hot path divide
>       IB/hfi1: Get rid of tx priv backpointer
>       IB/hfi1: Tune netdev xmit cachelines
>       IB/hfi1: Remove atomic completion count
>       IB/hfi1: Add ring consumer and producers traces

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-27 23:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13 13:28 [PATCH for-next 0/6] Perf and debug fixes for hfi Dennis Dalessandro
2021-09-13 13:28 ` [PATCH for-next 1/6] IB/hfi1: Remove cache and embed txreq in ring Dennis Dalessandro
2021-09-13 13:28 ` [PATCH for-next 2/6] IB/hfi1: Get rid of hot path divide Dennis Dalessandro
2021-09-13 13:28 ` [PATCH for-next 3/6] IB/hfi1: Get rid of tx priv backpointer Dennis Dalessandro
2021-09-13 13:28 ` [PATCH for-next 4/6] IB/hfi1: Tune netdev xmit cachelines Dennis Dalessandro
2021-09-13 13:28 ` [PATCH for-next 5/6] IB/hfi1: Remove atomic completion count Dennis Dalessandro
2021-09-13 13:28 ` [PATCH for-next 6/6] IB/hfi1: Add ring consumer and producers traces Dennis Dalessandro
2021-09-27 23:15 ` [PATCH for-next 0/6] Perf and debug fixes for hfi Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).