netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates
@ 2020-11-19  1:12 Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 1/9] ibmvnic: Introduce indirect subordinate Command Response Queue buffer Thomas Falcon
                   ` (9 more replies)
  0 siblings, 10 replies; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

The first three patches utilize a hypervisor call allowing multiple 
TX and RX buffer replenishment descriptors to be sent in one operation,
which significantly reduces hypervisor call overhead. The xmit_more
and Byte Queue Limit API's are leveraged to provide this support
for TX descriptors.

The subsequent two patches remove superfluous code and members in
TX completion handling function and TX buffer structure, respectively,
and remove unused routines.

Finally, four patches which ensure that device queue memory is
cache-line aligned, resolving slowdowns observed in PCI traces,
as well as optimize the driver's NAPI polling function and 
to RX buffer replenishment are provided by Dwip Banerjee.

This series provides significant performance improvements, allowing
the driver to fully utilize 100Gb NIC's.

v2 updates:

1) Removed three patches from the original series which
   were bug fixes and thus better suited for the net tree,
   suggested by Jakub Kicinski.
2) Fixed error handling when initializing device queues,
   suggested by Jakub Kicinski.
3) Fixed bug where queued entries were not flushed after a
   dropped frame, also suggested by Jakub. Two functions,
   ibmvnic_tx_scrq_flush and its helper ibmvnic_tx_scrq_clean_buffer,
   were introduced to ensure that queued frames are either submitted
   to firmware or, if that is not successful, freed as dropped and
   associated data structures are updated with the new device queue state.

Dwip N. Banerjee (4):
  ibmvnic: Ensure that device queue memory is cache-line aligned
  ibmvnic: Correctly re-enable interrupts in NAPI polling routine
  ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX
    buffers
  ibmvnic: Do not replenish RX buffers after every polling loop

Thomas Falcon (5):
  ibmvnic: Introduce indirect subordinate Command Response Queue buffer
  ibmvnic: Introduce batched RX buffer descriptor transmission
  ibmvnic: Introduce xmit_more support using batched subCRQ hcalls
  ibmvnic: Clean up TX code and TX buffer data structure
  ibmvnic: Remove send_subcrq function

 drivers/net/ethernet/ibm/ibmvnic.c | 398 ++++++++++++++++++-----------
 drivers/net/ethernet/ibm/ibmvnic.h |  27 +-
 2 files changed, 256 insertions(+), 169 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 1/9] ibmvnic: Introduce indirect subordinate Command Response Queue buffer
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  9:34   ` ljp
  2020-11-19  1:12 ` [PATCH net-next v2 2/9] ibmvnic: Introduce batched RX buffer descriptor transmission Thomas Falcon
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

This patch introduces the infrastructure to send batched subordinate
Command Response Queue descriptors, which are used by the ibmvnic
driver to send TX frame and RX buffer descriptors.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 23 +++++++++++++++++++++++
 drivers/net/ethernet/ibm/ibmvnic.h |  9 +++++++++
 2 files changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index da15913879f8..3884f8a683a7 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2858,6 +2858,7 @@ static int reset_one_sub_crq_queue(struct ibmvnic_adapter *adapter,
 	memset(scrq->msgs, 0, 4 * PAGE_SIZE);
 	atomic_set(&scrq->used, 0);
 	scrq->cur = 0;
+	scrq->ind_buf.index = 0;
 
 	rc = h_reg_sub_crq(adapter->vdev->unit_address, scrq->msg_token,
 			   4 * PAGE_SIZE, &scrq->crq_num, &scrq->hw_irq);
@@ -2909,6 +2910,11 @@ static void release_sub_crq_queue(struct ibmvnic_adapter *adapter,
 		}
 	}
 
+	dma_free_coherent(dev,
+			  IBMVNIC_IND_ARR_SZ,
+			  scrq->ind_buf.indir_arr,
+			  scrq->ind_buf.indir_dma);
+
 	dma_unmap_single(dev, scrq->msg_token, 4 * PAGE_SIZE,
 			 DMA_BIDIRECTIONAL);
 	free_pages((unsigned long)scrq->msgs, 2);
@@ -2955,6 +2961,17 @@ static struct ibmvnic_sub_crq_queue *init_sub_crq_queue(struct ibmvnic_adapter
 
 	scrq->adapter = adapter;
 	scrq->size = 4 * PAGE_SIZE / sizeof(*scrq->msgs);
+	scrq->ind_buf.index = 0;
+
+	scrq->ind_buf.indir_arr =
+		dma_alloc_coherent(dev,
+				   IBMVNIC_IND_ARR_SZ,
+				   &scrq->ind_buf.indir_dma,
+				   GFP_KERNEL);
+
+	if (!scrq->ind_buf.indir_arr)
+		goto indir_failed;
+
 	spin_lock_init(&scrq->lock);
 
 	netdev_dbg(adapter->netdev,
@@ -2963,6 +2980,12 @@ static struct ibmvnic_sub_crq_queue *init_sub_crq_queue(struct ibmvnic_adapter
 
 	return scrq;
 
+indir_failed:
+	do {
+		rc = plpar_hcall_norets(H_FREE_SUB_CRQ,
+					adapter->vdev->unit_address,
+					scrq->crq_num);
+	} while (rc == H_BUSY || rc == H_IS_LONG_BUSY(rc));
 reg_failed:
 	dma_unmap_single(dev, scrq->msg_token, 4 * PAGE_SIZE,
 			 DMA_BIDIRECTIONAL);
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 217dcc7ded70..4a63e9886719 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -31,6 +31,8 @@
 #define IBMVNIC_BUFFS_PER_POOL	100
 #define IBMVNIC_MAX_QUEUES	16
 #define IBMVNIC_MAX_QUEUE_SZ   4096
+#define IBMVNIC_MAX_IND_DESCS  128
+#define IBMVNIC_IND_ARR_SZ	(IBMVNIC_MAX_IND_DESCS * 32)
 
 #define IBMVNIC_TSO_BUF_SZ	65536
 #define IBMVNIC_TSO_BUFS	64
@@ -861,6 +863,12 @@ union sub_crq {
 	struct ibmvnic_rx_buff_add_desc rx_add;
 };
 
+struct ibmvnic_ind_xmit_queue {
+	union sub_crq *indir_arr;
+	dma_addr_t indir_dma;
+	int index;
+};
+
 struct ibmvnic_sub_crq_queue {
 	union sub_crq *msgs;
 	int size, cur;
@@ -873,6 +881,7 @@ struct ibmvnic_sub_crq_queue {
 	spinlock_t lock;
 	struct sk_buff *rx_skb_top;
 	struct ibmvnic_adapter *adapter;
+	struct ibmvnic_ind_xmit_queue ind_buf;
 	atomic_t used;
 	char name[32];
 	u64 handle;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 2/9] ibmvnic: Introduce batched RX buffer descriptor transmission
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 1/9] ibmvnic: Introduce indirect subordinate Command Response Queue buffer Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 3/9] ibmvnic: Introduce xmit_more support using batched subCRQ hcalls Thomas Falcon
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

Utilize the H_SEND_SUB_CRQ_INDIRECT hypervisor call to send
multiple RX buffer descriptors to the device in one hypervisor
call operation. This change will reduce the number of hypervisor
calls and thus hypervisor call overhead needed to transmit
RX buffer descriptors to the device.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 57 +++++++++++++++++++-----------
 1 file changed, 37 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 3884f8a683a7..17ba6db6f5f9 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -306,9 +306,11 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter,
 	int count = pool->size - atomic_read(&pool->available);
 	u64 handle = adapter->rx_scrq[pool->index]->handle;
 	struct device *dev = &adapter->vdev->dev;
+	struct ibmvnic_ind_xmit_queue *ind_bufp;
+	struct ibmvnic_sub_crq_queue *rx_scrq;
+	union sub_crq *sub_crq;
 	int buffers_added = 0;
 	unsigned long lpar_rc;
-	union sub_crq sub_crq;
 	struct sk_buff *skb;
 	unsigned int offset;
 	dma_addr_t dma_addr;
@@ -320,6 +322,8 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter,
 	if (!pool->active)
 		return;
 
+	rx_scrq = adapter->rx_scrq[pool->index];
+	ind_bufp = &rx_scrq->ind_buf;
 	for (i = 0; i < count; ++i) {
 		skb = alloc_skb(pool->buff_size, GFP_ATOMIC);
 		if (!skb) {
@@ -346,12 +350,13 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter,
 		pool->rx_buff[index].pool_index = pool->index;
 		pool->rx_buff[index].size = pool->buff_size;
 
-		memset(&sub_crq, 0, sizeof(sub_crq));
-		sub_crq.rx_add.first = IBMVNIC_CRQ_CMD;
-		sub_crq.rx_add.correlator =
+		sub_crq = &ind_bufp->indir_arr[ind_bufp->index++];
+		memset(sub_crq, 0, sizeof(*sub_crq));
+		sub_crq->rx_add.first = IBMVNIC_CRQ_CMD;
+		sub_crq->rx_add.correlator =
 		    cpu_to_be64((u64)&pool->rx_buff[index]);
-		sub_crq.rx_add.ioba = cpu_to_be32(dma_addr);
-		sub_crq.rx_add.map_id = pool->long_term_buff.map_id;
+		sub_crq->rx_add.ioba = cpu_to_be32(dma_addr);
+		sub_crq->rx_add.map_id = pool->long_term_buff.map_id;
 
 		/* The length field of the sCRQ is defined to be 24 bits so the
 		 * buffer size needs to be left shifted by a byte before it is
@@ -361,15 +366,20 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter,
 #ifdef __LITTLE_ENDIAN__
 		shift = 8;
 #endif
-		sub_crq.rx_add.len = cpu_to_be32(pool->buff_size << shift);
-
-		lpar_rc = send_subcrq(adapter, handle, &sub_crq);
-		if (lpar_rc != H_SUCCESS)
-			goto failure;
-
-		buffers_added++;
-		adapter->replenish_add_buff_success++;
+		sub_crq->rx_add.len = cpu_to_be32(pool->buff_size << shift);
 		pool->next_free = (pool->next_free + 1) % pool->size;
+		if (ind_bufp->index == IBMVNIC_MAX_IND_DESCS ||
+		    i == count - 1) {
+			lpar_rc =
+				send_subcrq_indirect(adapter, handle,
+						     (u64)ind_bufp->indir_dma,
+						     (u64)ind_bufp->index);
+			if (lpar_rc != H_SUCCESS)
+				goto failure;
+			buffers_added += ind_bufp->index;
+			adapter->replenish_add_buff_success += ind_bufp->index;
+			ind_bufp->index = 0;
+		}
 	}
 	atomic_add(buffers_added, &pool->available);
 	return;
@@ -377,13 +387,20 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter,
 failure:
 	if (lpar_rc != H_PARAMETER && lpar_rc != H_CLOSED)
 		dev_err_ratelimited(dev, "rx: replenish packet buffer failed\n");
-	pool->free_map[pool->next_free] = index;
-	pool->rx_buff[index].skb = NULL;
-
-	dev_kfree_skb_any(skb);
-	adapter->replenish_add_buff_failure++;
-	atomic_add(buffers_added, &pool->available);
+	for (i = ind_bufp->index - 1; i >= 0; --i) {
+		struct ibmvnic_rx_buff *rx_buff;
 
+		pool->next_free = pool->next_free == 0 ?
+				  pool->size - 1 : pool->next_free - 1;
+		sub_crq = &ind_bufp->indir_arr[i];
+		rx_buff = (struct ibmvnic_rx_buff *)
+				be64_to_cpu(sub_crq->rx_add.correlator);
+		index = (int)(rx_buff - pool->rx_buff);
+		pool->free_map[pool->next_free] = index;
+		dev_kfree_skb_any(pool->rx_buff[index].skb);
+		pool->rx_buff[index].skb = NULL;
+	}
+	ind_bufp->index = 0;
 	if (lpar_rc == H_CLOSED || adapter->failover_pending) {
 		/* Disable buffer pool replenishment and report carrier off if
 		 * queue is closed or pending failover.
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 3/9] ibmvnic: Introduce xmit_more support using batched subCRQ hcalls
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 1/9] ibmvnic: Introduce indirect subordinate Command Response Queue buffer Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 2/9] ibmvnic: Introduce batched RX buffer descriptor transmission Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 4/9] ibmvnic: Clean up TX code and TX buffer data structure Thomas Falcon
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

Include support for the xmit_more feature utilizing the
H_SEND_SUB_CRQ_INDIRECT hypervisor call which allows the sending
of multiple subordinate Command Response Queue descriptors in one
hypervisor call via a DMA-mapped buffer. This update reduces hypervisor
calls and thus hypervisor call overhead per TX descriptor.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 204 ++++++++++++++++++++---------
 1 file changed, 139 insertions(+), 65 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 17ba6db6f5f9..650aaf100d65 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1165,6 +1165,7 @@ static int __ibmvnic_open(struct net_device *netdev)
 		if (prev_state == VNIC_CLOSED)
 			enable_irq(adapter->tx_scrq[i]->irq);
 		enable_scrq_irq(adapter, adapter->tx_scrq[i]);
+		netdev_tx_reset_queue(netdev_get_tx_queue(netdev, i));
 	}
 
 	rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_UP);
@@ -1523,16 +1524,93 @@ static int ibmvnic_xmit_workarounds(struct sk_buff *skb,
 	return 0;
 }
 
+static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
+					 struct ibmvnic_sub_crq_queue *tx_scrq)
+{
+	struct ibmvnic_ind_xmit_queue *ind_bufp;
+	struct ibmvnic_tx_buff *tx_buff;
+	struct ibmvnic_tx_pool *tx_pool;
+	union sub_crq tx_scrq_entry;
+	int queue_num;
+	int entries;
+	int index;
+	int i;
+
+	ind_bufp = &tx_scrq->ind_buf;
+	entries = (u64)ind_bufp->index;
+	queue_num = tx_scrq->pool_index;
+
+	for (i = entries - 1; i >= 0; --i) {
+		tx_scrq_entry = ind_bufp->indir_arr[i];
+		if (tx_scrq_entry.v1.type != IBMVNIC_TX_DESC)
+			continue;
+		index = be32_to_cpu(tx_scrq_entry.v1.correlator);
+		if (index & IBMVNIC_TSO_POOL_MASK) {
+			tx_pool = &adapter->tso_pool[queue_num];
+			index &= ~IBMVNIC_TSO_POOL_MASK;
+		} else {
+			tx_pool = &adapter->tx_pool[queue_num];
+		}
+		tx_pool->free_map[tx_pool->consumer_index] = index;
+		tx_pool->consumer_index = tx_pool->consumer_index == 0 ?
+					  tx_pool->num_buffers - 1 :
+					  tx_pool->consumer_index - 1;
+		tx_buff = &tx_pool->tx_buff[index];
+		adapter->netdev->stats.tx_packets--;
+		adapter->netdev->stats.tx_bytes -= tx_buff->skb->len;
+		adapter->tx_stats_buffers[queue_num].packets--;
+		adapter->tx_stats_buffers[queue_num].bytes -=
+						tx_buff->skb->len;
+		dev_kfree_skb_any(tx_buff->skb);
+		tx_buff->skb = NULL;
+		adapter->netdev->stats.tx_dropped++;
+	}
+	ind_bufp->index = 0;
+	if (atomic_sub_return(entries, &tx_scrq->used) <=
+	    (adapter->req_tx_entries_per_subcrq / 2) &&
+	    __netif_subqueue_stopped(adapter->netdev, queue_num)) {
+		netif_wake_subqueue(adapter->netdev, queue_num);
+		netdev_dbg(adapter->netdev, "Started queue %d\n",
+			   queue_num);
+	}
+}
+
+static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter,
+				 struct ibmvnic_sub_crq_queue *tx_scrq)
+{
+	struct ibmvnic_ind_xmit_queue *ind_bufp;
+	u64 dma_addr;
+	u64 entries;
+	u64 handle;
+	int rc;
+
+	ind_bufp = &tx_scrq->ind_buf;
+	dma_addr = (u64)ind_bufp->indir_dma;
+	entries = (u64)ind_bufp->index;
+	handle = tx_scrq->handle;
+
+	if (!entries)
+		return 0;
+	rc = send_subcrq_indirect(adapter, handle, dma_addr, entries);
+	if (rc)
+		ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
+	else
+		ind_bufp->index = 0;
+	return 0;
+}
+
 static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 {
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
 	int queue_num = skb_get_queue_mapping(skb);
 	u8 *hdrs = (u8 *)&adapter->tx_rx_desc_req;
 	struct device *dev = &adapter->vdev->dev;
+	struct ibmvnic_ind_xmit_queue *ind_bufp;
 	struct ibmvnic_tx_buff *tx_buff = NULL;
 	struct ibmvnic_sub_crq_queue *tx_scrq;
 	struct ibmvnic_tx_pool *tx_pool;
 	unsigned int tx_send_failed = 0;
+	netdev_tx_t ret = NETDEV_TX_OK;
 	unsigned int tx_map_failed = 0;
 	unsigned int tx_dropped = 0;
 	unsigned int tx_packets = 0;
@@ -1546,8 +1624,10 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	unsigned char *dst;
 	int index = 0;
 	u8 proto = 0;
-	u64 handle;
-	netdev_tx_t ret = NETDEV_TX_OK;
+
+	tx_scrq = adapter->tx_scrq[queue_num];
+	txq = netdev_get_tx_queue(netdev, queue_num);
+	ind_bufp = &tx_scrq->ind_buf;
 
 	if (test_bit(0, &adapter->resetting)) {
 		if (!netif_subqueue_stopped(netdev, skb))
@@ -1557,6 +1637,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		tx_send_failed++;
 		tx_dropped++;
 		ret = NETDEV_TX_OK;
+		ibmvnic_tx_scrq_flush(adapter, tx_scrq);
 		goto out;
 	}
 
@@ -1564,6 +1645,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		tx_dropped++;
 		tx_send_failed++;
 		ret = NETDEV_TX_OK;
+		ibmvnic_tx_scrq_flush(adapter, tx_scrq);
 		goto out;
 	}
 	if (skb_is_gso(skb))
@@ -1571,10 +1653,6 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	else
 		tx_pool = &adapter->tx_pool[queue_num];
 
-	tx_scrq = adapter->tx_scrq[queue_num];
-	txq = netdev_get_tx_queue(netdev, skb_get_queue_mapping(skb));
-	handle = tx_scrq->handle;
-
 	index = tx_pool->free_map[tx_pool->consumer_index];
 
 	if (index == IBMVNIC_INVALID_MAP) {
@@ -1582,6 +1660,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		tx_send_failed++;
 		tx_dropped++;
 		ret = NETDEV_TX_OK;
+		ibmvnic_tx_scrq_flush(adapter, tx_scrq);
 		goto out;
 	}
 
@@ -1666,55 +1745,29 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 		tx_crq.v1.mss = cpu_to_be16(skb_shinfo(skb)->gso_size);
 		hdrs += 2;
 	}
-	/* determine if l2/3/4 headers are sent to firmware */
-	if ((*hdrs >> 7) & 1) {
+
+	if ((*hdrs >> 7) & 1)
 		build_hdr_descs_arr(tx_buff, &num_entries, *hdrs);
-		tx_crq.v1.n_crq_elem = num_entries;
-		tx_buff->num_entries = num_entries;
-		tx_buff->indir_arr[0] = tx_crq;
-		tx_buff->indir_dma = dma_map_single(dev, tx_buff->indir_arr,
-						    sizeof(tx_buff->indir_arr),
-						    DMA_TO_DEVICE);
-		if (dma_mapping_error(dev, tx_buff->indir_dma)) {
-			dev_kfree_skb_any(skb);
-			tx_buff->skb = NULL;
-			if (!firmware_has_feature(FW_FEATURE_CMO))
-				dev_err(dev, "tx: unable to map descriptor array\n");
-			tx_map_failed++;
-			tx_dropped++;
-			ret = NETDEV_TX_OK;
-			goto tx_err_out;
-		}
-		lpar_rc = send_subcrq_indirect(adapter, handle,
-					       (u64)tx_buff->indir_dma,
-					       (u64)num_entries);
-		dma_unmap_single(dev, tx_buff->indir_dma,
-				 sizeof(tx_buff->indir_arr), DMA_TO_DEVICE);
-	} else {
-		tx_buff->num_entries = num_entries;
-		lpar_rc = send_subcrq(adapter, handle,
-				      &tx_crq);
-	}
-	if (lpar_rc != H_SUCCESS) {
-		if (lpar_rc != H_CLOSED && lpar_rc != H_PARAMETER)
-			dev_err_ratelimited(dev, "tx: send failed\n");
-		dev_kfree_skb_any(skb);
-		tx_buff->skb = NULL;
 
-		if (lpar_rc == H_CLOSED || adapter->failover_pending) {
-			/* Disable TX and report carrier off if queue is closed
-			 * or pending failover.
-			 * Firmware guarantees that a signal will be sent to the
-			 * driver, triggering a reset or some other action.
-			 */
-			netif_tx_stop_all_queues(netdev);
-			netif_carrier_off(netdev);
-		}
+	tx_crq.v1.n_crq_elem = num_entries;
+	tx_buff->num_entries = num_entries;
+	/* flush buffer if current entry can not fit */
+	if (num_entries + ind_bufp->index > IBMVNIC_MAX_IND_DESCS) {
+		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq);
+		if (lpar_rc != H_SUCCESS)
+			goto tx_flush_err;
+	}
 
-		tx_send_failed++;
-		tx_dropped++;
-		ret = NETDEV_TX_OK;
-		goto tx_err_out;
+	tx_buff->indir_arr[0] = tx_crq;
+	memcpy(&ind_bufp->indir_arr[ind_bufp->index], tx_buff->indir_arr,
+	       num_entries * sizeof(struct ibmvnic_generic_scrq));
+	ind_bufp->index += num_entries;
+	if (__netdev_tx_sent_queue(txq, skb->len,
+				   netdev_xmit_more() &&
+				   ind_bufp->index < IBMVNIC_MAX_IND_DESCS)) {
+		lpar_rc = ibmvnic_tx_scrq_flush(adapter, tx_scrq);
+		if (lpar_rc != H_SUCCESS)
+			goto tx_err;
 	}
 
 	if (atomic_add_return(num_entries, &tx_scrq->used)
@@ -1729,14 +1782,26 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	ret = NETDEV_TX_OK;
 	goto out;
 
-tx_err_out:
-	/* roll back consumer index and map array*/
-	if (tx_pool->consumer_index == 0)
-		tx_pool->consumer_index =
-			tx_pool->num_buffers - 1;
-	else
-		tx_pool->consumer_index--;
-	tx_pool->free_map[tx_pool->consumer_index] = index;
+tx_flush_err:
+	dev_kfree_skb_any(skb);
+	tx_buff->skb = NULL;
+	tx_pool->consumer_index = tx_pool->consumer_index == 0 ?
+				  tx_pool->num_buffers - 1 :
+				  tx_pool->consumer_index - 1;
+	tx_dropped++;
+tx_err:
+	if (lpar_rc != H_CLOSED && lpar_rc != H_PARAMETER)
+		dev_err_ratelimited(dev, "tx: send failed\n");
+
+	if (lpar_rc == H_CLOSED || adapter->failover_pending) {
+		/* Disable TX and report carrier off if queue is closed
+		 * or pending failover.
+		 * Firmware guarantees that a signal will be sent to the
+		 * driver, triggering a reset or some other action.
+		 */
+		netif_tx_stop_all_queues(netdev);
+		netif_carrier_off(netdev);
+	}
 out:
 	netdev->stats.tx_dropped += tx_dropped;
 	netdev->stats.tx_bytes += tx_bytes;
@@ -3117,6 +3182,7 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
 	struct device *dev = &adapter->vdev->dev;
 	struct ibmvnic_tx_pool *tx_pool;
 	struct ibmvnic_tx_buff *txbuff;
+	struct netdev_queue *txq;
 	union sub_crq *next;
 	int index;
 	int i, j;
@@ -3125,6 +3191,8 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
 	while (pending_scrq(adapter, scrq)) {
 		unsigned int pool = scrq->pool_index;
 		int num_entries = 0;
+		int total_bytes = 0;
+		int num_packets = 0;
 
 		next = ibmvnic_next_scrq(adapter, scrq);
 		for (i = 0; i < next->tx_comp.num_comps; i++) {
@@ -3150,13 +3218,16 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
 				txbuff->data_dma[j] = 0;
 			}
 
-			if (txbuff->last_frag) {
-				dev_kfree_skb_any(txbuff->skb);
+			num_packets++;
+			num_entries += txbuff->num_entries;
+			if (txbuff->skb) {
+				total_bytes += txbuff->skb->len;
+				dev_consume_skb_irq(txbuff->skb);
 				txbuff->skb = NULL;
+			} else {
+				netdev_warn(adapter->netdev,
+					    "TX completion received with NULL socket buffer\n");
 			}
-
-			num_entries += txbuff->num_entries;
-
 			tx_pool->free_map[tx_pool->producer_index] = index;
 			tx_pool->producer_index =
 				(tx_pool->producer_index + 1) %
@@ -3165,6 +3236,9 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
 		/* remove tx_comp scrq*/
 		next->tx_comp.first = 0;
 
+		txq = netdev_get_tx_queue(adapter->netdev, scrq->pool_index);
+		netdev_tx_completed_queue(txq, num_packets, total_bytes);
+
 		if (atomic_sub_return(num_entries, &scrq->used) <=
 		    (adapter->req_tx_entries_per_subcrq / 2) &&
 		    __netif_subqueue_stopped(adapter->netdev,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 4/9] ibmvnic: Clean up TX code and TX buffer data structure
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
                   ` (2 preceding siblings ...)
  2020-11-19  1:12 ` [PATCH net-next v2 3/9] ibmvnic: Introduce xmit_more support using batched subCRQ hcalls Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 5/9] ibmvnic: Remove send_subcrq function Thomas Falcon
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

Remove unused and superfluous code and members in
existing TX implementation and data structures.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 31 +++++++++++-------------------
 drivers/net/ethernet/ibm/ibmvnic.h |  8 --------
 2 files changed, 11 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 650aaf100d65..2aace693559f 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1496,17 +1496,18 @@ static int create_hdr_descs(u8 hdr_field, u8 *hdr_data, int len, int *hdr_len,
  * L2/L3/L4 packet header descriptors to be sent by send_subcrq_indirect.
  */
 
-static void build_hdr_descs_arr(struct ibmvnic_tx_buff *txbuff,
+static void build_hdr_descs_arr(struct sk_buff *skb,
+				union sub_crq *indir_arr,
 				int *num_entries, u8 hdr_field)
 {
 	int hdr_len[3] = {0, 0, 0};
+	u8 hdr_data[140] = {0};
 	int tot_len;
-	u8 *hdr_data = txbuff->hdr_data;
 
-	tot_len = build_hdr_data(hdr_field, txbuff->skb, hdr_len,
-				 txbuff->hdr_data);
+	tot_len = build_hdr_data(hdr_field, skb, hdr_len,
+				 hdr_data);
 	*num_entries += create_hdr_descs(hdr_field, hdr_data, tot_len, hdr_len,
-			 txbuff->indir_arr + 1);
+					 indir_arr + 1);
 }
 
 static int ibmvnic_xmit_workarounds(struct sk_buff *skb,
@@ -1612,6 +1613,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	unsigned int tx_send_failed = 0;
 	netdev_tx_t ret = NETDEV_TX_OK;
 	unsigned int tx_map_failed = 0;
+	union sub_crq indir_arr[16];
 	unsigned int tx_dropped = 0;
 	unsigned int tx_packets = 0;
 	unsigned int tx_bytes = 0;
@@ -1696,11 +1698,8 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 
 	tx_buff = &tx_pool->tx_buff[index];
 	tx_buff->skb = skb;
-	tx_buff->data_dma[0] = data_dma_addr;
-	tx_buff->data_len[0] = skb->len;
 	tx_buff->index = index;
 	tx_buff->pool_index = queue_num;
-	tx_buff->last_frag = true;
 
 	memset(&tx_crq, 0, sizeof(tx_crq));
 	tx_crq.v1.first = IBMVNIC_CRQ_CMD;
@@ -1747,7 +1746,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	}
 
 	if ((*hdrs >> 7) & 1)
-		build_hdr_descs_arr(tx_buff, &num_entries, *hdrs);
+		build_hdr_descs_arr(skb, indir_arr, &num_entries, *hdrs);
 
 	tx_crq.v1.n_crq_elem = num_entries;
 	tx_buff->num_entries = num_entries;
@@ -1758,8 +1757,8 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 			goto tx_flush_err;
 	}
 
-	tx_buff->indir_arr[0] = tx_crq;
-	memcpy(&ind_bufp->indir_arr[ind_bufp->index], tx_buff->indir_arr,
+	indir_arr[0] = tx_crq;
+	memcpy(&ind_bufp->indir_arr[ind_bufp->index], &indir_arr[0],
 	       num_entries * sizeof(struct ibmvnic_generic_scrq));
 	ind_bufp->index += num_entries;
 	if (__netdev_tx_sent_queue(txq, skb->len,
@@ -3185,7 +3184,7 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
 	struct netdev_queue *txq;
 	union sub_crq *next;
 	int index;
-	int i, j;
+	int i;
 
 restart_loop:
 	while (pending_scrq(adapter, scrq)) {
@@ -3210,14 +3209,6 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter *adapter,
 			}
 
 			txbuff = &tx_pool->tx_buff[index];
-
-			for (j = 0; j < IBMVNIC_MAX_FRAGS_PER_CRQ; j++) {
-				if (!txbuff->data_dma[j])
-					continue;
-
-				txbuff->data_dma[j] = 0;
-			}
-
 			num_packets++;
 			num_entries += txbuff->num_entries;
 			if (txbuff->skb) {
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 4a63e9886719..16d892c3db0f 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -226,8 +226,6 @@ struct ibmvnic_tx_comp_desc {
 #define IBMVNIC_TCP_CHKSUM		0x20
 #define IBMVNIC_UDP_CHKSUM		0x08
 
-#define IBMVNIC_MAX_FRAGS_PER_CRQ 3
-
 struct ibmvnic_tx_desc {
 	u8 first;
 	u8 type;
@@ -896,14 +894,8 @@ struct ibmvnic_long_term_buff {
 
 struct ibmvnic_tx_buff {
 	struct sk_buff *skb;
-	dma_addr_t data_dma[IBMVNIC_MAX_FRAGS_PER_CRQ];
-	unsigned int data_len[IBMVNIC_MAX_FRAGS_PER_CRQ];
 	int index;
 	int pool_index;
-	bool last_frag;
-	union sub_crq indir_arr[6];
-	u8 hdr_data[140];
-	dma_addr_t indir_dma;
 	int num_entries;
 };
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 5/9] ibmvnic: Remove send_subcrq function
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
                   ` (3 preceding siblings ...)
  2020-11-19  1:12 ` [PATCH net-next v2 4/9] ibmvnic: Clean up TX code and TX buffer data structure Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  9:37   ` ljp
  2020-11-19  1:12 ` [PATCH net-next v2 6/9] ibmvnic: Ensure that device queue memory is cache-line aligned Thomas Falcon
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

It is not longer used, so remove it.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 34 ------------------------------
 1 file changed, 34 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 2aace693559f..e9b0cb6dfd9d 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -84,8 +84,6 @@ static int ibmvnic_reset_crq(struct ibmvnic_adapter *);
 static int ibmvnic_send_crq_init(struct ibmvnic_adapter *);
 static int ibmvnic_reenable_crq_queue(struct ibmvnic_adapter *);
 static int ibmvnic_send_crq(struct ibmvnic_adapter *, union ibmvnic_crq *);
-static int send_subcrq(struct ibmvnic_adapter *adapter, u64 remote_handle,
-		       union sub_crq *sub_crq);
 static int send_subcrq_indirect(struct ibmvnic_adapter *, u64, u64, u64);
 static irqreturn_t ibmvnic_interrupt_rx(int irq, void *instance);
 static int enable_scrq_irq(struct ibmvnic_adapter *,
@@ -3629,38 +3627,6 @@ static void print_subcrq_error(struct device *dev, int rc, const char *func)
 	}
 }
 
-static int send_subcrq(struct ibmvnic_adapter *adapter, u64 remote_handle,
-		       union sub_crq *sub_crq)
-{
-	unsigned int ua = adapter->vdev->unit_address;
-	struct device *dev = &adapter->vdev->dev;
-	u64 *u64_crq = (u64 *)sub_crq;
-	int rc;
-
-	netdev_dbg(adapter->netdev,
-		   "Sending sCRQ %016lx: %016lx %016lx %016lx %016lx\n",
-		   (unsigned long int)cpu_to_be64(remote_handle),
-		   (unsigned long int)cpu_to_be64(u64_crq[0]),
-		   (unsigned long int)cpu_to_be64(u64_crq[1]),
-		   (unsigned long int)cpu_to_be64(u64_crq[2]),
-		   (unsigned long int)cpu_to_be64(u64_crq[3]));
-
-	/* Make sure the hypervisor sees the complete request */
-	mb();
-
-	rc = plpar_hcall_norets(H_SEND_SUB_CRQ, ua,
-				cpu_to_be64(remote_handle),
-				cpu_to_be64(u64_crq[0]),
-				cpu_to_be64(u64_crq[1]),
-				cpu_to_be64(u64_crq[2]),
-				cpu_to_be64(u64_crq[3]));
-
-	if (rc)
-		print_subcrq_error(dev, rc, __func__);
-
-	return rc;
-}
-
 static int send_subcrq_indirect(struct ibmvnic_adapter *adapter,
 				u64 remote_handle, u64 ioba, u64 num_entries)
 {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 6/9] ibmvnic: Ensure that device queue memory is cache-line aligned
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
                   ` (4 preceding siblings ...)
  2020-11-19  1:12 ` [PATCH net-next v2 5/9] ibmvnic: Remove send_subcrq function Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 7/9] ibmvnic: Correctly re-enable interrupts in NAPI polling routine Thomas Falcon
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>

PCI bus slowdowns were observed on IBM VNIC devices as a result
of partial cache line writes and non-cache aligned full cache line writes.
Ensure that packet data buffers are cache-line aligned to avoid these
slowdowns.

Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c |  9 ++++++---
 drivers/net/ethernet/ibm/ibmvnic.h | 10 +++++-----
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index e9b0cb6dfd9d..85df91c9861b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -498,7 +498,7 @@ static int reset_rx_pools(struct ibmvnic_adapter *adapter)
 
 		if (rx_pool->buff_size != buff_size) {
 			free_long_term_buff(adapter, &rx_pool->long_term_buff);
-			rx_pool->buff_size = buff_size;
+			rx_pool->buff_size = ALIGN(buff_size, L1_CACHE_BYTES);
 			rc = alloc_long_term_buff(adapter,
 						  &rx_pool->long_term_buff,
 						  rx_pool->size *
@@ -592,7 +592,7 @@ static int init_rx_pools(struct net_device *netdev)
 
 		rx_pool->size = adapter->req_rx_add_entries_per_subcrq;
 		rx_pool->index = i;
-		rx_pool->buff_size = buff_size;
+		rx_pool->buff_size = ALIGN(buff_size, L1_CACHE_BYTES);
 		rx_pool->active = 1;
 
 		rx_pool->free_map = kcalloc(rx_pool->size, sizeof(int),
@@ -745,6 +745,7 @@ static int init_tx_pools(struct net_device *netdev)
 {
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
 	int tx_subcrqs;
+	u64 buff_size;
 	int i, rc;
 
 	tx_subcrqs = adapter->num_active_tx_scrqs;
@@ -761,9 +762,11 @@ static int init_tx_pools(struct net_device *netdev)
 	adapter->num_active_tx_pools = tx_subcrqs;
 
 	for (i = 0; i < tx_subcrqs; i++) {
+		buff_size = adapter->req_mtu + VLAN_HLEN;
+		buff_size = ALIGN(buff_size, L1_CACHE_BYTES);
 		rc = init_one_tx_pool(netdev, &adapter->tx_pool[i],
 				      adapter->req_tx_entries_per_subcrq,
-				      adapter->req_mtu + VLAN_HLEN);
+				      buff_size);
 		if (rc) {
 			release_tx_pools(adapter);
 			return rc;
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 16d892c3db0f..9911d926dd7f 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -883,7 +883,7 @@ struct ibmvnic_sub_crq_queue {
 	atomic_t used;
 	char name[32];
 	u64 handle;
-};
+} ____cacheline_aligned;
 
 struct ibmvnic_long_term_buff {
 	unsigned char *buff;
@@ -907,7 +907,7 @@ struct ibmvnic_tx_pool {
 	struct ibmvnic_long_term_buff long_term_buff;
 	int num_buffers;
 	int buf_size;
-};
+} ____cacheline_aligned;
 
 struct ibmvnic_rx_buff {
 	struct sk_buff *skb;
@@ -928,7 +928,7 @@ struct ibmvnic_rx_pool {
 	int next_alloc;
 	int active;
 	struct ibmvnic_long_term_buff long_term_buff;
-};
+} ____cacheline_aligned;
 
 struct ibmvnic_vpd {
 	unsigned char *buff;
@@ -1015,8 +1015,8 @@ struct ibmvnic_adapter {
 	atomic_t running_cap_crqs;
 	bool wait_capability;
 
-	struct ibmvnic_sub_crq_queue **tx_scrq;
-	struct ibmvnic_sub_crq_queue **rx_scrq;
+	struct ibmvnic_sub_crq_queue **tx_scrq ____cacheline_aligned;
+	struct ibmvnic_sub_crq_queue **rx_scrq ____cacheline_aligned;
 
 	/* rx structs */
 	struct napi_struct *napi;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 7/9] ibmvnic: Correctly re-enable interrupts in NAPI polling routine
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
                   ` (5 preceding siblings ...)
  2020-11-19  1:12 ` [PATCH net-next v2 6/9] ibmvnic: Ensure that device queue memory is cache-line aligned Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  1:12 ` [PATCH net-next v2 8/9] ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers Thomas Falcon
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>

If the current NAPI polling loop exits without completing it's
budget, only re-enable interrupts if there are no entries remaining
in the queue and napi_complete_done is successful. If there are entries
remaining on the queue that were missed, restart the polling loop.

Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 37 +++++++++++++++++++-----------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 85df91c9861b..596546f0614d 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2450,10 +2450,17 @@ static void remove_buff_from_pool(struct ibmvnic_adapter *adapter,
 
 static int ibmvnic_poll(struct napi_struct *napi, int budget)
 {
-	struct net_device *netdev = napi->dev;
-	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
-	int scrq_num = (int)(napi - adapter->napi);
-	int frames_processed = 0;
+	struct ibmvnic_sub_crq_queue *rx_scrq;
+	struct ibmvnic_adapter *adapter;
+	struct net_device *netdev;
+	int frames_processed;
+	int scrq_num;
+
+	netdev = napi->dev;
+	adapter = netdev_priv(netdev);
+	scrq_num = (int)(napi - adapter->napi);
+	frames_processed = 0;
+	rx_scrq = adapter->rx_scrq[scrq_num];
 
 restart_poll:
 	while (frames_processed < budget) {
@@ -2466,14 +2473,14 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
 
 		if (unlikely(test_bit(0, &adapter->resetting) &&
 			     adapter->reset_reason != VNIC_RESET_NON_FATAL)) {
-			enable_scrq_irq(adapter, adapter->rx_scrq[scrq_num]);
+			enable_scrq_irq(adapter, rx_scrq);
 			napi_complete_done(napi, frames_processed);
 			return frames_processed;
 		}
 
-		if (!pending_scrq(adapter, adapter->rx_scrq[scrq_num]))
+		if (!pending_scrq(adapter, rx_scrq))
 			break;
-		next = ibmvnic_next_scrq(adapter, adapter->rx_scrq[scrq_num]);
+		next = ibmvnic_next_scrq(adapter, rx_scrq);
 		rx_buff =
 		    (struct ibmvnic_rx_buff *)be64_to_cpu(next->
 							  rx_comp.correlator);
@@ -2532,14 +2539,16 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
 
 	if (adapter->state != VNIC_CLOSING)
 		replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
-
 	if (frames_processed < budget) {
-		enable_scrq_irq(adapter, adapter->rx_scrq[scrq_num]);
-		napi_complete_done(napi, frames_processed);
-		if (pending_scrq(adapter, adapter->rx_scrq[scrq_num]) &&
-		    napi_reschedule(napi)) {
-			disable_scrq_irq(adapter, adapter->rx_scrq[scrq_num]);
-			goto restart_poll;
+		if (napi_complete_done(napi, frames_processed)) {
+			enable_scrq_irq(adapter, rx_scrq);
+			if (pending_scrq(adapter, rx_scrq)) {
+				rmb();
+				if (napi_reschedule(napi)) {
+					disable_scrq_irq(adapter, rx_scrq);
+					goto restart_poll;
+				}
+			}
 		}
 	}
 	return frames_processed;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 8/9] ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
                   ` (6 preceding siblings ...)
  2020-11-19  1:12 ` [PATCH net-next v2 7/9] ibmvnic: Correctly re-enable interrupts in NAPI polling routine Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  9:47   ` ljp
  2020-11-19  1:12 ` [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop Thomas Falcon
  2020-11-21  3:52 ` [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Jakub Kicinski
  9 siblings, 1 reply; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>

Take advantage of the additional optimizations in netdev_alloc_skb when
allocating socket buffers to be used for packet reception.

Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 596546f0614d..96df6d8fa277 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -323,7 +323,7 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter,
 	rx_scrq = adapter->rx_scrq[pool->index];
 	ind_bufp = &rx_scrq->ind_buf;
 	for (i = 0; i < count; ++i) {
-		skb = alloc_skb(pool->buff_size, GFP_ATOMIC);
+		skb = netdev_alloc_skb(adapter->netdev, pool->buff_size);
 		if (!skb) {
 			dev_err(dev, "Couldn't replenish rx buff\n");
 			adapter->replenish_no_mem++;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
                   ` (7 preceding siblings ...)
  2020-11-19  1:12 ` [PATCH net-next v2 8/9] ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers Thomas Falcon
@ 2020-11-19  1:12 ` Thomas Falcon
  2020-11-19  9:43   ` ljp
  2020-11-21  3:52 ` [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Jakub Kicinski
  9 siblings, 1 reply; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19  1:12 UTC (permalink / raw)
  To: kuba
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev, tlfalcon

From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>

Reduce the amount of time spent replenishing RX buffers by
only doing so once available buffers has fallen under a certain
threshold, in this case half of the total number of buffers, or
if the polling loop exits before the packets processed is less
than its budget.

Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 96df6d8fa277..9fe43ab0496d 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2537,7 +2537,10 @@ static int ibmvnic_poll(struct napi_struct *napi, int budget)
 		frames_processed++;
 	}
 
-	if (adapter->state != VNIC_CLOSING)
+	if (adapter->state != VNIC_CLOSING &&
+	    ((atomic_read(&adapter->rx_pool[scrq_num].available) <
+	      adapter->req_rx_add_entries_per_subcrq / 2) ||
+	      frames_processed < budget))
 		replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
 	if (frames_processed < budget) {
 		if (napi_complete_done(napi, frames_processed)) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2 1/9] ibmvnic: Introduce indirect subordinate Command Response Queue buffer
  2020-11-19  1:12 ` [PATCH net-next v2 1/9] ibmvnic: Introduce indirect subordinate Command Response Queue buffer Thomas Falcon
@ 2020-11-19  9:34   ` ljp
  0 siblings, 0 replies; 17+ messages in thread
From: ljp @ 2020-11-19  9:34 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: kuba, netdev, linuxppc-dev, cforno12, ricklind, dnbanerg, drt,
	brking, sukadev

On 2020-11-18 19:12, Thomas Falcon wrote:
> This patch introduces the infrastructure to send batched subordinate
> Command Response Queue descriptors, which are used by the ibmvnic
> driver to send TX frame and RX buffer descriptors.
> 
> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>

Acked-by: Lijun Pan <ljp@linux.ibm.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2 5/9] ibmvnic: Remove send_subcrq function
  2020-11-19  1:12 ` [PATCH net-next v2 5/9] ibmvnic: Remove send_subcrq function Thomas Falcon
@ 2020-11-19  9:37   ` ljp
  0 siblings, 0 replies; 17+ messages in thread
From: ljp @ 2020-11-19  9:37 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: kuba, cforno12, netdev, ricklind, dnbanerg, drt, brking, sukadev,
	linuxppc-dev, Linuxppc-dev

On 2020-11-18 19:12, Thomas Falcon wrote:
> It is not longer used, so remove it.
> 
> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>

Acked-by: Lijun Pan <ljp@linux.ibm.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop
  2020-11-19  1:12 ` [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop Thomas Falcon
@ 2020-11-19  9:43   ` ljp
       [not found]     ` <83ca37f3-07be-4179-8414-88c8c83bfe56@linux.ibm.com>
  0 siblings, 1 reply; 17+ messages in thread
From: ljp @ 2020-11-19  9:43 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: kuba, cforno12, netdev, ricklind, dnbanerg, drt, brking, sukadev,
	linuxppc-dev, Linuxppc-dev

On 2020-11-18 19:12, Thomas Falcon wrote:
> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
> 
> Reduce the amount of time spent replenishing RX buffers by
> only doing so once available buffers has fallen under a certain
> threshold, in this case half of the total number of buffers, or
> if the polling loop exits before the packets processed is less
> than its budget.
> 
> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 96df6d8fa277..9fe43ab0496d 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2537,7 +2537,10 @@ static int ibmvnic_poll(struct napi_struct
> *napi, int budget)
>  		frames_processed++;
>  	}
> 
> -	if (adapter->state != VNIC_CLOSING)
> +	if (adapter->state != VNIC_CLOSING &&
> +	    ((atomic_read(&adapter->rx_pool[scrq_num].available) <
> +	      adapter->req_rx_add_entries_per_subcrq / 2) ||
> +	      frames_processed < budget))

1/2 seems a simple and good algorithm.
Explaining why "frames_process < budget" is necessary in the commit 
message
or source code also helps.


>  		replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>  	if (frames_processed < budget) {
>  		if (napi_complete_done(napi, frames_processed)) {

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2 8/9] ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers
  2020-11-19  1:12 ` [PATCH net-next v2 8/9] ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers Thomas Falcon
@ 2020-11-19  9:47   ` ljp
  0 siblings, 0 replies; 17+ messages in thread
From: ljp @ 2020-11-19  9:47 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: kuba, cforno12, netdev, ricklind, dnbanerg, drt, brking, sukadev,
	linuxppc-dev, Linuxppc-dev

On 2020-11-18 19:12, Thomas Falcon wrote:
> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
> 
> Take advantage of the additional optimizations in netdev_alloc_skb when
> allocating socket buffers to be used for packet reception.
> 
> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>

Acked-by: Lijun Pan <ljp@linux.ibm.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop
       [not found]     ` <83ca37f3-07be-4179-8414-88c8c83bfe56@linux.ibm.com>
@ 2020-11-19 20:38       ` ljp
  2020-11-19 20:58         ` Thomas Falcon
  0 siblings, 1 reply; 17+ messages in thread
From: ljp @ 2020-11-19 20:38 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: kuba, cforno12, netdev, ricklind, dnbanerg, drt, brking, sukadev,
	linuxppc-dev, Linuxppc-dev

On 2020-11-19 14:26, Thomas Falcon wrote:
> On 11/19/20 3:43 AM, ljp wrote:
> 
>> On 2020-11-18 19:12, Thomas Falcon wrote:
>> 
>>> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
>>> 
>>> Reduce the amount of time spent replenishing RX buffers by
>>> only doing so once available buffers has fallen under a certain
>>> threshold, in this case half of the total number of buffers, or
>>> if the polling loop exits before the packets processed is less
>>> than its budget.
>>> 
>>> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
>>> ---
>>> drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
>>> b/drivers/net/ethernet/ibm/ibmvnic.c
>>> index 96df6d8fa277..9fe43ab0496d 100644
>>> --- a/drivers/net/ethernet/ibm/ibmvnic.c
>>> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
>>> @@ -2537,7 +2537,10 @@ static int ibmvnic_poll(struct napi_struct
>>> 
>>> *napi, int budget)
>>> frames_processed++;
>>> }
>>> 
>>> - if (adapter->state != VNIC_CLOSING)
>>> + if (adapter->state != VNIC_CLOSING &&
>>> + ((atomic_read(&adapter->rx_pool[scrq_num].available) <
>>> + adapter->req_rx_add_entries_per_subcrq / 2) ||
>>> + frames_processed < budget))
>> 
>> 1/2 seems a simple and good algorithm.
>> Explaining why "frames_process < budget" is necessary in the commit
>> message
>> or source code also helps.
> 
> Hello, Lijun. The patch author, Dwip Banerjee, suggested the modified
> commit message below:
> 
> Reduce the amount of time spent replenishing RX buffers by
>  only doing so once available buffers has fallen under a certain
>  threshold, in this case half of the total number of buffers, or
>  if the polling loop exits before the packets processed is less
>  than its budget. Non-exhaustion of NAPI budget implies lower
>  incoming packet pressure, allowing the leeway to refill the buffers
>  in preparation for any impending burst.

It looks good to me.

> 
> Would such an update require a v3?

I assume you ask Jakub, right?


>>> replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>>> if (frames_processed < budget) {
>>> if (napi_complete_done(napi, frames_processed)) {

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop
  2020-11-19 20:38       ` ljp
@ 2020-11-19 20:58         ` Thomas Falcon
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Falcon @ 2020-11-19 20:58 UTC (permalink / raw)
  To: ljp
  Cc: kuba, cforno12, netdev, ricklind, dnbanerg, drt, brking, sukadev,
	linuxppc-dev, Linuxppc-dev

On 11/19/20 2:38 PM, ljp wrote:
> On 2020-11-19 14:26, Thomas Falcon wrote:
>> On 11/19/20 3:43 AM, ljp wrote:
>>
>>> On 2020-11-18 19:12, Thomas Falcon wrote:
>>>
>>>> From: "Dwip N. Banerjee" <dnbanerg@us.ibm.com>
>>>>
>>>> Reduce the amount of time spent replenishing RX buffers by
>>>> only doing so once available buffers has fallen under a certain
>>>> threshold, in this case half of the total number of buffers, or
>>>> if the polling loop exits before the packets processed is less
>>>> than its budget.
>>>>
>>>> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
>>>> ---
>>>> drivers/net/ethernet/ibm/ibmvnic.c | 5 ++++-
>>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
>>>> b/drivers/net/ethernet/ibm/ibmvnic.c
>>>> index 96df6d8fa277..9fe43ab0496d 100644
>>>> --- a/drivers/net/ethernet/ibm/ibmvnic.c
>>>> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
>>>> @@ -2537,7 +2537,10 @@ static int ibmvnic_poll(struct napi_struct
>>>>
>>>> *napi, int budget)
>>>> frames_processed++;
>>>> }
>>>>
>>>> - if (adapter->state != VNIC_CLOSING)
>>>> + if (adapter->state != VNIC_CLOSING &&
>>>> + ((atomic_read(&adapter->rx_pool[scrq_num].available) <
>>>> + adapter->req_rx_add_entries_per_subcrq / 2) ||
>>>> + frames_processed < budget))
>>>
>>> 1/2 seems a simple and good algorithm.
>>> Explaining why "frames_process < budget" is necessary in the commit
>>> message
>>> or source code also helps.
>>
>> Hello, Lijun. The patch author, Dwip Banerjee, suggested the modified
>> commit message below:
>>
>> Reduce the amount of time spent replenishing RX buffers by
>>  only doing so once available buffers has fallen under a certain
>>  threshold, in this case half of the total number of buffers, or
>>  if the polling loop exits before the packets processed is less
>>  than its budget. Non-exhaustion of NAPI budget implies lower
>>  incoming packet pressure, allowing the leeway to refill the buffers
>>  in preparation for any impending burst.
>
> It looks good to me.
>
>>
>> Would such an update require a v3?
>
> I assume you ask Jakub, right?
>
>
Yes. There was an issue with my mail client in my earlier response, so I 
am posting Dwip's modified commit message again below.

Reduce the amount of time spent replenishing RX buffers by only doing so 
once available buffers has fallen under a certain threshold, in this 
case half of the total number of buffers, or if the polling loop exits 
before the packets processed is less than its budget. Non-exhaustion of 
NAPI budget implies lower incoming packet pressure, allowing the leeway 
to refill the buffers in preparation for any impending burst.

>>>> replenish_rx_pool(adapter, &adapter->rx_pool[scrq_num]);
>>>> if (frames_processed < budget) {
>>>> if (napi_complete_done(napi, frames_processed)) {

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates
  2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
                   ` (8 preceding siblings ...)
  2020-11-19  1:12 ` [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop Thomas Falcon
@ 2020-11-21  3:52 ` Jakub Kicinski
  9 siblings, 0 replies; 17+ messages in thread
From: Jakub Kicinski @ 2020-11-21  3:52 UTC (permalink / raw)
  To: Thomas Falcon
  Cc: netdev, linuxppc-dev, cforno12, ljp, ricklind, dnbanerg, drt,
	brking, sukadev

On Wed, 18 Nov 2020 19:12:16 -0600 Thomas Falcon wrote:
> The first three patches utilize a hypervisor call allowing multiple 
> TX and RX buffer replenishment descriptors to be sent in one operation,
> which significantly reduces hypervisor call overhead. The xmit_more
> and Byte Queue Limit API's are leveraged to provide this support
> for TX descriptors.
> 
> The subsequent two patches remove superfluous code and members in
> TX completion handling function and TX buffer structure, respectively,
> and remove unused routines.
> 
> Finally, four patches which ensure that device queue memory is
> cache-line aligned, resolving slowdowns observed in PCI traces,
> as well as optimize the driver's NAPI polling function and 
> to RX buffer replenishment are provided by Dwip Banerjee.
> 
> This series provides significant performance improvements, allowing
> the driver to fully utilize 100Gb NIC's.

Applied, thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-11-21  3:53 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-19  1:12 [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Thomas Falcon
2020-11-19  1:12 ` [PATCH net-next v2 1/9] ibmvnic: Introduce indirect subordinate Command Response Queue buffer Thomas Falcon
2020-11-19  9:34   ` ljp
2020-11-19  1:12 ` [PATCH net-next v2 2/9] ibmvnic: Introduce batched RX buffer descriptor transmission Thomas Falcon
2020-11-19  1:12 ` [PATCH net-next v2 3/9] ibmvnic: Introduce xmit_more support using batched subCRQ hcalls Thomas Falcon
2020-11-19  1:12 ` [PATCH net-next v2 4/9] ibmvnic: Clean up TX code and TX buffer data structure Thomas Falcon
2020-11-19  1:12 ` [PATCH net-next v2 5/9] ibmvnic: Remove send_subcrq function Thomas Falcon
2020-11-19  9:37   ` ljp
2020-11-19  1:12 ` [PATCH net-next v2 6/9] ibmvnic: Ensure that device queue memory is cache-line aligned Thomas Falcon
2020-11-19  1:12 ` [PATCH net-next v2 7/9] ibmvnic: Correctly re-enable interrupts in NAPI polling routine Thomas Falcon
2020-11-19  1:12 ` [PATCH net-next v2 8/9] ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers Thomas Falcon
2020-11-19  9:47   ` ljp
2020-11-19  1:12 ` [PATCH net-next v2 9/9] ibmvnic: Do not replenish RX buffers after every polling loop Thomas Falcon
2020-11-19  9:43   ` ljp
     [not found]     ` <83ca37f3-07be-4179-8414-88c8c83bfe56@linux.ibm.com>
2020-11-19 20:38       ` ljp
2020-11-19 20:58         ` Thomas Falcon
2020-11-21  3:52 ` [PATCH net-next v2 0/9] ibmvnic: Performance improvements and other updates Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).