All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: netdev@vger.kernel.org
Cc: kafai@fb.com, daniel@iogearbox.net, tom@herbertland.com,
	bblanco@plumgrid.com, Jesper Dangaard Brouer <brouer@redhat.com>,
	john.fastabend@gmail.com, gerlitz.or@gmail.com,
	hannes@stressinduktion.org, rana.shahot@gmail.com, tgraf@suug.ch,
	"David S. Miller" <davem@davemloft.net>,
	as754m@att.com
Subject: [net-next PATCH RFC] mlx4: RX prefetch loop
Date: Fri, 08 Jul 2016 18:02:20 +0200	[thread overview]
Message-ID: <20160708160135.27507.77711.stgit@firesoul> (raw)
In-Reply-To: <20160708172024.4a849b7a@redhat.com>

This patch is about prefetching without being opportunistic.
The idea is only to start prefetching on packets that are marked as
ready/completed in the RX ring.

This is acheived by splitting the napi_poll call mlx4_en_process_rx_cq()
loop into two.  The first loop extract completed CQEs and start
prefetching on data and RX descriptors. The second loop process the
real packets.

Details: The batching of CQEs are limited to 8 in-order to avoid
stressing the LFB (Line Fill Buffer) and cache usage.

I've left some opportunities for prefetching CQE descriptors.


The performance improvements on my platform are huge, as I tested this
on a CPU without DDIO.  The performance for XDP is the same as with
Brendens prefetch hack.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c |   70 +++++++++++++++++++++++++---
 1 file changed, 62 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 41c76fe00a7f..c5efe03e31ce 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -782,7 +782,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	int doorbell_pending;
 	struct sk_buff *skb;
 	int tx_index;
-	int index;
+	int index, saved_index, i;
 	int nr;
 	unsigned int length;
 	int polled = 0;
@@ -790,6 +790,10 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	int factor = priv->cqe_factor;
 	u64 timestamp;
 	bool l2_tunnel;
+#define PREFETCH_BATCH 8
+	struct mlx4_cqe *cqe_array[PREFETCH_BATCH];
+	int cqe_idx;
+	bool cqe_more;
 
 	if (!priv->port_up)
 		return 0;
@@ -801,24 +805,75 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	doorbell_pending = 0;
 	tx_index = (priv->tx_ring_num - priv->rsv_tx_rings) + cq->ring;
 
+next_prefetch_batch:
+	cqe_idx = 0;
+	cqe_more = false;
+
 	/* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
 	 * descriptor offset can be deduced from the CQE index instead of
 	 * reading 'cqe->index' */
 	index = cq->mcq.cons_index & ring->size_mask;
+	saved_index = index;
 	cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
 
-	/* Process all completed CQEs */
+	/* Extract and prefetch completed CQEs */
 	while (XNOR(cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK,
 		    cq->mcq.cons_index & cq->size)) {
+		void *data;
 
 		frags = ring->rx_info + (index << priv->log_rx_info);
 		rx_desc = ring->buf + (index << ring->log_stride);
+		prefetch(rx_desc);
 
 		/*
 		 * make sure we read the CQE after we read the ownership bit
 		 */
 		dma_rmb();
 
+		cqe_array[cqe_idx++] = cqe;
+
+		/* Base error handling here, free handled in next loop */
+		if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
+			     MLX4_CQE_OPCODE_ERROR))
+			goto skip;
+
+		data = page_address(frags[0].page) + frags[0].page_offset;
+		prefetch(data);
+	skip:
+		++cq->mcq.cons_index;
+		index = (cq->mcq.cons_index) & ring->size_mask;
+		cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
+		/* likely too slow prefetching CQE here ... do look-a-head ? */
+		//prefetch(cqe + priv->cqe_size * 3);
+
+		if (++polled == budget) {
+			cqe_more = false;
+			break;
+		}
+		if (cqe_idx == PREFETCH_BATCH) {
+			cqe_more = true;
+			// IDEA: Opportunistic prefetch CQEs for next_prefetch_batch?
+			//for (i = 0; i < PREFETCH_BATCH; i++) {
+			//	prefetch(cqe + priv->cqe_size * i);
+			//}
+			break;
+		}
+	}
+	/* Hint: The cqe_idx will be number of packets, it can be used
+	 * for bulk allocating SKBs
+	 */
+
+	/* Now, index function as index for rx_desc */
+	index = saved_index;
+
+	/* Process completed CQEs in cqe_array */
+	for (i = 0; i < cqe_idx; i++) {
+
+		cqe = cqe_array[i];
+
+		frags = ring->rx_info + (index << priv->log_rx_info);
+		rx_desc = ring->buf + (index << ring->log_stride);
+
 		/* Drop packet on bad receive or bad checksum */
 		if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
 						MLX4_CQE_OPCODE_ERROR)) {
@@ -1065,14 +1120,13 @@ next:
 			mlx4_en_free_frag(priv, frags, nr);
 
 consumed:
-		++cq->mcq.cons_index;
-		index = (cq->mcq.cons_index) & ring->size_mask;
-		cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
-		if (++polled == budget)
-			goto out;
+		++index;
+		index = index & ring->size_mask;
 	}
+	/* Check for more completed CQEs */
+	if (cqe_more)
+		goto next_prefetch_batch;
 
-out:
 	if (doorbell_pending)
 		mlx4_en_xmit_doorbell(priv->tx_ring[tx_index]);
 

  reply	other threads:[~2016-07-08 16:02 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-08  2:15 [PATCH v6 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 01/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-09  8:14   ` Jesper Dangaard Brouer
2016-07-09 13:47     ` Tom Herbert
2016-07-10 13:37       ` Jesper Dangaard Brouer
2016-07-10 17:09         ` Brenden Blanco
2016-07-10 20:30           ` Tom Herbert
2016-07-11 10:15             ` Daniel Borkmann
2016-07-11 12:58               ` Jesper Dangaard Brouer
2016-07-10 20:27         ` Tom Herbert
2016-07-11 11:36           ` Jesper Dangaard Brouer
2016-07-10 20:56   ` Tom Herbert
2016-07-11 16:51     ` Brenden Blanco
2016-07-11 21:21       ` Daniel Borkmann
2016-07-10 21:04   ` Tom Herbert
2016-07-11 13:53     ` Jesper Dangaard Brouer
2016-07-08  2:15 ` [PATCH v6 02/12] net: add ndo to set xdp prog in adapter rx Brenden Blanco
2016-07-10 20:59   ` Tom Herbert
2016-07-11 10:35     ` Daniel Borkmann
2016-07-08  2:15 ` [PATCH v6 03/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 04/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-09 14:07   ` Or Gerlitz
2016-07-10 15:40     ` Brenden Blanco
2016-07-10 16:38       ` Tariq Toukan
2016-07-09 19:58   ` Saeed Mahameed
2016-07-09 21:37     ` Or Gerlitz
2016-07-10 15:25     ` Tariq Toukan
2016-07-10 16:05       ` Brenden Blanco
2016-07-11 11:48         ` Saeed Mahameed
2016-07-11 21:49           ` Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 05/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-09 20:21   ` Saeed Mahameed
2016-07-11 11:09   ` Jamal Hadi Salim
2016-07-11 13:37     ` Jesper Dangaard Brouer
2016-07-16 14:55       ` Jamal Hadi Salim
2016-07-08  2:15 ` [PATCH v6 06/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 07/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 08/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 09/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 10/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 11/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 12/12] net/mlx4_en: add prefetch in xdp rx path Brenden Blanco
2016-07-08  3:56   ` Eric Dumazet
2016-07-08  4:16     ` Alexei Starovoitov
2016-07-08  6:56       ` Eric Dumazet
2016-07-08 16:49         ` Brenden Blanco
2016-07-10 20:48           ` Tom Herbert
2016-07-10 20:50           ` Tom Herbert
2016-07-11 14:54             ` Jesper Dangaard Brouer
2016-07-08 15:20     ` Jesper Dangaard Brouer
2016-07-08 16:02       ` Jesper Dangaard Brouer [this message]
2016-07-11 11:09         ` [net-next PATCH RFC] mlx4: RX prefetch loop Jesper Dangaard Brouer
2016-07-11 16:00           ` Brenden Blanco
2016-07-11 23:05           ` Alexei Starovoitov
2016-07-12 12:45             ` Jesper Dangaard Brouer
2016-07-12 16:46               ` Alexander Duyck
2016-07-12 19:52                 ` Jesper Dangaard Brouer
2016-07-13  1:37                   ` Alexei Starovoitov
2016-07-10 16:14 ` [PATCH v6 00/12] Add driver bpf hook for early packet drop and forwarding Tariq Toukan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160708160135.27507.77711.stgit@firesoul \
    --to=brouer@redhat.com \
    --cc=as754m@att.com \
    --cc=bblanco@plumgrid.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=rana.shahot@gmail.com \
    --cc=tgraf@suug.ch \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.