bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] virtio_net: refactor xdp codes
@ 2023-03-22  3:03 Xuan Zhuo
  2023-03-22  3:03 ` [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately Xuan Zhuo
                   ` (8 more replies)
  0 siblings, 9 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

Due to historical reasons, the implementation of XDP in virtio-net is relatively
chaotic. For example, the processing of XDP actions has two copies of similar
code. Such as page, xdp_page processing, etc.

The purpose of this patch set is to refactor these code. Reduce the difficulty
of subsequent maintenance. Subsequent developers will not introduce new bugs
because of some complex logical relationships.

In addition, the supporting to AF_XDP that I want to submit later will also need
to reuse the logic of XDP, such as the processing of actions, I don't want to
introduce a new similar code. In this way, I can reuse these codes in the
future.

Please review.

Thanks.

Xuan Zhuo (8):
  virtio_net: mergeable xdp: put old page immediately
  virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
    run xdp
  virtio_net: separate the logic of freeing xdp shinfo
  virtio_net: separate the logic of freeing the rest mergeable buf
  virtio_net: auto release xdp shinfo
  virtio_net: introduce receive_mergeable_xdp()
  virtio_net: introduce receive_small_xdp()

 drivers/net/virtio_net.c | 615 +++++++++++++++++++++++----------------
 1 file changed, 357 insertions(+), 258 deletions(-)

--
2.32.0.3.g01195cf9f


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
@ 2023-03-22  3:03 ` Xuan Zhuo
  2023-03-22  8:22   ` Yunsheng Lin
  2023-03-22  3:03 ` [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare Xuan Zhuo
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

In the xdp implementation of virtio-net mergeable, it always checks
whether two page is used and a page is selected to release. This is
complicated for the processing of action, and be careful.

In the entire process, we have such principles:
* If xdp_page is used (PASS, TX, Redirect), then we release the old
  page.
* If it is a drop case, we will release two. The old page obtained from
  buf is release inside err_xdp, and xdp_page needs be relased by us.

But in fact, when we allocate a new page, we can release the old page
immediately. Then just one is using, we just need to release the new
page for drop case. On the drop path, err_xdp will release the variable
"page", so we only need to let "page" point to the new xdp_page in
advance.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index e2560b6f7980..4d2bf1ce0730 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1245,6 +1245,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			if (!xdp_page)
 				goto err_xdp;
 			offset = VIRTIO_XDP_HEADROOM;
+
+			put_page(page);
+			page = xdp_page;
 		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
 			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
 						  sizeof(struct skb_shared_info));
@@ -1259,6 +1262,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			       page_address(page) + offset, len);
 			frame_sz = PAGE_SIZE;
 			offset = VIRTIO_XDP_HEADROOM;
+
+			put_page(page);
+			page = xdp_page;
 		} else {
 			xdp_page = page;
 		}
@@ -1278,8 +1284,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			if (unlikely(!head_skb))
 				goto err_xdp_frags;
 
-			if (unlikely(xdp_page != page))
-				put_page(page);
 			rcu_read_unlock();
 			return head_skb;
 		case XDP_TX:
@@ -1297,8 +1301,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 				goto err_xdp_frags;
 			}
 			*xdp_xmit |= VIRTIO_XDP_TX;
-			if (unlikely(xdp_page != page))
-				put_page(page);
 			rcu_read_unlock();
 			goto xdp_xmit;
 		case XDP_REDIRECT:
@@ -1307,8 +1309,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			if (err)
 				goto err_xdp_frags;
 			*xdp_xmit |= VIRTIO_XDP_REDIR;
-			if (unlikely(xdp_page != page))
-				put_page(page);
 			rcu_read_unlock();
 			goto xdp_xmit;
 		default:
@@ -1321,9 +1321,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			goto err_xdp_frags;
 		}
 err_xdp_frags:
-		if (unlikely(xdp_page != page))
-			__free_pages(xdp_page, 0);
-
 		if (xdp_buff_has_frags(&xdp)) {
 			shinfo = xdp_get_shared_info_from_buff(&xdp);
 			for (i = 0; i < shinfo->nr_frags; i++) {
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
  2023-03-22  3:03 ` [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately Xuan Zhuo
@ 2023-03-22  3:03 ` Xuan Zhuo
  2023-03-22 11:52   ` Yunsheng Lin
  2023-03-22  3:03 ` [PATCH net-next 3/8] virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdp Xuan Zhuo
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

Separating the logic of preparation for xdp from receive_mergeable.

The purpose of this is to simplify the logic of execution of XDP.

The main logic here is that when headroom is insufficient, we need to
allocate a new page and calculate offset. It should be noted that if
there is new page, the variable page will refer to the new page.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 135 ++++++++++++++++++++++-----------------
 1 file changed, 77 insertions(+), 58 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 4d2bf1ce0730..bb426958cdd4 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1162,6 +1162,79 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
 	return 0;
 }
 
+static void *mergeable_xdp_prepare(struct virtnet_info *vi,
+				   struct receive_queue *rq,
+				   struct bpf_prog *xdp_prog,
+				   void *ctx,
+				   unsigned int *frame_sz,
+				   int *num_buf,
+				   struct page **page,
+				   int offset,
+				   unsigned int *len,
+				   struct virtio_net_hdr_mrg_rxbuf *hdr)
+{
+	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
+	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
+	struct page *xdp_page;
+	unsigned int xdp_room;
+
+	/* Transient failure which in theory could occur if
+	 * in-flight packets from before XDP was enabled reach
+	 * the receive path after XDP is loaded.
+	 */
+	if (unlikely(hdr->hdr.gso_type))
+		return NULL;
+
+	/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
+	 * with headroom may add hole in truesize, which
+	 * make their length exceed PAGE_SIZE. So we disabled the
+	 * hole mechanism for xdp. See add_recvbuf_mergeable().
+	 */
+	*frame_sz = truesize;
+
+	/* This happens when headroom is not enough because
+	 * of the buffer was prefilled before XDP is set.
+	 * This should only happen for the first several packets.
+	 * In fact, vq reset can be used here to help us clean up
+	 * the prefilled buffers, but many existing devices do not
+	 * support it, and we don't want to bother users who are
+	 * using xdp normally.
+	 */
+	if (!xdp_prog->aux->xdp_has_frags &&
+	    (*num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
+		/* linearize data for XDP */
+		xdp_page = xdp_linearize_page(rq, num_buf,
+					      *page, offset,
+					      VIRTIO_XDP_HEADROOM,
+					      len);
+
+		if (!xdp_page)
+			return NULL;
+	} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
+		xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
+					  sizeof(struct skb_shared_info));
+		if (*len + xdp_room > PAGE_SIZE)
+			return NULL;
+
+		xdp_page = alloc_page(GFP_ATOMIC);
+		if (!xdp_page)
+			return NULL;
+
+		memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
+		       page_address(*page) + offset, *len);
+	} else {
+		return page_address(*page) + offset;
+	}
+
+	*frame_sz = PAGE_SIZE;
+
+	put_page(*page);
+
+	*page = xdp_page;
+
+	return page_address(xdp_page) + VIRTIO_XDP_HEADROOM;
+}
+
 static struct sk_buff *receive_mergeable(struct net_device *dev,
 					 struct virtnet_info *vi,
 					 struct receive_queue *rq,
@@ -1181,7 +1254,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
 	unsigned int tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
 	unsigned int room = SKB_DATA_ALIGN(headroom + tailroom);
-	unsigned int frame_sz, xdp_room;
+	unsigned int frame_sz;
 	int err;
 
 	head_skb = NULL;
@@ -1211,65 +1284,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		u32 act;
 		int i;
 
-		/* Transient failure which in theory could occur if
-		 * in-flight packets from before XDP was enabled reach
-		 * the receive path after XDP is loaded.
-		 */
-		if (unlikely(hdr->hdr.gso_type))
+		data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
+					     offset, &len, hdr);
+		if (!data)
 			goto err_xdp;
 
-		/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
-		 * with headroom may add hole in truesize, which
-		 * make their length exceed PAGE_SIZE. So we disabled the
-		 * hole mechanism for xdp. See add_recvbuf_mergeable().
-		 */
-		frame_sz = truesize;
-
-		/* This happens when headroom is not enough because
-		 * of the buffer was prefilled before XDP is set.
-		 * This should only happen for the first several packets.
-		 * In fact, vq reset can be used here to help us clean up
-		 * the prefilled buffers, but many existing devices do not
-		 * support it, and we don't want to bother users who are
-		 * using xdp normally.
-		 */
-		if (!xdp_prog->aux->xdp_has_frags &&
-		    (num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
-			/* linearize data for XDP */
-			xdp_page = xdp_linearize_page(rq, &num_buf,
-						      page, offset,
-						      VIRTIO_XDP_HEADROOM,
-						      &len);
-			frame_sz = PAGE_SIZE;
-
-			if (!xdp_page)
-				goto err_xdp;
-			offset = VIRTIO_XDP_HEADROOM;
-
-			put_page(page);
-			page = xdp_page;
-		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
-			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
-						  sizeof(struct skb_shared_info));
-			if (len + xdp_room > PAGE_SIZE)
-				goto err_xdp;
-
-			xdp_page = alloc_page(GFP_ATOMIC);
-			if (!xdp_page)
-				goto err_xdp;
-
-			memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
-			       page_address(page) + offset, len);
-			frame_sz = PAGE_SIZE;
-			offset = VIRTIO_XDP_HEADROOM;
-
-			put_page(page);
-			page = xdp_page;
-		} else {
-			xdp_page = page;
-		}
-
-		data = page_address(xdp_page) + offset;
 		err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
 						 &num_buf, &xdp_frags_truesz, stats);
 		if (unlikely(err))
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next 3/8] virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdp
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
  2023-03-22  3:03 ` [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately Xuan Zhuo
  2023-03-22  3:03 ` [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare Xuan Zhuo
@ 2023-03-22  3:03 ` Xuan Zhuo
  2023-03-22  3:03 ` [PATCH net-next 4/8] virtio_net: separate the logic of freeing xdp shinfo Xuan Zhuo
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

At present, we have two similar logic to perform the XDP prog.

Therefore, this PATCH separates the code of executing XDP, which is
conducive to later maintenance.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 142 +++++++++++++++++++++------------------
 1 file changed, 75 insertions(+), 67 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index bb426958cdd4..72b9d6ee4024 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -301,6 +301,15 @@ struct padded_vnet_hdr {
 	char padding[12];
 };
 
+enum {
+	/* xdp pass */
+	VIRTNET_XDP_RES_PASS,
+	/* drop packet. the caller needs to release the page. */
+	VIRTNET_XDP_RES_DROP,
+	/* packet is consumed by xdp. the caller needs to do nothing. */
+	VIRTNET_XDP_RES_CONSUMED,
+};
+
 static void virtnet_rq_free_unused_buf(struct virtqueue *vq, void *buf);
 static void virtnet_sq_free_unused_buf(struct virtqueue *vq, void *buf);
 
@@ -789,6 +798,59 @@ static int virtnet_xdp_xmit(struct net_device *dev,
 	return ret;
 }
 
+static int virtnet_xdp_handler(struct bpf_prog *xdp_prog, struct xdp_buff *xdp,
+			       struct net_device *dev,
+			       unsigned int *xdp_xmit,
+			       struct virtnet_rq_stats *stats)
+{
+	struct xdp_frame *xdpf;
+	int err;
+	u32 act;
+
+	act = bpf_prog_run_xdp(xdp_prog, xdp);
+	stats->xdp_packets++;
+
+	switch (act) {
+	case XDP_PASS:
+		return VIRTNET_XDP_RES_PASS;
+
+	case XDP_TX:
+		stats->xdp_tx++;
+		xdpf = xdp_convert_buff_to_frame(xdp);
+		if (unlikely(!xdpf))
+			return VIRTNET_XDP_RES_DROP;
+
+		err = virtnet_xdp_xmit(dev, 1, &xdpf, 0);
+		if (unlikely(!err)) {
+			xdp_return_frame_rx_napi(xdpf);
+		} else if (unlikely(err < 0)) {
+			trace_xdp_exception(dev, xdp_prog, act);
+			return VIRTNET_XDP_RES_DROP;
+		}
+
+		*xdp_xmit |= VIRTIO_XDP_TX;
+		return VIRTNET_XDP_RES_CONSUMED;
+
+	case XDP_REDIRECT:
+		stats->xdp_redirects++;
+		err = xdp_do_redirect(dev, xdp, xdp_prog);
+		if (err)
+			return VIRTNET_XDP_RES_DROP;
+
+		*xdp_xmit |= VIRTIO_XDP_REDIR;
+		return VIRTNET_XDP_RES_CONSUMED;
+
+	default:
+		bpf_warn_invalid_xdp_action(dev, xdp_prog, act);
+		fallthrough;
+	case XDP_ABORTED:
+		trace_xdp_exception(dev, xdp_prog, act);
+		fallthrough;
+	case XDP_DROP:
+		return VIRTNET_XDP_RES_DROP;
+	}
+}
+
 static unsigned int virtnet_get_headroom(struct virtnet_info *vi)
 {
 	return vi->xdp_enabled ? VIRTIO_XDP_HEADROOM : 0;
@@ -876,7 +938,6 @@ static struct sk_buff *receive_small(struct net_device *dev,
 	struct page *page = virt_to_head_page(buf);
 	unsigned int delta = 0;
 	struct page *xdp_page;
-	int err;
 	unsigned int metasize = 0;
 
 	len -= vi->hdr_len;
@@ -898,7 +959,6 @@ static struct sk_buff *receive_small(struct net_device *dev,
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (xdp_prog) {
 		struct virtio_net_hdr_mrg_rxbuf *hdr = buf + header_offset;
-		struct xdp_frame *xdpf;
 		struct xdp_buff xdp;
 		void *orig_data;
 		u32 act;
@@ -931,46 +991,22 @@ static struct sk_buff *receive_small(struct net_device *dev,
 		xdp_prepare_buff(&xdp, buf + VIRTNET_RX_PAD + vi->hdr_len,
 				 xdp_headroom, len, true);
 		orig_data = xdp.data;
-		act = bpf_prog_run_xdp(xdp_prog, &xdp);
-		stats->xdp_packets++;
+
+		act = virtnet_xdp_handler(xdp_prog, &xdp, dev, xdp_xmit, stats);
 
 		switch (act) {
-		case XDP_PASS:
+		case VIRTNET_XDP_RES_PASS:
 			/* Recalculate length in case bpf program changed it */
 			delta = orig_data - xdp.data;
 			len = xdp.data_end - xdp.data;
 			metasize = xdp.data - xdp.data_meta;
 			break;
-		case XDP_TX:
-			stats->xdp_tx++;
-			xdpf = xdp_convert_buff_to_frame(&xdp);
-			if (unlikely(!xdpf))
-				goto err_xdp;
-			err = virtnet_xdp_xmit(dev, 1, &xdpf, 0);
-			if (unlikely(!err)) {
-				xdp_return_frame_rx_napi(xdpf);
-			} else if (unlikely(err < 0)) {
-				trace_xdp_exception(vi->dev, xdp_prog, act);
-				goto err_xdp;
-			}
-			*xdp_xmit |= VIRTIO_XDP_TX;
-			rcu_read_unlock();
-			goto xdp_xmit;
-		case XDP_REDIRECT:
-			stats->xdp_redirects++;
-			err = xdp_do_redirect(dev, &xdp, xdp_prog);
-			if (err)
-				goto err_xdp;
-			*xdp_xmit |= VIRTIO_XDP_REDIR;
+
+		case VIRTNET_XDP_RES_CONSUMED:
 			rcu_read_unlock();
 			goto xdp_xmit;
-		default:
-			bpf_warn_invalid_xdp_action(vi->dev, xdp_prog, act);
-			fallthrough;
-		case XDP_ABORTED:
-			trace_xdp_exception(vi->dev, xdp_prog, act);
-			goto err_xdp;
-		case XDP_DROP:
+
+		case VIRTNET_XDP_RES_DROP:
 			goto err_xdp;
 		}
 	}
@@ -1277,7 +1313,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	if (xdp_prog) {
 		unsigned int xdp_frags_truesz = 0;
 		struct skb_shared_info *shinfo;
-		struct xdp_frame *xdpf;
 		struct page *xdp_page;
 		struct xdp_buff xdp;
 		void *data;
@@ -1294,49 +1329,22 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		if (unlikely(err))
 			goto err_xdp_frags;
 
-		act = bpf_prog_run_xdp(xdp_prog, &xdp);
-		stats->xdp_packets++;
+		act = virtnet_xdp_handler(xdp_prog, &xdp, dev, xdp_xmit, stats);
 
 		switch (act) {
-		case XDP_PASS:
+		case VIRTNET_XDP_RES_PASS:
 			head_skb = build_skb_from_xdp_buff(dev, vi, &xdp, xdp_frags_truesz);
 			if (unlikely(!head_skb))
 				goto err_xdp_frags;
 
 			rcu_read_unlock();
 			return head_skb;
-		case XDP_TX:
-			stats->xdp_tx++;
-			xdpf = xdp_convert_buff_to_frame(&xdp);
-			if (unlikely(!xdpf)) {
-				netdev_dbg(dev, "convert buff to frame failed for xdp\n");
-				goto err_xdp_frags;
-			}
-			err = virtnet_xdp_xmit(dev, 1, &xdpf, 0);
-			if (unlikely(!err)) {
-				xdp_return_frame_rx_napi(xdpf);
-			} else if (unlikely(err < 0)) {
-				trace_xdp_exception(vi->dev, xdp_prog, act);
-				goto err_xdp_frags;
-			}
-			*xdp_xmit |= VIRTIO_XDP_TX;
-			rcu_read_unlock();
-			goto xdp_xmit;
-		case XDP_REDIRECT:
-			stats->xdp_redirects++;
-			err = xdp_do_redirect(dev, &xdp, xdp_prog);
-			if (err)
-				goto err_xdp_frags;
-			*xdp_xmit |= VIRTIO_XDP_REDIR;
+
+		case VIRTNET_XDP_RES_CONSUMED:
 			rcu_read_unlock();
 			goto xdp_xmit;
-		default:
-			bpf_warn_invalid_xdp_action(vi->dev, xdp_prog, act);
-			fallthrough;
-		case XDP_ABORTED:
-			trace_xdp_exception(vi->dev, xdp_prog, act);
-			fallthrough;
-		case XDP_DROP:
+
+		case VIRTNET_XDP_RES_DROP:
 			goto err_xdp_frags;
 		}
 err_xdp_frags:
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next 4/8] virtio_net: separate the logic of freeing xdp shinfo
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
                   ` (2 preceding siblings ...)
  2023-03-22  3:03 ` [PATCH net-next 3/8] virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdp Xuan Zhuo
@ 2023-03-22  3:03 ` Xuan Zhuo
  2023-03-22  3:03 ` [PATCH net-next 5/8] virtio_net: separate the logic of freeing the rest mergeable buf Xuan Zhuo
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

This patch introduce a new function that releases the
xdp shinfo. The subsequent patch will reuse this function.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 72b9d6ee4024..09aed60e2f51 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -798,6 +798,21 @@ static int virtnet_xdp_xmit(struct net_device *dev,
 	return ret;
 }
 
+static void put_xdp_frags(struct xdp_buff *xdp)
+{
+	struct skb_shared_info *shinfo;
+	struct page *xdp_page;
+	int i;
+
+	if (xdp_buff_has_frags(xdp)) {
+		shinfo = xdp_get_shared_info_from_buff(xdp);
+		for (i = 0; i < shinfo->nr_frags; i++) {
+			xdp_page = skb_frag_page(&shinfo->frags[i]);
+			put_page(xdp_page);
+		}
+	}
+}
+
 static int virtnet_xdp_handler(struct bpf_prog *xdp_prog, struct xdp_buff *xdp,
 			       struct net_device *dev,
 			       unsigned int *xdp_xmit,
@@ -1312,12 +1327,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (xdp_prog) {
 		unsigned int xdp_frags_truesz = 0;
-		struct skb_shared_info *shinfo;
-		struct page *xdp_page;
 		struct xdp_buff xdp;
 		void *data;
 		u32 act;
-		int i;
 
 		data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
 					     offset, &len, hdr);
@@ -1348,14 +1360,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			goto err_xdp_frags;
 		}
 err_xdp_frags:
-		if (xdp_buff_has_frags(&xdp)) {
-			shinfo = xdp_get_shared_info_from_buff(&xdp);
-			for (i = 0; i < shinfo->nr_frags; i++) {
-				xdp_page = skb_frag_page(&shinfo->frags[i]);
-				put_page(xdp_page);
-			}
-		}
-
+		put_xdp_frags(&xdp);
 		goto err_xdp;
 	}
 	rcu_read_unlock();
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next 5/8] virtio_net: separate the logic of freeing the rest mergeable buf
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
                   ` (3 preceding siblings ...)
  2023-03-22  3:03 ` [PATCH net-next 4/8] virtio_net: separate the logic of freeing xdp shinfo Xuan Zhuo
@ 2023-03-22  3:03 ` Xuan Zhuo
  2023-03-22  3:03 ` [PATCH net-next 6/8] virtio_net: auto release xdp shinfo Xuan Zhuo
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

This patch introduce a new function that frees the rest mergeable buf.
The subsequent patch will reuse this function.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 09aed60e2f51..a3f2bcb3db27 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1076,6 +1076,28 @@ static struct sk_buff *receive_big(struct net_device *dev,
 	return NULL;
 }
 
+static void mergeable_buf_free(struct receive_queue *rq, int num_buf,
+			       struct net_device *dev,
+			       struct virtnet_rq_stats *stats)
+{
+	struct page *page;
+	void *buf;
+	int len;
+
+	while (num_buf-- > 1) {
+		buf = virtqueue_get_buf(rq->vq, &len);
+		if (unlikely(!buf)) {
+			pr_debug("%s: rx error: %d buffers missing\n",
+				 dev->name, num_buf);
+			dev->stats.rx_length_errors++;
+			break;
+		}
+		stats->bytes += len;
+		page = virt_to_head_page(buf);
+		put_page(page);
+	}
+}
+
 /* Why not use xdp_build_skb_from_frame() ?
  * XDP core assumes that xdp frags are PAGE_SIZE in length, while in
  * virtio-net there are 2 points that do not match its requirements:
@@ -1436,18 +1458,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	stats->xdp_drops++;
 err_skb:
 	put_page(page);
-	while (num_buf-- > 1) {
-		buf = virtqueue_get_buf(rq->vq, &len);
-		if (unlikely(!buf)) {
-			pr_debug("%s: rx error: %d buffers missing\n",
-				 dev->name, num_buf);
-			dev->stats.rx_length_errors++;
-			break;
-		}
-		stats->bytes += len;
-		page = virt_to_head_page(buf);
-		put_page(page);
-	}
+	mergeable_buf_free(rq, num_buf, dev, stats);
+
 err_buf:
 	stats->drops++;
 	dev_kfree_skb(head_skb);
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next 6/8] virtio_net: auto release xdp shinfo
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
                   ` (4 preceding siblings ...)
  2023-03-22  3:03 ` [PATCH net-next 5/8] virtio_net: separate the logic of freeing the rest mergeable buf Xuan Zhuo
@ 2023-03-22  3:03 ` Xuan Zhuo
  2023-03-22  3:03 ` [PATCH net-next 7/8] virtio_net: introduce receive_mergeable_xdp() Xuan Zhuo
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

virtnet_build_xdp_buff_mrg() and virtnet_xdp_handler() auto
release xdp shinfo then the caller no need to careful the xdp shinfo.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a3f2bcb3db27..136131a7868a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -833,14 +833,14 @@ static int virtnet_xdp_handler(struct bpf_prog *xdp_prog, struct xdp_buff *xdp,
 		stats->xdp_tx++;
 		xdpf = xdp_convert_buff_to_frame(xdp);
 		if (unlikely(!xdpf))
-			return VIRTNET_XDP_RES_DROP;
+			goto drop;
 
 		err = virtnet_xdp_xmit(dev, 1, &xdpf, 0);
 		if (unlikely(!err)) {
 			xdp_return_frame_rx_napi(xdpf);
 		} else if (unlikely(err < 0)) {
 			trace_xdp_exception(dev, xdp_prog, act);
-			return VIRTNET_XDP_RES_DROP;
+			goto drop;
 		}
 
 		*xdp_xmit |= VIRTIO_XDP_TX;
@@ -850,7 +850,7 @@ static int virtnet_xdp_handler(struct bpf_prog *xdp_prog, struct xdp_buff *xdp,
 		stats->xdp_redirects++;
 		err = xdp_do_redirect(dev, xdp, xdp_prog);
 		if (err)
-			return VIRTNET_XDP_RES_DROP;
+			goto drop;
 
 		*xdp_xmit |= VIRTIO_XDP_REDIR;
 		return VIRTNET_XDP_RES_CONSUMED;
@@ -862,8 +862,12 @@ static int virtnet_xdp_handler(struct bpf_prog *xdp_prog, struct xdp_buff *xdp,
 		trace_xdp_exception(dev, xdp_prog, act);
 		fallthrough;
 	case XDP_DROP:
-		return VIRTNET_XDP_RES_DROP;
+		goto drop;
 	}
+
+drop:
+	put_xdp_frags(xdp);
+	return VIRTNET_XDP_RES_DROP;
 }
 
 static unsigned int virtnet_get_headroom(struct virtnet_info *vi)
@@ -1199,7 +1203,7 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
 				 dev->name, *num_buf,
 				 virtio16_to_cpu(vi->vdev, hdr->num_buffers));
 			dev->stats.rx_length_errors++;
-			return -EINVAL;
+			goto err;
 		}
 
 		stats->bytes += len;
@@ -1218,7 +1222,7 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
 			pr_debug("%s: rx error: len %u exceeds truesize %lu\n",
 				 dev->name, len, (unsigned long)(truesize - room));
 			dev->stats.rx_length_errors++;
-			return -EINVAL;
+			goto err;
 		}
 
 		frag = &shinfo->frags[shinfo->nr_frags++];
@@ -1233,6 +1237,10 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
 
 	*xdp_frags_truesize = xdp_frags_truesz;
 	return 0;
+
+err:
+	put_xdp_frags(xdp);
+	return -EINVAL;
 }
 
 static void *mergeable_xdp_prepare(struct virtnet_info *vi,
@@ -1361,7 +1369,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
 						 &num_buf, &xdp_frags_truesz, stats);
 		if (unlikely(err))
-			goto err_xdp_frags;
+			goto err_xdp;
 
 		act = virtnet_xdp_handler(xdp_prog, &xdp, dev, xdp_xmit, stats);
 
@@ -1369,7 +1377,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		case VIRTNET_XDP_RES_PASS:
 			head_skb = build_skb_from_xdp_buff(dev, vi, &xdp, xdp_frags_truesz);
 			if (unlikely(!head_skb))
-				goto err_xdp_frags;
+				goto err_xdp;
 
 			rcu_read_unlock();
 			return head_skb;
@@ -1379,11 +1387,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			goto xdp_xmit;
 
 		case VIRTNET_XDP_RES_DROP:
-			goto err_xdp_frags;
+			goto err_xdp;
 		}
-err_xdp_frags:
-		put_xdp_frags(&xdp);
-		goto err_xdp;
 	}
 	rcu_read_unlock();
 
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next 7/8] virtio_net: introduce receive_mergeable_xdp()
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
                   ` (5 preceding siblings ...)
  2023-03-22  3:03 ` [PATCH net-next 6/8] virtio_net: auto release xdp shinfo Xuan Zhuo
@ 2023-03-22  3:03 ` Xuan Zhuo
  2023-03-22 23:43   ` kernel test robot
  2023-03-22  3:03 ` [PATCH net-next 8/8] virtio_net: introduce receive_small_xdp() Xuan Zhuo
  2023-03-22  3:34 ` [PATCH net-next 0/8] virtio_net: refactor xdp codes Michael S. Tsirkin
  8 siblings, 1 reply; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

The purpose of this patch is to simplify the receive_mergeable().
Separate all the logic of XDP into a function.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 128 +++++++++++++++++++++++----------------
 1 file changed, 76 insertions(+), 52 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 136131a7868a..6ecb17e972e1 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1316,6 +1316,63 @@ static void *mergeable_xdp_prepare(struct virtnet_info *vi,
 	return page_address(xdp_page) + VIRTIO_XDP_HEADROOM;
 }
 
+static struct sk_buff *receive_mergeable_xdp(struct net_device *dev,
+					     struct virtnet_info *vi,
+					     struct receive_queue *rq,
+					     struct bpf_prog *xdp_prog,
+					     void *buf,
+					     void *ctx,
+					     unsigned int len,
+					     unsigned int *xdp_xmit,
+					     struct virtnet_rq_stats *stats)
+{
+	struct virtio_net_hdr_mrg_rxbuf *hdr = buf;
+	int num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers);
+	struct page *page = virt_to_head_page(buf);
+	int offset = buf - page_address(page);
+	unsigned int xdp_frags_truesz = 0;
+	struct sk_buff *head_skb;
+	unsigned int frame_sz;
+	struct xdp_buff xdp;
+	void *data;
+	u32 act;
+	int err;
+
+	data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
+				     offset, &len, hdr);
+	if (!data)
+		goto err_xdp;
+
+	err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
+					 &num_buf, &xdp_frags_truesz, stats);
+	if (unlikely(err))
+		goto err_xdp;
+
+	act = virtnet_xdp_handler(xdp_prog, &xdp, dev, xdp_xmit, stats);
+
+	switch (act) {
+	case VIRTNET_XDP_RES_PASS:
+		head_skb = build_skb_from_xdp_buff(dev, vi, &xdp, xdp_frags_truesz);
+		if (unlikely(!head_skb))
+			goto err_xdp;
+		return head_skb;
+
+	case VIRTNET_XDP_RES_CONSUMED:
+		return NULL;
+
+	case VIRTNET_XDP_RES_DROP:
+		break;
+	}
+
+err_xdp:
+	put_page(page);
+	mergeable_buf_free(rq, num_buf, dev, stats);
+
+	stats->xdp_drops++;
+	stats->drops++;
+	return NULL;
+}
+
 static struct sk_buff *receive_mergeable(struct net_device *dev,
 					 struct virtnet_info *vi,
 					 struct receive_queue *rq,
@@ -1325,18 +1382,16 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 					 unsigned int *xdp_xmit,
 					 struct virtnet_rq_stats *stats)
 {
-	struct virtio_net_hdr_mrg_rxbuf *hdr = buf;
-	int num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers);
-	struct page *page = virt_to_head_page(buf);
-	int offset = buf - page_address(page);
-	struct sk_buff *head_skb, *curr_skb;
-	struct bpf_prog *xdp_prog;
 	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
 	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
 	unsigned int tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
 	unsigned int room = SKB_DATA_ALIGN(headroom + tailroom);
-	unsigned int frame_sz;
-	int err;
+	struct virtio_net_hdr_mrg_rxbuf *hdr;
+	struct sk_buff *head_skb, *curr_skb;
+	struct bpf_prog *xdp_prog;
+	struct page *page;
+	int num_buf;
+	int offset;
 
 	head_skb = NULL;
 	stats->bytes += len - vi->hdr_len;
@@ -1348,51 +1403,24 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		goto err_skb;
 	}
 
-	if (likely(!vi->xdp_enabled)) {
-		xdp_prog = NULL;
-		goto skip_xdp;
-	}
-
-	rcu_read_lock();
-	xdp_prog = rcu_dereference(rq->xdp_prog);
-	if (xdp_prog) {
-		unsigned int xdp_frags_truesz = 0;
-		struct xdp_buff xdp;
-		void *data;
-		u32 act;
-
-		data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
-					     offset, &len, hdr);
-		if (!data)
-			goto err_xdp;
-
-		err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
-						 &num_buf, &xdp_frags_truesz, stats);
-		if (unlikely(err))
-			goto err_xdp;
-
-		act = virtnet_xdp_handler(xdp_prog, &xdp, dev, xdp_xmit, stats);
-
-		switch (act) {
-		case VIRTNET_XDP_RES_PASS:
-			head_skb = build_skb_from_xdp_buff(dev, vi, &xdp, xdp_frags_truesz);
-			if (unlikely(!head_skb))
-				goto err_xdp;
-
+	if (likely(vi->xdp_enabled)) {
+		rcu_read_lock();
+		xdp_prog = rcu_dereference(rq->xdp_prog);
+		if (xdp_prog) {
+			head_skb = receive_mergeable_xdp(dev, vi, rq, xdp_prog,
+							 buf, ctx, len, xdp_xmit,
+							 stats);
 			rcu_read_unlock();
 			return head_skb;
-
-		case VIRTNET_XDP_RES_CONSUMED:
-			rcu_read_unlock();
-			goto xdp_xmit;
-
-		case VIRTNET_XDP_RES_DROP:
-			goto err_xdp;
 		}
+		rcu_read_unlock();
 	}
-	rcu_read_unlock();
 
-skip_xdp:
+	hdr = buf;
+	num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers);
+	page = virt_to_head_page(buf);
+	offset = buf - page_address(page);
+
 	head_skb = page_to_skb(vi, rq, page, offset, len, truesize, headroom);
 	curr_skb = head_skb;
 
@@ -1458,9 +1486,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	ewma_pkt_len_add(&rq->mrg_avg_pkt_len, head_skb->len);
 	return head_skb;
 
-err_xdp:
-	rcu_read_unlock();
-	stats->xdp_drops++;
 err_skb:
 	put_page(page);
 	mergeable_buf_free(rq, num_buf, dev, stats);
@@ -1468,7 +1493,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 err_buf:
 	stats->drops++;
 	dev_kfree_skb(head_skb);
-xdp_xmit:
 	return NULL;
 }
 
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH net-next 8/8] virtio_net: introduce receive_small_xdp()
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
                   ` (6 preceding siblings ...)
  2023-03-22  3:03 ` [PATCH net-next 7/8] virtio_net: introduce receive_mergeable_xdp() Xuan Zhuo
@ 2023-03-22  3:03 ` Xuan Zhuo
  2023-03-23  2:06   ` kernel test robot
  2023-03-22  3:34 ` [PATCH net-next 0/8] virtio_net: refactor xdp codes Michael S. Tsirkin
  8 siblings, 1 reply; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:03 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

The purpose of this patch is to simplify the receive_small().
Separate all the logic of XDP of small into a function.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 165 +++++++++++++++++++++++----------------
 1 file changed, 97 insertions(+), 68 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 6ecb17e972e1..23b1a6fd1224 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -939,6 +939,96 @@ static struct page *xdp_linearize_page(struct receive_queue *rq,
 	return NULL;
 }
 
+static struct sk_buff *receive_small_xdp(struct net_device *dev,
+					 struct virtnet_info *vi,
+					 struct receive_queue *rq,
+					 struct bpf_prog *xdp_prog,
+					 void *buf,
+					 void *ctx,
+					 unsigned int len,
+					 unsigned int *xdp_xmit,
+					 struct virtnet_rq_stats *stats)
+{
+	unsigned int xdp_headroom = (unsigned long)ctx;
+	unsigned int header_offset = VIRTNET_RX_PAD + xdp_headroom;
+	unsigned int headroom = vi->hdr_len + header_offset;
+	struct virtio_net_hdr_mrg_rxbuf *hdr = buf + header_offset;
+	struct page *page = virt_to_head_page(buf);
+	struct page *xdp_page;
+	unsigned int buflen;
+	struct xdp_buff xdp;
+	struct sk_buff *skb;
+	unsigned int delta = 0;
+	unsigned int metasize = 0;
+	void *orig_data;
+	u32 act;
+
+	if (unlikely(hdr->hdr.gso_type))
+		goto err_xdp;
+
+	if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
+		int offset = buf - page_address(page) + header_offset;
+		unsigned int tlen = len + vi->hdr_len;
+		int num_buf = 1;
+
+		xdp_headroom = virtnet_get_headroom(vi);
+		header_offset = VIRTNET_RX_PAD + xdp_headroom;
+		headroom = vi->hdr_len + header_offset;
+		buflen = SKB_DATA_ALIGN(GOOD_PACKET_LEN + headroom) +
+			SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+		xdp_page = xdp_linearize_page(rq, &num_buf, page,
+					      offset, header_offset,
+					      &tlen);
+		if (!xdp_page)
+			goto err_xdp;
+
+		buf = page_address(xdp_page);
+		put_page(page);
+		page = xdp_page;
+	}
+
+	xdp_init_buff(&xdp, buflen, &rq->xdp_rxq);
+	xdp_prepare_buff(&xdp, buf + VIRTNET_RX_PAD + vi->hdr_len,
+			 xdp_headroom, len, true);
+	orig_data = xdp.data;
+
+	act = virtnet_xdp_handler(xdp_prog, &xdp, dev, xdp_xmit, stats);
+
+	switch (act) {
+	case VIRTNET_XDP_RES_PASS:
+		/* Recalculate length in case bpf program changed it */
+		delta = orig_data - xdp.data;
+		len = xdp.data_end - xdp.data;
+		metasize = xdp.data - xdp.data_meta;
+		break;
+
+	case VIRTNET_XDP_RES_CONSUMED:
+		goto xdp_xmit;
+
+	case VIRTNET_XDP_RES_DROP:
+		goto err_xdp;
+	}
+
+	skb = build_skb(buf, buflen);
+	if (!skb)
+		goto err;
+
+	skb_reserve(skb, headroom - delta);
+	skb_put(skb, len);
+	if (metasize)
+		skb_metadata_set(skb, metasize);
+
+	return skb;
+
+err_xdp:
+	stats->xdp_drops++;
+err:
+	stats->drops++;
+	put_page(page);
+xdp_xmit:
+	return NULL;
+}
+
 static struct sk_buff *receive_small(struct net_device *dev,
 				     struct virtnet_info *vi,
 				     struct receive_queue *rq,
@@ -949,15 +1039,11 @@ static struct sk_buff *receive_small(struct net_device *dev,
 {
 	struct sk_buff *skb;
 	struct bpf_prog *xdp_prog;
-	unsigned int xdp_headroom = (unsigned long)ctx;
-	unsigned int header_offset = VIRTNET_RX_PAD + xdp_headroom;
+	unsigned int header_offset = VIRTNET_RX_PAD;
 	unsigned int headroom = vi->hdr_len + header_offset;
 	unsigned int buflen = SKB_DATA_ALIGN(GOOD_PACKET_LEN + headroom) +
 			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 	struct page *page = virt_to_head_page(buf);
-	unsigned int delta = 0;
-	struct page *xdp_page;
-	unsigned int metasize = 0;
 
 	len -= vi->hdr_len;
 	stats->bytes += len;
@@ -977,57 +1063,9 @@ static struct sk_buff *receive_small(struct net_device *dev,
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (xdp_prog) {
-		struct virtio_net_hdr_mrg_rxbuf *hdr = buf + header_offset;
-		struct xdp_buff xdp;
-		void *orig_data;
-		u32 act;
-
-		if (unlikely(hdr->hdr.gso_type))
-			goto err_xdp;
-
-		if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
-			int offset = buf - page_address(page) + header_offset;
-			unsigned int tlen = len + vi->hdr_len;
-			int num_buf = 1;
-
-			xdp_headroom = virtnet_get_headroom(vi);
-			header_offset = VIRTNET_RX_PAD + xdp_headroom;
-			headroom = vi->hdr_len + header_offset;
-			buflen = SKB_DATA_ALIGN(GOOD_PACKET_LEN + headroom) +
-				 SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-			xdp_page = xdp_linearize_page(rq, &num_buf, page,
-						      offset, header_offset,
-						      &tlen);
-			if (!xdp_page)
-				goto err_xdp;
-
-			buf = page_address(xdp_page);
-			put_page(page);
-			page = xdp_page;
-		}
-
-		xdp_init_buff(&xdp, buflen, &rq->xdp_rxq);
-		xdp_prepare_buff(&xdp, buf + VIRTNET_RX_PAD + vi->hdr_len,
-				 xdp_headroom, len, true);
-		orig_data = xdp.data;
-
-		act = virtnet_xdp_handler(xdp_prog, &xdp, dev, xdp_xmit, stats);
-
-		switch (act) {
-		case VIRTNET_XDP_RES_PASS:
-			/* Recalculate length in case bpf program changed it */
-			delta = orig_data - xdp.data;
-			len = xdp.data_end - xdp.data;
-			metasize = xdp.data - xdp.data_meta;
-			break;
-
-		case VIRTNET_XDP_RES_CONSUMED:
-			rcu_read_unlock();
-			goto xdp_xmit;
-
-		case VIRTNET_XDP_RES_DROP:
-			goto err_xdp;
-		}
+		skb = receive_small_xdp(dev, vi, rq, xdp_prog, buf, ctx, len, xdp_xmit, stats);
+		rcu_read_unlock();
+		return skb;
 	}
 	rcu_read_unlock();
 
@@ -1035,25 +1073,16 @@ static struct sk_buff *receive_small(struct net_device *dev,
 	skb = build_skb(buf, buflen);
 	if (!skb)
 		goto err;
-	skb_reserve(skb, headroom - delta);
+	skb_reserve(skb, headroom);
 	skb_put(skb, len);
-	if (!xdp_prog) {
-		buf += header_offset;
-		memcpy(skb_vnet_hdr(skb), buf, vi->hdr_len);
-	} /* keep zeroed vnet hdr since XDP is loaded */
-
-	if (metasize)
-		skb_metadata_set(skb, metasize);
 
+	buf += header_offset;
+	memcpy(skb_vnet_hdr(skb), buf, vi->hdr_len);
 	return skb;
 
-err_xdp:
-	rcu_read_unlock();
-	stats->xdp_drops++;
 err:
 	stats->drops++;
 	put_page(page);
-xdp_xmit:
 	return NULL;
 }
 
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 0/8] virtio_net: refactor xdp codes
  2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
                   ` (7 preceding siblings ...)
  2023-03-22  3:03 ` [PATCH net-next 8/8] virtio_net: introduce receive_small_xdp() Xuan Zhuo
@ 2023-03-22  3:34 ` Michael S. Tsirkin
  2023-03-22  3:40   ` Xuan Zhuo
  8 siblings, 1 reply; 33+ messages in thread
From: Michael S. Tsirkin @ 2023-03-22  3:34 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: netdev, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

On Wed, Mar 22, 2023 at 11:03:00AM +0800, Xuan Zhuo wrote:
> Due to historical reasons, the implementation of XDP in virtio-net is relatively
> chaotic. For example, the processing of XDP actions has two copies of similar
> code. Such as page, xdp_page processing, etc.
> 
> The purpose of this patch set is to refactor these code. Reduce the difficulty
> of subsequent maintenance. Subsequent developers will not introduce new bugs
> because of some complex logical relationships.
> 
> In addition, the supporting to AF_XDP that I want to submit later will also need
> to reuse the logic of XDP, such as the processing of actions, I don't want to
> introduce a new similar code. In this way, I can reuse these codes in the
> future.
> 
> Please review.
> 
> Thanks.

I really want to see that code make progress though.
Would it make sense to merge this one through the virtio tree?

Then you will have all the pieces in one place and try to target
next linux.


> Xuan Zhuo (8):
>   virtio_net: mergeable xdp: put old page immediately
>   virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
>   virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
>     run xdp
>   virtio_net: separate the logic of freeing xdp shinfo
>   virtio_net: separate the logic of freeing the rest mergeable buf
>   virtio_net: auto release xdp shinfo
>   virtio_net: introduce receive_mergeable_xdp()
>   virtio_net: introduce receive_small_xdp()
> 
>  drivers/net/virtio_net.c | 615 +++++++++++++++++++++++----------------
>  1 file changed, 357 insertions(+), 258 deletions(-)
> 
> --
> 2.32.0.3.g01195cf9f


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 0/8] virtio_net: refactor xdp codes
  2023-03-22  3:34 ` [PATCH net-next 0/8] virtio_net: refactor xdp codes Michael S. Tsirkin
@ 2023-03-22  3:40   ` Xuan Zhuo
  2023-03-22  3:53     ` Michael S. Tsirkin
  0 siblings, 1 reply; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

On Tue, 21 Mar 2023 23:34:43 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Wed, Mar 22, 2023 at 11:03:00AM +0800, Xuan Zhuo wrote:
> > Due to historical reasons, the implementation of XDP in virtio-net is relatively
> > chaotic. For example, the processing of XDP actions has two copies of similar
> > code. Such as page, xdp_page processing, etc.
> >
> > The purpose of this patch set is to refactor these code. Reduce the difficulty
> > of subsequent maintenance. Subsequent developers will not introduce new bugs
> > because of some complex logical relationships.
> >
> > In addition, the supporting to AF_XDP that I want to submit later will also need
> > to reuse the logic of XDP, such as the processing of actions, I don't want to
> > introduce a new similar code. In this way, I can reuse these codes in the
> > future.
> >
> > Please review.
> >
> > Thanks.
>
> I really want to see that code make progress though.

I want to know, you refer to virtio-net + AF_XDP or refactor for XDP.

> Would it make sense to merge this one through the virtio tree?

There are some small problems that we merge this patch-set to Virtio Tree
directly.

Thanks.

>
> Then you will have all the pieces in one place and try to target
> next linux.
>
>
> > Xuan Zhuo (8):
> >   virtio_net: mergeable xdp: put old page immediately
> >   virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
> >   virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
> >     run xdp
> >   virtio_net: separate the logic of freeing xdp shinfo
> >   virtio_net: separate the logic of freeing the rest mergeable buf
> >   virtio_net: auto release xdp shinfo
> >   virtio_net: introduce receive_mergeable_xdp()
> >   virtio_net: introduce receive_small_xdp()
> >
> >  drivers/net/virtio_net.c | 615 +++++++++++++++++++++++----------------
> >  1 file changed, 357 insertions(+), 258 deletions(-)
> >
> > --
> > 2.32.0.3.g01195cf9f
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 0/8] virtio_net: refactor xdp codes
  2023-03-22  3:40   ` Xuan Zhuo
@ 2023-03-22  3:53     ` Michael S. Tsirkin
  2023-03-22  3:56       ` Xuan Zhuo
  2023-03-22  3:59       ` Jakub Kicinski
  0 siblings, 2 replies; 33+ messages in thread
From: Michael S. Tsirkin @ 2023-03-22  3:53 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: netdev, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

On Wed, Mar 22, 2023 at 11:40:56AM +0800, Xuan Zhuo wrote:
> On Tue, 21 Mar 2023 23:34:43 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Wed, Mar 22, 2023 at 11:03:00AM +0800, Xuan Zhuo wrote:
> > > Due to historical reasons, the implementation of XDP in virtio-net is relatively
> > > chaotic. For example, the processing of XDP actions has two copies of similar
> > > code. Such as page, xdp_page processing, etc.
> > >
> > > The purpose of this patch set is to refactor these code. Reduce the difficulty
> > > of subsequent maintenance. Subsequent developers will not introduce new bugs
> > > because of some complex logical relationships.
> > >
> > > In addition, the supporting to AF_XDP that I want to submit later will also need
> > > to reuse the logic of XDP, such as the processing of actions, I don't want to
> > > introduce a new similar code. In this way, I can reuse these codes in the
> > > future.
> > >
> > > Please review.
> > >
> > > Thanks.
> >
> > I really want to see that code make progress though.
> 
> I want to know, you refer to virtio-net + AF_XDP or refactor for XDP.
> 
> > Would it make sense to merge this one through the virtio tree?
> 
> There are some small problems that we merge this patch-set to Virtio Tree
> directly.
> 
> Thanks.

what exactly? is there a dependency on net-next?

> >
> > Then you will have all the pieces in one place and try to target
> > next linux.
> >
> >
> > > Xuan Zhuo (8):
> > >   virtio_net: mergeable xdp: put old page immediately
> > >   virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
> > >   virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
> > >     run xdp
> > >   virtio_net: separate the logic of freeing xdp shinfo
> > >   virtio_net: separate the logic of freeing the rest mergeable buf
> > >   virtio_net: auto release xdp shinfo
> > >   virtio_net: introduce receive_mergeable_xdp()
> > >   virtio_net: introduce receive_small_xdp()
> > >
> > >  drivers/net/virtio_net.c | 615 +++++++++++++++++++++++----------------
> > >  1 file changed, 357 insertions(+), 258 deletions(-)
> > >
> > > --
> > > 2.32.0.3.g01195cf9f
> >


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 0/8] virtio_net: refactor xdp codes
  2023-03-22  3:53     ` Michael S. Tsirkin
@ 2023-03-22  3:56       ` Xuan Zhuo
  2023-03-22  3:59       ` Jakub Kicinski
  1 sibling, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-22  3:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

On Tue, 21 Mar 2023 23:53:52 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Wed, Mar 22, 2023 at 11:40:56AM +0800, Xuan Zhuo wrote:
> > On Tue, 21 Mar 2023 23:34:43 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Wed, Mar 22, 2023 at 11:03:00AM +0800, Xuan Zhuo wrote:
> > > > Due to historical reasons, the implementation of XDP in virtio-net is relatively
> > > > chaotic. For example, the processing of XDP actions has two copies of similar
> > > > code. Such as page, xdp_page processing, etc.
> > > >
> > > > The purpose of this patch set is to refactor these code. Reduce the difficulty
> > > > of subsequent maintenance. Subsequent developers will not introduce new bugs
> > > > because of some complex logical relationships.
> > > >
> > > > In addition, the supporting to AF_XDP that I want to submit later will also need
> > > > to reuse the logic of XDP, such as the processing of actions, I don't want to
> > > > introduce a new similar code. In this way, I can reuse these codes in the
> > > > future.
> > > >
> > > > Please review.
> > > >
> > > > Thanks.
> > >
> > > I really want to see that code make progress though.
> >
> > I want to know, you refer to virtio-net + AF_XDP or refactor for XDP.
> >
> > > Would it make sense to merge this one through the virtio tree?
> >
> > There are some small problems that we merge this patch-set to Virtio Tree
> > directly.
> >
> > Thanks.
>
> what exactly? is there a dependency on net-next?

There will be a conflict, I submitted to net before. Now net-next includes it.

[1]. https://lore.kernel.org/netdev/20230315015223.89137-1-xuanzhuo@linux.alibaba.com/

There are no other problems.

Thanks.

>
> > >
> > > Then you will have all the pieces in one place and try to target
> > > next linux.
> > >
> > >
> > > > Xuan Zhuo (8):
> > > >   virtio_net: mergeable xdp: put old page immediately
> > > >   virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
> > > >   virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
> > > >     run xdp
> > > >   virtio_net: separate the logic of freeing xdp shinfo
> > > >   virtio_net: separate the logic of freeing the rest mergeable buf
> > > >   virtio_net: auto release xdp shinfo
> > > >   virtio_net: introduce receive_mergeable_xdp()
> > > >   virtio_net: introduce receive_small_xdp()
> > > >
> > > >  drivers/net/virtio_net.c | 615 +++++++++++++++++++++++----------------
> > > >  1 file changed, 357 insertions(+), 258 deletions(-)
> > > >
> > > > --
> > > > 2.32.0.3.g01195cf9f
> > >
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 0/8] virtio_net: refactor xdp codes
  2023-03-22  3:53     ` Michael S. Tsirkin
  2023-03-22  3:56       ` Xuan Zhuo
@ 2023-03-22  3:59       ` Jakub Kicinski
  1 sibling, 0 replies; 33+ messages in thread
From: Jakub Kicinski @ 2023-03-22  3:59 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Xuan Zhuo, netdev, Jason Wang, David S. Miller, Eric Dumazet,
	Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

On Tue, 21 Mar 2023 23:53:52 -0400 Michael S. Tsirkin wrote:
> > > Would it make sense to merge this one through the virtio tree?  
> > 
> > There are some small problems that we merge this patch-set to Virtio Tree
> > directly.
> 
> what exactly? is there a dependency on net-next?

XDP is very actively developed, this seems like a bad idea :(

Is there a problem? Is it because the RFC was not getting much
attention? RFCs are for interesting code, really, nobody was 
interested 🤷️ When it comes to merging - it will be in the tree
in two days if nobody finds bugs...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately
  2023-03-22  3:03 ` [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately Xuan Zhuo
@ 2023-03-22  8:22   ` Yunsheng Lin
  2023-03-23  1:36     ` Xuan Zhuo
  0 siblings, 1 reply; 33+ messages in thread
From: Yunsheng Lin @ 2023-03-22  8:22 UTC (permalink / raw)
  To: Xuan Zhuo, netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

On 2023/3/22 11:03, Xuan Zhuo wrote:
> In the xdp implementation of virtio-net mergeable, it always checks
> whether two page is used and a page is selected to release. This is
> complicated for the processing of action, and be careful.
> 
> In the entire process, we have such principles:
> * If xdp_page is used (PASS, TX, Redirect), then we release the old
>   page.
> * If it is a drop case, we will release two. The old page obtained from
>   buf is release inside err_xdp, and xdp_page needs be relased by us.
> 
> But in fact, when we allocate a new page, we can release the old page
> immediately. Then just one is using, we just need to release the new
> page for drop case. On the drop path, err_xdp will release the variable
> "page", so we only need to let "page" point to the new xdp_page in
> advance.
> 
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  drivers/net/virtio_net.c | 15 ++++++---------
>  1 file changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index e2560b6f7980..4d2bf1ce0730 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1245,6 +1245,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  			if (!xdp_page)
>  				goto err_xdp;
>  			offset = VIRTIO_XDP_HEADROOM;
> +
> +			put_page(page);

the error handling of xdp_linearize_page() does not seems self contained.
Does it not seem better:
1. if xdp_linearize_page() succesed, call put_page() for first buffer just
   as put_page() is call for other buffer
2. or call virtqueue_get_buf() and put_page() for all the buffer of the packet
   so the error handling is not needed outside the virtqueue_get_buf().

In that case, it seems we can just do below without xdp_page:
page = xdp_linearize_page(rq, num_buf, page, ...);


> +			page = xdp_page;
>  		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
>  			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
>  						  sizeof(struct skb_shared_info));
> @@ -1259,6 +1262,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  			       page_address(page) + offset, len);
>  			frame_sz = PAGE_SIZE;
>  			offset = VIRTIO_XDP_HEADROOM;
> +
> +			put_page(page);
> +			page = xdp_page;

It seems we can limit the scope of xdp_page in this "else if" block.

>  		} else {
>  			xdp_page = page;
>  		}

It seems the above else block is not needed anymore.

> @@ -1278,8 +1284,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  			if (unlikely(!head_skb))
>  				goto err_xdp_frags;
>  
> -			if (unlikely(xdp_page != page))
> -				put_page(page);
>  			rcu_read_unlock();
>  			return head_skb;
>  		case XDP_TX:
> @@ -1297,8 +1301,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  				goto err_xdp_frags;
>  			}
>  			*xdp_xmit |= VIRTIO_XDP_TX;
> -			if (unlikely(xdp_page != page))
> -				put_page(page);
>  			rcu_read_unlock();
>  			goto xdp_xmit;
>  		case XDP_REDIRECT:
> @@ -1307,8 +1309,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  			if (err)
>  				goto err_xdp_frags;
>  			*xdp_xmit |= VIRTIO_XDP_REDIR;
> -			if (unlikely(xdp_page != page))
> -				put_page(page);
>  			rcu_read_unlock();
>  			goto xdp_xmit;
>  		default:
> @@ -1321,9 +1321,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  			goto err_xdp_frags;
>  		}
>  err_xdp_frags:
> -		if (unlikely(xdp_page != page))
> -			__free_pages(xdp_page, 0);

It seems __free_pages() and put_page() is used interchangeably here.
Perhaps using __free_pages() have performance reason? As the comment below:

https://elixir.bootlin.com/linux/v6.3-rc3/source/net/core/page_pool.c#L500

> -
>  		if (xdp_buff_has_frags(&xdp)) {
>  			shinfo = xdp_get_shared_info_from_buff(&xdp);
>  			for (i = 0; i < shinfo->nr_frags; i++) {
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-22  3:03 ` [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare Xuan Zhuo
@ 2023-03-22 11:52   ` Yunsheng Lin
  2023-03-23  1:45     ` Xuan Zhuo
  0 siblings, 1 reply; 33+ messages in thread
From: Yunsheng Lin @ 2023-03-22 11:52 UTC (permalink / raw)
  To: Xuan Zhuo, netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

On 2023/3/22 11:03, Xuan Zhuo wrote:
> Separating the logic of preparation for xdp from receive_mergeable.
> 
> The purpose of this is to simplify the logic of execution of XDP.
> 
> The main logic here is that when headroom is insufficient, we need to
> allocate a new page and calculate offset. It should be noted that if
> there is new page, the variable page will refer to the new page.
> 
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  drivers/net/virtio_net.c | 135 ++++++++++++++++++++++-----------------
>  1 file changed, 77 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 4d2bf1ce0730..bb426958cdd4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1162,6 +1162,79 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
>  	return 0;
>  }
>  
> +static void *mergeable_xdp_prepare(struct virtnet_info *vi,
> +				   struct receive_queue *rq,
> +				   struct bpf_prog *xdp_prog,
> +				   void *ctx,
> +				   unsigned int *frame_sz,
> +				   int *num_buf,
> +				   struct page **page,
> +				   int offset,
> +				   unsigned int *len,
> +				   struct virtio_net_hdr_mrg_rxbuf *hdr)

The naming convention seems to be xdp_prepare_mergeable().

> +{
> +	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
> +	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> +	struct page *xdp_page;
> +	unsigned int xdp_room;
> +
> +	/* Transient failure which in theory could occur if
> +	 * in-flight packets from before XDP was enabled reach
> +	 * the receive path after XDP is loaded.
> +	 */
> +	if (unlikely(hdr->hdr.gso_type))
> +		return NULL;
> +
> +	/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> +	 * with headroom may add hole in truesize, which
> +	 * make their length exceed PAGE_SIZE. So we disabled the
> +	 * hole mechanism for xdp. See add_recvbuf_mergeable().
> +	 */
> +	*frame_sz = truesize;
> +
> +	/* This happens when headroom is not enough because
> +	 * of the buffer was prefilled before XDP is set.
> +	 * This should only happen for the first several packets.
> +	 * In fact, vq reset can be used here to help us clean up
> +	 * the prefilled buffers, but many existing devices do not
> +	 * support it, and we don't want to bother users who are
> +	 * using xdp normally.
> +	 */
> +	if (!xdp_prog->aux->xdp_has_frags &&
> +	    (*num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> +		/* linearize data for XDP */
> +		xdp_page = xdp_linearize_page(rq, num_buf,
> +					      *page, offset,
> +					      VIRTIO_XDP_HEADROOM,
> +					      len);
> +
> +		if (!xdp_page)
> +			return NULL;
> +	} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> +		xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> +					  sizeof(struct skb_shared_info));
> +		if (*len + xdp_room > PAGE_SIZE)
> +			return NULL;
> +
> +		xdp_page = alloc_page(GFP_ATOMIC);
> +		if (!xdp_page)
> +			return NULL;
> +
> +		memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> +		       page_address(*page) + offset, *len);

It seems the above 'else if' was not really tested even before this patch,
as there is no "--*num_buf" if xdp_linearize_page() is not called, which
may causes virtnet_build_xdp_buff_mrg() to comsume one more buffer than
expected?

Also, it seems better to split the xdp_linearize_page() to two functions
as pskb_expand_head() and __skb_linearize() do, one to expand the headroom,
the other one to do the linearizing.


> +	} else {
> +		return page_address(*page) + offset;
> +	}
> +
> +	*frame_sz = PAGE_SIZE;
> +
> +	put_page(*page);
> +
> +	*page = xdp_page;
> +
> +	return page_address(xdp_page) + VIRTIO_XDP_HEADROOM;
> +}
> +
>  static struct sk_buff *receive_mergeable(struct net_device *dev,
>  					 struct virtnet_info *vi,
>  					 struct receive_queue *rq,
> @@ -1181,7 +1254,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
>  	unsigned int tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
>  	unsigned int room = SKB_DATA_ALIGN(headroom + tailroom);
> -	unsigned int frame_sz, xdp_room;
> +	unsigned int frame_sz;
>  	int err;
>  
>  	head_skb = NULL;
> @@ -1211,65 +1284,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  		u32 act;
>  		int i;
>  
> -		/* Transient failure which in theory could occur if
> -		 * in-flight packets from before XDP was enabled reach
> -		 * the receive path after XDP is loaded.
> -		 */
> -		if (unlikely(hdr->hdr.gso_type))
> +		data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
> +					     offset, &len, hdr);
> +		if (!data)

unlikely().

>  			goto err_xdp;
>  
> -		/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> -		 * with headroom may add hole in truesize, which
> -		 * make their length exceed PAGE_SIZE. So we disabled the
> -		 * hole mechanism for xdp. See add_recvbuf_mergeable().
> -		 */
> -		frame_sz = truesize;
> -
> -		/* This happens when headroom is not enough because
> -		 * of the buffer was prefilled before XDP is set.
> -		 * This should only happen for the first several packets.
> -		 * In fact, vq reset can be used here to help us clean up
> -		 * the prefilled buffers, but many existing devices do not
> -		 * support it, and we don't want to bother users who are
> -		 * using xdp normally.
> -		 */
> -		if (!xdp_prog->aux->xdp_has_frags &&
> -		    (num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> -			/* linearize data for XDP */
> -			xdp_page = xdp_linearize_page(rq, &num_buf,
> -						      page, offset,
> -						      VIRTIO_XDP_HEADROOM,
> -						      &len);
> -			frame_sz = PAGE_SIZE;
> -
> -			if (!xdp_page)
> -				goto err_xdp;
> -			offset = VIRTIO_XDP_HEADROOM;
> -
> -			put_page(page);
> -			page = xdp_page;
> -		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> -			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> -						  sizeof(struct skb_shared_info));
> -			if (len + xdp_room > PAGE_SIZE)
> -				goto err_xdp;
> -
> -			xdp_page = alloc_page(GFP_ATOMIC);
> -			if (!xdp_page)
> -				goto err_xdp;
> -
> -			memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> -			       page_address(page) + offset, len);
> -			frame_sz = PAGE_SIZE;
> -			offset = VIRTIO_XDP_HEADROOM;
> -
> -			put_page(page);
> -			page = xdp_page;
> -		} else {
> -			xdp_page = page;
> -		}
> -
> -		data = page_address(xdp_page) + offset;
>  		err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
>  						 &num_buf, &xdp_frags_truesz, stats);
>  		if (unlikely(err))
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 7/8] virtio_net: introduce receive_mergeable_xdp()
  2023-03-22  3:03 ` [PATCH net-next 7/8] virtio_net: introduce receive_mergeable_xdp() Xuan Zhuo
@ 2023-03-22 23:43   ` kernel test robot
  0 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2023-03-22 23:43 UTC (permalink / raw)
  To: Xuan Zhuo, netdev
  Cc: llvm, oe-kbuild-all, Michael S. Tsirkin, Jason Wang,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	virtualization, bpf

Hi Xuan,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Xuan-Zhuo/virtio_net-mergeable-xdp-put-old-page-immediately/20230322-110445
patch link:    https://lore.kernel.org/r/20230322030308.16046-8-xuanzhuo%40linux.alibaba.com
patch subject: [PATCH net-next 7/8] virtio_net: introduce receive_mergeable_xdp()
config: i386-randconfig-a002 (https://download.01.org/0day-ci/archive/20230323/202303230720.XUavHilr-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/c00edb888e239eb9eb468c0e93419f373f5e72a7
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Xuan-Zhuo/virtio_net-mergeable-xdp-put-old-page-immediately/20230322-110445
        git checkout c00edb888e239eb9eb468c0e93419f373f5e72a7
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/net/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303230720.XUavHilr-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/net/virtio_net.c:1399:6: warning: variable 'page' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
           if (unlikely(len > truesize - room)) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:78:22: note: expanded from macro 'unlikely'
   # define unlikely(x)    __builtin_expect(!!(x), 0)
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:1490:11: note: uninitialized use occurs here
           put_page(page);
                    ^~~~
   drivers/net/virtio_net.c:1399:2: note: remove the 'if' if its condition is always false
           if (unlikely(len > truesize - room)) {
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:1392:19: note: initialize the variable 'page' to silence this warning
           struct page *page;
                            ^
                             = NULL
>> drivers/net/virtio_net.c:1399:6: warning: variable 'num_buf' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
           if (unlikely(len > truesize - room)) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:78:22: note: expanded from macro 'unlikely'
   # define unlikely(x)    __builtin_expect(!!(x), 0)
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:1491:25: note: uninitialized use occurs here
           mergeable_buf_free(rq, num_buf, dev, stats);
                                  ^~~~~~~
   drivers/net/virtio_net.c:1399:2: note: remove the 'if' if its condition is always false
           if (unlikely(len > truesize - room)) {
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:1393:13: note: initialize the variable 'num_buf' to silence this warning
           int num_buf;
                      ^
                       = 0
   2 warnings generated.


vim +1399 drivers/net/virtio_net.c

  1375	
  1376	static struct sk_buff *receive_mergeable(struct net_device *dev,
  1377						 struct virtnet_info *vi,
  1378						 struct receive_queue *rq,
  1379						 void *buf,
  1380						 void *ctx,
  1381						 unsigned int len,
  1382						 unsigned int *xdp_xmit,
  1383						 struct virtnet_rq_stats *stats)
  1384	{
  1385		unsigned int truesize = mergeable_ctx_to_truesize(ctx);
  1386		unsigned int headroom = mergeable_ctx_to_headroom(ctx);
  1387		unsigned int tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
  1388		unsigned int room = SKB_DATA_ALIGN(headroom + tailroom);
  1389		struct virtio_net_hdr_mrg_rxbuf *hdr;
  1390		struct sk_buff *head_skb, *curr_skb;
  1391		struct bpf_prog *xdp_prog;
  1392		struct page *page;
  1393		int num_buf;
  1394		int offset;
  1395	
  1396		head_skb = NULL;
  1397		stats->bytes += len - vi->hdr_len;
  1398	
> 1399		if (unlikely(len > truesize - room)) {
  1400			pr_debug("%s: rx error: len %u exceeds truesize %lu\n",
  1401				 dev->name, len, (unsigned long)(truesize - room));
  1402			dev->stats.rx_length_errors++;
  1403			goto err_skb;
  1404		}
  1405	
  1406		if (likely(vi->xdp_enabled)) {
  1407			rcu_read_lock();
  1408			xdp_prog = rcu_dereference(rq->xdp_prog);
  1409			if (xdp_prog) {
  1410				head_skb = receive_mergeable_xdp(dev, vi, rq, xdp_prog,
  1411								 buf, ctx, len, xdp_xmit,
  1412								 stats);
  1413				rcu_read_unlock();
  1414				return head_skb;
  1415			}
  1416			rcu_read_unlock();
  1417		}
  1418	
  1419		hdr = buf;
  1420		num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers);
  1421		page = virt_to_head_page(buf);
  1422		offset = buf - page_address(page);
  1423	
  1424		head_skb = page_to_skb(vi, rq, page, offset, len, truesize, headroom);
  1425		curr_skb = head_skb;
  1426	
  1427		if (unlikely(!curr_skb))
  1428			goto err_skb;
  1429		while (--num_buf) {
  1430			int num_skb_frags;
  1431	
  1432			buf = virtqueue_get_buf_ctx(rq->vq, &len, &ctx);
  1433			if (unlikely(!buf)) {
  1434				pr_debug("%s: rx error: %d buffers out of %d missing\n",
  1435					 dev->name, num_buf,
  1436					 virtio16_to_cpu(vi->vdev,
  1437							 hdr->num_buffers));
  1438				dev->stats.rx_length_errors++;
  1439				goto err_buf;
  1440			}
  1441	
  1442			stats->bytes += len;
  1443			page = virt_to_head_page(buf);
  1444	
  1445			truesize = mergeable_ctx_to_truesize(ctx);
  1446			headroom = mergeable_ctx_to_headroom(ctx);
  1447			tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
  1448			room = SKB_DATA_ALIGN(headroom + tailroom);
  1449			if (unlikely(len > truesize - room)) {
  1450				pr_debug("%s: rx error: len %u exceeds truesize %lu\n",
  1451					 dev->name, len, (unsigned long)(truesize - room));
  1452				dev->stats.rx_length_errors++;
  1453				goto err_skb;
  1454			}
  1455	
  1456			num_skb_frags = skb_shinfo(curr_skb)->nr_frags;
  1457			if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) {
  1458				struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC);
  1459	
  1460				if (unlikely(!nskb))
  1461					goto err_skb;
  1462				if (curr_skb == head_skb)
  1463					skb_shinfo(curr_skb)->frag_list = nskb;
  1464				else
  1465					curr_skb->next = nskb;
  1466				curr_skb = nskb;
  1467				head_skb->truesize += nskb->truesize;
  1468				num_skb_frags = 0;
  1469			}
  1470			if (curr_skb != head_skb) {
  1471				head_skb->data_len += len;
  1472				head_skb->len += len;
  1473				head_skb->truesize += truesize;
  1474			}
  1475			offset = buf - page_address(page);
  1476			if (skb_can_coalesce(curr_skb, num_skb_frags, page, offset)) {
  1477				put_page(page);
  1478				skb_coalesce_rx_frag(curr_skb, num_skb_frags - 1,
  1479						     len, truesize);
  1480			} else {
  1481				skb_add_rx_frag(curr_skb, num_skb_frags, page,
  1482						offset, len, truesize);
  1483			}
  1484		}
  1485	
  1486		ewma_pkt_len_add(&rq->mrg_avg_pkt_len, head_skb->len);
  1487		return head_skb;
  1488	
  1489	err_skb:
  1490		put_page(page);
  1491		mergeable_buf_free(rq, num_buf, dev, stats);
  1492	
  1493	err_buf:
  1494		stats->drops++;
  1495		dev_kfree_skb(head_skb);
  1496		return NULL;
  1497	}
  1498	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately
  2023-03-22  8:22   ` Yunsheng Lin
@ 2023-03-23  1:36     ` Xuan Zhuo
  2023-03-23  3:38       ` Yunsheng Lin
  2023-03-23  5:38       ` Jason Wang
  0 siblings, 2 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-23  1:36 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On Wed, 22 Mar 2023 16:22:18 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> On 2023/3/22 11:03, Xuan Zhuo wrote:
> > In the xdp implementation of virtio-net mergeable, it always checks
> > whether two page is used and a page is selected to release. This is
> > complicated for the processing of action, and be careful.
> >
> > In the entire process, we have such principles:
> > * If xdp_page is used (PASS, TX, Redirect), then we release the old
> >   page.
> > * If it is a drop case, we will release two. The old page obtained from
> >   buf is release inside err_xdp, and xdp_page needs be relased by us.
> >
> > But in fact, when we allocate a new page, we can release the old page
> > immediately. Then just one is using, we just need to release the new
> > page for drop case. On the drop path, err_xdp will release the variable
> > "page", so we only need to let "page" point to the new xdp_page in
> > advance.
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  drivers/net/virtio_net.c | 15 ++++++---------
> >  1 file changed, 6 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index e2560b6f7980..4d2bf1ce0730 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1245,6 +1245,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  			if (!xdp_page)
> >  				goto err_xdp;
> >  			offset = VIRTIO_XDP_HEADROOM;
> > +
> > +			put_page(page);
>
> the error handling of xdp_linearize_page() does not seems self contained.
> Does it not seem better:
> 1. if xdp_linearize_page() succesed, call put_page() for first buffer just
>    as put_page() is call for other buffer
> 2. or call virtqueue_get_buf() and put_page() for all the buffer of the packet
>    so the error handling is not needed outside the virtqueue_get_buf().
>
> In that case, it seems we can just do below without xdp_page:
> page = xdp_linearize_page(rq, num_buf, page, ...);


This does look better.

In fact, we already have vq reset, we can load XDP based on vq reset.
In this way, we can run without xdp_linearize_page.


>
>
> > +			page = xdp_page;
> >  		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> >  			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> >  						  sizeof(struct skb_shared_info));
> > @@ -1259,6 +1262,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  			       page_address(page) + offset, len);
> >  			frame_sz = PAGE_SIZE;
> >  			offset = VIRTIO_XDP_HEADROOM;
> > +
> > +			put_page(page);
> > +			page = xdp_page;
>
> It seems we can limit the scope of xdp_page in this "else if" block.
>
> >  		} else {
> >  			xdp_page = page;
> >  		}
>
> It seems the above else block is not needed anymore.

Yes, the follow-up patch has this optimization.


>
> > @@ -1278,8 +1284,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  			if (unlikely(!head_skb))
> >  				goto err_xdp_frags;
> >
> > -			if (unlikely(xdp_page != page))
> > -				put_page(page);
> >  			rcu_read_unlock();
> >  			return head_skb;
> >  		case XDP_TX:
> > @@ -1297,8 +1301,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  				goto err_xdp_frags;
> >  			}
> >  			*xdp_xmit |= VIRTIO_XDP_TX;
> > -			if (unlikely(xdp_page != page))
> > -				put_page(page);
> >  			rcu_read_unlock();
> >  			goto xdp_xmit;
> >  		case XDP_REDIRECT:
> > @@ -1307,8 +1309,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  			if (err)
> >  				goto err_xdp_frags;
> >  			*xdp_xmit |= VIRTIO_XDP_REDIR;
> > -			if (unlikely(xdp_page != page))
> > -				put_page(page);
> >  			rcu_read_unlock();
> >  			goto xdp_xmit;
> >  		default:
> > @@ -1321,9 +1321,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  			goto err_xdp_frags;
> >  		}
> >  err_xdp_frags:
> > -		if (unlikely(xdp_page != page))
> > -			__free_pages(xdp_page, 0);
>
> It seems __free_pages() and put_page() is used interchangeably here.
> Perhaps using __free_pages() have performance reason? As the comment below:
>
> https://elixir.bootlin.com/linux/v6.3-rc3/source/net/core/page_pool.c#L500


Yes, but now we don't seem to be very good to distinguish it. But I think
it doesn't matter. This logic is rare under actual situation.

Thanks.


>
> > -
> >  		if (xdp_buff_has_frags(&xdp)) {
> >  			shinfo = xdp_get_shared_info_from_buff(&xdp);
> >  			for (i = 0; i < shinfo->nr_frags; i++) {
> >

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-22 11:52   ` Yunsheng Lin
@ 2023-03-23  1:45     ` Xuan Zhuo
  2023-03-23  4:45       ` Yunsheng Lin
  0 siblings, 1 reply; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-23  1:45 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On Wed, 22 Mar 2023 19:52:48 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> On 2023/3/22 11:03, Xuan Zhuo wrote:
> > Separating the logic of preparation for xdp from receive_mergeable.
> >
> > The purpose of this is to simplify the logic of execution of XDP.
> >
> > The main logic here is that when headroom is insufficient, we need to
> > allocate a new page and calculate offset. It should be noted that if
> > there is new page, the variable page will refer to the new page.
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >  drivers/net/virtio_net.c | 135 ++++++++++++++++++++++-----------------
> >  1 file changed, 77 insertions(+), 58 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 4d2bf1ce0730..bb426958cdd4 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1162,6 +1162,79 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
> >  	return 0;
> >  }
> >
> > +static void *mergeable_xdp_prepare(struct virtnet_info *vi,
> > +				   struct receive_queue *rq,
> > +				   struct bpf_prog *xdp_prog,
> > +				   void *ctx,
> > +				   unsigned int *frame_sz,
> > +				   int *num_buf,
> > +				   struct page **page,
> > +				   int offset,
> > +				   unsigned int *len,
> > +				   struct virtio_net_hdr_mrg_rxbuf *hdr)
>
> The naming convention seems to be xdp_prepare_mergeable().

What convention?


>
> > +{
> > +	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
> > +	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> > +	struct page *xdp_page;
> > +	unsigned int xdp_room;
> > +
> > +	/* Transient failure which in theory could occur if
> > +	 * in-flight packets from before XDP was enabled reach
> > +	 * the receive path after XDP is loaded.
> > +	 */
> > +	if (unlikely(hdr->hdr.gso_type))
> > +		return NULL;
> > +
> > +	/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> > +	 * with headroom may add hole in truesize, which
> > +	 * make their length exceed PAGE_SIZE. So we disabled the
> > +	 * hole mechanism for xdp. See add_recvbuf_mergeable().
> > +	 */
> > +	*frame_sz = truesize;
> > +
> > +	/* This happens when headroom is not enough because
> > +	 * of the buffer was prefilled before XDP is set.
> > +	 * This should only happen for the first several packets.
> > +	 * In fact, vq reset can be used here to help us clean up
> > +	 * the prefilled buffers, but many existing devices do not
> > +	 * support it, and we don't want to bother users who are
> > +	 * using xdp normally.
> > +	 */
> > +	if (!xdp_prog->aux->xdp_has_frags &&
> > +	    (*num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> > +		/* linearize data for XDP */
> > +		xdp_page = xdp_linearize_page(rq, num_buf,
> > +					      *page, offset,
> > +					      VIRTIO_XDP_HEADROOM,
> > +					      len);
> > +
> > +		if (!xdp_page)
> > +			return NULL;
> > +	} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> > +		xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> > +					  sizeof(struct skb_shared_info));
> > +		if (*len + xdp_room > PAGE_SIZE)
> > +			return NULL;
> > +
> > +		xdp_page = alloc_page(GFP_ATOMIC);
> > +		if (!xdp_page)
> > +			return NULL;
> > +
> > +		memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> > +		       page_address(*page) + offset, *len);
>
> It seems the above 'else if' was not really tested even before this patch,
> as there is no "--*num_buf" if xdp_linearize_page() is not called, which
> may causes virtnet_build_xdp_buff_mrg() to comsume one more buffer than
> expected?

Why do you think so?

>
> Also, it seems better to split the xdp_linearize_page() to two functions
> as pskb_expand_head() and __skb_linearize() do, one to expand the headroom,
> the other one to do the linearizing.

No skb here.


>
>
> > +	} else {
> > +		return page_address(*page) + offset;
> > +	}
> > +
> > +	*frame_sz = PAGE_SIZE;
> > +
> > +	put_page(*page);
> > +
> > +	*page = xdp_page;
> > +
> > +	return page_address(xdp_page) + VIRTIO_XDP_HEADROOM;
> > +}
> > +
> >  static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  					 struct virtnet_info *vi,
> >  					 struct receive_queue *rq,
> > @@ -1181,7 +1254,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> >  	unsigned int tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
> >  	unsigned int room = SKB_DATA_ALIGN(headroom + tailroom);
> > -	unsigned int frame_sz, xdp_room;
> > +	unsigned int frame_sz;
> >  	int err;
> >
> >  	head_skb = NULL;
> > @@ -1211,65 +1284,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >  		u32 act;
> >  		int i;
> >
> > -		/* Transient failure which in theory could occur if
> > -		 * in-flight packets from before XDP was enabled reach
> > -		 * the receive path after XDP is loaded.
> > -		 */
> > -		if (unlikely(hdr->hdr.gso_type))
> > +		data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
> > +					     offset, &len, hdr);
> > +		if (!data)
>
> unlikely().


Thanks.

>
> >  			goto err_xdp;
> >
> > -		/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> > -		 * with headroom may add hole in truesize, which
> > -		 * make their length exceed PAGE_SIZE. So we disabled the
> > -		 * hole mechanism for xdp. See add_recvbuf_mergeable().
> > -		 */
> > -		frame_sz = truesize;
> > -
> > -		/* This happens when headroom is not enough because
> > -		 * of the buffer was prefilled before XDP is set.
> > -		 * This should only happen for the first several packets.
> > -		 * In fact, vq reset can be used here to help us clean up
> > -		 * the prefilled buffers, but many existing devices do not
> > -		 * support it, and we don't want to bother users who are
> > -		 * using xdp normally.
> > -		 */
> > -		if (!xdp_prog->aux->xdp_has_frags &&
> > -		    (num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> > -			/* linearize data for XDP */
> > -			xdp_page = xdp_linearize_page(rq, &num_buf,
> > -						      page, offset,
> > -						      VIRTIO_XDP_HEADROOM,
> > -						      &len);
> > -			frame_sz = PAGE_SIZE;
> > -
> > -			if (!xdp_page)
> > -				goto err_xdp;
> > -			offset = VIRTIO_XDP_HEADROOM;
> > -
> > -			put_page(page);
> > -			page = xdp_page;
> > -		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> > -			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> > -						  sizeof(struct skb_shared_info));
> > -			if (len + xdp_room > PAGE_SIZE)
> > -				goto err_xdp;
> > -
> > -			xdp_page = alloc_page(GFP_ATOMIC);
> > -			if (!xdp_page)
> > -				goto err_xdp;
> > -
> > -			memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> > -			       page_address(page) + offset, len);
> > -			frame_sz = PAGE_SIZE;
> > -			offset = VIRTIO_XDP_HEADROOM;
> > -
> > -			put_page(page);
> > -			page = xdp_page;
> > -		} else {
> > -			xdp_page = page;
> > -		}
> > -
> > -		data = page_address(xdp_page) + offset;
> >  		err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
> >  						 &num_buf, &xdp_frags_truesz, stats);
> >  		if (unlikely(err))
> >

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 8/8] virtio_net: introduce receive_small_xdp()
  2023-03-22  3:03 ` [PATCH net-next 8/8] virtio_net: introduce receive_small_xdp() Xuan Zhuo
@ 2023-03-23  2:06   ` kernel test robot
  0 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2023-03-23  2:06 UTC (permalink / raw)
  To: Xuan Zhuo, netdev
  Cc: llvm, oe-kbuild-all, Michael S. Tsirkin, Jason Wang,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	virtualization, bpf

Hi Xuan,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Xuan-Zhuo/virtio_net-mergeable-xdp-put-old-page-immediately/20230322-110445
patch link:    https://lore.kernel.org/r/20230322030308.16046-9-xuanzhuo%40linux.alibaba.com
patch subject: [PATCH net-next 8/8] virtio_net: introduce receive_small_xdp()
config: i386-randconfig-a002 (https://download.01.org/0day-ci/archive/20230323/202303230953.y0P1e1Fc-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/6445923dc4c91e57f98b8356d0bd706e95dafaa1
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Xuan-Zhuo/virtio_net-mergeable-xdp-put-old-page-immediately/20230322-110445
        git checkout 6445923dc4c91e57f98b8356d0bd706e95dafaa1
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/net/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303230953.y0P1e1Fc-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/net/virtio_net.c:969:6: warning: variable 'buflen' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
           if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:78:22: note: expanded from macro 'unlikely'
   # define unlikely(x)    __builtin_expect(!!(x), 0)
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:990:22: note: uninitialized use occurs here
           xdp_init_buff(&xdp, buflen, &rq->xdp_rxq);
                               ^~~~~~
   drivers/net/virtio_net.c:969:2: note: remove the 'if' if its condition is always true
           if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:958:21: note: initialize the variable 'buflen' to silence this warning
           unsigned int buflen;
                              ^
                               = 0
   drivers/net/virtio_net.c:1428:6: warning: variable 'page' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
           if (unlikely(len > truesize - room)) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:78:22: note: expanded from macro 'unlikely'
   # define unlikely(x)    __builtin_expect(!!(x), 0)
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:1519:11: note: uninitialized use occurs here
           put_page(page);
                    ^~~~
   drivers/net/virtio_net.c:1428:2: note: remove the 'if' if its condition is always false
           if (unlikely(len > truesize - room)) {
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:1421:19: note: initialize the variable 'page' to silence this warning
           struct page *page;
                            ^
                             = NULL
   drivers/net/virtio_net.c:1428:6: warning: variable 'num_buf' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
           if (unlikely(len > truesize - room)) {
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:78:22: note: expanded from macro 'unlikely'
   # define unlikely(x)    __builtin_expect(!!(x), 0)
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:1520:25: note: uninitialized use occurs here
           mergeable_buf_free(rq, num_buf, dev, stats);
                                  ^~~~~~~
   drivers/net/virtio_net.c:1428:2: note: remove the 'if' if its condition is always false
           if (unlikely(len > truesize - room)) {
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/net/virtio_net.c:1422:13: note: initialize the variable 'num_buf' to silence this warning
           int num_buf;
                      ^
                       = 0
   3 warnings generated.


vim +969 drivers/net/virtio_net.c

4941d472bf95b4 Jason Wang             2017-07-19   941  
6445923dc4c91e Xuan Zhuo              2023-03-22   942  static struct sk_buff *receive_small_xdp(struct net_device *dev,
bb91accf27335c Jason Wang             2016-12-23   943  					 struct virtnet_info *vi,
bb91accf27335c Jason Wang             2016-12-23   944  					 struct receive_queue *rq,
6445923dc4c91e Xuan Zhuo              2023-03-22   945  					 struct bpf_prog *xdp_prog,
6445923dc4c91e Xuan Zhuo              2023-03-22   946  					 void *buf,
6445923dc4c91e Xuan Zhuo              2023-03-22   947  					 void *ctx,
186b3c998c50fc Jason Wang             2017-09-19   948  					 unsigned int len,
7d9d60fd4ab696 Toshiaki Makita        2018-07-23   949  					 unsigned int *xdp_xmit,
d46eeeaf99bcfa Jason Wang             2018-07-31   950  					 struct virtnet_rq_stats *stats)
f121159d72091f Michael S. Tsirkin     2013-11-28   951  {
4941d472bf95b4 Jason Wang             2017-07-19   952  	unsigned int xdp_headroom = (unsigned long)ctx;
f6b10209b90d48 Jason Wang             2017-02-21   953  	unsigned int header_offset = VIRTNET_RX_PAD + xdp_headroom;
f6b10209b90d48 Jason Wang             2017-02-21   954  	unsigned int headroom = vi->hdr_len + header_offset;
6445923dc4c91e Xuan Zhuo              2023-03-22   955  	struct virtio_net_hdr_mrg_rxbuf *hdr = buf + header_offset;
4941d472bf95b4 Jason Wang             2017-07-19   956  	struct page *page = virt_to_head_page(buf);
4941d472bf95b4 Jason Wang             2017-07-19   957  	struct page *xdp_page;
6445923dc4c91e Xuan Zhuo              2023-03-22   958  	unsigned int buflen;
0354e4d19cd5de John Fastabend         2017-02-02   959  	struct xdp_buff xdp;
6445923dc4c91e Xuan Zhuo              2023-03-22   960  	struct sk_buff *skb;
6445923dc4c91e Xuan Zhuo              2023-03-22   961  	unsigned int delta = 0;
6445923dc4c91e Xuan Zhuo              2023-03-22   962  	unsigned int metasize = 0;
f6b10209b90d48 Jason Wang             2017-02-21   963  	void *orig_data;
bb91accf27335c Jason Wang             2016-12-23   964  	u32 act;
bb91accf27335c Jason Wang             2016-12-23   965  
95dbe9e7b3720e Jesper Dangaard Brouer 2018-02-20   966  	if (unlikely(hdr->hdr.gso_type))
bb91accf27335c Jason Wang             2016-12-23   967  		goto err_xdp;
0354e4d19cd5de John Fastabend         2017-02-02   968  
4941d472bf95b4 Jason Wang             2017-07-19  @969  	if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
4941d472bf95b4 Jason Wang             2017-07-19   970  		int offset = buf - page_address(page) + header_offset;
4941d472bf95b4 Jason Wang             2017-07-19   971  		unsigned int tlen = len + vi->hdr_len;
981f14d42a7f16 Heng Qi                2023-01-31   972  		int num_buf = 1;
4941d472bf95b4 Jason Wang             2017-07-19   973  
4941d472bf95b4 Jason Wang             2017-07-19   974  		xdp_headroom = virtnet_get_headroom(vi);
4941d472bf95b4 Jason Wang             2017-07-19   975  		header_offset = VIRTNET_RX_PAD + xdp_headroom;
4941d472bf95b4 Jason Wang             2017-07-19   976  		headroom = vi->hdr_len + header_offset;
4941d472bf95b4 Jason Wang             2017-07-19   977  		buflen = SKB_DATA_ALIGN(GOOD_PACKET_LEN + headroom) +
4941d472bf95b4 Jason Wang             2017-07-19   978  			SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
4941d472bf95b4 Jason Wang             2017-07-19   979  		xdp_page = xdp_linearize_page(rq, &num_buf, page,
4941d472bf95b4 Jason Wang             2017-07-19   980  					      offset, header_offset,
4941d472bf95b4 Jason Wang             2017-07-19   981  					      &tlen);
4941d472bf95b4 Jason Wang             2017-07-19   982  		if (!xdp_page)
4941d472bf95b4 Jason Wang             2017-07-19   983  			goto err_xdp;
4941d472bf95b4 Jason Wang             2017-07-19   984  
4941d472bf95b4 Jason Wang             2017-07-19   985  		buf = page_address(xdp_page);
4941d472bf95b4 Jason Wang             2017-07-19   986  		put_page(page);
4941d472bf95b4 Jason Wang             2017-07-19   987  		page = xdp_page;
4941d472bf95b4 Jason Wang             2017-07-19   988  	}
4941d472bf95b4 Jason Wang             2017-07-19   989  
43b5169d8355cc Lorenzo Bianconi       2020-12-22   990  	xdp_init_buff(&xdp, buflen, &rq->xdp_rxq);
be9df4aff65f18 Lorenzo Bianconi       2020-12-22   991  	xdp_prepare_buff(&xdp, buf + VIRTNET_RX_PAD + vi->hdr_len,
be9df4aff65f18 Lorenzo Bianconi       2020-12-22   992  			 xdp_headroom, len, true);
f6b10209b90d48 Jason Wang             2017-02-21   993  	orig_data = xdp.data;
4249e276b1ab30 Xuan Zhuo              2023-03-22   994  
4249e276b1ab30 Xuan Zhuo              2023-03-22   995  	act = virtnet_xdp_handler(xdp_prog, &xdp, dev, xdp_xmit, stats);
0354e4d19cd5de John Fastabend         2017-02-02   996  
bb91accf27335c Jason Wang             2016-12-23   997  	switch (act) {
4249e276b1ab30 Xuan Zhuo              2023-03-22   998  	case VIRTNET_XDP_RES_PASS:
2de2f7f40ef923 John Fastabend         2017-02-02   999  		/* Recalculate length in case bpf program changed it */
f6b10209b90d48 Jason Wang             2017-02-21  1000  		delta = orig_data - xdp.data;
6870de435b90c0 Nikita V. Shirokov     2018-04-17  1001  		len = xdp.data_end - xdp.data;
503d539a6e417b Yuya Kusakabe          2020-02-25  1002  		metasize = xdp.data - xdp.data_meta;
bb91accf27335c Jason Wang             2016-12-23  1003  		break;
4249e276b1ab30 Xuan Zhuo              2023-03-22  1004  
4249e276b1ab30 Xuan Zhuo              2023-03-22  1005  	case VIRTNET_XDP_RES_CONSUMED:
bb91accf27335c Jason Wang             2016-12-23  1006  		goto xdp_xmit;
4249e276b1ab30 Xuan Zhuo              2023-03-22  1007  
4249e276b1ab30 Xuan Zhuo              2023-03-22  1008  	case VIRTNET_XDP_RES_DROP:
bb91accf27335c Jason Wang             2016-12-23  1009  		goto err_xdp;
bb91accf27335c Jason Wang             2016-12-23  1010  	}
bb91accf27335c Jason Wang             2016-12-23  1011  
f6b10209b90d48 Jason Wang             2017-02-21  1012  	skb = build_skb(buf, buflen);
053c9e18c6f9cf Wenliang Wang          2021-12-16  1013  	if (!skb)
f6b10209b90d48 Jason Wang             2017-02-21  1014  		goto err;
6445923dc4c91e Xuan Zhuo              2023-03-22  1015  
f6b10209b90d48 Jason Wang             2017-02-21  1016  	skb_reserve(skb, headroom - delta);
6870de435b90c0 Nikita V. Shirokov     2018-04-17  1017  	skb_put(skb, len);
503d539a6e417b Yuya Kusakabe          2020-02-25  1018  	if (metasize)
503d539a6e417b Yuya Kusakabe          2020-02-25  1019  		skb_metadata_set(skb, metasize);
503d539a6e417b Yuya Kusakabe          2020-02-25  1020  
f121159d72091f Michael S. Tsirkin     2013-11-28  1021  	return skb;
bb91accf27335c Jason Wang             2016-12-23  1022  
bb91accf27335c Jason Wang             2016-12-23  1023  err_xdp:
d46eeeaf99bcfa Jason Wang             2018-07-31  1024  	stats->xdp_drops++;
053c9e18c6f9cf Wenliang Wang          2021-12-16  1025  err:
d46eeeaf99bcfa Jason Wang             2018-07-31  1026  	stats->drops++;
4941d472bf95b4 Jason Wang             2017-07-19  1027  	put_page(page);
bb91accf27335c Jason Wang             2016-12-23  1028  xdp_xmit:
bb91accf27335c Jason Wang             2016-12-23  1029  	return NULL;
f121159d72091f Michael S. Tsirkin     2013-11-28  1030  }
f121159d72091f Michael S. Tsirkin     2013-11-28  1031  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately
  2023-03-23  1:36     ` Xuan Zhuo
@ 2023-03-23  3:38       ` Yunsheng Lin
  2023-03-23  3:45         ` Xuan Zhuo
  2023-03-23  5:38       ` Jason Wang
  1 sibling, 1 reply; 33+ messages in thread
From: Yunsheng Lin @ 2023-03-23  3:38 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On 2023/3/23 9:36, Xuan Zhuo wrote:
> On Wed, 22 Mar 2023 16:22:18 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
>> On 2023/3/22 11:03, Xuan Zhuo wrote:
>>> In the xdp implementation of virtio-net mergeable, it always checks
>>> whether two page is used and a page is selected to release. This is
>>> complicated for the processing of action, and be careful.
>>>
>>> In the entire process, we have such principles:
>>> * If xdp_page is used (PASS, TX, Redirect), then we release the old
>>>   page.
>>> * If it is a drop case, we will release two. The old page obtained from
>>>   buf is release inside err_xdp, and xdp_page needs be relased by us.
>>>
>>> But in fact, when we allocate a new page, we can release the old page
>>> immediately. Then just one is using, we just need to release the new
>>> page for drop case. On the drop path, err_xdp will release the variable
>>> "page", so we only need to let "page" point to the new xdp_page in
>>> advance.
>>>
>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>> ---
>>>  drivers/net/virtio_net.c | 15 ++++++---------
>>>  1 file changed, 6 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index e2560b6f7980..4d2bf1ce0730 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -1245,6 +1245,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>>>  			if (!xdp_page)
>>>  				goto err_xdp;
>>>  			offset = VIRTIO_XDP_HEADROOM;
>>> +
>>> +			put_page(page);
>>
>> the error handling of xdp_linearize_page() does not seems self contained.
>> Does it not seem better:
>> 1. if xdp_linearize_page() succesed, call put_page() for first buffer just
>>    as put_page() is call for other buffer
>> 2. or call virtqueue_get_buf() and put_page() for all the buffer of the packet
>>    so the error handling is not needed outside the virtqueue_get_buf().
>>
>> In that case, it seems we can just do below without xdp_page:
>> page = xdp_linearize_page(rq, num_buf, page, ...);
> 
> 
> This does look better.
> 
> In fact, we already have vq reset, we can load XDP based on vq reset.
> In this way, we can run without xdp_linearize_page.

For compatibility, it is still needed, right?

> 
> 
>>
>>
>>> +			page = xdp_page;
>>>  		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
>>>  			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
>>>  						  sizeof(struct skb_shared_info));
>>> @@ -1259,6 +1262,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>>>  			       page_address(page) + offset, len);
>>>  			frame_sz = PAGE_SIZE;
>>>  			offset = VIRTIO_XDP_HEADROOM;
>>> +
>>> +			put_page(page);
>>> +			page = xdp_page;
>>
>> It seems we can limit the scope of xdp_page in this "else if" block.
>>
>>>  		} else {
>>>  			xdp_page = page;
>>>  		}
>>
>> It seems the above else block is not needed anymore.
> 
> Yes, the follow-up patch has this optimization.

Isn't refoctor patch supposed to be self-contianed too, instead of
depending on follow-up patch?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately
  2023-03-23  3:38       ` Yunsheng Lin
@ 2023-03-23  3:45         ` Xuan Zhuo
  0 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-23  3:45 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On Thu, 23 Mar 2023 11:38:34 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> On 2023/3/23 9:36, Xuan Zhuo wrote:
> > On Wed, 22 Mar 2023 16:22:18 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >> On 2023/3/22 11:03, Xuan Zhuo wrote:
> >>> In the xdp implementation of virtio-net mergeable, it always checks
> >>> whether two page is used and a page is selected to release. This is
> >>> complicated for the processing of action, and be careful.
> >>>
> >>> In the entire process, we have such principles:
> >>> * If xdp_page is used (PASS, TX, Redirect), then we release the old
> >>>   page.
> >>> * If it is a drop case, we will release two. The old page obtained from
> >>>   buf is release inside err_xdp, and xdp_page needs be relased by us.
> >>>
> >>> But in fact, when we allocate a new page, we can release the old page
> >>> immediately. Then just one is using, we just need to release the new
> >>> page for drop case. On the drop path, err_xdp will release the variable
> >>> "page", so we only need to let "page" point to the new xdp_page in
> >>> advance.
> >>>
> >>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> >>> ---
> >>>  drivers/net/virtio_net.c | 15 ++++++---------
> >>>  1 file changed, 6 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>> index e2560b6f7980..4d2bf1ce0730 100644
> >>> --- a/drivers/net/virtio_net.c
> >>> +++ b/drivers/net/virtio_net.c
> >>> @@ -1245,6 +1245,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >>>  			if (!xdp_page)
> >>>  				goto err_xdp;
> >>>  			offset = VIRTIO_XDP_HEADROOM;
> >>> +
> >>> +			put_page(page);
> >>
> >> the error handling of xdp_linearize_page() does not seems self contained.
> >> Does it not seem better:
> >> 1. if xdp_linearize_page() succesed, call put_page() for first buffer just
> >>    as put_page() is call for other buffer
> >> 2. or call virtqueue_get_buf() and put_page() for all the buffer of the packet
> >>    so the error handling is not needed outside the virtqueue_get_buf().
> >>
> >> In that case, it seems we can just do below without xdp_page:
> >> page = xdp_linearize_page(rq, num_buf, page, ...);
> >
> >
> > This does look better.
> >
> > In fact, we already have vq reset, we can load XDP based on vq reset.
> > In this way, we can run without xdp_linearize_page.
>
> For compatibility, it is still needed, right?

Yes


>
> >
> >
> >>
> >>
> >>> +			page = xdp_page;
> >>>  		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> >>>  			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> >>>  						  sizeof(struct skb_shared_info));
> >>> @@ -1259,6 +1262,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >>>  			       page_address(page) + offset, len);
> >>>  			frame_sz = PAGE_SIZE;
> >>>  			offset = VIRTIO_XDP_HEADROOM;
> >>> +
> >>> +			put_page(page);
> >>> +			page = xdp_page;
> >>
> >> It seems we can limit the scope of xdp_page in this "else if" block.
> >>
> >>>  		} else {
> >>>  			xdp_page = page;
> >>>  		}
> >>
> >> It seems the above else block is not needed anymore.
> >
> > Yes, the follow-up patch has this optimization.
>
> Isn't refoctor patch supposed to be self-contianed too, instead of
> depending on follow-up patch?


I mean that the #2 patch do this.

Thanks.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-23  1:45     ` Xuan Zhuo
@ 2023-03-23  4:45       ` Yunsheng Lin
  2023-03-23  4:52         ` Jakub Kicinski
                           ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Yunsheng Lin @ 2023-03-23  4:45 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On 2023/3/23 9:45, Xuan Zhuo wrote:
> On Wed, 22 Mar 2023 19:52:48 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
>> On 2023/3/22 11:03, Xuan Zhuo wrote:
>>> Separating the logic of preparation for xdp from receive_mergeable.
>>>
>>> The purpose of this is to simplify the logic of execution of XDP.
>>>
>>> The main logic here is that when headroom is insufficient, we need to
>>> allocate a new page and calculate offset. It should be noted that if
>>> there is new page, the variable page will refer to the new page.
>>>
>>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>> ---
>>>  drivers/net/virtio_net.c | 135 ++++++++++++++++++++++-----------------
>>>  1 file changed, 77 insertions(+), 58 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index 4d2bf1ce0730..bb426958cdd4 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -1162,6 +1162,79 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
>>>  	return 0;
>>>  }
>>>
>>> +static void *mergeable_xdp_prepare(struct virtnet_info *vi,
>>> +				   struct receive_queue *rq,
>>> +				   struct bpf_prog *xdp_prog,
>>> +				   void *ctx,
>>> +				   unsigned int *frame_sz,
>>> +				   int *num_buf,
>>> +				   struct page **page,
>>> +				   int offset,
>>> +				   unsigned int *len,
>>> +				   struct virtio_net_hdr_mrg_rxbuf *hdr)
>>
>> The naming convention seems to be xdp_prepare_mergeable().
> 
> What convention?
> 
> 
>>
>>> +{
>>> +	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
>>> +	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
>>> +	struct page *xdp_page;
>>> +	unsigned int xdp_room;
>>> +
>>> +	/* Transient failure which in theory could occur if
>>> +	 * in-flight packets from before XDP was enabled reach
>>> +	 * the receive path after XDP is loaded.
>>> +	 */
>>> +	if (unlikely(hdr->hdr.gso_type))
>>> +		return NULL;
>>> +
>>> +	/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
>>> +	 * with headroom may add hole in truesize, which
>>> +	 * make their length exceed PAGE_SIZE. So we disabled the
>>> +	 * hole mechanism for xdp. See add_recvbuf_mergeable().
>>> +	 */
>>> +	*frame_sz = truesize;
>>> +
>>> +	/* This happens when headroom is not enough because
>>> +	 * of the buffer was prefilled before XDP is set.
>>> +	 * This should only happen for the first several packets.
>>> +	 * In fact, vq reset can be used here to help us clean up
>>> +	 * the prefilled buffers, but many existing devices do not
>>> +	 * support it, and we don't want to bother users who are
>>> +	 * using xdp normally.
>>> +	 */
>>> +	if (!xdp_prog->aux->xdp_has_frags &&
>>> +	    (*num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
>>> +		/* linearize data for XDP */
>>> +		xdp_page = xdp_linearize_page(rq, num_buf,
>>> +					      *page, offset,
>>> +					      VIRTIO_XDP_HEADROOM,
>>> +					      len);
>>> +
>>> +		if (!xdp_page)
>>> +			return NULL;
>>> +	} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
>>> +		xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
>>> +					  sizeof(struct skb_shared_info));
>>> +		if (*len + xdp_room > PAGE_SIZE)
>>> +			return NULL;
>>> +
>>> +		xdp_page = alloc_page(GFP_ATOMIC);
>>> +		if (!xdp_page)
>>> +			return NULL;
>>> +
>>> +		memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
>>> +		       page_address(*page) + offset, *len);
>>
>> It seems the above 'else if' was not really tested even before this patch,
>> as there is no "--*num_buf" if xdp_linearize_page() is not called, which
>> may causes virtnet_build_xdp_buff_mrg() to comsume one more buffer than
>> expected?
> 
> Why do you think so?


In first 'if' block, there is a "--*num_buf" before gotoing 'err_xdp'
for virtqueue_get_buf() failure in xdp_linearize_page().

But here there is no "--*num_buf" before gotoing 'err_xdp' for
alloc_page() failure.

So one of them has to be wrong, right?

> 
>>
>> Also, it seems better to split the xdp_linearize_page() to two functions
>> as pskb_expand_head() and __skb_linearize() do, one to expand the headroom,
>> the other one to do the linearizing.
> 
> No skb here.

I means following the semantics of pskb_expand_head() and __skb_linearize(),
not to combine the headroom expanding and linearizing into one function as
xdp_linearize_page() does now if we want a better refoctor result.

> 
> 
>>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-23  4:45       ` Yunsheng Lin
@ 2023-03-23  4:52         ` Jakub Kicinski
  2023-03-23  5:40         ` Jason Wang
  2023-03-23 10:59         ` Xuan Zhuo
  2 siblings, 0 replies; 33+ messages in thread
From: Jakub Kicinski @ 2023-03-23  4:52 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Xuan Zhuo, Michael S. Tsirkin, Jason Wang, David S. Miller,
	Eric Dumazet, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On Thu, 23 Mar 2023 12:45:41 +0800 Yunsheng Lin wrote:
> >> Also, it seems better to split the xdp_linearize_page() to two functions
> >> as pskb_expand_head() and __skb_linearize() do, one to expand the headroom,
> >> the other one to do the linearizing.  
> > 
> > No skb here.  
> 
> I means following the semantics of pskb_expand_head() and __skb_linearize(),
> not to combine the headroom expanding and linearizing into one function as
> xdp_linearize_page() does now if we want a better refoctor result.

It's a driver-local function, if I was reading the code and saw
xdp_prepare_mergeable() I'd have thought it's a function from XDP core.
If anything the functions are missing local driver prefix. But it's
not a huge deal (given Michael's ping yesterday asking us if we can
expedite merging..)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately
  2023-03-23  1:36     ` Xuan Zhuo
  2023-03-23  3:38       ` Yunsheng Lin
@ 2023-03-23  5:38       ` Jason Wang
  2023-03-23  5:58         ` Xuan Zhuo
  1 sibling, 1 reply; 33+ messages in thread
From: Jason Wang @ 2023-03-23  5:38 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Yunsheng Lin, Michael S. Tsirkin, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On Thu, Mar 23, 2023 at 9:43 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Wed, 22 Mar 2023 16:22:18 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> > On 2023/3/22 11:03, Xuan Zhuo wrote:
> > > In the xdp implementation of virtio-net mergeable, it always checks
> > > whether two page is used and a page is selected to release. This is
> > > complicated for the processing of action, and be careful.
> > >
> > > In the entire process, we have such principles:
> > > * If xdp_page is used (PASS, TX, Redirect), then we release the old
> > >   page.
> > > * If it is a drop case, we will release two. The old page obtained from
> > >   buf is release inside err_xdp, and xdp_page needs be relased by us.
> > >
> > > But in fact, when we allocate a new page, we can release the old page
> > > immediately. Then just one is using, we just need to release the new
> > > page for drop case. On the drop path, err_xdp will release the variable
> > > "page", so we only need to let "page" point to the new xdp_page in
> > > advance.
> > >
> > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > ---
> > >  drivers/net/virtio_net.c | 15 ++++++---------
> > >  1 file changed, 6 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index e2560b6f7980..4d2bf1ce0730 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -1245,6 +1245,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > >                     if (!xdp_page)
> > >                             goto err_xdp;
> > >                     offset = VIRTIO_XDP_HEADROOM;
> > > +
> > > +                   put_page(page);
> >
> > the error handling of xdp_linearize_page() does not seems self contained.
> > Does it not seem better:
> > 1. if xdp_linearize_page() succesed, call put_page() for first buffer just
> >    as put_page() is call for other buffer
> > 2. or call virtqueue_get_buf() and put_page() for all the buffer of the packet
> >    so the error handling is not needed outside the virtqueue_get_buf().
> >
> > In that case, it seems we can just do below without xdp_page:
> > page = xdp_linearize_page(rq, num_buf, page, ...);
>
>
> This does look better.
>
> In fact, we already have vq reset, we can load XDP based on vq reset.
> In this way, we can run without xdp_linearize_page.

The goal is to try our best not to drop packets, so I think it's
better to keep it.

Thanks

>
>
> >
> >
> > > +                   page = xdp_page;
> > >             } else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> > >                     xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> > >                                               sizeof(struct skb_shared_info));
> > > @@ -1259,6 +1262,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > >                            page_address(page) + offset, len);
> > >                     frame_sz = PAGE_SIZE;
> > >                     offset = VIRTIO_XDP_HEADROOM;
> > > +
> > > +                   put_page(page);
> > > +                   page = xdp_page;
> >
> > It seems we can limit the scope of xdp_page in this "else if" block.
> >
> > >             } else {
> > >                     xdp_page = page;
> > >             }
> >
> > It seems the above else block is not needed anymore.
>
> Yes, the follow-up patch has this optimization.
>
>
> >
> > > @@ -1278,8 +1284,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > >                     if (unlikely(!head_skb))
> > >                             goto err_xdp_frags;
> > >
> > > -                   if (unlikely(xdp_page != page))
> > > -                           put_page(page);
> > >                     rcu_read_unlock();
> > >                     return head_skb;
> > >             case XDP_TX:
> > > @@ -1297,8 +1301,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > >                             goto err_xdp_frags;
> > >                     }
> > >                     *xdp_xmit |= VIRTIO_XDP_TX;
> > > -                   if (unlikely(xdp_page != page))
> > > -                           put_page(page);
> > >                     rcu_read_unlock();
> > >                     goto xdp_xmit;
> > >             case XDP_REDIRECT:
> > > @@ -1307,8 +1309,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > >                     if (err)
> > >                             goto err_xdp_frags;
> > >                     *xdp_xmit |= VIRTIO_XDP_REDIR;
> > > -                   if (unlikely(xdp_page != page))
> > > -                           put_page(page);
> > >                     rcu_read_unlock();
> > >                     goto xdp_xmit;
> > >             default:
> > > @@ -1321,9 +1321,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > >                     goto err_xdp_frags;
> > >             }
> > >  err_xdp_frags:
> > > -           if (unlikely(xdp_page != page))
> > > -                   __free_pages(xdp_page, 0);
> >
> > It seems __free_pages() and put_page() is used interchangeably here.
> > Perhaps using __free_pages() have performance reason? As the comment below:
> >
> > https://elixir.bootlin.com/linux/v6.3-rc3/source/net/core/page_pool.c#L500
>
>
> Yes, but now we don't seem to be very good to distinguish it. But I think
> it doesn't matter. This logic is rare under actual situation.
>
> Thanks.
>
>
> >
> > > -
> > >             if (xdp_buff_has_frags(&xdp)) {
> > >                     shinfo = xdp_get_shared_info_from_buff(&xdp);
> > >                     for (i = 0; i < shinfo->nr_frags; i++) {
> > >
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-23  4:45       ` Yunsheng Lin
  2023-03-23  4:52         ` Jakub Kicinski
@ 2023-03-23  5:40         ` Jason Wang
  2023-03-23  7:24           ` Yunsheng Lin
  2023-03-23 10:59         ` Xuan Zhuo
  2 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2023-03-23  5:40 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Xuan Zhuo, Michael S. Tsirkin, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

> >
> >>
> >> Also, it seems better to split the xdp_linearize_page() to two functions
> >> as pskb_expand_head() and __skb_linearize() do, one to expand the headroom,
> >> the other one to do the linearizing.
> >
> > No skb here.
>
> I means following the semantics of pskb_expand_head() and __skb_linearize(),
> not to combine the headroom expanding and linearizing into one function as
> xdp_linearize_page() does now if we want a better refoctor result.

Not sure it's worth it, since the use is very specific unless we could
find a case that wants only one of them.

Thanks

>
> >
> >
> >>
>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately
  2023-03-23  5:38       ` Jason Wang
@ 2023-03-23  5:58         ` Xuan Zhuo
  0 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-23  5:58 UTC (permalink / raw)
  To: Jason Wang
  Cc: Yunsheng Lin, Michael S. Tsirkin, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On Thu, 23 Mar 2023 13:38:30 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Thu, Mar 23, 2023 at 9:43 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Wed, 22 Mar 2023 16:22:18 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> > > On 2023/3/22 11:03, Xuan Zhuo wrote:
> > > > In the xdp implementation of virtio-net mergeable, it always checks
> > > > whether two page is used and a page is selected to release. This is
> > > > complicated for the processing of action, and be careful.
> > > >
> > > > In the entire process, we have such principles:
> > > > * If xdp_page is used (PASS, TX, Redirect), then we release the old
> > > >   page.
> > > > * If it is a drop case, we will release two. The old page obtained from
> > > >   buf is release inside err_xdp, and xdp_page needs be relased by us.
> > > >
> > > > But in fact, when we allocate a new page, we can release the old page
> > > > immediately. Then just one is using, we just need to release the new
> > > > page for drop case. On the drop path, err_xdp will release the variable
> > > > "page", so we only need to let "page" point to the new xdp_page in
> > > > advance.
> > > >
> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > ---
> > > >  drivers/net/virtio_net.c | 15 ++++++---------
> > > >  1 file changed, 6 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > index e2560b6f7980..4d2bf1ce0730 100644
> > > > --- a/drivers/net/virtio_net.c
> > > > +++ b/drivers/net/virtio_net.c
> > > > @@ -1245,6 +1245,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > > >                     if (!xdp_page)
> > > >                             goto err_xdp;
> > > >                     offset = VIRTIO_XDP_HEADROOM;
> > > > +
> > > > +                   put_page(page);
> > >
> > > the error handling of xdp_linearize_page() does not seems self contained.
> > > Does it not seem better:
> > > 1. if xdp_linearize_page() succesed, call put_page() for first buffer just
> > >    as put_page() is call for other buffer
> > > 2. or call virtqueue_get_buf() and put_page() for all the buffer of the packet
> > >    so the error handling is not needed outside the virtqueue_get_buf().
> > >
> > > In that case, it seems we can just do below without xdp_page:
> > > page = xdp_linearize_page(rq, num_buf, page, ...);
> >
> >
> > This does look better.
> >
> > In fact, we already have vq reset, we can load XDP based on vq reset.
> > In this way, we can run without xdp_linearize_page.
>
> The goal is to try our best not to drop packets, so I think it's
> better to keep it.


Yes. vq reset may drop some packets.

Thanks.

>
> Thanks
>
> >
> >
> > >
> > >
> > > > +                   page = xdp_page;
> > > >             } else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> > > >                     xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> > > >                                               sizeof(struct skb_shared_info));
> > > > @@ -1259,6 +1262,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > > >                            page_address(page) + offset, len);
> > > >                     frame_sz = PAGE_SIZE;
> > > >                     offset = VIRTIO_XDP_HEADROOM;
> > > > +
> > > > +                   put_page(page);
> > > > +                   page = xdp_page;
> > >
> > > It seems we can limit the scope of xdp_page in this "else if" block.
> > >
> > > >             } else {
> > > >                     xdp_page = page;
> > > >             }
> > >
> > > It seems the above else block is not needed anymore.
> >
> > Yes, the follow-up patch has this optimization.
> >
> >
> > >
> > > > @@ -1278,8 +1284,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > > >                     if (unlikely(!head_skb))
> > > >                             goto err_xdp_frags;
> > > >
> > > > -                   if (unlikely(xdp_page != page))
> > > > -                           put_page(page);
> > > >                     rcu_read_unlock();
> > > >                     return head_skb;
> > > >             case XDP_TX:
> > > > @@ -1297,8 +1301,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > > >                             goto err_xdp_frags;
> > > >                     }
> > > >                     *xdp_xmit |= VIRTIO_XDP_TX;
> > > > -                   if (unlikely(xdp_page != page))
> > > > -                           put_page(page);
> > > >                     rcu_read_unlock();
> > > >                     goto xdp_xmit;
> > > >             case XDP_REDIRECT:
> > > > @@ -1307,8 +1309,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > > >                     if (err)
> > > >                             goto err_xdp_frags;
> > > >                     *xdp_xmit |= VIRTIO_XDP_REDIR;
> > > > -                   if (unlikely(xdp_page != page))
> > > > -                           put_page(page);
> > > >                     rcu_read_unlock();
> > > >                     goto xdp_xmit;
> > > >             default:
> > > > @@ -1321,9 +1321,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> > > >                     goto err_xdp_frags;
> > > >             }
> > > >  err_xdp_frags:
> > > > -           if (unlikely(xdp_page != page))
> > > > -                   __free_pages(xdp_page, 0);
> > >
> > > It seems __free_pages() and put_page() is used interchangeably here.
> > > Perhaps using __free_pages() have performance reason? As the comment below:
> > >
> > > https://elixir.bootlin.com/linux/v6.3-rc3/source/net/core/page_pool.c#L500
> >
> >
> > Yes, but now we don't seem to be very good to distinguish it. But I think
> > it doesn't matter. This logic is rare under actual situation.
> >
> > Thanks.
> >
> >
> > >
> > > > -
> > > >             if (xdp_buff_has_frags(&xdp)) {
> > > >                     shinfo = xdp_get_shared_info_from_buff(&xdp);
> > > >                     for (i = 0; i < shinfo->nr_frags; i++) {
> > > >
> >
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-23  5:40         ` Jason Wang
@ 2023-03-23  7:24           ` Yunsheng Lin
  2023-03-23 11:04             ` Xuan Zhuo
  0 siblings, 1 reply; 33+ messages in thread
From: Yunsheng Lin @ 2023-03-23  7:24 UTC (permalink / raw)
  To: Jason Wang
  Cc: Xuan Zhuo, Michael S. Tsirkin, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On 2023/3/23 13:40, Jason Wang wrote:
>>>
>>>>
>>>> Also, it seems better to split the xdp_linearize_page() to two functions
>>>> as pskb_expand_head() and __skb_linearize() do, one to expand the headroom,
>>>> the other one to do the linearizing.
>>>
>>> No skb here.
>>
>> I means following the semantics of pskb_expand_head() and __skb_linearize(),
>> not to combine the headroom expanding and linearizing into one function as
>> xdp_linearize_page() does now if we want a better refoctor result.
> 
> Not sure it's worth it, since the use is very specific unless we could
> find a case that wants only one of them.

It seems receive_small() only need the headroom expanding one.
For receive_mergeable(), it seems we can split into the below cases:
1. " (!xdp_prog->aux->xdp_has_frags && (num_buf > 1 || headroom < virtnet_get_headroom(vi)))"
   case only need linearizing.
2. other cases only need headroom/tailroom expanding.

Anyway, it is your call to decide if you want to take this
opportunity do a better refoctoring to virtio_net.

> 
> Thanks
> 
>>
>>>
>>>
>>>>
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-23  4:45       ` Yunsheng Lin
  2023-03-23  4:52         ` Jakub Kicinski
  2023-03-23  5:40         ` Jason Wang
@ 2023-03-23 10:59         ` Xuan Zhuo
  2 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-23 10:59 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On Thu, 23 Mar 2023 12:45:41 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> On 2023/3/23 9:45, Xuan Zhuo wrote:
> > On Wed, 22 Mar 2023 19:52:48 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >> On 2023/3/22 11:03, Xuan Zhuo wrote:
> >>> Separating the logic of preparation for xdp from receive_mergeable.
> >>>
> >>> The purpose of this is to simplify the logic of execution of XDP.
> >>>
> >>> The main logic here is that when headroom is insufficient, we need to
> >>> allocate a new page and calculate offset. It should be noted that if
> >>> there is new page, the variable page will refer to the new page.
> >>>
> >>> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> >>> ---
> >>>  drivers/net/virtio_net.c | 135 ++++++++++++++++++++++-----------------
> >>>  1 file changed, 77 insertions(+), 58 deletions(-)
> >>>
> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >>> index 4d2bf1ce0730..bb426958cdd4 100644
> >>> --- a/drivers/net/virtio_net.c
> >>> +++ b/drivers/net/virtio_net.c
> >>> @@ -1162,6 +1162,79 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
> >>>  	return 0;
> >>>  }
> >>>
> >>> +static void *mergeable_xdp_prepare(struct virtnet_info *vi,
> >>> +				   struct receive_queue *rq,
> >>> +				   struct bpf_prog *xdp_prog,
> >>> +				   void *ctx,
> >>> +				   unsigned int *frame_sz,
> >>> +				   int *num_buf,
> >>> +				   struct page **page,
> >>> +				   int offset,
> >>> +				   unsigned int *len,
> >>> +				   struct virtio_net_hdr_mrg_rxbuf *hdr)
> >>
> >> The naming convention seems to be xdp_prepare_mergeable().
> >
> > What convention?
> >
> >
> >>
> >>> +{
> >>> +	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
> >>> +	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> >>> +	struct page *xdp_page;
> >>> +	unsigned int xdp_room;
> >>> +
> >>> +	/* Transient failure which in theory could occur if
> >>> +	 * in-flight packets from before XDP was enabled reach
> >>> +	 * the receive path after XDP is loaded.
> >>> +	 */
> >>> +	if (unlikely(hdr->hdr.gso_type))
> >>> +		return NULL;
> >>> +
> >>> +	/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> >>> +	 * with headroom may add hole in truesize, which
> >>> +	 * make their length exceed PAGE_SIZE. So we disabled the
> >>> +	 * hole mechanism for xdp. See add_recvbuf_mergeable().
> >>> +	 */
> >>> +	*frame_sz = truesize;
> >>> +
> >>> +	/* This happens when headroom is not enough because
> >>> +	 * of the buffer was prefilled before XDP is set.
> >>> +	 * This should only happen for the first several packets.
> >>> +	 * In fact, vq reset can be used here to help us clean up
> >>> +	 * the prefilled buffers, but many existing devices do not
> >>> +	 * support it, and we don't want to bother users who are
> >>> +	 * using xdp normally.
> >>> +	 */
> >>> +	if (!xdp_prog->aux->xdp_has_frags &&
> >>> +	    (*num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> >>> +		/* linearize data for XDP */
> >>> +		xdp_page = xdp_linearize_page(rq, num_buf,
> >>> +					      *page, offset,
> >>> +					      VIRTIO_XDP_HEADROOM,
> >>> +					      len);
> >>> +
> >>> +		if (!xdp_page)
> >>> +			return NULL;
> >>> +	} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> >>> +		xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> >>> +					  sizeof(struct skb_shared_info));
> >>> +		if (*len + xdp_room > PAGE_SIZE)
> >>> +			return NULL;
> >>> +
> >>> +		xdp_page = alloc_page(GFP_ATOMIC);
> >>> +		if (!xdp_page)
> >>> +			return NULL;
> >>> +
> >>> +		memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> >>> +		       page_address(*page) + offset, *len);
> >>
> >> It seems the above 'else if' was not really tested even before this patch,
> >> as there is no "--*num_buf" if xdp_linearize_page() is not called, which
> >> may causes virtnet_build_xdp_buff_mrg() to comsume one more buffer than
> >> expected?
> >
> > Why do you think so?
>
>
> In first 'if' block, there is a "--*num_buf" before gotoing 'err_xdp'
> for virtqueue_get_buf() failure in xdp_linearize_page().
>
> But here there is no "--*num_buf" before gotoing 'err_xdp' for
> alloc_page() failure.

Inside err_xdp, we will get all bufs and free them util num_buf is 0.

Thanks.

>
> So one of them has to be wrong, right?
>
> >
> >>
> >> Also, it seems better to split the xdp_linearize_page() to two functions
> >> as pskb_expand_head() and __skb_linearize() do, one to expand the headroom,
> >> the other one to do the linearizing.
> >
> > No skb here.
>
> I means following the semantics of pskb_expand_head() and __skb_linearize(),
> not to combine the headroom expanding and linearizing into one function as
> xdp_linearize_page() does now if we want a better refoctor result.
>
> >
> >
> >>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-23  7:24           ` Yunsheng Lin
@ 2023-03-23 11:04             ` Xuan Zhuo
  0 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-23 11:04 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Michael S. Tsirkin, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev, Jason Wang

On Thu, 23 Mar 2023 15:24:38 +0800, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> On 2023/3/23 13:40, Jason Wang wrote:
> >>>
> >>>>
> >>>> Also, it seems better to split the xdp_linearize_page() to two functions
> >>>> as pskb_expand_head() and __skb_linearize() do, one to expand the headroom,
> >>>> the other one to do the linearizing.
> >>>
> >>> No skb here.
> >>
> >> I means following the semantics of pskb_expand_head() and __skb_linearize(),
> >> not to combine the headroom expanding and linearizing into one function as
> >> xdp_linearize_page() does now if we want a better refoctor result.
> >
> > Not sure it's worth it, since the use is very specific unless we could
> > find a case that wants only one of them.
>
> It seems receive_small() only need the headroom expanding one.
> For receive_mergeable(), it seems we can split into the below cases:
> 1. " (!xdp_prog->aux->xdp_has_frags && (num_buf > 1 || headroom < virtnet_get_headroom(vi)))"
>    case only need linearizing.
> 2. other cases only need headroom/tailroom expanding.
>
> Anyway, it is your call to decide if you want to take this
> opportunity do a better refoctoring to virtio_net.

Compared to the chaotic state of the virtio-net XDP, this is a small point.
And I don’t think this brings any practical optimization. If you think this
division is better. You can submit a new patch on the top of this patch set.
I think the code can be clearer.

Thanks.

>
> >
> > Thanks
> >
> >>
> >>>
> >>>
> >>>>
> >>
> >
> >
> > .
> >

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-31  9:14   ` Jason Wang
@ 2023-04-03  4:11     ` Xuan Zhuo
  0 siblings, 0 replies; 33+ messages in thread
From: Xuan Zhuo @ 2023-04-03  4:11 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf,
	netdev

On Fri, 31 Mar 2023 17:14:33 +0800, Jason Wang <jasowang@redhat.com> wrote:
>
> 在 2023/3/28 20:04, Xuan Zhuo 写道:
> > Separating the logic of preparation for xdp from receive_mergeable.
> >
> > The purpose of this is to simplify the logic of execution of XDP.
> >
> > The main logic here is that when headroom is insufficient, we need to
> > allocate a new page and calculate offset. It should be noted that if
> > there is new page, the variable page will refer to the new page.
> >
> > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > ---
> >   drivers/net/virtio_net.c | 135 ++++++++++++++++++++++-----------------
> >   1 file changed, 77 insertions(+), 58 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 4d2bf1ce0730..bb426958cdd4 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1162,6 +1162,79 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
> >   	return 0;
> >   }
> >
> > +static void *mergeable_xdp_prepare(struct virtnet_info *vi,
> > +				   struct receive_queue *rq,
> > +				   struct bpf_prog *xdp_prog,
> > +				   void *ctx,
> > +				   unsigned int *frame_sz,
> > +				   int *num_buf,
> > +				   struct page **page,
> > +				   int offset,
> > +				   unsigned int *len,
> > +				   struct virtio_net_hdr_mrg_rxbuf *hdr)
> > +{
> > +	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
> > +	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> > +	struct page *xdp_page;
> > +	unsigned int xdp_room;
> > +
> > +	/* Transient failure which in theory could occur if
> > +	 * in-flight packets from before XDP was enabled reach
> > +	 * the receive path after XDP is loaded.
> > +	 */
> > +	if (unlikely(hdr->hdr.gso_type))
> > +		return NULL;
> > +
> > +	/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> > +	 * with headroom may add hole in truesize, which
> > +	 * make their length exceed PAGE_SIZE. So we disabled the
> > +	 * hole mechanism for xdp. See add_recvbuf_mergeable().
> > +	 */
> > +	*frame_sz = truesize;
> > +
> > +	/* This happens when headroom is not enough because
> > +	 * of the buffer was prefilled before XDP is set.
> > +	 * This should only happen for the first several packets.
> > +	 * In fact, vq reset can be used here to help us clean up
> > +	 * the prefilled buffers, but many existing devices do not
> > +	 * support it, and we don't want to bother users who are
> > +	 * using xdp normally.
> > +	 */
> > +	if (!xdp_prog->aux->xdp_has_frags &&
> > +	    (*num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> > +		/* linearize data for XDP */
> > +		xdp_page = xdp_linearize_page(rq, num_buf,
> > +					      *page, offset,
> > +					      VIRTIO_XDP_HEADROOM,
> > +					      len);
> > +
> > +		if (!xdp_page)
> > +			return NULL;
> > +	} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> > +		xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> > +					  sizeof(struct skb_shared_info));
> > +		if (*len + xdp_room > PAGE_SIZE)
> > +			return NULL;
> > +
> > +		xdp_page = alloc_page(GFP_ATOMIC);
> > +		if (!xdp_page)
> > +			return NULL;
> > +
> > +		memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> > +		       page_address(*page) + offset, *len);
> > +	} else {
> > +		return page_address(*page) + offset;
>
>
> This makes the code a little harder to be read than the original code.
>
> Why not do a verbatim moving without introducing new logic? (Or
> introducing new logic on top?)


Yes. Will fix.

Thanks.

>
> Thanks
>
>
> > +	}
> > +
> > +	*frame_sz = PAGE_SIZE;
> > +
> > +	put_page(*page);
> > +
> > +	*page = xdp_page;
> > +
> > +	return page_address(xdp_page) + VIRTIO_XDP_HEADROOM;
> > +}
> > +
> >   static struct sk_buff *receive_mergeable(struct net_device *dev,
> >   					 struct virtnet_info *vi,
> >   					 struct receive_queue *rq,
> > @@ -1181,7 +1254,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >   	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> >   	unsigned int tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
> >   	unsigned int room = SKB_DATA_ALIGN(headroom + tailroom);
> > -	unsigned int frame_sz, xdp_room;
> > +	unsigned int frame_sz;
> >   	int err;
> >
> >   	head_skb = NULL;
> > @@ -1211,65 +1284,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
> >   		u32 act;
> >   		int i;
> >
> > -		/* Transient failure which in theory could occur if
> > -		 * in-flight packets from before XDP was enabled reach
> > -		 * the receive path after XDP is loaded.
> > -		 */
> > -		if (unlikely(hdr->hdr.gso_type))
> > +		data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
> > +					     offset, &len, hdr);
> > +		if (!data)
> >   			goto err_xdp;
> >
> > -		/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> > -		 * with headroom may add hole in truesize, which
> > -		 * make their length exceed PAGE_SIZE. So we disabled the
> > -		 * hole mechanism for xdp. See add_recvbuf_mergeable().
> > -		 */
> > -		frame_sz = truesize;
> > -
> > -		/* This happens when headroom is not enough because
> > -		 * of the buffer was prefilled before XDP is set.
> > -		 * This should only happen for the first several packets.
> > -		 * In fact, vq reset can be used here to help us clean up
> > -		 * the prefilled buffers, but many existing devices do not
> > -		 * support it, and we don't want to bother users who are
> > -		 * using xdp normally.
> > -		 */
> > -		if (!xdp_prog->aux->xdp_has_frags &&
> > -		    (num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> > -			/* linearize data for XDP */
> > -			xdp_page = xdp_linearize_page(rq, &num_buf,
> > -						      page, offset,
> > -						      VIRTIO_XDP_HEADROOM,
> > -						      &len);
> > -			frame_sz = PAGE_SIZE;
> > -
> > -			if (!xdp_page)
> > -				goto err_xdp;
> > -			offset = VIRTIO_XDP_HEADROOM;
> > -
> > -			put_page(page);
> > -			page = xdp_page;
> > -		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> > -			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> > -						  sizeof(struct skb_shared_info));
> > -			if (len + xdp_room > PAGE_SIZE)
> > -				goto err_xdp;
> > -
> > -			xdp_page = alloc_page(GFP_ATOMIC);
> > -			if (!xdp_page)
> > -				goto err_xdp;
> > -
> > -			memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> > -			       page_address(page) + offset, len);
> > -			frame_sz = PAGE_SIZE;
> > -			offset = VIRTIO_XDP_HEADROOM;
> > -
> > -			put_page(page);
> > -			page = xdp_page;
> > -		} else {
> > -			xdp_page = page;
> > -		}
> > -
> > -		data = page_address(xdp_page) + offset;
> >   		err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
> >   						 &num_buf, &xdp_frags_truesz, stats);
> >   		if (unlikely(err))
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-28 12:04 ` [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare Xuan Zhuo
@ 2023-03-31  9:14   ` Jason Wang
  2023-04-03  4:11     ` Xuan Zhuo
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Wang @ 2023-03-31  9:14 UTC (permalink / raw)
  To: Xuan Zhuo, netdev
  Cc: Michael S. Tsirkin, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf


在 2023/3/28 20:04, Xuan Zhuo 写道:
> Separating the logic of preparation for xdp from receive_mergeable.
>
> The purpose of this is to simplify the logic of execution of XDP.
>
> The main logic here is that when headroom is insufficient, we need to
> allocate a new page and calculate offset. It should be noted that if
> there is new page, the variable page will refer to the new page.
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>   drivers/net/virtio_net.c | 135 ++++++++++++++++++++++-----------------
>   1 file changed, 77 insertions(+), 58 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 4d2bf1ce0730..bb426958cdd4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1162,6 +1162,79 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
>   	return 0;
>   }
>   
> +static void *mergeable_xdp_prepare(struct virtnet_info *vi,
> +				   struct receive_queue *rq,
> +				   struct bpf_prog *xdp_prog,
> +				   void *ctx,
> +				   unsigned int *frame_sz,
> +				   int *num_buf,
> +				   struct page **page,
> +				   int offset,
> +				   unsigned int *len,
> +				   struct virtio_net_hdr_mrg_rxbuf *hdr)
> +{
> +	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
> +	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> +	struct page *xdp_page;
> +	unsigned int xdp_room;
> +
> +	/* Transient failure which in theory could occur if
> +	 * in-flight packets from before XDP was enabled reach
> +	 * the receive path after XDP is loaded.
> +	 */
> +	if (unlikely(hdr->hdr.gso_type))
> +		return NULL;
> +
> +	/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> +	 * with headroom may add hole in truesize, which
> +	 * make their length exceed PAGE_SIZE. So we disabled the
> +	 * hole mechanism for xdp. See add_recvbuf_mergeable().
> +	 */
> +	*frame_sz = truesize;
> +
> +	/* This happens when headroom is not enough because
> +	 * of the buffer was prefilled before XDP is set.
> +	 * This should only happen for the first several packets.
> +	 * In fact, vq reset can be used here to help us clean up
> +	 * the prefilled buffers, but many existing devices do not
> +	 * support it, and we don't want to bother users who are
> +	 * using xdp normally.
> +	 */
> +	if (!xdp_prog->aux->xdp_has_frags &&
> +	    (*num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> +		/* linearize data for XDP */
> +		xdp_page = xdp_linearize_page(rq, num_buf,
> +					      *page, offset,
> +					      VIRTIO_XDP_HEADROOM,
> +					      len);
> +
> +		if (!xdp_page)
> +			return NULL;
> +	} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> +		xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> +					  sizeof(struct skb_shared_info));
> +		if (*len + xdp_room > PAGE_SIZE)
> +			return NULL;
> +
> +		xdp_page = alloc_page(GFP_ATOMIC);
> +		if (!xdp_page)
> +			return NULL;
> +
> +		memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> +		       page_address(*page) + offset, *len);
> +	} else {
> +		return page_address(*page) + offset;


This makes the code a little harder to be read than the original code.

Why not do a verbatim moving without introducing new logic? (Or 
introducing new logic on top?)

Thanks


> +	}
> +
> +	*frame_sz = PAGE_SIZE;
> +
> +	put_page(*page);
> +
> +	*page = xdp_page;
> +
> +	return page_address(xdp_page) + VIRTIO_XDP_HEADROOM;
> +}
> +
>   static struct sk_buff *receive_mergeable(struct net_device *dev,
>   					 struct virtnet_info *vi,
>   					 struct receive_queue *rq,
> @@ -1181,7 +1254,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>   	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
>   	unsigned int tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
>   	unsigned int room = SKB_DATA_ALIGN(headroom + tailroom);
> -	unsigned int frame_sz, xdp_room;
> +	unsigned int frame_sz;
>   	int err;
>   
>   	head_skb = NULL;
> @@ -1211,65 +1284,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>   		u32 act;
>   		int i;
>   
> -		/* Transient failure which in theory could occur if
> -		 * in-flight packets from before XDP was enabled reach
> -		 * the receive path after XDP is loaded.
> -		 */
> -		if (unlikely(hdr->hdr.gso_type))
> +		data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
> +					     offset, &len, hdr);
> +		if (!data)
>   			goto err_xdp;
>   
> -		/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
> -		 * with headroom may add hole in truesize, which
> -		 * make their length exceed PAGE_SIZE. So we disabled the
> -		 * hole mechanism for xdp. See add_recvbuf_mergeable().
> -		 */
> -		frame_sz = truesize;
> -
> -		/* This happens when headroom is not enough because
> -		 * of the buffer was prefilled before XDP is set.
> -		 * This should only happen for the first several packets.
> -		 * In fact, vq reset can be used here to help us clean up
> -		 * the prefilled buffers, but many existing devices do not
> -		 * support it, and we don't want to bother users who are
> -		 * using xdp normally.
> -		 */
> -		if (!xdp_prog->aux->xdp_has_frags &&
> -		    (num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
> -			/* linearize data for XDP */
> -			xdp_page = xdp_linearize_page(rq, &num_buf,
> -						      page, offset,
> -						      VIRTIO_XDP_HEADROOM,
> -						      &len);
> -			frame_sz = PAGE_SIZE;
> -
> -			if (!xdp_page)
> -				goto err_xdp;
> -			offset = VIRTIO_XDP_HEADROOM;
> -
> -			put_page(page);
> -			page = xdp_page;
> -		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
> -			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
> -						  sizeof(struct skb_shared_info));
> -			if (len + xdp_room > PAGE_SIZE)
> -				goto err_xdp;
> -
> -			xdp_page = alloc_page(GFP_ATOMIC);
> -			if (!xdp_page)
> -				goto err_xdp;
> -
> -			memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
> -			       page_address(page) + offset, len);
> -			frame_sz = PAGE_SIZE;
> -			offset = VIRTIO_XDP_HEADROOM;
> -
> -			put_page(page);
> -			page = xdp_page;
> -		} else {
> -			xdp_page = page;
> -		}
> -
> -		data = page_address(xdp_page) + offset;
>   		err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
>   						 &num_buf, &xdp_frags_truesz, stats);
>   		if (unlikely(err))


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare
  2023-03-28 12:04 Xuan Zhuo
@ 2023-03-28 12:04 ` Xuan Zhuo
  2023-03-31  9:14   ` Jason Wang
  0 siblings, 1 reply; 33+ messages in thread
From: Xuan Zhuo @ 2023-03-28 12:04 UTC (permalink / raw)
  To: netdev
  Cc: Michael S. Tsirkin, Jason Wang, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, virtualization, bpf

Separating the logic of preparation for xdp from receive_mergeable.

The purpose of this is to simplify the logic of execution of XDP.

The main logic here is that when headroom is insufficient, we need to
allocate a new page and calculate offset. It should be noted that if
there is new page, the variable page will refer to the new page.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 135 ++++++++++++++++++++++-----------------
 1 file changed, 77 insertions(+), 58 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 4d2bf1ce0730..bb426958cdd4 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1162,6 +1162,79 @@ static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
 	return 0;
 }
 
+static void *mergeable_xdp_prepare(struct virtnet_info *vi,
+				   struct receive_queue *rq,
+				   struct bpf_prog *xdp_prog,
+				   void *ctx,
+				   unsigned int *frame_sz,
+				   int *num_buf,
+				   struct page **page,
+				   int offset,
+				   unsigned int *len,
+				   struct virtio_net_hdr_mrg_rxbuf *hdr)
+{
+	unsigned int truesize = mergeable_ctx_to_truesize(ctx);
+	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
+	struct page *xdp_page;
+	unsigned int xdp_room;
+
+	/* Transient failure which in theory could occur if
+	 * in-flight packets from before XDP was enabled reach
+	 * the receive path after XDP is loaded.
+	 */
+	if (unlikely(hdr->hdr.gso_type))
+		return NULL;
+
+	/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
+	 * with headroom may add hole in truesize, which
+	 * make their length exceed PAGE_SIZE. So we disabled the
+	 * hole mechanism for xdp. See add_recvbuf_mergeable().
+	 */
+	*frame_sz = truesize;
+
+	/* This happens when headroom is not enough because
+	 * of the buffer was prefilled before XDP is set.
+	 * This should only happen for the first several packets.
+	 * In fact, vq reset can be used here to help us clean up
+	 * the prefilled buffers, but many existing devices do not
+	 * support it, and we don't want to bother users who are
+	 * using xdp normally.
+	 */
+	if (!xdp_prog->aux->xdp_has_frags &&
+	    (*num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
+		/* linearize data for XDP */
+		xdp_page = xdp_linearize_page(rq, num_buf,
+					      *page, offset,
+					      VIRTIO_XDP_HEADROOM,
+					      len);
+
+		if (!xdp_page)
+			return NULL;
+	} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
+		xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
+					  sizeof(struct skb_shared_info));
+		if (*len + xdp_room > PAGE_SIZE)
+			return NULL;
+
+		xdp_page = alloc_page(GFP_ATOMIC);
+		if (!xdp_page)
+			return NULL;
+
+		memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
+		       page_address(*page) + offset, *len);
+	} else {
+		return page_address(*page) + offset;
+	}
+
+	*frame_sz = PAGE_SIZE;
+
+	put_page(*page);
+
+	*page = xdp_page;
+
+	return page_address(xdp_page) + VIRTIO_XDP_HEADROOM;
+}
+
 static struct sk_buff *receive_mergeable(struct net_device *dev,
 					 struct virtnet_info *vi,
 					 struct receive_queue *rq,
@@ -1181,7 +1254,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
 	unsigned int tailroom = headroom ? sizeof(struct skb_shared_info) : 0;
 	unsigned int room = SKB_DATA_ALIGN(headroom + tailroom);
-	unsigned int frame_sz, xdp_room;
+	unsigned int frame_sz;
 	int err;
 
 	head_skb = NULL;
@@ -1211,65 +1284,11 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		u32 act;
 		int i;
 
-		/* Transient failure which in theory could occur if
-		 * in-flight packets from before XDP was enabled reach
-		 * the receive path after XDP is loaded.
-		 */
-		if (unlikely(hdr->hdr.gso_type))
+		data = mergeable_xdp_prepare(vi, rq, xdp_prog, ctx, &frame_sz, &num_buf, &page,
+					     offset, &len, hdr);
+		if (!data)
 			goto err_xdp;
 
-		/* Now XDP core assumes frag size is PAGE_SIZE, but buffers
-		 * with headroom may add hole in truesize, which
-		 * make their length exceed PAGE_SIZE. So we disabled the
-		 * hole mechanism for xdp. See add_recvbuf_mergeable().
-		 */
-		frame_sz = truesize;
-
-		/* This happens when headroom is not enough because
-		 * of the buffer was prefilled before XDP is set.
-		 * This should only happen for the first several packets.
-		 * In fact, vq reset can be used here to help us clean up
-		 * the prefilled buffers, but many existing devices do not
-		 * support it, and we don't want to bother users who are
-		 * using xdp normally.
-		 */
-		if (!xdp_prog->aux->xdp_has_frags &&
-		    (num_buf > 1 || headroom < virtnet_get_headroom(vi))) {
-			/* linearize data for XDP */
-			xdp_page = xdp_linearize_page(rq, &num_buf,
-						      page, offset,
-						      VIRTIO_XDP_HEADROOM,
-						      &len);
-			frame_sz = PAGE_SIZE;
-
-			if (!xdp_page)
-				goto err_xdp;
-			offset = VIRTIO_XDP_HEADROOM;
-
-			put_page(page);
-			page = xdp_page;
-		} else if (unlikely(headroom < virtnet_get_headroom(vi))) {
-			xdp_room = SKB_DATA_ALIGN(VIRTIO_XDP_HEADROOM +
-						  sizeof(struct skb_shared_info));
-			if (len + xdp_room > PAGE_SIZE)
-				goto err_xdp;
-
-			xdp_page = alloc_page(GFP_ATOMIC);
-			if (!xdp_page)
-				goto err_xdp;
-
-			memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM,
-			       page_address(page) + offset, len);
-			frame_sz = PAGE_SIZE;
-			offset = VIRTIO_XDP_HEADROOM;
-
-			put_page(page);
-			page = xdp_page;
-		} else {
-			xdp_page = page;
-		}
-
-		data = page_address(xdp_page) + offset;
 		err = virtnet_build_xdp_buff_mrg(dev, vi, rq, &xdp, data, len, frame_sz,
 						 &num_buf, &xdp_frags_truesz, stats);
 		if (unlikely(err))
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2023-04-03  4:12 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-22  3:03 [PATCH net-next 0/8] virtio_net: refactor xdp codes Xuan Zhuo
2023-03-22  3:03 ` [PATCH net-next 1/8] virtio_net: mergeable xdp: put old page immediately Xuan Zhuo
2023-03-22  8:22   ` Yunsheng Lin
2023-03-23  1:36     ` Xuan Zhuo
2023-03-23  3:38       ` Yunsheng Lin
2023-03-23  3:45         ` Xuan Zhuo
2023-03-23  5:38       ` Jason Wang
2023-03-23  5:58         ` Xuan Zhuo
2023-03-22  3:03 ` [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare Xuan Zhuo
2023-03-22 11:52   ` Yunsheng Lin
2023-03-23  1:45     ` Xuan Zhuo
2023-03-23  4:45       ` Yunsheng Lin
2023-03-23  4:52         ` Jakub Kicinski
2023-03-23  5:40         ` Jason Wang
2023-03-23  7:24           ` Yunsheng Lin
2023-03-23 11:04             ` Xuan Zhuo
2023-03-23 10:59         ` Xuan Zhuo
2023-03-22  3:03 ` [PATCH net-next 3/8] virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdp Xuan Zhuo
2023-03-22  3:03 ` [PATCH net-next 4/8] virtio_net: separate the logic of freeing xdp shinfo Xuan Zhuo
2023-03-22  3:03 ` [PATCH net-next 5/8] virtio_net: separate the logic of freeing the rest mergeable buf Xuan Zhuo
2023-03-22  3:03 ` [PATCH net-next 6/8] virtio_net: auto release xdp shinfo Xuan Zhuo
2023-03-22  3:03 ` [PATCH net-next 7/8] virtio_net: introduce receive_mergeable_xdp() Xuan Zhuo
2023-03-22 23:43   ` kernel test robot
2023-03-22  3:03 ` [PATCH net-next 8/8] virtio_net: introduce receive_small_xdp() Xuan Zhuo
2023-03-23  2:06   ` kernel test robot
2023-03-22  3:34 ` [PATCH net-next 0/8] virtio_net: refactor xdp codes Michael S. Tsirkin
2023-03-22  3:40   ` Xuan Zhuo
2023-03-22  3:53     ` Michael S. Tsirkin
2023-03-22  3:56       ` Xuan Zhuo
2023-03-22  3:59       ` Jakub Kicinski
2023-03-28 12:04 Xuan Zhuo
2023-03-28 12:04 ` [PATCH net-next 2/8] virtio_net: mergeable xdp: introduce mergeable_xdp_prepare Xuan Zhuo
2023-03-31  9:14   ` Jason Wang
2023-04-03  4:11     ` Xuan Zhuo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).