[net-next,2/2] virtio-net: simplify XDP handling in small buffer
diff mbox series

Message ID 1519874345-10235-3-git-send-email-jasowang@redhat.com
State New, archived
Headers show
Series
  • virtio-net: re enable XDP_REDIRECT for mergeable buffer
Related show

Commit Message

Jason Wang March 1, 2018, 3:19 a.m. UTC
We used to do data copy through xdp_linearize_page() for the buffer
without sufficient headroom, it brings extra complexity without
helping for the performance. So this patch remove it and switch to use
generic XDP routine to handle this case.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/virtio_net.c | 93 ++++++------------------------------------------
 1 file changed, 10 insertions(+), 83 deletions(-)

Comments

Jesper Dangaard Brouer March 1, 2018, 8:02 a.m. UTC | #1
On Thu,  1 Mar 2018 11:19:05 +0800 Jason Wang <jasowang@redhat.com> wrote:

> We used to do data copy through xdp_linearize_page() for the buffer
> without sufficient headroom, it brings extra complexity without
> helping for the performance. So this patch remove it and switch to use
> generic XDP routine to handle this case.

I don't like where this is going.  I don't like intermixing the native
XDP and generic XDP in this way, for several reasons:

1. XDP generic is not feature complete, e.g. cpumap will drop these
   packets. It might not be possible to implement some features, think
   of (AF_XDP) zero-copy.

2. This can easily cause out-of-order packets.

3. It makes it harder to troubleshoot, when diagnosing issues
   around #1, we have a hard time determining what path an XDP packet
   took (the xdp tracepoints doesn't know).


[...]
> @@ -590,25 +526,14 @@ static struct sk_buff *receive_small(struct net_device *dev,
>  		if (unlikely(hdr->hdr.gso_type))
>  			goto err_xdp;
>  
> +		/* This happnes when headroom is not enough because
> +		 * the buffer was refilled before XDP is set. This
> +		 * only happen for several packets, for simplicity,
> +		 * offload them to generic XDP routine.

In my practical tests, I also saw that sometime my ping packets were
traveling this code-path, even after a long time when XDP were attached.

This worries me a bit, for troubleshooting purposes... as this can give
a strange user experience given point #1.


> +		 */
>  		if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
> -			int offset = buf - page_address(page) + header_offset;
> -			unsigned int tlen = len + vi->hdr_len;
> -			u16 num_buf = 1;
Jason Wang March 1, 2018, 8:49 a.m. UTC | #2
On 2018年03月01日 16:02, Jesper Dangaard Brouer wrote:
> On Thu,  1 Mar 2018 11:19:05 +0800 Jason Wang <jasowang@redhat.com> wrote:
>
>> We used to do data copy through xdp_linearize_page() for the buffer
>> without sufficient headroom, it brings extra complexity without
>> helping for the performance. So this patch remove it and switch to use
>> generic XDP routine to handle this case.
> I don't like where this is going.  I don't like intermixing the native
> XDP and generic XDP in this way, for several reasons:
>
> 1. XDP generic is not feature complete, e.g. cpumap will drop these
>     packets.

Well, then we'd better fix it, otherwise it would be even worse than not 
having it. For cpumap, it can be done through queuing skb in the pointer 
ring with some encoding/decoding.

>   It might not be possible to implement some features, think
>     of (AF_XDP) zero-copy.

Yes, but can AF_XDP support header adjustment? Consider the assumption 
of zerocopy, I was considering maybe we need a way to tell AF_XDP of 
whether or not zerocopy is supported by the driver.

>
> 2. This can easily cause out-of-order packets.

I may miss something, but it looks to me packets were still delivered in 
order? Or you mean the packets that was dropped by cpumap?

>
> 3. It makes it harder to troubleshoot, when diagnosing issues
>     around #1, we have a hard time determining what path an XDP packet
>     took (the xdp tracepoints doesn't know).

Looks like we can address this by adding tracepoints in generic code path.

>
>
> [...]
>> @@ -590,25 +526,14 @@ static struct sk_buff *receive_small(struct net_device *dev,
>>   		if (unlikely(hdr->hdr.gso_type))
>>   			goto err_xdp;
>>   
>> +		/* This happnes when headroom is not enough because
>> +		 * the buffer was refilled before XDP is set. This
>> +		 * only happen for several packets, for simplicity,
>> +		 * offload them to generic XDP routine.
> In my practical tests, I also saw that sometime my ping packets were
> traveling this code-path, even after a long time when XDP were attached.

I don't see this, probably a bug somewhere. I would like to reproduce 
and see, how do you do the test? If there's just very few packets were 
received after XDP, we may still use the buffer that was refilled before 
XDP, that's by design.

>
> This worries me a bit, for troubleshooting purposes... as this can give
> a strange user experience given point #1.



I can stick to the xdp_linearize_page() logic if you don't like generic 
XDP stuffs, but it may not work for AF_XDP neither. Consider AF_XDP has 
still some distance of being merged, I'm not sure we could even care 
about it.

Thanks

>
>
>> +		 */
>>   		if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
>> -			int offset = buf - page_address(page) + header_offset;
>> -			unsigned int tlen = len + vi->hdr_len;
>> -			u16 num_buf = 1;
>
Jesper Dangaard Brouer March 1, 2018, 9:15 a.m. UTC | #3
On Thu, 1 Mar 2018 16:49:24 +0800
Jason Wang <jasowang@redhat.com> wrote:

> > 2. This can easily cause out-of-order packets.  
> 
> I may miss something, but it looks to me packets were still delivered
> in order? Or you mean the packets that was dropped by cpumap?

No. Packets can now travel two code paths to the egress device. (1) XDP
native via ndp_xdp_xmit via direct delivery into a lockfree/dedicated
TX queue, (2) via normal network stack which can involve being queue in
a qdisc.  Do you see the possibility of the reorder now?
Jason Wang March 1, 2018, 9:24 a.m. UTC | #4
On 2018年03月01日 17:15, Jesper Dangaard Brouer wrote:
> On Thu, 1 Mar 2018 16:49:24 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
>>> 2. This can easily cause out-of-order packets.
>> I may miss something, but it looks to me packets were still delivered
>> in order? Or you mean the packets that was dropped by cpumap?
> No. Packets can now travel two code paths to the egress device. (1) XDP
> native via ndp_xdp_xmit via direct delivery into a lockfree/dedicated
> TX queue, (2) via normal network stack which can involve being queue in
> a qdisc.  Do you see the possibility of the reorder now?
>

I see, thanks. But consider this could only happen for first few 
packets, not sure it was worth to worry about it.

Thanks

Patch
diff mbox series

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 81190ba..3f14948 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -474,69 +474,6 @@  static unsigned int virtnet_get_headroom(struct virtnet_info *vi)
 	return vi->xdp_queue_pairs ? VIRTIO_XDP_HEADROOM : 0;
 }
 
-/* We copy the packet for XDP in the following cases:
- *
- * 1) Packet is scattered across multiple rx buffers.
- * 2) Headroom space is insufficient.
- *
- * This is inefficient but it's a temporary condition that
- * we hit right after XDP is enabled and until queue is refilled
- * with large buffers with sufficient headroom - so it should affect
- * at most queue size packets.
- * Afterwards, the conditions to enable
- * XDP should preclude the underlying device from sending packets
- * across multiple buffers (num_buf > 1), and we make sure buffers
- * have enough headroom.
- */
-static struct page *xdp_linearize_page(struct receive_queue *rq,
-				       u16 *num_buf,
-				       struct page *p,
-				       int offset,
-				       int page_off,
-				       unsigned int *len)
-{
-	struct page *page = alloc_page(GFP_ATOMIC);
-
-	if (!page)
-		return NULL;
-
-	memcpy(page_address(page) + page_off, page_address(p) + offset, *len);
-	page_off += *len;
-
-	while (--*num_buf) {
-		unsigned int buflen;
-		void *buf;
-		int off;
-
-		buf = virtqueue_get_buf(rq->vq, &buflen);
-		if (unlikely(!buf))
-			goto err_buf;
-
-		p = virt_to_head_page(buf);
-		off = buf - page_address(p);
-
-		/* guard against a misconfigured or uncooperative backend that
-		 * is sending packet larger than the MTU.
-		 */
-		if ((page_off + buflen) > PAGE_SIZE) {
-			put_page(p);
-			goto err_buf;
-		}
-
-		memcpy(page_address(page) + page_off,
-		       page_address(p) + off, buflen);
-		page_off += buflen;
-		put_page(p);
-	}
-
-	/* Headroom does not contribute to packet length */
-	*len = page_off - VIRTIO_XDP_HEADROOM;
-	return page;
-err_buf:
-	__free_pages(page, 0);
-	return NULL;
-}
-
 static struct sk_buff *virtnet_skb_xdp(struct receive_queue *rq,
 				       struct sk_buff *skb)
 {
@@ -573,8 +510,7 @@  static struct sk_buff *receive_small(struct net_device *dev,
 			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 	struct page *page = virt_to_head_page(buf);
 	unsigned int delta = 0;
-	struct page *xdp_page;
-	bool sent;
+	bool sent, skb_xdp = false;
 	int err;
 
 	len -= vi->hdr_len;
@@ -590,25 +526,14 @@  static struct sk_buff *receive_small(struct net_device *dev,
 		if (unlikely(hdr->hdr.gso_type))
 			goto err_xdp;
 
+		/* This happnes when headroom is not enough because
+		 * the buffer was refilled before XDP is set. This
+		 * only happen for several packets, for simplicity,
+		 * offload them to generic XDP routine.
+		 */
 		if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
-			int offset = buf - page_address(page) + header_offset;
-			unsigned int tlen = len + vi->hdr_len;
-			u16 num_buf = 1;
-
-			xdp_headroom = virtnet_get_headroom(vi);
-			header_offset = VIRTNET_RX_PAD + xdp_headroom;
-			headroom = vi->hdr_len + header_offset;
-			buflen = SKB_DATA_ALIGN(GOOD_PACKET_LEN + headroom) +
-				 SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-			xdp_page = xdp_linearize_page(rq, &num_buf, page,
-						      offset, header_offset,
-						      &tlen);
-			if (!xdp_page)
-				goto err_xdp;
-
-			buf = page_address(xdp_page);
-			put_page(page);
-			page = xdp_page;
+			skb_xdp = true;
+			goto skb_xdp;
 		}
 
 		xdp.data_hard_start = buf + VIRTNET_RX_PAD + vi->hdr_len;
@@ -650,6 +575,7 @@  static struct sk_buff *receive_small(struct net_device *dev,
 	}
 	rcu_read_unlock();
 
+skb_xdp:
 	skb = build_skb(buf, buflen);
 	if (!skb) {
 		put_page(page);
@@ -662,6 +588,7 @@  static struct sk_buff *receive_small(struct net_device *dev,
 		memcpy(skb_vnet_hdr(skb), buf, vi->hdr_len);
 	} /* keep zeroed vnet hdr since packet was changed by bpf */
 
+	skb = virtnet_skb_xdp(rq, skb);
 err:
 	return skb;