All of lore.kernel.org
 help / color / mirror / Atom feed
* [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net
@ 2018-02-20 13:31 Jesper Dangaard Brouer
  2018-02-20 13:32 ` [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case Jesper Dangaard Brouer
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2018-02-20 13:31 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, netdev, John Fastabend, Alexei Starovoitov,
	Saeed Mahameed, Jesper Dangaard Brouer, Daniel Borkmann,
	David S. Miller, Tariq Toukan

The virtio_net driver actually violates the original memory model of
XDP causing hard to debug crashes.  Per request of John Fastabend,
instead of removing the XDP feature I'm fixing as much as possible.
While testing virtio_net with XDP_REDIRECT I found 4 different bugs.

Patch-1: not enough tail-room for build_skb in receive_mergeable()
 only option is to disable XDP_REDIRECT in receive_mergeable()

Patch-2: XDP in receive_small() basically never worked (check wrong flag)

Patch-3: fix memory leak for XDP_REDIRECT in error cases

Patch-4: avoid crash when ndo_xdp_xmit is called on dev not ready for XDP

In the longer run, we should consider introducing a separate receive
function when attaching an XDP program, and also change the memory
model to be compatible with XDP when attaching an XDP prog.

---

Jesper Dangaard Brouer (4):
      virtio_net: disable XDP_REDIRECT in receive_mergeable() case
      virtio_net: fix XDP code path in receive_small()
      virtio_net: fix memory leak in XDP_REDIRECT
      virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDP


 drivers/net/virtio_net.c |   58 +++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 24 deletions(-)

--

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case
  2018-02-20 13:31 [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net Jesper Dangaard Brouer
@ 2018-02-20 13:32 ` Jesper Dangaard Brouer
  2018-02-20 16:56   ` John Fastabend
  2018-02-27  0:40   ` Michael S. Tsirkin
  2018-02-20 13:32 ` [net PATCH 2/4] virtio_net: fix XDP code path in receive_small() Jesper Dangaard Brouer
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2018-02-20 13:32 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, netdev, John Fastabend, Alexei Starovoitov,
	Saeed Mahameed, Jesper Dangaard Brouer, Daniel Borkmann,
	David S. Miller, Tariq Toukan

The virtio_net code have three different RX code-paths in receive_buf().
Two of these code paths can handle XDP, but one of them is broken for
at least XDP_REDIRECT.

Function(1): receive_big() does not support XDP.
Function(2): receive_small() support XDP fully and uses build_skb().
Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb().

The simple explanation is that receive_mergeable() is broken because
it uses napi_alloc_skb(), which violates XDP given XDP assumes packet
header+data in single page and enough tail room for skb_shared_info.

The longer explaination is that receive_mergeable() tries to
work-around and satisfy these XDP requiresments e.g. by having a
function xdp_linearize_page() that allocates and memcpy RX buffers
around (in case packet is scattered across multiple rx buffers).  This
does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because
we have not implemented bpf_xdp_adjust_tail yet).

The XDP_REDIRECT action combined with cpumap is broken, and cause hard
to debug crashes.  The main issue is that the RX packet does not have
the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing
skb_shared_info to overlap the next packets head-room (in which cpumap
stores info).

Reproducing depend on the packet payload length and if RX-buffer size
happened to have tail-room for skb_shared_info or not.  But to make
this even harder to troubleshoot, the RX-buffer size is runtime
dynamically change based on an Exponentially Weighted Moving Average
(EWMA) over the packet length, when refilling RX rings.

This patch only disable XDP_REDIRECT support in receive_mergeable()
case, because it can cause a real crash.

IMHO we should consider NOT supporting XDP in receive_mergeable() at
all, because the principles behind XDP are to gain speed by (1) code
simplicity, (2) sacrificing memory and (3) where possible moving
runtime checks to setup time.  These principles are clearly being
violated in receive_mergeable(), that e.g. runtime track average
buffer size to save memory consumption.

In the longer run, we should consider introducing a separate receive
function when attaching an XDP program, and also change the memory
model to be compatible with XDP when attaching an XDP prog.

Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/virtio_net.c |    7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 626c27352ae2..0ca91942a884 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -677,7 +677,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	struct bpf_prog *xdp_prog;
 	unsigned int truesize;
 	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
-	int err;
 
 	head_skb = NULL;
 
@@ -754,12 +753,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 				goto err_xdp;
 			rcu_read_unlock();
 			goto xdp_xmit;
-		case XDP_REDIRECT:
-			err = xdp_do_redirect(dev, &xdp, xdp_prog);
-			if (!err)
-				*xdp_xmit = true;
-			rcu_read_unlock();
-			goto xdp_xmit;
 		default:
 			bpf_warn_invalid_xdp_action(act);
 		case XDP_ABORTED:

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net PATCH 2/4] virtio_net: fix XDP code path in receive_small()
  2018-02-20 13:31 [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net Jesper Dangaard Brouer
  2018-02-20 13:32 ` [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case Jesper Dangaard Brouer
@ 2018-02-20 13:32 ` Jesper Dangaard Brouer
  2018-02-20 16:57   ` John Fastabend
  2018-02-20 13:32 ` [net PATCH 3/4] virtio_net: fix memory leak in XDP_REDIRECT Jesper Dangaard Brouer
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2018-02-20 13:32 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, netdev, John Fastabend, Alexei Starovoitov,
	Saeed Mahameed, Jesper Dangaard Brouer, Daniel Borkmann,
	David S. Miller, Tariq Toukan

When configuring virtio_net to use the code path 'receive_small()',
in-order to get correct XDP_REDIRECT support, I discovered TCP packets
would get silently dropped when loading an XDP program action XDP_PASS.

The bug seems to be that receive_small() when XDP is loaded check that
hdr->hdr.flags is zero, which seems wrong as hdr.flags contains the
flags VIRTIO_NET_HDR_F_* :
 #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */
 #define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */

TCP got dropped as it had the VIRTIO_NET_HDR_F_DATA_VALID flag set.

The flags that are relevant here are the VIRTIO_NET_HDR_GSO_* flags
stored in hdr->hdr.gso_type. Thus, the fix is just check that none of
the gso_type flags have been set.

Fixes: bb91accf2733 ("virtio-net: XDP support for small buffers")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/virtio_net.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 0ca91942a884..10c8fc46b588 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -558,7 +558,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
 		void *orig_data;
 		u32 act;
 
-		if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags))
+		if (unlikely(hdr->hdr.gso_type))
 			goto err_xdp;
 
 		if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net PATCH 3/4] virtio_net: fix memory leak in XDP_REDIRECT
  2018-02-20 13:31 [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net Jesper Dangaard Brouer
  2018-02-20 13:32 ` [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case Jesper Dangaard Brouer
  2018-02-20 13:32 ` [net PATCH 2/4] virtio_net: fix XDP code path in receive_small() Jesper Dangaard Brouer
@ 2018-02-20 13:32 ` Jesper Dangaard Brouer
  2018-02-20 16:59   ` John Fastabend
  2018-02-20 13:32 ` [net PATCH 4/4] virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDP Jesper Dangaard Brouer
  2018-02-21 20:09 ` [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net David Miller
  4 siblings, 1 reply; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2018-02-20 13:32 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, netdev, John Fastabend, Alexei Starovoitov,
	Saeed Mahameed, Jesper Dangaard Brouer, Daniel Borkmann,
	David S. Miller, Tariq Toukan

XDP_REDIRECT calling xdp_do_redirect() can fail for multiple reasons
(which can be inspected by tracepoints). The current semantics is that
on failure the driver calling xdp_do_redirect() must handle freeing or
recycling the page associated with this frame.  This can be seen as an
optimization, as drivers usually have an optimized XDP_DROP code path
for frame recycling in place already.

The virtio_net driver didn't handle when xdp_do_redirect() failed.
This caused a memory leak as the page refcnt wasn't decremented on
failures.

The function __virtnet_xdp_xmit() did handle one type of failure,
when the xmit queue virtqueue_add_outbuf() is full, which "hides"
releasing a refcnt on the page.  Instead the function __virtnet_xdp_xmit()
must follow API of xdp_do_redirect(), which on errors leave it up to
the caller to free the page, of the failed send operation.

Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/virtio_net.c |   37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 10c8fc46b588..1e0e0fce3ab2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -443,12 +443,8 @@ static bool __virtnet_xdp_xmit(struct virtnet_info *vi,
 	sg_init_one(sq->sg, xdp->data, xdp->data_end - xdp->data);
 
 	err = virtqueue_add_outbuf(sq->vq, sq->sg, 1, xdp->data, GFP_ATOMIC);
-	if (unlikely(err)) {
-		struct page *page = virt_to_head_page(xdp->data);
-
-		put_page(page);
-		return false;
-	}
+	if (unlikely(err))
+		return false; /* Caller handle free/refcnt */
 
 	return true;
 }
@@ -546,8 +542,11 @@ static struct sk_buff *receive_small(struct net_device *dev,
 	unsigned int buflen = SKB_DATA_ALIGN(GOOD_PACKET_LEN + headroom) +
 			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 	struct page *page = virt_to_head_page(buf);
-	unsigned int delta = 0, err;
+	unsigned int delta = 0;
 	struct page *xdp_page;
+	bool sent;
+	int err;
+
 	len -= vi->hdr_len;
 
 	rcu_read_lock();
@@ -596,16 +595,19 @@ static struct sk_buff *receive_small(struct net_device *dev,
 			delta = orig_data - xdp.data;
 			break;
 		case XDP_TX:
-			if (unlikely(!__virtnet_xdp_xmit(vi, &xdp)))
+			sent = __virtnet_xdp_xmit(vi, &xdp);
+			if (unlikely(!sent)) {
 				trace_xdp_exception(vi->dev, xdp_prog, act);
-			else
-				*xdp_xmit = true;
+				goto err_xdp;
+			}
+			*xdp_xmit = true;
 			rcu_read_unlock();
 			goto xdp_xmit;
 		case XDP_REDIRECT:
 			err = xdp_do_redirect(dev, &xdp, xdp_prog);
-			if (!err)
-				*xdp_xmit = true;
+			if (err)
+				goto err_xdp;
+			*xdp_xmit = true;
 			rcu_read_unlock();
 			goto xdp_xmit;
 		default:
@@ -677,6 +679,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	struct bpf_prog *xdp_prog;
 	unsigned int truesize;
 	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
+	bool sent;
 
 	head_skb = NULL;
 
@@ -745,10 +748,14 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			}
 			break;
 		case XDP_TX:
-			if (unlikely(!__virtnet_xdp_xmit(vi, &xdp)))
+			sent = __virtnet_xdp_xmit(vi, &xdp);
+			if (unlikely(!sent)) {
 				trace_xdp_exception(vi->dev, xdp_prog, act);
-			else
-				*xdp_xmit = true;
+				if (unlikely(xdp_page != page))
+					put_page(xdp_page);
+				goto err_xdp;
+			}
+			*xdp_xmit = true;
 			if (unlikely(xdp_page != page))
 				goto err_xdp;
 			rcu_read_unlock();

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [net PATCH 4/4] virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDP
  2018-02-20 13:31 [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net Jesper Dangaard Brouer
                   ` (2 preceding siblings ...)
  2018-02-20 13:32 ` [net PATCH 3/4] virtio_net: fix memory leak in XDP_REDIRECT Jesper Dangaard Brouer
@ 2018-02-20 13:32 ` Jesper Dangaard Brouer
  2018-02-20 17:01   ` John Fastabend
  2018-02-21 20:09 ` [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net David Miller
  4 siblings, 1 reply; 13+ messages in thread
From: Jesper Dangaard Brouer @ 2018-02-20 13:32 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, netdev, John Fastabend, Alexei Starovoitov,
	Saeed Mahameed, Jesper Dangaard Brouer, Daniel Borkmann,
	David S. Miller, Tariq Toukan

When a driver implements the ndo_xdp_xmit() function, there is
(currently) no generic way to determine whether it is safe to call.

It is e.g. unsafe to call the drivers ndo_xdp_xmit, if it have not
allocated the needed XDP TX queues yet.  This is the case for
virtio_net, which first allocates the XDP TX queues once an XDP/bpf
prog is attached (in virtnet_xdp_set()).

Thus, a crash will occur for virtio_net when redirecting to another
virtio_net device's ndo_xdp_xmit, which have not attached a XDP prog.
The sample xdp_redirect_map tries to attach a dummy XDP prog to take
this into account, but it can also easily fail if the virtio_net (or
actually underlying vhost driver) have not allocated enough extra
queues for the device.

Allocating more queue this is currently a manual config.
Hint for libvirt XML add:

  <driver name='vhost' queues='16'>
    <host mrg_rxbuf='off'/>
    <guest tso4='off' tso6='off' ecn='off' ufo='off'/>
  </driver>

The solution in this patch is to check that the device have loaded an
XDP/bpf prog before proceeding.  This is similar to the check
performed in driver ixgbe.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 drivers/net/virtio_net.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 1e0e0fce3ab2..9bb9e562b893 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -452,8 +452,18 @@ static bool __virtnet_xdp_xmit(struct virtnet_info *vi,
 static int virtnet_xdp_xmit(struct net_device *dev, struct xdp_buff *xdp)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
-	bool sent = __virtnet_xdp_xmit(vi, xdp);
+	struct receive_queue *rq = vi->rq;
+	struct bpf_prog *xdp_prog;
+	bool sent;
+
+	/* Only allow ndo_xdp_xmit if XDP is loaded on dev, as this
+	 * indicate XDP resources have been successfully allocated.
+	 */
+	xdp_prog = rcu_dereference(rq->xdp_prog);
+	if (!xdp_prog)
+		return -ENXIO;
 
+	sent = __virtnet_xdp_xmit(vi, xdp);
 	if (!sent)
 		return -ENOSPC;
 	return 0;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case
  2018-02-20 13:32 ` [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case Jesper Dangaard Brouer
@ 2018-02-20 16:56   ` John Fastabend
  2018-02-27  0:40   ` Michael S. Tsirkin
  1 sibling, 0 replies; 13+ messages in thread
From: John Fastabend @ 2018-02-20 16:56 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Jason Wang
  Cc: Michael S. Tsirkin, netdev, Alexei Starovoitov, Saeed Mahameed,
	Daniel Borkmann, David S. Miller, Tariq Toukan

On 02/20/2018 05:32 AM, Jesper Dangaard Brouer wrote:
> The virtio_net code have three different RX code-paths in receive_buf().
> Two of these code paths can handle XDP, but one of them is broken for
> at least XDP_REDIRECT.
> 
> Function(1): receive_big() does not support XDP.
> Function(2): receive_small() support XDP fully and uses build_skb().
> Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb().
> 
> The simple explanation is that receive_mergeable() is broken because
> it uses napi_alloc_skb(), which violates XDP given XDP assumes packet
> header+data in single page and enough tail room for skb_shared_info.
> 
> The longer explaination is that receive_mergeable() tries to
> work-around and satisfy these XDP requiresments e.g. by having a
> function xdp_linearize_page() that allocates and memcpy RX buffers
> around (in case packet is scattered across multiple rx buffers).  This
> does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because
> we have not implemented bpf_xdp_adjust_tail yet).
> 
> The XDP_REDIRECT action combined with cpumap is broken, and cause hard
> to debug crashes.  The main issue is that the RX packet does not have
> the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing
> skb_shared_info to overlap the next packets head-room (in which cpumap
> stores info).
> 
> Reproducing depend on the packet payload length and if RX-buffer size
> happened to have tail-room for skb_shared_info or not.  But to make
> this even harder to troubleshoot, the RX-buffer size is runtime
> dynamically change based on an Exponentially Weighted Moving Average
> (EWMA) over the packet length, when refilling RX rings.
> 
> This patch only disable XDP_REDIRECT support in receive_mergeable()
> case, because it can cause a real crash.
> 
> IMHO we should consider NOT supporting XDP in receive_mergeable() at
> all, because the principles behind XDP are to gain speed by (1) code
> simplicity, (2) sacrificing memory and (3) where possible moving
> runtime checks to setup time.  These principles are clearly being
> violated in receive_mergeable(), that e.g. runtime track average
> buffer size to save memory consumption.
> 
> In the longer run, we should consider introducing a separate receive
> function when attaching an XDP program, and also change the memory
> model to be compatible with XDP when attaching an XDP prog.
> 
> Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

Seems reasonable for net, ideally we can fix it in net-next.

Thanks,
John

>  drivers/net/virtio_net.c |    7 -------
>  1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 626c27352ae2..0ca91942a884 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -677,7 +677,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  	struct bpf_prog *xdp_prog;
>  	unsigned int truesize;
>  	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> -	int err;
>  
>  	head_skb = NULL;
>  
> @@ -754,12 +753,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  				goto err_xdp;
>  			rcu_read_unlock();
>  			goto xdp_xmit;
> -		case XDP_REDIRECT:
> -			err = xdp_do_redirect(dev, &xdp, xdp_prog);
> -			if (!err)
> -				*xdp_xmit = true;
> -			rcu_read_unlock();
> -			goto xdp_xmit;
>  		default:
>  			bpf_warn_invalid_xdp_action(act);
>  		case XDP_ABORTED:
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [net PATCH 2/4] virtio_net: fix XDP code path in receive_small()
  2018-02-20 13:32 ` [net PATCH 2/4] virtio_net: fix XDP code path in receive_small() Jesper Dangaard Brouer
@ 2018-02-20 16:57   ` John Fastabend
  0 siblings, 0 replies; 13+ messages in thread
From: John Fastabend @ 2018-02-20 16:57 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Jason Wang
  Cc: Michael S. Tsirkin, netdev, Alexei Starovoitov, Saeed Mahameed,
	Daniel Borkmann, David S. Miller, Tariq Toukan

On 02/20/2018 05:32 AM, Jesper Dangaard Brouer wrote:
> When configuring virtio_net to use the code path 'receive_small()',
> in-order to get correct XDP_REDIRECT support, I discovered TCP packets
> would get silently dropped when loading an XDP program action XDP_PASS.
> 
> The bug seems to be that receive_small() when XDP is loaded check that
> hdr->hdr.flags is zero, which seems wrong as hdr.flags contains the
> flags VIRTIO_NET_HDR_F_* :
>  #define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 /* Use csum_start, csum_offset */
>  #define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */
> 
> TCP got dropped as it had the VIRTIO_NET_HDR_F_DATA_VALID flag set.
> 
> The flags that are relevant here are the VIRTIO_NET_HDR_GSO_* flags
> stored in hdr->hdr.gso_type. Thus, the fix is just check that none of
> the gso_type flags have been set.
> 
> Fixes: bb91accf2733 ("virtio-net: XDP support for small buffers")
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

Thanks!

>  drivers/net/virtio_net.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 0ca91942a884..10c8fc46b588 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -558,7 +558,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
>  		void *orig_data;
>  		u32 act;
>  
> -		if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags))
> +		if (unlikely(hdr->hdr.gso_type))
>  			goto err_xdp;
>  
>  		if (unlikely(xdp_headroom < virtnet_get_headroom(vi))) {
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [net PATCH 3/4] virtio_net: fix memory leak in XDP_REDIRECT
  2018-02-20 13:32 ` [net PATCH 3/4] virtio_net: fix memory leak in XDP_REDIRECT Jesper Dangaard Brouer
@ 2018-02-20 16:59   ` John Fastabend
  0 siblings, 0 replies; 13+ messages in thread
From: John Fastabend @ 2018-02-20 16:59 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Jason Wang
  Cc: Michael S. Tsirkin, netdev, Alexei Starovoitov, Saeed Mahameed,
	Daniel Borkmann, David S. Miller, Tariq Toukan

On 02/20/2018 05:32 AM, Jesper Dangaard Brouer wrote:
> XDP_REDIRECT calling xdp_do_redirect() can fail for multiple reasons
> (which can be inspected by tracepoints). The current semantics is that
> on failure the driver calling xdp_do_redirect() must handle freeing or
> recycling the page associated with this frame.  This can be seen as an
> optimization, as drivers usually have an optimized XDP_DROP code path
> for frame recycling in place already.
> 
> The virtio_net driver didn't handle when xdp_do_redirect() failed.
> This caused a memory leak as the page refcnt wasn't decremented on
> failures.
> 
> The function __virtnet_xdp_xmit() did handle one type of failure,
> when the xmit queue virtqueue_add_outbuf() is full, which "hides"
> releasing a refcnt on the page.  Instead the function __virtnet_xdp_xmit()
> must follow API of xdp_do_redirect(), which on errors leave it up to
> the caller to free the page, of the failed send operation.
> 
> Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [net PATCH 4/4] virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDP
  2018-02-20 13:32 ` [net PATCH 4/4] virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDP Jesper Dangaard Brouer
@ 2018-02-20 17:01   ` John Fastabend
  0 siblings, 0 replies; 13+ messages in thread
From: John Fastabend @ 2018-02-20 17:01 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Jason Wang
  Cc: Michael S. Tsirkin, netdev, Alexei Starovoitov, Saeed Mahameed,
	Daniel Borkmann, David S. Miller, Tariq Toukan

On 02/20/2018 05:32 AM, Jesper Dangaard Brouer wrote:
> When a driver implements the ndo_xdp_xmit() function, there is
> (currently) no generic way to determine whether it is safe to call.
> 
> It is e.g. unsafe to call the drivers ndo_xdp_xmit, if it have not
> allocated the needed XDP TX queues yet.  This is the case for
> virtio_net, which first allocates the XDP TX queues once an XDP/bpf
> prog is attached (in virtnet_xdp_set()).
> 
> Thus, a crash will occur for virtio_net when redirecting to another
> virtio_net device's ndo_xdp_xmit, which have not attached a XDP prog.
> The sample xdp_redirect_map tries to attach a dummy XDP prog to take
> this into account, but it can also easily fail if the virtio_net (or
> actually underlying vhost driver) have not allocated enough extra
> queues for the device.
> 
> Allocating more queue this is currently a manual config.
> Hint for libvirt XML add:
> 
>   <driver name='vhost' queues='16'>
>     <host mrg_rxbuf='off'/>
>     <guest tso4='off' tso6='off' ecn='off' ufo='off'/>
>   </driver>
> 
> The solution in this patch is to check that the device have loaded an
> XDP/bpf prog before proceeding.  This is similar to the check
> performed in driver ixgbe.
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---

Yep, nice catch. I caught this in ixgbe but didn't manage to
pull it into virtio.

Acked-by: John Fastabend <john.fastabend@gmail.com>

>  drivers/net/virtio_net.c |   12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 1e0e0fce3ab2..9bb9e562b893 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -452,8 +452,18 @@ static bool __virtnet_xdp_xmit(struct virtnet_info *vi,
>  static int virtnet_xdp_xmit(struct net_device *dev, struct xdp_buff *xdp)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> -	bool sent = __virtnet_xdp_xmit(vi, xdp);
> +	struct receive_queue *rq = vi->rq;
> +	struct bpf_prog *xdp_prog;
> +	bool sent;
> +
> +	/* Only allow ndo_xdp_xmit if XDP is loaded on dev, as this
> +	 * indicate XDP resources have been successfully allocated.
> +	 */
> +	xdp_prog = rcu_dereference(rq->xdp_prog);
> +	if (!xdp_prog)
> +		return -ENXIO;
>  
> +	sent = __virtnet_xdp_xmit(vi, xdp);
>  	if (!sent)
>  		return -ENOSPC;
>  	return 0;
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net
  2018-02-20 13:31 [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net Jesper Dangaard Brouer
                   ` (3 preceding siblings ...)
  2018-02-20 13:32 ` [net PATCH 4/4] virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDP Jesper Dangaard Brouer
@ 2018-02-21 20:09 ` David Miller
  4 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2018-02-21 20:09 UTC (permalink / raw)
  To: brouer
  Cc: jasowang, mst, netdev, john.fastabend, alexei.starovoitov,
	saeedm, borkmann, tariqt

From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Tue, 20 Feb 2018 14:31:59 +0100

> The virtio_net driver actually violates the original memory model of
> XDP causing hard to debug crashes.  Per request of John Fastabend,
> instead of removing the XDP feature I'm fixing as much as possible.
> While testing virtio_net with XDP_REDIRECT I found 4 different bugs.
> 
> Patch-1: not enough tail-room for build_skb in receive_mergeable()
>  only option is to disable XDP_REDIRECT in receive_mergeable()
> 
> Patch-2: XDP in receive_small() basically never worked (check wrong flag)
> 
> Patch-3: fix memory leak for XDP_REDIRECT in error cases
> 
> Patch-4: avoid crash when ndo_xdp_xmit is called on dev not ready for XDP
> 
> In the longer run, we should consider introducing a separate receive
> function when attaching an XDP program, and also change the memory
> model to be compatible with XDP when attaching an XDP prog.

Series applied, thanks Jesper.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case
  2018-02-20 13:32 ` [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case Jesper Dangaard Brouer
  2018-02-20 16:56   ` John Fastabend
@ 2018-02-27  0:40   ` Michael S. Tsirkin
  2018-02-27  2:25     ` Jason Wang
  1 sibling, 1 reply; 13+ messages in thread
From: Michael S. Tsirkin @ 2018-02-27  0:40 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Jason Wang, netdev, John Fastabend, Alexei Starovoitov,
	Saeed Mahameed, Daniel Borkmann, David S. Miller, Tariq Toukan

On Tue, Feb 20, 2018 at 02:32:04PM +0100, Jesper Dangaard Brouer wrote:
> The virtio_net code have three different RX code-paths in receive_buf().
> Two of these code paths can handle XDP, but one of them is broken for
> at least XDP_REDIRECT.
> 
> Function(1): receive_big() does not support XDP.
> Function(2): receive_small() support XDP fully and uses build_skb().
> Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb().
> 
> The simple explanation is that receive_mergeable() is broken because
> it uses napi_alloc_skb(), which violates XDP given XDP assumes packet
> header+data in single page and enough tail room for skb_shared_info.
> 
> The longer explaination is that receive_mergeable() tries to
> work-around and satisfy these XDP requiresments e.g. by having a
> function xdp_linearize_page() that allocates and memcpy RX buffers
> around (in case packet is scattered across multiple rx buffers).  This
> does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because
> we have not implemented bpf_xdp_adjust_tail yet).
> 
> The XDP_REDIRECT action combined with cpumap is broken, and cause hard
> to debug crashes.  The main issue is that the RX packet does not have
> the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing
> skb_shared_info to overlap the next packets head-room (in which cpumap
> stores info).
> 
> Reproducing depend on the packet payload length and if RX-buffer size
> happened to have tail-room for skb_shared_info or not.  But to make
> this even harder to troubleshoot, the RX-buffer size is runtime
> dynamically change based on an Exponentially Weighted Moving Average
> (EWMA) over the packet length, when refilling RX rings.
> 
> This patch only disable XDP_REDIRECT support in receive_mergeable()
> case, because it can cause a real crash.
> 
> IMHO we should consider NOT supporting XDP in receive_mergeable() at
> all, because the principles behind XDP are to gain speed by (1) code
> simplicity, (2) sacrificing memory and (3) where possible moving
> runtime checks to setup time.  These principles are clearly being
> violated in receive_mergeable(), that e.g. runtime track average
> buffer size to save memory consumption.
> 
> In the longer run, we should consider introducing a separate receive
> function when attaching an XDP program, and also change the memory
> model to be compatible with XDP when attaching an XDP prog.

I agree with a separate function approach.

So each buffer is tagged as xdp/non xdp, we check that
and handle appropriately - where non xdp could be handled
by the generic path.



> Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  drivers/net/virtio_net.c |    7 -------
>  1 file changed, 7 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 626c27352ae2..0ca91942a884 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -677,7 +677,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  	struct bpf_prog *xdp_prog;
>  	unsigned int truesize;
>  	unsigned int headroom = mergeable_ctx_to_headroom(ctx);
> -	int err;
>  
>  	head_skb = NULL;
>  
> @@ -754,12 +753,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
>  				goto err_xdp;
>  			rcu_read_unlock();
>  			goto xdp_xmit;
> -		case XDP_REDIRECT:
> -			err = xdp_do_redirect(dev, &xdp, xdp_prog);
> -			if (!err)
> -				*xdp_xmit = true;
> -			rcu_read_unlock();
> -			goto xdp_xmit;
>  		default:
>  			bpf_warn_invalid_xdp_action(act);
>  		case XDP_ABORTED:

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case
  2018-02-27  0:40   ` Michael S. Tsirkin
@ 2018-02-27  2:25     ` Jason Wang
  2018-02-27  2:28       ` Jason Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Wang @ 2018-02-27  2:25 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jesper Dangaard Brouer
  Cc: netdev, John Fastabend, Alexei Starovoitov, Saeed Mahameed,
	Daniel Borkmann, David S. Miller, Tariq Toukan



On 2018年02月27日 08:40, Michael S. Tsirkin wrote:
>> IMHO we should consider NOT supporting XDP in receive_mergeable() at
>> all, because the principles behind XDP are to gain speed by (1) code
>> simplicity, (2) sacrificing memory and (3) where possible moving
>> runtime checks to setup time.  These principles are clearly being
>> violated in receive_mergeable(), that e.g. runtime track average
>> buffer size to save memory consumption.
>>
>> In the longer run, we should consider introducing a separate receive
>> function when attaching an XDP program, and also change the memory
>> model to be compatible with XDP when attaching an XDP prog.
> I agree with a separate function approach.
>
> So each buffer is tagged as xdp/non xdp, we check that
> and handle appropriately - where non xdp could be handled
> by the generic path.

If we want to have separated function, should we do it for all XDP 
capable drivers instead of virtio-net only?

Thanks

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case
  2018-02-27  2:25     ` Jason Wang
@ 2018-02-27  2:28       ` Jason Wang
  0 siblings, 0 replies; 13+ messages in thread
From: Jason Wang @ 2018-02-27  2:28 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jesper Dangaard Brouer
  Cc: netdev, John Fastabend, Alexei Starovoitov, Saeed Mahameed,
	Daniel Borkmann, David S. Miller, Tariq Toukan



On 2018年02月27日 10:25, Jason Wang wrote:
>
>
> On 2018年02月27日 08:40, Michael S. Tsirkin wrote:
>>> IMHO we should consider NOT supporting XDP in receive_mergeable() at
>>> all, because the principles behind XDP are to gain speed by (1) code
>>> simplicity, (2) sacrificing memory and (3) where possible moving
>>> runtime checks to setup time.  These principles are clearly being
>>> violated in receive_mergeable(), that e.g. runtime track average
>>> buffer size to save memory consumption.
>>>
>>> In the longer run, we should consider introducing a separate receive
>>> function when attaching an XDP program, and also change the memory
>>> model to be compatible with XDP when attaching an XDP prog.
>> I agree with a separate function approach.
>>
>> So each buffer is tagged as xdp/non xdp, we check that
>> and handle appropriately - where non xdp could be handled
>> by the generic path.
>
> If we want to have separated function, should we do it for all XDP 
> capable drivers instead of virtio-net only?
>
> Thanks
>

What's more, since we don't stop device during XDP set. Even if a buffer 
is tagged as xdp/non xdp during refill, we still can't make sure whether 
or not XDP was used at receive function. So we can not handle them 
separately.

Thanks

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-02-27  2:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-20 13:31 [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net Jesper Dangaard Brouer
2018-02-20 13:32 ` [net PATCH 1/4] virtio_net: disable XDP_REDIRECT in receive_mergeable() case Jesper Dangaard Brouer
2018-02-20 16:56   ` John Fastabend
2018-02-27  0:40   ` Michael S. Tsirkin
2018-02-27  2:25     ` Jason Wang
2018-02-27  2:28       ` Jason Wang
2018-02-20 13:32 ` [net PATCH 2/4] virtio_net: fix XDP code path in receive_small() Jesper Dangaard Brouer
2018-02-20 16:57   ` John Fastabend
2018-02-20 13:32 ` [net PATCH 3/4] virtio_net: fix memory leak in XDP_REDIRECT Jesper Dangaard Brouer
2018-02-20 16:59   ` John Fastabend
2018-02-20 13:32 ` [net PATCH 4/4] virtio_net: fix ndo_xdp_xmit crash towards dev not ready for XDP Jesper Dangaard Brouer
2018-02-20 17:01   ` John Fastabend
2018-02-21 20:09 ` [net PATCH 0/4] virtio_net: several bugs in XDP code for driver virtio_net David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.