All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
@ 2013-07-16  9:46 Jan Beulich
  2013-07-16 10:25 ` Wei Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Jan Beulich @ 2013-07-16  9:46 UTC (permalink / raw)
  To: davem; +Cc: Ian Campbell, Wei Liu, Dion Kant, xen-devel, netdev, stable

Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
linear area is big enough on RX") xennet_fill_frags() may end up
filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
the fragment count subsequently via __pskb_pull_tail(). That's a
result of xennet_get_responses() allowing a maximum of one more slot to
be consumed (and intermediately transformed into a fragment) if the
head slot has a size less than or equal to RX_COPY_THRESHOLD.

Hence we need to adjust xennet_fill_frags() to pull earlier if we
reached the maximum fragment count - due to the described behavior of
xennet_get_responses() this guarantees that at least the first fragment
will get completely consumed, and hence the fragment count reduced.

In order to not needlessly call __pskb_pull_tail() twice, make the
original call conditional upon the pull target not having been reached
yet, and defer the newly added one as much as possible (an alternative
would have been to always call the function right before the call to
xennet_fill_frags(), but that would imply more frequent cases of
needing to call it twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org (3.6 onwards)
---
v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
    we go (skb->len needing intermediate updating was pointed out by
    Wei Liu and David Miller, shinfo->nr_frags needing updating before
    calling __pskb_pull_tail() was spotted out by Dion Kant).

---
 drivers/net/xen-netfront.c |   32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

--- 3.11-rc1/drivers/net/xen-netfront.c
+++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
@@ -286,8 +286,7 @@ no_skb:
 			break;
 		}
 
-		__skb_fill_page_desc(skb, 0, page, 0, 0);
-		skb_shinfo(skb)->nr_frags = 1;
+		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
 		__skb_queue_tail(&np->rx_batch, skb);
 	}
 
@@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
 				  struct sk_buff_head *list)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
-	int nr_frags = shinfo->nr_frags;
 	RING_IDX cons = np->rx.rsp_cons;
 	struct sk_buff *nskb;
 
@@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
 
-		__skb_fill_page_desc(skb, nr_frags,
-				     skb_frag_page(nfrag),
-				     rx->offset, rx->status);
+		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		skb->data_len += rx->status;
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		}
+		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
+
+		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
+				rx->offset, rx->status, PAGE_SIZE);
 
 		skb_shinfo(nskb)->nr_frags = 0;
 		kfree_skb(nskb);
-
-		nr_frags++;
 	}
 
-	shinfo->nr_frags = nr_frags;
 	return cons;
 }
 
@@ -933,7 +933,8 @@ static int handle_incoming_queue(struct 
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
@@ -1018,17 +1019,10 @@ err:
 
 		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
 		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
-		skb->data_len = rx->status;
+		skb->len += skb->data_len = rx->status;
 
 		i = xennet_fill_frags(np, skb, &tmpq);
 
-		/*
-                 * Truesize is the actual allocation size, even if the
-                 * allocation is only partially used.
-                 */
-		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
-		skb->len += skb->data_len;
-
 		if (rx->flags & XEN_NETRXF_csum_blank)
 			skb->ip_summed = CHECKSUM_PARTIAL;
 		else if (rx->flags & XEN_NETRXF_data_validated)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16  9:46 [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
  2013-07-16 10:25 ` Wei Liu
@ 2013-07-16 10:25 ` Wei Liu
  2013-07-16 10:33   ` Dion Kant
                     ` (3 more replies)
  2013-07-16 10:33 ` Ian Campbell
  2013-07-16 10:33 ` Ian Campbell
  3 siblings, 4 replies; 16+ messages in thread
From: Wei Liu @ 2013-07-16 10:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: davem, Ian Campbell, Wei Liu, Dion Kant, xen-devel, netdev, stable

On Tue, Jul 16, 2013 at 10:46:01AM +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: stable@vger.kernel.org (3.6 onwards)
> ---
> v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
>     we go (skb->len needing intermediate updating was pointed out by
>     Wei Liu and David Miller, shinfo->nr_frags needing updating before
>     calling __pskb_pull_tail() was spotted out by Dion Kant).

Jan and Dion, is this a confirmed fix for SuSE kernel?

I complied and tested it, at least it didn't break things for me. The
tests I ran were 1) scp large_file to domU; 2) iperf from Dom0 to DomU;
3) netperf from Dom0 to DomU.

On the basis that this patch 1) fixes the bug for SuSE kernel (to be
confirmed with Jan); 2) doesn't break upstream (tested by me).

Acked-by: Wei Liu <wei.liu2@citrix.com>

> 
> ---
>  drivers/net/xen-netfront.c |   32 +++++++++++++-------------------
>  1 file changed, 13 insertions(+), 19 deletions(-)
> 
> --- 3.11-rc1/drivers/net/xen-netfront.c
> +++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
> @@ -286,8 +286,7 @@ no_skb:
>  			break;
>  		}
>  
> -		__skb_fill_page_desc(skb, 0, page, 0, 0);
> -		skb_shinfo(skb)->nr_frags = 1;
> +		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
>  		__skb_queue_tail(&np->rx_batch, skb);
>  	}
>  
> @@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
>  				  struct sk_buff_head *list)
>  {
>  	struct skb_shared_info *shinfo = skb_shinfo(skb);
> -	int nr_frags = shinfo->nr_frags;
>  	RING_IDX cons = np->rx.rsp_cons;
>  	struct sk_buff *nskb;
>  
> @@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> -		__skb_fill_page_desc(skb, nr_frags,
> -				     skb_frag_page(nfrag),
> -				     rx->offset, rx->status);
> +		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		skb->data_len += rx->status;
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		}
> +		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
> +
> +		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
> +				rx->offset, rx->status, PAGE_SIZE);
>  
>  		skb_shinfo(nskb)->nr_frags = 0;
>  		kfree_skb(nskb);
> -
> -		nr_frags++;
>  	}
>  
> -	shinfo->nr_frags = nr_frags;
>  	return cons;
>  }
>  
> @@ -933,7 +933,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -1018,17 +1019,10 @@ err:
>  
>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> -		skb->data_len = rx->status;
> +		skb->len += skb->data_len = rx->status;
>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> -		/*
> -                 * Truesize is the actual allocation size, even if the
> -                 * allocation is only partially used.
> -                 */
> -		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> -		skb->len += skb->data_len;
> -
>  		if (rx->flags & XEN_NETRXF_csum_blank)
>  			skb->ip_summed = CHECKSUM_PARTIAL;
>  		else if (rx->flags & XEN_NETRXF_data_validated)
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16  9:46 [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
@ 2013-07-16 10:25 ` Wei Liu
  2013-07-16 10:25 ` Wei Liu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Wei Liu @ 2013-07-16 10:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

On Tue, Jul 16, 2013 at 10:46:01AM +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: stable@vger.kernel.org (3.6 onwards)
> ---
> v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
>     we go (skb->len needing intermediate updating was pointed out by
>     Wei Liu and David Miller, shinfo->nr_frags needing updating before
>     calling __pskb_pull_tail() was spotted out by Dion Kant).

Jan and Dion, is this a confirmed fix for SuSE kernel?

I complied and tested it, at least it didn't break things for me. The
tests I ran were 1) scp large_file to domU; 2) iperf from Dom0 to DomU;
3) netperf from Dom0 to DomU.

On the basis that this patch 1) fixes the bug for SuSE kernel (to be
confirmed with Jan); 2) doesn't break upstream (tested by me).

Acked-by: Wei Liu <wei.liu2@citrix.com>

> 
> ---
>  drivers/net/xen-netfront.c |   32 +++++++++++++-------------------
>  1 file changed, 13 insertions(+), 19 deletions(-)
> 
> --- 3.11-rc1/drivers/net/xen-netfront.c
> +++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
> @@ -286,8 +286,7 @@ no_skb:
>  			break;
>  		}
>  
> -		__skb_fill_page_desc(skb, 0, page, 0, 0);
> -		skb_shinfo(skb)->nr_frags = 1;
> +		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
>  		__skb_queue_tail(&np->rx_batch, skb);
>  	}
>  
> @@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
>  				  struct sk_buff_head *list)
>  {
>  	struct skb_shared_info *shinfo = skb_shinfo(skb);
> -	int nr_frags = shinfo->nr_frags;
>  	RING_IDX cons = np->rx.rsp_cons;
>  	struct sk_buff *nskb;
>  
> @@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> -		__skb_fill_page_desc(skb, nr_frags,
> -				     skb_frag_page(nfrag),
> -				     rx->offset, rx->status);
> +		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		skb->data_len += rx->status;
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		}
> +		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
> +
> +		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
> +				rx->offset, rx->status, PAGE_SIZE);
>  
>  		skb_shinfo(nskb)->nr_frags = 0;
>  		kfree_skb(nskb);
> -
> -		nr_frags++;
>  	}
>  
> -	shinfo->nr_frags = nr_frags;
>  	return cons;
>  }
>  
> @@ -933,7 +933,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -1018,17 +1019,10 @@ err:
>  
>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> -		skb->data_len = rx->status;
> +		skb->len += skb->data_len = rx->status;
>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> -		/*
> -                 * Truesize is the actual allocation size, even if the
> -                 * allocation is only partially used.
> -                 */
> -		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> -		skb->len += skb->data_len;
> -
>  		if (rx->flags & XEN_NETRXF_csum_blank)
>  			skb->ip_summed = CHECKSUM_PARTIAL;
>  		else if (rx->flags & XEN_NETRXF_data_validated)
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16  9:46 [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
                   ` (2 preceding siblings ...)
  2013-07-16 10:33 ` Ian Campbell
@ 2013-07-16 10:33 ` Ian Campbell
  2013-07-17  7:09   ` [PATCH v3] " Jan Beulich
  2013-07-17  7:09   ` Jan Beulich
  3 siblings, 2 replies; 16+ messages in thread
From: Ian Campbell @ 2013-07-16 10:33 UTC (permalink / raw)
  To: Jan Beulich; +Cc: davem, Wei Liu, Dion Kant, xen-devel, netdev, stable

On Tue, 2013-07-16 at 10:46 +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: stable@vger.kernel.org (3.6 onwards)
> ---
> v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
>     we go (skb->len needing intermediate updating was pointed out by
>     Wei Liu and David Miller, shinfo->nr_frags needing updating before
>     calling __pskb_pull_tail() was spotted out by Dion Kant).
> 
> ---
>  drivers/net/xen-netfront.c |   32 +++++++++++++-------------------
>  1 file changed, 13 insertions(+), 19 deletions(-)
> 
> --- 3.11-rc1/drivers/net/xen-netfront.c
> +++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
> @@ -286,8 +286,7 @@ no_skb:
>  			break;
>  		}
>  
> -		__skb_fill_page_desc(skb, 0, page, 0, 0);
> -		skb_shinfo(skb)->nr_frags = 1;
> +		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
>  		__skb_queue_tail(&np->rx_batch, skb);
>  	}
>  
> @@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
>  				  struct sk_buff_head *list)
>  {
>  	struct skb_shared_info *shinfo = skb_shinfo(skb);
> -	int nr_frags = shinfo->nr_frags;
>  	RING_IDX cons = np->rx.rsp_cons;
>  	struct sk_buff *nskb;
>  
> @@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> -		__skb_fill_page_desc(skb, nr_frags,
> -				     skb_frag_page(nfrag),
> -				     rx->offset, rx->status);
> +		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		skb->data_len += rx->status;
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		}
> +		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
> +
> +		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
> +				rx->offset, rx->status, PAGE_SIZE);
>  
>  		skb_shinfo(nskb)->nr_frags = 0;
>  		kfree_skb(nskb);
> -
> -		nr_frags++;
>  	}
>  
> -	shinfo->nr_frags = nr_frags;
>  	return cons;
>  }
>  
> @@ -933,7 +933,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -1018,17 +1019,10 @@ err:
>  
>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> -		skb->data_len = rx->status;
> +		skb->len += skb->data_len = rx->status;

Of the top of my head I have no confidence in my understanding of what
this is actually going to end up assigned to ->len or ->data_len after
this.

For my benefit, can we do the dumb version please?

Also ref CodingStyle:
        Don't put multiple assignments on a single line either.  Kernel coding style
        is super simple.  Avoid tricky expressions.

The rest of it looked OK to me.

Ian.


>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> -		/*
> -                 * Truesize is the actual allocation size, even if the
> -                 * allocation is only partially used.
> -                 */
> -		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> -		skb->len += skb->data_len;
> -
>  		if (rx->flags & XEN_NETRXF_csum_blank)
>  			skb->ip_summed = CHECKSUM_PARTIAL;
>  		else if (rx->flags & XEN_NETRXF_data_validated)
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16  9:46 [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
  2013-07-16 10:25 ` Wei Liu
  2013-07-16 10:25 ` Wei Liu
@ 2013-07-16 10:33 ` Ian Campbell
  2013-07-16 10:33 ` Ian Campbell
  3 siblings, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2013-07-16 10:33 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, netdev, stable, xen-devel, Dion Kant, davem

On Tue, 2013-07-16 at 10:46 +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: stable@vger.kernel.org (3.6 onwards)
> ---
> v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
>     we go (skb->len needing intermediate updating was pointed out by
>     Wei Liu and David Miller, shinfo->nr_frags needing updating before
>     calling __pskb_pull_tail() was spotted out by Dion Kant).
> 
> ---
>  drivers/net/xen-netfront.c |   32 +++++++++++++-------------------
>  1 file changed, 13 insertions(+), 19 deletions(-)
> 
> --- 3.11-rc1/drivers/net/xen-netfront.c
> +++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
> @@ -286,8 +286,7 @@ no_skb:
>  			break;
>  		}
>  
> -		__skb_fill_page_desc(skb, 0, page, 0, 0);
> -		skb_shinfo(skb)->nr_frags = 1;
> +		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
>  		__skb_queue_tail(&np->rx_batch, skb);
>  	}
>  
> @@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
>  				  struct sk_buff_head *list)
>  {
>  	struct skb_shared_info *shinfo = skb_shinfo(skb);
> -	int nr_frags = shinfo->nr_frags;
>  	RING_IDX cons = np->rx.rsp_cons;
>  	struct sk_buff *nskb;
>  
> @@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
>  			RING_GET_RESPONSE(&np->rx, ++cons);
>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>  
> -		__skb_fill_page_desc(skb, nr_frags,
> -				     skb_frag_page(nfrag),
> -				     rx->offset, rx->status);
> +		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		skb->data_len += rx->status;
> +			BUG_ON(pull_to <= skb_headlen(skb));
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		}
> +		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
> +
> +		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
> +				rx->offset, rx->status, PAGE_SIZE);
>  
>  		skb_shinfo(nskb)->nr_frags = 0;
>  		kfree_skb(nskb);
> -
> -		nr_frags++;
>  	}
>  
> -	shinfo->nr_frags = nr_frags;
>  	return cons;
>  }
>  
> @@ -933,7 +933,8 @@ static int handle_incoming_queue(struct 
>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>  
> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
> +		if (pull_to > skb_headlen(skb))
> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>  
>  		/* Ethernet work: Delayed to here as it peeks the header. */
>  		skb->protocol = eth_type_trans(skb, dev);
> @@ -1018,17 +1019,10 @@ err:
>  
>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
> -		skb->data_len = rx->status;
> +		skb->len += skb->data_len = rx->status;

Of the top of my head I have no confidence in my understanding of what
this is actually going to end up assigned to ->len or ->data_len after
this.

For my benefit, can we do the dumb version please?

Also ref CodingStyle:
        Don't put multiple assignments on a single line either.  Kernel coding style
        is super simple.  Avoid tricky expressions.

The rest of it looked OK to me.

Ian.


>  
>  		i = xennet_fill_frags(np, skb, &tmpq);
>  
> -		/*
> -                 * Truesize is the actual allocation size, even if the
> -                 * allocation is only partially used.
> -                 */
> -		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
> -		skb->len += skb->data_len;
> -
>  		if (rx->flags & XEN_NETRXF_csum_blank)
>  			skb->ip_summed = CHECKSUM_PARTIAL;
>  		else if (rx->flags & XEN_NETRXF_data_validated)
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16 10:25 ` Wei Liu
  2013-07-16 10:33   ` Dion Kant
@ 2013-07-16 10:33   ` Dion Kant
  2013-07-16 11:26   ` Jan Beulich
  2013-07-16 11:26   ` Jan Beulich
  3 siblings, 0 replies; 16+ messages in thread
From: Dion Kant @ 2013-07-16 10:33 UTC (permalink / raw)
  To: Wei Liu; +Cc: Jan Beulich, davem, Ian Campbell, xen-devel, netdev, stable

On 07/16/2013 12:25 PM, Wei Liu wrote:
> On Tue, Jul 16, 2013 at 10:46:01AM +0100, Jan Beulich wrote:
>> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
>> linear area is big enough on RX") xennet_fill_frags() may end up
>> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
>> the fragment count subsequently via __pskb_pull_tail(). That's a
>> result of xennet_get_responses() allowing a maximum of one more slot to
>> be consumed (and intermediately transformed into a fragment) if the
>> head slot has a size less than or equal to RX_COPY_THRESHOLD.
>>
>> Hence we need to adjust xennet_fill_frags() to pull earlier if we
>> reached the maximum fragment count - due to the described behavior of
>> xennet_get_responses() this guarantees that at least the first fragment
>> will get completely consumed, and hence the fragment count reduced.
>>
>> In order to not needlessly call __pskb_pull_tail() twice, make the
>> original call conditional upon the pull target not having been reached
>> yet, and defer the newly added one as much as possible (an alternative
>> would have been to always call the function right before the call to
>> xennet_fill_frags(), but that would imply more frequent cases of
>> needing to call it twice).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: stable@vger.kernel.org (3.6 onwards)
>> ---
>> v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
>>     we go (skb->len needing intermediate updating was pointed out by
>>     Wei Liu and David Miller, shinfo->nr_frags needing updating before
>>     calling __pskb_pull_tail() was spotted out by Dion Kant).
> Jan and Dion, is this a confirmed fix for SuSE kernel?
Dear Wei,

I did thorough testing on the system which revealed the original problem
and I can confirm that it does fix the issue.

Dion


>
> I complied and tested it, at least it didn't break things for me. The
> tests I ran were 1) scp large_file to domU; 2) iperf from Dom0 to DomU;
> 3) netperf from Dom0 to DomU.
>
> On the basis that this patch 1) fixes the bug for SuSE kernel (to be
> confirmed with Jan); 2) doesn't break upstream (tested by me).
>
> Acked-by: Wei Liu <wei.liu2@citrix.com>
>
>> ---
>>  drivers/net/xen-netfront.c |   32 +++++++++++++-------------------
>>  1 file changed, 13 insertions(+), 19 deletions(-)
>>
>> --- 3.11-rc1/drivers/net/xen-netfront.c
>> +++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
>> @@ -286,8 +286,7 @@ no_skb:
>>  			break;
>>  		}
>>  
>> -		__skb_fill_page_desc(skb, 0, page, 0, 0);
>> -		skb_shinfo(skb)->nr_frags = 1;
>> +		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
>>  		__skb_queue_tail(&np->rx_batch, skb);
>>  	}
>>  
>> @@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
>>  				  struct sk_buff_head *list)
>>  {
>>  	struct skb_shared_info *shinfo = skb_shinfo(skb);
>> -	int nr_frags = shinfo->nr_frags;
>>  	RING_IDX cons = np->rx.rsp_cons;
>>  	struct sk_buff *nskb;
>>  
>> @@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
>>  			RING_GET_RESPONSE(&np->rx, ++cons);
>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>>  
>> -		__skb_fill_page_desc(skb, nr_frags,
>> -				     skb_frag_page(nfrag),
>> -				     rx->offset, rx->status);
>> +		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>>  
>> -		skb->data_len += rx->status;
>> +			BUG_ON(pull_to <= skb_headlen(skb));
>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>> +		}
>> +		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
>> +
>> +		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
>> +				rx->offset, rx->status, PAGE_SIZE);
>>  
>>  		skb_shinfo(nskb)->nr_frags = 0;
>>  		kfree_skb(nskb);
>> -
>> -		nr_frags++;
>>  	}
>>  
>> -	shinfo->nr_frags = nr_frags;
>>  	return cons;
>>  }
>>  
>> @@ -933,7 +933,8 @@ static int handle_incoming_queue(struct 
>>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>>  
>> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>> +		if (pull_to > skb_headlen(skb))
>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>>  
>>  		/* Ethernet work: Delayed to here as it peeks the header. */
>>  		skb->protocol = eth_type_trans(skb, dev);
>> @@ -1018,17 +1019,10 @@ err:
>>  
>>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> -		skb->data_len = rx->status;
>> +		skb->len += skb->data_len = rx->status;
>>  
>>  		i = xennet_fill_frags(np, skb, &tmpq);
>>  
>> -		/*
>> -                 * Truesize is the actual allocation size, even if the
>> -                 * allocation is only partially used.
>> -                 */
>> -		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>> -		skb->len += skb->data_len;
>> -
>>  		if (rx->flags & XEN_NETRXF_csum_blank)
>>  			skb->ip_summed = CHECKSUM_PARTIAL;
>>  		else if (rx->flags & XEN_NETRXF_data_validated)
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16 10:25 ` Wei Liu
@ 2013-07-16 10:33   ` Dion Kant
  2013-07-16 10:33   ` Dion Kant
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Dion Kant @ 2013-07-16 10:33 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, netdev, stable, Jan Beulich, xen-devel, davem

On 07/16/2013 12:25 PM, Wei Liu wrote:
> On Tue, Jul 16, 2013 at 10:46:01AM +0100, Jan Beulich wrote:
>> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
>> linear area is big enough on RX") xennet_fill_frags() may end up
>> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
>> the fragment count subsequently via __pskb_pull_tail(). That's a
>> result of xennet_get_responses() allowing a maximum of one more slot to
>> be consumed (and intermediately transformed into a fragment) if the
>> head slot has a size less than or equal to RX_COPY_THRESHOLD.
>>
>> Hence we need to adjust xennet_fill_frags() to pull earlier if we
>> reached the maximum fragment count - due to the described behavior of
>> xennet_get_responses() this guarantees that at least the first fragment
>> will get completely consumed, and hence the fragment count reduced.
>>
>> In order to not needlessly call __pskb_pull_tail() twice, make the
>> original call conditional upon the pull target not having been reached
>> yet, and defer the newly added one as much as possible (an alternative
>> would have been to always call the function right before the call to
>> xennet_fill_frags(), but that would imply more frequent cases of
>> needing to call it twice).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: stable@vger.kernel.org (3.6 onwards)
>> ---
>> v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
>>     we go (skb->len needing intermediate updating was pointed out by
>>     Wei Liu and David Miller, shinfo->nr_frags needing updating before
>>     calling __pskb_pull_tail() was spotted out by Dion Kant).
> Jan and Dion, is this a confirmed fix for SuSE kernel?
Dear Wei,

I did thorough testing on the system which revealed the original problem
and I can confirm that it does fix the issue.

Dion


>
> I complied and tested it, at least it didn't break things for me. The
> tests I ran were 1) scp large_file to domU; 2) iperf from Dom0 to DomU;
> 3) netperf from Dom0 to DomU.
>
> On the basis that this patch 1) fixes the bug for SuSE kernel (to be
> confirmed with Jan); 2) doesn't break upstream (tested by me).
>
> Acked-by: Wei Liu <wei.liu2@citrix.com>
>
>> ---
>>  drivers/net/xen-netfront.c |   32 +++++++++++++-------------------
>>  1 file changed, 13 insertions(+), 19 deletions(-)
>>
>> --- 3.11-rc1/drivers/net/xen-netfront.c
>> +++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
>> @@ -286,8 +286,7 @@ no_skb:
>>  			break;
>>  		}
>>  
>> -		__skb_fill_page_desc(skb, 0, page, 0, 0);
>> -		skb_shinfo(skb)->nr_frags = 1;
>> +		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
>>  		__skb_queue_tail(&np->rx_batch, skb);
>>  	}
>>  
>> @@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
>>  				  struct sk_buff_head *list)
>>  {
>>  	struct skb_shared_info *shinfo = skb_shinfo(skb);
>> -	int nr_frags = shinfo->nr_frags;
>>  	RING_IDX cons = np->rx.rsp_cons;
>>  	struct sk_buff *nskb;
>>  
>> @@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
>>  			RING_GET_RESPONSE(&np->rx, ++cons);
>>  		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
>>  
>> -		__skb_fill_page_desc(skb, nr_frags,
>> -				     skb_frag_page(nfrag),
>> -				     rx->offset, rx->status);
>> +		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
>> +			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>>  
>> -		skb->data_len += rx->status;
>> +			BUG_ON(pull_to <= skb_headlen(skb));
>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>> +		}
>> +		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
>> +
>> +		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
>> +				rx->offset, rx->status, PAGE_SIZE);
>>  
>>  		skb_shinfo(nskb)->nr_frags = 0;
>>  		kfree_skb(nskb);
>> -
>> -		nr_frags++;
>>  	}
>>  
>> -	shinfo->nr_frags = nr_frags;
>>  	return cons;
>>  }
>>  
>> @@ -933,7 +933,8 @@ static int handle_incoming_queue(struct 
>>  	while ((skb = __skb_dequeue(rxq)) != NULL) {
>>  		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
>>  
>> -		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>> +		if (pull_to > skb_headlen(skb))
>> +			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
>>  
>>  		/* Ethernet work: Delayed to here as it peeks the header. */
>>  		skb->protocol = eth_type_trans(skb, dev);
>> @@ -1018,17 +1019,10 @@ err:
>>  
>>  		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
>>  		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
>> -		skb->data_len = rx->status;
>> +		skb->len += skb->data_len = rx->status;
>>  
>>  		i = xennet_fill_frags(np, skb, &tmpq);
>>  
>> -		/*
>> -                 * Truesize is the actual allocation size, even if the
>> -                 * allocation is only partially used.
>> -                 */
>> -		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
>> -		skb->len += skb->data_len;
>> -
>>  		if (rx->flags & XEN_NETRXF_csum_blank)
>>  			skb->ip_summed = CHECKSUM_PARTIAL;
>>  		else if (rx->flags & XEN_NETRXF_data_validated)
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16 10:25 ` Wei Liu
                     ` (2 preceding siblings ...)
  2013-07-16 11:26   ` Jan Beulich
@ 2013-07-16 11:26   ` Jan Beulich
  3 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2013-07-16 11:26 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, davem, Dion Kant, xen-devel, netdev, stable

>>> On 16.07.13 at 12:25, Wei Liu <wei.liu2@citrix.com> wrote:
> On Tue, Jul 16, 2013 at 10:46:01AM +0100, Jan Beulich wrote:
>> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
>> linear area is big enough on RX") xennet_fill_frags() may end up
>> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
>> the fragment count subsequently via __pskb_pull_tail(). That's a
>> result of xennet_get_responses() allowing a maximum of one more slot to
>> be consumed (and intermediately transformed into a fragment) if the
>> head slot has a size less than or equal to RX_COPY_THRESHOLD.
>> 
>> Hence we need to adjust xennet_fill_frags() to pull earlier if we
>> reached the maximum fragment count - due to the described behavior of
>> xennet_get_responses() this guarantees that at least the first fragment
>> will get completely consumed, and hence the fragment count reduced.
>> 
>> In order to not needlessly call __pskb_pull_tail() twice, make the
>> original call conditional upon the pull target not having been reached
>> yet, and defer the newly added one as much as possible (an alternative
>> would have been to always call the function right before the call to
>> xennet_fill_frags(), but that would imply more frequent cases of
>> needing to call it twice).
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: stable@vger.kernel.org (3.6 onwards)
>> ---
>> v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
>>     we go (skb->len needing intermediate updating was pointed out by
>>     Wei Liu and David Miller, shinfo->nr_frags needing updating before
>>     calling __pskb_pull_tail() was spotted out by Dion Kant).
> 
> Jan and Dion, is this a confirmed fix for SuSE kernel?
> 
> I complied and tested it, at least it didn't break things for me. The
> tests I ran were 1) scp large_file to domU; 2) iperf from Dom0 to DomU;
> 3) netperf from Dom0 to DomU.

Yes, the equivalent of this applied to our driver fixes the issue
for Dion.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16 10:25 ` Wei Liu
  2013-07-16 10:33   ` Dion Kant
  2013-07-16 10:33   ` Dion Kant
@ 2013-07-16 11:26   ` Jan Beulich
  2013-07-16 11:26   ` Jan Beulich
  3 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2013-07-16 11:26 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, netdev, stable, xen-devel, Dion Kant, davem

>>> On 16.07.13 at 12:25, Wei Liu <wei.liu2@citrix.com> wrote:
> On Tue, Jul 16, 2013 at 10:46:01AM +0100, Jan Beulich wrote:
>> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
>> linear area is big enough on RX") xennet_fill_frags() may end up
>> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
>> the fragment count subsequently via __pskb_pull_tail(). That's a
>> result of xennet_get_responses() allowing a maximum of one more slot to
>> be consumed (and intermediately transformed into a fragment) if the
>> head slot has a size less than or equal to RX_COPY_THRESHOLD.
>> 
>> Hence we need to adjust xennet_fill_frags() to pull earlier if we
>> reached the maximum fragment count - due to the described behavior of
>> xennet_get_responses() this guarantees that at least the first fragment
>> will get completely consumed, and hence the fragment count reduced.
>> 
>> In order to not needlessly call __pskb_pull_tail() twice, make the
>> original call conditional upon the pull target not having been reached
>> yet, and defer the newly added one as much as possible (an alternative
>> would have been to always call the function right before the call to
>> xennet_fill_frags(), but that would imply more frequent cases of
>> needing to call it twice).
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Cc: Wei Liu <wei.liu2@citrix.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: stable@vger.kernel.org (3.6 onwards)
>> ---
>> v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
>>     we go (skb->len needing intermediate updating was pointed out by
>>     Wei Liu and David Miller, shinfo->nr_frags needing updating before
>>     calling __pskb_pull_tail() was spotted out by Dion Kant).
> 
> Jan and Dion, is this a confirmed fix for SuSE kernel?
> 
> I complied and tested it, at least it didn't break things for me. The
> tests I ran were 1) scp large_file to domU; 2) iperf from Dom0 to DomU;
> 3) netperf from Dom0 to DomU.

Yes, the equivalent of this applied to our driver fixes the issue
for Dion.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16 10:33 ` Ian Campbell
  2013-07-17  7:09   ` [PATCH v3] " Jan Beulich
@ 2013-07-17  7:09   ` Jan Beulich
  2013-07-17  8:26     ` Ian Campbell
  2013-07-17  8:26     ` Ian Campbell
  1 sibling, 2 replies; 16+ messages in thread
From: Jan Beulich @ 2013-07-17  7:09 UTC (permalink / raw)
  To: davem; +Cc: Ian Campbell, Wei Liu, Dion Kant, xen-devel, netdev, stable

Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
linear area is big enough on RX") xennet_fill_frags() may end up
filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
the fragment count subsequently via __pskb_pull_tail(). That's a
result of xennet_get_responses() allowing a maximum of one more slot to
be consumed (and intermediately transformed into a fragment) if the
head slot has a size less than or equal to RX_COPY_THRESHOLD.

Hence we need to adjust xennet_fill_frags() to pull earlier if we
reached the maximum fragment count - due to the described behavior of
xennet_get_responses() this guarantees that at least the first fragment
will get completely consumed, and hence the fragment count reduced.

In order to not needlessly call __pskb_pull_tail() twice, make the
original call conditional upon the pull target not having been reached
yet, and defer the newly added one as much as possible (an alternative
would have been to always call the function right before the call to
xennet_fill_frags(), but that would imply more frequent cases of
needing to call it twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org (3.6 onwards)
---
v3: Break up a "complex" assignment into two simple ones.

v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
    we go (skb->len needing intermediate updating was pointed out by
    Wei Liu and David Miller, shinfo->nr_frags needing updating before
    calling __pskb_pull_tail() was spotted out by Dion Kant).

---
 drivers/net/xen-netfront.c |   31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

--- 3.11-rc1/drivers/net/xen-netfront.c
+++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
@@ -286,8 +286,7 @@ no_skb:
 			break;
 		}
 
-		__skb_fill_page_desc(skb, 0, page, 0, 0);
-		skb_shinfo(skb)->nr_frags = 1;
+		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
 		__skb_queue_tail(&np->rx_batch, skb);
 	}
 
@@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
 				  struct sk_buff_head *list)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
-	int nr_frags = shinfo->nr_frags;
 	RING_IDX cons = np->rx.rsp_cons;
 	struct sk_buff *nskb;
 
@@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
 
-		__skb_fill_page_desc(skb, nr_frags,
-				     skb_frag_page(nfrag),
-				     rx->offset, rx->status);
+		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		skb->data_len += rx->status;
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		}
+		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
+
+		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
+				rx->offset, rx->status, PAGE_SIZE);
 
 		skb_shinfo(nskb)->nr_frags = 0;
 		kfree_skb(nskb);
-
-		nr_frags++;
 	}
 
-	shinfo->nr_frags = nr_frags;
 	return cons;
 }
 
@@ -933,7 +933,8 @@ static int handle_incoming_queue(struct
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
@@ -1019,16 +1020,10 @@ err:
 		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
 		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
 		skb->data_len = rx->status;
+		skb->len += rx->status;
 
 		i = xennet_fill_frags(np, skb, &tmpq);
 
-		/*
-                 * Truesize is the actual allocation size, even if the
-                 * allocation is only partially used.
-                 */
-		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
-		skb->len += skb->data_len;
-
 		if (rx->flags & XEN_NETRXF_csum_blank)
 			skb->ip_summed = CHECKSUM_PARTIAL;
 		else if (rx->flags & XEN_NETRXF_data_validated)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-16 10:33 ` Ian Campbell
@ 2013-07-17  7:09   ` Jan Beulich
  2013-07-17  7:09   ` Jan Beulich
  1 sibling, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2013-07-17  7:09 UTC (permalink / raw)
  To: davem; +Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant

Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
linear area is big enough on RX") xennet_fill_frags() may end up
filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
the fragment count subsequently via __pskb_pull_tail(). That's a
result of xennet_get_responses() allowing a maximum of one more slot to
be consumed (and intermediately transformed into a fragment) if the
head slot has a size less than or equal to RX_COPY_THRESHOLD.

Hence we need to adjust xennet_fill_frags() to pull earlier if we
reached the maximum fragment count - due to the described behavior of
xennet_get_responses() this guarantees that at least the first fragment
will get completely consumed, and hence the fragment count reduced.

In order to not needlessly call __pskb_pull_tail() twice, make the
original call conditional upon the pull target not having been reached
yet, and defer the newly added one as much as possible (an alternative
would have been to always call the function right before the call to
xennet_fill_frags(), but that would imply more frequent cases of
needing to call it twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org (3.6 onwards)
---
v3: Break up a "complex" assignment into two simple ones.

v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
    we go (skb->len needing intermediate updating was pointed out by
    Wei Liu and David Miller, shinfo->nr_frags needing updating before
    calling __pskb_pull_tail() was spotted out by Dion Kant).

---
 drivers/net/xen-netfront.c |   31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

--- 3.11-rc1/drivers/net/xen-netfront.c
+++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
@@ -286,8 +286,7 @@ no_skb:
 			break;
 		}
 
-		__skb_fill_page_desc(skb, 0, page, 0, 0);
-		skb_shinfo(skb)->nr_frags = 1;
+		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
 		__skb_queue_tail(&np->rx_batch, skb);
 	}
 
@@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
 				  struct sk_buff_head *list)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
-	int nr_frags = shinfo->nr_frags;
 	RING_IDX cons = np->rx.rsp_cons;
 	struct sk_buff *nskb;
 
@@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
 
-		__skb_fill_page_desc(skb, nr_frags,
-				     skb_frag_page(nfrag),
-				     rx->offset, rx->status);
+		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		skb->data_len += rx->status;
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		}
+		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
+
+		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
+				rx->offset, rx->status, PAGE_SIZE);
 
 		skb_shinfo(nskb)->nr_frags = 0;
 		kfree_skb(nskb);
-
-		nr_frags++;
 	}
 
-	shinfo->nr_frags = nr_frags;
 	return cons;
 }
 
@@ -933,7 +933,8 @@ static int handle_incoming_queue(struct
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
@@ -1019,16 +1020,10 @@ err:
 		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
 		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
 		skb->data_len = rx->status;
+		skb->len += rx->status;
 
 		i = xennet_fill_frags(np, skb, &tmpq);
 
-		/*
-                 * Truesize is the actual allocation size, even if the
-                 * allocation is only partially used.
-                 */
-		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
-		skb->len += skb->data_len;
-
 		if (rx->flags & XEN_NETRXF_csum_blank)
 			skb->ip_summed = CHECKSUM_PARTIAL;
 		else if (rx->flags & XEN_NETRXF_data_validated)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-17  7:09   ` Jan Beulich
@ 2013-07-17  8:26     ` Ian Campbell
  2013-07-17 19:52       ` David Miller
  2013-07-17 19:52       ` David Miller
  2013-07-17  8:26     ` Ian Campbell
  1 sibling, 2 replies; 16+ messages in thread
From: Ian Campbell @ 2013-07-17  8:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: davem, Wei Liu, Dion Kant, xen-devel, netdev, stable

On Wed, 2013-07-17 at 08:09 +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-17  7:09   ` Jan Beulich
  2013-07-17  8:26     ` Ian Campbell
@ 2013-07-17  8:26     ` Ian Campbell
  1 sibling, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2013-07-17  8:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Wei Liu, netdev, stable, xen-devel, Dion Kant, davem

On Wed, 2013-07-17 at 08:09 +0100, Jan Beulich wrote:
> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
> linear area is big enough on RX") xennet_fill_frags() may end up
> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
> the fragment count subsequently via __pskb_pull_tail(). That's a
> result of xennet_get_responses() allowing a maximum of one more slot to
> be consumed (and intermediately transformed into a fragment) if the
> head slot has a size less than or equal to RX_COPY_THRESHOLD.
> 
> Hence we need to adjust xennet_fill_frags() to pull earlier if we
> reached the maximum fragment count - due to the described behavior of
> xennet_get_responses() this guarantees that at least the first fragment
> will get completely consumed, and hence the fragment count reduced.
> 
> In order to not needlessly call __pskb_pull_tail() twice, make the
> original call conditional upon the pull target not having been reached
> yet, and defer the newly added one as much as possible (an alternative
> would have been to always call the function right before the call to
> xennet_fill_frags(), but that would imply more frequent cases of
> needing to call it twice).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Acked-by: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-17  8:26     ` Ian Campbell
@ 2013-07-17 19:52       ` David Miller
  2013-07-17 19:52       ` David Miller
  1 sibling, 0 replies; 16+ messages in thread
From: David Miller @ 2013-07-17 19:52 UTC (permalink / raw)
  To: Ian.Campbell; +Cc: JBeulich, wei.liu2, g.w.kant, xen-devel, netdev, stable

From: Ian Campbell <Ian.Campbell@citrix.com>
Date: Wed, 17 Jul 2013 09:26:24 +0100

> On Wed, 2013-07-17 at 08:09 +0100, Jan Beulich wrote:
>> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
>> linear area is big enough on RX") xennet_fill_frags() may end up
>> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
>> the fragment count subsequently via __pskb_pull_tail(). That's a
>> result of xennet_get_responses() allowing a maximum of one more slot to
>> be consumed (and intermediately transformed into a fragment) if the
>> head slot has a size less than or equal to RX_COPY_THRESHOLD.
>> 
>> Hence we need to adjust xennet_fill_frags() to pull earlier if we
>> reached the maximum fragment count - due to the described behavior of
>> xennet_get_responses() this guarantees that at least the first fragment
>> will get completely consumed, and hence the fragment count reduced.
>> 
>> In order to not needlessly call __pskb_pull_tail() twice, make the
>> original call conditional upon the pull target not having been reached
>> yet, and defer the newly added one as much as possible (an alternative
>> would have been to always call the function right before the call to
>> xennet_fill_frags(), but that would imply more frequent cases of
>> needing to call it twice).
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Acked-by: Wei Liu <wei.liu2@citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Applied.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] xen-netfront: pull on receive skb may need to happen earlier
  2013-07-17  8:26     ` Ian Campbell
  2013-07-17 19:52       ` David Miller
@ 2013-07-17 19:52       ` David Miller
  1 sibling, 0 replies; 16+ messages in thread
From: David Miller @ 2013-07-17 19:52 UTC (permalink / raw)
  To: Ian.Campbell; +Cc: wei.liu2, netdev, stable, JBeulich, xen-devel, g.w.kant

From: Ian Campbell <Ian.Campbell@citrix.com>
Date: Wed, 17 Jul 2013 09:26:24 +0100

> On Wed, 2013-07-17 at 08:09 +0100, Jan Beulich wrote:
>> Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
>> linear area is big enough on RX") xennet_fill_frags() may end up
>> filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
>> the fragment count subsequently via __pskb_pull_tail(). That's a
>> result of xennet_get_responses() allowing a maximum of one more slot to
>> be consumed (and intermediately transformed into a fragment) if the
>> head slot has a size less than or equal to RX_COPY_THRESHOLD.
>> 
>> Hence we need to adjust xennet_fill_frags() to pull earlier if we
>> reached the maximum fragment count - due to the described behavior of
>> xennet_get_responses() this guarantees that at least the first fragment
>> will get completely consumed, and hence the fragment count reduced.
>> 
>> In order to not needlessly call __pskb_pull_tail() twice, make the
>> original call conditional upon the pull target not having been reached
>> yet, and defer the newly added one as much as possible (an alternative
>> would have been to always call the function right before the call to
>> xennet_fill_frags(), but that would imply more frequent cases of
>> needing to call it twice).
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Acked-by: Wei Liu <wei.liu2@citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Applied.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier
@ 2013-07-16  9:46 Jan Beulich
  0 siblings, 0 replies; 16+ messages in thread
From: Jan Beulich @ 2013-07-16  9:46 UTC (permalink / raw)
  To: davem; +Cc: Wei Liu, Ian Campbell, netdev, stable, xen-devel, Dion Kant

Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
linear area is big enough on RX") xennet_fill_frags() may end up
filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
the fragment count subsequently via __pskb_pull_tail(). That's a
result of xennet_get_responses() allowing a maximum of one more slot to
be consumed (and intermediately transformed into a fragment) if the
head slot has a size less than or equal to RX_COPY_THRESHOLD.

Hence we need to adjust xennet_fill_frags() to pull earlier if we
reached the maximum fragment count - due to the described behavior of
xennet_get_responses() this guarantees that at least the first fragment
will get completely consumed, and hence the fragment count reduced.

In order to not needlessly call __pskb_pull_tail() twice, make the
original call conditional upon the pull target not having been reached
yet, and defer the newly added one as much as possible (an alternative
would have been to always call the function right before the call to
xennet_fill_frags(), but that would imply more frequent cases of
needing to call it twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org (3.6 onwards)
---
v2: Use skb_add_rx_frag() to keep all accounting fields up to date as
    we go (skb->len needing intermediate updating was pointed out by
    Wei Liu and David Miller, shinfo->nr_frags needing updating before
    calling __pskb_pull_tail() was spotted out by Dion Kant).

---
 drivers/net/xen-netfront.c |   32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

--- 3.11-rc1/drivers/net/xen-netfront.c
+++ 3.11-rc1-xen-netfront-pull-earlier/drivers/net/xen-netfront.c
@@ -286,8 +286,7 @@ no_skb:
 			break;
 		}
 
-		__skb_fill_page_desc(skb, 0, page, 0, 0);
-		skb_shinfo(skb)->nr_frags = 1;
+		skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
 		__skb_queue_tail(&np->rx_batch, skb);
 	}
 
@@ -831,7 +830,6 @@ static RING_IDX xennet_fill_frags(struct
 				  struct sk_buff_head *list)
 {
 	struct skb_shared_info *shinfo = skb_shinfo(skb);
-	int nr_frags = shinfo->nr_frags;
 	RING_IDX cons = np->rx.rsp_cons;
 	struct sk_buff *nskb;
 
@@ -840,19 +838,21 @@ static RING_IDX xennet_fill_frags(struct
 			RING_GET_RESPONSE(&np->rx, ++cons);
 		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
 
-		__skb_fill_page_desc(skb, nr_frags,
-				     skb_frag_page(nfrag),
-				     rx->offset, rx->status);
+		if (shinfo->nr_frags == MAX_SKB_FRAGS) {
+			unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		skb->data_len += rx->status;
+			BUG_ON(pull_to <= skb_headlen(skb));
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		}
+		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
+
+		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
+				rx->offset, rx->status, PAGE_SIZE);
 
 		skb_shinfo(nskb)->nr_frags = 0;
 		kfree_skb(nskb);
-
-		nr_frags++;
 	}
 
-	shinfo->nr_frags = nr_frags;
 	return cons;
 }
 
@@ -933,7 +933,8 @@ static int handle_incoming_queue(struct 
 	while ((skb = __skb_dequeue(rxq)) != NULL) {
 		int pull_to = NETFRONT_SKB_CB(skb)->pull_to;
 
-		__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+		if (pull_to > skb_headlen(skb))
+			__pskb_pull_tail(skb, pull_to - skb_headlen(skb));
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
@@ -1018,17 +1019,10 @@ err:
 
 		skb_shinfo(skb)->frags[0].page_offset = rx->offset;
 		skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
-		skb->data_len = rx->status;
+		skb->len += skb->data_len = rx->status;
 
 		i = xennet_fill_frags(np, skb, &tmpq);
 
-		/*
-                 * Truesize is the actual allocation size, even if the
-                 * allocation is only partially used.
-                 */
-		skb->truesize += PAGE_SIZE * skb_shinfo(skb)->nr_frags;
-		skb->len += skb->data_len;
-
 		if (rx->flags & XEN_NETRXF_csum_blank)
 			skb->ip_summed = CHECKSUM_PARTIAL;
 		else if (rx->flags & XEN_NETRXF_data_validated)

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-07-17 19:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-16  9:46 [PATCH v2] xen-netfront: pull on receive skb may need to happen earlier Jan Beulich
2013-07-16 10:25 ` Wei Liu
2013-07-16 10:25 ` Wei Liu
2013-07-16 10:33   ` Dion Kant
2013-07-16 10:33   ` Dion Kant
2013-07-16 11:26   ` Jan Beulich
2013-07-16 11:26   ` Jan Beulich
2013-07-16 10:33 ` Ian Campbell
2013-07-16 10:33 ` Ian Campbell
2013-07-17  7:09   ` [PATCH v3] " Jan Beulich
2013-07-17  7:09   ` Jan Beulich
2013-07-17  8:26     ` Ian Campbell
2013-07-17 19:52       ` David Miller
2013-07-17 19:52       ` David Miller
2013-07-17  8:26     ` Ian Campbell
2013-07-16  9:46 [PATCH v2] " Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.