All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
@ 2013-07-09  6:15 Annie Li
  2013-07-09 16:16   ` Wei Liu
  2013-07-09 22:14   ` Matt Wilson
  0 siblings, 2 replies; 38+ messages in thread
From: Annie Li @ 2013-07-09  6:15 UTC (permalink / raw)
  To: xen-devel, netdev, Ian.Campbell; +Cc: konrad.wilk, wei.liu2, annie.li

When counting required slots for skb, netback directly uses DIV_ROUND_UP to get
slots required by header data. This is wrong when offset in the page of header
data is not zero, and is also inconsistent with following calculation for
required slot in netbk_gop_skb.

In netbk_gop_skb, required slots are calculated based on offset and len in page
of header data. It is possible that required slots here is larger than the one
calculated in earlier netbk_count_requests. This inconsistency directly results
in rx_req_cons_peek and xen_netbk_rx_ring_full judgement are wrong.

Then it comes to situation the ring is actually full, but netback thinks it is
not and continues to create responses. This results in response overlaps request
in the ring, then grantcopy gets wrong grant reference and throws out error,
for example "(XEN) grant_table.c:1763:d0 Bad grant reference 2949120", the
grant reference is invalid value here. Netback returns XEN_NETIF_RSP_ERROR(-1)
to netfront when grant copy status is error, then netfront gets rx->status
(the status is -1, not really data size now), and throws out error,
"kernel: net eth1: rx->offset: 0, size: 4294967295". This issue can be reproduced
by doing gzip/gunzip in nfs share with mtu = 9000, the guest would panic after
running such test for a while.

This patch is based on 3.10-rc7.

Signed-off-by: Annie Li <annie.li@oracle.com>
---
 drivers/net/xen-netback/netback.c |   97 +++++++++++++++++++++++++-----------
 1 files changed, 67 insertions(+), 30 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 8c20935..7ff9333 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -359,51 +359,88 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
  * the guest. This function is essentially a dry run of
  * netbk_gop_frag_copy.
  */
-unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
+static void netbk_get_slots(struct xenvif *vif, struct sk_buff *skb,
+			    struct page *page, int *copy_off,
+			    unsigned long size, unsigned long offset,
+			    int *head, int *count)
 {
-	unsigned int count;
-	int i, copy_off;
+	unsigned long bytes;
 
-	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
+	/* Data must not cross a page boundary. */
+	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
 
-	copy_off = skb_headlen(skb) % PAGE_SIZE;
+	/* Skip unused frames from start of page */
+	page += offset >> PAGE_SHIFT;
+	offset &= ~PAGE_MASK;
 
-	if (skb_shinfo(skb)->gso_size)
-		count++;
+	while (size > 0) {
+		BUG_ON(offset >= PAGE_SIZE);
+		BUG_ON(*copy_off > MAX_BUFFER_OFFSET);
 
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
-		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
-		unsigned long bytes;
+		bytes = PAGE_SIZE - offset;
 
-		offset &= ~PAGE_MASK;
+		if (bytes > size)
+			bytes = size;
 
-		while (size > 0) {
-			BUG_ON(offset >= PAGE_SIZE);
-			BUG_ON(copy_off > MAX_BUFFER_OFFSET);
+		if (start_new_rx_buffer(*copy_off, bytes, *head)) {
+			*count = *count + 1;
+			*copy_off = 0;
+		}
 
-			bytes = PAGE_SIZE - offset;
+		if (*copy_off + bytes > MAX_BUFFER_OFFSET)
+			bytes = MAX_BUFFER_OFFSET - *copy_off;
 
-			if (bytes > size)
-				bytes = size;
+		*copy_off += bytes;
 
-			if (start_new_rx_buffer(copy_off, bytes, 0)) {
-				count++;
-				copy_off = 0;
-			}
+		offset += bytes;
+		size -= bytes;
 
-			if (copy_off + bytes > MAX_BUFFER_OFFSET)
-				bytes = MAX_BUFFER_OFFSET - copy_off;
+		/* Next frame */
+		if (offset == PAGE_SIZE && size) {
+			BUG_ON(!PageCompound(page));
+			page++;
+			offset = 0;
+		}
 
-			copy_off += bytes;
+		if (*head)
+			*count = *count + 1;
+		*head = 0; /* There must be something in this buffer now. */
+	}
+}
+
+unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
+{
+	int i, copy_off = 0;
+	int nr_frags = skb_shinfo(skb)->nr_frags;
+	unsigned char *data;
+	int head = 1;
+	unsigned int count = 0;
 
-			offset += bytes;
-			size -= bytes;
+	if (skb_shinfo(skb)->gso_size)
+		count++;
 
-			if (offset == PAGE_SIZE)
-				offset = 0;
-		}
+	data = skb->data;
+	while (data < skb_tail_pointer(skb)) {
+		unsigned int offset = offset_in_page(data);
+		unsigned int len = PAGE_SIZE - offset;
+
+		if (data + len > skb_tail_pointer(skb))
+			len = skb_tail_pointer(skb) - data;
+
+		netbk_get_slots(vif, skb, virt_to_page(data), &copy_off,
+				len, offset, &head, &count);
+		data += len;
+	}
+
+	for (i = 0; i < nr_frags; i++) {
+		netbk_get_slots(vif, skb,
+				skb_frag_page(&skb_shinfo(skb)->frags[i]),
+				&copy_off,
+				skb_frag_size(&skb_shinfo(skb)->frags[i]),
+				skb_shinfo(skb)->frags[i].page_offset,
+				&head, &count);
 	}
+
 	return count;
 }
 
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
  2013-07-09  6:15 [PATCH 1/1] xen/netback: correctly calculate required slots of skb Annie Li
@ 2013-07-09 16:16   ` Wei Liu
  2013-07-09 22:14   ` Matt Wilson
  1 sibling, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-09 16:16 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, Ian.Campbell, konrad.wilk, wei.liu2

On Tue, Jul 09, 2013 at 02:15:20PM +0800, Annie Li wrote:
> When counting required slots for skb, netback directly uses DIV_ROUND_UP to get
> slots required by header data. This is wrong when offset in the page of header
> data is not zero, and is also inconsistent with following calculation for
> required slot in netbk_gop_skb.
> 
> In netbk_gop_skb, required slots are calculated based on offset and len in page
> of header data. It is possible that required slots here is larger than the one
> calculated in earlier netbk_count_requests. This inconsistency directly results
> in rx_req_cons_peek and xen_netbk_rx_ring_full judgement are wrong.
> 
> Then it comes to situation the ring is actually full, but netback thinks it is
> not and continues to create responses. This results in response overlaps request
> in the ring, then grantcopy gets wrong grant reference and throws out error,
> for example "(XEN) grant_table.c:1763:d0 Bad grant reference 2949120", the
> grant reference is invalid value here. Netback returns XEN_NETIF_RSP_ERROR(-1)
> to netfront when grant copy status is error, then netfront gets rx->status
> (the status is -1, not really data size now), and throws out error,
> "kernel: net eth1: rx->offset: 0, size: 4294967295". This issue can be reproduced
> by doing gzip/gunzip in nfs share with mtu = 9000, the guest would panic after
> running such test for a while.
> 
> This patch is based on 3.10-rc7.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netback/netback.c |   97 +++++++++++++++++++++++++-----------
>  1 files changed, 67 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 8c20935..7ff9333 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -359,51 +359,88 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
>   * the guest. This function is essentially a dry run of
>   * netbk_gop_frag_copy.
>   */

Just to be clear, now the above comment applies to your new function,
right?

> -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> +static void netbk_get_slots(struct xenvif *vif, struct sk_buff *skb,

A better name would be netbk_count_slots because it doesn't actually
acquire any slots.

And you can make it return the slot count and adds that to the total
number of slots in the caller, as this function doesn't seem to rely on
previous count so there is not much point passing in *count.

> +			    struct page *page, int *copy_off,
> +			    unsigned long size, unsigned long offset,
> +			    int *head, int *count)
>  {
> -	unsigned int count;
> -	int i, copy_off;
> +	unsigned long bytes;
>  
> -	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> +	/* Data must not cross a page boundary. */
> +	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
>  
> -	copy_off = skb_headlen(skb) % PAGE_SIZE;
> +	/* Skip unused frames from start of page */
> +	page += offset >> PAGE_SHIFT;
> +	offset &= ~PAGE_MASK;
>  
> -	if (skb_shinfo(skb)->gso_size)
> -		count++;
> +	while (size > 0) {
> +		BUG_ON(offset >= PAGE_SIZE);
> +		BUG_ON(*copy_off > MAX_BUFFER_OFFSET);
>  
> -	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> -		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> -		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
> -		unsigned long bytes;
> +		bytes = PAGE_SIZE - offset;
>  
> -		offset &= ~PAGE_MASK;
> +		if (bytes > size)
> +			bytes = size;
>  
> -		while (size > 0) {
> -			BUG_ON(offset >= PAGE_SIZE);
> -			BUG_ON(copy_off > MAX_BUFFER_OFFSET);
> +		if (start_new_rx_buffer(*copy_off, bytes, *head)) {
> +			*count = *count + 1;
> +			*copy_off = 0;
> +		}
>  
> -			bytes = PAGE_SIZE - offset;
> +		if (*copy_off + bytes > MAX_BUFFER_OFFSET)
> +			bytes = MAX_BUFFER_OFFSET - *copy_off;
>  
> -			if (bytes > size)
> -				bytes = size;
> +		*copy_off += bytes;
>  
> -			if (start_new_rx_buffer(copy_off, bytes, 0)) {
> -				count++;
> -				copy_off = 0;
> -			}
> +		offset += bytes;
> +		size -= bytes;
>  
> -			if (copy_off + bytes > MAX_BUFFER_OFFSET)
> -				bytes = MAX_BUFFER_OFFSET - copy_off;
> +		/* Next frame */
> +		if (offset == PAGE_SIZE && size) {
> +			BUG_ON(!PageCompound(page));
> +			page++;
> +			offset = 0;
> +		}
>  
> -			copy_off += bytes;
> +		if (*head)

This seems to differ from netbk_gop_frag_copy. In that function it's
like

    if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)


Wei.

> +			*count = *count + 1;
> +		*head = 0; /* There must be something in this buffer now. */
> +	}
> +}
> +
> +unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> +{
> +	int i, copy_off = 0;
> +	int nr_frags = skb_shinfo(skb)->nr_frags;
> +	unsigned char *data;
> +	int head = 1;
> +	unsigned int count = 0;
>  
> -			offset += bytes;
> -			size -= bytes;
> +	if (skb_shinfo(skb)->gso_size)
> +		count++;
>  
> -			if (offset == PAGE_SIZE)
> -				offset = 0;
> -		}
> +	data = skb->data;
> +	while (data < skb_tail_pointer(skb)) {
> +		unsigned int offset = offset_in_page(data);
> +		unsigned int len = PAGE_SIZE - offset;
> +
> +		if (data + len > skb_tail_pointer(skb))
> +			len = skb_tail_pointer(skb) - data;
> +
> +		netbk_get_slots(vif, skb, virt_to_page(data), &copy_off,
> +				len, offset, &head, &count);
> +		data += len;
> +	}
> +
> +	for (i = 0; i < nr_frags; i++) {
> +		netbk_get_slots(vif, skb,
> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> +				&copy_off,
> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> +				skb_shinfo(skb)->frags[i].page_offset,
> +				&head, &count);
>  	}
> +
>  	return count;
>  }
>  
> -- 
> 1.7.3.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
@ 2013-07-09 16:16   ` Wei Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-09 16:16 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, Ian.Campbell, konrad.wilk, wei.liu2

On Tue, Jul 09, 2013 at 02:15:20PM +0800, Annie Li wrote:
> When counting required slots for skb, netback directly uses DIV_ROUND_UP to get
> slots required by header data. This is wrong when offset in the page of header
> data is not zero, and is also inconsistent with following calculation for
> required slot in netbk_gop_skb.
> 
> In netbk_gop_skb, required slots are calculated based on offset and len in page
> of header data. It is possible that required slots here is larger than the one
> calculated in earlier netbk_count_requests. This inconsistency directly results
> in rx_req_cons_peek and xen_netbk_rx_ring_full judgement are wrong.
> 
> Then it comes to situation the ring is actually full, but netback thinks it is
> not and continues to create responses. This results in response overlaps request
> in the ring, then grantcopy gets wrong grant reference and throws out error,
> for example "(XEN) grant_table.c:1763:d0 Bad grant reference 2949120", the
> grant reference is invalid value here. Netback returns XEN_NETIF_RSP_ERROR(-1)
> to netfront when grant copy status is error, then netfront gets rx->status
> (the status is -1, not really data size now), and throws out error,
> "kernel: net eth1: rx->offset: 0, size: 4294967295". This issue can be reproduced
> by doing gzip/gunzip in nfs share with mtu = 9000, the guest would panic after
> running such test for a while.
> 
> This patch is based on 3.10-rc7.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netback/netback.c |   97 +++++++++++++++++++++++++-----------
>  1 files changed, 67 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 8c20935..7ff9333 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -359,51 +359,88 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
>   * the guest. This function is essentially a dry run of
>   * netbk_gop_frag_copy.
>   */

Just to be clear, now the above comment applies to your new function,
right?

> -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> +static void netbk_get_slots(struct xenvif *vif, struct sk_buff *skb,

A better name would be netbk_count_slots because it doesn't actually
acquire any slots.

And you can make it return the slot count and adds that to the total
number of slots in the caller, as this function doesn't seem to rely on
previous count so there is not much point passing in *count.

> +			    struct page *page, int *copy_off,
> +			    unsigned long size, unsigned long offset,
> +			    int *head, int *count)
>  {
> -	unsigned int count;
> -	int i, copy_off;
> +	unsigned long bytes;
>  
> -	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> +	/* Data must not cross a page boundary. */
> +	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
>  
> -	copy_off = skb_headlen(skb) % PAGE_SIZE;
> +	/* Skip unused frames from start of page */
> +	page += offset >> PAGE_SHIFT;
> +	offset &= ~PAGE_MASK;
>  
> -	if (skb_shinfo(skb)->gso_size)
> -		count++;
> +	while (size > 0) {
> +		BUG_ON(offset >= PAGE_SIZE);
> +		BUG_ON(*copy_off > MAX_BUFFER_OFFSET);
>  
> -	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> -		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> -		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
> -		unsigned long bytes;
> +		bytes = PAGE_SIZE - offset;
>  
> -		offset &= ~PAGE_MASK;
> +		if (bytes > size)
> +			bytes = size;
>  
> -		while (size > 0) {
> -			BUG_ON(offset >= PAGE_SIZE);
> -			BUG_ON(copy_off > MAX_BUFFER_OFFSET);
> +		if (start_new_rx_buffer(*copy_off, bytes, *head)) {
> +			*count = *count + 1;
> +			*copy_off = 0;
> +		}
>  
> -			bytes = PAGE_SIZE - offset;
> +		if (*copy_off + bytes > MAX_BUFFER_OFFSET)
> +			bytes = MAX_BUFFER_OFFSET - *copy_off;
>  
> -			if (bytes > size)
> -				bytes = size;
> +		*copy_off += bytes;
>  
> -			if (start_new_rx_buffer(copy_off, bytes, 0)) {
> -				count++;
> -				copy_off = 0;
> -			}
> +		offset += bytes;
> +		size -= bytes;
>  
> -			if (copy_off + bytes > MAX_BUFFER_OFFSET)
> -				bytes = MAX_BUFFER_OFFSET - copy_off;
> +		/* Next frame */
> +		if (offset == PAGE_SIZE && size) {
> +			BUG_ON(!PageCompound(page));
> +			page++;
> +			offset = 0;
> +		}
>  
> -			copy_off += bytes;
> +		if (*head)

This seems to differ from netbk_gop_frag_copy. In that function it's
like

    if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)


Wei.

> +			*count = *count + 1;
> +		*head = 0; /* There must be something in this buffer now. */
> +	}
> +}
> +
> +unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> +{
> +	int i, copy_off = 0;
> +	int nr_frags = skb_shinfo(skb)->nr_frags;
> +	unsigned char *data;
> +	int head = 1;
> +	unsigned int count = 0;
>  
> -			offset += bytes;
> -			size -= bytes;
> +	if (skb_shinfo(skb)->gso_size)
> +		count++;
>  
> -			if (offset == PAGE_SIZE)
> -				offset = 0;
> -		}
> +	data = skb->data;
> +	while (data < skb_tail_pointer(skb)) {
> +		unsigned int offset = offset_in_page(data);
> +		unsigned int len = PAGE_SIZE - offset;
> +
> +		if (data + len > skb_tail_pointer(skb))
> +			len = skb_tail_pointer(skb) - data;
> +
> +		netbk_get_slots(vif, skb, virt_to_page(data), &copy_off,
> +				len, offset, &head, &count);
> +		data += len;
> +	}
> +
> +	for (i = 0; i < nr_frags; i++) {
> +		netbk_get_slots(vif, skb,
> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> +				&copy_off,
> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> +				skb_shinfo(skb)->frags[i].page_offset,
> +				&head, &count);
>  	}
> +
>  	return count;
>  }
>  
> -- 
> 1.7.3.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
  2013-07-09  6:15 [PATCH 1/1] xen/netback: correctly calculate required slots of skb Annie Li
@ 2013-07-09 22:14   ` Matt Wilson
  2013-07-09 22:14   ` Matt Wilson
  1 sibling, 0 replies; 38+ messages in thread
From: Matt Wilson @ 2013-07-09 22:14 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, Ian.Campbell, wei.liu2

On Tue, Jul 09, 2013 at 02:15:20PM +0800, Annie Li wrote:
> When counting required slots for skb, netback directly uses DIV_ROUND_UP to get
> slots required by header data. This is wrong when offset in the page of header
> data is not zero, and is also inconsistent with following calculation for
> required slot in netbk_gop_skb.
> 
> In netbk_gop_skb, required slots are calculated based on offset and len in page
> of header data. It is possible that required slots here is larger than the one
> calculated in earlier netbk_count_requests. This inconsistency directly results
> in rx_req_cons_peek and xen_netbk_rx_ring_full judgement are wrong.
> 
> Then it comes to situation the ring is actually full, but netback thinks it is
> not and continues to create responses. This results in response overlaps request
> in the ring, then grantcopy gets wrong grant reference and throws out error,
> for example "(XEN) grant_table.c:1763:d0 Bad grant reference 2949120", the
> grant reference is invalid value here. Netback returns XEN_NETIF_RSP_ERROR(-1)
> to netfront when grant copy status is error, then netfront gets rx->status
> (the status is -1, not really data size now), and throws out error,
> "kernel: net eth1: rx->offset: 0, size: 4294967295". This issue can be reproduced
> by doing gzip/gunzip in nfs share with mtu = 9000, the guest would panic after
> running such test for a while.

I have a similar (but smaller) patch in my queue that Wei looked over
at the Dublin hackathon. I don't have time to rebase and test it right
now, but let me post it for a second data point.

--msw

> This patch is based on 3.10-rc7.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netback/netback.c |   97 +++++++++++++++++++++++++-----------
>  1 files changed, 67 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 8c20935..7ff9333 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -359,51 +359,88 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
>   * the guest. This function is essentially a dry run of
>   * netbk_gop_frag_copy.
>   */
> -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> +static void netbk_get_slots(struct xenvif *vif, struct sk_buff *skb,
> +			    struct page *page, int *copy_off,
> +			    unsigned long size, unsigned long offset,
> +			    int *head, int *count)
>  {
> -	unsigned int count;
> -	int i, copy_off;
> +	unsigned long bytes;
>  
> -	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> +	/* Data must not cross a page boundary. */
> +	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
>  
> -	copy_off = skb_headlen(skb) % PAGE_SIZE;
> +	/* Skip unused frames from start of page */
> +	page += offset >> PAGE_SHIFT;
> +	offset &= ~PAGE_MASK;
>  
> -	if (skb_shinfo(skb)->gso_size)
> -		count++;
> +	while (size > 0) {
> +		BUG_ON(offset >= PAGE_SIZE);
> +		BUG_ON(*copy_off > MAX_BUFFER_OFFSET);
>  
> -	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> -		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> -		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
> -		unsigned long bytes;
> +		bytes = PAGE_SIZE - offset;
>  
> -		offset &= ~PAGE_MASK;
> +		if (bytes > size)
> +			bytes = size;
>  
> -		while (size > 0) {
> -			BUG_ON(offset >= PAGE_SIZE);
> -			BUG_ON(copy_off > MAX_BUFFER_OFFSET);
> +		if (start_new_rx_buffer(*copy_off, bytes, *head)) {
> +			*count = *count + 1;
> +			*copy_off = 0;
> +		}
>  
> -			bytes = PAGE_SIZE - offset;
> +		if (*copy_off + bytes > MAX_BUFFER_OFFSET)
> +			bytes = MAX_BUFFER_OFFSET - *copy_off;
>  
> -			if (bytes > size)
> -				bytes = size;
> +		*copy_off += bytes;
>  
> -			if (start_new_rx_buffer(copy_off, bytes, 0)) {
> -				count++;
> -				copy_off = 0;
> -			}
> +		offset += bytes;
> +		size -= bytes;
>  
> -			if (copy_off + bytes > MAX_BUFFER_OFFSET)
> -				bytes = MAX_BUFFER_OFFSET - copy_off;
> +		/* Next frame */
> +		if (offset == PAGE_SIZE && size) {
> +			BUG_ON(!PageCompound(page));
> +			page++;
> +			offset = 0;
> +		}
>  
> -			copy_off += bytes;
> +		if (*head)
> +			*count = *count + 1;
> +		*head = 0; /* There must be something in this buffer now. */
> +	}
> +}
> +
> +unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> +{
> +	int i, copy_off = 0;
> +	int nr_frags = skb_shinfo(skb)->nr_frags;
> +	unsigned char *data;
> +	int head = 1;
> +	unsigned int count = 0;
>  
> -			offset += bytes;
> -			size -= bytes;
> +	if (skb_shinfo(skb)->gso_size)
> +		count++;
>  
> -			if (offset == PAGE_SIZE)
> -				offset = 0;
> -		}
> +	data = skb->data;
> +	while (data < skb_tail_pointer(skb)) {
> +		unsigned int offset = offset_in_page(data);
> +		unsigned int len = PAGE_SIZE - offset;
> +
> +		if (data + len > skb_tail_pointer(skb))
> +			len = skb_tail_pointer(skb) - data;
> +
> +		netbk_get_slots(vif, skb, virt_to_page(data), &copy_off,
> +				len, offset, &head, &count);
> +		data += len;
> +	}
> +
> +	for (i = 0; i < nr_frags; i++) {
> +		netbk_get_slots(vif, skb,
> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> +				&copy_off,
> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> +				skb_shinfo(skb)->frags[i].page_offset,
> +				&head, &count);
>  	}
> +
>  	return count;
>  }
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
@ 2013-07-09 22:14   ` Matt Wilson
  0 siblings, 0 replies; 38+ messages in thread
From: Matt Wilson @ 2013-07-09 22:14 UTC (permalink / raw)
  To: Annie Li; +Cc: xen-devel, netdev, Ian.Campbell, wei.liu2

On Tue, Jul 09, 2013 at 02:15:20PM +0800, Annie Li wrote:
> When counting required slots for skb, netback directly uses DIV_ROUND_UP to get
> slots required by header data. This is wrong when offset in the page of header
> data is not zero, and is also inconsistent with following calculation for
> required slot in netbk_gop_skb.
> 
> In netbk_gop_skb, required slots are calculated based on offset and len in page
> of header data. It is possible that required slots here is larger than the one
> calculated in earlier netbk_count_requests. This inconsistency directly results
> in rx_req_cons_peek and xen_netbk_rx_ring_full judgement are wrong.
> 
> Then it comes to situation the ring is actually full, but netback thinks it is
> not and continues to create responses. This results in response overlaps request
> in the ring, then grantcopy gets wrong grant reference and throws out error,
> for example "(XEN) grant_table.c:1763:d0 Bad grant reference 2949120", the
> grant reference is invalid value here. Netback returns XEN_NETIF_RSP_ERROR(-1)
> to netfront when grant copy status is error, then netfront gets rx->status
> (the status is -1, not really data size now), and throws out error,
> "kernel: net eth1: rx->offset: 0, size: 4294967295". This issue can be reproduced
> by doing gzip/gunzip in nfs share with mtu = 9000, the guest would panic after
> running such test for a while.

I have a similar (but smaller) patch in my queue that Wei looked over
at the Dublin hackathon. I don't have time to rebase and test it right
now, but let me post it for a second data point.

--msw

> This patch is based on 3.10-rc7.
> 
> Signed-off-by: Annie Li <annie.li@oracle.com>
> ---
>  drivers/net/xen-netback/netback.c |   97 +++++++++++++++++++++++++-----------
>  1 files changed, 67 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 8c20935..7ff9333 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -359,51 +359,88 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
>   * the guest. This function is essentially a dry run of
>   * netbk_gop_frag_copy.
>   */
> -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> +static void netbk_get_slots(struct xenvif *vif, struct sk_buff *skb,
> +			    struct page *page, int *copy_off,
> +			    unsigned long size, unsigned long offset,
> +			    int *head, int *count)
>  {
> -	unsigned int count;
> -	int i, copy_off;
> +	unsigned long bytes;
>  
> -	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> +	/* Data must not cross a page boundary. */
> +	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
>  
> -	copy_off = skb_headlen(skb) % PAGE_SIZE;
> +	/* Skip unused frames from start of page */
> +	page += offset >> PAGE_SHIFT;
> +	offset &= ~PAGE_MASK;
>  
> -	if (skb_shinfo(skb)->gso_size)
> -		count++;
> +	while (size > 0) {
> +		BUG_ON(offset >= PAGE_SIZE);
> +		BUG_ON(*copy_off > MAX_BUFFER_OFFSET);
>  
> -	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> -		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> -		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
> -		unsigned long bytes;
> +		bytes = PAGE_SIZE - offset;
>  
> -		offset &= ~PAGE_MASK;
> +		if (bytes > size)
> +			bytes = size;
>  
> -		while (size > 0) {
> -			BUG_ON(offset >= PAGE_SIZE);
> -			BUG_ON(copy_off > MAX_BUFFER_OFFSET);
> +		if (start_new_rx_buffer(*copy_off, bytes, *head)) {
> +			*count = *count + 1;
> +			*copy_off = 0;
> +		}
>  
> -			bytes = PAGE_SIZE - offset;
> +		if (*copy_off + bytes > MAX_BUFFER_OFFSET)
> +			bytes = MAX_BUFFER_OFFSET - *copy_off;
>  
> -			if (bytes > size)
> -				bytes = size;
> +		*copy_off += bytes;
>  
> -			if (start_new_rx_buffer(copy_off, bytes, 0)) {
> -				count++;
> -				copy_off = 0;
> -			}
> +		offset += bytes;
> +		size -= bytes;
>  
> -			if (copy_off + bytes > MAX_BUFFER_OFFSET)
> -				bytes = MAX_BUFFER_OFFSET - copy_off;
> +		/* Next frame */
> +		if (offset == PAGE_SIZE && size) {
> +			BUG_ON(!PageCompound(page));
> +			page++;
> +			offset = 0;
> +		}
>  
> -			copy_off += bytes;
> +		if (*head)
> +			*count = *count + 1;
> +		*head = 0; /* There must be something in this buffer now. */
> +	}
> +}
> +
> +unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> +{
> +	int i, copy_off = 0;
> +	int nr_frags = skb_shinfo(skb)->nr_frags;
> +	unsigned char *data;
> +	int head = 1;
> +	unsigned int count = 0;
>  
> -			offset += bytes;
> -			size -= bytes;
> +	if (skb_shinfo(skb)->gso_size)
> +		count++;
>  
> -			if (offset == PAGE_SIZE)
> -				offset = 0;
> -		}
> +	data = skb->data;
> +	while (data < skb_tail_pointer(skb)) {
> +		unsigned int offset = offset_in_page(data);
> +		unsigned int len = PAGE_SIZE - offset;
> +
> +		if (data + len > skb_tail_pointer(skb))
> +			len = skb_tail_pointer(skb) - data;
> +
> +		netbk_get_slots(vif, skb, virt_to_page(data), &copy_off,
> +				len, offset, &head, &count);
> +		data += len;
> +	}
> +
> +	for (i = 0; i < nr_frags; i++) {
> +		netbk_get_slots(vif, skb,
> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> +				&copy_off,
> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> +				skb_shinfo(skb)->frags[i].page_offset,
> +				&head, &count);
>  	}
> +
>  	return count;
>  }
>  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-09 22:14   ` Matt Wilson
  (?)
  (?)
@ 2013-07-09 22:40   ` Matt Wilson
  2013-07-10  8:13     ` Wei Liu
  2013-07-10  8:13     ` Wei Liu
  -1 siblings, 2 replies; 38+ messages in thread
From: Matt Wilson @ 2013-07-09 22:40 UTC (permalink / raw)
  Cc: Xi Xiong, Matt Wilson, Annie Li, Wei Liu, Ian Campbell, netdev,
	xen-devel

From: Xi Xiong <xixiong@amazon.com>

[ note: I've just cherry picked this onto net-next, and only compile
  tested. This a RFC only. -msw ]

Currently the number of RX slots required to transmit a SKB to
xen-netfront can be miscalculated when an interface uses a MTU larger
than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
pause the queue indefinitely or reuse slots. The former manifests as a
loss of connectivity to the guest (which can be restored by lowering
the MTU set on the interface). The latter manifests with "Bad grant
reference" messages from Xen such as:

(XEN) grant_table.c:1797:d0 Bad grant reference 264241157

and kernel messages within the guest such as:

[  180.419567] net eth0: Invalid extra type: 112
[  180.868620] net eth0: rx->offset: 0, size: 4294967295
[  180.868629] net eth0: rx->offset: 0, size: 4294967295

BUG_ON() assertions can also be hit if RX slots are exhausted while
handling a SKB.

This patch changes xen_netbk_rx_action() to count the number of RX
slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
This prevents under-counting the number of RX slots consumed when a
SKB has a large linear buffer.

Additionally, we now store the estimated number of RX slots required
to handle a SKB in the cb overlay. This value is used to determine if
the next SKB in the queue can be processed.

Finally, the logic in start_new_rx_buffer() can cause RX slots to be
wasted when setting up copy grant table operations for SKBs with large
linear buffers. For example, a SKB with skb_headlen() equal to 8157
bytes that starts 64 bytes 64 bytes from the start of the page will
consume three RX slots instead of two. This patch changes the "head"
parameter to netbk_gop_frag_copy() to act as a flag. When set,
start_new_rx_buffer() will always place as much data as possible into
each RX slot.

Signed-off-by: Xi Xiong <xixiong@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[ msw: minor code cleanups, rewrote commit message, adjusted code
  to count RX slots instead of meta structures ]
Signed-off-by: Matt Wilson <msw@amazon.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: netdev@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
---
 drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
 1 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 64828de..82dd207 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -110,6 +110,11 @@ union page_ext {
 	void *mapping;
 };
 
+struct skb_cb_overlay {
+	int meta_slots_used;
+	int peek_slots_count;
+};
+
 struct xen_netbk {
 	wait_queue_head_t wq;
 	struct task_struct *task;
@@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
 {
 	unsigned int count;
 	int i, copy_off;
+	struct skb_cb_overlay *sco;
 
 	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
 
@@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
 				offset = 0;
 		}
 	}
+
+	sco = (struct skb_cb_overlay *) skb->cb;
+	sco->peek_slots_count = count;
 	return count;
 }
 
@@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
 }
 
 /*
- * Set up the grant operations for this fragment. If it's a flipping
- * interface, we also set up the unmap request from here.
+ * Set up the grant operations for this fragment.
  */
 static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 				struct netrx_pending_operations *npo,
 				struct page *page, unsigned long size,
-				unsigned long offset, int *head)
+				unsigned long offset, int head, int *first)
 {
 	struct gnttab_copy *copy_gop;
 	struct netbk_rx_meta *meta;
@@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		if (bytes > size)
 			bytes = size;
 
-		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
+		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
 			/*
 			 * Netfront requires there to be some data in the head
 			 * buffer.
 			 */
-			BUG_ON(*head);
+			BUG_ON(*first);
 
 			meta = get_next_rx_buffer(vif, npo);
 		}
@@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		}
 
 		/* Leave a gap for the GSO descriptor. */
-		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
+		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
 			vif->rx.req_cons++;
 
-		*head = 0; /* There must be something in this buffer now. */
+		*first = 0; /* There must be something in this buffer now. */
 
 	}
 }
@@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
 	struct xen_netif_rx_request *req;
 	struct netbk_rx_meta *meta;
 	unsigned char *data;
-	int head = 1;
+	int first = 1;
 	int old_meta_prod;
 
 	old_meta_prod = npo->meta_prod;
@@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
 			len = skb_tail_pointer(skb) - data;
 
 		netbk_gop_frag_copy(vif, skb, npo,
-				    virt_to_page(data), len, offset, &head);
+				virt_to_page(data), len, offset, 1, &first);
 		data += len;
 	}
 
 	for (i = 0; i < nr_frags; i++) {
 		netbk_gop_frag_copy(vif, skb, npo,
-				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
-				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
-				    skb_shinfo(skb)->frags[i].page_offset,
-				    &head);
+				skb_frag_page(&skb_shinfo(skb)->frags[i]),
+				skb_frag_size(&skb_shinfo(skb)->frags[i]),
+				skb_shinfo(skb)->frags[i].page_offset,
+				0, &first);
 	}
 
 	return npo->meta_prod - old_meta_prod;
@@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
 	}
 }
 
-struct skb_cb_overlay {
-	int meta_slots_used;
-};
-
 static void xen_netbk_rx_action(struct xen_netbk *netbk)
 {
 	struct xenvif *vif = NULL, *tmp;
@@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 	count = 0;
 
 	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
+		RING_IDX old_rx_req_cons;
+ 
 		vif = netdev_priv(skb->dev);
 		nr_frags = skb_shinfo(skb)->nr_frags;
 
+		old_rx_req_cons = vif->rx.req_cons;
 		sco = (struct skb_cb_overlay *)skb->cb;
 		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
 
-		count += nr_frags + 1;
+		count += vif->rx.req_cons - old_rx_req_cons;
 
 		__skb_queue_tail(&rxq, skb);
 
+		skb = skb_peek(&netbk->rx_queue);
+		if (skb == NULL)
+			break;
+		sco = (struct skb_cb_overlay *) skb->cb;
+
 		/* Filled the batch queue? */
-		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
-		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
+		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
 			break;
 	}
 
-- 
1.7.4.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-09 22:14   ` Matt Wilson
  (?)
@ 2013-07-09 22:40   ` Matt Wilson
  -1 siblings, 0 replies; 38+ messages in thread
From: Matt Wilson @ 2013-07-09 22:40 UTC (permalink / raw)
  Cc: Wei Liu, Ian Campbell, netdev, Annie Li, Matt Wilson, xen-devel,
	Xi Xiong

From: Xi Xiong <xixiong@amazon.com>

[ note: I've just cherry picked this onto net-next, and only compile
  tested. This a RFC only. -msw ]

Currently the number of RX slots required to transmit a SKB to
xen-netfront can be miscalculated when an interface uses a MTU larger
than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
pause the queue indefinitely or reuse slots. The former manifests as a
loss of connectivity to the guest (which can be restored by lowering
the MTU set on the interface). The latter manifests with "Bad grant
reference" messages from Xen such as:

(XEN) grant_table.c:1797:d0 Bad grant reference 264241157

and kernel messages within the guest such as:

[  180.419567] net eth0: Invalid extra type: 112
[  180.868620] net eth0: rx->offset: 0, size: 4294967295
[  180.868629] net eth0: rx->offset: 0, size: 4294967295

BUG_ON() assertions can also be hit if RX slots are exhausted while
handling a SKB.

This patch changes xen_netbk_rx_action() to count the number of RX
slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
This prevents under-counting the number of RX slots consumed when a
SKB has a large linear buffer.

Additionally, we now store the estimated number of RX slots required
to handle a SKB in the cb overlay. This value is used to determine if
the next SKB in the queue can be processed.

Finally, the logic in start_new_rx_buffer() can cause RX slots to be
wasted when setting up copy grant table operations for SKBs with large
linear buffers. For example, a SKB with skb_headlen() equal to 8157
bytes that starts 64 bytes 64 bytes from the start of the page will
consume three RX slots instead of two. This patch changes the "head"
parameter to netbk_gop_frag_copy() to act as a flag. When set,
start_new_rx_buffer() will always place as much data as possible into
each RX slot.

Signed-off-by: Xi Xiong <xixiong@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[ msw: minor code cleanups, rewrote commit message, adjusted code
  to count RX slots instead of meta structures ]
Signed-off-by: Matt Wilson <msw@amazon.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: netdev@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
---
 drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
 1 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 64828de..82dd207 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -110,6 +110,11 @@ union page_ext {
 	void *mapping;
 };
 
+struct skb_cb_overlay {
+	int meta_slots_used;
+	int peek_slots_count;
+};
+
 struct xen_netbk {
 	wait_queue_head_t wq;
 	struct task_struct *task;
@@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
 {
 	unsigned int count;
 	int i, copy_off;
+	struct skb_cb_overlay *sco;
 
 	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
 
@@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
 				offset = 0;
 		}
 	}
+
+	sco = (struct skb_cb_overlay *) skb->cb;
+	sco->peek_slots_count = count;
 	return count;
 }
 
@@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
 }
 
 /*
- * Set up the grant operations for this fragment. If it's a flipping
- * interface, we also set up the unmap request from here.
+ * Set up the grant operations for this fragment.
  */
 static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 				struct netrx_pending_operations *npo,
 				struct page *page, unsigned long size,
-				unsigned long offset, int *head)
+				unsigned long offset, int head, int *first)
 {
 	struct gnttab_copy *copy_gop;
 	struct netbk_rx_meta *meta;
@@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		if (bytes > size)
 			bytes = size;
 
-		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
+		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
 			/*
 			 * Netfront requires there to be some data in the head
 			 * buffer.
 			 */
-			BUG_ON(*head);
+			BUG_ON(*first);
 
 			meta = get_next_rx_buffer(vif, npo);
 		}
@@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
 		}
 
 		/* Leave a gap for the GSO descriptor. */
-		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
+		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
 			vif->rx.req_cons++;
 
-		*head = 0; /* There must be something in this buffer now. */
+		*first = 0; /* There must be something in this buffer now. */
 
 	}
 }
@@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
 	struct xen_netif_rx_request *req;
 	struct netbk_rx_meta *meta;
 	unsigned char *data;
-	int head = 1;
+	int first = 1;
 	int old_meta_prod;
 
 	old_meta_prod = npo->meta_prod;
@@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
 			len = skb_tail_pointer(skb) - data;
 
 		netbk_gop_frag_copy(vif, skb, npo,
-				    virt_to_page(data), len, offset, &head);
+				virt_to_page(data), len, offset, 1, &first);
 		data += len;
 	}
 
 	for (i = 0; i < nr_frags; i++) {
 		netbk_gop_frag_copy(vif, skb, npo,
-				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
-				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
-				    skb_shinfo(skb)->frags[i].page_offset,
-				    &head);
+				skb_frag_page(&skb_shinfo(skb)->frags[i]),
+				skb_frag_size(&skb_shinfo(skb)->frags[i]),
+				skb_shinfo(skb)->frags[i].page_offset,
+				0, &first);
 	}
 
 	return npo->meta_prod - old_meta_prod;
@@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
 	}
 }
 
-struct skb_cb_overlay {
-	int meta_slots_used;
-};
-
 static void xen_netbk_rx_action(struct xen_netbk *netbk)
 {
 	struct xenvif *vif = NULL, *tmp;
@@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
 	count = 0;
 
 	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
+		RING_IDX old_rx_req_cons;
+ 
 		vif = netdev_priv(skb->dev);
 		nr_frags = skb_shinfo(skb)->nr_frags;
 
+		old_rx_req_cons = vif->rx.req_cons;
 		sco = (struct skb_cb_overlay *)skb->cb;
 		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
 
-		count += nr_frags + 1;
+		count += vif->rx.req_cons - old_rx_req_cons;
 
 		__skb_queue_tail(&rxq, skb);
 
+		skb = skb_peek(&netbk->rx_queue);
+		if (skb == NULL)
+			break;
+		sco = (struct skb_cb_overlay *) skb->cb;
+
 		/* Filled the batch queue? */
-		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
-		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
+		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
 			break;
 	}
 
-- 
1.7.4.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
  2013-07-09 16:16   ` Wei Liu
  (?)
@ 2013-07-10  1:57   ` annie li
  -1 siblings, 0 replies; 38+ messages in thread
From: annie li @ 2013-07-10  1:57 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, Ian.Campbell, konrad.wilk


On 2013-7-10 0:16, Wei Liu wrote:
> On Tue, Jul 09, 2013 at 02:15:20PM +0800, Annie Li wrote:
>> When counting required slots for skb, netback directly uses DIV_ROUND_UP to get
>> slots required by header data. This is wrong when offset in the page of header
>> data is not zero, and is also inconsistent with following calculation for
>> required slot in netbk_gop_skb.
>>
>> In netbk_gop_skb, required slots are calculated based on offset and len in page
>> of header data. It is possible that required slots here is larger than the one
>> calculated in earlier netbk_count_requests. This inconsistency directly results
>> in rx_req_cons_peek and xen_netbk_rx_ring_full judgement are wrong.
>>
>> Then it comes to situation the ring is actually full, but netback thinks it is
>> not and continues to create responses. This results in response overlaps request
>> in the ring, then grantcopy gets wrong grant reference and throws out error,
>> for example "(XEN) grant_table.c:1763:d0 Bad grant reference 2949120", the
>> grant reference is invalid value here. Netback returns XEN_NETIF_RSP_ERROR(-1)
>> to netfront when grant copy status is error, then netfront gets rx->status
>> (the status is -1, not really data size now), and throws out error,
>> "kernel: net eth1: rx->offset: 0, size: 4294967295". This issue can be reproduced
>> by doing gzip/gunzip in nfs share with mtu = 9000, the guest would panic after
>> running such test for a while.
>>
>> This patch is based on 3.10-rc7.
>>
>> Signed-off-by: Annie Li <annie.li@oracle.com>
>> ---
>>   drivers/net/xen-netback/netback.c |   97 +++++++++++++++++++++++++-----------
>>   1 files changed, 67 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 8c20935..7ff9333 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -359,51 +359,88 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
>>    * the guest. This function is essentially a dry run of
>>    * netbk_gop_frag_copy.
>>    */
> Just to be clear, now the above comment applies to your new function,
> right?

Oh, this should apply to xen_netbk_count_skb_slots instead of this one,

>
>> -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>> +static void netbk_get_slots(struct xenvif *vif, struct sk_buff *skb,
> A better name would be netbk_count_slots because it doesn't actually
> acquire any slots.

OK, this looks fine.

>
> And you can make it return the slot count and adds that to the total
> number of slots in the caller, as this function doesn't seem to rely on
> previous count so there is not much point passing in *count.

Correct!

>
>> +			    struct page *page, int *copy_off,
>> +			    unsigned long size, unsigned long offset,
>> +			    int *head, int *count)
>>   {
>> -	unsigned int count;
>> -	int i, copy_off;
>> +	unsigned long bytes;
>>   
>> -	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>> +	/* Data must not cross a page boundary. */
>> +	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
>>   
>> -	copy_off = skb_headlen(skb) % PAGE_SIZE;
>> +	/* Skip unused frames from start of page */
>> +	page += offset >> PAGE_SHIFT;
>> +	offset &= ~PAGE_MASK;
>>   
>> -	if (skb_shinfo(skb)->gso_size)
>> -		count++;
>> +	while (size > 0) {
>> +		BUG_ON(offset >= PAGE_SIZE);
>> +		BUG_ON(*copy_off > MAX_BUFFER_OFFSET);
>>   
>> -	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
>> -		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>> -		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
>> -		unsigned long bytes;
>> +		bytes = PAGE_SIZE - offset;
>>   
>> -		offset &= ~PAGE_MASK;
>> +		if (bytes > size)
>> +			bytes = size;
>>   
>> -		while (size > 0) {
>> -			BUG_ON(offset >= PAGE_SIZE);
>> -			BUG_ON(copy_off > MAX_BUFFER_OFFSET);
>> +		if (start_new_rx_buffer(*copy_off, bytes, *head)) {
>> +			*count = *count + 1;
>> +			*copy_off = 0;
>> +		}
>>   
>> -			bytes = PAGE_SIZE - offset;
>> +		if (*copy_off + bytes > MAX_BUFFER_OFFSET)
>> +			bytes = MAX_BUFFER_OFFSET - *copy_off;
>>   
>> -			if (bytes > size)
>> -				bytes = size;
>> +		*copy_off += bytes;
>>   
>> -			if (start_new_rx_buffer(copy_off, bytes, 0)) {
>> -				count++;
>> -				copy_off = 0;
>> -			}
>> +		offset += bytes;
>> +		size -= bytes;
>>   
>> -			if (copy_off + bytes > MAX_BUFFER_OFFSET)
>> -				bytes = MAX_BUFFER_OFFSET - copy_off;
>> +		/* Next frame */
>> +		if (offset == PAGE_SIZE && size) {
>> +			BUG_ON(!PageCompound(page));
>> +			page++;
>> +			offset = 0;
>> +		}
>>   
>> -			copy_off += bytes;
>> +		if (*head)
> This seems to differ from netbk_gop_frag_copy. In that function it's
> like
>
>      if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)

Sorry, my original patch is based on older version which does not have 
this, need to change that.

Thanks
Annie
>
> Wei.
>
>> +			*count = *count + 1;
>> +		*head = 0; /* There must be something in this buffer now. */
>> +	}
>> +}
>> +
>> +unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>> +{
>> +	int i, copy_off = 0;
>> +	int nr_frags = skb_shinfo(skb)->nr_frags;
>> +	unsigned char *data;
>> +	int head = 1;
>> +	unsigned int count = 0;
>>   
>> -			offset += bytes;
>> -			size -= bytes;
>> +	if (skb_shinfo(skb)->gso_size)
>> +		count++;
>>   
>> -			if (offset == PAGE_SIZE)
>> -				offset = 0;
>> -		}
>> +	data = skb->data;
>> +	while (data < skb_tail_pointer(skb)) {
>> +		unsigned int offset = offset_in_page(data);
>> +		unsigned int len = PAGE_SIZE - offset;
>> +
>> +		if (data + len > skb_tail_pointer(skb))
>> +			len = skb_tail_pointer(skb) - data;
>> +
>> +		netbk_get_slots(vif, skb, virt_to_page(data), &copy_off,
>> +				len, offset, &head, &count);
>> +		data += len;
>> +	}
>> +
>> +	for (i = 0; i < nr_frags; i++) {
>> +		netbk_get_slots(vif, skb,
>> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
>> +				&copy_off,
>> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
>> +				skb_shinfo(skb)->frags[i].page_offset,
>> +				&head, &count);
>>   	}
>> +
>>   	return count;
>>   }
>>   
>> -- 
>> 1.7.3.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
  2013-07-09 16:16   ` Wei Liu
  (?)
  (?)
@ 2013-07-10  2:31   ` annie li
  2013-07-10  7:48       ` Wei Liu
  -1 siblings, 1 reply; 38+ messages in thread
From: annie li @ 2013-07-10  2:31 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, xen-devel, Ian.Campbell


On 2013-7-10 0:16, Wei Liu wrote:
> On Tue, Jul 09, 2013 at 02:15:20PM +0800, Annie Li wrote:
>> When counting required slots for skb, netback directly uses DIV_ROUND_UP to get
>> slots required by header data. This is wrong when offset in the page of header
>> data is not zero, and is also inconsistent with following calculation for
>> required slot in netbk_gop_skb.
>>
>> In netbk_gop_skb, required slots are calculated based on offset and len in page
>> of header data. It is possible that required slots here is larger than the one
>> calculated in earlier netbk_count_requests. This inconsistency directly results
>> in rx_req_cons_peek and xen_netbk_rx_ring_full judgement are wrong.
>>
>> Then it comes to situation the ring is actually full, but netback thinks it is
>> not and continues to create responses. This results in response overlaps request
>> in the ring, then grantcopy gets wrong grant reference and throws out error,
>> for example "(XEN) grant_table.c:1763:d0 Bad grant reference 2949120", the
>> grant reference is invalid value here. Netback returns XEN_NETIF_RSP_ERROR(-1)
>> to netfront when grant copy status is error, then netfront gets rx->status
>> (the status is -1, not really data size now), and throws out error,
>> "kernel: net eth1: rx->offset: 0, size: 4294967295". This issue can be reproduced
>> by doing gzip/gunzip in nfs share with mtu = 9000, the guest would panic after
>> running such test for a while.
>>
>> This patch is based on 3.10-rc7.
>>
>> Signed-off-by: Annie Li <annie.li@oracle.com>
>> ---
>>   drivers/net/xen-netback/netback.c |   97 +++++++++++++++++++++++++-----------
>>   1 files changed, 67 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 8c20935..7ff9333 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -359,51 +359,88 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
>>    * the guest. This function is essentially a dry run of
>>    * netbk_gop_frag_copy.
>>    */
> Just to be clear, now the above comment applies to your new function,
> right?
>
>> -unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>> +static void netbk_get_slots(struct xenvif *vif, struct sk_buff *skb,
> A better name would be netbk_count_slots because it doesn't actually
> acquire any slots.
>
> And you can make it return the slot count and adds that to the total
> number of slots in the caller, as this function doesn't seem to rely on
> previous count so there is not much point passing in *count.
>
>> +			    struct page *page, int *copy_off,
>> +			    unsigned long size, unsigned long offset,
>> +			    int *head, int *count)
>>   {
>> -	unsigned int count;
>> -	int i, copy_off;
>> +	unsigned long bytes;
>>   
>> -	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>> +	/* Data must not cross a page boundary. */
>> +	BUG_ON(size + offset > PAGE_SIZE<<compound_order(page));
>>   
>> -	copy_off = skb_headlen(skb) % PAGE_SIZE;
>> +	/* Skip unused frames from start of page */
>> +	page += offset >> PAGE_SHIFT;
>> +	offset &= ~PAGE_MASK;
>>   
>> -	if (skb_shinfo(skb)->gso_size)
>> -		count++;
>> +	while (size > 0) {
>> +		BUG_ON(offset >= PAGE_SIZE);
>> +		BUG_ON(*copy_off > MAX_BUFFER_OFFSET);
>>   
>> -	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
>> -		unsigned long size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>> -		unsigned long offset = skb_shinfo(skb)->frags[i].page_offset;
>> -		unsigned long bytes;
>> +		bytes = PAGE_SIZE - offset;
>>   
>> -		offset &= ~PAGE_MASK;
>> +		if (bytes > size)
>> +			bytes = size;
>>   
>> -		while (size > 0) {
>> -			BUG_ON(offset >= PAGE_SIZE);
>> -			BUG_ON(copy_off > MAX_BUFFER_OFFSET);
>> +		if (start_new_rx_buffer(*copy_off, bytes, *head)) {
>> +			*count = *count + 1;
>> +			*copy_off = 0;
>> +		}
>>   
>> -			bytes = PAGE_SIZE - offset;
>> +		if (*copy_off + bytes > MAX_BUFFER_OFFSET)
>> +			bytes = MAX_BUFFER_OFFSET - *copy_off;
>>   
>> -			if (bytes > size)
>> -				bytes = size;
>> +		*copy_off += bytes;
>>   
>> -			if (start_new_rx_buffer(copy_off, bytes, 0)) {
>> -				count++;
>> -				copy_off = 0;
>> -			}
>> +		offset += bytes;
>> +		size -= bytes;
>>   
>> -			if (copy_off + bytes > MAX_BUFFER_OFFSET)
>> -				bytes = MAX_BUFFER_OFFSET - copy_off;
>> +		/* Next frame */
>> +		if (offset == PAGE_SIZE && size) {
>> +			BUG_ON(!PageCompound(page));
>> +			page++;
>> +			offset = 0;
>> +		}
>>   
>> -			copy_off += bytes;
>> +		if (*head)
> This seems to differ from netbk_gop_frag_copy. In that function it's
> like
>
>      if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)

I mixed netbk_gop_frag_copy and xen_netbk_count_skb_slots just now, so 
please ignore the comments here in my last email.
In xen_netbk_count_skb_slots, we have following code already, so it is 
not necessary to add gso condition here.

         if (skb_shinfo(skb)->gso_size)
                 count++;


Thanks
Annie
>
> Wei.
>
>> +			*count = *count + 1;
>> +		*head = 0; /* There must be something in this buffer now. */
>> +	}
>> +}
>> +
>> +unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>> +{
>> +	int i, copy_off = 0;
>> +	int nr_frags = skb_shinfo(skb)->nr_frags;
>> +	unsigned char *data;
>> +	int head = 1;
>> +	unsigned int count = 0;
>>   
>> -			offset += bytes;
>> -			size -= bytes;
>> +	if (skb_shinfo(skb)->gso_size)
>> +		count++;
>>   
>> -			if (offset == PAGE_SIZE)
>> -				offset = 0;
>> -		}
>> +	data = skb->data;
>> +	while (data < skb_tail_pointer(skb)) {
>> +		unsigned int offset = offset_in_page(data);
>> +		unsigned int len = PAGE_SIZE - offset;
>> +
>> +		if (data + len > skb_tail_pointer(skb))
>> +			len = skb_tail_pointer(skb) - data;
>> +
>> +		netbk_get_slots(vif, skb, virt_to_page(data), &copy_off,
>> +				len, offset, &head, &count);
>> +		data += len;
>> +	}
>> +
>> +	for (i = 0; i < nr_frags; i++) {
>> +		netbk_get_slots(vif, skb,
>> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
>> +				&copy_off,
>> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
>> +				skb_shinfo(skb)->frags[i].page_offset,
>> +				&head, &count);
>>   	}
>> +
>>   	return count;
>>   }
>>   
>> -- 
>> 1.7.3.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
  2013-07-10  2:31   ` annie li
@ 2013-07-10  7:48       ` Wei Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-10  7:48 UTC (permalink / raw)
  To: annie li; +Cc: Wei Liu, xen-devel, netdev, Ian.Campbell, konrad.wilk

On Wed, Jul 10, 2013 at 10:31:49AM +0800, annie li wrote:
[...]
> >This seems to differ from netbk_gop_frag_copy. In that function it's
> >like
> >
> >     if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> 
> I mixed netbk_gop_frag_copy and xen_netbk_count_skb_slots just now,
> so please ignore the comments here in my last email.
> In xen_netbk_count_skb_slots, we have following code already, so it
> is not necessary to add gso condition here.
> 
>         if (skb_shinfo(skb)->gso_size)
>                 count++;
> 

Oh I see. But the condition still seems to be different from the
original one -- you missed !vif->gso_prefix.


Wei.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
@ 2013-07-10  7:48       ` Wei Liu
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-10  7:48 UTC (permalink / raw)
  To: annie li; +Cc: Wei Liu, xen-devel, netdev, Ian.Campbell, konrad.wilk

On Wed, Jul 10, 2013 at 10:31:49AM +0800, annie li wrote:
[...]
> >This seems to differ from netbk_gop_frag_copy. In that function it's
> >like
> >
> >     if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> 
> I mixed netbk_gop_frag_copy and xen_netbk_count_skb_slots just now,
> so please ignore the comments here in my last email.
> In xen_netbk_count_skb_slots, we have following code already, so it
> is not necessary to add gso condition here.
> 
>         if (skb_shinfo(skb)->gso_size)
>                 count++;
> 

Oh I see. But the condition still seems to be different from the
original one -- you missed !vif->gso_prefix.


Wei.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-09 22:40   ` Matt Wilson
@ 2013-07-10  8:13     ` Wei Liu
  2013-07-10 19:37       ` Wei Liu
  2013-07-10 19:37       ` Wei Liu
  2013-07-10  8:13     ` Wei Liu
  1 sibling, 2 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-10  8:13 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Xi Xiong, Annie Li, Wei Liu, Ian Campbell, netdev, xen-devel

On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> From: Xi Xiong <xixiong@amazon.com>
> 
> [ note: I've just cherry picked this onto net-next, and only compile
>   tested. This a RFC only. -msw ]
> 

Should probably rebase it on net.git because it is a bug fix. Let's
worry about that later...

> Currently the number of RX slots required to transmit a SKB to
> xen-netfront can be miscalculated when an interface uses a MTU larger
> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> pause the queue indefinitely or reuse slots. The former manifests as a
> loss of connectivity to the guest (which can be restored by lowering
> the MTU set on the interface). The latter manifests with "Bad grant
> reference" messages from Xen such as:
> 
> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> 
> and kernel messages within the guest such as:
> 
> [  180.419567] net eth0: Invalid extra type: 112
> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
> 
> BUG_ON() assertions can also be hit if RX slots are exhausted while
> handling a SKB.
> 
> This patch changes xen_netbk_rx_action() to count the number of RX
> slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
> This prevents under-counting the number of RX slots consumed when a
> SKB has a large linear buffer.
> 
> Additionally, we now store the estimated number of RX slots required
> to handle a SKB in the cb overlay. This value is used to determine if
> the next SKB in the queue can be processed.
> 
> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> wasted when setting up copy grant table operations for SKBs with large
> linear buffers. For example, a SKB with skb_headlen() equal to 8157
> bytes that starts 64 bytes 64 bytes from the start of the page will

Duplicated "64 bytes". And this change looks like an improvement not a
bug fix. Probably submit a separate patch for this?

> consume three RX slots instead of two. This patch changes the "head"
> parameter to netbk_gop_frag_copy() to act as a flag. When set,
> start_new_rx_buffer() will always place as much data as possible into
> each RX slot.
> 
> Signed-off-by: Xi Xiong <xixiong@amazon.com>
> Reviewed-by: Matt Wilson <msw@amazon.com>
> [ msw: minor code cleanups, rewrote commit message, adjusted code
>   to count RX slots instead of meta structures ]
> Signed-off-by: Matt Wilson <msw@amazon.com>
> Cc: Annie Li <annie.li@oracle.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <Ian.Campbell@citrix.com>
> Cc: netdev@vger.kernel.org
> Cc: xen-devel@lists.xenproject.org
> ---
>  drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
>  1 files changed, 31 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 64828de..82dd207 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -110,6 +110,11 @@ union page_ext {
>  	void *mapping;
>  };
>  
> +struct skb_cb_overlay {
> +	int meta_slots_used;
> +	int peek_slots_count;
> +};
> +
>  struct xen_netbk {
>  	wait_queue_head_t wq;
>  	struct task_struct *task;
> @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>  {
>  	unsigned int count;
>  	int i, copy_off;
> +	struct skb_cb_overlay *sco;
>  
>  	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>  
> @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>  				offset = 0;
>  		}
>  	}
> +
> +	sco = (struct skb_cb_overlay *) skb->cb;
> +	sco->peek_slots_count = count;
>  	return count;
>  }
>  
> @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>  }
>  
>  /*
> - * Set up the grant operations for this fragment. If it's a flipping
> - * interface, we also set up the unmap request from here.
> + * Set up the grant operations for this fragment.
>   */
>  static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>  				struct netrx_pending_operations *npo,
>  				struct page *page, unsigned long size,
> -				unsigned long offset, int *head)
> +				unsigned long offset, int head, int *first)
>  {
>  	struct gnttab_copy *copy_gop;
>  	struct netbk_rx_meta *meta;
> @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>  		if (bytes > size)
>  			bytes = size;
>  
> -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
>  			/*
>  			 * Netfront requires there to be some data in the head
>  			 * buffer.
>  			 */
> -			BUG_ON(*head);
> +			BUG_ON(*first);
>  
>  			meta = get_next_rx_buffer(vif, npo);
>  		}
> @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>  		}
>  
>  		/* Leave a gap for the GSO descriptor. */
> -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> +		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
>  			vif->rx.req_cons++;
>  
> -		*head = 0; /* There must be something in this buffer now. */
> +		*first = 0; /* There must be something in this buffer now. */
>  
>  	}
>  }
> @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
>  	struct xen_netif_rx_request *req;
>  	struct netbk_rx_meta *meta;
>  	unsigned char *data;
> -	int head = 1;
> +	int first = 1;
>  	int old_meta_prod;
>  
>  	old_meta_prod = npo->meta_prod;
> @@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
>  			len = skb_tail_pointer(skb) - data;
>  
>  		netbk_gop_frag_copy(vif, skb, npo,
> -				    virt_to_page(data), len, offset, &head);
> +				virt_to_page(data), len, offset, 1, &first);
>  		data += len;
>  	}
>  
>  	for (i = 0; i < nr_frags; i++) {
>  		netbk_gop_frag_copy(vif, skb, npo,
> -				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
> -				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
> -				    skb_shinfo(skb)->frags[i].page_offset,
> -				    &head);
> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> +				skb_shinfo(skb)->frags[i].page_offset,
> +				0, &first);
>  	}
>  
>  	return npo->meta_prod - old_meta_prod;
> @@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
>  	}
>  }
>  
> -struct skb_cb_overlay {
> -	int meta_slots_used;
> -};
> -
>  static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  {
>  	struct xenvif *vif = NULL, *tmp;
> @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  	count = 0;
>  
>  	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
> +		RING_IDX old_rx_req_cons;
> + 
>  		vif = netdev_priv(skb->dev);
>  		nr_frags = skb_shinfo(skb)->nr_frags;
>  
> +		old_rx_req_cons = vif->rx.req_cons;
>  		sco = (struct skb_cb_overlay *)skb->cb;
>  		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
>  
> -		count += nr_frags + 1;
> +		count += vif->rx.req_cons - old_rx_req_cons;
>  
>  		__skb_queue_tail(&rxq, skb);
>  
> +		skb = skb_peek(&netbk->rx_queue);
> +		if (skb == NULL)
> +			break;
> +		sco = (struct skb_cb_overlay *) skb->cb;
> +
>  		/* Filled the batch queue? */
> -		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
> -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> +		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
>  			break;

Using req_cons to count is OK, but since you've already store the number
of slots in sco->peek_slots_count before actually queueing the packet,
why don't you use that directly?


Wei.

>  	}
>  
> -- 
> 1.7.4.5

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-09 22:40   ` Matt Wilson
  2013-07-10  8:13     ` Wei Liu
@ 2013-07-10  8:13     ` Wei Liu
  1 sibling, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-10  8:13 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Wei Liu, Ian Campbell, netdev, Annie Li, xen-devel, Xi Xiong

On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> From: Xi Xiong <xixiong@amazon.com>
> 
> [ note: I've just cherry picked this onto net-next, and only compile
>   tested. This a RFC only. -msw ]
> 

Should probably rebase it on net.git because it is a bug fix. Let's
worry about that later...

> Currently the number of RX slots required to transmit a SKB to
> xen-netfront can be miscalculated when an interface uses a MTU larger
> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> pause the queue indefinitely or reuse slots. The former manifests as a
> loss of connectivity to the guest (which can be restored by lowering
> the MTU set on the interface). The latter manifests with "Bad grant
> reference" messages from Xen such as:
> 
> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> 
> and kernel messages within the guest such as:
> 
> [  180.419567] net eth0: Invalid extra type: 112
> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
> 
> BUG_ON() assertions can also be hit if RX slots are exhausted while
> handling a SKB.
> 
> This patch changes xen_netbk_rx_action() to count the number of RX
> slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
> This prevents under-counting the number of RX slots consumed when a
> SKB has a large linear buffer.
> 
> Additionally, we now store the estimated number of RX slots required
> to handle a SKB in the cb overlay. This value is used to determine if
> the next SKB in the queue can be processed.
> 
> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> wasted when setting up copy grant table operations for SKBs with large
> linear buffers. For example, a SKB with skb_headlen() equal to 8157
> bytes that starts 64 bytes 64 bytes from the start of the page will

Duplicated "64 bytes". And this change looks like an improvement not a
bug fix. Probably submit a separate patch for this?

> consume three RX slots instead of two. This patch changes the "head"
> parameter to netbk_gop_frag_copy() to act as a flag. When set,
> start_new_rx_buffer() will always place as much data as possible into
> each RX slot.
> 
> Signed-off-by: Xi Xiong <xixiong@amazon.com>
> Reviewed-by: Matt Wilson <msw@amazon.com>
> [ msw: minor code cleanups, rewrote commit message, adjusted code
>   to count RX slots instead of meta structures ]
> Signed-off-by: Matt Wilson <msw@amazon.com>
> Cc: Annie Li <annie.li@oracle.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Ian Campbell <Ian.Campbell@citrix.com>
> Cc: netdev@vger.kernel.org
> Cc: xen-devel@lists.xenproject.org
> ---
>  drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
>  1 files changed, 31 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 64828de..82dd207 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -110,6 +110,11 @@ union page_ext {
>  	void *mapping;
>  };
>  
> +struct skb_cb_overlay {
> +	int meta_slots_used;
> +	int peek_slots_count;
> +};
> +
>  struct xen_netbk {
>  	wait_queue_head_t wq;
>  	struct task_struct *task;
> @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>  {
>  	unsigned int count;
>  	int i, copy_off;
> +	struct skb_cb_overlay *sco;
>  
>  	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>  
> @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>  				offset = 0;
>  		}
>  	}
> +
> +	sco = (struct skb_cb_overlay *) skb->cb;
> +	sco->peek_slots_count = count;
>  	return count;
>  }
>  
> @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>  }
>  
>  /*
> - * Set up the grant operations for this fragment. If it's a flipping
> - * interface, we also set up the unmap request from here.
> + * Set up the grant operations for this fragment.
>   */
>  static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>  				struct netrx_pending_operations *npo,
>  				struct page *page, unsigned long size,
> -				unsigned long offset, int *head)
> +				unsigned long offset, int head, int *first)
>  {
>  	struct gnttab_copy *copy_gop;
>  	struct netbk_rx_meta *meta;
> @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>  		if (bytes > size)
>  			bytes = size;
>  
> -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
>  			/*
>  			 * Netfront requires there to be some data in the head
>  			 * buffer.
>  			 */
> -			BUG_ON(*head);
> +			BUG_ON(*first);
>  
>  			meta = get_next_rx_buffer(vif, npo);
>  		}
> @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>  		}
>  
>  		/* Leave a gap for the GSO descriptor. */
> -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> +		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
>  			vif->rx.req_cons++;
>  
> -		*head = 0; /* There must be something in this buffer now. */
> +		*first = 0; /* There must be something in this buffer now. */
>  
>  	}
>  }
> @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
>  	struct xen_netif_rx_request *req;
>  	struct netbk_rx_meta *meta;
>  	unsigned char *data;
> -	int head = 1;
> +	int first = 1;
>  	int old_meta_prod;
>  
>  	old_meta_prod = npo->meta_prod;
> @@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
>  			len = skb_tail_pointer(skb) - data;
>  
>  		netbk_gop_frag_copy(vif, skb, npo,
> -				    virt_to_page(data), len, offset, &head);
> +				virt_to_page(data), len, offset, 1, &first);
>  		data += len;
>  	}
>  
>  	for (i = 0; i < nr_frags; i++) {
>  		netbk_gop_frag_copy(vif, skb, npo,
> -				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
> -				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
> -				    skb_shinfo(skb)->frags[i].page_offset,
> -				    &head);
> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> +				skb_shinfo(skb)->frags[i].page_offset,
> +				0, &first);
>  	}
>  
>  	return npo->meta_prod - old_meta_prod;
> @@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
>  	}
>  }
>  
> -struct skb_cb_overlay {
> -	int meta_slots_used;
> -};
> -
>  static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  {
>  	struct xenvif *vif = NULL, *tmp;
> @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>  	count = 0;
>  
>  	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
> +		RING_IDX old_rx_req_cons;
> + 
>  		vif = netdev_priv(skb->dev);
>  		nr_frags = skb_shinfo(skb)->nr_frags;
>  
> +		old_rx_req_cons = vif->rx.req_cons;
>  		sco = (struct skb_cb_overlay *)skb->cb;
>  		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
>  
> -		count += nr_frags + 1;
> +		count += vif->rx.req_cons - old_rx_req_cons;
>  
>  		__skb_queue_tail(&rxq, skb);
>  
> +		skb = skb_peek(&netbk->rx_queue);
> +		if (skb == NULL)
> +			break;
> +		sco = (struct skb_cb_overlay *) skb->cb;
> +
>  		/* Filled the batch queue? */
> -		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
> -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> +		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
>  			break;

Using req_cons to count is OK, but since you've already store the number
of slots in sco->peek_slots_count before actually queueing the packet,
why don't you use that directly?


Wei.

>  	}
>  
> -- 
> 1.7.4.5

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/1] xen/netback: correctly calculate required slots of skb.
  2013-07-10  7:48       ` Wei Liu
  (?)
@ 2013-07-10  8:34       ` annie li
  -1 siblings, 0 replies; 38+ messages in thread
From: annie li @ 2013-07-10  8:34 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, netdev, Ian.Campbell, konrad.wilk


On 2013-7-10 15:48, Wei Liu wrote:
> On Wed, Jul 10, 2013 at 10:31:49AM +0800, annie li wrote:
> [...]
>>> This seems to differ from netbk_gop_frag_copy. In that function it's
>>> like
>>>
>>>      if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
>> I mixed netbk_gop_frag_copy and xen_netbk_count_skb_slots just now,
>> so please ignore the comments here in my last email.
>> In xen_netbk_count_skb_slots, we have following code already, so it
>> is not necessary to add gso condition here.
>>
>>          if (skb_shinfo(skb)->gso_size)
>>                  count++;
>>
> Oh I see. But the condition still seems to be different from the
> original one -- you missed !vif->gso_prefix.

The original code in xen_netbk_count_skb_slots does not have 
!vif->gso_prefix. But I think you are right, it is better to keep 
consistent with that in netbk_gop_frag_copy.

Thanks
Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-10  8:13     ` Wei Liu
@ 2013-07-10 19:37       ` Wei Liu
  2013-07-11  1:25         ` annie li
                           ` (3 more replies)
  2013-07-10 19:37       ` Wei Liu
  1 sibling, 4 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-10 19:37 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Xi Xiong, Annie Li, Wei Liu, Ian Campbell, netdev, xen-devel

On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> > From: Xi Xiong <xixiong@amazon.com>
> > 
> > [ note: I've just cherry picked this onto net-next, and only compile
> >   tested. This a RFC only. -msw ]
> > 
> 
> Should probably rebase it on net.git because it is a bug fix. Let's
> worry about that later...
> 
> > Currently the number of RX slots required to transmit a SKB to
> > xen-netfront can be miscalculated when an interface uses a MTU larger
> > than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> > pause the queue indefinitely or reuse slots. The former manifests as a
> > loss of connectivity to the guest (which can be restored by lowering
> > the MTU set on the interface). The latter manifests with "Bad grant
> > reference" messages from Xen such as:
> > 
> > (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> > 
> > and kernel messages within the guest such as:
> > 
> > [  180.419567] net eth0: Invalid extra type: 112
> > [  180.868620] net eth0: rx->offset: 0, size: 4294967295
> > [  180.868629] net eth0: rx->offset: 0, size: 4294967295
> > 
> > BUG_ON() assertions can also be hit if RX slots are exhausted while
> > handling a SKB.
> > 
> > This patch changes xen_netbk_rx_action() to count the number of RX
> > slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
> > This prevents under-counting the number of RX slots consumed when a
> > SKB has a large linear buffer.
> > 
> > Additionally, we now store the estimated number of RX slots required
> > to handle a SKB in the cb overlay. This value is used to determine if
> > the next SKB in the queue can be processed.
> > 
> > Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> > wasted when setting up copy grant table operations for SKBs with large
> > linear buffers. For example, a SKB with skb_headlen() equal to 8157
> > bytes that starts 64 bytes 64 bytes from the start of the page will
> 
> Duplicated "64 bytes". And this change looks like an improvement not a
> bug fix. Probably submit a separate patch for this?
> 
> > consume three RX slots instead of two. This patch changes the "head"
> > parameter to netbk_gop_frag_copy() to act as a flag. When set,
> > start_new_rx_buffer() will always place as much data as possible into
> > each RX slot.
> > 
> > Signed-off-by: Xi Xiong <xixiong@amazon.com>
> > Reviewed-by: Matt Wilson <msw@amazon.com>
> > [ msw: minor code cleanups, rewrote commit message, adjusted code
> >   to count RX slots instead of meta structures ]
> > Signed-off-by: Matt Wilson <msw@amazon.com>
> > Cc: Annie Li <annie.li@oracle.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Ian Campbell <Ian.Campbell@citrix.com>
> > Cc: netdev@vger.kernel.org
> > Cc: xen-devel@lists.xenproject.org
> > ---
> >  drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
> >  1 files changed, 31 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > index 64828de..82dd207 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -110,6 +110,11 @@ union page_ext {
> >  	void *mapping;
> >  };
> >  
> > +struct skb_cb_overlay {
> > +	int meta_slots_used;
> > +	int peek_slots_count;
> > +};
> > +
> >  struct xen_netbk {
> >  	wait_queue_head_t wq;
> >  	struct task_struct *task;
> > @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> >  {
> >  	unsigned int count;
> >  	int i, copy_off;
> > +	struct skb_cb_overlay *sco;
> >  
> >  	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> >  
> > @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> >  				offset = 0;
> >  		}
> >  	}
> > +
> > +	sco = (struct skb_cb_overlay *) skb->cb;
> > +	sco->peek_slots_count = count;
> >  	return count;
> >  }
> >  
> > @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
> >  }
> >  
> >  /*
> > - * Set up the grant operations for this fragment. If it's a flipping
> > - * interface, we also set up the unmap request from here.
> > + * Set up the grant operations for this fragment.
> >   */
> >  static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> >  				struct netrx_pending_operations *npo,
> >  				struct page *page, unsigned long size,
> > -				unsigned long offset, int *head)
> > +				unsigned long offset, int head, int *first)
> >  {
> >  	struct gnttab_copy *copy_gop;
> >  	struct netbk_rx_meta *meta;
> > @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> >  		if (bytes > size)
> >  			bytes = size;
> >  
> > -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> > +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
> >  			/*
> >  			 * Netfront requires there to be some data in the head
> >  			 * buffer.
> >  			 */
> > -			BUG_ON(*head);
> > +			BUG_ON(*first);
> >  
> >  			meta = get_next_rx_buffer(vif, npo);
> >  		}
> > @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> >  		}
> >  
> >  		/* Leave a gap for the GSO descriptor. */
> > -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> > +		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> >  			vif->rx.req_cons++;
> >  
> > -		*head = 0; /* There must be something in this buffer now. */
> > +		*first = 0; /* There must be something in this buffer now. */
> >  
> >  	}
> >  }
> > @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
> >  	struct xen_netif_rx_request *req;
> >  	struct netbk_rx_meta *meta;
> >  	unsigned char *data;
> > -	int head = 1;
> > +	int first = 1;
> >  	int old_meta_prod;
> >  
> >  	old_meta_prod = npo->meta_prod;
> > @@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
> >  			len = skb_tail_pointer(skb) - data;
> >  
> >  		netbk_gop_frag_copy(vif, skb, npo,
> > -				    virt_to_page(data), len, offset, &head);
> > +				virt_to_page(data), len, offset, 1, &first);
> >  		data += len;
> >  	}
> >  
> >  	for (i = 0; i < nr_frags; i++) {
> >  		netbk_gop_frag_copy(vif, skb, npo,
> > -				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
> > -				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
> > -				    skb_shinfo(skb)->frags[i].page_offset,
> > -				    &head);
> > +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> > +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> > +				skb_shinfo(skb)->frags[i].page_offset,
> > +				0, &first);
> >  	}
> >  
> >  	return npo->meta_prod - old_meta_prod;
> > @@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
> >  	}
> >  }
> >  
> > -struct skb_cb_overlay {
> > -	int meta_slots_used;
> > -};
> > -
> >  static void xen_netbk_rx_action(struct xen_netbk *netbk)
> >  {
> >  	struct xenvif *vif = NULL, *tmp;
> > @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
> >  	count = 0;
> >  
> >  	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
> > +		RING_IDX old_rx_req_cons;
> > + 
> >  		vif = netdev_priv(skb->dev);
> >  		nr_frags = skb_shinfo(skb)->nr_frags;
> >  
> > +		old_rx_req_cons = vif->rx.req_cons;
> >  		sco = (struct skb_cb_overlay *)skb->cb;
> >  		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
> >  
> > -		count += nr_frags + 1;
> > +		count += vif->rx.req_cons - old_rx_req_cons;
> >  
> >  		__skb_queue_tail(&rxq, skb);
> >  
> > +		skb = skb_peek(&netbk->rx_queue);
> > +		if (skb == NULL)
> > +			break;
> > +		sco = (struct skb_cb_overlay *) skb->cb;
> > +
> >  		/* Filled the batch queue? */
> > -		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
> > -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> > +		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
> >  			break;
> 
> Using req_cons to count is OK, but since you've already store the number
> of slots in sco->peek_slots_count before actually queueing the packet,
> why don't you use that directly?
> 

Ah, I must be off my head here. I mixed several patches 1) my RFC patch
that removes MAX_SKB_FRAGS in RX path, 2) Annie's patch and 3) your
patch.

The reason that your patches uses req_cons instead of pre-calculated
value is, that value returned by xen_netbk_skb_count_slots is actually
*wrong* -- that's what Annie tried to fix in her patch.

After fixing xen_netbk_skb_count_slots, we would need the above snippet
(or the snippet I proposed in my RFC patch) to prevent overruning the
ring.

So a proper patch to this issue (not couting RX slots correctly causing
overrun of the ring) would be the above two aspects combined.

Comments?


Wei.

> 
> Wei.
> 
> >  	}
> >  
> > -- 
> > 1.7.4.5

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-10  8:13     ` Wei Liu
  2013-07-10 19:37       ` Wei Liu
@ 2013-07-10 19:37       ` Wei Liu
  1 sibling, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-10 19:37 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Wei Liu, Ian Campbell, netdev, Annie Li, xen-devel, Xi Xiong

On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> > From: Xi Xiong <xixiong@amazon.com>
> > 
> > [ note: I've just cherry picked this onto net-next, and only compile
> >   tested. This a RFC only. -msw ]
> > 
> 
> Should probably rebase it on net.git because it is a bug fix. Let's
> worry about that later...
> 
> > Currently the number of RX slots required to transmit a SKB to
> > xen-netfront can be miscalculated when an interface uses a MTU larger
> > than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> > pause the queue indefinitely or reuse slots. The former manifests as a
> > loss of connectivity to the guest (which can be restored by lowering
> > the MTU set on the interface). The latter manifests with "Bad grant
> > reference" messages from Xen such as:
> > 
> > (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> > 
> > and kernel messages within the guest such as:
> > 
> > [  180.419567] net eth0: Invalid extra type: 112
> > [  180.868620] net eth0: rx->offset: 0, size: 4294967295
> > [  180.868629] net eth0: rx->offset: 0, size: 4294967295
> > 
> > BUG_ON() assertions can also be hit if RX slots are exhausted while
> > handling a SKB.
> > 
> > This patch changes xen_netbk_rx_action() to count the number of RX
> > slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
> > This prevents under-counting the number of RX slots consumed when a
> > SKB has a large linear buffer.
> > 
> > Additionally, we now store the estimated number of RX slots required
> > to handle a SKB in the cb overlay. This value is used to determine if
> > the next SKB in the queue can be processed.
> > 
> > Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> > wasted when setting up copy grant table operations for SKBs with large
> > linear buffers. For example, a SKB with skb_headlen() equal to 8157
> > bytes that starts 64 bytes 64 bytes from the start of the page will
> 
> Duplicated "64 bytes". And this change looks like an improvement not a
> bug fix. Probably submit a separate patch for this?
> 
> > consume three RX slots instead of two. This patch changes the "head"
> > parameter to netbk_gop_frag_copy() to act as a flag. When set,
> > start_new_rx_buffer() will always place as much data as possible into
> > each RX slot.
> > 
> > Signed-off-by: Xi Xiong <xixiong@amazon.com>
> > Reviewed-by: Matt Wilson <msw@amazon.com>
> > [ msw: minor code cleanups, rewrote commit message, adjusted code
> >   to count RX slots instead of meta structures ]
> > Signed-off-by: Matt Wilson <msw@amazon.com>
> > Cc: Annie Li <annie.li@oracle.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Ian Campbell <Ian.Campbell@citrix.com>
> > Cc: netdev@vger.kernel.org
> > Cc: xen-devel@lists.xenproject.org
> > ---
> >  drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
> >  1 files changed, 31 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > index 64828de..82dd207 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -110,6 +110,11 @@ union page_ext {
> >  	void *mapping;
> >  };
> >  
> > +struct skb_cb_overlay {
> > +	int meta_slots_used;
> > +	int peek_slots_count;
> > +};
> > +
> >  struct xen_netbk {
> >  	wait_queue_head_t wq;
> >  	struct task_struct *task;
> > @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> >  {
> >  	unsigned int count;
> >  	int i, copy_off;
> > +	struct skb_cb_overlay *sco;
> >  
> >  	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> >  
> > @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> >  				offset = 0;
> >  		}
> >  	}
> > +
> > +	sco = (struct skb_cb_overlay *) skb->cb;
> > +	sco->peek_slots_count = count;
> >  	return count;
> >  }
> >  
> > @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
> >  }
> >  
> >  /*
> > - * Set up the grant operations for this fragment. If it's a flipping
> > - * interface, we also set up the unmap request from here.
> > + * Set up the grant operations for this fragment.
> >   */
> >  static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> >  				struct netrx_pending_operations *npo,
> >  				struct page *page, unsigned long size,
> > -				unsigned long offset, int *head)
> > +				unsigned long offset, int head, int *first)
> >  {
> >  	struct gnttab_copy *copy_gop;
> >  	struct netbk_rx_meta *meta;
> > @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> >  		if (bytes > size)
> >  			bytes = size;
> >  
> > -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> > +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
> >  			/*
> >  			 * Netfront requires there to be some data in the head
> >  			 * buffer.
> >  			 */
> > -			BUG_ON(*head);
> > +			BUG_ON(*first);
> >  
> >  			meta = get_next_rx_buffer(vif, npo);
> >  		}
> > @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> >  		}
> >  
> >  		/* Leave a gap for the GSO descriptor. */
> > -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> > +		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> >  			vif->rx.req_cons++;
> >  
> > -		*head = 0; /* There must be something in this buffer now. */
> > +		*first = 0; /* There must be something in this buffer now. */
> >  
> >  	}
> >  }
> > @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
> >  	struct xen_netif_rx_request *req;
> >  	struct netbk_rx_meta *meta;
> >  	unsigned char *data;
> > -	int head = 1;
> > +	int first = 1;
> >  	int old_meta_prod;
> >  
> >  	old_meta_prod = npo->meta_prod;
> > @@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
> >  			len = skb_tail_pointer(skb) - data;
> >  
> >  		netbk_gop_frag_copy(vif, skb, npo,
> > -				    virt_to_page(data), len, offset, &head);
> > +				virt_to_page(data), len, offset, 1, &first);
> >  		data += len;
> >  	}
> >  
> >  	for (i = 0; i < nr_frags; i++) {
> >  		netbk_gop_frag_copy(vif, skb, npo,
> > -				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
> > -				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
> > -				    skb_shinfo(skb)->frags[i].page_offset,
> > -				    &head);
> > +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> > +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> > +				skb_shinfo(skb)->frags[i].page_offset,
> > +				0, &first);
> >  	}
> >  
> >  	return npo->meta_prod - old_meta_prod;
> > @@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
> >  	}
> >  }
> >  
> > -struct skb_cb_overlay {
> > -	int meta_slots_used;
> > -};
> > -
> >  static void xen_netbk_rx_action(struct xen_netbk *netbk)
> >  {
> >  	struct xenvif *vif = NULL, *tmp;
> > @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
> >  	count = 0;
> >  
> >  	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
> > +		RING_IDX old_rx_req_cons;
> > + 
> >  		vif = netdev_priv(skb->dev);
> >  		nr_frags = skb_shinfo(skb)->nr_frags;
> >  
> > +		old_rx_req_cons = vif->rx.req_cons;
> >  		sco = (struct skb_cb_overlay *)skb->cb;
> >  		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
> >  
> > -		count += nr_frags + 1;
> > +		count += vif->rx.req_cons - old_rx_req_cons;
> >  
> >  		__skb_queue_tail(&rxq, skb);
> >  
> > +		skb = skb_peek(&netbk->rx_queue);
> > +		if (skb == NULL)
> > +			break;
> > +		sco = (struct skb_cb_overlay *) skb->cb;
> > +
> >  		/* Filled the batch queue? */
> > -		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
> > -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> > +		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
> >  			break;
> 
> Using req_cons to count is OK, but since you've already store the number
> of slots in sco->peek_slots_count before actually queueing the packet,
> why don't you use that directly?
> 

Ah, I must be off my head here. I mixed several patches 1) my RFC patch
that removes MAX_SKB_FRAGS in RX path, 2) Annie's patch and 3) your
patch.

The reason that your patches uses req_cons instead of pre-calculated
value is, that value returned by xen_netbk_skb_count_slots is actually
*wrong* -- that's what Annie tried to fix in her patch.

After fixing xen_netbk_skb_count_slots, we would need the above snippet
(or the snippet I proposed in my RFC patch) to prevent overruning the
ring.

So a proper patch to this issue (not couting RX slots correctly causing
overrun of the ring) would be the above two aspects combined.

Comments?


Wei.

> 
> Wei.
> 
> >  	}
> >  
> > -- 
> > 1.7.4.5

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-10 19:37       ` Wei Liu
@ 2013-07-11  1:25         ` annie li
  2013-07-11  1:25         ` annie li
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 38+ messages in thread
From: annie li @ 2013-07-11  1:25 UTC (permalink / raw)
  To: Wei Liu; +Cc: Matt Wilson, Xi Xiong, Ian Campbell, netdev, xen-devel


On 2013-7-11 3:37, Wei Liu wrote:
> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
>>> From: Xi Xiong <xixiong@amazon.com>
>>>
>>> [ note: I've just cherry picked this onto net-next, and only compile
>>>    tested. This a RFC only. -msw ]
>>>
>> Should probably rebase it on net.git because it is a bug fix. Let's
>> worry about that later...
>>
>>> Currently the number of RX slots required to transmit a SKB to
>>> xen-netfront can be miscalculated when an interface uses a MTU larger
>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
>>> pause the queue indefinitely or reuse slots. The former manifests as a
>>> loss of connectivity to the guest (which can be restored by lowering
>>> the MTU set on the interface). The latter manifests with "Bad grant
>>> reference" messages from Xen such as:
>>>
>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
>>>
>>> and kernel messages within the guest such as:
>>>
>>> [  180.419567] net eth0: Invalid extra type: 112
>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
>>>
>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
>>> handling a SKB.
>>>
>>> This patch changes xen_netbk_rx_action() to count the number of RX
>>> slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
>>> This prevents under-counting the number of RX slots consumed when a
>>> SKB has a large linear buffer.
>>>
>>> Additionally, we now store the estimated number of RX slots required
>>> to handle a SKB in the cb overlay. This value is used to determine if
>>> the next SKB in the queue can be processed.
>>>
>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
>>> wasted when setting up copy grant table operations for SKBs with large
>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
>>> bytes that starts 64 bytes 64 bytes from the start of the page will
>> Duplicated "64 bytes". And this change looks like an improvement not a
>> bug fix. Probably submit a separate patch for this?
>>
>>> consume three RX slots instead of two. This patch changes the "head"
>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
>>> start_new_rx_buffer() will always place as much data as possible into
>>> each RX slot.
>>>
>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
>>> Reviewed-by: Matt Wilson <msw@amazon.com>
>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
>>>    to count RX slots instead of meta structures ]
>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>> Cc: Annie Li <annie.li@oracle.com>
>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>>> Cc: netdev@vger.kernel.org
>>> Cc: xen-devel@lists.xenproject.org
>>> ---
>>>   drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
>>>   1 files changed, 31 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>>> index 64828de..82dd207 100644
>>> --- a/drivers/net/xen-netback/netback.c
>>> +++ b/drivers/net/xen-netback/netback.c
>>> @@ -110,6 +110,11 @@ union page_ext {
>>>   	void *mapping;
>>>   };
>>>   
>>> +struct skb_cb_overlay {
>>> +	int meta_slots_used;
>>> +	int peek_slots_count;
>>> +};
>>> +
>>>   struct xen_netbk {
>>>   	wait_queue_head_t wq;
>>>   	struct task_struct *task;
>>> @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>   {
>>>   	unsigned int count;
>>>   	int i, copy_off;
>>> +	struct skb_cb_overlay *sco;
>>>   
>>>   	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>>>   
>>> @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>   				offset = 0;
>>>   		}
>>>   	}
>>> +
>>> +	sco = (struct skb_cb_overlay *) skb->cb;
>>> +	sco->peek_slots_count = count;
>>>   	return count;
>>>   }
>>>   
>>> @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>>>   }
>>>   
>>>   /*
>>> - * Set up the grant operations for this fragment. If it's a flipping
>>> - * interface, we also set up the unmap request from here.
>>> + * Set up the grant operations for this fragment.
>>>    */
>>>   static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>   				struct netrx_pending_operations *npo,
>>>   				struct page *page, unsigned long size,
>>> -				unsigned long offset, int *head)
>>> +				unsigned long offset, int head, int *first)
>>>   {
>>>   	struct gnttab_copy *copy_gop;
>>>   	struct netbk_rx_meta *meta;
>>> @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>   		if (bytes > size)
>>>   			bytes = size;
>>>   
>>> -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
>>> +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
>>>   			/*
>>>   			 * Netfront requires there to be some data in the head
>>>   			 * buffer.
>>>   			 */
>>> -			BUG_ON(*head);
>>> +			BUG_ON(*first);
>>>   
>>>   			meta = get_next_rx_buffer(vif, npo);
>>>   		}
>>> @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>   		}
>>>   
>>>   		/* Leave a gap for the GSO descriptor. */
>>> -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
>>> +		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
>>>   			vif->rx.req_cons++;
>>>   
>>> -		*head = 0; /* There must be something in this buffer now. */
>>> +		*first = 0; /* There must be something in this buffer now. */
>>>   
>>>   	}
>>>   }
>>> @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
>>>   	struct xen_netif_rx_request *req;
>>>   	struct netbk_rx_meta *meta;
>>>   	unsigned char *data;
>>> -	int head = 1;
>>> +	int first = 1;
>>>   	int old_meta_prod;
>>>   
>>>   	old_meta_prod = npo->meta_prod;
>>> @@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
>>>   			len = skb_tail_pointer(skb) - data;
>>>   
>>>   		netbk_gop_frag_copy(vif, skb, npo,
>>> -				    virt_to_page(data), len, offset, &head);
>>> +				virt_to_page(data), len, offset, 1, &first);
>>>   		data += len;
>>>   	}
>>>   
>>>   	for (i = 0; i < nr_frags; i++) {
>>>   		netbk_gop_frag_copy(vif, skb, npo,
>>> -				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
>>> -				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
>>> -				    skb_shinfo(skb)->frags[i].page_offset,
>>> -				    &head);
>>> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
>>> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
>>> +				skb_shinfo(skb)->frags[i].page_offset,
>>> +				0, &first);
>>>   	}
>>>   
>>>   	return npo->meta_prod - old_meta_prod;
>>> @@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
>>>   	}
>>>   }
>>>   
>>> -struct skb_cb_overlay {
>>> -	int meta_slots_used;
>>> -};
>>> -
>>>   static void xen_netbk_rx_action(struct xen_netbk *netbk)
>>>   {
>>>   	struct xenvif *vif = NULL, *tmp;
>>> @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>>>   	count = 0;
>>>   
>>>   	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
>>> +		RING_IDX old_rx_req_cons;
>>> +
>>>   		vif = netdev_priv(skb->dev);
>>>   		nr_frags = skb_shinfo(skb)->nr_frags;
>>>   
>>> +		old_rx_req_cons = vif->rx.req_cons;
>>>   		sco = (struct skb_cb_overlay *)skb->cb;
>>>   		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
>>>   
>>> -		count += nr_frags + 1;
>>> +		count += vif->rx.req_cons - old_rx_req_cons;
>>>   
>>>   		__skb_queue_tail(&rxq, skb);
>>>   
>>> +		skb = skb_peek(&netbk->rx_queue);
>>> +		if (skb == NULL)
>>> +			break;
>>> +		sco = (struct skb_cb_overlay *) skb->cb;
>>> +
>>>   		/* Filled the batch queue? */
>>> -		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
>>> -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
>>> +		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
>>>   			break;
>> Using req_cons to count is OK, but since you've already store the number
>> of slots in sco->peek_slots_count before actually queueing the packet,
>> why don't you use that directly?
>>
> Ah, I must be off my head here. I mixed several patches 1) my RFC patch
> that removes MAX_SKB_FRAGS in RX path, 2) Annie's patch and 3) your
> patch.
>
> The reason that your patches uses req_cons instead of pre-calculated
> value is, that value returned by xen_netbk_skb_count_slots is actually
> *wrong* -- that's what Annie tried to fix in her patch.
>
> After fixing xen_netbk_skb_count_slots, we would need the above snippet
> (or the snippet I proposed in my RFC patch) to prevent overruning the
> ring.
>
> So a proper patch to this issue (not couting RX slots correctly causing
> overrun of the ring) would be the above two aspects combined.
>
> Comments?

I agree, two patches looks fine.
xen_netbk_skb_count_slots returns wrong value, this could be fixed by my 
patch.
protection from overrunning the ring, this could be implemented by 
Wei's/Matt's patch.

Thanks
Annie
>
>
> Wei.
>
>> Wei.
>>
>>>   	}
>>>   
>>> -- 
>>> 1.7.4.5
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-10 19:37       ` Wei Liu
  2013-07-11  1:25         ` annie li
@ 2013-07-11  1:25         ` annie li
  2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
  2013-07-11  5:14         ` Matt Wilson
  3 siblings, 0 replies; 38+ messages in thread
From: annie li @ 2013-07-11  1:25 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, xen-devel, Xi Xiong, Matt Wilson, Ian Campbell


On 2013-7-11 3:37, Wei Liu wrote:
> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
>>> From: Xi Xiong <xixiong@amazon.com>
>>>
>>> [ note: I've just cherry picked this onto net-next, and only compile
>>>    tested. This a RFC only. -msw ]
>>>
>> Should probably rebase it on net.git because it is a bug fix. Let's
>> worry about that later...
>>
>>> Currently the number of RX slots required to transmit a SKB to
>>> xen-netfront can be miscalculated when an interface uses a MTU larger
>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
>>> pause the queue indefinitely or reuse slots. The former manifests as a
>>> loss of connectivity to the guest (which can be restored by lowering
>>> the MTU set on the interface). The latter manifests with "Bad grant
>>> reference" messages from Xen such as:
>>>
>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
>>>
>>> and kernel messages within the guest such as:
>>>
>>> [  180.419567] net eth0: Invalid extra type: 112
>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
>>>
>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
>>> handling a SKB.
>>>
>>> This patch changes xen_netbk_rx_action() to count the number of RX
>>> slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
>>> This prevents under-counting the number of RX slots consumed when a
>>> SKB has a large linear buffer.
>>>
>>> Additionally, we now store the estimated number of RX slots required
>>> to handle a SKB in the cb overlay. This value is used to determine if
>>> the next SKB in the queue can be processed.
>>>
>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
>>> wasted when setting up copy grant table operations for SKBs with large
>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
>>> bytes that starts 64 bytes 64 bytes from the start of the page will
>> Duplicated "64 bytes". And this change looks like an improvement not a
>> bug fix. Probably submit a separate patch for this?
>>
>>> consume three RX slots instead of two. This patch changes the "head"
>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
>>> start_new_rx_buffer() will always place as much data as possible into
>>> each RX slot.
>>>
>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
>>> Reviewed-by: Matt Wilson <msw@amazon.com>
>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
>>>    to count RX slots instead of meta structures ]
>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>> Cc: Annie Li <annie.li@oracle.com>
>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>>> Cc: netdev@vger.kernel.org
>>> Cc: xen-devel@lists.xenproject.org
>>> ---
>>>   drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
>>>   1 files changed, 31 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>>> index 64828de..82dd207 100644
>>> --- a/drivers/net/xen-netback/netback.c
>>> +++ b/drivers/net/xen-netback/netback.c
>>> @@ -110,6 +110,11 @@ union page_ext {
>>>   	void *mapping;
>>>   };
>>>   
>>> +struct skb_cb_overlay {
>>> +	int meta_slots_used;
>>> +	int peek_slots_count;
>>> +};
>>> +
>>>   struct xen_netbk {
>>>   	wait_queue_head_t wq;
>>>   	struct task_struct *task;
>>> @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>   {
>>>   	unsigned int count;
>>>   	int i, copy_off;
>>> +	struct skb_cb_overlay *sco;
>>>   
>>>   	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>>>   
>>> @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>   				offset = 0;
>>>   		}
>>>   	}
>>> +
>>> +	sco = (struct skb_cb_overlay *) skb->cb;
>>> +	sco->peek_slots_count = count;
>>>   	return count;
>>>   }
>>>   
>>> @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>>>   }
>>>   
>>>   /*
>>> - * Set up the grant operations for this fragment. If it's a flipping
>>> - * interface, we also set up the unmap request from here.
>>> + * Set up the grant operations for this fragment.
>>>    */
>>>   static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>   				struct netrx_pending_operations *npo,
>>>   				struct page *page, unsigned long size,
>>> -				unsigned long offset, int *head)
>>> +				unsigned long offset, int head, int *first)
>>>   {
>>>   	struct gnttab_copy *copy_gop;
>>>   	struct netbk_rx_meta *meta;
>>> @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>   		if (bytes > size)
>>>   			bytes = size;
>>>   
>>> -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
>>> +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
>>>   			/*
>>>   			 * Netfront requires there to be some data in the head
>>>   			 * buffer.
>>>   			 */
>>> -			BUG_ON(*head);
>>> +			BUG_ON(*first);
>>>   
>>>   			meta = get_next_rx_buffer(vif, npo);
>>>   		}
>>> @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>   		}
>>>   
>>>   		/* Leave a gap for the GSO descriptor. */
>>> -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
>>> +		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
>>>   			vif->rx.req_cons++;
>>>   
>>> -		*head = 0; /* There must be something in this buffer now. */
>>> +		*first = 0; /* There must be something in this buffer now. */
>>>   
>>>   	}
>>>   }
>>> @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
>>>   	struct xen_netif_rx_request *req;
>>>   	struct netbk_rx_meta *meta;
>>>   	unsigned char *data;
>>> -	int head = 1;
>>> +	int first = 1;
>>>   	int old_meta_prod;
>>>   
>>>   	old_meta_prod = npo->meta_prod;
>>> @@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
>>>   			len = skb_tail_pointer(skb) - data;
>>>   
>>>   		netbk_gop_frag_copy(vif, skb, npo,
>>> -				    virt_to_page(data), len, offset, &head);
>>> +				virt_to_page(data), len, offset, 1, &first);
>>>   		data += len;
>>>   	}
>>>   
>>>   	for (i = 0; i < nr_frags; i++) {
>>>   		netbk_gop_frag_copy(vif, skb, npo,
>>> -				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
>>> -				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
>>> -				    skb_shinfo(skb)->frags[i].page_offset,
>>> -				    &head);
>>> +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
>>> +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
>>> +				skb_shinfo(skb)->frags[i].page_offset,
>>> +				0, &first);
>>>   	}
>>>   
>>>   	return npo->meta_prod - old_meta_prod;
>>> @@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
>>>   	}
>>>   }
>>>   
>>> -struct skb_cb_overlay {
>>> -	int meta_slots_used;
>>> -};
>>> -
>>>   static void xen_netbk_rx_action(struct xen_netbk *netbk)
>>>   {
>>>   	struct xenvif *vif = NULL, *tmp;
>>> @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
>>>   	count = 0;
>>>   
>>>   	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
>>> +		RING_IDX old_rx_req_cons;
>>> +
>>>   		vif = netdev_priv(skb->dev);
>>>   		nr_frags = skb_shinfo(skb)->nr_frags;
>>>   
>>> +		old_rx_req_cons = vif->rx.req_cons;
>>>   		sco = (struct skb_cb_overlay *)skb->cb;
>>>   		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
>>>   
>>> -		count += nr_frags + 1;
>>> +		count += vif->rx.req_cons - old_rx_req_cons;
>>>   
>>>   		__skb_queue_tail(&rxq, skb);
>>>   
>>> +		skb = skb_peek(&netbk->rx_queue);
>>> +		if (skb == NULL)
>>> +			break;
>>> +		sco = (struct skb_cb_overlay *) skb->cb;
>>> +
>>>   		/* Filled the batch queue? */
>>> -		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
>>> -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
>>> +		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
>>>   			break;
>> Using req_cons to count is OK, but since you've already store the number
>> of slots in sco->peek_slots_count before actually queueing the packet,
>> why don't you use that directly?
>>
> Ah, I must be off my head here. I mixed several patches 1) my RFC patch
> that removes MAX_SKB_FRAGS in RX path, 2) Annie's patch and 3) your
> patch.
>
> The reason that your patches uses req_cons instead of pre-calculated
> value is, that value returned by xen_netbk_skb_count_slots is actually
> *wrong* -- that's what Annie tried to fix in her patch.
>
> After fixing xen_netbk_skb_count_slots, we would need the above snippet
> (or the snippet I proposed in my RFC patch) to prevent overruning the
> ring.
>
> So a proper patch to this issue (not couting RX slots correctly causing
> overrun of the ring) would be the above two aspects combined.
>
> Comments?

I agree, two patches looks fine.
xen_netbk_skb_count_slots returns wrong value, this could be fixed by my 
patch.
protection from overrunning the ring, this could be implemented by 
Wei's/Matt's patch.

Thanks
Annie
>
>
> Wei.
>
>> Wei.
>>
>>>   	}
>>>   
>>> -- 
>>> 1.7.4.5
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-10 19:37       ` Wei Liu
  2013-07-11  1:25         ` annie li
  2013-07-11  1:25         ` annie li
@ 2013-07-11  5:14         ` Matt Wilson
  2013-07-11  6:01           ` annie li
                             ` (5 more replies)
  2013-07-11  5:14         ` Matt Wilson
  3 siblings, 6 replies; 38+ messages in thread
From: Matt Wilson @ 2013-07-11  5:14 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, netdev, Annie Li, xen-devel, Xi Xiong

On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
> > On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> > > From: Xi Xiong <xixiong@amazon.com>
> > > 
> > > [ note: I've just cherry picked this onto net-next, and only compile
> > >   tested. This a RFC only. -msw ]
> > > 
> > 
> > Should probably rebase it on net.git because it is a bug fix. Let's
> > worry about that later...

*nod*

> > > Currently the number of RX slots required to transmit a SKB to
> > > xen-netfront can be miscalculated when an interface uses a MTU larger
> > > than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> > > pause the queue indefinitely or reuse slots. The former manifests as a
> > > loss of connectivity to the guest (which can be restored by lowering
> > > the MTU set on the interface). The latter manifests with "Bad grant
> > > reference" messages from Xen such as:
> > > 
> > > (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> > > 
> > > and kernel messages within the guest such as:
> > > 
> > > [  180.419567] net eth0: Invalid extra type: 112
> > > [  180.868620] net eth0: rx->offset: 0, size: 4294967295
> > > [  180.868629] net eth0: rx->offset: 0, size: 4294967295
> > > 
> > > BUG_ON() assertions can also be hit if RX slots are exhausted while
> > > handling a SKB.
> > > 
> > > This patch changes xen_netbk_rx_action() to count the number of RX
> > > slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
> > > This prevents under-counting the number of RX slots consumed when a
> > > SKB has a large linear buffer.
> > > 
> > > Additionally, we now store the estimated number of RX slots required
> > > to handle a SKB in the cb overlay. This value is used to determine if
> > > the next SKB in the queue can be processed.
> > > 
> > > Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> > > wasted when setting up copy grant table operations for SKBs with large
> > > linear buffers. For example, a SKB with skb_headlen() equal to 8157
> > > bytes that starts 64 bytes 64 bytes from the start of the page will
> > 
> > Duplicated "64 bytes". And this change looks like an improvement not a
> > bug fix. Probably submit a separate patch for this?

Argh, I knew it was in there somewhere (since you pointed it out in
Dubin :-).

Maybe it could be a separate patch. I think the description is also a
bit confusing. I'll work on rewording it.

> > > consume three RX slots instead of two. This patch changes the "head"
> > > parameter to netbk_gop_frag_copy() to act as a flag. When set,
> > > start_new_rx_buffer() will always place as much data as possible into
> > > each RX slot.
> > > 
> > > Signed-off-by: Xi Xiong <xixiong@amazon.com>
> > > Reviewed-by: Matt Wilson <msw@amazon.com>
> > > [ msw: minor code cleanups, rewrote commit message, adjusted code
> > >   to count RX slots instead of meta structures ]
> > > Signed-off-by: Matt Wilson <msw@amazon.com>
> > > Cc: Annie Li <annie.li@oracle.com>
> > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > Cc: Ian Campbell <Ian.Campbell@citrix.com>
> > > Cc: netdev@vger.kernel.org
> > > Cc: xen-devel@lists.xenproject.org
> > > ---
> > >  drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
> > >  1 files changed, 31 insertions(+), 20 deletions(-)
> > > 
> > > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > > index 64828de..82dd207 100644
> > > --- a/drivers/net/xen-netback/netback.c
> > > +++ b/drivers/net/xen-netback/netback.c
> > > @@ -110,6 +110,11 @@ union page_ext {
> > >  	void *mapping;
> > >  };
> > >  
> > > +struct skb_cb_overlay {
> > > +	int meta_slots_used;
> > > +	int peek_slots_count;
> > > +};
> > > +
> > >  struct xen_netbk {
> > >  	wait_queue_head_t wq;
> > >  	struct task_struct *task;
> > > @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> > >  {
> > >  	unsigned int count;
> > >  	int i, copy_off;
> > > +	struct skb_cb_overlay *sco;
> > >  
> > >  	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> > >  
> > > @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> > >  				offset = 0;
> > >  		}
> > >  	}
> > > +
> > > +	sco = (struct skb_cb_overlay *) skb->cb;
> > > +	sco->peek_slots_count = count;
> > >  	return count;
> > >  }
> > >  
> > > @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
> > >  }
> > >  
> > >  /*
> > > - * Set up the grant operations for this fragment. If it's a flipping
> > > - * interface, we also set up the unmap request from here.
> > > + * Set up the grant operations for this fragment.
> > >   */
> > >  static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> > >  				struct netrx_pending_operations *npo,
> > >  				struct page *page, unsigned long size,
> > > -				unsigned long offset, int *head)
> > > +				unsigned long offset, int head, int *first)
> > >  {
> > >  	struct gnttab_copy *copy_gop;
> > >  	struct netbk_rx_meta *meta;
> > > @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> > >  		if (bytes > size)
> > >  			bytes = size;
> > >  
> > > -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> > > +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
> > >  			/*
> > >  			 * Netfront requires there to be some data in the head
> > >  			 * buffer.
> > >  			 */
> > > -			BUG_ON(*head);
> > > +			BUG_ON(*first);
> > >  
> > >  			meta = get_next_rx_buffer(vif, npo);
> > >  		}
> > > @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> > >  		}
> > >  
> > >  		/* Leave a gap for the GSO descriptor. */
> > > -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> > > +		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> > >  			vif->rx.req_cons++;
> > >  
> > > -		*head = 0; /* There must be something in this buffer now. */
> > > +		*first = 0; /* There must be something in this buffer now. */
> > >  
> > >  	}
> > >  }
> > > @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
> > >  	struct xen_netif_rx_request *req;
> > >  	struct netbk_rx_meta *meta;
> > >  	unsigned char *data;
> > > -	int head = 1;
> > > +	int first = 1;
> > >  	int old_meta_prod;
> > >  
> > >  	old_meta_prod = npo->meta_prod;
> > > @@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
> > >  			len = skb_tail_pointer(skb) - data;
> > >  
> > >  		netbk_gop_frag_copy(vif, skb, npo,
> > > -				    virt_to_page(data), len, offset, &head);
> > > +				virt_to_page(data), len, offset, 1, &first);
> > >  		data += len;
> > >  	}
> > >  
> > >  	for (i = 0; i < nr_frags; i++) {
> > >  		netbk_gop_frag_copy(vif, skb, npo,
> > > -				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
> > > -				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
> > > -				    skb_shinfo(skb)->frags[i].page_offset,
> > > -				    &head);
> > > +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> > > +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> > > +				skb_shinfo(skb)->frags[i].page_offset,
> > > +				0, &first);
> > >  	}
> > >  
> > >  	return npo->meta_prod - old_meta_prod;
> > > @@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
> > >  	}
> > >  }
> > >  
> > > -struct skb_cb_overlay {
> > > -	int meta_slots_used;
> > > -};
> > > -
> > >  static void xen_netbk_rx_action(struct xen_netbk *netbk)
> > >  {
> > >  	struct xenvif *vif = NULL, *tmp;
> > > @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
> > >  	count = 0;
> > >  
> > >  	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
> > > +		RING_IDX old_rx_req_cons;
> > > + 
> > >  		vif = netdev_priv(skb->dev);
> > >  		nr_frags = skb_shinfo(skb)->nr_frags;
> > >  
> > > +		old_rx_req_cons = vif->rx.req_cons;
> > >  		sco = (struct skb_cb_overlay *)skb->cb;
> > >  		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
> > >  
> > > -		count += nr_frags + 1;
> > > +		count += vif->rx.req_cons - old_rx_req_cons;
> > >  
> > >  		__skb_queue_tail(&rxq, skb);
> > >  
> > > +		skb = skb_peek(&netbk->rx_queue);
> > > +		if (skb == NULL)
> > > +			break;
> > > +		sco = (struct skb_cb_overlay *) skb->cb;
> > > +
> > >  		/* Filled the batch queue? */
> > > -		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
> > > -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> > > +		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
> > >  			break;
> > 
> > Using req_cons to count is OK, but since you've already store the number
> > of slots in sco->peek_slots_count before actually queueing the packet,
> > why don't you use that directly?
> > 
> 
> Ah, I must be off my head here. I mixed several patches 1) my RFC patch
> that removes MAX_SKB_FRAGS in RX path, 2) Annie's patch and 3) your
> patch.

Yes, I wanted to get this out since we're talking about multiple
changes in similar areas.

> The reason that your patches uses req_cons instead of pre-calculated
> value is, that value returned by xen_netbk_skb_count_slots is actually
> *wrong* -- that's what Annie tried to fix in her patch.

Yes and no. With this patch, the xen_netbk_count_skb_slots() is now
correct. xen_netbk_count_skb_slots() was under-counting when slots
were inefficiently consumed. With this patch they match.

Since we've had trouble with the counting in the past, I felt like
using the "real" number of slots here was safer and future-proof.

> After fixing xen_netbk_skb_count_slots, we would need the above snippet
> (or the snippet I proposed in my RFC patch) to prevent overruning the
> ring.

Agreed.

> So a proper patch to this issue (not couting RX slots correctly causing
> overrun of the ring) would be the above two aspects combined.

Yes, I think so.

> Comments?

I think that this patch addresses the problem more completely. Annie?

See also the thread from last August:
   http://lists.xen.org/archives/html/xen-devel/2012-12/msg00274.html

--msw

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-10 19:37       ` Wei Liu
                           ` (2 preceding siblings ...)
  2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
@ 2013-07-11  5:14         ` Matt Wilson
  3 siblings, 0 replies; 38+ messages in thread
From: Matt Wilson @ 2013-07-11  5:14 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, Annie Li, Xi Xiong, Ian Campbell, xen-devel

On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
> > On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> > > From: Xi Xiong <xixiong@amazon.com>
> > > 
> > > [ note: I've just cherry picked this onto net-next, and only compile
> > >   tested. This a RFC only. -msw ]
> > > 
> > 
> > Should probably rebase it on net.git because it is a bug fix. Let's
> > worry about that later...

*nod*

> > > Currently the number of RX slots required to transmit a SKB to
> > > xen-netfront can be miscalculated when an interface uses a MTU larger
> > > than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> > > pause the queue indefinitely or reuse slots. The former manifests as a
> > > loss of connectivity to the guest (which can be restored by lowering
> > > the MTU set on the interface). The latter manifests with "Bad grant
> > > reference" messages from Xen such as:
> > > 
> > > (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> > > 
> > > and kernel messages within the guest such as:
> > > 
> > > [  180.419567] net eth0: Invalid extra type: 112
> > > [  180.868620] net eth0: rx->offset: 0, size: 4294967295
> > > [  180.868629] net eth0: rx->offset: 0, size: 4294967295
> > > 
> > > BUG_ON() assertions can also be hit if RX slots are exhausted while
> > > handling a SKB.
> > > 
> > > This patch changes xen_netbk_rx_action() to count the number of RX
> > > slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
> > > This prevents under-counting the number of RX slots consumed when a
> > > SKB has a large linear buffer.
> > > 
> > > Additionally, we now store the estimated number of RX slots required
> > > to handle a SKB in the cb overlay. This value is used to determine if
> > > the next SKB in the queue can be processed.
> > > 
> > > Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> > > wasted when setting up copy grant table operations for SKBs with large
> > > linear buffers. For example, a SKB with skb_headlen() equal to 8157
> > > bytes that starts 64 bytes 64 bytes from the start of the page will
> > 
> > Duplicated "64 bytes". And this change looks like an improvement not a
> > bug fix. Probably submit a separate patch for this?

Argh, I knew it was in there somewhere (since you pointed it out in
Dubin :-).

Maybe it could be a separate patch. I think the description is also a
bit confusing. I'll work on rewording it.

> > > consume three RX slots instead of two. This patch changes the "head"
> > > parameter to netbk_gop_frag_copy() to act as a flag. When set,
> > > start_new_rx_buffer() will always place as much data as possible into
> > > each RX slot.
> > > 
> > > Signed-off-by: Xi Xiong <xixiong@amazon.com>
> > > Reviewed-by: Matt Wilson <msw@amazon.com>
> > > [ msw: minor code cleanups, rewrote commit message, adjusted code
> > >   to count RX slots instead of meta structures ]
> > > Signed-off-by: Matt Wilson <msw@amazon.com>
> > > Cc: Annie Li <annie.li@oracle.com>
> > > Cc: Wei Liu <wei.liu2@citrix.com>
> > > Cc: Ian Campbell <Ian.Campbell@citrix.com>
> > > Cc: netdev@vger.kernel.org
> > > Cc: xen-devel@lists.xenproject.org
> > > ---
> > >  drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
> > >  1 files changed, 31 insertions(+), 20 deletions(-)
> > > 
> > > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > > index 64828de..82dd207 100644
> > > --- a/drivers/net/xen-netback/netback.c
> > > +++ b/drivers/net/xen-netback/netback.c
> > > @@ -110,6 +110,11 @@ union page_ext {
> > >  	void *mapping;
> > >  };
> > >  
> > > +struct skb_cb_overlay {
> > > +	int meta_slots_used;
> > > +	int peek_slots_count;
> > > +};
> > > +
> > >  struct xen_netbk {
> > >  	wait_queue_head_t wq;
> > >  	struct task_struct *task;
> > > @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> > >  {
> > >  	unsigned int count;
> > >  	int i, copy_off;
> > > +	struct skb_cb_overlay *sco;
> > >  
> > >  	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> > >  
> > > @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> > >  				offset = 0;
> > >  		}
> > >  	}
> > > +
> > > +	sco = (struct skb_cb_overlay *) skb->cb;
> > > +	sco->peek_slots_count = count;
> > >  	return count;
> > >  }
> > >  
> > > @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
> > >  }
> > >  
> > >  /*
> > > - * Set up the grant operations for this fragment. If it's a flipping
> > > - * interface, we also set up the unmap request from here.
> > > + * Set up the grant operations for this fragment.
> > >   */
> > >  static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> > >  				struct netrx_pending_operations *npo,
> > >  				struct page *page, unsigned long size,
> > > -				unsigned long offset, int *head)
> > > +				unsigned long offset, int head, int *first)
> > >  {
> > >  	struct gnttab_copy *copy_gop;
> > >  	struct netbk_rx_meta *meta;
> > > @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> > >  		if (bytes > size)
> > >  			bytes = size;
> > >  
> > > -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> > > +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
> > >  			/*
> > >  			 * Netfront requires there to be some data in the head
> > >  			 * buffer.
> > >  			 */
> > > -			BUG_ON(*head);
> > > +			BUG_ON(*first);
> > >  
> > >  			meta = get_next_rx_buffer(vif, npo);
> > >  		}
> > > @@ -529,10 +537,10 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
> > >  		}
> > >  
> > >  		/* Leave a gap for the GSO descriptor. */
> > > -		if (*head && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> > > +		if (*first && skb_shinfo(skb)->gso_size && !vif->gso_prefix)
> > >  			vif->rx.req_cons++;
> > >  
> > > -		*head = 0; /* There must be something in this buffer now. */
> > > +		*first = 0; /* There must be something in this buffer now. */
> > >  
> > >  	}
> > >  }
> > > @@ -558,7 +566,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
> > >  	struct xen_netif_rx_request *req;
> > >  	struct netbk_rx_meta *meta;
> > >  	unsigned char *data;
> > > -	int head = 1;
> > > +	int first = 1;
> > >  	int old_meta_prod;
> > >  
> > >  	old_meta_prod = npo->meta_prod;
> > > @@ -594,16 +602,16 @@ static int netbk_gop_skb(struct sk_buff *skb,
> > >  			len = skb_tail_pointer(skb) - data;
> > >  
> > >  		netbk_gop_frag_copy(vif, skb, npo,
> > > -				    virt_to_page(data), len, offset, &head);
> > > +				virt_to_page(data), len, offset, 1, &first);
> > >  		data += len;
> > >  	}
> > >  
> > >  	for (i = 0; i < nr_frags; i++) {
> > >  		netbk_gop_frag_copy(vif, skb, npo,
> > > -				    skb_frag_page(&skb_shinfo(skb)->frags[i]),
> > > -				    skb_frag_size(&skb_shinfo(skb)->frags[i]),
> > > -				    skb_shinfo(skb)->frags[i].page_offset,
> > > -				    &head);
> > > +				skb_frag_page(&skb_shinfo(skb)->frags[i]),
> > > +				skb_frag_size(&skb_shinfo(skb)->frags[i]),
> > > +				skb_shinfo(skb)->frags[i].page_offset,
> > > +				0, &first);
> > >  	}
> > >  
> > >  	return npo->meta_prod - old_meta_prod;
> > > @@ -661,10 +669,6 @@ static void netbk_add_frag_responses(struct xenvif *vif, int status,
> > >  	}
> > >  }
> > >  
> > > -struct skb_cb_overlay {
> > > -	int meta_slots_used;
> > > -};
> > > -
> > >  static void xen_netbk_rx_action(struct xen_netbk *netbk)
> > >  {
> > >  	struct xenvif *vif = NULL, *tmp;
> > > @@ -690,19 +694,26 @@ static void xen_netbk_rx_action(struct xen_netbk *netbk)
> > >  	count = 0;
> > >  
> > >  	while ((skb = skb_dequeue(&netbk->rx_queue)) != NULL) {
> > > +		RING_IDX old_rx_req_cons;
> > > + 
> > >  		vif = netdev_priv(skb->dev);
> > >  		nr_frags = skb_shinfo(skb)->nr_frags;
> > >  
> > > +		old_rx_req_cons = vif->rx.req_cons;
> > >  		sco = (struct skb_cb_overlay *)skb->cb;
> > >  		sco->meta_slots_used = netbk_gop_skb(skb, &npo);
> > >  
> > > -		count += nr_frags + 1;
> > > +		count += vif->rx.req_cons - old_rx_req_cons;
> > >  
> > >  		__skb_queue_tail(&rxq, skb);
> > >  
> > > +		skb = skb_peek(&netbk->rx_queue);
> > > +		if (skb == NULL)
> > > +			break;
> > > +		sco = (struct skb_cb_overlay *) skb->cb;
> > > +
> > >  		/* Filled the batch queue? */
> > > -		/* XXX FIXME: RX path dependent on MAX_SKB_FRAGS */
> > > -		if (count + MAX_SKB_FRAGS >= XEN_NETIF_RX_RING_SIZE)
> > > +		if (count + sco->peek_slots_count >= XEN_NETIF_RX_RING_SIZE)
> > >  			break;
> > 
> > Using req_cons to count is OK, but since you've already store the number
> > of slots in sco->peek_slots_count before actually queueing the packet,
> > why don't you use that directly?
> > 
> 
> Ah, I must be off my head here. I mixed several patches 1) my RFC patch
> that removes MAX_SKB_FRAGS in RX path, 2) Annie's patch and 3) your
> patch.

Yes, I wanted to get this out since we're talking about multiple
changes in similar areas.

> The reason that your patches uses req_cons instead of pre-calculated
> value is, that value returned by xen_netbk_skb_count_slots is actually
> *wrong* -- that's what Annie tried to fix in her patch.

Yes and no. With this patch, the xen_netbk_count_skb_slots() is now
correct. xen_netbk_count_skb_slots() was under-counting when slots
were inefficiently consumed. With this patch they match.

Since we've had trouble with the counting in the past, I felt like
using the "real" number of slots here was safer and future-proof.

> After fixing xen_netbk_skb_count_slots, we would need the above snippet
> (or the snippet I proposed in my RFC patch) to prevent overruning the
> ring.

Agreed.

> So a proper patch to this issue (not couting RX slots correctly causing
> overrun of the ring) would be the above two aspects combined.

Yes, I think so.

> Comments?

I think that this patch addresses the problem more completely. Annie?

See also the thread from last August:
   http://lists.xen.org/archives/html/xen-devel/2012-12/msg00274.html

--msw

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
  2013-07-11  6:01           ` annie li
@ 2013-07-11  6:01           ` annie li
  2013-07-11 13:52             ` annie li
  2013-07-11 13:52             ` [Xen-devel] " annie li
  2013-07-11  8:46           ` [Xen-devel] " Wei Liu
                             ` (3 subsequent siblings)
  5 siblings, 2 replies; 38+ messages in thread
From: annie li @ 2013-07-11  6:01 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Wei Liu, Ian Campbell, netdev, xen-devel, Xi Xiong


On 2013-7-11 13:14, Matt Wilson wrote:
> On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
>> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
>>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
>>>> From: Xi Xiong <xixiong@amazon.com>
>>>>
>>>> [ note: I've just cherry picked this onto net-next, and only compile
>>>>    tested. This a RFC only. -msw ]
>>>>
>>> Should probably rebase it on net.git because it is a bug fix. Let's
>>> worry about that later...
> *nod*
>
>>>> Currently the number of RX slots required to transmit a SKB to
>>>> xen-netfront can be miscalculated when an interface uses a MTU larger
>>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
>>>> pause the queue indefinitely or reuse slots. The former manifests as a
>>>> loss of connectivity to the guest (which can be restored by lowering
>>>> the MTU set on the interface). The latter manifests with "Bad grant
>>>> reference" messages from Xen such as:
>>>>
>>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
>>>>
>>>> and kernel messages within the guest such as:
>>>>
>>>> [  180.419567] net eth0: Invalid extra type: 112
>>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
>>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
>>>>
>>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
>>>> handling a SKB.
>>>>
>>>> This patch changes xen_netbk_rx_action() to count the number of RX
>>>> slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
>>>> This prevents under-counting the number of RX slots consumed when a
>>>> SKB has a large linear buffer.
>>>>
>>>> Additionally, we now store the estimated number of RX slots required
>>>> to handle a SKB in the cb overlay. This value is used to determine if
>>>> the next SKB in the queue can be processed.
>>>>
>>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
>>>> wasted when setting up copy grant table operations for SKBs with large
>>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
>>>> bytes that starts 64 bytes 64 bytes from the start of the page will
>>> Duplicated "64 bytes". And this change looks like an improvement not a
>>> bug fix. Probably submit a separate patch for this?
> Argh, I knew it was in there somewhere (since you pointed it out in
> Dubin :-).
>
> Maybe it could be a separate patch. I think the description is also a
> bit confusing. I'll work on rewording it.
>
>>>> consume three RX slots instead of two. This patch changes the "head"
>>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
>>>> start_new_rx_buffer() will always place as much data as possible into
>>>> each RX slot.
>>>>
>>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
>>>> Reviewed-by: Matt Wilson <msw@amazon.com>
>>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
>>>>    to count RX slots instead of meta structures ]
>>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>>> Cc: Annie Li <annie.li@oracle.com>
>>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>>>> Cc: netdev@vger.kernel.org
>>>> Cc: xen-devel@lists.xenproject.org
>>>> ---
>>>>   drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
>>>>   1 files changed, 31 insertions(+), 20 deletions(-)
>>>>
>>>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>>>> index 64828de..82dd207 100644
>>>> --- a/drivers/net/xen-netback/netback.c
>>>> +++ b/drivers/net/xen-netback/netback.c
>>>> @@ -110,6 +110,11 @@ union page_ext {
>>>>   	void *mapping;
>>>>   };
>>>>   
>>>> +struct skb_cb_overlay {
>>>> +	int meta_slots_used;
>>>> +	int peek_slots_count;
>>>> +};
>>>> +
>>>>   struct xen_netbk {
>>>>   	wait_queue_head_t wq;
>>>>   	struct task_struct *task;
>>>> @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>>   {
>>>>   	unsigned int count;
>>>>   	int i, copy_off;
>>>> +	struct skb_cb_overlay *sco;
>>>>   
>>>>   	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>>>>   
>>>> @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>>   				offset = 0;
>>>>   		}
>>>>   	}
>>>> +
>>>> +	sco = (struct skb_cb_overlay *) skb->cb;
>>>> +	sco->peek_slots_count = count;
>>>>   	return count;
>>>>   }
>>>>   
>>>> @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>>>>   }
>>>>   
>>>>   /*
>>>> - * Set up the grant operations for this fragment. If it's a flipping
>>>> - * interface, we also set up the unmap request from here.
>>>> + * Set up the grant operations for this fragment.
>>>>    */
>>>>   static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>>   				struct netrx_pending_operations *npo,
>>>>   				struct page *page, unsigned long size,
>>>> -				unsigned long offset, int *head)
>>>> +				unsigned long offset, int head, int *first)
>>>>   {
>>>>   	struct gnttab_copy *copy_gop;
>>>>   	struct netbk_rx_meta *meta;
>>>> @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>>   		if (bytes > size)
>>>>   			bytes = size;
>>>>   
>>>> -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
>>>> +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {

head is always 1 here when processing the header data, so it is 
impossible to start a new buffer for header data with larger header 
size, and get_next_rx_buffer is never called. I assume this would not 
work well for skb with larger header data size. Have you run network 
test with this patch?

>>>>   			/*
>>>>   			 * Netfront requires there to be some data in the head
>>>>   			 * buffer.
>>>>   			 */
>>>> -			BUG_ON(*head);
>>>> +			BUG_ON(*first);
>>>>   
>>>>   			meta = get_next_rx_buffer(vif, npo);
>>>>   		}
snip...
>>>> Comments?
> I think that this patch addresses the problem more completely. Annie?

See my comments above, I am curious whether the header data is processed 
correctly.:-)

Thanks
Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
@ 2013-07-11  6:01           ` annie li
  2013-07-11  6:01           ` [Xen-devel] " annie li
                             ` (4 subsequent siblings)
  5 siblings, 0 replies; 38+ messages in thread
From: annie li @ 2013-07-11  6:01 UTC (permalink / raw)
  To: Matt Wilson; +Cc: netdev, Wei Liu, Ian Campbell, Xi Xiong, xen-devel


On 2013-7-11 13:14, Matt Wilson wrote:
> On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
>> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
>>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
>>>> From: Xi Xiong <xixiong@amazon.com>
>>>>
>>>> [ note: I've just cherry picked this onto net-next, and only compile
>>>>    tested. This a RFC only. -msw ]
>>>>
>>> Should probably rebase it on net.git because it is a bug fix. Let's
>>> worry about that later...
> *nod*
>
>>>> Currently the number of RX slots required to transmit a SKB to
>>>> xen-netfront can be miscalculated when an interface uses a MTU larger
>>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
>>>> pause the queue indefinitely or reuse slots. The former manifests as a
>>>> loss of connectivity to the guest (which can be restored by lowering
>>>> the MTU set on the interface). The latter manifests with "Bad grant
>>>> reference" messages from Xen such as:
>>>>
>>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
>>>>
>>>> and kernel messages within the guest such as:
>>>>
>>>> [  180.419567] net eth0: Invalid extra type: 112
>>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
>>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
>>>>
>>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
>>>> handling a SKB.
>>>>
>>>> This patch changes xen_netbk_rx_action() to count the number of RX
>>>> slots actually consumed by netbk_gop_skb() instead of using nr_frags + 1.
>>>> This prevents under-counting the number of RX slots consumed when a
>>>> SKB has a large linear buffer.
>>>>
>>>> Additionally, we now store the estimated number of RX slots required
>>>> to handle a SKB in the cb overlay. This value is used to determine if
>>>> the next SKB in the queue can be processed.
>>>>
>>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
>>>> wasted when setting up copy grant table operations for SKBs with large
>>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
>>>> bytes that starts 64 bytes 64 bytes from the start of the page will
>>> Duplicated "64 bytes". And this change looks like an improvement not a
>>> bug fix. Probably submit a separate patch for this?
> Argh, I knew it was in there somewhere (since you pointed it out in
> Dubin :-).
>
> Maybe it could be a separate patch. I think the description is also a
> bit confusing. I'll work on rewording it.
>
>>>> consume three RX slots instead of two. This patch changes the "head"
>>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
>>>> start_new_rx_buffer() will always place as much data as possible into
>>>> each RX slot.
>>>>
>>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
>>>> Reviewed-by: Matt Wilson <msw@amazon.com>
>>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
>>>>    to count RX slots instead of meta structures ]
>>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>>> Cc: Annie Li <annie.li@oracle.com>
>>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>>>> Cc: netdev@vger.kernel.org
>>>> Cc: xen-devel@lists.xenproject.org
>>>> ---
>>>>   drivers/net/xen-netback/netback.c |   51 ++++++++++++++++++++++--------------
>>>>   1 files changed, 31 insertions(+), 20 deletions(-)
>>>>
>>>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>>>> index 64828de..82dd207 100644
>>>> --- a/drivers/net/xen-netback/netback.c
>>>> +++ b/drivers/net/xen-netback/netback.c
>>>> @@ -110,6 +110,11 @@ union page_ext {
>>>>   	void *mapping;
>>>>   };
>>>>   
>>>> +struct skb_cb_overlay {
>>>> +	int meta_slots_used;
>>>> +	int peek_slots_count;
>>>> +};
>>>> +
>>>>   struct xen_netbk {
>>>>   	wait_queue_head_t wq;
>>>>   	struct task_struct *task;
>>>> @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>>   {
>>>>   	unsigned int count;
>>>>   	int i, copy_off;
>>>> +	struct skb_cb_overlay *sco;
>>>>   
>>>>   	count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>>>>   
>>>> @@ -411,6 +417,9 @@ unsigned int xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>>   				offset = 0;
>>>>   		}
>>>>   	}
>>>> +
>>>> +	sco = (struct skb_cb_overlay *) skb->cb;
>>>> +	sco->peek_slots_count = count;
>>>>   	return count;
>>>>   }
>>>>   
>>>> @@ -443,13 +452,12 @@ static struct netbk_rx_meta *get_next_rx_buffer(struct xenvif *vif,
>>>>   }
>>>>   
>>>>   /*
>>>> - * Set up the grant operations for this fragment. If it's a flipping
>>>> - * interface, we also set up the unmap request from here.
>>>> + * Set up the grant operations for this fragment.
>>>>    */
>>>>   static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>>   				struct netrx_pending_operations *npo,
>>>>   				struct page *page, unsigned long size,
>>>> -				unsigned long offset, int *head)
>>>> +				unsigned long offset, int head, int *first)
>>>>   {
>>>>   	struct gnttab_copy *copy_gop;
>>>>   	struct netbk_rx_meta *meta;
>>>> @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff *skb,
>>>>   		if (bytes > size)
>>>>   			bytes = size;
>>>>   
>>>> -		if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
>>>> +		if (start_new_rx_buffer(npo->copy_off, bytes, head)) {

head is always 1 here when processing the header data, so it is 
impossible to start a new buffer for header data with larger header 
size, and get_next_rx_buffer is never called. I assume this would not 
work well for skb with larger header data size. Have you run network 
test with this patch?

>>>>   			/*
>>>>   			 * Netfront requires there to be some data in the head
>>>>   			 * buffer.
>>>>   			 */
>>>> -			BUG_ON(*head);
>>>> +			BUG_ON(*first);
>>>>   
>>>>   			meta = get_next_rx_buffer(vif, npo);
>>>>   		}
snip...
>>>> Comments?
> I think that this patch addresses the problem more completely. Annie?

See my comments above, I am curious whether the header data is processed 
correctly.:-)

Thanks
Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
  2013-07-11  6:01           ` annie li
  2013-07-11  6:01           ` [Xen-devel] " annie li
@ 2013-07-11  8:46           ` Wei Liu
  2013-07-11  8:46           ` Wei Liu
                             ` (2 subsequent siblings)
  5 siblings, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-11  8:46 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Wei Liu, Ian Campbell, netdev, Annie Li, xen-devel, Xi Xiong

On Wed, Jul 10, 2013 at 10:14:43PM -0700, Matt Wilson wrote:
[...]
> Yes, I wanted to get this out since we're talking about multiple
> changes in similar areas.
> 
> > The reason that your patches uses req_cons instead of pre-calculated
> > value is, that value returned by xen_netbk_skb_count_slots is actually
> > *wrong* -- that's what Annie tried to fix in her patch.
> 
> Yes and no. With this patch, the xen_netbk_count_skb_slots() is now
> correct. xen_netbk_count_skb_slots() was under-counting when slots
> were inefficiently consumed. With this patch they match.
> 

This is worth mentioning in the commit message IMHO.

> Since we've had trouble with the counting in the past, I felt like
> using the "real" number of slots here was safer and future-proof.
> 

Fair enough. Worth a comment in the code.

> > After fixing xen_netbk_skb_count_slots, we would need the above snippet
> > (or the snippet I proposed in my RFC patch) to prevent overruning the
> > ring.
> 
> Agreed.
> 
> > So a proper patch to this issue (not couting RX slots correctly causing
> > overrun of the ring) would be the above two aspects combined.
> 
> Yes, I think so.
> 
> > Comments?
> 
> I think that this patch addresses the problem more completely. Annie?
> 
> See also the thread from last August:
>    http://lists.xen.org/archives/html/xen-devel/2012-12/msg00274.html
> 

So now it seems that "turning head in start_new_rx_buffer to flag" part
should also be in the patch as it's actually a fix to match counting in
both functions. The side effect of this change is "improving efficiency
of the ring". This would also worth mentioning in the commit message
IMHO.


Wei.

> --msw

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
                             ` (2 preceding siblings ...)
  2013-07-11  8:46           ` [Xen-devel] " Wei Liu
@ 2013-07-11  8:46           ` Wei Liu
  2013-07-16 10:08           ` annie li
  2013-07-16 10:08           ` [Xen-devel] " annie li
  5 siblings, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-11  8:46 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Wei Liu, Ian Campbell, netdev, Annie Li, xen-devel, Xi Xiong

On Wed, Jul 10, 2013 at 10:14:43PM -0700, Matt Wilson wrote:
[...]
> Yes, I wanted to get this out since we're talking about multiple
> changes in similar areas.
> 
> > The reason that your patches uses req_cons instead of pre-calculated
> > value is, that value returned by xen_netbk_skb_count_slots is actually
> > *wrong* -- that's what Annie tried to fix in her patch.
> 
> Yes and no. With this patch, the xen_netbk_count_skb_slots() is now
> correct. xen_netbk_count_skb_slots() was under-counting when slots
> were inefficiently consumed. With this patch they match.
> 

This is worth mentioning in the commit message IMHO.

> Since we've had trouble with the counting in the past, I felt like
> using the "real" number of slots here was safer and future-proof.
> 

Fair enough. Worth a comment in the code.

> > After fixing xen_netbk_skb_count_slots, we would need the above snippet
> > (or the snippet I proposed in my RFC patch) to prevent overruning the
> > ring.
> 
> Agreed.
> 
> > So a proper patch to this issue (not couting RX slots correctly causing
> > overrun of the ring) would be the above two aspects combined.
> 
> Yes, I think so.
> 
> > Comments?
> 
> I think that this patch addresses the problem more completely. Annie?
> 
> See also the thread from last August:
>    http://lists.xen.org/archives/html/xen-devel/2012-12/msg00274.html
> 

So now it seems that "turning head in start_new_rx_buffer to flag" part
should also be in the patch as it's actually a fix to match counting in
both functions. The side effect of this change is "improving efficiency
of the ring". This would also worth mentioning in the commit message
IMHO.


Wei.

> --msw

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11  6:01           ` [Xen-devel] " annie li
  2013-07-11 13:52             ` annie li
@ 2013-07-11 13:52             ` annie li
  2013-07-12  8:49               ` Wei Liu
  2013-07-12  8:49               ` Wei Liu
  1 sibling, 2 replies; 38+ messages in thread
From: annie li @ 2013-07-11 13:52 UTC (permalink / raw)
  To: Matt Wilson; +Cc: netdev, Wei Liu, Ian Campbell, Xi Xiong, xen-devel


On 2013-7-11 14:01, annie li wrote:
>
> On 2013-7-11 13:14, Matt Wilson wrote:
>> On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
>>> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
>>>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
>>>>> From: Xi Xiong <xixiong@amazon.com>
>>>>>
>>>>> [ note: I've just cherry picked this onto net-next, and only compile
>>>>>    tested. This a RFC only. -msw ]
>>>>>
>>>> Should probably rebase it on net.git because it is a bug fix. Let's
>>>> worry about that later...
>> *nod*
>>
>>>>> Currently the number of RX slots required to transmit a SKB to
>>>>> xen-netfront can be miscalculated when an interface uses a MTU larger
>>>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
>>>>> pause the queue indefinitely or reuse slots. The former manifests 
>>>>> as a
>>>>> loss of connectivity to the guest (which can be restored by lowering
>>>>> the MTU set on the interface). The latter manifests with "Bad grant
>>>>> reference" messages from Xen such as:
>>>>>
>>>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
>>>>>
>>>>> and kernel messages within the guest such as:
>>>>>
>>>>> [  180.419567] net eth0: Invalid extra type: 112
>>>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
>>>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
>>>>>
>>>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
>>>>> handling a SKB.
>>>>>
>>>>> This patch changes xen_netbk_rx_action() to count the number of RX
>>>>> slots actually consumed by netbk_gop_skb() instead of using 
>>>>> nr_frags + 1.
>>>>> This prevents under-counting the number of RX slots consumed when a
>>>>> SKB has a large linear buffer.
>>>>>
>>>>> Additionally, we now store the estimated number of RX slots required
>>>>> to handle a SKB in the cb overlay. This value is used to determine if
>>>>> the next SKB in the queue can be processed.
>>>>>
>>>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
>>>>> wasted when setting up copy grant table operations for SKBs with 
>>>>> large
>>>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
>>>>> bytes that starts 64 bytes 64 bytes from the start of the page will
>>>> Duplicated "64 bytes". And this change looks like an improvement not a
>>>> bug fix. Probably submit a separate patch for this?
>> Argh, I knew it was in there somewhere (since you pointed it out in
>> Dubin :-).
>>
>> Maybe it could be a separate patch. I think the description is also a
>> bit confusing. I'll work on rewording it.
>>
>>>>> consume three RX slots instead of two. This patch changes the "head"
>>>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
>>>>> start_new_rx_buffer() will always place as much data as possible into
>>>>> each RX slot.
>>>>>
>>>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
>>>>> Reviewed-by: Matt Wilson <msw@amazon.com>
>>>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
>>>>>    to count RX slots instead of meta structures ]
>>>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>>>> Cc: Annie Li <annie.li@oracle.com>
>>>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>>>>> Cc: netdev@vger.kernel.org
>>>>> Cc: xen-devel@lists.xenproject.org
>>>>> ---
>>>>>   drivers/net/xen-netback/netback.c |   51 
>>>>> ++++++++++++++++++++++--------------
>>>>>   1 files changed, 31 insertions(+), 20 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/xen-netback/netback.c 
>>>>> b/drivers/net/xen-netback/netback.c
>>>>> index 64828de..82dd207 100644
>>>>> --- a/drivers/net/xen-netback/netback.c
>>>>> +++ b/drivers/net/xen-netback/netback.c
>>>>> @@ -110,6 +110,11 @@ union page_ext {
>>>>>       void *mapping;
>>>>>   };
>>>>>   +struct skb_cb_overlay {
>>>>> +    int meta_slots_used;
>>>>> +    int peek_slots_count;
>>>>> +};
>>>>> +
>>>>>   struct xen_netbk {
>>>>>       wait_queue_head_t wq;
>>>>>       struct task_struct *task;
>>>>> @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct 
>>>>> xenvif *vif, struct sk_buff *skb)
>>>>>   {
>>>>>       unsigned int count;
>>>>>       int i, copy_off;
>>>>> +    struct skb_cb_overlay *sco;
>>>>>         count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>>>>>   @@ -411,6 +417,9 @@ unsigned int 
>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>>>                   offset = 0;
>>>>>           }
>>>>>       }
>>>>> +
>>>>> +    sco = (struct skb_cb_overlay *) skb->cb;
>>>>> +    sco->peek_slots_count = count;
>>>>>       return count;
>>>>>   }
>>>>>   @@ -443,13 +452,12 @@ static struct netbk_rx_meta 
>>>>> *get_next_rx_buffer(struct xenvif *vif,
>>>>>   }
>>>>>     /*
>>>>> - * Set up the grant operations for this fragment. If it's a flipping
>>>>> - * interface, we also set up the unmap request from here.
>>>>> + * Set up the grant operations for this fragment.
>>>>>    */
>>>>>   static void netbk_gop_frag_copy(struct xenvif *vif, struct 
>>>>> sk_buff *skb,
>>>>>                   struct netrx_pending_operations *npo,
>>>>>                   struct page *page, unsigned long size,
>>>>> -                unsigned long offset, int *head)
>>>>> +                unsigned long offset, int head, int *first)
>>>>>   {
>>>>>       struct gnttab_copy *copy_gop;
>>>>>       struct netbk_rx_meta *meta;
>>>>> @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct 
>>>>> xenvif *vif, struct sk_buff *skb,
>>>>>           if (bytes > size)
>>>>>               bytes = size;
>>>>>   -        if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
>>>>> +        if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
>
> head is always 1 here when processing the header data, so it is 
> impossible to start a new buffer for header data with larger header 
> size, and get_next_rx_buffer is never called. I assume this would not 
> work well for skb with larger header data size. Have you run network 
> test with this patch?

Sorry, I forgot the offset == MAX_BUFFER_OFFSET case and misunderstand 
your patch, please ignore my last comments. Your patch keeps the 
original DIV_ROUND_UP and changes the mechanism in netbk_gop_frag_copy 
to make slots same with xen_netbk_count_skb_slots. All Roads Lead to 
Rome!:-)

Thanks
Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11  6:01           ` [Xen-devel] " annie li
@ 2013-07-11 13:52             ` annie li
  2013-07-11 13:52             ` [Xen-devel] " annie li
  1 sibling, 0 replies; 38+ messages in thread
From: annie li @ 2013-07-11 13:52 UTC (permalink / raw)
  To: Matt Wilson; +Cc: netdev, xen-devel, Wei Liu, Ian Campbell, Xi Xiong


On 2013-7-11 14:01, annie li wrote:
>
> On 2013-7-11 13:14, Matt Wilson wrote:
>> On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
>>> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
>>>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
>>>>> From: Xi Xiong <xixiong@amazon.com>
>>>>>
>>>>> [ note: I've just cherry picked this onto net-next, and only compile
>>>>>    tested. This a RFC only. -msw ]
>>>>>
>>>> Should probably rebase it on net.git because it is a bug fix. Let's
>>>> worry about that later...
>> *nod*
>>
>>>>> Currently the number of RX slots required to transmit a SKB to
>>>>> xen-netfront can be miscalculated when an interface uses a MTU larger
>>>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
>>>>> pause the queue indefinitely or reuse slots. The former manifests 
>>>>> as a
>>>>> loss of connectivity to the guest (which can be restored by lowering
>>>>> the MTU set on the interface). The latter manifests with "Bad grant
>>>>> reference" messages from Xen such as:
>>>>>
>>>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
>>>>>
>>>>> and kernel messages within the guest such as:
>>>>>
>>>>> [  180.419567] net eth0: Invalid extra type: 112
>>>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
>>>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
>>>>>
>>>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
>>>>> handling a SKB.
>>>>>
>>>>> This patch changes xen_netbk_rx_action() to count the number of RX
>>>>> slots actually consumed by netbk_gop_skb() instead of using 
>>>>> nr_frags + 1.
>>>>> This prevents under-counting the number of RX slots consumed when a
>>>>> SKB has a large linear buffer.
>>>>>
>>>>> Additionally, we now store the estimated number of RX slots required
>>>>> to handle a SKB in the cb overlay. This value is used to determine if
>>>>> the next SKB in the queue can be processed.
>>>>>
>>>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
>>>>> wasted when setting up copy grant table operations for SKBs with 
>>>>> large
>>>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
>>>>> bytes that starts 64 bytes 64 bytes from the start of the page will
>>>> Duplicated "64 bytes". And this change looks like an improvement not a
>>>> bug fix. Probably submit a separate patch for this?
>> Argh, I knew it was in there somewhere (since you pointed it out in
>> Dubin :-).
>>
>> Maybe it could be a separate patch. I think the description is also a
>> bit confusing. I'll work on rewording it.
>>
>>>>> consume three RX slots instead of two. This patch changes the "head"
>>>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
>>>>> start_new_rx_buffer() will always place as much data as possible into
>>>>> each RX slot.
>>>>>
>>>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
>>>>> Reviewed-by: Matt Wilson <msw@amazon.com>
>>>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
>>>>>    to count RX slots instead of meta structures ]
>>>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>>>> Cc: Annie Li <annie.li@oracle.com>
>>>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>>>>> Cc: netdev@vger.kernel.org
>>>>> Cc: xen-devel@lists.xenproject.org
>>>>> ---
>>>>>   drivers/net/xen-netback/netback.c |   51 
>>>>> ++++++++++++++++++++++--------------
>>>>>   1 files changed, 31 insertions(+), 20 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/xen-netback/netback.c 
>>>>> b/drivers/net/xen-netback/netback.c
>>>>> index 64828de..82dd207 100644
>>>>> --- a/drivers/net/xen-netback/netback.c
>>>>> +++ b/drivers/net/xen-netback/netback.c
>>>>> @@ -110,6 +110,11 @@ union page_ext {
>>>>>       void *mapping;
>>>>>   };
>>>>>   +struct skb_cb_overlay {
>>>>> +    int meta_slots_used;
>>>>> +    int peek_slots_count;
>>>>> +};
>>>>> +
>>>>>   struct xen_netbk {
>>>>>       wait_queue_head_t wq;
>>>>>       struct task_struct *task;
>>>>> @@ -370,6 +375,7 @@ unsigned int xen_netbk_count_skb_slots(struct 
>>>>> xenvif *vif, struct sk_buff *skb)
>>>>>   {
>>>>>       unsigned int count;
>>>>>       int i, copy_off;
>>>>> +    struct skb_cb_overlay *sco;
>>>>>         count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>>>>>   @@ -411,6 +417,9 @@ unsigned int 
>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
>>>>>                   offset = 0;
>>>>>           }
>>>>>       }
>>>>> +
>>>>> +    sco = (struct skb_cb_overlay *) skb->cb;
>>>>> +    sco->peek_slots_count = count;
>>>>>       return count;
>>>>>   }
>>>>>   @@ -443,13 +452,12 @@ static struct netbk_rx_meta 
>>>>> *get_next_rx_buffer(struct xenvif *vif,
>>>>>   }
>>>>>     /*
>>>>> - * Set up the grant operations for this fragment. If it's a flipping
>>>>> - * interface, we also set up the unmap request from here.
>>>>> + * Set up the grant operations for this fragment.
>>>>>    */
>>>>>   static void netbk_gop_frag_copy(struct xenvif *vif, struct 
>>>>> sk_buff *skb,
>>>>>                   struct netrx_pending_operations *npo,
>>>>>                   struct page *page, unsigned long size,
>>>>> -                unsigned long offset, int *head)
>>>>> +                unsigned long offset, int head, int *first)
>>>>>   {
>>>>>       struct gnttab_copy *copy_gop;
>>>>>       struct netbk_rx_meta *meta;
>>>>> @@ -479,12 +487,12 @@ static void netbk_gop_frag_copy(struct 
>>>>> xenvif *vif, struct sk_buff *skb,
>>>>>           if (bytes > size)
>>>>>               bytes = size;
>>>>>   -        if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
>>>>> +        if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
>
> head is always 1 here when processing the header data, so it is 
> impossible to start a new buffer for header data with larger header 
> size, and get_next_rx_buffer is never called. I assume this would not 
> work well for skb with larger header data size. Have you run network 
> test with this patch?

Sorry, I forgot the offset == MAX_BUFFER_OFFSET case and misunderstand 
your patch, please ignore my last comments. Your patch keeps the 
original DIV_ROUND_UP and changes the mechanism in netbk_gop_frag_copy 
to make slots same with xen_netbk_count_skb_slots. All Roads Lead to 
Rome!:-)

Thanks
Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11 13:52             ` [Xen-devel] " annie li
@ 2013-07-12  8:49               ` Wei Liu
  2013-07-12  9:19                 ` annie li
  2013-07-12  9:19                 ` [Xen-devel] " annie li
  2013-07-12  8:49               ` Wei Liu
  1 sibling, 2 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-12  8:49 UTC (permalink / raw)
  To: annie li; +Cc: Matt Wilson, netdev, Wei Liu, Ian Campbell, Xi Xiong, xen-devel

On Thu, Jul 11, 2013 at 09:52:52PM +0800, annie li wrote:
> 
> On 2013-7-11 14:01, annie li wrote:
> >
> >On 2013-7-11 13:14, Matt Wilson wrote:
> >>On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
> >>>On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
> >>>>On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> >>>>>From: Xi Xiong <xixiong@amazon.com>
> >>>>>
> >>>>>[ note: I've just cherry picked this onto net-next, and only compile
> >>>>>   tested. This a RFC only. -msw ]
> >>>>>
> >>>>Should probably rebase it on net.git because it is a bug fix. Let's
> >>>>worry about that later...
> >>*nod*
> >>
> >>>>>Currently the number of RX slots required to transmit a SKB to
> >>>>>xen-netfront can be miscalculated when an interface uses a MTU larger
> >>>>>than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> >>>>>pause the queue indefinitely or reuse slots. The former
> >>>>>manifests as a
> >>>>>loss of connectivity to the guest (which can be restored by lowering
> >>>>>the MTU set on the interface). The latter manifests with "Bad grant
> >>>>>reference" messages from Xen such as:
> >>>>>
> >>>>>(XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> >>>>>
> >>>>>and kernel messages within the guest such as:
> >>>>>
> >>>>>[  180.419567] net eth0: Invalid extra type: 112
> >>>>>[  180.868620] net eth0: rx->offset: 0, size: 4294967295
> >>>>>[  180.868629] net eth0: rx->offset: 0, size: 4294967295
> >>>>>
> >>>>>BUG_ON() assertions can also be hit if RX slots are exhausted while
> >>>>>handling a SKB.
> >>>>>
> >>>>>This patch changes xen_netbk_rx_action() to count the number of RX
> >>>>>slots actually consumed by netbk_gop_skb() instead of
> >>>>>using nr_frags + 1.
> >>>>>This prevents under-counting the number of RX slots consumed when a
> >>>>>SKB has a large linear buffer.
> >>>>>
> >>>>>Additionally, we now store the estimated number of RX slots required
> >>>>>to handle a SKB in the cb overlay. This value is used to determine if
> >>>>>the next SKB in the queue can be processed.
> >>>>>
> >>>>>Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> >>>>>wasted when setting up copy grant table operations for
> >>>>>SKBs with large
> >>>>>linear buffers. For example, a SKB with skb_headlen() equal to 8157
> >>>>>bytes that starts 64 bytes 64 bytes from the start of the page will
> >>>>Duplicated "64 bytes". And this change looks like an improvement not a
> >>>>bug fix. Probably submit a separate patch for this?
> >>Argh, I knew it was in there somewhere (since you pointed it out in
> >>Dubin :-).
> >>
> >>Maybe it could be a separate patch. I think the description is also a
> >>bit confusing. I'll work on rewording it.
> >>
> >>>>>consume three RX slots instead of two. This patch changes the "head"
> >>>>>parameter to netbk_gop_frag_copy() to act as a flag. When set,
> >>>>>start_new_rx_buffer() will always place as much data as possible into
> >>>>>each RX slot.
> >>>>>
> >>>>>Signed-off-by: Xi Xiong <xixiong@amazon.com>
> >>>>>Reviewed-by: Matt Wilson <msw@amazon.com>
> >>>>>[ msw: minor code cleanups, rewrote commit message, adjusted code
> >>>>>   to count RX slots instead of meta structures ]
> >>>>>Signed-off-by: Matt Wilson <msw@amazon.com>
> >>>>>Cc: Annie Li <annie.li@oracle.com>
> >>>>>Cc: Wei Liu <wei.liu2@citrix.com>
> >>>>>Cc: Ian Campbell <Ian.Campbell@citrix.com>
> >>>>>Cc: netdev@vger.kernel.org
> >>>>>Cc: xen-devel@lists.xenproject.org
> >>>>>---
> >>>>>  drivers/net/xen-netback/netback.c |   51
> >>>>>++++++++++++++++++++++--------------
> >>>>>  1 files changed, 31 insertions(+), 20 deletions(-)
> >>>>>
> >>>>>diff --git a/drivers/net/xen-netback/netback.c
> >>>>>b/drivers/net/xen-netback/netback.c
> >>>>>index 64828de..82dd207 100644
> >>>>>--- a/drivers/net/xen-netback/netback.c
> >>>>>+++ b/drivers/net/xen-netback/netback.c
> >>>>>@@ -110,6 +110,11 @@ union page_ext {
> >>>>>      void *mapping;
> >>>>>  };
> >>>>>  +struct skb_cb_overlay {
> >>>>>+    int meta_slots_used;
> >>>>>+    int peek_slots_count;
> >>>>>+};
> >>>>>+
> >>>>>  struct xen_netbk {
> >>>>>      wait_queue_head_t wq;
> >>>>>      struct task_struct *task;
> >>>>>@@ -370,6 +375,7 @@ unsigned int
> >>>>>xen_netbk_count_skb_slots(struct xenvif *vif, struct
> >>>>>sk_buff *skb)
> >>>>>  {
> >>>>>      unsigned int count;
> >>>>>      int i, copy_off;
> >>>>>+    struct skb_cb_overlay *sco;
> >>>>>        count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> >>>>>  @@ -411,6 +417,9 @@ unsigned int
> >>>>>xen_netbk_count_skb_slots(struct xenvif *vif, struct
> >>>>>sk_buff *skb)
> >>>>>                  offset = 0;
> >>>>>          }
> >>>>>      }
> >>>>>+
> >>>>>+    sco = (struct skb_cb_overlay *) skb->cb;
> >>>>>+    sco->peek_slots_count = count;
> >>>>>      return count;
> >>>>>  }
> >>>>>  @@ -443,13 +452,12 @@ static struct netbk_rx_meta
> >>>>>*get_next_rx_buffer(struct xenvif *vif,
> >>>>>  }
> >>>>>    /*
> >>>>>- * Set up the grant operations for this fragment. If it's a flipping
> >>>>>- * interface, we also set up the unmap request from here.
> >>>>>+ * Set up the grant operations for this fragment.
> >>>>>   */
> >>>>>  static void netbk_gop_frag_copy(struct xenvif *vif,
> >>>>>struct sk_buff *skb,
> >>>>>                  struct netrx_pending_operations *npo,
> >>>>>                  struct page *page, unsigned long size,
> >>>>>-                unsigned long offset, int *head)
> >>>>>+                unsigned long offset, int head, int *first)
> >>>>>  {
> >>>>>      struct gnttab_copy *copy_gop;
> >>>>>      struct netbk_rx_meta *meta;
> >>>>>@@ -479,12 +487,12 @@ static void
> >>>>>netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff
> >>>>>*skb,
> >>>>>          if (bytes > size)
> >>>>>              bytes = size;
> >>>>>  -        if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> >>>>>+        if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
> >
> >head is always 1 here when processing the header data, so it is
> >impossible to start a new buffer for header data with larger
> >header size, and get_next_rx_buffer is never called. I assume this
> >would not work well for skb with larger header data size. Have you
> >run network test with this patch?
> 
> Sorry, I forgot the offset == MAX_BUFFER_OFFSET case and
> misunderstand your patch, please ignore my last comments. Your patch
> keeps the original DIV_ROUND_UP and changes the mechanism in
> netbk_gop_frag_copy to make slots same with
> xen_netbk_count_skb_slots. All Roads Lead to Rome!:-)
> 

Actually I'm now in favor of Matt's approach as it a) fix the bug, b)
make efficient use of the ring.

Annie, Ian, what's your opinion?


Wei.

> Thanks
> Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11 13:52             ` [Xen-devel] " annie li
  2013-07-12  8:49               ` Wei Liu
@ 2013-07-12  8:49               ` Wei Liu
  1 sibling, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-12  8:49 UTC (permalink / raw)
  To: annie li; +Cc: Wei Liu, Ian Campbell, netdev, Matt Wilson, xen-devel, Xi Xiong

On Thu, Jul 11, 2013 at 09:52:52PM +0800, annie li wrote:
> 
> On 2013-7-11 14:01, annie li wrote:
> >
> >On 2013-7-11 13:14, Matt Wilson wrote:
> >>On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
> >>>On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
> >>>>On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> >>>>>From: Xi Xiong <xixiong@amazon.com>
> >>>>>
> >>>>>[ note: I've just cherry picked this onto net-next, and only compile
> >>>>>   tested. This a RFC only. -msw ]
> >>>>>
> >>>>Should probably rebase it on net.git because it is a bug fix. Let's
> >>>>worry about that later...
> >>*nod*
> >>
> >>>>>Currently the number of RX slots required to transmit a SKB to
> >>>>>xen-netfront can be miscalculated when an interface uses a MTU larger
> >>>>>than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> >>>>>pause the queue indefinitely or reuse slots. The former
> >>>>>manifests as a
> >>>>>loss of connectivity to the guest (which can be restored by lowering
> >>>>>the MTU set on the interface). The latter manifests with "Bad grant
> >>>>>reference" messages from Xen such as:
> >>>>>
> >>>>>(XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> >>>>>
> >>>>>and kernel messages within the guest such as:
> >>>>>
> >>>>>[  180.419567] net eth0: Invalid extra type: 112
> >>>>>[  180.868620] net eth0: rx->offset: 0, size: 4294967295
> >>>>>[  180.868629] net eth0: rx->offset: 0, size: 4294967295
> >>>>>
> >>>>>BUG_ON() assertions can also be hit if RX slots are exhausted while
> >>>>>handling a SKB.
> >>>>>
> >>>>>This patch changes xen_netbk_rx_action() to count the number of RX
> >>>>>slots actually consumed by netbk_gop_skb() instead of
> >>>>>using nr_frags + 1.
> >>>>>This prevents under-counting the number of RX slots consumed when a
> >>>>>SKB has a large linear buffer.
> >>>>>
> >>>>>Additionally, we now store the estimated number of RX slots required
> >>>>>to handle a SKB in the cb overlay. This value is used to determine if
> >>>>>the next SKB in the queue can be processed.
> >>>>>
> >>>>>Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> >>>>>wasted when setting up copy grant table operations for
> >>>>>SKBs with large
> >>>>>linear buffers. For example, a SKB with skb_headlen() equal to 8157
> >>>>>bytes that starts 64 bytes 64 bytes from the start of the page will
> >>>>Duplicated "64 bytes". And this change looks like an improvement not a
> >>>>bug fix. Probably submit a separate patch for this?
> >>Argh, I knew it was in there somewhere (since you pointed it out in
> >>Dubin :-).
> >>
> >>Maybe it could be a separate patch. I think the description is also a
> >>bit confusing. I'll work on rewording it.
> >>
> >>>>>consume three RX slots instead of two. This patch changes the "head"
> >>>>>parameter to netbk_gop_frag_copy() to act as a flag. When set,
> >>>>>start_new_rx_buffer() will always place as much data as possible into
> >>>>>each RX slot.
> >>>>>
> >>>>>Signed-off-by: Xi Xiong <xixiong@amazon.com>
> >>>>>Reviewed-by: Matt Wilson <msw@amazon.com>
> >>>>>[ msw: minor code cleanups, rewrote commit message, adjusted code
> >>>>>   to count RX slots instead of meta structures ]
> >>>>>Signed-off-by: Matt Wilson <msw@amazon.com>
> >>>>>Cc: Annie Li <annie.li@oracle.com>
> >>>>>Cc: Wei Liu <wei.liu2@citrix.com>
> >>>>>Cc: Ian Campbell <Ian.Campbell@citrix.com>
> >>>>>Cc: netdev@vger.kernel.org
> >>>>>Cc: xen-devel@lists.xenproject.org
> >>>>>---
> >>>>>  drivers/net/xen-netback/netback.c |   51
> >>>>>++++++++++++++++++++++--------------
> >>>>>  1 files changed, 31 insertions(+), 20 deletions(-)
> >>>>>
> >>>>>diff --git a/drivers/net/xen-netback/netback.c
> >>>>>b/drivers/net/xen-netback/netback.c
> >>>>>index 64828de..82dd207 100644
> >>>>>--- a/drivers/net/xen-netback/netback.c
> >>>>>+++ b/drivers/net/xen-netback/netback.c
> >>>>>@@ -110,6 +110,11 @@ union page_ext {
> >>>>>      void *mapping;
> >>>>>  };
> >>>>>  +struct skb_cb_overlay {
> >>>>>+    int meta_slots_used;
> >>>>>+    int peek_slots_count;
> >>>>>+};
> >>>>>+
> >>>>>  struct xen_netbk {
> >>>>>      wait_queue_head_t wq;
> >>>>>      struct task_struct *task;
> >>>>>@@ -370,6 +375,7 @@ unsigned int
> >>>>>xen_netbk_count_skb_slots(struct xenvif *vif, struct
> >>>>>sk_buff *skb)
> >>>>>  {
> >>>>>      unsigned int count;
> >>>>>      int i, copy_off;
> >>>>>+    struct skb_cb_overlay *sco;
> >>>>>        count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> >>>>>  @@ -411,6 +417,9 @@ unsigned int
> >>>>>xen_netbk_count_skb_slots(struct xenvif *vif, struct
> >>>>>sk_buff *skb)
> >>>>>                  offset = 0;
> >>>>>          }
> >>>>>      }
> >>>>>+
> >>>>>+    sco = (struct skb_cb_overlay *) skb->cb;
> >>>>>+    sco->peek_slots_count = count;
> >>>>>      return count;
> >>>>>  }
> >>>>>  @@ -443,13 +452,12 @@ static struct netbk_rx_meta
> >>>>>*get_next_rx_buffer(struct xenvif *vif,
> >>>>>  }
> >>>>>    /*
> >>>>>- * Set up the grant operations for this fragment. If it's a flipping
> >>>>>- * interface, we also set up the unmap request from here.
> >>>>>+ * Set up the grant operations for this fragment.
> >>>>>   */
> >>>>>  static void netbk_gop_frag_copy(struct xenvif *vif,
> >>>>>struct sk_buff *skb,
> >>>>>                  struct netrx_pending_operations *npo,
> >>>>>                  struct page *page, unsigned long size,
> >>>>>-                unsigned long offset, int *head)
> >>>>>+                unsigned long offset, int head, int *first)
> >>>>>  {
> >>>>>      struct gnttab_copy *copy_gop;
> >>>>>      struct netbk_rx_meta *meta;
> >>>>>@@ -479,12 +487,12 @@ static void
> >>>>>netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff
> >>>>>*skb,
> >>>>>          if (bytes > size)
> >>>>>              bytes = size;
> >>>>>  -        if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> >>>>>+        if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
> >
> >head is always 1 here when processing the header data, so it is
> >impossible to start a new buffer for header data with larger
> >header size, and get_next_rx_buffer is never called. I assume this
> >would not work well for skb with larger header data size. Have you
> >run network test with this patch?
> 
> Sorry, I forgot the offset == MAX_BUFFER_OFFSET case and
> misunderstand your patch, please ignore my last comments. Your patch
> keeps the original DIV_ROUND_UP and changes the mechanism in
> netbk_gop_frag_copy to make slots same with
> xen_netbk_count_skb_slots. All Roads Lead to Rome!:-)
> 

Actually I'm now in favor of Matt's approach as it a) fix the bug, b)
make efficient use of the ring.

Annie, Ian, what's your opinion?


Wei.

> Thanks
> Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-12  8:49               ` Wei Liu
  2013-07-12  9:19                 ` annie li
@ 2013-07-12  9:19                 ` annie li
  2013-07-18 11:47                   ` Ian Campbell
  2013-07-18 11:47                   ` [Xen-devel] " Ian Campbell
  1 sibling, 2 replies; 38+ messages in thread
From: annie li @ 2013-07-12  9:19 UTC (permalink / raw)
  To: Wei Liu; +Cc: Matt Wilson, netdev, Ian Campbell, Xi Xiong, xen-devel


On 2013-7-12 16:49, Wei Liu wrote:
> On Thu, Jul 11, 2013 at 09:52:52PM +0800, annie li wrote:
>> On 2013-7-11 14:01, annie li wrote:
>>> On 2013-7-11 13:14, Matt Wilson wrote:
>>>> On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
>>>>> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
>>>>>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
>>>>>>> From: Xi Xiong <xixiong@amazon.com>
>>>>>>>
>>>>>>> [ note: I've just cherry picked this onto net-next, and only compile
>>>>>>>    tested. This a RFC only. -msw ]
>>>>>>>
>>>>>> Should probably rebase it on net.git because it is a bug fix. Let's
>>>>>> worry about that later...
>>>> *nod*
>>>>
>>>>>>> Currently the number of RX slots required to transmit a SKB to
>>>>>>> xen-netfront can be miscalculated when an interface uses a MTU larger
>>>>>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
>>>>>>> pause the queue indefinitely or reuse slots. The former
>>>>>>> manifests as a
>>>>>>> loss of connectivity to the guest (which can be restored by lowering
>>>>>>> the MTU set on the interface). The latter manifests with "Bad grant
>>>>>>> reference" messages from Xen such as:
>>>>>>>
>>>>>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
>>>>>>>
>>>>>>> and kernel messages within the guest such as:
>>>>>>>
>>>>>>> [  180.419567] net eth0: Invalid extra type: 112
>>>>>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
>>>>>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
>>>>>>>
>>>>>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
>>>>>>> handling a SKB.
>>>>>>>
>>>>>>> This patch changes xen_netbk_rx_action() to count the number of RX
>>>>>>> slots actually consumed by netbk_gop_skb() instead of
>>>>>>> using nr_frags + 1.
>>>>>>> This prevents under-counting the number of RX slots consumed when a
>>>>>>> SKB has a large linear buffer.
>>>>>>>
>>>>>>> Additionally, we now store the estimated number of RX slots required
>>>>>>> to handle a SKB in the cb overlay. This value is used to determine if
>>>>>>> the next SKB in the queue can be processed.
>>>>>>>
>>>>>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
>>>>>>> wasted when setting up copy grant table operations for
>>>>>>> SKBs with large
>>>>>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
>>>>>>> bytes that starts 64 bytes 64 bytes from the start of the page will
>>>>>> Duplicated "64 bytes". And this change looks like an improvement not a
>>>>>> bug fix. Probably submit a separate patch for this?
>>>> Argh, I knew it was in there somewhere (since you pointed it out in
>>>> Dubin :-).
>>>>
>>>> Maybe it could be a separate patch. I think the description is also a
>>>> bit confusing. I'll work on rewording it.
>>>>
>>>>>>> consume three RX slots instead of two. This patch changes the "head"
>>>>>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
>>>>>>> start_new_rx_buffer() will always place as much data as possible into
>>>>>>> each RX slot.
>>>>>>>
>>>>>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
>>>>>>> Reviewed-by: Matt Wilson <msw@amazon.com>
>>>>>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
>>>>>>>    to count RX slots instead of meta structures ]
>>>>>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>>>>>> Cc: Annie Li <annie.li@oracle.com>
>>>>>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>>>>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>>>>>>> Cc: netdev@vger.kernel.org
>>>>>>> Cc: xen-devel@lists.xenproject.org
>>>>>>> ---
>>>>>>>   drivers/net/xen-netback/netback.c |   51
>>>>>>> ++++++++++++++++++++++--------------
>>>>>>>   1 files changed, 31 insertions(+), 20 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/xen-netback/netback.c
>>>>>>> b/drivers/net/xen-netback/netback.c
>>>>>>> index 64828de..82dd207 100644
>>>>>>> --- a/drivers/net/xen-netback/netback.c
>>>>>>> +++ b/drivers/net/xen-netback/netback.c
>>>>>>> @@ -110,6 +110,11 @@ union page_ext {
>>>>>>>       void *mapping;
>>>>>>>   };
>>>>>>>   +struct skb_cb_overlay {
>>>>>>> +    int meta_slots_used;
>>>>>>> +    int peek_slots_count;
>>>>>>> +};
>>>>>>> +
>>>>>>>   struct xen_netbk {
>>>>>>>       wait_queue_head_t wq;
>>>>>>>       struct task_struct *task;
>>>>>>> @@ -370,6 +375,7 @@ unsigned int
>>>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct
>>>>>>> sk_buff *skb)
>>>>>>>   {
>>>>>>>       unsigned int count;
>>>>>>>       int i, copy_off;
>>>>>>> +    struct skb_cb_overlay *sco;
>>>>>>>         count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>>>>>>>   @@ -411,6 +417,9 @@ unsigned int
>>>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct
>>>>>>> sk_buff *skb)
>>>>>>>                   offset = 0;
>>>>>>>           }
>>>>>>>       }
>>>>>>> +
>>>>>>> +    sco = (struct skb_cb_overlay *) skb->cb;
>>>>>>> +    sco->peek_slots_count = count;
>>>>>>>       return count;
>>>>>>>   }
>>>>>>>   @@ -443,13 +452,12 @@ static struct netbk_rx_meta
>>>>>>> *get_next_rx_buffer(struct xenvif *vif,
>>>>>>>   }
>>>>>>>     /*
>>>>>>> - * Set up the grant operations for this fragment. If it's a flipping
>>>>>>> - * interface, we also set up the unmap request from here.
>>>>>>> + * Set up the grant operations for this fragment.
>>>>>>>    */
>>>>>>>   static void netbk_gop_frag_copy(struct xenvif *vif,
>>>>>>> struct sk_buff *skb,
>>>>>>>                   struct netrx_pending_operations *npo,
>>>>>>>                   struct page *page, unsigned long size,
>>>>>>> -                unsigned long offset, int *head)
>>>>>>> +                unsigned long offset, int head, int *first)
>>>>>>>   {
>>>>>>>       struct gnttab_copy *copy_gop;
>>>>>>>       struct netbk_rx_meta *meta;
>>>>>>> @@ -479,12 +487,12 @@ static void
>>>>>>> netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff
>>>>>>> *skb,
>>>>>>>           if (bytes > size)
>>>>>>>               bytes = size;
>>>>>>>   -        if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
>>>>>>> +        if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
>>> head is always 1 here when processing the header data, so it is
>>> impossible to start a new buffer for header data with larger
>>> header size, and get_next_rx_buffer is never called. I assume this
>>> would not work well for skb with larger header data size. Have you
>>> run network test with this patch?
>> Sorry, I forgot the offset == MAX_BUFFER_OFFSET case and
>> misunderstand your patch, please ignore my last comments. Your patch
>> keeps the original DIV_ROUND_UP and changes the mechanism in
>> netbk_gop_frag_copy to make slots same with
>> xen_netbk_count_skb_slots. All Roads Lead to Rome!:-)
>>
> Actually I'm now in favor of Matt's approach as it a) fix the bug, b)
> make efficient use of the ring.
>
> Annie, Ian, what's your opinion?

Me too. Looks his approach not only fix this bug, but also saves the 
ring usage for header data, and this is good.
Combining my patch with his is not a good idea, so I think we can use 
his patch.

Thanks
Annie
>
>
> Wei.
>
>> Thanks
>> Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-12  8:49               ` Wei Liu
@ 2013-07-12  9:19                 ` annie li
  2013-07-12  9:19                 ` [Xen-devel] " annie li
  1 sibling, 0 replies; 38+ messages in thread
From: annie li @ 2013-07-12  9:19 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, xen-devel, Ian Campbell, Matt Wilson, Xi Xiong


On 2013-7-12 16:49, Wei Liu wrote:
> On Thu, Jul 11, 2013 at 09:52:52PM +0800, annie li wrote:
>> On 2013-7-11 14:01, annie li wrote:
>>> On 2013-7-11 13:14, Matt Wilson wrote:
>>>> On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
>>>>> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
>>>>>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
>>>>>>> From: Xi Xiong <xixiong@amazon.com>
>>>>>>>
>>>>>>> [ note: I've just cherry picked this onto net-next, and only compile
>>>>>>>    tested. This a RFC only. -msw ]
>>>>>>>
>>>>>> Should probably rebase it on net.git because it is a bug fix. Let's
>>>>>> worry about that later...
>>>> *nod*
>>>>
>>>>>>> Currently the number of RX slots required to transmit a SKB to
>>>>>>> xen-netfront can be miscalculated when an interface uses a MTU larger
>>>>>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
>>>>>>> pause the queue indefinitely or reuse slots. The former
>>>>>>> manifests as a
>>>>>>> loss of connectivity to the guest (which can be restored by lowering
>>>>>>> the MTU set on the interface). The latter manifests with "Bad grant
>>>>>>> reference" messages from Xen such as:
>>>>>>>
>>>>>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
>>>>>>>
>>>>>>> and kernel messages within the guest such as:
>>>>>>>
>>>>>>> [  180.419567] net eth0: Invalid extra type: 112
>>>>>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
>>>>>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
>>>>>>>
>>>>>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
>>>>>>> handling a SKB.
>>>>>>>
>>>>>>> This patch changes xen_netbk_rx_action() to count the number of RX
>>>>>>> slots actually consumed by netbk_gop_skb() instead of
>>>>>>> using nr_frags + 1.
>>>>>>> This prevents under-counting the number of RX slots consumed when a
>>>>>>> SKB has a large linear buffer.
>>>>>>>
>>>>>>> Additionally, we now store the estimated number of RX slots required
>>>>>>> to handle a SKB in the cb overlay. This value is used to determine if
>>>>>>> the next SKB in the queue can be processed.
>>>>>>>
>>>>>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
>>>>>>> wasted when setting up copy grant table operations for
>>>>>>> SKBs with large
>>>>>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
>>>>>>> bytes that starts 64 bytes 64 bytes from the start of the page will
>>>>>> Duplicated "64 bytes". And this change looks like an improvement not a
>>>>>> bug fix. Probably submit a separate patch for this?
>>>> Argh, I knew it was in there somewhere (since you pointed it out in
>>>> Dubin :-).
>>>>
>>>> Maybe it could be a separate patch. I think the description is also a
>>>> bit confusing. I'll work on rewording it.
>>>>
>>>>>>> consume three RX slots instead of two. This patch changes the "head"
>>>>>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
>>>>>>> start_new_rx_buffer() will always place as much data as possible into
>>>>>>> each RX slot.
>>>>>>>
>>>>>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
>>>>>>> Reviewed-by: Matt Wilson <msw@amazon.com>
>>>>>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
>>>>>>>    to count RX slots instead of meta structures ]
>>>>>>> Signed-off-by: Matt Wilson <msw@amazon.com>
>>>>>>> Cc: Annie Li <annie.li@oracle.com>
>>>>>>> Cc: Wei Liu <wei.liu2@citrix.com>
>>>>>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
>>>>>>> Cc: netdev@vger.kernel.org
>>>>>>> Cc: xen-devel@lists.xenproject.org
>>>>>>> ---
>>>>>>>   drivers/net/xen-netback/netback.c |   51
>>>>>>> ++++++++++++++++++++++--------------
>>>>>>>   1 files changed, 31 insertions(+), 20 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/xen-netback/netback.c
>>>>>>> b/drivers/net/xen-netback/netback.c
>>>>>>> index 64828de..82dd207 100644
>>>>>>> --- a/drivers/net/xen-netback/netback.c
>>>>>>> +++ b/drivers/net/xen-netback/netback.c
>>>>>>> @@ -110,6 +110,11 @@ union page_ext {
>>>>>>>       void *mapping;
>>>>>>>   };
>>>>>>>   +struct skb_cb_overlay {
>>>>>>> +    int meta_slots_used;
>>>>>>> +    int peek_slots_count;
>>>>>>> +};
>>>>>>> +
>>>>>>>   struct xen_netbk {
>>>>>>>       wait_queue_head_t wq;
>>>>>>>       struct task_struct *task;
>>>>>>> @@ -370,6 +375,7 @@ unsigned int
>>>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct
>>>>>>> sk_buff *skb)
>>>>>>>   {
>>>>>>>       unsigned int count;
>>>>>>>       int i, copy_off;
>>>>>>> +    struct skb_cb_overlay *sco;
>>>>>>>         count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
>>>>>>>   @@ -411,6 +417,9 @@ unsigned int
>>>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct
>>>>>>> sk_buff *skb)
>>>>>>>                   offset = 0;
>>>>>>>           }
>>>>>>>       }
>>>>>>> +
>>>>>>> +    sco = (struct skb_cb_overlay *) skb->cb;
>>>>>>> +    sco->peek_slots_count = count;
>>>>>>>       return count;
>>>>>>>   }
>>>>>>>   @@ -443,13 +452,12 @@ static struct netbk_rx_meta
>>>>>>> *get_next_rx_buffer(struct xenvif *vif,
>>>>>>>   }
>>>>>>>     /*
>>>>>>> - * Set up the grant operations for this fragment. If it's a flipping
>>>>>>> - * interface, we also set up the unmap request from here.
>>>>>>> + * Set up the grant operations for this fragment.
>>>>>>>    */
>>>>>>>   static void netbk_gop_frag_copy(struct xenvif *vif,
>>>>>>> struct sk_buff *skb,
>>>>>>>                   struct netrx_pending_operations *npo,
>>>>>>>                   struct page *page, unsigned long size,
>>>>>>> -                unsigned long offset, int *head)
>>>>>>> +                unsigned long offset, int head, int *first)
>>>>>>>   {
>>>>>>>       struct gnttab_copy *copy_gop;
>>>>>>>       struct netbk_rx_meta *meta;
>>>>>>> @@ -479,12 +487,12 @@ static void
>>>>>>> netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff
>>>>>>> *skb,
>>>>>>>           if (bytes > size)
>>>>>>>               bytes = size;
>>>>>>>   -        if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
>>>>>>> +        if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
>>> head is always 1 here when processing the header data, so it is
>>> impossible to start a new buffer for header data with larger
>>> header size, and get_next_rx_buffer is never called. I assume this
>>> would not work well for skb with larger header data size. Have you
>>> run network test with this patch?
>> Sorry, I forgot the offset == MAX_BUFFER_OFFSET case and
>> misunderstand your patch, please ignore my last comments. Your patch
>> keeps the original DIV_ROUND_UP and changes the mechanism in
>> netbk_gop_frag_copy to make slots same with
>> xen_netbk_count_skb_slots. All Roads Lead to Rome!:-)
>>
> Actually I'm now in favor of Matt's approach as it a) fix the bug, b)
> make efficient use of the ring.
>
> Annie, Ian, what's your opinion?

Me too. Looks his approach not only fix this bug, but also saves the 
ring usage for header data, and this is good.
Combining my patch with his is not a good idea, so I think we can use 
his patch.

Thanks
Annie
>
>
> Wei.
>
>> Thanks
>> Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
                             ` (4 preceding siblings ...)
  2013-07-16 10:08           ` annie li
@ 2013-07-16 10:08           ` annie li
  2013-07-16 10:27             ` Wei Liu
  2013-07-16 10:27             ` [Xen-devel] " Wei Liu
  5 siblings, 2 replies; 38+ messages in thread
From: annie li @ 2013-07-16 10:08 UTC (permalink / raw)
  To: Matt Wilson; +Cc: Wei Liu, netdev, Xi Xiong, Ian Campbell, xen-devel


On 2013-7-11 13:14, Matt Wilson wrote:
> I think that this patch addresses the problem more completely. Annie?

Agree, it is better to take your patch. Would you like to go forward 
with it? This issue is obvious and easy to be hit with larger MTU size.

Thanks
Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
                             ` (3 preceding siblings ...)
  2013-07-11  8:46           ` Wei Liu
@ 2013-07-16 10:08           ` annie li
  2013-07-16 10:08           ` [Xen-devel] " annie li
  5 siblings, 0 replies; 38+ messages in thread
From: annie li @ 2013-07-16 10:08 UTC (permalink / raw)
  To: Matt Wilson; +Cc: netdev, xen-devel, Wei Liu, Xi Xiong, Ian Campbell


On 2013-7-11 13:14, Matt Wilson wrote:
> I think that this patch addresses the problem more completely. Annie?

Agree, it is better to take your patch. Would you like to go forward 
with it? This issue is obvious and easy to be hit with larger MTU size.

Thanks
Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-16 10:08           ` [Xen-devel] " annie li
  2013-07-16 10:27             ` Wei Liu
@ 2013-07-16 10:27             ` Wei Liu
  2013-07-16 17:40               ` Matt Wilson
  2013-07-16 17:40               ` [Xen-devel] " Matt Wilson
  1 sibling, 2 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-16 10:27 UTC (permalink / raw)
  To: annie li; +Cc: Matt Wilson, Wei Liu, netdev, Xi Xiong, Ian Campbell, xen-devel

On Tue, Jul 16, 2013 at 06:08:03PM +0800, annie li wrote:
> 
> On 2013-7-11 13:14, Matt Wilson wrote:
> >I think that this patch addresses the problem more completely. Annie?
> 
> Agree, it is better to take your patch. Would you like to go forward
> with it? This issue is obvious and easy to be hit with larger MTU
> size.
> 

I'm going to be on vacation tomorrow. Matt, feel free to add my Acked-by to
your patch.


Wei.

> Thanks
> Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-16 10:08           ` [Xen-devel] " annie li
@ 2013-07-16 10:27             ` Wei Liu
  2013-07-16 10:27             ` [Xen-devel] " Wei Liu
  1 sibling, 0 replies; 38+ messages in thread
From: Wei Liu @ 2013-07-16 10:27 UTC (permalink / raw)
  To: annie li; +Cc: Wei Liu, Ian Campbell, netdev, Matt Wilson, xen-devel, Xi Xiong

On Tue, Jul 16, 2013 at 06:08:03PM +0800, annie li wrote:
> 
> On 2013-7-11 13:14, Matt Wilson wrote:
> >I think that this patch addresses the problem more completely. Annie?
> 
> Agree, it is better to take your patch. Would you like to go forward
> with it? This issue is obvious and easy to be hit with larger MTU
> size.
> 

I'm going to be on vacation tomorrow. Matt, feel free to add my Acked-by to
your patch.


Wei.

> Thanks
> Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-16 10:27             ` [Xen-devel] " Wei Liu
  2013-07-16 17:40               ` Matt Wilson
@ 2013-07-16 17:40               ` Matt Wilson
  1 sibling, 0 replies; 38+ messages in thread
From: Matt Wilson @ 2013-07-16 17:40 UTC (permalink / raw)
  To: Wei Liu; +Cc: annie li, netdev, Xi Xiong, Ian Campbell, xen-devel

On Tue, Jul 16, 2013 at 11:27:19AM +0100, Wei Liu wrote:
> On Tue, Jul 16, 2013 at 06:08:03PM +0800, annie li wrote:
> > 
> > On 2013-7-11 13:14, Matt Wilson wrote:
> > >I think that this patch addresses the problem more completely. Annie?
> > 
> > Agree, it is better to take your patch. Would you like to go forward
> > with it? This issue is obvious and easy to be hit with larger MTU
> > size.
> > 
> 
> I'm going to be on vacation tomorrow. Matt, feel free to add my Acked-by to
> your patch.

Thanks, Wei. I'm on vacation now and I'll return to the office
tomorrow. Please accept my apologies for the delay.

I should have time to clean up a few things with this patch (no
functional changes) and post a final version this week.

--msw

> Wei.
> 
> > Thanks
> > Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-16 10:27             ` [Xen-devel] " Wei Liu
@ 2013-07-16 17:40               ` Matt Wilson
  2013-07-16 17:40               ` [Xen-devel] " Matt Wilson
  1 sibling, 0 replies; 38+ messages in thread
From: Matt Wilson @ 2013-07-16 17:40 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, annie li, xen-devel, Xi Xiong, Ian Campbell

On Tue, Jul 16, 2013 at 11:27:19AM +0100, Wei Liu wrote:
> On Tue, Jul 16, 2013 at 06:08:03PM +0800, annie li wrote:
> > 
> > On 2013-7-11 13:14, Matt Wilson wrote:
> > >I think that this patch addresses the problem more completely. Annie?
> > 
> > Agree, it is better to take your patch. Would you like to go forward
> > with it? This issue is obvious and easy to be hit with larger MTU
> > size.
> > 
> 
> I'm going to be on vacation tomorrow. Matt, feel free to add my Acked-by to
> your patch.

Thanks, Wei. I'm on vacation now and I'll return to the office
tomorrow. Please accept my apologies for the delay.

I should have time to clean up a few things with this patch (no
functional changes) and post a final version this week.

--msw

> Wei.
> 
> > Thanks
> > Annie

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Xen-devel] [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-12  9:19                 ` [Xen-devel] " annie li
  2013-07-18 11:47                   ` Ian Campbell
@ 2013-07-18 11:47                   ` Ian Campbell
  1 sibling, 0 replies; 38+ messages in thread
From: Ian Campbell @ 2013-07-18 11:47 UTC (permalink / raw)
  To: annie li; +Cc: Wei Liu, Matt Wilson, netdev, Xi Xiong, xen-devel

On Fri, 2013-07-12 at 17:19 +0800, annie li wrote:
> On 2013-7-12 16:49, Wei Liu wrote:
> > On Thu, Jul 11, 2013 at 09:52:52PM +0800, annie li wrote:
> >> On 2013-7-11 14:01, annie li wrote:
> >>> On 2013-7-11 13:14, Matt Wilson wrote:
> >>>> On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
> >>>>> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
> >>>>>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> >>>>>>> From: Xi Xiong <xixiong@amazon.com>
> >>>>>>>
> >>>>>>> [ note: I've just cherry picked this onto net-next, and only compile
> >>>>>>>    tested. This a RFC only. -msw ]
> >>>>>>>
> >>>>>> Should probably rebase it on net.git because it is a bug fix. Let's
> >>>>>> worry about that later...
> >>>> *nod*
> >>>>
> >>>>>>> Currently the number of RX slots required to transmit a SKB to
> >>>>>>> xen-netfront can be miscalculated when an interface uses a MTU larger
> >>>>>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> >>>>>>> pause the queue indefinitely or reuse slots. The former
> >>>>>>> manifests as a
> >>>>>>> loss of connectivity to the guest (which can be restored by lowering
> >>>>>>> the MTU set on the interface). The latter manifests with "Bad grant
> >>>>>>> reference" messages from Xen such as:
> >>>>>>>
> >>>>>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> >>>>>>>
> >>>>>>> and kernel messages within the guest such as:
> >>>>>>>
> >>>>>>> [  180.419567] net eth0: Invalid extra type: 112
> >>>>>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
> >>>>>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
> >>>>>>>
> >>>>>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
> >>>>>>> handling a SKB.
> >>>>>>>
> >>>>>>> This patch changes xen_netbk_rx_action() to count the number of RX
> >>>>>>> slots actually consumed by netbk_gop_skb() instead of
> >>>>>>> using nr_frags + 1.
> >>>>>>> This prevents under-counting the number of RX slots consumed when a
> >>>>>>> SKB has a large linear buffer.
> >>>>>>>
> >>>>>>> Additionally, we now store the estimated number of RX slots required
> >>>>>>> to handle a SKB in the cb overlay. This value is used to determine if
> >>>>>>> the next SKB in the queue can be processed.
> >>>>>>>
> >>>>>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> >>>>>>> wasted when setting up copy grant table operations for
> >>>>>>> SKBs with large
> >>>>>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
> >>>>>>> bytes that starts 64 bytes 64 bytes from the start of the page will
> >>>>>> Duplicated "64 bytes". And this change looks like an improvement not a
> >>>>>> bug fix. Probably submit a separate patch for this?
> >>>> Argh, I knew it was in there somewhere (since you pointed it out in
> >>>> Dubin :-).
> >>>>
> >>>> Maybe it could be a separate patch. I think the description is also a
> >>>> bit confusing. I'll work on rewording it.
> >>>>
> >>>>>>> consume three RX slots instead of two. This patch changes the "head"
> >>>>>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
> >>>>>>> start_new_rx_buffer() will always place as much data as possible into
> >>>>>>> each RX slot.
> >>>>>>>
> >>>>>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
> >>>>>>> Reviewed-by: Matt Wilson <msw@amazon.com>
> >>>>>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
> >>>>>>>    to count RX slots instead of meta structures ]
> >>>>>>> Signed-off-by: Matt Wilson <msw@amazon.com>
> >>>>>>> Cc: Annie Li <annie.li@oracle.com>
> >>>>>>> Cc: Wei Liu <wei.liu2@citrix.com>
> >>>>>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
> >>>>>>> Cc: netdev@vger.kernel.org
> >>>>>>> Cc: xen-devel@lists.xenproject.org
> >>>>>>> ---
> >>>>>>>   drivers/net/xen-netback/netback.c |   51
> >>>>>>> ++++++++++++++++++++++--------------
> >>>>>>>   1 files changed, 31 insertions(+), 20 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/net/xen-netback/netback.c
> >>>>>>> b/drivers/net/xen-netback/netback.c
> >>>>>>> index 64828de..82dd207 100644
> >>>>>>> --- a/drivers/net/xen-netback/netback.c
> >>>>>>> +++ b/drivers/net/xen-netback/netback.c
> >>>>>>> @@ -110,6 +110,11 @@ union page_ext {
> >>>>>>>       void *mapping;
> >>>>>>>   };
> >>>>>>>   +struct skb_cb_overlay {
> >>>>>>> +    int meta_slots_used;
> >>>>>>> +    int peek_slots_count;
> >>>>>>> +};
> >>>>>>> +
> >>>>>>>   struct xen_netbk {
> >>>>>>>       wait_queue_head_t wq;
> >>>>>>>       struct task_struct *task;
> >>>>>>> @@ -370,6 +375,7 @@ unsigned int
> >>>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct
> >>>>>>> sk_buff *skb)
> >>>>>>>   {
> >>>>>>>       unsigned int count;
> >>>>>>>       int i, copy_off;
> >>>>>>> +    struct skb_cb_overlay *sco;
> >>>>>>>         count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> >>>>>>>   @@ -411,6 +417,9 @@ unsigned int
> >>>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct
> >>>>>>> sk_buff *skb)
> >>>>>>>                   offset = 0;
> >>>>>>>           }
> >>>>>>>       }
> >>>>>>> +
> >>>>>>> +    sco = (struct skb_cb_overlay *) skb->cb;
> >>>>>>> +    sco->peek_slots_count = count;
> >>>>>>>       return count;
> >>>>>>>   }
> >>>>>>>   @@ -443,13 +452,12 @@ static struct netbk_rx_meta
> >>>>>>> *get_next_rx_buffer(struct xenvif *vif,
> >>>>>>>   }
> >>>>>>>     /*
> >>>>>>> - * Set up the grant operations for this fragment. If it's a flipping
> >>>>>>> - * interface, we also set up the unmap request from here.
> >>>>>>> + * Set up the grant operations for this fragment.
> >>>>>>>    */
> >>>>>>>   static void netbk_gop_frag_copy(struct xenvif *vif,
> >>>>>>> struct sk_buff *skb,
> >>>>>>>                   struct netrx_pending_operations *npo,
> >>>>>>>                   struct page *page, unsigned long size,
> >>>>>>> -                unsigned long offset, int *head)
> >>>>>>> +                unsigned long offset, int head, int *first)
> >>>>>>>   {
> >>>>>>>       struct gnttab_copy *copy_gop;
> >>>>>>>       struct netbk_rx_meta *meta;
> >>>>>>> @@ -479,12 +487,12 @@ static void
> >>>>>>> netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff
> >>>>>>> *skb,
> >>>>>>>           if (bytes > size)
> >>>>>>>               bytes = size;
> >>>>>>>   -        if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> >>>>>>> +        if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
> >>> head is always 1 here when processing the header data, so it is
> >>> impossible to start a new buffer for header data with larger
> >>> header size, and get_next_rx_buffer is never called. I assume this
> >>> would not work well for skb with larger header data size. Have you
> >>> run network test with this patch?
> >> Sorry, I forgot the offset == MAX_BUFFER_OFFSET case and
> >> misunderstand your patch, please ignore my last comments. Your patch
> >> keeps the original DIV_ROUND_UP and changes the mechanism in
> >> netbk_gop_frag_copy to make slots same with
> >> xen_netbk_count_skb_slots. All Roads Lead to Rome!:-)
> >>
> > Actually I'm now in favor of Matt's approach as it a) fix the bug, b)
> > make efficient use of the ring.
> >
> > Annie, Ian, what's your opinion?
> 
> Me too. Looks his approach not only fix this bug, but also saves the 
> ring usage for header data, and this is good.
> Combining my patch with his is not a good idea, so I think we can use 
> his patch.

Sounds good to me. I didn't review the patch in
<1373409659-22383-1-git-send-email-msw@amazon.com> in much detail
because I think we are expecting a respin with some of Wei & Annie's
comment addressed? If I'm wrong and I've just missed that posting please
cluebat me.

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs
  2013-07-12  9:19                 ` [Xen-devel] " annie li
@ 2013-07-18 11:47                   ` Ian Campbell
  2013-07-18 11:47                   ` [Xen-devel] " Ian Campbell
  1 sibling, 0 replies; 38+ messages in thread
From: Ian Campbell @ 2013-07-18 11:47 UTC (permalink / raw)
  To: annie li; +Cc: netdev, Wei Liu, Xi Xiong, Matt Wilson, xen-devel

On Fri, 2013-07-12 at 17:19 +0800, annie li wrote:
> On 2013-7-12 16:49, Wei Liu wrote:
> > On Thu, Jul 11, 2013 at 09:52:52PM +0800, annie li wrote:
> >> On 2013-7-11 14:01, annie li wrote:
> >>> On 2013-7-11 13:14, Matt Wilson wrote:
> >>>> On Wed, Jul 10, 2013 at 08:37:03PM +0100, Wei Liu wrote:
> >>>>> On Wed, Jul 10, 2013 at 09:13:33AM +0100, Wei Liu wrote:
> >>>>>> On Tue, Jul 09, 2013 at 10:40:59PM +0000, Matt Wilson wrote:
> >>>>>>> From: Xi Xiong <xixiong@amazon.com>
> >>>>>>>
> >>>>>>> [ note: I've just cherry picked this onto net-next, and only compile
> >>>>>>>    tested. This a RFC only. -msw ]
> >>>>>>>
> >>>>>> Should probably rebase it on net.git because it is a bug fix. Let's
> >>>>>> worry about that later...
> >>>> *nod*
> >>>>
> >>>>>>> Currently the number of RX slots required to transmit a SKB to
> >>>>>>> xen-netfront can be miscalculated when an interface uses a MTU larger
> >>>>>>> than PAGE_SIZE. If the slot calculation is wrong, xen-netback can
> >>>>>>> pause the queue indefinitely or reuse slots. The former
> >>>>>>> manifests as a
> >>>>>>> loss of connectivity to the guest (which can be restored by lowering
> >>>>>>> the MTU set on the interface). The latter manifests with "Bad grant
> >>>>>>> reference" messages from Xen such as:
> >>>>>>>
> >>>>>>> (XEN) grant_table.c:1797:d0 Bad grant reference 264241157
> >>>>>>>
> >>>>>>> and kernel messages within the guest such as:
> >>>>>>>
> >>>>>>> [  180.419567] net eth0: Invalid extra type: 112
> >>>>>>> [  180.868620] net eth0: rx->offset: 0, size: 4294967295
> >>>>>>> [  180.868629] net eth0: rx->offset: 0, size: 4294967295
> >>>>>>>
> >>>>>>> BUG_ON() assertions can also be hit if RX slots are exhausted while
> >>>>>>> handling a SKB.
> >>>>>>>
> >>>>>>> This patch changes xen_netbk_rx_action() to count the number of RX
> >>>>>>> slots actually consumed by netbk_gop_skb() instead of
> >>>>>>> using nr_frags + 1.
> >>>>>>> This prevents under-counting the number of RX slots consumed when a
> >>>>>>> SKB has a large linear buffer.
> >>>>>>>
> >>>>>>> Additionally, we now store the estimated number of RX slots required
> >>>>>>> to handle a SKB in the cb overlay. This value is used to determine if
> >>>>>>> the next SKB in the queue can be processed.
> >>>>>>>
> >>>>>>> Finally, the logic in start_new_rx_buffer() can cause RX slots to be
> >>>>>>> wasted when setting up copy grant table operations for
> >>>>>>> SKBs with large
> >>>>>>> linear buffers. For example, a SKB with skb_headlen() equal to 8157
> >>>>>>> bytes that starts 64 bytes 64 bytes from the start of the page will
> >>>>>> Duplicated "64 bytes". And this change looks like an improvement not a
> >>>>>> bug fix. Probably submit a separate patch for this?
> >>>> Argh, I knew it was in there somewhere (since you pointed it out in
> >>>> Dubin :-).
> >>>>
> >>>> Maybe it could be a separate patch. I think the description is also a
> >>>> bit confusing. I'll work on rewording it.
> >>>>
> >>>>>>> consume three RX slots instead of two. This patch changes the "head"
> >>>>>>> parameter to netbk_gop_frag_copy() to act as a flag. When set,
> >>>>>>> start_new_rx_buffer() will always place as much data as possible into
> >>>>>>> each RX slot.
> >>>>>>>
> >>>>>>> Signed-off-by: Xi Xiong <xixiong@amazon.com>
> >>>>>>> Reviewed-by: Matt Wilson <msw@amazon.com>
> >>>>>>> [ msw: minor code cleanups, rewrote commit message, adjusted code
> >>>>>>>    to count RX slots instead of meta structures ]
> >>>>>>> Signed-off-by: Matt Wilson <msw@amazon.com>
> >>>>>>> Cc: Annie Li <annie.li@oracle.com>
> >>>>>>> Cc: Wei Liu <wei.liu2@citrix.com>
> >>>>>>> Cc: Ian Campbell <Ian.Campbell@citrix.com>
> >>>>>>> Cc: netdev@vger.kernel.org
> >>>>>>> Cc: xen-devel@lists.xenproject.org
> >>>>>>> ---
> >>>>>>>   drivers/net/xen-netback/netback.c |   51
> >>>>>>> ++++++++++++++++++++++--------------
> >>>>>>>   1 files changed, 31 insertions(+), 20 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/net/xen-netback/netback.c
> >>>>>>> b/drivers/net/xen-netback/netback.c
> >>>>>>> index 64828de..82dd207 100644
> >>>>>>> --- a/drivers/net/xen-netback/netback.c
> >>>>>>> +++ b/drivers/net/xen-netback/netback.c
> >>>>>>> @@ -110,6 +110,11 @@ union page_ext {
> >>>>>>>       void *mapping;
> >>>>>>>   };
> >>>>>>>   +struct skb_cb_overlay {
> >>>>>>> +    int meta_slots_used;
> >>>>>>> +    int peek_slots_count;
> >>>>>>> +};
> >>>>>>> +
> >>>>>>>   struct xen_netbk {
> >>>>>>>       wait_queue_head_t wq;
> >>>>>>>       struct task_struct *task;
> >>>>>>> @@ -370,6 +375,7 @@ unsigned int
> >>>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct
> >>>>>>> sk_buff *skb)
> >>>>>>>   {
> >>>>>>>       unsigned int count;
> >>>>>>>       int i, copy_off;
> >>>>>>> +    struct skb_cb_overlay *sco;
> >>>>>>>         count = DIV_ROUND_UP(skb_headlen(skb), PAGE_SIZE);
> >>>>>>>   @@ -411,6 +417,9 @@ unsigned int
> >>>>>>> xen_netbk_count_skb_slots(struct xenvif *vif, struct
> >>>>>>> sk_buff *skb)
> >>>>>>>                   offset = 0;
> >>>>>>>           }
> >>>>>>>       }
> >>>>>>> +
> >>>>>>> +    sco = (struct skb_cb_overlay *) skb->cb;
> >>>>>>> +    sco->peek_slots_count = count;
> >>>>>>>       return count;
> >>>>>>>   }
> >>>>>>>   @@ -443,13 +452,12 @@ static struct netbk_rx_meta
> >>>>>>> *get_next_rx_buffer(struct xenvif *vif,
> >>>>>>>   }
> >>>>>>>     /*
> >>>>>>> - * Set up the grant operations for this fragment. If it's a flipping
> >>>>>>> - * interface, we also set up the unmap request from here.
> >>>>>>> + * Set up the grant operations for this fragment.
> >>>>>>>    */
> >>>>>>>   static void netbk_gop_frag_copy(struct xenvif *vif,
> >>>>>>> struct sk_buff *skb,
> >>>>>>>                   struct netrx_pending_operations *npo,
> >>>>>>>                   struct page *page, unsigned long size,
> >>>>>>> -                unsigned long offset, int *head)
> >>>>>>> +                unsigned long offset, int head, int *first)
> >>>>>>>   {
> >>>>>>>       struct gnttab_copy *copy_gop;
> >>>>>>>       struct netbk_rx_meta *meta;
> >>>>>>> @@ -479,12 +487,12 @@ static void
> >>>>>>> netbk_gop_frag_copy(struct xenvif *vif, struct sk_buff
> >>>>>>> *skb,
> >>>>>>>           if (bytes > size)
> >>>>>>>               bytes = size;
> >>>>>>>   -        if (start_new_rx_buffer(npo->copy_off, bytes, *head)) {
> >>>>>>> +        if (start_new_rx_buffer(npo->copy_off, bytes, head)) {
> >>> head is always 1 here when processing the header data, so it is
> >>> impossible to start a new buffer for header data with larger
> >>> header size, and get_next_rx_buffer is never called. I assume this
> >>> would not work well for skb with larger header data size. Have you
> >>> run network test with this patch?
> >> Sorry, I forgot the offset == MAX_BUFFER_OFFSET case and
> >> misunderstand your patch, please ignore my last comments. Your patch
> >> keeps the original DIV_ROUND_UP and changes the mechanism in
> >> netbk_gop_frag_copy to make slots same with
> >> xen_netbk_count_skb_slots. All Roads Lead to Rome!:-)
> >>
> > Actually I'm now in favor of Matt's approach as it a) fix the bug, b)
> > make efficient use of the ring.
> >
> > Annie, Ian, what's your opinion?
> 
> Me too. Looks his approach not only fix this bug, but also saves the 
> ring usage for header data, and this is good.
> Combining my patch with his is not a good idea, so I think we can use 
> his patch.

Sounds good to me. I didn't review the patch in
<1373409659-22383-1-git-send-email-msw@amazon.com> in much detail
because I think we are expecting a respin with some of Wei & Annie's
comment addressed? If I'm wrong and I've just missed that posting please
cluebat me.

Ian.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2013-07-18 11:47 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-09  6:15 [PATCH 1/1] xen/netback: correctly calculate required slots of skb Annie Li
2013-07-09 16:16 ` Wei Liu
2013-07-09 16:16   ` Wei Liu
2013-07-10  1:57   ` annie li
2013-07-10  2:31   ` annie li
2013-07-10  7:48     ` Wei Liu
2013-07-10  7:48       ` Wei Liu
2013-07-10  8:34       ` annie li
2013-07-09 22:14 ` Matt Wilson
2013-07-09 22:14   ` Matt Wilson
2013-07-09 22:40   ` [PATCH RFC] xen-netback: calculate the number of slots required for large MTU vifs Matt Wilson
2013-07-09 22:40   ` Matt Wilson
2013-07-10  8:13     ` Wei Liu
2013-07-10 19:37       ` Wei Liu
2013-07-11  1:25         ` annie li
2013-07-11  1:25         ` annie li
2013-07-11  5:14         ` [Xen-devel] " Matt Wilson
2013-07-11  6:01           ` annie li
2013-07-11  6:01           ` [Xen-devel] " annie li
2013-07-11 13:52             ` annie li
2013-07-11 13:52             ` [Xen-devel] " annie li
2013-07-12  8:49               ` Wei Liu
2013-07-12  9:19                 ` annie li
2013-07-12  9:19                 ` [Xen-devel] " annie li
2013-07-18 11:47                   ` Ian Campbell
2013-07-18 11:47                   ` [Xen-devel] " Ian Campbell
2013-07-12  8:49               ` Wei Liu
2013-07-11  8:46           ` [Xen-devel] " Wei Liu
2013-07-11  8:46           ` Wei Liu
2013-07-16 10:08           ` annie li
2013-07-16 10:08           ` [Xen-devel] " annie li
2013-07-16 10:27             ` Wei Liu
2013-07-16 10:27             ` [Xen-devel] " Wei Liu
2013-07-16 17:40               ` Matt Wilson
2013-07-16 17:40               ` [Xen-devel] " Matt Wilson
2013-07-11  5:14         ` Matt Wilson
2013-07-10 19:37       ` Wei Liu
2013-07-10  8:13     ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.