All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2 0/3] xen-netback: fix rx slot estimation
@ 2014-03-27 12:56 Paul Durrant
  2014-03-27 12:56 ` [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement Paul Durrant
                   ` (5 more replies)
  0 siblings, 6 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 12:56 UTC (permalink / raw)
  To: xen-devel, netdev

Sander Eikelenboom reported an issue with ring overflow in netback in
3.14-rc3. This turns outo be be because of a bug in the ring slot estimation
code. This patch series fixes the slot estimation, fixes the BUG_ON() that
was supposed to catch the issue that Sander ran into and also makes a small
fix to start_new_rx_buffer().

v2:
 - Added BUG_ON() to patch #1
 - Added more explanation to patch #3

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 12:56 [PATCH net v2 0/3] xen-netback: fix rx slot estimation Paul Durrant
  2014-03-27 12:56 ` [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement Paul Durrant
@ 2014-03-27 12:56 ` Paul Durrant
  2014-03-27 13:45   ` Sander Eikelenboom
  2014-03-27 13:45   ` Sander Eikelenboom
  2014-03-27 12:56 ` [PATCH net v2 2/3] xen-netback: worse-case estimate in xenvif_rx_action is underestimating Paul Durrant
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 12:56 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Ian Campbell, Wei Liu, Sander Eikelenboom

This patch removes a test in start_new_rx_buffer() that checks whether
a copy operation is less than MAX_BUFFER_OFFSET in length, since
MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
---

v2:
 - Add BUG_ON() as suggested by Ian Campbell

 drivers/net/xen-netback/netback.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 438d0c0..72314c7 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
 	 * into multiple copies tend to give large frags their
 	 * own buffers as before.
 	 */
-	if ((offset + size > MAX_BUFFER_OFFSET) &&
-	    (size <= MAX_BUFFER_OFFSET) && offset && !head)
+	BUG_ON(size > MAX_BUFFER_OFFSET);
+	if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
 		return true;
 
 	return false;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 12:56 [PATCH net v2 0/3] xen-netback: fix rx slot estimation Paul Durrant
@ 2014-03-27 12:56 ` Paul Durrant
  2014-03-27 12:56 ` Paul Durrant
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 12:56 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Sander Eikelenboom, Paul Durrant, Wei Liu, Ian Campbell

This patch removes a test in start_new_rx_buffer() that checks whether
a copy operation is less than MAX_BUFFER_OFFSET in length, since
MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
---

v2:
 - Add BUG_ON() as suggested by Ian Campbell

 drivers/net/xen-netback/netback.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 438d0c0..72314c7 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
 	 * into multiple copies tend to give large frags their
 	 * own buffers as before.
 	 */
-	if ((offset + size > MAX_BUFFER_OFFSET) &&
-	    (size <= MAX_BUFFER_OFFSET) && offset && !head)
+	BUG_ON(size > MAX_BUFFER_OFFSET);
+	if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
 		return true;
 
 	return false;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH net v2 2/3] xen-netback: worse-case estimate in xenvif_rx_action is underestimating
  2014-03-27 12:56 [PATCH net v2 0/3] xen-netback: fix rx slot estimation Paul Durrant
                   ` (2 preceding siblings ...)
  2014-03-27 12:56 ` [PATCH net v2 2/3] xen-netback: worse-case estimate in xenvif_rx_action is underestimating Paul Durrant
@ 2014-03-27 12:56 ` Paul Durrant
  2014-03-27 12:56 ` [PATCH net v2 3/3] xen-netback: BUG_ON in xenvif_rx_action() not catching overflow Paul Durrant
  2014-03-27 12:56 ` Paul Durrant
  5 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 12:56 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Ian Campbell, Wei Liu, Sander Eikelenboom

The worse-case estimate for skb ring slot usage in xenvif_rx_action()
fails to take fragment page_offset into account. The page_offset does,
however, affect the number of times the fragmentation code calls
start_new_rx_buffer() (i.e. consume another slot) and the worse-case
should assume that will always return true. This patch adds the page_offset
into the DIV_ROUND_UP for each frag.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
---
 drivers/net/xen-netback/netback.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 72314c7..aa87ae8 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -493,8 +493,18 @@ static void xenvif_rx_action(struct xenvif *vif)
 						PAGE_SIZE);
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			unsigned int size;
+			unsigned int offset;
+
 			size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
-			max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
+			offset = skb_shinfo(skb)->frags[i].page_offset;
+
+			/* For a worse-case estimate we need to factor in
+			 * the fragment page offset as this will affect the
+			 * number of times xenvif_gop_frag_copy() will
+			 * call start_new_rx_buffer().
+			 */
+			max_slots_needed += DIV_ROUND_UP(offset + size,
+							 PAGE_SIZE);
 		}
 		if (skb_is_gso(skb) &&
 		   (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4 ||
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH net v2 2/3] xen-netback: worse-case estimate in xenvif_rx_action is underestimating
  2014-03-27 12:56 [PATCH net v2 0/3] xen-netback: fix rx slot estimation Paul Durrant
  2014-03-27 12:56 ` [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement Paul Durrant
  2014-03-27 12:56 ` Paul Durrant
@ 2014-03-27 12:56 ` Paul Durrant
  2014-03-27 12:56 ` Paul Durrant
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 12:56 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Sander Eikelenboom, Paul Durrant, Wei Liu, Ian Campbell

The worse-case estimate for skb ring slot usage in xenvif_rx_action()
fails to take fragment page_offset into account. The page_offset does,
however, affect the number of times the fragmentation code calls
start_new_rx_buffer() (i.e. consume another slot) and the worse-case
should assume that will always return true. This patch adds the page_offset
into the DIV_ROUND_UP for each frag.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
---
 drivers/net/xen-netback/netback.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 72314c7..aa87ae8 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -493,8 +493,18 @@ static void xenvif_rx_action(struct xenvif *vif)
 						PAGE_SIZE);
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			unsigned int size;
+			unsigned int offset;
+
 			size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
-			max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
+			offset = skb_shinfo(skb)->frags[i].page_offset;
+
+			/* For a worse-case estimate we need to factor in
+			 * the fragment page offset as this will affect the
+			 * number of times xenvif_gop_frag_copy() will
+			 * call start_new_rx_buffer().
+			 */
+			max_slots_needed += DIV_ROUND_UP(offset + size,
+							 PAGE_SIZE);
 		}
 		if (skb_is_gso(skb) &&
 		   (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4 ||
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH net v2 3/3] xen-netback: BUG_ON in xenvif_rx_action() not catching overflow
  2014-03-27 12:56 [PATCH net v2 0/3] xen-netback: fix rx slot estimation Paul Durrant
                   ` (4 preceding siblings ...)
  2014-03-27 12:56 ` [PATCH net v2 3/3] xen-netback: BUG_ON in xenvif_rx_action() not catching overflow Paul Durrant
@ 2014-03-27 12:56 ` Paul Durrant
  5 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 12:56 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Paul Durrant, Ian Campbell, Wei Liu, Sander Eikelenboom

The BUG_ON to catch ring overflow in xenvif_rx_action() makes the assumption
that meta_slots_used == ring slots used. This is not necessarily the case
for GSO packets, because the non-prefix GSO protocol consumes one more ring
slot than meta-slot for the 'extra_info'. This patch changes the test to
actually check ring slots.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
---

v2:
 - Add explanation as to why meta_slots_used != ring slots used

 drivers/net/xen-netback/netback.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index aa87ae8..cbe6d6a 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -482,6 +482,8 @@ static void xenvif_rx_action(struct xenvif *vif)
 
 	while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
 		RING_IDX max_slots_needed;
+		RING_IDX old_req_cons;
+		RING_IDX ring_slots_used;
 		int i;
 
 		/* We need a cheap worse case estimate for the number of
@@ -521,8 +523,12 @@ static void xenvif_rx_action(struct xenvif *vif)
 			vif->rx_last_skb_slots = 0;
 
 		sco = (struct skb_cb_overlay *)skb->cb;
+
+		old_req_cons = vif->rx.req_cons;
 		sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
-		BUG_ON(sco->meta_slots_used > max_slots_needed);
+		ring_slots_used = vif->rx.req_cons - old_req_cons;
+
+		BUG_ON(ring_slots_used > max_slots_needed);
 
 		__skb_queue_tail(&rxq, skb);
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH net v2 3/3] xen-netback: BUG_ON in xenvif_rx_action() not catching overflow
  2014-03-27 12:56 [PATCH net v2 0/3] xen-netback: fix rx slot estimation Paul Durrant
                   ` (3 preceding siblings ...)
  2014-03-27 12:56 ` Paul Durrant
@ 2014-03-27 12:56 ` Paul Durrant
  2014-03-27 12:56 ` Paul Durrant
  5 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 12:56 UTC (permalink / raw)
  To: xen-devel, netdev; +Cc: Sander Eikelenboom, Paul Durrant, Wei Liu, Ian Campbell

The BUG_ON to catch ring overflow in xenvif_rx_action() makes the assumption
that meta_slots_used == ring slots used. This is not necessarily the case
for GSO packets, because the non-prefix GSO protocol consumes one more ring
slot than meta-slot for the 'extra_info'. This patch changes the test to
actually check ring slots.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
---

v2:
 - Add explanation as to why meta_slots_used != ring slots used

 drivers/net/xen-netback/netback.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index aa87ae8..cbe6d6a 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -482,6 +482,8 @@ static void xenvif_rx_action(struct xenvif *vif)
 
 	while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
 		RING_IDX max_slots_needed;
+		RING_IDX old_req_cons;
+		RING_IDX ring_slots_used;
 		int i;
 
 		/* We need a cheap worse case estimate for the number of
@@ -521,8 +523,12 @@ static void xenvif_rx_action(struct xenvif *vif)
 			vif->rx_last_skb_slots = 0;
 
 		sco = (struct skb_cb_overlay *)skb->cb;
+
+		old_req_cons = vif->rx.req_cons;
 		sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
-		BUG_ON(sco->meta_slots_used > max_slots_needed);
+		ring_slots_used = vif->rx.req_cons - old_req_cons;
+
+		BUG_ON(ring_slots_used > max_slots_needed);
 
 		__skb_queue_tail(&rxq, skb);
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 12:56 ` Paul Durrant
  2014-03-27 13:45   ` Sander Eikelenboom
@ 2014-03-27 13:45   ` Sander Eikelenboom
  2014-03-27 13:54     ` Paul Durrant
                       ` (3 more replies)
  1 sibling, 4 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 13:45 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu


Thursday, March 27, 2014, 1:56:11 PM, you wrote:

> This patch removes a test in start_new_rx_buffer() that checks whether
> a copy operation is less than MAX_BUFFER_OFFSET in length, since
> MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
> start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.

> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Sander Eikelenboom <linux@eikelenboom.it>
> ---

> v2:
>  - Add BUG_ON() as suggested by Ian Campbell

>  drivers/net/xen-netback/netback.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 438d0c0..72314c7 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
>          * into multiple copies tend to give large frags their
>          * own buffers as before.
>          */
> -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> +       BUG_ON(size > MAX_BUFFER_OFFSET);
> +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>                 return true;
>  
>         return false;

Hi Paul,

Unfortunately .. no good ..

With these patches (v2) applied to 3.14-rc8 it all seems to work well,
until i do my test case .. it still chokes and now effectively permanently stalls network traffic to that guest.

No error messages or anything in either xl dmesg or dmesg on the host .. and nothing in dmesg in the guest either.

But in the guest the TX bytes ifconfig reports for eth0 still increase but RX bytes does nothing, so it seems only the RX path is effected)

So it now seems i now have the situation which you described in the commit message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
"Without this patch I can trivially stall netback permanently by just doing a large guest to guest file copy between two Windows Server 2008R2 VMs on a single host."

--
Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 12:56 ` Paul Durrant
@ 2014-03-27 13:45   ` Sander Eikelenboom
  2014-03-27 13:45   ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 13:45 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 1:56:11 PM, you wrote:

> This patch removes a test in start_new_rx_buffer() that checks whether
> a copy operation is less than MAX_BUFFER_OFFSET in length, since
> MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
> start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.

> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Sander Eikelenboom <linux@eikelenboom.it>
> ---

> v2:
>  - Add BUG_ON() as suggested by Ian Campbell

>  drivers/net/xen-netback/netback.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 438d0c0..72314c7 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset, unsigned long size, int head)
>          * into multiple copies tend to give large frags their
>          * own buffers as before.
>          */
> -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> +       BUG_ON(size > MAX_BUFFER_OFFSET);
> +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>                 return true;
>  
>         return false;

Hi Paul,

Unfortunately .. no good ..

With these patches (v2) applied to 3.14-rc8 it all seems to work well,
until i do my test case .. it still chokes and now effectively permanently stalls network traffic to that guest.

No error messages or anything in either xl dmesg or dmesg on the host .. and nothing in dmesg in the guest either.

But in the guest the TX bytes ifconfig reports for eth0 still increase but RX bytes does nothing, so it seems only the RX path is effected)

So it now seems i now have the situation which you described in the commit message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
"Without this patch I can trivially stall netback permanently by just doing a large guest to guest file copy between two Windows Server 2008R2 VMs on a single host."

--
Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 13:45   ` Sander Eikelenboom
  2014-03-27 13:54     ` Paul Durrant
@ 2014-03-27 13:54     ` Paul Durrant
  2014-03-27 14:02       ` Sander Eikelenboom
  2014-03-27 14:02       ` Sander Eikelenboom
  2014-03-27 14:00     ` Paul Durrant
  2014-03-27 14:00     ` Paul Durrant
  3 siblings, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 13:54 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 13:46
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> 
> > This patch removes a test in start_new_rx_buffer() that checks whether
> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
> 
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> > ---
> 
> > v2:
> >  - Add BUG_ON() as suggested by Ian Campbell
> 
> >  drivers/net/xen-netback/netback.c |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> > index 438d0c0..72314c7 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> unsigned long size, int head)
> >          * into multiple copies tend to give large frags their
> >          * own buffers as before.
> >          */
> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >                 return true;
> >
> >         return false;
> 
> Hi Paul,
> 
> Unfortunately .. no good ..
> 
> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> until i do my test case .. it still chokes and now effectively permanently stalls
> network traffic to that guest.
> 
> No error messages or anything in either xl dmesg or dmesg on the host .. and
> nothing in dmesg in the guest either.
> 
> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
> bytes does nothing, so it seems only the RX path is effected)
> 

But you're not getting ring overflow, right? So that suggests this series is working and you're now hitting another problem? I don't see how these patches could directly cause the new behaviour you're seeing.

  Paul

> So it now seems i now have the situation which you described in the commit
> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> "Without this patch I can trivially stall netback permanently by just doing a
> large guest to guest file copy between two Windows Server 2008R2 VMs on a
> single host."
> 
> --
> Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 13:45   ` Sander Eikelenboom
@ 2014-03-27 13:54     ` Paul Durrant
  2014-03-27 13:54     ` Paul Durrant
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 13:54 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 13:46
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> 
> > This patch removes a test in start_new_rx_buffer() that checks whether
> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
> 
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> > ---
> 
> > v2:
> >  - Add BUG_ON() as suggested by Ian Campbell
> 
> >  drivers/net/xen-netback/netback.c |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> > index 438d0c0..72314c7 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> unsigned long size, int head)
> >          * into multiple copies tend to give large frags their
> >          * own buffers as before.
> >          */
> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >                 return true;
> >
> >         return false;
> 
> Hi Paul,
> 
> Unfortunately .. no good ..
> 
> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> until i do my test case .. it still chokes and now effectively permanently stalls
> network traffic to that guest.
> 
> No error messages or anything in either xl dmesg or dmesg on the host .. and
> nothing in dmesg in the guest either.
> 
> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
> bytes does nothing, so it seems only the RX path is effected)
> 

But you're not getting ring overflow, right? So that suggests this series is working and you're now hitting another problem? I don't see how these patches could directly cause the new behaviour you're seeing.

  Paul

> So it now seems i now have the situation which you described in the commit
> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> "Without this patch I can trivially stall netback permanently by just doing a
> large guest to guest file copy between two Windows Server 2008R2 VMs on a
> single host."
> 
> --
> Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 13:45   ` Sander Eikelenboom
  2014-03-27 13:54     ` Paul Durrant
  2014-03-27 13:54     ` Paul Durrant
@ 2014-03-27 14:00     ` Paul Durrant
  2014-03-27 14:05       ` Sander Eikelenboom
  2014-03-27 14:05       ` Sander Eikelenboom
  2014-03-27 14:00     ` Paul Durrant
  3 siblings, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 14:00 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 13:46
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> 
> > This patch removes a test in start_new_rx_buffer() that checks whether
> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
> 
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> > ---
> 
> > v2:
> >  - Add BUG_ON() as suggested by Ian Campbell
> 
> >  drivers/net/xen-netback/netback.c |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> > index 438d0c0..72314c7 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> unsigned long size, int head)
> >          * into multiple copies tend to give large frags their
> >          * own buffers as before.
> >          */
> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >                 return true;
> >
> >         return false;
> 
> Hi Paul,
> 
> Unfortunately .. no good ..
> 
> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> until i do my test case .. it still chokes and now effectively permanently stalls
> network traffic to that guest.
> 
> No error messages or anything in either xl dmesg or dmesg on the host .. and
> nothing in dmesg in the guest either.
> 
> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
> bytes does nothing, so it seems only the RX path is effected)
> 
> So it now seems i now have the situation which you described in the commit
> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> "Without this patch I can trivially stall netback permanently by just doing a
> large guest to guest file copy between two Windows Server 2008R2 VMs on a
> single host."
> 

Assuming it is similar, then the tx queue should be stalled. It's possible the frontend is not posting enough rx requests into the ring to take the packet. Can you tell if this is happening?

  Paul

> --
> Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 13:45   ` Sander Eikelenboom
                       ` (2 preceding siblings ...)
  2014-03-27 14:00     ` Paul Durrant
@ 2014-03-27 14:00     ` Paul Durrant
  3 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 14:00 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 13:46
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> 
> > This patch removes a test in start_new_rx_buffer() that checks whether
> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
> 
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Cc: Ian Campbell <ian.campbell@citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> > ---
> 
> > v2:
> >  - Add BUG_ON() as suggested by Ian Campbell
> 
> >  drivers/net/xen-netback/netback.c |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> > index 438d0c0..72314c7 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> unsigned long size, int head)
> >          * into multiple copies tend to give large frags their
> >          * own buffers as before.
> >          */
> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >                 return true;
> >
> >         return false;
> 
> Hi Paul,
> 
> Unfortunately .. no good ..
> 
> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> until i do my test case .. it still chokes and now effectively permanently stalls
> network traffic to that guest.
> 
> No error messages or anything in either xl dmesg or dmesg on the host .. and
> nothing in dmesg in the guest either.
> 
> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
> bytes does nothing, so it seems only the RX path is effected)
> 
> So it now seems i now have the situation which you described in the commit
> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> "Without this patch I can trivially stall netback permanently by just doing a
> large guest to guest file copy between two Windows Server 2008R2 VMs on a
> single host."
> 

Assuming it is similar, then the tx queue should be stalled. It's possible the frontend is not posting enough rx requests into the ring to take the packet. Can you tell if this is happening?

  Paul

> --
> Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 13:54     ` Paul Durrant
@ 2014-03-27 14:02       ` Sander Eikelenboom
  2014-03-27 14:09         ` Paul Durrant
  2014-03-27 14:09         ` Paul Durrant
  2014-03-27 14:02       ` Sander Eikelenboom
  1 sibling, 2 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 14:02 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu


Thursday, March 27, 2014, 2:54:46 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 13:46
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> 
>> > This patch removes a test in start_new_rx_buffer() that checks whether
>> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
>> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
>> 
>> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> > Cc: Wei Liu <wei.liu2@citrix.com>
>> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> > ---
>> 
>> > v2:
>> >  - Add BUG_ON() as suggested by Ian Campbell
>> 
>> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> netback/netback.c
>> > index 438d0c0..72314c7 100644
>> > --- a/drivers/net/xen-netback/netback.c
>> > +++ b/drivers/net/xen-netback/netback.c
>> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> unsigned long size, int head)
>> >          * into multiple copies tend to give large frags their
>> >          * own buffers as before.
>> >          */
>> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >                 return true;
>> >
>> >         return false;
>> 
>> Hi Paul,
>> 
>> Unfortunately .. no good ..
>> 
>> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> until i do my test case .. it still chokes and now effectively permanently stalls
>> network traffic to that guest.
>> 
>> No error messages or anything in either xl dmesg or dmesg on the host .. and
>> nothing in dmesg in the guest either.
>> 
>> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
>> bytes does nothing, so it seems only the RX path is effected)
>> 

> But you're not getting ring overflow, right? So that suggests this series is working and you're now hitting another problem? I don't see how these patches could directly cause the new behaviour you're seeing.

Don't know  .. how ever .. i previously tested:
        - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()", and that circumvented the problem reliably without causing anything else
        - reverting the calculation of "max_slots_needed + 1"  in "net_rx_action()" to what it was before :
                int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
                if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
                        max += MAX_SKB_FRAGS + 1; /* extra_info + frags */

So that leads me to think it's something caused by this patch set.
Patch1 could be a candidate .. perhaps that check was needed for some reason .. will see what not applying that one does

--
Sander


>   Paul

>> So it now seems i now have the situation which you described in the commit
>> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
>> "Without this patch I can trivially stall netback permanently by just doing a
>> large guest to guest file copy between two Windows Server 2008R2 VMs on a
>> single host."
>> 
>> --
>> Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 13:54     ` Paul Durrant
  2014-03-27 14:02       ` Sander Eikelenboom
@ 2014-03-27 14:02       ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 14:02 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 2:54:46 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 13:46
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> 
>> > This patch removes a test in start_new_rx_buffer() that checks whether
>> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
>> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
>> 
>> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> > Cc: Wei Liu <wei.liu2@citrix.com>
>> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> > ---
>> 
>> > v2:
>> >  - Add BUG_ON() as suggested by Ian Campbell
>> 
>> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> netback/netback.c
>> > index 438d0c0..72314c7 100644
>> > --- a/drivers/net/xen-netback/netback.c
>> > +++ b/drivers/net/xen-netback/netback.c
>> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> unsigned long size, int head)
>> >          * into multiple copies tend to give large frags their
>> >          * own buffers as before.
>> >          */
>> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >                 return true;
>> >
>> >         return false;
>> 
>> Hi Paul,
>> 
>> Unfortunately .. no good ..
>> 
>> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> until i do my test case .. it still chokes and now effectively permanently stalls
>> network traffic to that guest.
>> 
>> No error messages or anything in either xl dmesg or dmesg on the host .. and
>> nothing in dmesg in the guest either.
>> 
>> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
>> bytes does nothing, so it seems only the RX path is effected)
>> 

> But you're not getting ring overflow, right? So that suggests this series is working and you're now hitting another problem? I don't see how these patches could directly cause the new behaviour you're seeing.

Don't know  .. how ever .. i previously tested:
        - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()", and that circumvented the problem reliably without causing anything else
        - reverting the calculation of "max_slots_needed + 1"  in "net_rx_action()" to what it was before :
                int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
                if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
                        max += MAX_SKB_FRAGS + 1; /* extra_info + frags */

So that leads me to think it's something caused by this patch set.
Patch1 could be a candidate .. perhaps that check was needed for some reason .. will see what not applying that one does

--
Sander


>   Paul

>> So it now seems i now have the situation which you described in the commit
>> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
>> "Without this patch I can trivially stall netback permanently by just doing a
>> large guest to guest file copy between two Windows Server 2008R2 VMs on a
>> single host."
>> 
>> --
>> Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:00     ` Paul Durrant
  2014-03-27 14:05       ` Sander Eikelenboom
@ 2014-03-27 14:05       ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 14:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu


Thursday, March 27, 2014, 3:00:14 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 13:46
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> 
>> > This patch removes a test in start_new_rx_buffer() that checks whether
>> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
>> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
>> 
>> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> > Cc: Wei Liu <wei.liu2@citrix.com>
>> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> > ---
>> 
>> > v2:
>> >  - Add BUG_ON() as suggested by Ian Campbell
>> 
>> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> netback/netback.c
>> > index 438d0c0..72314c7 100644
>> > --- a/drivers/net/xen-netback/netback.c
>> > +++ b/drivers/net/xen-netback/netback.c
>> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> unsigned long size, int head)
>> >          * into multiple copies tend to give large frags their
>> >          * own buffers as before.
>> >          */
>> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >                 return true;
>> >
>> >         return false;
>> 
>> Hi Paul,
>> 
>> Unfortunately .. no good ..
>> 
>> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> until i do my test case .. it still chokes and now effectively permanently stalls
>> network traffic to that guest.
>> 
>> No error messages or anything in either xl dmesg or dmesg on the host .. and
>> nothing in dmesg in the guest either.
>> 
>> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
>> bytes does nothing, so it seems only the RX path is effected)
>> 
>> So it now seems i now have the situation which you described in the commit
>> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
>> "Without this patch I can trivially stall netback permanently by just doing a
>> large guest to guest file copy between two Windows Server 2008R2 VMs on a
>> single host."
>> 

> Assuming it is similar, then the tx queue should be stalled. It's possible the frontend is not posting enough rx requests into the ring to take the packet. Can you tell if this is happening?

How would i do that ?
(and it seems unlikely since that case that triggers is .. is swamping the interface with both inbound and outbound traffic)

>   Paul

>> --
>> Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:00     ` Paul Durrant
@ 2014-03-27 14:05       ` Sander Eikelenboom
  2014-03-27 14:05       ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 14:05 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 3:00:14 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 13:46
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> 
>> > This patch removes a test in start_new_rx_buffer() that checks whether
>> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
>> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less.
>> 
>> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> > Cc: Wei Liu <wei.liu2@citrix.com>
>> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> > ---
>> 
>> > v2:
>> >  - Add BUG_ON() as suggested by Ian Campbell
>> 
>> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> netback/netback.c
>> > index 438d0c0..72314c7 100644
>> > --- a/drivers/net/xen-netback/netback.c
>> > +++ b/drivers/net/xen-netback/netback.c
>> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> unsigned long size, int head)
>> >          * into multiple copies tend to give large frags their
>> >          * own buffers as before.
>> >          */
>> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >                 return true;
>> >
>> >         return false;
>> 
>> Hi Paul,
>> 
>> Unfortunately .. no good ..
>> 
>> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> until i do my test case .. it still chokes and now effectively permanently stalls
>> network traffic to that guest.
>> 
>> No error messages or anything in either xl dmesg or dmesg on the host .. and
>> nothing in dmesg in the guest either.
>> 
>> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
>> bytes does nothing, so it seems only the RX path is effected)
>> 
>> So it now seems i now have the situation which you described in the commit
>> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
>> "Without this patch I can trivially stall netback permanently by just doing a
>> large guest to guest file copy between two Windows Server 2008R2 VMs on a
>> single host."
>> 

> Assuming it is similar, then the tx queue should be stalled. It's possible the frontend is not posting enough rx requests into the ring to take the packet. Can you tell if this is happening?

How would i do that ?
(and it seems unlikely since that case that triggers is .. is swamping the interface with both inbound and outbound traffic)

>   Paul

>> --
>> Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:02       ` Sander Eikelenboom
@ 2014-03-27 14:09         ` Paul Durrant
  2014-03-27 14:29           ` Sander Eikelenboom
                             ` (3 more replies)
  2014-03-27 14:09         ` Paul Durrant
  1 sibling, 4 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 14:09 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 14:03
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 13:46
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
> Liu
> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> from if
> >> statement
> >>
> >>
> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> >>
> >> > This patch removes a test in start_new_rx_buffer() that checks whether
> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or
> less.
> >>
> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
> >> > Cc: Wei Liu <wei.liu2@citrix.com>
> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> >> > ---
> >>
> >> > v2:
> >> >  - Add BUG_ON() as suggested by Ian Campbell
> >>
> >> >  drivers/net/xen-netback/netback.c |    4 ++--
> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> >> netback/netback.c
> >> > index 438d0c0..72314c7 100644
> >> > --- a/drivers/net/xen-netback/netback.c
> >> > +++ b/drivers/net/xen-netback/netback.c
> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> >> unsigned long size, int head)
> >> >          * into multiple copies tend to give large frags their
> >> >          * own buffers as before.
> >> >          */
> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >> >                 return true;
> >> >
> >> >         return false;
> >>
> >> Hi Paul,
> >>
> >> Unfortunately .. no good ..
> >>
> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> >> until i do my test case .. it still chokes and now effectively permanently
> stalls
> >> network traffic to that guest.
> >>
> >> No error messages or anything in either xl dmesg or dmesg on the host ..
> and
> >> nothing in dmesg in the guest either.
> >>
> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
> >> bytes does nothing, so it seems only the RX path is effected)
> >>
> 
> > But you're not getting ring overflow, right? So that suggests this series is
> working and you're now hitting another problem? I don't see how these
> patches could directly cause the new behaviour you're seeing.
> 
> Don't know  .. how ever .. i previously tested:
>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
> and that circumvented the problem reliably without causing anything else
>         - reverting the calculation of "max_slots_needed + 1"  in
> "net_rx_action()" to what it was before :
>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> 

So, it may be that the worse-case estimate is now too bad. In the case where it's failing for you it would be nice to know what the estimate was. Looking at netfront, we could be in trouble if it ever goes above 64.

  Paul

> So that leads me to think it's something caused by this patch set.
> Patch1 could be a candidate .. perhaps that check was needed for some
> reason .. will see what not applying that one does
> 
> --
> Sander
> 
> 
> >   Paul
> 
> >> So it now seems i now have the situation which you described in the
> commit
> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> >> "Without this patch I can trivially stall netback permanently by just doing a
> >> large guest to guest file copy between two Windows Server 2008R2 VMs
> on a
> >> single host."
> >>
> >> --
> >> Sander
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:02       ` Sander Eikelenboom
  2014-03-27 14:09         ` Paul Durrant
@ 2014-03-27 14:09         ` Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 14:09 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 14:03
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 13:46
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
> Liu
> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> from if
> >> statement
> >>
> >>
> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> >>
> >> > This patch removes a test in start_new_rx_buffer() that checks whether
> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or
> less.
> >>
> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
> >> > Cc: Wei Liu <wei.liu2@citrix.com>
> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> >> > ---
> >>
> >> > v2:
> >> >  - Add BUG_ON() as suggested by Ian Campbell
> >>
> >> >  drivers/net/xen-netback/netback.c |    4 ++--
> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> >> netback/netback.c
> >> > index 438d0c0..72314c7 100644
> >> > --- a/drivers/net/xen-netback/netback.c
> >> > +++ b/drivers/net/xen-netback/netback.c
> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> >> unsigned long size, int head)
> >> >          * into multiple copies tend to give large frags their
> >> >          * own buffers as before.
> >> >          */
> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >> >                 return true;
> >> >
> >> >         return false;
> >>
> >> Hi Paul,
> >>
> >> Unfortunately .. no good ..
> >>
> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> >> until i do my test case .. it still chokes and now effectively permanently
> stalls
> >> network traffic to that guest.
> >>
> >> No error messages or anything in either xl dmesg or dmesg on the host ..
> and
> >> nothing in dmesg in the guest either.
> >>
> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
> >> bytes does nothing, so it seems only the RX path is effected)
> >>
> 
> > But you're not getting ring overflow, right? So that suggests this series is
> working and you're now hitting another problem? I don't see how these
> patches could directly cause the new behaviour you're seeing.
> 
> Don't know  .. how ever .. i previously tested:
>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
> and that circumvented the problem reliably without causing anything else
>         - reverting the calculation of "max_slots_needed + 1"  in
> "net_rx_action()" to what it was before :
>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> 

So, it may be that the worse-case estimate is now too bad. In the case where it's failing for you it would be nice to know what the estimate was. Looking at netfront, we could be in trouble if it ever goes above 64.

  Paul

> So that leads me to think it's something caused by this patch set.
> Patch1 could be a candidate .. perhaps that check was needed for some
> reason .. will see what not applying that one does
> 
> --
> Sander
> 
> 
> >   Paul
> 
> >> So it now seems i now have the situation which you described in the
> commit
> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> >> "Without this patch I can trivially stall netback permanently by just doing a
> >> large guest to guest file copy between two Windows Server 2008R2 VMs
> on a
> >> single host."
> >>
> >> --
> >> Sander
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:09         ` Paul Durrant
  2014-03-27 14:29           ` Sander Eikelenboom
@ 2014-03-27 14:29           ` Sander Eikelenboom
  2014-03-27 14:38             ` Paul Durrant
  2014-03-27 14:38             ` Paul Durrant
  2014-03-27 16:46           ` Sander Eikelenboom
  2014-03-27 16:46           ` Sander Eikelenboom
  3 siblings, 2 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 14:29 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu


Thursday, March 27, 2014, 3:09:32 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 14:03
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 27 March 2014 13:46
>> >> To: Paul Durrant
>> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
>> Liu
>> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
>> from if
>> >> statement
>> >>
>> >>
>> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> >>
>> >> > This patch removes a test in start_new_rx_buffer() that checks whether
>> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
>> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or
>> less.
>> >>
>> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> >> > Cc: Wei Liu <wei.liu2@citrix.com>
>> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> >> > ---
>> >>
>> >> > v2:
>> >> >  - Add BUG_ON() as suggested by Ian Campbell
>> >>
>> >> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >>
>> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> >> netback/netback.c
>> >> > index 438d0c0..72314c7 100644
>> >> > --- a/drivers/net/xen-netback/netback.c
>> >> > +++ b/drivers/net/xen-netback/netback.c
>> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> >> unsigned long size, int head)
>> >> >          * into multiple copies tend to give large frags their
>> >> >          * own buffers as before.
>> >> >          */
>> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >> >                 return true;
>> >> >
>> >> >         return false;
>> >>
>> >> Hi Paul,
>> >>
>> >> Unfortunately .. no good ..
>> >>
>> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> >> until i do my test case .. it still chokes and now effectively permanently
>> stalls
>> >> network traffic to that guest.
>> >>
>> >> No error messages or anything in either xl dmesg or dmesg on the host ..
>> and
>> >> nothing in dmesg in the guest either.
>> >>
>> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
>> >> bytes does nothing, so it seems only the RX path is effected)
>> >>
>> 
>> > But you're not getting ring overflow, right? So that suggests this series is
>> working and you're now hitting another problem? I don't see how these
>> patches could directly cause the new behaviour you're seeing.
>> 
>> Don't know  .. how ever .. i previously tested:
>>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
>> and that circumvented the problem reliably without causing anything else
>>         - reverting the calculation of "max_slots_needed + 1"  in
>> "net_rx_action()" to what it was before :
>>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> 

> So, it may be that the worse-case estimate is now too bad. In the case where it's failing for you it would be nice to know what the estimate was. Looking at netfront, we could be in trouble if it ever goes above 64.

It probaly isn't .. from what i previously have seen .. the max was around 13 if i recall correct, but i could put a check on that.
And since i don't know *why* it fails .. it's hard to put a warn on it.

>   Paul

>> So that leads me to think it's something caused by this patch set.
>> Patch1 could be a candidate .. perhaps that check was needed for some
>> reason .. will see what not applying that one does
>> 
>> --
>> Sander
>> 
>> 
>> >   Paul
>> 
>> >> So it now seems i now have the situation which you described in the
>> commit
>> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
>> >> "Without this patch I can trivially stall netback permanently by just doing a
>> >> large guest to guest file copy between two Windows Server 2008R2 VMs
>> on a
>> >> single host."
>> >>
>> >> --
>> >> Sander
>> 
>> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:09         ` Paul Durrant
@ 2014-03-27 14:29           ` Sander Eikelenboom
  2014-03-27 14:29           ` Sander Eikelenboom
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 14:29 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 3:09:32 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 14:03
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 27 March 2014 13:46
>> >> To: Paul Durrant
>> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
>> Liu
>> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
>> from if
>> >> statement
>> >>
>> >>
>> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> >>
>> >> > This patch removes a test in start_new_rx_buffer() that checks whether
>> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
>> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or
>> less.
>> >>
>> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> >> > Cc: Wei Liu <wei.liu2@citrix.com>
>> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> >> > ---
>> >>
>> >> > v2:
>> >> >  - Add BUG_ON() as suggested by Ian Campbell
>> >>
>> >> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >>
>> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> >> netback/netback.c
>> >> > index 438d0c0..72314c7 100644
>> >> > --- a/drivers/net/xen-netback/netback.c
>> >> > +++ b/drivers/net/xen-netback/netback.c
>> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> >> unsigned long size, int head)
>> >> >          * into multiple copies tend to give large frags their
>> >> >          * own buffers as before.
>> >> >          */
>> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >> >                 return true;
>> >> >
>> >> >         return false;
>> >>
>> >> Hi Paul,
>> >>
>> >> Unfortunately .. no good ..
>> >>
>> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> >> until i do my test case .. it still chokes and now effectively permanently
>> stalls
>> >> network traffic to that guest.
>> >>
>> >> No error messages or anything in either xl dmesg or dmesg on the host ..
>> and
>> >> nothing in dmesg in the guest either.
>> >>
>> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
>> >> bytes does nothing, so it seems only the RX path is effected)
>> >>
>> 
>> > But you're not getting ring overflow, right? So that suggests this series is
>> working and you're now hitting another problem? I don't see how these
>> patches could directly cause the new behaviour you're seeing.
>> 
>> Don't know  .. how ever .. i previously tested:
>>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
>> and that circumvented the problem reliably without causing anything else
>>         - reverting the calculation of "max_slots_needed + 1"  in
>> "net_rx_action()" to what it was before :
>>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> 

> So, it may be that the worse-case estimate is now too bad. In the case where it's failing for you it would be nice to know what the estimate was. Looking at netfront, we could be in trouble if it ever goes above 64.

It probaly isn't .. from what i previously have seen .. the max was around 13 if i recall correct, but i could put a check on that.
And since i don't know *why* it fails .. it's hard to put a warn on it.

>   Paul

>> So that leads me to think it's something caused by this patch set.
>> Patch1 could be a candidate .. perhaps that check was needed for some
>> reason .. will see what not applying that one does
>> 
>> --
>> Sander
>> 
>> 
>> >   Paul
>> 
>> >> So it now seems i now have the situation which you described in the
>> commit
>> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
>> >> "Without this patch I can trivially stall netback permanently by just doing a
>> >> large guest to guest file copy between two Windows Server 2008R2 VMs
>> on a
>> >> single host."
>> >>
>> >> --
>> >> Sander
>> 
>> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:29           ` Sander Eikelenboom
  2014-03-27 14:38             ` Paul Durrant
@ 2014-03-27 14:38             ` Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 14:38 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 14:29
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 3:09:32 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 14:03
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
> Liu
> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> from if
> >> statement
> >>
> >>
> >> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 27 March 2014 13:46
> >> >> To: Paul Durrant
> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell;
> Wei
> >> Liu
> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> >> from if
> >> >> statement
> >> >>
> >> >>
> >> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> >> >>
> >> >> > This patch removes a test in start_new_rx_buffer() that checks
> whether
> >> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> >> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller
> of
> >> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE
> or
> >> less.
> >> >>
> >> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
> >> >> > Cc: Wei Liu <wei.liu2@citrix.com>
> >> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> >> >> > ---
> >> >>
> >> >> > v2:
> >> >> >  - Add BUG_ON() as suggested by Ian Campbell
> >> >>
> >> >> >  drivers/net/xen-netback/netback.c |    4 ++--
> >> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >>
> >> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> >> >> netback/netback.c
> >> >> > index 438d0c0..72314c7 100644
> >> >> > --- a/drivers/net/xen-netback/netback.c
> >> >> > +++ b/drivers/net/xen-netback/netback.c
> >> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> >> >> unsigned long size, int head)
> >> >> >          * into multiple copies tend to give large frags their
> >> >> >          * own buffers as before.
> >> >> >          */
> >> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> >> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> >> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> >> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >> >> >                 return true;
> >> >> >
> >> >> >         return false;
> >> >>
> >> >> Hi Paul,
> >> >>
> >> >> Unfortunately .. no good ..
> >> >>
> >> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> >> >> until i do my test case .. it still chokes and now effectively permanently
> >> stalls
> >> >> network traffic to that guest.
> >> >>
> >> >> No error messages or anything in either xl dmesg or dmesg on the host
> ..
> >> and
> >> >> nothing in dmesg in the guest either.
> >> >>
> >> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but
> RX
> >> >> bytes does nothing, so it seems only the RX path is effected)
> >> >>
> >>
> >> > But you're not getting ring overflow, right? So that suggests this series is
> >> working and you're now hitting another problem? I don't see how these
> >> patches could directly cause the new behaviour you're seeing.
> >>
> >> Don't know  .. how ever .. i previously tested:
> >>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
> >> and that circumvented the problem reliably without causing anything else
> >>         - reverting the calculation of "max_slots_needed + 1"  in
> >> "net_rx_action()" to what it was before :
> >>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>
> 
> > So, it may be that the worse-case estimate is now too bad. In the case
> where it's failing for you it would be nice to know what the estimate was.
> Looking at netfront, we could be in trouble if it ever goes above 64.
> 
> It probaly isn't .. from what i previously have seen .. the max was around 13 if
> i recall correct, but i could put a check on that.
> And since i don't know *why* it fails .. it's hard to put a warn on it.
> 

Well, if you stick a printk of rx_last_skb_slots in rx_work_todo() in the case where there are not enough slots available then we may be able to see how many slots it's waiting for.

  Paul

> >   Paul
> 
> >> So that leads me to think it's something caused by this patch set.
> >> Patch1 could be a candidate .. perhaps that check was needed for some
> >> reason .. will see what not applying that one does
> >>
> >> --
> >> Sander
> >>
> >>
> >> >   Paul
> >>
> >> >> So it now seems i now have the situation which you described in the
> >> commit
> >> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> >> >> "Without this patch I can trivially stall netback permanently by just
> doing a
> >> >> large guest to guest file copy between two Windows Server 2008R2
> VMs
> >> on a
> >> >> single host."
> >> >>
> >> >> --
> >> >> Sander
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:29           ` Sander Eikelenboom
@ 2014-03-27 14:38             ` Paul Durrant
  2014-03-27 14:38             ` Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 14:38 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 14:29
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 3:09:32 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 14:03
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
> Liu
> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> from if
> >> statement
> >>
> >>
> >> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 27 March 2014 13:46
> >> >> To: Paul Durrant
> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell;
> Wei
> >> Liu
> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> >> from if
> >> >> statement
> >> >>
> >> >>
> >> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> >> >>
> >> >> > This patch removes a test in start_new_rx_buffer() that checks
> whether
> >> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> >> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller
> of
> >> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE
> or
> >> less.
> >> >>
> >> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
> >> >> > Cc: Wei Liu <wei.liu2@citrix.com>
> >> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> >> >> > ---
> >> >>
> >> >> > v2:
> >> >> >  - Add BUG_ON() as suggested by Ian Campbell
> >> >>
> >> >> >  drivers/net/xen-netback/netback.c |    4 ++--
> >> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >>
> >> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> >> >> netback/netback.c
> >> >> > index 438d0c0..72314c7 100644
> >> >> > --- a/drivers/net/xen-netback/netback.c
> >> >> > +++ b/drivers/net/xen-netback/netback.c
> >> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> >> >> unsigned long size, int head)
> >> >> >          * into multiple copies tend to give large frags their
> >> >> >          * own buffers as before.
> >> >> >          */
> >> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> >> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> >> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> >> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >> >> >                 return true;
> >> >> >
> >> >> >         return false;
> >> >>
> >> >> Hi Paul,
> >> >>
> >> >> Unfortunately .. no good ..
> >> >>
> >> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> >> >> until i do my test case .. it still chokes and now effectively permanently
> >> stalls
> >> >> network traffic to that guest.
> >> >>
> >> >> No error messages or anything in either xl dmesg or dmesg on the host
> ..
> >> and
> >> >> nothing in dmesg in the guest either.
> >> >>
> >> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but
> RX
> >> >> bytes does nothing, so it seems only the RX path is effected)
> >> >>
> >>
> >> > But you're not getting ring overflow, right? So that suggests this series is
> >> working and you're now hitting another problem? I don't see how these
> >> patches could directly cause the new behaviour you're seeing.
> >>
> >> Don't know  .. how ever .. i previously tested:
> >>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
> >> and that circumvented the problem reliably without causing anything else
> >>         - reverting the calculation of "max_slots_needed + 1"  in
> >> "net_rx_action()" to what it was before :
> >>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>
> 
> > So, it may be that the worse-case estimate is now too bad. In the case
> where it's failing for you it would be nice to know what the estimate was.
> Looking at netfront, we could be in trouble if it ever goes above 64.
> 
> It probaly isn't .. from what i previously have seen .. the max was around 13 if
> i recall correct, but i could put a check on that.
> And since i don't know *why* it fails .. it's hard to put a warn on it.
> 

Well, if you stick a printk of rx_last_skb_slots in rx_work_todo() in the case where there are not enough slots available then we may be able to see how many slots it's waiting for.

  Paul

> >   Paul
> 
> >> So that leads me to think it's something caused by this patch set.
> >> Patch1 could be a candidate .. perhaps that check was needed for some
> >> reason .. will see what not applying that one does
> >>
> >> --
> >> Sander
> >>
> >>
> >> >   Paul
> >>
> >> >> So it now seems i now have the situation which you described in the
> >> commit
> >> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> >> >> "Without this patch I can trivially stall netback permanently by just
> doing a
> >> >> large guest to guest file copy between two Windows Server 2008R2
> VMs
> >> on a
> >> >> single host."
> >> >>
> >> >> --
> >> >> Sander
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:09         ` Paul Durrant
  2014-03-27 14:29           ` Sander Eikelenboom
  2014-03-27 14:29           ` Sander Eikelenboom
@ 2014-03-27 16:46           ` Sander Eikelenboom
  2014-03-27 16:54             ` Paul Durrant
  2014-03-27 16:54             ` Paul Durrant
  2014-03-27 16:46           ` Sander Eikelenboom
  3 siblings, 2 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 16:46 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu


Thursday, March 27, 2014, 3:09:32 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 14:03
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 27 March 2014 13:46
>> >> To: Paul Durrant
>> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
>> Liu
>> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
>> from if
>> >> statement
>> >>
>> >>
>> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> >>
>> >> > This patch removes a test in start_new_rx_buffer() that checks whether
>> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
>> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or
>> less.
>> >>
>> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> >> > Cc: Wei Liu <wei.liu2@citrix.com>
>> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> >> > ---
>> >>
>> >> > v2:
>> >> >  - Add BUG_ON() as suggested by Ian Campbell
>> >>
>> >> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >>
>> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> >> netback/netback.c
>> >> > index 438d0c0..72314c7 100644
>> >> > --- a/drivers/net/xen-netback/netback.c
>> >> > +++ b/drivers/net/xen-netback/netback.c
>> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> >> unsigned long size, int head)
>> >> >          * into multiple copies tend to give large frags their
>> >> >          * own buffers as before.
>> >> >          */
>> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >> >                 return true;
>> >> >
>> >> >         return false;
>> >>
>> >> Hi Paul,
>> >>
>> >> Unfortunately .. no good ..
>> >>
>> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> >> until i do my test case .. it still chokes and now effectively permanently
>> stalls
>> >> network traffic to that guest.
>> >>
>> >> No error messages or anything in either xl dmesg or dmesg on the host ..
>> and
>> >> nothing in dmesg in the guest either.
>> >>
>> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
>> >> bytes does nothing, so it seems only the RX path is effected)
>> >>
>> 
>> > But you're not getting ring overflow, right? So that suggests this series is
>> working and you're now hitting another problem? I don't see how these
>> patches could directly cause the new behaviour you're seeing.
>> 
>> Don't know  .. how ever .. i previously tested:
>>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
>> and that circumvented the problem reliably without causing anything else
>>         - reverting the calculation of "max_slots_needed + 1"  in
>> "net_rx_action()" to what it was before :
>>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> 

> So, it may be that the worse-case estimate is now too bad. In the case where it's failing for you it would be nice to know what the estimate was. Looking at netfront, we could be in trouble if it ever goes above 64.

With your patches + some extra printk's

Ok you seem to be right ..

[  967.957014] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  968.164711] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  968.310899] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  968.674412] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  968.928398] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  969.105993] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  969.434961] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  969.719368] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  969.729606] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.195451] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.493106] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.581056] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.594934] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.754355] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.991755] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  976.978261] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26 frag:8 size:1460 offset:15478
[  976.980183] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 31 frag:9 size:313 offset:17398
[  976.982154] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 32
[  976.984078] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 27 frag:3 size:1460 offset:25846
[  976.986466] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 35 frag:4 size:1460 offset:27766
[  976.988540] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 43 frag:5 size:1460 offset:29686
[  976.990809] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 44 frag:6 size:1460 offset:118
[  976.993038] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 45 frag:7 size:1460 offset:2038
[  976.994966] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 47 frag:8 size:1460 offset:3958
[  976.996987] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 49 frag:9 size:1460 offset:5878
[  976.998947] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 52
[  982.624852] net_ratelimit: 97 callbacks suppressed
[  982.627279] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  982.637833] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.432482] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.640017] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.642974] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.656408] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.779142] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  984.644546] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  984.657728] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  985.459147] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  987.668407] net_ratelimit: 8 callbacks suppressed
[  987.671661] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  987.678483] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  988.671510] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  988.681210] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  989.472372] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  989.685166] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  989.700220] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  990.058987] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  990.192480] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  990.687626] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  992.707878] net_ratelimit: 5 callbacks suppressed

But it's done at 52 instead of 64.

And this worst case is a *lot* larger (and probably needless .. since i previously could do with only one slot extra).




>   Paul

>> So that leads me to think it's something caused by this patch set.
>> Patch1 could be a candidate .. perhaps that check was needed for some
>> reason .. will see what not applying that one does
>> 
>> --
>> Sander
>> 
>> 
>> >   Paul
>> 
>> >> So it now seems i now have the situation which you described in the
>> commit
>> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
>> >> "Without this patch I can trivially stall netback permanently by just doing a
>> >> large guest to guest file copy between two Windows Server 2008R2 VMs
>> on a
>> >> single host."
>> >>
>> >> --
>> >> Sander
>> 
>> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 14:09         ` Paul Durrant
                             ` (2 preceding siblings ...)
  2014-03-27 16:46           ` Sander Eikelenboom
@ 2014-03-27 16:46           ` Sander Eikelenboom
  3 siblings, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 16:46 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 3:09:32 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 14:03
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 27 March 2014 13:46
>> >> To: Paul Durrant
>> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
>> Liu
>> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
>> from if
>> >> statement
>> >>
>> >>
>> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> >>
>> >> > This patch removes a test in start_new_rx_buffer() that checks whether
>> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of
>> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE or
>> less.
>> >>
>> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> >> > Cc: Wei Liu <wei.liu2@citrix.com>
>> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> >> > ---
>> >>
>> >> > v2:
>> >> >  - Add BUG_ON() as suggested by Ian Campbell
>> >>
>> >> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >>
>> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> >> netback/netback.c
>> >> > index 438d0c0..72314c7 100644
>> >> > --- a/drivers/net/xen-netback/netback.c
>> >> > +++ b/drivers/net/xen-netback/netback.c
>> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> >> unsigned long size, int head)
>> >> >          * into multiple copies tend to give large frags their
>> >> >          * own buffers as before.
>> >> >          */
>> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >> >                 return true;
>> >> >
>> >> >         return false;
>> >>
>> >> Hi Paul,
>> >>
>> >> Unfortunately .. no good ..
>> >>
>> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> >> until i do my test case .. it still chokes and now effectively permanently
>> stalls
>> >> network traffic to that guest.
>> >>
>> >> No error messages or anything in either xl dmesg or dmesg on the host ..
>> and
>> >> nothing in dmesg in the guest either.
>> >>
>> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but RX
>> >> bytes does nothing, so it seems only the RX path is effected)
>> >>
>> 
>> > But you're not getting ring overflow, right? So that suggests this series is
>> working and you're now hitting another problem? I don't see how these
>> patches could directly cause the new behaviour you're seeing.
>> 
>> Don't know  .. how ever .. i previously tested:
>>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
>> and that circumvented the problem reliably without causing anything else
>>         - reverting the calculation of "max_slots_needed + 1"  in
>> "net_rx_action()" to what it was before :
>>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> 

> So, it may be that the worse-case estimate is now too bad. In the case where it's failing for you it would be nice to know what the estimate was. Looking at netfront, we could be in trouble if it ever goes above 64.

With your patches + some extra printk's

Ok you seem to be right ..

[  967.957014] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  968.164711] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  968.310899] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  968.674412] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  968.928398] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  969.105993] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  969.434961] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  969.719368] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  969.729606] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.195451] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.493106] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.581056] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.594934] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.754355] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  970.991755] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26
[  976.978261] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 26 frag:8 size:1460 offset:15478
[  976.980183] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 31 frag:9 size:313 offset:17398
[  976.982154] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 32
[  976.984078] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 27 frag:3 size:1460 offset:25846
[  976.986466] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 35 frag:4 size:1460 offset:27766
[  976.988540] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 43 frag:5 size:1460 offset:29686
[  976.990809] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 44 frag:6 size:1460 offset:118
[  976.993038] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 45 frag:7 size:1460 offset:2038
[  976.994966] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 47 frag:8 size:1460 offset:3958
[  976.996987] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 49 frag:9 size:1460 offset:5878
[  976.998947] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots needed: 52
[  982.624852] net_ratelimit: 97 callbacks suppressed
[  982.627279] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  982.637833] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.432482] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.640017] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.642974] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.656408] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  983.779142] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  984.644546] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  984.657728] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  985.459147] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  987.668407] net_ratelimit: 8 callbacks suppressed
[  987.671661] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  987.678483] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  988.671510] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  988.681210] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  989.472372] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  989.685166] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  989.700220] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  990.058987] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  990.192480] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  990.687626] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for rx_last_skb_slots: 52
[  992.707878] net_ratelimit: 5 callbacks suppressed

But it's done at 52 instead of 64.

And this worst case is a *lot* larger (and probably needless .. since i previously could do with only one slot extra).




>   Paul

>> So that leads me to think it's something caused by this patch set.
>> Patch1 could be a candidate .. perhaps that check was needed for some
>> reason .. will see what not applying that one does
>> 
>> --
>> Sander
>> 
>> 
>> >   Paul
>> 
>> >> So it now seems i now have the situation which you described in the
>> commit
>> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
>> >> "Without this patch I can trivially stall netback permanently by just doing a
>> >> large guest to guest file copy between two Windows Server 2008R2 VMs
>> on a
>> >> single host."
>> >>
>> >> --
>> >> Sander
>> 
>> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 16:46           ` Sander Eikelenboom
@ 2014-03-27 16:54             ` Paul Durrant
  2014-03-27 17:15               ` Sander Eikelenboom
  2014-03-27 17:15               ` Sander Eikelenboom
  2014-03-27 16:54             ` Paul Durrant
  1 sibling, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 16:54 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 16:46
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 3:09:32 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 14:03
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
> Liu
> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> from if
> >> statement
> >>
> >>
> >> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 27 March 2014 13:46
> >> >> To: Paul Durrant
> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell;
> Wei
> >> Liu
> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> >> from if
> >> >> statement
> >> >>
> >> >>
> >> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> >> >>
> >> >> > This patch removes a test in start_new_rx_buffer() that checks
> whether
> >> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> >> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller
> of
> >> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE
> or
> >> less.
> >> >>
> >> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
> >> >> > Cc: Wei Liu <wei.liu2@citrix.com>
> >> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> >> >> > ---
> >> >>
> >> >> > v2:
> >> >> >  - Add BUG_ON() as suggested by Ian Campbell
> >> >>
> >> >> >  drivers/net/xen-netback/netback.c |    4 ++--
> >> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >>
> >> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> >> >> netback/netback.c
> >> >> > index 438d0c0..72314c7 100644
> >> >> > --- a/drivers/net/xen-netback/netback.c
> >> >> > +++ b/drivers/net/xen-netback/netback.c
> >> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> >> >> unsigned long size, int head)
> >> >> >          * into multiple copies tend to give large frags their
> >> >> >          * own buffers as before.
> >> >> >          */
> >> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> >> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> >> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> >> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >> >> >                 return true;
> >> >> >
> >> >> >         return false;
> >> >>
> >> >> Hi Paul,
> >> >>
> >> >> Unfortunately .. no good ..
> >> >>
> >> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> >> >> until i do my test case .. it still chokes and now effectively permanently
> >> stalls
> >> >> network traffic to that guest.
> >> >>
> >> >> No error messages or anything in either xl dmesg or dmesg on the host
> ..
> >> and
> >> >> nothing in dmesg in the guest either.
> >> >>
> >> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but
> RX
> >> >> bytes does nothing, so it seems only the RX path is effected)
> >> >>
> >>
> >> > But you're not getting ring overflow, right? So that suggests this series is
> >> working and you're now hitting another problem? I don't see how these
> >> patches could directly cause the new behaviour you're seeing.
> >>
> >> Don't know  .. how ever .. i previously tested:
> >>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
> >> and that circumvented the problem reliably without causing anything else
> >>         - reverting the calculation of "max_slots_needed + 1"  in
> >> "net_rx_action()" to what it was before :
> >>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>
> 
> > So, it may be that the worse-case estimate is now too bad. In the case
> where it's failing for you it would be nice to know what the estimate was.
> Looking at netfront, we could be in trouble if it ever goes above 64.
> 
> With your patches + some extra printk's
> 
> Ok you seem to be right ..
> 
> [  967.957014] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  968.164711] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  968.310899] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  968.674412] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  968.928398] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  969.105993] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  969.434961] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  969.719368] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  969.729606] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.195451] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.493106] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.581056] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.594934] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.754355] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.991755] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  976.978261] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26 frag:8 size:1460 offset:15478
> [  976.980183] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 31 frag:9 size:313 offset:17398
> [  976.982154] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 32
> [  976.984078] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 27 frag:3 size:1460 offset:25846
> [  976.986466] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 35 frag:4 size:1460 offset:27766
> [  976.988540] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 43 frag:5 size:1460 offset:29686
> [  976.990809] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 44 frag:6 size:1460 offset:118
> [  976.993038] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 45 frag:7 size:1460 offset:2038
> [  976.994966] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 47 frag:8 size:1460 offset:3958
> [  976.996987] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 49 frag:9 size:1460 offset:5878
> [  976.998947] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 52
> [  982.624852] net_ratelimit: 97 callbacks suppressed
> [  982.627279] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  982.637833] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.432482] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.640017] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.642974] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.656408] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.779142] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  984.644546] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  984.657728] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  985.459147] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  987.668407] net_ratelimit: 8 callbacks suppressed
> [  987.671661] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  987.678483] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  988.671510] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  988.681210] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  989.472372] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  989.685166] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  989.700220] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  990.058987] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  990.192480] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  990.687626] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  992.707878] net_ratelimit: 5 callbacks suppressed
> 
> But it's done at 52 instead of 64.
> 
> And this worst case is a *lot* larger (and probably needless .. since i
> previously could do with only one slot extra).
> 

Ok, so we cannot be too pessimistic. In that case I don't see there's a lot of choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume start_new_rx_buffer() returns true every time) and just add the extra 1.

  Paul

> 
> 
> 
> >   Paul
> 
> >> So that leads me to think it's something caused by this patch set.
> >> Patch1 could be a candidate .. perhaps that check was needed for some
> >> reason .. will see what not applying that one does
> >>
> >> --
> >> Sander
> >>
> >>
> >> >   Paul
> >>
> >> >> So it now seems i now have the situation which you described in the
> >> commit
> >> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> >> >> "Without this patch I can trivially stall netback permanently by just
> doing a
> >> >> large guest to guest file copy between two Windows Server 2008R2
> VMs
> >> on a
> >> >> single host."
> >> >>
> >> >> --
> >> >> Sander
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 16:46           ` Sander Eikelenboom
  2014-03-27 16:54             ` Paul Durrant
@ 2014-03-27 16:54             ` Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 16:54 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 16:46
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 3:09:32 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 14:03
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
> Liu
> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> from if
> >> statement
> >>
> >>
> >> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 27 March 2014 13:46
> >> >> To: Paul Durrant
> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell;
> Wei
> >> Liu
> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> >> from if
> >> >> statement
> >> >>
> >> >>
> >> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> >> >>
> >> >> > This patch removes a test in start_new_rx_buffer() that checks
> whether
> >> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
> >> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller
> of
> >> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE
> or
> >> less.
> >> >>
> >> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
> >> >> > Cc: Wei Liu <wei.liu2@citrix.com>
> >> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> >> >> > ---
> >> >>
> >> >> > v2:
> >> >> >  - Add BUG_ON() as suggested by Ian Campbell
> >> >>
> >> >> >  drivers/net/xen-netback/netback.c |    4 ++--
> >> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >>
> >> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> >> >> netback/netback.c
> >> >> > index 438d0c0..72314c7 100644
> >> >> > --- a/drivers/net/xen-netback/netback.c
> >> >> > +++ b/drivers/net/xen-netback/netback.c
> >> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> >> >> unsigned long size, int head)
> >> >> >          * into multiple copies tend to give large frags their
> >> >> >          * own buffers as before.
> >> >> >          */
> >> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> >> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> >> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> >> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
> >> >> >                 return true;
> >> >> >
> >> >> >         return false;
> >> >>
> >> >> Hi Paul,
> >> >>
> >> >> Unfortunately .. no good ..
> >> >>
> >> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> >> >> until i do my test case .. it still chokes and now effectively permanently
> >> stalls
> >> >> network traffic to that guest.
> >> >>
> >> >> No error messages or anything in either xl dmesg or dmesg on the host
> ..
> >> and
> >> >> nothing in dmesg in the guest either.
> >> >>
> >> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but
> RX
> >> >> bytes does nothing, so it seems only the RX path is effected)
> >> >>
> >>
> >> > But you're not getting ring overflow, right? So that suggests this series is
> >> working and you're now hitting another problem? I don't see how these
> >> patches could directly cause the new behaviour you're seeing.
> >>
> >> Don't know  .. how ever .. i previously tested:
> >>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
> >> and that circumvented the problem reliably without causing anything else
> >>         - reverting the calculation of "max_slots_needed + 1"  in
> >> "net_rx_action()" to what it was before :
> >>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>
> 
> > So, it may be that the worse-case estimate is now too bad. In the case
> where it's failing for you it would be nice to know what the estimate was.
> Looking at netfront, we could be in trouble if it ever goes above 64.
> 
> With your patches + some extra printk's
> 
> Ok you seem to be right ..
> 
> [  967.957014] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  968.164711] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  968.310899] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  968.674412] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  968.928398] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  969.105993] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  969.434961] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  969.719368] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  969.729606] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.195451] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.493106] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.581056] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.594934] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.754355] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  970.991755] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26
> [  976.978261] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 26 frag:8 size:1460 offset:15478
> [  976.980183] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 31 frag:9 size:313 offset:17398
> [  976.982154] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 32
> [  976.984078] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 27 frag:3 size:1460 offset:25846
> [  976.986466] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 35 frag:4 size:1460 offset:27766
> [  976.988540] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 43 frag:5 size:1460 offset:29686
> [  976.990809] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 44 frag:6 size:1460 offset:118
> [  976.993038] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 45 frag:7 size:1460 offset:2038
> [  976.994966] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 47 frag:8 size:1460 offset:3958
> [  976.996987] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 49 frag:9 size:1460 offset:5878
> [  976.998947] vif vif-7-0 vif7.0: ?!?!? xenvif_rx_action Insane numer of slots
> needed: 52
> [  982.624852] net_ratelimit: 97 callbacks suppressed
> [  982.627279] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  982.637833] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.432482] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.640017] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.642974] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.656408] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  983.779142] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  984.644546] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  984.657728] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  985.459147] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  987.668407] net_ratelimit: 8 callbacks suppressed
> [  987.671661] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  987.678483] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  988.671510] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  988.681210] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  989.472372] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  989.685166] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  989.700220] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  990.058987] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  990.192480] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  990.687626] vif vif-7-0 vif7.0: ?!?!? rx_work_todo waiting for
> rx_last_skb_slots: 52
> [  992.707878] net_ratelimit: 5 callbacks suppressed
> 
> But it's done at 52 instead of 64.
> 
> And this worst case is a *lot* larger (and probably needless .. since i
> previously could do with only one slot extra).
> 

Ok, so we cannot be too pessimistic. In that case I don't see there's a lot of choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume start_new_rx_buffer() returns true every time) and just add the extra 1.

  Paul

> 
> 
> 
> >   Paul
> 
> >> So that leads me to think it's something caused by this patch set.
> >> Patch1 could be a candidate .. perhaps that check was needed for some
> >> reason .. will see what not applying that one does
> >>
> >> --
> >> Sander
> >>
> >>
> >> >   Paul
> >>
> >> >> So it now seems i now have the situation which you described in the
> >> commit
> >> >> message from "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c",
> >> >> "Without this patch I can trivially stall netback permanently by just
> doing a
> >> >> large guest to guest file copy between two Windows Server 2008R2
> VMs
> >> on a
> >> >> single host."
> >> >>
> >> >> --
> >> >> Sander
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 16:54             ` Paul Durrant
@ 2014-03-27 17:15               ` Sander Eikelenboom
  2014-03-27 17:26                 ` Paul Durrant
  2014-03-27 17:26                 ` Paul Durrant
  2014-03-27 17:15               ` Sander Eikelenboom
  1 sibling, 2 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 17:15 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu


Thursday, March 27, 2014, 5:54:05 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 16:46
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 3:09:32 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 27 March 2014 14:03
>> >> To: Paul Durrant
>> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
>> Liu
>> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
>> from if
>> >> statement
>> >>
>> >>
>> >> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> >> Sent: 27 March 2014 13:46
>> >> >> To: Paul Durrant
>> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell;
>> Wei
>> >> Liu
>> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
>> >> from if
>> >> >> statement
>> >> >>
>> >> >>
>> >> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> >> >>
>> >> >> > This patch removes a test in start_new_rx_buffer() that checks
>> whether
>> >> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> >> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller
>> of
>> >> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE
>> or
>> >> less.
>> >> >>
>> >> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> >> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> >> >> > Cc: Wei Liu <wei.liu2@citrix.com>
>> >> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> >> >> > ---
>> >> >>
>> >> >> > v2:
>> >> >> >  - Add BUG_ON() as suggested by Ian Campbell
>> >> >>
>> >> >> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >> >>
>> >> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> >> >> netback/netback.c
>> >> >> > index 438d0c0..72314c7 100644
>> >> >> > --- a/drivers/net/xen-netback/netback.c
>> >> >> > +++ b/drivers/net/xen-netback/netback.c
>> >> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> >> >> unsigned long size, int head)
>> >> >> >          * into multiple copies tend to give large frags their
>> >> >> >          * own buffers as before.
>> >> >> >          */
>> >> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> >> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> >> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> >> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >> >> >                 return true;
>> >> >> >
>> >> >> >         return false;
>> >> >>
>> >> >> Hi Paul,
>> >> >>
>> >> >> Unfortunately .. no good ..
>> >> >>
>> >> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> >> >> until i do my test case .. it still chokes and now effectively permanently
>> >> stalls
>> >> >> network traffic to that guest.
>> >> >>
>> >> >> No error messages or anything in either xl dmesg or dmesg on the host
>> ..
>> >> and
>> >> >> nothing in dmesg in the guest either.
>> >> >>
>> >> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but
>> RX
>> >> >> bytes does nothing, so it seems only the RX path is effected)
>> >> >>
>> >>
>> >> > But you're not getting ring overflow, right? So that suggests this series is
>> >> working and you're now hitting another problem? I don't see how these
>> >> patches could directly cause the new behaviour you're seeing.
>> >>
>> >> Don't know  .. how ever .. i previously tested:
>> >>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
>> >> and that circumvented the problem reliably without causing anything else
>> >>         - reverting the calculation of "max_slots_needed + 1"  in
>> >> "net_rx_action()" to what it was before :
>> >>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>> >>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>> >>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> >>
>> 
>> > So, it may be that the worse-case estimate is now too bad. In the case
>> where it's failing for you it would be nice to know what the estimate was


> Ok, so we cannot be too pessimistic. In that case I don't see there's a lot of
> choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> start_new_rx_buffer() returns true every time) and just add the extra 1.

Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical backing.
And since the original problem always seemed to occur on a packet with a single large frag, i'm wondering
if this 1 would actually be correct in other cases.

Well this is what i said earlier on .. it's hard to estimate upfront if "start_new_rx_buffer()" will return true,
and how many times that is possible per frag .. and if that is possible for only 1 frag or for all frags.

The problem is now replaced from packets with 1 large frag (for which it didn't account properly leading to a too small estimate) .. to packets
with a large number of (smaller) frags .. leading to a too large over estimation.

So would there be a theoretical maximum how often that path could hit based on a combination of sizes (total size of all frags, nr_frags, size per frag) ?
- if you hit "start_new_rx_buffer()" == true  in the first frag .. could you hit it in a next frag ?
- could it be limited due to something like the packet_size / nr_frags / page_size ?

And what was wrong with the previous calculation ?
                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */

That perhaps also misses some theoretical backing, what if it would have (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
fit in a slot. Or is the total size of frags a skb can carry limited to MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that MAX_SKB_FRAGS is a upper limit.
(and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't get to a too large non reachable estimate).

But as a side question .. the whole "get_next_rx_buffer()" path is needed for when a frag could not fit in a slot
as a whole ?



--
Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 16:54             ` Paul Durrant
  2014-03-27 17:15               ` Sander Eikelenboom
@ 2014-03-27 17:15               ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 17:15 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 5:54:05 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 16:46
>> To: Paul Durrant
>> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
>> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
>> statement
>> 
>> 
>> Thursday, March 27, 2014, 3:09:32 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 27 March 2014 14:03
>> >> To: Paul Durrant
>> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
>> Liu
>> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
>> from if
>> >> statement
>> >>
>> >>
>> >> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> >> Sent: 27 March 2014 13:46
>> >> >> To: Paul Durrant
>> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell;
>> Wei
>> >> Liu
>> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
>> >> from if
>> >> >> statement
>> >> >>
>> >> >>
>> >> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
>> >> >>
>> >> >> > This patch removes a test in start_new_rx_buffer() that checks
>> whether
>> >> >> > a copy operation is less than MAX_BUFFER_OFFSET in length, since
>> >> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller
>> of
>> >> >> > start_new_rx_buffer() already limits copy operations to PAGE_SIZE
>> or
>> >> less.
>> >> >>
>> >> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
>> >> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
>> >> >> > Cc: Wei Liu <wei.liu2@citrix.com>
>> >> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
>> >> >> > ---
>> >> >>
>> >> >> > v2:
>> >> >> >  - Add BUG_ON() as suggested by Ian Campbell
>> >> >>
>> >> >> >  drivers/net/xen-netback/netback.c |    4 ++--
>> >> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >> >>
>> >> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> >> >> netback/netback.c
>> >> >> > index 438d0c0..72314c7 100644
>> >> >> > --- a/drivers/net/xen-netback/netback.c
>> >> >> > +++ b/drivers/net/xen-netback/netback.c
>> >> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
>> >> >> unsigned long size, int head)
>> >> >> >          * into multiple copies tend to give large frags their
>> >> >> >          * own buffers as before.
>> >> >> >          */
>> >> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
>> >> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
>> >> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
>> >> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset && !head)
>> >> >> >                 return true;
>> >> >> >
>> >> >> >         return false;
>> >> >>
>> >> >> Hi Paul,
>> >> >>
>> >> >> Unfortunately .. no good ..
>> >> >>
>> >> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
>> >> >> until i do my test case .. it still chokes and now effectively permanently
>> >> stalls
>> >> >> network traffic to that guest.
>> >> >>
>> >> >> No error messages or anything in either xl dmesg or dmesg on the host
>> ..
>> >> and
>> >> >> nothing in dmesg in the guest either.
>> >> >>
>> >> >> But in the guest the TX bytes ifconfig reports for eth0 still increase but
>> RX
>> >> >> bytes does nothing, so it seems only the RX path is effected)
>> >> >>
>> >>
>> >> > But you're not getting ring overflow, right? So that suggests this series is
>> >> working and you're now hitting another problem? I don't see how these
>> >> patches could directly cause the new behaviour you're seeing.
>> >>
>> >> Don't know  .. how ever .. i previously tested:
>> >>         - unconditionally doing "max_slots_needed + 1"  in "net_rx_action()",
>> >> and that circumvented the problem reliably without causing anything else
>> >>         - reverting the calculation of "max_slots_needed + 1"  in
>> >> "net_rx_action()" to what it was before :
>> >>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>> >>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>> >>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> >>
>> 
>> > So, it may be that the worse-case estimate is now too bad. In the case
>> where it's failing for you it would be nice to know what the estimate was


> Ok, so we cannot be too pessimistic. In that case I don't see there's a lot of
> choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> start_new_rx_buffer() returns true every time) and just add the extra 1.

Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical backing.
And since the original problem always seemed to occur on a packet with a single large frag, i'm wondering
if this 1 would actually be correct in other cases.

Well this is what i said earlier on .. it's hard to estimate upfront if "start_new_rx_buffer()" will return true,
and how many times that is possible per frag .. and if that is possible for only 1 frag or for all frags.

The problem is now replaced from packets with 1 large frag (for which it didn't account properly leading to a too small estimate) .. to packets
with a large number of (smaller) frags .. leading to a too large over estimation.

So would there be a theoretical maximum how often that path could hit based on a combination of sizes (total size of all frags, nr_frags, size per frag) ?
- if you hit "start_new_rx_buffer()" == true  in the first frag .. could you hit it in a next frag ?
- could it be limited due to something like the packet_size / nr_frags / page_size ?

And what was wrong with the previous calculation ?
                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */

That perhaps also misses some theoretical backing, what if it would have (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
fit in a slot. Or is the total size of frags a skb can carry limited to MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that MAX_SKB_FRAGS is a upper limit.
(and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't get to a too large non reachable estimate).

But as a side question .. the whole "get_next_rx_buffer()" path is needed for when a frag could not fit in a slot
as a whole ?



--
Sander

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 17:15               ` Sander Eikelenboom
  2014-03-27 17:26                 ` Paul Durrant
@ 2014-03-27 17:26                 ` Paul Durrant
  2014-03-27 18:34                   ` Sander Eikelenboom
  2014-03-27 18:34                   ` Sander Eikelenboom
  1 sibling, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 17:26 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 17:15
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 5:54:05 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 16:46
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
> Liu
> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> from if
> >> statement
> >>
> >>
> >> Thursday, March 27, 2014, 3:09:32 PM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 27 March 2014 14:03
> >> >> To: Paul Durrant
> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell;
> Wei
> >> Liu
> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> >> from if
> >> >> statement
> >> >>
> >> >>
> >> >> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> >> Sent: 27 March 2014 13:46
> >> >> >> To: Paul Durrant
> >> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian
> Campbell;
> >> Wei
> >> >> Liu
> >> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless
> clause
> >> >> from if
> >> >> >> statement
> >> >> >>
> >> >> >>
> >> >> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> >> >> >>
> >> >> >> > This patch removes a test in start_new_rx_buffer() that checks
> >> whether
> >> >> >> > a copy operation is less than MAX_BUFFER_OFFSET in length,
> since
> >> >> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only
> caller
> >> of
> >> >> >> > start_new_rx_buffer() already limits copy operations to
> PAGE_SIZE
> >> or
> >> >> less.
> >> >> >>
> >> >> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> >> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
> >> >> >> > Cc: Wei Liu <wei.liu2@citrix.com>
> >> >> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> >> >> >> > ---
> >> >> >>
> >> >> >> > v2:
> >> >> >> >  - Add BUG_ON() as suggested by Ian Campbell
> >> >> >>
> >> >> >> >  drivers/net/xen-netback/netback.c |    4 ++--
> >> >> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >> >>
> >> >> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> >> >> >> netback/netback.c
> >> >> >> > index 438d0c0..72314c7 100644
> >> >> >> > --- a/drivers/net/xen-netback/netback.c
> >> >> >> > +++ b/drivers/net/xen-netback/netback.c
> >> >> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> >> >> >> unsigned long size, int head)
> >> >> >> >          * into multiple copies tend to give large frags their
> >> >> >> >          * own buffers as before.
> >> >> >> >          */
> >> >> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> >> >> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> >> >> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> >> >> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset &&
> !head)
> >> >> >> >                 return true;
> >> >> >> >
> >> >> >> >         return false;
> >> >> >>
> >> >> >> Hi Paul,
> >> >> >>
> >> >> >> Unfortunately .. no good ..
> >> >> >>
> >> >> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> >> >> >> until i do my test case .. it still chokes and now effectively
> permanently
> >> >> stalls
> >> >> >> network traffic to that guest.
> >> >> >>
> >> >> >> No error messages or anything in either xl dmesg or dmesg on the
> host
> >> ..
> >> >> and
> >> >> >> nothing in dmesg in the guest either.
> >> >> >>
> >> >> >> But in the guest the TX bytes ifconfig reports for eth0 still increase
> but
> >> RX
> >> >> >> bytes does nothing, so it seems only the RX path is effected)
> >> >> >>
> >> >>
> >> >> > But you're not getting ring overflow, right? So that suggests this
> series is
> >> >> working and you're now hitting another problem? I don't see how
> these
> >> >> patches could directly cause the new behaviour you're seeing.
> >> >>
> >> >> Don't know  .. how ever .. i previously tested:
> >> >>         - unconditionally doing "max_slots_needed + 1"  in
> "net_rx_action()",
> >> >> and that circumvented the problem reliably without causing anything
> else
> >> >>         - reverting the calculation of "max_slots_needed + 1"  in
> >> >> "net_rx_action()" to what it was before :
> >> >>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >> >>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >> >>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >> >>
> >>
> >> > So, it may be that the worse-case estimate is now too bad. In the case
> >> where it's failing for you it would be nice to know what the estimate was
> 
> 
> > Ok, so we cannot be too pessimistic. In that case I don't see there's a lot of
> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> > start_new_rx_buffer() returns true every time) and just add the extra 1.
> 
> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
> backing.

I don't like it either, but theory suggested each frag should take no more space than the original DIV_ROUND_UP and that proved to be wrong, but I cannot figure out why.

> And since the original problem always seemed to occur on a packet with a
> single large frag, i'm wondering
> if this 1 would actually be correct in other cases.

That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2 byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.

> 
> Well this is what i said earlier on .. it's hard to estimate upfront if
> "start_new_rx_buffer()" will return true,
> and how many times that is possible per frag .. and if that is possible for only
> 1 frag or for all frags.
> 
> The problem is now replaced from packets with 1 large frag (for which it
> didn't account properly leading to a too small estimate) .. to packets
> with a large number of (smaller) frags .. leading to a too large over
> estimation.
> 
> So would there be a theoretical maximum how often that path could hit
> based on a combination of sizes (total size of all frags, nr_frags, size per frag)
> ?
> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you hit it
> in a next frag ?
> - could it be limited due to something like the packet_size / nr_frags /
> page_size ?
> 
> And what was wrong with the previous calculation ?
>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> 

This is not safe if frag size can be > PAGE_SIZE.

> That perhaps also misses some theoretical backing, what if it would have
> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
> fit in a slot. Or is the total size of frags a skb can carry limited to
> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> MAX_SKB_FRAGS is a upper limit.
> (and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't
> get to a too large non reachable estimate).
> 
> But as a side question .. the whole "get_next_rx_buffer()" path is needed
> for when a frag could not fit in a slot
> as a whole ?
> 

Perhaps it would be best to take the hit on copy_ops and just tightly pack, so we only start a new slot when the current one is completely full; then actual slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the extra if it's a GSO).

  Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 17:15               ` Sander Eikelenboom
@ 2014-03-27 17:26                 ` Paul Durrant
  2014-03-27 17:26                 ` Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-27 17:26 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 17:15
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei Liu
> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if
> statement
> 
> 
> Thursday, March 27, 2014, 5:54:05 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 16:46
> >> To: Paul Durrant
> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell; Wei
> Liu
> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> from if
> >> statement
> >>
> >>
> >> Thursday, March 27, 2014, 3:09:32 PM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 27 March 2014 14:03
> >> >> To: Paul Durrant
> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian Campbell;
> Wei
> >> Liu
> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless clause
> >> from if
> >> >> statement
> >> >>
> >> >>
> >> >> Thursday, March 27, 2014, 2:54:46 PM, you wrote:
> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> >> Sent: 27 March 2014 13:46
> >> >> >> To: Paul Durrant
> >> >> >> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Ian
> Campbell;
> >> Wei
> >> >> Liu
> >> >> >> Subject: Re: [PATCH net v2 1/3] xen-netback: remove pointless
> clause
> >> >> from if
> >> >> >> statement
> >> >> >>
> >> >> >>
> >> >> >> Thursday, March 27, 2014, 1:56:11 PM, you wrote:
> >> >> >>
> >> >> >> > This patch removes a test in start_new_rx_buffer() that checks
> >> whether
> >> >> >> > a copy operation is less than MAX_BUFFER_OFFSET in length,
> since
> >> >> >> > MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only
> caller
> >> of
> >> >> >> > start_new_rx_buffer() already limits copy operations to
> PAGE_SIZE
> >> or
> >> >> less.
> >> >> >>
> >> >> >> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >> >> >> > Cc: Ian Campbell <ian.campbell@citrix.com>
> >> >> >> > Cc: Wei Liu <wei.liu2@citrix.com>
> >> >> >> > Cc: Sander Eikelenboom <linux@eikelenboom.it>
> >> >> >> > ---
> >> >> >>
> >> >> >> > v2:
> >> >> >> >  - Add BUG_ON() as suggested by Ian Campbell
> >> >> >>
> >> >> >> >  drivers/net/xen-netback/netback.c |    4 ++--
> >> >> >> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >> >> >>
> >> >> >> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> >> >> >> netback/netback.c
> >> >> >> > index 438d0c0..72314c7 100644
> >> >> >> > --- a/drivers/net/xen-netback/netback.c
> >> >> >> > +++ b/drivers/net/xen-netback/netback.c
> >> >> >> > @@ -192,8 +192,8 @@ static bool start_new_rx_buffer(int offset,
> >> >> >> unsigned long size, int head)
> >> >> >> >          * into multiple copies tend to give large frags their
> >> >> >> >          * own buffers as before.
> >> >> >> >          */
> >> >> >> > -       if ((offset + size > MAX_BUFFER_OFFSET) &&
> >> >> >> > -           (size <= MAX_BUFFER_OFFSET) && offset && !head)
> >> >> >> > +       BUG_ON(size > MAX_BUFFER_OFFSET);
> >> >> >> > +       if ((offset + size > MAX_BUFFER_OFFSET) && offset &&
> !head)
> >> >> >> >                 return true;
> >> >> >> >
> >> >> >> >         return false;
> >> >> >>
> >> >> >> Hi Paul,
> >> >> >>
> >> >> >> Unfortunately .. no good ..
> >> >> >>
> >> >> >> With these patches (v2) applied to 3.14-rc8 it all seems to work well,
> >> >> >> until i do my test case .. it still chokes and now effectively
> permanently
> >> >> stalls
> >> >> >> network traffic to that guest.
> >> >> >>
> >> >> >> No error messages or anything in either xl dmesg or dmesg on the
> host
> >> ..
> >> >> and
> >> >> >> nothing in dmesg in the guest either.
> >> >> >>
> >> >> >> But in the guest the TX bytes ifconfig reports for eth0 still increase
> but
> >> RX
> >> >> >> bytes does nothing, so it seems only the RX path is effected)
> >> >> >>
> >> >>
> >> >> > But you're not getting ring overflow, right? So that suggests this
> series is
> >> >> working and you're now hitting another problem? I don't see how
> these
> >> >> patches could directly cause the new behaviour you're seeing.
> >> >>
> >> >> Don't know  .. how ever .. i previously tested:
> >> >>         - unconditionally doing "max_slots_needed + 1"  in
> "net_rx_action()",
> >> >> and that circumvented the problem reliably without causing anything
> else
> >> >>         - reverting the calculation of "max_slots_needed + 1"  in
> >> >> "net_rx_action()" to what it was before :
> >> >>                 int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >> >>                 if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >> >>                         max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >> >>
> >>
> >> > So, it may be that the worse-case estimate is now too bad. In the case
> >> where it's failing for you it would be nice to know what the estimate was
> 
> 
> > Ok, so we cannot be too pessimistic. In that case I don't see there's a lot of
> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> > start_new_rx_buffer() returns true every time) and just add the extra 1.
> 
> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
> backing.

I don't like it either, but theory suggested each frag should take no more space than the original DIV_ROUND_UP and that proved to be wrong, but I cannot figure out why.

> And since the original problem always seemed to occur on a packet with a
> single large frag, i'm wondering
> if this 1 would actually be correct in other cases.

That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2 byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.

> 
> Well this is what i said earlier on .. it's hard to estimate upfront if
> "start_new_rx_buffer()" will return true,
> and how many times that is possible per frag .. and if that is possible for only
> 1 frag or for all frags.
> 
> The problem is now replaced from packets with 1 large frag (for which it
> didn't account properly leading to a too small estimate) .. to packets
> with a large number of (smaller) frags .. leading to a too large over
> estimation.
> 
> So would there be a theoretical maximum how often that path could hit
> based on a combination of sizes (total size of all frags, nr_frags, size per frag)
> ?
> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you hit it
> in a next frag ?
> - could it be limited due to something like the packet_size / nr_frags /
> page_size ?
> 
> And what was wrong with the previous calculation ?
>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> 

This is not safe if frag size can be > PAGE_SIZE.

> That perhaps also misses some theoretical backing, what if it would have
> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
> fit in a slot. Or is the total size of frags a skb can carry limited to
> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> MAX_SKB_FRAGS is a upper limit.
> (and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't
> get to a too large non reachable estimate).
> 
> But as a side question .. the whole "get_next_rx_buffer()" path is needed
> for when a frag could not fit in a slot
> as a whole ?
> 

Perhaps it would be best to take the hit on copy_ops and just tightly pack, so we only start a new slot when the current one is completely full; then actual slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the extra if it's a GSO).

  Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 17:26                 ` Paul Durrant
@ 2014-03-27 18:34                   ` Sander Eikelenboom
  2014-03-27 19:22                     ` [Xen-devel] " Sander Eikelenboom
                                       ` (3 more replies)
  2014-03-27 18:34                   ` Sander Eikelenboom
  1 sibling, 4 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 18:34 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, netdev, Ian Campbell, Wei Liu

<big snip>

>> >>
>> >> > So, it may be that the worse-case estimate is now too bad. In the case
>> >> where it's failing for you it would be nice to know what the estimate was
>>
>>
>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a lot
>> > of
>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>> > start_new_rx_buffer() returns true every time) and just add the extra 1.
>>
>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>> backing.

I don't like it either, but theory suggested each frag should take no more 
space than the original DIV_ROUND_UP and that proved to be wrong, but I cannot 
figure out why.

>> And since the original problem always seemed to occur on a packet with a
>> single large frag, i'm wondering
>> if this 1 would actually be correct in other cases.

That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2 
byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.

In what situation my a 2 byte frag span 2 slots ?

At least there must be a theoretical cap to the number of slots needed ..
- assuming and SKB can contain only 65535 bytes
- assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE pieces ..

- it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
- double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't know the ring size
  but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall there.


>> Well this is what i said earlier on .. it's hard to estimate upfront if
>> "start_new_rx_buffer()" will return true,
>> and how many times that is possible per frag .. and if that is possible for
>> only
>> 1 frag or for all frags.
>>
>> The problem is now replaced from packets with 1 large frag (for which it
>> didn't account properly leading to a too small estimate) .. to packets
>> with a large number of (smaller) frags .. leading to a too large over
>> estimation.
>>
>> So would there be a theoretical maximum how often that path could hit
>> based on a combination of sizes (total size of all frags, nr_frags, size per
>> frag)
>> ?
>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>> hit it
>> in a next frag ?
>> - could it be limited due to something like the packet_size / nr_frags /
>> page_size ?
>>
>> And what was wrong with the previous calculation ?
>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>>

> This is not safe if frag size can be > PAGE_SIZE.

#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)

So if one of the frags is > PAGE_SIZE ..
wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we are limited by the total packet size ?
(so we would spare a slot since we have a frag less .. but spend one more because we have a frag that needs 2 slots ?)

(and that this should even be pessimistic since we didn't substract the header etc from the max total packet size ?)


So from what i said early, you could probably do the pessimistic estimate (that would help when packets have a small skb->data_len
(space occupied by frags)) so the estimate would be less then the old one based on MAX_SKB_FRAGS causing the packet to be processed earlier.
And CAP it using the old way since a packet should never be able to use more slots than that theoretical max_slots (which hopefully is less than
the ring size, so a packet can always be processed if the ring is finally emptied.


>> That perhaps also misses some theoretical backing, what if it would have
>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>> fit in a slot. Or is the total size of frags a skb can carry limited to
>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>> MAX_SKB_FRAGS is a upper limit.
>> (and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't
>> get to a too large non reachable estimate).
>>
>> But as a side question .. the whole "get_next_rx_buffer()" path is needed
>> for when a frag could not fit in a slot
>> as a whole ?


> Perhaps it would be best to take the hit on copy_ops and just tightly pack, so
> we only start a new slot when the current one is completely full; then actual
> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the extra if
> it's a GSO).

Don't know if and how much a performance penalty that would be.

>  Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 17:26                 ` Paul Durrant
  2014-03-27 18:34                   ` Sander Eikelenboom
@ 2014-03-27 18:34                   ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 18:34 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

<big snip>

>> >>
>> >> > So, it may be that the worse-case estimate is now too bad. In the case
>> >> where it's failing for you it would be nice to know what the estimate was
>>
>>
>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a lot
>> > of
>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>> > start_new_rx_buffer() returns true every time) and just add the extra 1.
>>
>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>> backing.

I don't like it either, but theory suggested each frag should take no more 
space than the original DIV_ROUND_UP and that proved to be wrong, but I cannot 
figure out why.

>> And since the original problem always seemed to occur on a packet with a
>> single large frag, i'm wondering
>> if this 1 would actually be correct in other cases.

That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2 
byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.

In what situation my a 2 byte frag span 2 slots ?

At least there must be a theoretical cap to the number of slots needed ..
- assuming and SKB can contain only 65535 bytes
- assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE pieces ..

- it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
- double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't know the ring size
  but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall there.


>> Well this is what i said earlier on .. it's hard to estimate upfront if
>> "start_new_rx_buffer()" will return true,
>> and how many times that is possible per frag .. and if that is possible for
>> only
>> 1 frag or for all frags.
>>
>> The problem is now replaced from packets with 1 large frag (for which it
>> didn't account properly leading to a too small estimate) .. to packets
>> with a large number of (smaller) frags .. leading to a too large over
>> estimation.
>>
>> So would there be a theoretical maximum how often that path could hit
>> based on a combination of sizes (total size of all frags, nr_frags, size per
>> frag)
>> ?
>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>> hit it
>> in a next frag ?
>> - could it be limited due to something like the packet_size / nr_frags /
>> page_size ?
>>
>> And what was wrong with the previous calculation ?
>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>>

> This is not safe if frag size can be > PAGE_SIZE.

#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)

So if one of the frags is > PAGE_SIZE ..
wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we are limited by the total packet size ?
(so we would spare a slot since we have a frag less .. but spend one more because we have a frag that needs 2 slots ?)

(and that this should even be pessimistic since we didn't substract the header etc from the max total packet size ?)


So from what i said early, you could probably do the pessimistic estimate (that would help when packets have a small skb->data_len
(space occupied by frags)) so the estimate would be less then the old one based on MAX_SKB_FRAGS causing the packet to be processed earlier.
And CAP it using the old way since a packet should never be able to use more slots than that theoretical max_slots (which hopefully is less than
the ring size, so a packet can always be processed if the ring is finally emptied.


>> That perhaps also misses some theoretical backing, what if it would have
>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>> fit in a slot. Or is the total size of frags a skb can carry limited to
>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>> MAX_SKB_FRAGS is a upper limit.
>> (and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't
>> get to a too large non reachable estimate).
>>
>> But as a side question .. the whole "get_next_rx_buffer()" path is needed
>> for when a frag could not fit in a slot
>> as a whole ?


> Perhaps it would be best to take the hit on copy_ops and just tightly pack, so
> we only start a new slot when the current one is completely full; then actual
> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the extra if
> it's a GSO).

Don't know if and how much a performance penalty that would be.

>  Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 18:34                   ` Sander Eikelenboom
@ 2014-03-27 19:22                     ` Sander Eikelenboom
  2014-03-28  9:30                       ` Paul Durrant
  2014-03-28  9:30                       ` [Xen-devel] " Paul Durrant
  2014-03-27 19:22                     ` Sander Eikelenboom
                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 19:22 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 7:34:54 PM, you wrote:

> <big snip>

>>> >>
>>> >> > So, it may be that the worse-case estimate is now too bad. In the case
>>> >> where it's failing for you it would be nice to know what the estimate was
>>>
>>>
>>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a lot
>>> > of
>>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>>> > start_new_rx_buffer() returns true every time) and just add the extra 1.
>>>
>>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>>> backing.

> I don't like it either, but theory suggested each frag should take no more 
> space than the original DIV_ROUND_UP and that proved to be wrong, but I cannot 
> figure out why.

>>> And since the original problem always seemed to occur on a packet with a
>>> single large frag, i'm wondering
>>> if this 1 would actually be correct in other cases.

> That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2 
> byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.

> In what situation my a 2 byte frag span 2 slots ?

> At least there must be a theoretical cap to the number of slots needed ..
> - assuming and SKB can contain only 65535 bytes
> - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE pieces ..

> - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't know the ring size
>   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall there.


>>> Well this is what i said earlier on .. it's hard to estimate upfront if
>>> "start_new_rx_buffer()" will return true,
>>> and how many times that is possible per frag .. and if that is possible for
>>> only
>>> 1 frag or for all frags.
>>>
>>> The problem is now replaced from packets with 1 large frag (for which it
>>> didn't account properly leading to a too small estimate) .. to packets
>>> with a large number of (smaller) frags .. leading to a too large over
>>> estimation.
>>>
>>> So would there be a theoretical maximum how often that path could hit
>>> based on a combination of sizes (total size of all frags, nr_frags, size per
>>> frag)
>>> ?
>>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>>> hit it
>>> in a next frag ?
>>> - could it be limited due to something like the packet_size / nr_frags /
>>> page_size ?
>>>
>>> And what was wrong with the previous calculation ?
>>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>>>

>> This is not safe if frag size can be > PAGE_SIZE.

> #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)

> So if one of the frags is > PAGE_SIZE ..
> wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we are limited by the total packet size ?
> (so we would spare a slot since we have a frag less .. but spend one more because we have a frag that needs 2 slots ?)

> (and that this should even be pessimistic since we didn't substract the header etc from the max total packet size ?)


> So from what i said early, you could probably do the pessimistic estimate (that would help when packets have a small skb->data_len
> (space occupied by frags)) so the estimate would be less then the old one based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> And CAP it using the old way since a packet should never be able to use more slots than that theoretical max_slots (which hopefully is less than
> the ring size, so a packet can always be processed if the ring is finally emptied.


>>> That perhaps also misses some theoretical backing, what if it would have
>>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>>> fit in a slot. Or is the total size of frags a skb can carry limited to
>>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>>> MAX_SKB_FRAGS is a upper limit.
>>> (and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't
>>> get to a too large non reachable estimate).
>>>
>>> But as a side question .. the whole "get_next_rx_buffer()" path is needed
>>> for when a frag could not fit in a slot
>>> as a whole ?


>> Perhaps it would be best to take the hit on copy_ops and just tightly pack, so
>> we only start a new slot when the current one is completely full; then actual
>> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the extra if
>> it's a GSO).

> Don't know if and how much a performance penalty that would be.

>>  Paul

Hmm since i now started to dig around a bit more ..

The ring size seems to be determined by netfront and not by netback ?
Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE domU (and potentially lead to a overrun and therefore problems on the HOST) ?

And about the commit message from ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
Do i understand it correctly that you saw the original problem (stall on large file copy) only on a "Windows Server 2008R2", probably with PV drivers ?

I don't see why the original calculation wouldn't work, so what kind of packets (nr_frags, frag size and PAGE_SIZE ) caused it ?

And could you retest if that "Windows Server 2008R2" works with a netback with you latest patch series (pessimistic estimate) plus a cap on max_slots_needed like:

if(max_slots_needed > MAX_SKB_FRAGS + 1){
        max_slots_needed = MAX_SKB_FRAGS + 1;
}

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 18:34                   ` Sander Eikelenboom
  2014-03-27 19:22                     ` [Xen-devel] " Sander Eikelenboom
@ 2014-03-27 19:22                     ` Sander Eikelenboom
  2014-03-28  0:55                     ` Sander Eikelenboom
  2014-03-28  0:55                     ` [Xen-devel] " Sander Eikelenboom
  3 siblings, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 19:22 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 7:34:54 PM, you wrote:

> <big snip>

>>> >>
>>> >> > So, it may be that the worse-case estimate is now too bad. In the case
>>> >> where it's failing for you it would be nice to know what the estimate was
>>>
>>>
>>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a lot
>>> > of
>>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>>> > start_new_rx_buffer() returns true every time) and just add the extra 1.
>>>
>>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>>> backing.

> I don't like it either, but theory suggested each frag should take no more 
> space than the original DIV_ROUND_UP and that proved to be wrong, but I cannot 
> figure out why.

>>> And since the original problem always seemed to occur on a packet with a
>>> single large frag, i'm wondering
>>> if this 1 would actually be correct in other cases.

> That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2 
> byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.

> In what situation my a 2 byte frag span 2 slots ?

> At least there must be a theoretical cap to the number of slots needed ..
> - assuming and SKB can contain only 65535 bytes
> - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE pieces ..

> - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't know the ring size
>   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall there.


>>> Well this is what i said earlier on .. it's hard to estimate upfront if
>>> "start_new_rx_buffer()" will return true,
>>> and how many times that is possible per frag .. and if that is possible for
>>> only
>>> 1 frag or for all frags.
>>>
>>> The problem is now replaced from packets with 1 large frag (for which it
>>> didn't account properly leading to a too small estimate) .. to packets
>>> with a large number of (smaller) frags .. leading to a too large over
>>> estimation.
>>>
>>> So would there be a theoretical maximum how often that path could hit
>>> based on a combination of sizes (total size of all frags, nr_frags, size per
>>> frag)
>>> ?
>>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>>> hit it
>>> in a next frag ?
>>> - could it be limited due to something like the packet_size / nr_frags /
>>> page_size ?
>>>
>>> And what was wrong with the previous calculation ?
>>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>>>

>> This is not safe if frag size can be > PAGE_SIZE.

> #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)

> So if one of the frags is > PAGE_SIZE ..
> wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we are limited by the total packet size ?
> (so we would spare a slot since we have a frag less .. but spend one more because we have a frag that needs 2 slots ?)

> (and that this should even be pessimistic since we didn't substract the header etc from the max total packet size ?)


> So from what i said early, you could probably do the pessimistic estimate (that would help when packets have a small skb->data_len
> (space occupied by frags)) so the estimate would be less then the old one based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> And CAP it using the old way since a packet should never be able to use more slots than that theoretical max_slots (which hopefully is less than
> the ring size, so a packet can always be processed if the ring is finally emptied.


>>> That perhaps also misses some theoretical backing, what if it would have
>>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>>> fit in a slot. Or is the total size of frags a skb can carry limited to
>>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>>> MAX_SKB_FRAGS is a upper limit.
>>> (and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't
>>> get to a too large non reachable estimate).
>>>
>>> But as a side question .. the whole "get_next_rx_buffer()" path is needed
>>> for when a frag could not fit in a slot
>>> as a whole ?


>> Perhaps it would be best to take the hit on copy_ops and just tightly pack, so
>> we only start a new slot when the current one is completely full; then actual
>> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the extra if
>> it's a GSO).

> Don't know if and how much a performance penalty that would be.

>>  Paul

Hmm since i now started to dig around a bit more ..

The ring size seems to be determined by netfront and not by netback ?
Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE domU (and potentially lead to a overrun and therefore problems on the HOST) ?

And about the commit message from ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
Do i understand it correctly that you saw the original problem (stall on large file copy) only on a "Windows Server 2008R2", probably with PV drivers ?

I don't see why the original calculation wouldn't work, so what kind of packets (nr_frags, frag size and PAGE_SIZE ) caused it ?

And could you retest if that "Windows Server 2008R2" works with a netback with you latest patch series (pessimistic estimate) plus a cap on max_slots_needed like:

if(max_slots_needed > MAX_SKB_FRAGS + 1){
        max_slots_needed = MAX_SKB_FRAGS + 1;
}

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 18:34                   ` Sander Eikelenboom
                                       ` (2 preceding siblings ...)
  2014-03-28  0:55                     ` Sander Eikelenboom
@ 2014-03-28  0:55                     ` Sander Eikelenboom
  2014-03-28  9:36                       ` Paul Durrant
  2014-03-28  9:36                       ` [Xen-devel] " Paul Durrant
  3 siblings, 2 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28  0:55 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 7:34:54 PM, you wrote:

> <big snip>

>>> >>
>>> >> > So, it may be that the worse-case estimate is now too bad. In the case
>>> >> where it's failing for you it would be nice to know what the estimate was
>>>
>>>
>>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a lot
>>> > of
>>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>>> > start_new_rx_buffer() returns true every time) and just add the extra 1.
>>>
>>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>>> backing.

> I don't like it either, but theory suggested each frag should take no more 
> space than the original DIV_ROUND_UP and that proved to be wrong, but I cannot 
> figure out why.

>>> And since the original problem always seemed to occur on a packet with a
>>> single large frag, i'm wondering
>>> if this 1 would actually be correct in other cases.

> That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2 
> byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.

> In what situation my a 2 byte frag span 2 slots ?

> At least there must be a theoretical cap to the number of slots needed ..
> - assuming and SKB can contain only 65535 bytes
> - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE pieces ..

> - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't know the ring size
>   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall there.


>>> Well this is what i said earlier on .. it's hard to estimate upfront if
>>> "start_new_rx_buffer()" will return true,
>>> and how many times that is possible per frag .. and if that is possible for
>>> only
>>> 1 frag or for all frags.
>>>
>>> The problem is now replaced from packets with 1 large frag (for which it
>>> didn't account properly leading to a too small estimate) .. to packets
>>> with a large number of (smaller) frags .. leading to a too large over
>>> estimation.
>>>
>>> So would there be a theoretical maximum how often that path could hit
>>> based on a combination of sizes (total size of all frags, nr_frags, size per
>>> frag)
>>> ?
>>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>>> hit it
>>> in a next frag ?
>>> - could it be limited due to something like the packet_size / nr_frags /
>>> page_size ?
>>>
>>> And what was wrong with the previous calculation ?
>>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>>>

>> This is not safe if frag size can be > PAGE_SIZE.

> #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)

> So if one of the frags is > PAGE_SIZE ..
> wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we are limited by the total packet size ?
> (so we would spare a slot since we have a frag less .. but spend one more because we have a frag that needs 2 slots ?)

> (and that this should even be pessimistic since we didn't substract the header etc from the max total packet size ?)


> So from what i said early, you could probably do the pessimistic estimate (that would help when packets have a small skb->data_len
> (space occupied by frags)) so the estimate would be less then the old one based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> And CAP it using the old way since a packet should never be able to use more slots than that theoretical max_slots (which hopefully is less than
> the ring size, so a packet can always be processed if the ring is finally emptied.

ok .. annotated "xenvif_gop_frag_copy()" to print what it did when we end up with (vif->rx.req_cons - req_cons_start) > estimatedcost for that frag
where estimatedcost = DIV_ROUND_UP(size, PAGE_SIZE).

So the calculation indeed didn't take the offset used into account.

vif vif-7-0 vif7.0: ?!?!? xenvif_gop_frag_copy: frag costed more than est. 3>2 | start i:0 size:7120 offset:1424 estimatedcost: 2
begin i:0 size:7120 offset:1424 bytes:308159856 head:1282276652
 d2 d4 d5
begin i:1 size:4448 offset:0 bytes:2672 head:1282276652
 d2 d4 d5
begin i:2 size:352 offset:0 bytes:4096 head:1282276652
 d1 d2 d5
end i:3 size:0 offset:352

In the first round we only process 2672 bytes (instead of a full 4096 that could fit in a slot),
which begs the question if it's actually needed to use the same offset from the frags in the slots ?

And this printk hits quite often for me ..

>>> That perhaps also misses some theoretical backing, what if it would have
>>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>>> fit in a slot. Or is the total size of frags a skb can carry limited to
>>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>>> MAX_SKB_FRAGS is a upper limit.
>>> (and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't
>>> get to a too large non reachable estimate).
>>>
>>> But as a side question .. the whole "get_next_rx_buffer()" path is needed
>>> for when a frag could not fit in a slot
>>> as a whole ?


>> Perhaps it would be best to take the hit on copy_ops and just tightly pack, so
>> we only start a new slot when the current one is completely full; then actual
>> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the extra if
>> it's a GSO).

> Don't know if and how much a performance penalty that would be.

>>  Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 18:34                   ` Sander Eikelenboom
  2014-03-27 19:22                     ` [Xen-devel] " Sander Eikelenboom
  2014-03-27 19:22                     ` Sander Eikelenboom
@ 2014-03-28  0:55                     ` Sander Eikelenboom
  2014-03-28  0:55                     ` [Xen-devel] " Sander Eikelenboom
  3 siblings, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28  0:55 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Thursday, March 27, 2014, 7:34:54 PM, you wrote:

> <big snip>

>>> >>
>>> >> > So, it may be that the worse-case estimate is now too bad. In the case
>>> >> where it's failing for you it would be nice to know what the estimate was
>>>
>>>
>>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a lot
>>> > of
>>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>>> > start_new_rx_buffer() returns true every time) and just add the extra 1.
>>>
>>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>>> backing.

> I don't like it either, but theory suggested each frag should take no more 
> space than the original DIV_ROUND_UP and that proved to be wrong, but I cannot 
> figure out why.

>>> And since the original problem always seemed to occur on a packet with a
>>> single large frag, i'm wondering
>>> if this 1 would actually be correct in other cases.

> That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2 
> byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.

> In what situation my a 2 byte frag span 2 slots ?

> At least there must be a theoretical cap to the number of slots needed ..
> - assuming and SKB can contain only 65535 bytes
> - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE pieces ..

> - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't know the ring size
>   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall there.


>>> Well this is what i said earlier on .. it's hard to estimate upfront if
>>> "start_new_rx_buffer()" will return true,
>>> and how many times that is possible per frag .. and if that is possible for
>>> only
>>> 1 frag or for all frags.
>>>
>>> The problem is now replaced from packets with 1 large frag (for which it
>>> didn't account properly leading to a too small estimate) .. to packets
>>> with a large number of (smaller) frags .. leading to a too large over
>>> estimation.
>>>
>>> So would there be a theoretical maximum how often that path could hit
>>> based on a combination of sizes (total size of all frags, nr_frags, size per
>>> frag)
>>> ?
>>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>>> hit it
>>> in a next frag ?
>>> - could it be limited due to something like the packet_size / nr_frags /
>>> page_size ?
>>>
>>> And what was wrong with the previous calculation ?
>>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>>>

>> This is not safe if frag size can be > PAGE_SIZE.

> #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)

> So if one of the frags is > PAGE_SIZE ..
> wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we are limited by the total packet size ?
> (so we would spare a slot since we have a frag less .. but spend one more because we have a frag that needs 2 slots ?)

> (and that this should even be pessimistic since we didn't substract the header etc from the max total packet size ?)


> So from what i said early, you could probably do the pessimistic estimate (that would help when packets have a small skb->data_len
> (space occupied by frags)) so the estimate would be less then the old one based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> And CAP it using the old way since a packet should never be able to use more slots than that theoretical max_slots (which hopefully is less than
> the ring size, so a packet can always be processed if the ring is finally emptied.

ok .. annotated "xenvif_gop_frag_copy()" to print what it did when we end up with (vif->rx.req_cons - req_cons_start) > estimatedcost for that frag
where estimatedcost = DIV_ROUND_UP(size, PAGE_SIZE).

So the calculation indeed didn't take the offset used into account.

vif vif-7-0 vif7.0: ?!?!? xenvif_gop_frag_copy: frag costed more than est. 3>2 | start i:0 size:7120 offset:1424 estimatedcost: 2
begin i:0 size:7120 offset:1424 bytes:308159856 head:1282276652
 d2 d4 d5
begin i:1 size:4448 offset:0 bytes:2672 head:1282276652
 d2 d4 d5
begin i:2 size:352 offset:0 bytes:4096 head:1282276652
 d1 d2 d5
end i:3 size:0 offset:352

In the first round we only process 2672 bytes (instead of a full 4096 that could fit in a slot),
which begs the question if it's actually needed to use the same offset from the frags in the slots ?

And this printk hits quite often for me ..

>>> That perhaps also misses some theoretical backing, what if it would have
>>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>>> fit in a slot. Or is the total size of frags a skb can carry limited to
>>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>>> MAX_SKB_FRAGS is a upper limit.
>>> (and you could do the new check maxed by MAX_SKB_FRAGS so it doesn't
>>> get to a too large non reachable estimate).
>>>
>>> But as a side question .. the whole "get_next_rx_buffer()" path is needed
>>> for when a frag could not fit in a slot
>>> as a whole ?


>> Perhaps it would be best to take the hit on copy_ops and just tightly pack, so
>> we only start a new slot when the current one is completely full; then actual
>> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the extra if
>> it's a GSO).

> Don't know if and how much a performance penalty that would be.

>>  Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 19:22                     ` [Xen-devel] " Sander Eikelenboom
  2014-03-28  9:30                       ` Paul Durrant
@ 2014-03-28  9:30                       ` Paul Durrant
  2014-03-28  9:39                         ` Sander Eikelenboom
  2014-03-28  9:39                         ` Sander Eikelenboom
  1 sibling, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28  9:30 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 19:23
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
> 
> > <big snip>
> 
> >>> >>
> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
> case
> >>> >> where it's failing for you it would be nice to know what the estimate
> was
> >>>
> >>>
> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
> lot
> >>> > of
> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> >>> > start_new_rx_buffer() returns true every time) and just add the extra
> 1.
> >>>
> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
> >>> backing.
> 
> > I don't like it either, but theory suggested each frag should take no more
> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
> cannot
> > figure out why.
> 
> >>> And since the original problem always seemed to occur on a packet with
> a
> >>> single large frag, i'm wondering
> >>> if this 1 would actually be correct in other cases.
> 
> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
> 
> > In what situation my a 2 byte frag span 2 slots ?
> 
> > At least there must be a theoretical cap to the number of slots needed ..
> > - assuming and SKB can contain only 65535 bytes
> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
> pieces ..
> 
> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't
> know the ring size
> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
> there.
> 
> 
> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
> >>> "start_new_rx_buffer()" will return true,
> >>> and how many times that is possible per frag .. and if that is possible for
> >>> only
> >>> 1 frag or for all frags.
> >>>
> >>> The problem is now replaced from packets with 1 large frag (for which it
> >>> didn't account properly leading to a too small estimate) .. to packets
> >>> with a large number of (smaller) frags .. leading to a too large over
> >>> estimation.
> >>>
> >>> So would there be a theoretical maximum how often that path could hit
> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
> >>> frag)
> >>> ?
> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
> >>> hit it
> >>> in a next frag ?
> >>> - could it be limited due to something like the packet_size / nr_frags /
> >>> page_size ?
> >>>
> >>> And what was wrong with the previous calculation ?
> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>>
> 
> >> This is not safe if frag size can be > PAGE_SIZE.
> 
> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
> 
> > So if one of the frags is > PAGE_SIZE ..
> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
> are limited by the total packet size ?
> > (so we would spare a slot since we have a frag less .. but spend one more
> because we have a frag that needs 2 slots ?)
> 
> > (and that this should even be pessimistic since we didn't substract the
> header etc from the max total packet size ?)
> 
> 
> > So from what i said early, you could probably do the pessimistic estimate
> (that would help when packets have a small skb->data_len
> > (space occupied by frags)) so the estimate would be less then the old one
> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> > And CAP it using the old way since a packet should never be able to use
> more slots than that theoretical max_slots (which hopefully is less than
> > the ring size, so a packet can always be processed if the ring is finally
> emptied.
> 
> 
> >>> That perhaps also misses some theoretical backing, what if it would have
> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> >>> MAX_SKB_FRAGS is a upper limit.
> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
> doesn't
> >>> get to a too large non reachable estimate).
> >>>
> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
> needed
> >>> for when a frag could not fit in a slot
> >>> as a whole ?
> 
> 
> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
> so
> >> we only start a new slot when the current one is completely full; then
> actual
> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
> extra if
> >> it's a GSO).
> 
> > Don't know if and how much a performance penalty that would be.
> 
> >>  Paul
> 
> Hmm since i now started to dig around a bit more ..
> 
> The ring size seems to be determined by netfront and not by netback ?
> Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE domU
> (and potentially lead to a overrun and therefore problems on the HOST) ?
> 
> And about the commit message from
> ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
> Do i understand it correctly that you saw the original problem (stall on large
> file copy) only on a "Windows Server 2008R2", probably with PV drivers ?
> 

Yes, with PV drivers as you say.

> I don't see why the original calculation wouldn't work, so what kind of
> packets (nr_frags, frag size and PAGE_SIZE ) caused it ?
> 
> And could you retest if that "Windows Server 2008R2" works with a netback
> with you latest patch series (pessimistic estimate) plus a cap on
> max_slots_needed like:
> 
> if(max_slots_needed > MAX_SKB_FRAGS + 1){
>         max_slots_needed = MAX_SKB_FRAGS + 1;
> }

The behaviour of the Windows frontend is different to netfront; it tries to keep the shared ring as full as possible so the estimate could be as pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd never see the lock-up. For some reason (best known to the originator of the code I suspect) the Linux netfront driver limits the number of requests it posts into the shared ring leading to the possibility of lock-up in the case where the backend needs more slots than the fontend 'thinks' it should.

  Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-27 19:22                     ` [Xen-devel] " Sander Eikelenboom
@ 2014-03-28  9:30                       ` Paul Durrant
  2014-03-28  9:30                       ` [Xen-devel] " Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28  9:30 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 27 March 2014 19:23
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
> 
> > <big snip>
> 
> >>> >>
> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
> case
> >>> >> where it's failing for you it would be nice to know what the estimate
> was
> >>>
> >>>
> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
> lot
> >>> > of
> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> >>> > start_new_rx_buffer() returns true every time) and just add the extra
> 1.
> >>>
> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
> >>> backing.
> 
> > I don't like it either, but theory suggested each frag should take no more
> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
> cannot
> > figure out why.
> 
> >>> And since the original problem always seemed to occur on a packet with
> a
> >>> single large frag, i'm wondering
> >>> if this 1 would actually be correct in other cases.
> 
> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
> 
> > In what situation my a 2 byte frag span 2 slots ?
> 
> > At least there must be a theoretical cap to the number of slots needed ..
> > - assuming and SKB can contain only 65535 bytes
> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
> pieces ..
> 
> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't
> know the ring size
> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
> there.
> 
> 
> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
> >>> "start_new_rx_buffer()" will return true,
> >>> and how many times that is possible per frag .. and if that is possible for
> >>> only
> >>> 1 frag or for all frags.
> >>>
> >>> The problem is now replaced from packets with 1 large frag (for which it
> >>> didn't account properly leading to a too small estimate) .. to packets
> >>> with a large number of (smaller) frags .. leading to a too large over
> >>> estimation.
> >>>
> >>> So would there be a theoretical maximum how often that path could hit
> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
> >>> frag)
> >>> ?
> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
> >>> hit it
> >>> in a next frag ?
> >>> - could it be limited due to something like the packet_size / nr_frags /
> >>> page_size ?
> >>>
> >>> And what was wrong with the previous calculation ?
> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>>
> 
> >> This is not safe if frag size can be > PAGE_SIZE.
> 
> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
> 
> > So if one of the frags is > PAGE_SIZE ..
> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
> are limited by the total packet size ?
> > (so we would spare a slot since we have a frag less .. but spend one more
> because we have a frag that needs 2 slots ?)
> 
> > (and that this should even be pessimistic since we didn't substract the
> header etc from the max total packet size ?)
> 
> 
> > So from what i said early, you could probably do the pessimistic estimate
> (that would help when packets have a small skb->data_len
> > (space occupied by frags)) so the estimate would be less then the old one
> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> > And CAP it using the old way since a packet should never be able to use
> more slots than that theoretical max_slots (which hopefully is less than
> > the ring size, so a packet can always be processed if the ring is finally
> emptied.
> 
> 
> >>> That perhaps also misses some theoretical backing, what if it would have
> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> >>> MAX_SKB_FRAGS is a upper limit.
> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
> doesn't
> >>> get to a too large non reachable estimate).
> >>>
> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
> needed
> >>> for when a frag could not fit in a slot
> >>> as a whole ?
> 
> 
> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
> so
> >> we only start a new slot when the current one is completely full; then
> actual
> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
> extra if
> >> it's a GSO).
> 
> > Don't know if and how much a performance penalty that would be.
> 
> >>  Paul
> 
> Hmm since i now started to dig around a bit more ..
> 
> The ring size seems to be determined by netfront and not by netback ?
> Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE domU
> (and potentially lead to a overrun and therefore problems on the HOST) ?
> 
> And about the commit message from
> ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
> Do i understand it correctly that you saw the original problem (stall on large
> file copy) only on a "Windows Server 2008R2", probably with PV drivers ?
> 

Yes, with PV drivers as you say.

> I don't see why the original calculation wouldn't work, so what kind of
> packets (nr_frags, frag size and PAGE_SIZE ) caused it ?
> 
> And could you retest if that "Windows Server 2008R2" works with a netback
> with you latest patch series (pessimistic estimate) plus a cap on
> max_slots_needed like:
> 
> if(max_slots_needed > MAX_SKB_FRAGS + 1){
>         max_slots_needed = MAX_SKB_FRAGS + 1;
> }

The behaviour of the Windows frontend is different to netfront; it tries to keep the shared ring as full as possible so the estimate could be as pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd never see the lock-up. For some reason (best known to the originator of the code I suspect) the Linux netfront driver limits the number of requests it posts into the shared ring leading to the possibility of lock-up in the case where the backend needs more slots than the fontend 'thinks' it should.

  Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  0:55                     ` [Xen-devel] " Sander Eikelenboom
  2014-03-28  9:36                       ` Paul Durrant
@ 2014-03-28  9:36                       ` Paul Durrant
  2014-03-28  9:46                         ` Sander Eikelenboom
  2014-03-28  9:46                         ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28  9:36 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 28 March 2014 00:55
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
> 
> > <big snip>
> 
> >>> >>
> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
> case
> >>> >> where it's failing for you it would be nice to know what the estimate
> was
> >>>
> >>>
> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
> lot
> >>> > of
> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> >>> > start_new_rx_buffer() returns true every time) and just add the extra
> 1.
> >>>
> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
> >>> backing.
> 
> > I don't like it either, but theory suggested each frag should take no more
> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
> cannot
> > figure out why.
> 
> >>> And since the original problem always seemed to occur on a packet with
> a
> >>> single large frag, i'm wondering
> >>> if this 1 would actually be correct in other cases.
> 
> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
> 
> > In what situation my a 2 byte frag span 2 slots ?
> 
> > At least there must be a theoretical cap to the number of slots needed ..
> > - assuming and SKB can contain only 65535 bytes
> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
> pieces ..
> 
> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't
> know the ring size
> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
> there.
> 
> 
> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
> >>> "start_new_rx_buffer()" will return true,
> >>> and how many times that is possible per frag .. and if that is possible for
> >>> only
> >>> 1 frag or for all frags.
> >>>
> >>> The problem is now replaced from packets with 1 large frag (for which it
> >>> didn't account properly leading to a too small estimate) .. to packets
> >>> with a large number of (smaller) frags .. leading to a too large over
> >>> estimation.
> >>>
> >>> So would there be a theoretical maximum how often that path could hit
> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
> >>> frag)
> >>> ?
> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
> >>> hit it
> >>> in a next frag ?
> >>> - could it be limited due to something like the packet_size / nr_frags /
> >>> page_size ?
> >>>
> >>> And what was wrong with the previous calculation ?
> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>>
> 
> >> This is not safe if frag size can be > PAGE_SIZE.
> 
> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
> 
> > So if one of the frags is > PAGE_SIZE ..
> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
> are limited by the total packet size ?
> > (so we would spare a slot since we have a frag less .. but spend one more
> because we have a frag that needs 2 slots ?)
> 
> > (and that this should even be pessimistic since we didn't substract the
> header etc from the max total packet size ?)
> 
> 
> > So from what i said early, you could probably do the pessimistic estimate
> (that would help when packets have a small skb->data_len
> > (space occupied by frags)) so the estimate would be less then the old one
> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> > And CAP it using the old way since a packet should never be able to use
> more slots than that theoretical max_slots (which hopefully is less than
> > the ring size, so a packet can always be processed if the ring is finally
> emptied.
> 
> ok .. annotated "xenvif_gop_frag_copy()" to print what it did when we end
> up with (vif->rx.req_cons - req_cons_start) > estimatedcost for that frag
> where estimatedcost = DIV_ROUND_UP(size, PAGE_SIZE).
> 
> So the calculation indeed didn't take the offset used into account.
> 
> vif vif-7-0 vif7.0: ?!?!? xenvif_gop_frag_copy: frag costed more than est. 3>2
> | start i:0 size:7120 offset:1424 estimatedcost: 2
> begin i:0 size:7120 offset:1424 bytes:308159856 head:1282276652
>  d2 d4 d5
> begin i:1 size:4448 offset:0 bytes:2672 head:1282276652
>  d2 d4 d5
> begin i:2 size:352 offset:0 bytes:4096 head:1282276652
>  d1 d2 d5
> end i:3 size:0 offset:352
> 
> In the first round we only process 2672 bytes (instead of a full 4096 that could
> fit in a slot),
> which begs the question if it's actually needed to use the same offset from
> the frags in the slots ?
> 
> And this printk hits quite often for me ..

The reason for paying attention to the offset is that the grant copy hypercall cannot cross a 4k boundary. (Note that the requirement is that slots are limited to 4k, so the actual page size in the front or backend is irrelevant to the protocol). So, I think the start_new_rx_buffer() logic is there to try to limit the number of grant copy operations  - i.e. the code is supposed  to copy enough to align the source and dest buffers on a 4k boundary and then copy in 4k chunks. I seriously wonder whether it's worth limiting the number of copy ops in this way though as, if we did not do this, we'd have the benefit of knowing exactly how many slots each skb needs.

  Paul

> 
> >>> That perhaps also misses some theoretical backing, what if it would have
> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> >>> MAX_SKB_FRAGS is a upper limit.
> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
> doesn't
> >>> get to a too large non reachable estimate).
> >>>
> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
> needed
> >>> for when a frag could not fit in a slot
> >>> as a whole ?
> 
> 
> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
> so
> >> we only start a new slot when the current one is completely full; then
> actual
> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
> extra if
> >> it's a GSO).
> 
> > Don't know if and how much a performance penalty that would be.
> 
> >>  Paul
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  0:55                     ` [Xen-devel] " Sander Eikelenboom
@ 2014-03-28  9:36                       ` Paul Durrant
  2014-03-28  9:36                       ` [Xen-devel] " Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28  9:36 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 28 March 2014 00:55
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
> 
> > <big snip>
> 
> >>> >>
> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
> case
> >>> >> where it's failing for you it would be nice to know what the estimate
> was
> >>>
> >>>
> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
> lot
> >>> > of
> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> >>> > start_new_rx_buffer() returns true every time) and just add the extra
> 1.
> >>>
> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
> >>> backing.
> 
> > I don't like it either, but theory suggested each frag should take no more
> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
> cannot
> > figure out why.
> 
> >>> And since the original problem always seemed to occur on a packet with
> a
> >>> single large frag, i'm wondering
> >>> if this 1 would actually be correct in other cases.
> 
> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
> 
> > In what situation my a 2 byte frag span 2 slots ?
> 
> > At least there must be a theoretical cap to the number of slots needed ..
> > - assuming and SKB can contain only 65535 bytes
> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
> pieces ..
> 
> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't
> know the ring size
> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
> there.
> 
> 
> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
> >>> "start_new_rx_buffer()" will return true,
> >>> and how many times that is possible per frag .. and if that is possible for
> >>> only
> >>> 1 frag or for all frags.
> >>>
> >>> The problem is now replaced from packets with 1 large frag (for which it
> >>> didn't account properly leading to a too small estimate) .. to packets
> >>> with a large number of (smaller) frags .. leading to a too large over
> >>> estimation.
> >>>
> >>> So would there be a theoretical maximum how often that path could hit
> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
> >>> frag)
> >>> ?
> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
> >>> hit it
> >>> in a next frag ?
> >>> - could it be limited due to something like the packet_size / nr_frags /
> >>> page_size ?
> >>>
> >>> And what was wrong with the previous calculation ?
> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>>
> 
> >> This is not safe if frag size can be > PAGE_SIZE.
> 
> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
> 
> > So if one of the frags is > PAGE_SIZE ..
> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
> are limited by the total packet size ?
> > (so we would spare a slot since we have a frag less .. but spend one more
> because we have a frag that needs 2 slots ?)
> 
> > (and that this should even be pessimistic since we didn't substract the
> header etc from the max total packet size ?)
> 
> 
> > So from what i said early, you could probably do the pessimistic estimate
> (that would help when packets have a small skb->data_len
> > (space occupied by frags)) so the estimate would be less then the old one
> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> > And CAP it using the old way since a packet should never be able to use
> more slots than that theoretical max_slots (which hopefully is less than
> > the ring size, so a packet can always be processed if the ring is finally
> emptied.
> 
> ok .. annotated "xenvif_gop_frag_copy()" to print what it did when we end
> up with (vif->rx.req_cons - req_cons_start) > estimatedcost for that frag
> where estimatedcost = DIV_ROUND_UP(size, PAGE_SIZE).
> 
> So the calculation indeed didn't take the offset used into account.
> 
> vif vif-7-0 vif7.0: ?!?!? xenvif_gop_frag_copy: frag costed more than est. 3>2
> | start i:0 size:7120 offset:1424 estimatedcost: 2
> begin i:0 size:7120 offset:1424 bytes:308159856 head:1282276652
>  d2 d4 d5
> begin i:1 size:4448 offset:0 bytes:2672 head:1282276652
>  d2 d4 d5
> begin i:2 size:352 offset:0 bytes:4096 head:1282276652
>  d1 d2 d5
> end i:3 size:0 offset:352
> 
> In the first round we only process 2672 bytes (instead of a full 4096 that could
> fit in a slot),
> which begs the question if it's actually needed to use the same offset from
> the frags in the slots ?
> 
> And this printk hits quite often for me ..

The reason for paying attention to the offset is that the grant copy hypercall cannot cross a 4k boundary. (Note that the requirement is that slots are limited to 4k, so the actual page size in the front or backend is irrelevant to the protocol). So, I think the start_new_rx_buffer() logic is there to try to limit the number of grant copy operations  - i.e. the code is supposed  to copy enough to align the source and dest buffers on a 4k boundary and then copy in 4k chunks. I seriously wonder whether it's worth limiting the number of copy ops in this way though as, if we did not do this, we'd have the benefit of knowing exactly how many slots each skb needs.

  Paul

> 
> >>> That perhaps also misses some theoretical backing, what if it would have
> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> >>> MAX_SKB_FRAGS is a upper limit.
> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
> doesn't
> >>> get to a too large non reachable estimate).
> >>>
> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
> needed
> >>> for when a frag could not fit in a slot
> >>> as a whole ?
> 
> 
> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
> so
> >> we only start a new slot when the current one is completely full; then
> actual
> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
> extra if
> >> it's a GSO).
> 
> > Don't know if and how much a performance penalty that would be.
> 
> >>  Paul
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:30                       ` [Xen-devel] " Paul Durrant
@ 2014-03-28  9:39                         ` Sander Eikelenboom
  2014-03-28  9:47                           ` Paul Durrant
  2014-03-28  9:47                           ` Paul Durrant
  2014-03-28  9:39                         ` Sander Eikelenboom
  1 sibling, 2 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28  9:39 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 10:30:27 AM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 19:23
>> To: Paul Durrant
>> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>> 
>> > <big snip>
>> 
>> >>> >>
>> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
>> case
>> >>> >> where it's failing for you it would be nice to know what the estimate
>> was
>> >>>
>> >>>
>> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
>> lot
>> >>> > of
>> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>> >>> > start_new_rx_buffer() returns true every time) and just add the extra
>> 1.
>> >>>
>> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>> >>> backing.
>> 
>> > I don't like it either, but theory suggested each frag should take no more
>> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
>> cannot
>> > figure out why.
>> 
>> >>> And since the original problem always seemed to occur on a packet with
>> a
>> >>> single large frag, i'm wondering
>> >>> if this 1 would actually be correct in other cases.
>> 
>> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
>> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
>> 
>> > In what situation my a 2 byte frag span 2 slots ?
>> 
>> > At least there must be a theoretical cap to the number of slots needed ..
>> > - assuming and SKB can contain only 65535 bytes
>> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
>> pieces ..
>> 
>> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
>> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't
>> know the ring size
>> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
>> there.
>> 
>> 
>> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
>> >>> "start_new_rx_buffer()" will return true,
>> >>> and how many times that is possible per frag .. and if that is possible for
>> >>> only
>> >>> 1 frag or for all frags.
>> >>>
>> >>> The problem is now replaced from packets with 1 large frag (for which it
>> >>> didn't account properly leading to a too small estimate) .. to packets
>> >>> with a large number of (smaller) frags .. leading to a too large over
>> >>> estimation.
>> >>>
>> >>> So would there be a theoretical maximum how often that path could hit
>> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
>> >>> frag)
>> >>> ?
>> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>> >>> hit it
>> >>> in a next frag ?
>> >>> - could it be limited due to something like the packet_size / nr_frags /
>> >>> page_size ?
>> >>>
>> >>> And what was wrong with the previous calculation ?
>> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> >>>
>> 
>> >> This is not safe if frag size can be > PAGE_SIZE.
>> 
>> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
>> 
>> > So if one of the frags is > PAGE_SIZE ..
>> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
>> are limited by the total packet size ?
>> > (so we would spare a slot since we have a frag less .. but spend one more
>> because we have a frag that needs 2 slots ?)
>> 
>> > (and that this should even be pessimistic since we didn't substract the
>> header etc from the max total packet size ?)
>> 
>> 
>> > So from what i said early, you could probably do the pessimistic estimate
>> (that would help when packets have a small skb->data_len
>> > (space occupied by frags)) so the estimate would be less then the old one
>> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
>> > And CAP it using the old way since a packet should never be able to use
>> more slots than that theoretical max_slots (which hopefully is less than
>> > the ring size, so a packet can always be processed if the ring is finally
>> emptied.
>> 
>> 
>> >>> That perhaps also misses some theoretical backing, what if it would have
>> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
>> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>> >>> MAX_SKB_FRAGS is a upper limit.
>> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
>> doesn't
>> >>> get to a too large non reachable estimate).
>> >>>
>> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
>> needed
>> >>> for when a frag could not fit in a slot
>> >>> as a whole ?
>> 
>> 
>> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
>> so
>> >> we only start a new slot when the current one is completely full; then
>> actual
>> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
>> extra if
>> >> it's a GSO).
>> 
>> > Don't know if and how much a performance penalty that would be.
>> 
>> >>  Paul
>> 
>> Hmm since i now started to dig around a bit more ..
>> 
>> The ring size seems to be determined by netfront and not by netback ?
>> Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE domU
>> (and potentially lead to a overrun and therefore problems on the HOST) ?
>> 
>> And about the commit message from
>> ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
>> Do i understand it correctly that you saw the original problem (stall on large
>> file copy) only on a "Windows Server 2008R2", probably with PV drivers ?
>> 

> Yes, with PV drivers as you say.

>> I don't see why the original calculation wouldn't work, so what kind of
>> packets (nr_frags, frag size and PAGE_SIZE ) caused it ?
>> 
>> And could you retest if that "Windows Server 2008R2" works with a netback
>> with you latest patch series (pessimistic estimate) plus a cap on
>> max_slots_needed like:
>> 
>> if(max_slots_needed > MAX_SKB_FRAGS + 1){
>>         max_slots_needed = MAX_SKB_FRAGS + 1;
>> }

> The behaviour of the Windows frontend is different to netfront; it tries to keep the shared ring as full as possible so the estimate could be as pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd never see the lock-up. For some reason (best known to the originator of the code I suspect) the Linux netfront driver limits the number of requests it posts into the shared ring leading to the possibility of lock-up in the case where the backend needs more slots than the fontend 'thinks' it should.
But from what i read the ring size is determined by the frontend .. so that PV driver should be able to guarantee that itself ..

Which begs for the question .. was that change of max_slots_needed calculation *needed* to prevent the problem you saw on "Windows Server 2008R2",
or was that just changed for correctness ?

>   Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:30                       ` [Xen-devel] " Paul Durrant
  2014-03-28  9:39                         ` Sander Eikelenboom
@ 2014-03-28  9:39                         ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28  9:39 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 10:30:27 AM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 27 March 2014 19:23
>> To: Paul Durrant
>> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>> 
>> > <big snip>
>> 
>> >>> >>
>> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
>> case
>> >>> >> where it's failing for you it would be nice to know what the estimate
>> was
>> >>>
>> >>>
>> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
>> lot
>> >>> > of
>> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>> >>> > start_new_rx_buffer() returns true every time) and just add the extra
>> 1.
>> >>>
>> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>> >>> backing.
>> 
>> > I don't like it either, but theory suggested each frag should take no more
>> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
>> cannot
>> > figure out why.
>> 
>> >>> And since the original problem always seemed to occur on a packet with
>> a
>> >>> single large frag, i'm wondering
>> >>> if this 1 would actually be correct in other cases.
>> 
>> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
>> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
>> 
>> > In what situation my a 2 byte frag span 2 slots ?
>> 
>> > At least there must be a theoretical cap to the number of slots needed ..
>> > - assuming and SKB can contain only 65535 bytes
>> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
>> pieces ..
>> 
>> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
>> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't
>> know the ring size
>> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
>> there.
>> 
>> 
>> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
>> >>> "start_new_rx_buffer()" will return true,
>> >>> and how many times that is possible per frag .. and if that is possible for
>> >>> only
>> >>> 1 frag or for all frags.
>> >>>
>> >>> The problem is now replaced from packets with 1 large frag (for which it
>> >>> didn't account properly leading to a too small estimate) .. to packets
>> >>> with a large number of (smaller) frags .. leading to a too large over
>> >>> estimation.
>> >>>
>> >>> So would there be a theoretical maximum how often that path could hit
>> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
>> >>> frag)
>> >>> ?
>> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>> >>> hit it
>> >>> in a next frag ?
>> >>> - could it be limited due to something like the packet_size / nr_frags /
>> >>> page_size ?
>> >>>
>> >>> And what was wrong with the previous calculation ?
>> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> >>>
>> 
>> >> This is not safe if frag size can be > PAGE_SIZE.
>> 
>> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
>> 
>> > So if one of the frags is > PAGE_SIZE ..
>> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
>> are limited by the total packet size ?
>> > (so we would spare a slot since we have a frag less .. but spend one more
>> because we have a frag that needs 2 slots ?)
>> 
>> > (and that this should even be pessimistic since we didn't substract the
>> header etc from the max total packet size ?)
>> 
>> 
>> > So from what i said early, you could probably do the pessimistic estimate
>> (that would help when packets have a small skb->data_len
>> > (space occupied by frags)) so the estimate would be less then the old one
>> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
>> > And CAP it using the old way since a packet should never be able to use
>> more slots than that theoretical max_slots (which hopefully is less than
>> > the ring size, so a packet can always be processed if the ring is finally
>> emptied.
>> 
>> 
>> >>> That perhaps also misses some theoretical backing, what if it would have
>> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
>> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>> >>> MAX_SKB_FRAGS is a upper limit.
>> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
>> doesn't
>> >>> get to a too large non reachable estimate).
>> >>>
>> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
>> needed
>> >>> for when a frag could not fit in a slot
>> >>> as a whole ?
>> 
>> 
>> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
>> so
>> >> we only start a new slot when the current one is completely full; then
>> actual
>> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
>> extra if
>> >> it's a GSO).
>> 
>> > Don't know if and how much a performance penalty that would be.
>> 
>> >>  Paul
>> 
>> Hmm since i now started to dig around a bit more ..
>> 
>> The ring size seems to be determined by netfront and not by netback ?
>> Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE domU
>> (and potentially lead to a overrun and therefore problems on the HOST) ?
>> 
>> And about the commit message from
>> ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
>> Do i understand it correctly that you saw the original problem (stall on large
>> file copy) only on a "Windows Server 2008R2", probably with PV drivers ?
>> 

> Yes, with PV drivers as you say.

>> I don't see why the original calculation wouldn't work, so what kind of
>> packets (nr_frags, frag size and PAGE_SIZE ) caused it ?
>> 
>> And could you retest if that "Windows Server 2008R2" works with a netback
>> with you latest patch series (pessimistic estimate) plus a cap on
>> max_slots_needed like:
>> 
>> if(max_slots_needed > MAX_SKB_FRAGS + 1){
>>         max_slots_needed = MAX_SKB_FRAGS + 1;
>> }

> The behaviour of the Windows frontend is different to netfront; it tries to keep the shared ring as full as possible so the estimate could be as pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd never see the lock-up. For some reason (best known to the originator of the code I suspect) the Linux netfront driver limits the number of requests it posts into the shared ring leading to the possibility of lock-up in the case where the backend needs more slots than the fontend 'thinks' it should.
But from what i read the ring size is determined by the frontend .. so that PV driver should be able to guarantee that itself ..

Which begs for the question .. was that change of max_slots_needed calculation *needed* to prevent the problem you saw on "Windows Server 2008R2",
or was that just changed for correctness ?

>   Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:36                       ` [Xen-devel] " Paul Durrant
  2014-03-28  9:46                         ` Sander Eikelenboom
@ 2014-03-28  9:46                         ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28  9:46 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 10:36:42 AM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 28 March 2014 00:55
>> To: Paul Durrant
>> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>> 
>> > <big snip>
>> 
>> >>> >>
>> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
>> case
>> >>> >> where it's failing for you it would be nice to know what the estimate
>> was
>> >>>
>> >>>
>> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
>> lot
>> >>> > of
>> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>> >>> > start_new_rx_buffer() returns true every time) and just add the extra
>> 1.
>> >>>
>> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>> >>> backing.
>> 
>> > I don't like it either, but theory suggested each frag should take no more
>> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
>> cannot
>> > figure out why.
>> 
>> >>> And since the original problem always seemed to occur on a packet with
>> a
>> >>> single large frag, i'm wondering
>> >>> if this 1 would actually be correct in other cases.
>> 
>> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
>> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
>> 
>> > In what situation my a 2 byte frag span 2 slots ?
>> 
>> > At least there must be a theoretical cap to the number of slots needed ..
>> > - assuming and SKB can contain only 65535 bytes
>> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
>> pieces ..
>> 
>> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
>> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't
>> know the ring size
>> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
>> there.
>> 
>> 
>> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
>> >>> "start_new_rx_buffer()" will return true,
>> >>> and how many times that is possible per frag .. and if that is possible for
>> >>> only
>> >>> 1 frag or for all frags.
>> >>>
>> >>> The problem is now replaced from packets with 1 large frag (for which it
>> >>> didn't account properly leading to a too small estimate) .. to packets
>> >>> with a large number of (smaller) frags .. leading to a too large over
>> >>> estimation.
>> >>>
>> >>> So would there be a theoretical maximum how often that path could hit
>> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
>> >>> frag)
>> >>> ?
>> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>> >>> hit it
>> >>> in a next frag ?
>> >>> - could it be limited due to something like the packet_size / nr_frags /
>> >>> page_size ?
>> >>>
>> >>> And what was wrong with the previous calculation ?
>> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> >>>
>> 
>> >> This is not safe if frag size can be > PAGE_SIZE.
>> 
>> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
>> 
>> > So if one of the frags is > PAGE_SIZE ..
>> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
>> are limited by the total packet size ?
>> > (so we would spare a slot since we have a frag less .. but spend one more
>> because we have a frag that needs 2 slots ?)
>> 
>> > (and that this should even be pessimistic since we didn't substract the
>> header etc from the max total packet size ?)
>> 
>> 
>> > So from what i said early, you could probably do the pessimistic estimate
>> (that would help when packets have a small skb->data_len
>> > (space occupied by frags)) so the estimate would be less then the old one
>> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
>> > And CAP it using the old way since a packet should never be able to use
>> more slots than that theoretical max_slots (which hopefully is less than
>> > the ring size, so a packet can always be processed if the ring is finally
>> emptied.
>> 
>> ok .. annotated "xenvif_gop_frag_copy()" to print what it did when we end
>> up with (vif->rx.req_cons - req_cons_start) > estimatedcost for that frag
>> where estimatedcost = DIV_ROUND_UP(size, PAGE_SIZE).
>> 
>> So the calculation indeed didn't take the offset used into account.
>> 
>> vif vif-7-0 vif7.0: ?!?!? xenvif_gop_frag_copy: frag costed more than est. 3>2
>> | start i:0 size:7120 offset:1424 estimatedcost: 2
>> begin i:0 size:7120 offset:1424 bytes:308159856 head:1282276652
>>  d2 d4 d5
>> begin i:1 size:4448 offset:0 bytes:2672 head:1282276652
>>  d2 d4 d5
>> begin i:2 size:352 offset:0 bytes:4096 head:1282276652
>>  d1 d2 d5
>> end i:3 size:0 offset:352
>> 
>> In the first round we only process 2672 bytes (instead of a full 4096 that could
>> fit in a slot),
>> which begs the question if it's actually needed to use the same offset from
>> the frags in the slots ?
>> 
>> And this printk hits quite often for me ..

> The reason for paying attention to the offset is that the grant copy hypercall cannot cross a 4k boundary. (Note that the requirement is that slots are limited to 4k, so the actual page size in the front or backend is irrelevant to the protocol).
> So, I think the start_new_rx_buffer() logic is there to try to limit the number of grant copy operations  - i.e. the code is supposed  to copy enough to align the source and dest buffers on a 4k boundary and then copy in 4k chunks.
> I seriously wonder whether it's worth limiting the number of copy ops in this way though as, if we did not do this, we'd have the benefit of knowing exactly how many slots each skb needs.

Well both the original calculation as the pessimistic one seem to be off by a large number of slots (estimated far larger than required in the end) so that can be extremely wastefull.
The new calculation introduced in ca2f09f2b2c6c25047cfc545d057c4edfcfe561c misses the offset and can be too small.

So clearly it's not easy to estimate something that's not wastefull, so it seems to need a precise calculation. If that could be achieved more easily
that way i think it will probably be worth it (and simplify things at the same time).

>   Paul

>> 
>> >>> That perhaps also misses some theoretical backing, what if it would have
>> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
>> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>> >>> MAX_SKB_FRAGS is a upper limit.
>> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
>> doesn't
>> >>> get to a too large non reachable estimate).
>> >>>
>> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
>> needed
>> >>> for when a frag could not fit in a slot
>> >>> as a whole ?
>> 
>> 
>> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
>> so
>> >> we only start a new slot when the current one is completely full; then
>> actual
>> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
>> extra if
>> >> it's a GSO).
>> 
>> > Don't know if and how much a performance penalty that would be.
>> 
>> >>  Paul
>> 
>> 
>> 
>> 
>> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:36                       ` [Xen-devel] " Paul Durrant
@ 2014-03-28  9:46                         ` Sander Eikelenboom
  2014-03-28  9:46                         ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28  9:46 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 10:36:42 AM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 28 March 2014 00:55
>> To: Paul Durrant
>> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>> 
>> > <big snip>
>> 
>> >>> >>
>> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
>> case
>> >>> >> where it's failing for you it would be nice to know what the estimate
>> was
>> >>>
>> >>>
>> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
>> lot
>> >>> > of
>> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
>> >>> > start_new_rx_buffer() returns true every time) and just add the extra
>> 1.
>> >>>
>> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
>> >>> backing.
>> 
>> > I don't like it either, but theory suggested each frag should take no more
>> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
>> cannot
>> > figure out why.
>> 
>> >>> And since the original problem always seemed to occur on a packet with
>> a
>> >>> single large frag, i'm wondering
>> >>> if this 1 would actually be correct in other cases.
>> 
>> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
>> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
>> 
>> > In what situation my a 2 byte frag span 2 slots ?
>> 
>> > At least there must be a theoretical cap to the number of slots needed ..
>> > - assuming and SKB can contain only 65535 bytes
>> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
>> pieces ..
>> 
>> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
>> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i don't
>> know the ring size
>> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
>> there.
>> 
>> 
>> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
>> >>> "start_new_rx_buffer()" will return true,
>> >>> and how many times that is possible per frag .. and if that is possible for
>> >>> only
>> >>> 1 frag or for all frags.
>> >>>
>> >>> The problem is now replaced from packets with 1 large frag (for which it
>> >>> didn't account properly leading to a too small estimate) .. to packets
>> >>> with a large number of (smaller) frags .. leading to a too large over
>> >>> estimation.
>> >>>
>> >>> So would there be a theoretical maximum how often that path could hit
>> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
>> >>> frag)
>> >>> ?
>> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could you
>> >>> hit it
>> >>> in a next frag ?
>> >>> - could it be limited due to something like the packet_size / nr_frags /
>> >>> page_size ?
>> >>>
>> >>> And what was wrong with the previous calculation ?
>> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
>> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
>> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
>> >>>
>> 
>> >> This is not safe if frag size can be > PAGE_SIZE.
>> 
>> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
>> 
>> > So if one of the frags is > PAGE_SIZE ..
>> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
>> are limited by the total packet size ?
>> > (so we would spare a slot since we have a frag less .. but spend one more
>> because we have a frag that needs 2 slots ?)
>> 
>> > (and that this should even be pessimistic since we didn't substract the
>> header etc from the max total packet size ?)
>> 
>> 
>> > So from what i said early, you could probably do the pessimistic estimate
>> (that would help when packets have a small skb->data_len
>> > (space occupied by frags)) so the estimate would be less then the old one
>> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
>> > And CAP it using the old way since a packet should never be able to use
>> more slots than that theoretical max_slots (which hopefully is less than
>> > the ring size, so a packet can always be processed if the ring is finally
>> emptied.
>> 
>> ok .. annotated "xenvif_gop_frag_copy()" to print what it did when we end
>> up with (vif->rx.req_cons - req_cons_start) > estimatedcost for that frag
>> where estimatedcost = DIV_ROUND_UP(size, PAGE_SIZE).
>> 
>> So the calculation indeed didn't take the offset used into account.
>> 
>> vif vif-7-0 vif7.0: ?!?!? xenvif_gop_frag_copy: frag costed more than est. 3>2
>> | start i:0 size:7120 offset:1424 estimatedcost: 2
>> begin i:0 size:7120 offset:1424 bytes:308159856 head:1282276652
>>  d2 d4 d5
>> begin i:1 size:4448 offset:0 bytes:2672 head:1282276652
>>  d2 d4 d5
>> begin i:2 size:352 offset:0 bytes:4096 head:1282276652
>>  d1 d2 d5
>> end i:3 size:0 offset:352
>> 
>> In the first round we only process 2672 bytes (instead of a full 4096 that could
>> fit in a slot),
>> which begs the question if it's actually needed to use the same offset from
>> the frags in the slots ?
>> 
>> And this printk hits quite often for me ..

> The reason for paying attention to the offset is that the grant copy hypercall cannot cross a 4k boundary. (Note that the requirement is that slots are limited to 4k, so the actual page size in the front or backend is irrelevant to the protocol).
> So, I think the start_new_rx_buffer() logic is there to try to limit the number of grant copy operations  - i.e. the code is supposed  to copy enough to align the source and dest buffers on a 4k boundary and then copy in 4k chunks.
> I seriously wonder whether it's worth limiting the number of copy ops in this way though as, if we did not do this, we'd have the benefit of knowing exactly how many slots each skb needs.

Well both the original calculation as the pessimistic one seem to be off by a large number of slots (estimated far larger than required in the end) so that can be extremely wastefull.
The new calculation introduced in ca2f09f2b2c6c25047cfc545d057c4edfcfe561c misses the offset and can be too small.

So clearly it's not easy to estimate something that's not wastefull, so it seems to need a precise calculation. If that could be achieved more easily
that way i think it will probably be worth it (and simplify things at the same time).

>   Paul

>> 
>> >>> That perhaps also misses some theoretical backing, what if it would have
>> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
>> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
>> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
>> >>> MAX_SKB_FRAGS is a upper limit.
>> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
>> doesn't
>> >>> get to a too large non reachable estimate).
>> >>>
>> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
>> needed
>> >>> for when a frag could not fit in a slot
>> >>> as a whole ?
>> 
>> 
>> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
>> so
>> >> we only start a new slot when the current one is completely full; then
>> actual
>> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
>> extra if
>> >> it's a GSO).
>> 
>> > Don't know if and how much a performance penalty that would be.
>> 
>> >>  Paul
>> 
>> 
>> 
>> 
>> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:39                         ` Sander Eikelenboom
@ 2014-03-28  9:47                           ` Paul Durrant
  2014-03-28  9:59                             ` Sander Eikelenboom
                                               ` (3 more replies)
  2014-03-28  9:47                           ` Paul Durrant
  1 sibling, 4 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28  9:47 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 28 March 2014 09:39
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Friday, March 28, 2014, 10:30:27 AM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 19:23
> >> To: Paul Durrant
> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
> devel@lists.xen.org
> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
> pointless
> >> clause from if statement
> >>
> >>
> >> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
> >>
> >> > <big snip>
> >>
> >> >>> >>
> >> >>> >> > So, it may be that the worse-case estimate is now too bad. In
> the
> >> case
> >> >>> >> where it's failing for you it would be nice to know what the
> estimate
> >> was
> >> >>>
> >> >>>
> >> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's
> a
> >> lot
> >> >>> > of
> >> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't
> assume
> >> >>> > start_new_rx_buffer() returns true every time) and just add the
> extra
> >> 1.
> >> >>>
> >> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some
> theoretical
> >> >>> backing.
> >>
> >> > I don't like it either, but theory suggested each frag should take no more
> >> > space than the original DIV_ROUND_UP and that proved to be wrong,
> but I
> >> cannot
> >> > figure out why.
> >>
> >> >>> And since the original problem always seemed to occur on a packet
> with
> >> a
> >> >>> single large frag, i'm wondering
> >> >>> if this 1 would actually be correct in other cases.
> >>
> >> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
> >> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
> >>
> >> > In what situation my a 2 byte frag span 2 slots ?
> >>
> >> > At least there must be a theoretical cap to the number of slots needed ..
> >> > - assuming and SKB can contain only 65535 bytes
> >> > - assuming a slot can take max PAGE_SIZE and frags are slit into
> PAGE_SIZE
> >> pieces ..
> >>
> >> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> >> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i
> don't
> >> know the ring size
> >> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
> >> there.
> >>
> >>
> >> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
> >> >>> "start_new_rx_buffer()" will return true,
> >> >>> and how many times that is possible per frag .. and if that is possible
> for
> >> >>> only
> >> >>> 1 frag or for all frags.
> >> >>>
> >> >>> The problem is now replaced from packets with 1 large frag (for which
> it
> >> >>> didn't account properly leading to a too small estimate) .. to packets
> >> >>> with a large number of (smaller) frags .. leading to a too large over
> >> >>> estimation.
> >> >>>
> >> >>> So would there be a theoretical maximum how often that path could
> hit
> >> >>> based on a combination of sizes (total size of all frags, nr_frags, size
> per
> >> >>> frag)
> >> >>> ?
> >> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could
> you
> >> >>> hit it
> >> >>> in a next frag ?
> >> >>> - could it be limited due to something like the packet_size / nr_frags /
> >> >>> page_size ?
> >> >>>
> >> >>> And what was wrong with the previous calculation ?
> >> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >> >>>
> >>
> >> >> This is not safe if frag size can be > PAGE_SIZE.
> >>
> >> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
> >>
> >> > So if one of the frags is > PAGE_SIZE ..
> >> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because
> we
> >> are limited by the total packet size ?
> >> > (so we would spare a slot since we have a frag less .. but spend one
> more
> >> because we have a frag that needs 2 slots ?)
> >>
> >> > (and that this should even be pessimistic since we didn't substract the
> >> header etc from the max total packet size ?)
> >>
> >>
> >> > So from what i said early, you could probably do the pessimistic estimate
> >> (that would help when packets have a small skb->data_len
> >> > (space occupied by frags)) so the estimate would be less then the old
> one
> >> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> >> > And CAP it using the old way since a packet should never be able to use
> >> more slots than that theoretical max_slots (which hopefully is less than
> >> > the ring size, so a packet can always be processed if the ring is finally
> >> emptied.
> >>
> >>
> >> >>> That perhaps also misses some theoretical backing, what if it would
> have
> >> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split
> to
> >> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
> >> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> >> >>> MAX_SKB_FRAGS is a upper limit.
> >> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
> >> doesn't
> >> >>> get to a too large non reachable estimate).
> >> >>>
> >> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
> >> needed
> >> >>> for when a frag could not fit in a slot
> >> >>> as a whole ?
> >>
> >>
> >> >> Perhaps it would be best to take the hit on copy_ops and just tightly
> pack,
> >> so
> >> >> we only start a new slot when the current one is completely full; then
> >> actual
> >> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for
> the
> >> extra if
> >> >> it's a GSO).
> >>
> >> > Don't know if and how much a performance penalty that would be.
> >>
> >> >>  Paul
> >>
> >> Hmm since i now started to dig around a bit more ..
> >>
> >> The ring size seems to be determined by netfront and not by netback ?
> >> Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE
> domU
> >> (and potentially lead to a overrun and therefore problems on the HOST) ?
> >>
> >> And about the commit message from
> >> ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
> >> Do i understand it correctly that you saw the original problem (stall on
> large
> >> file copy) only on a "Windows Server 2008R2", probably with PV drivers ?
> >>
> 
> > Yes, with PV drivers as you say.
> 
> >> I don't see why the original calculation wouldn't work, so what kind of
> >> packets (nr_frags, frag size and PAGE_SIZE ) caused it ?
> >>
> >> And could you retest if that "Windows Server 2008R2" works with a
> netback
> >> with you latest patch series (pessimistic estimate) plus a cap on
> >> max_slots_needed like:
> >>
> >> if(max_slots_needed > MAX_SKB_FRAGS + 1){
> >>         max_slots_needed = MAX_SKB_FRAGS + 1;
> >> }
> 
> > The behaviour of the Windows frontend is different to netfront; it tries to
> keep the shared ring as full as possible so the estimate could be as
> pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> never see the lock-up. For some reason (best known to the originator of the
> code I suspect) the Linux netfront driver limits the number of requests it
> posts into the shared ring leading to the possibility of lock-up in the case
> where the backend needs more slots than the fontend 'thinks' it should.
> But from what i read the ring size is determined by the frontend .. so that PV
> driver should be able to guarantee that itself ..
> 

The ring size is 256 - that's baked in. The number of pending requests available to backend *is* determined by the frontend.

> Which begs for the question .. was that change of max_slots_needed
> calculation *needed* to prevent the problem you saw on "Windows Server
> 2008R2",
> or was that just changed for correctness ?
> 

It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS is incorrect if compound pages are in use as the page size is no longer the slot size. It's also wasteful to always wait for space for a maximal packet if the packet you have is smaller so the intention of the max estimate was that it should be at least the number of slots required but not excessive. I think you've proved that making such an estimate is just too hard and since we don't want to fall back to the old dry-run style of slot counting (which meant you had two codepaths that *must* arrive at the same number - and they didn't, which is why I was getting the lock-up with Windows guests) I think we should just go with full-packing so that we don't need to estimate.

  Paul

> >   Paul
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:39                         ` Sander Eikelenboom
  2014-03-28  9:47                           ` Paul Durrant
@ 2014-03-28  9:47                           ` Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28  9:47 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 28 March 2014 09:39
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Friday, March 28, 2014, 10:30:27 AM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 27 March 2014 19:23
> >> To: Paul Durrant
> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
> devel@lists.xen.org
> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
> pointless
> >> clause from if statement
> >>
> >>
> >> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
> >>
> >> > <big snip>
> >>
> >> >>> >>
> >> >>> >> > So, it may be that the worse-case estimate is now too bad. In
> the
> >> case
> >> >>> >> where it's failing for you it would be nice to know what the
> estimate
> >> was
> >> >>>
> >> >>>
> >> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's
> a
> >> lot
> >> >>> > of
> >> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't
> assume
> >> >>> > start_new_rx_buffer() returns true every time) and just add the
> extra
> >> 1.
> >> >>>
> >> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some
> theoretical
> >> >>> backing.
> >>
> >> > I don't like it either, but theory suggested each frag should take no more
> >> > space than the original DIV_ROUND_UP and that proved to be wrong,
> but I
> >> cannot
> >> > figure out why.
> >>
> >> >>> And since the original problem always seemed to occur on a packet
> with
> >> a
> >> >>> single large frag, i'm wondering
> >> >>> if this 1 would actually be correct in other cases.
> >>
> >> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
> >> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
> >>
> >> > In what situation my a 2 byte frag span 2 slots ?
> >>
> >> > At least there must be a theoretical cap to the number of slots needed ..
> >> > - assuming and SKB can contain only 65535 bytes
> >> > - assuming a slot can take max PAGE_SIZE and frags are slit into
> PAGE_SIZE
> >> pieces ..
> >>
> >> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> >> > - double it ..  and at 30 we wouldn't still be near that 52 estimate and i
> don't
> >> know the ring size
> >> >   but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
> >> there.
> >>
> >>
> >> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
> >> >>> "start_new_rx_buffer()" will return true,
> >> >>> and how many times that is possible per frag .. and if that is possible
> for
> >> >>> only
> >> >>> 1 frag or for all frags.
> >> >>>
> >> >>> The problem is now replaced from packets with 1 large frag (for which
> it
> >> >>> didn't account properly leading to a too small estimate) .. to packets
> >> >>> with a large number of (smaller) frags .. leading to a too large over
> >> >>> estimation.
> >> >>>
> >> >>> So would there be a theoretical maximum how often that path could
> hit
> >> >>> based on a combination of sizes (total size of all frags, nr_frags, size
> per
> >> >>> frag)
> >> >>> ?
> >> >>> - if you hit "start_new_rx_buffer()" == true  in the first frag .. could
> you
> >> >>> hit it
> >> >>> in a next frag ?
> >> >>> - could it be limited due to something like the packet_size / nr_frags /
> >> >>> page_size ?
> >> >>>
> >> >>> And what was wrong with the previous calculation ?
> >> >>>                  int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >> >>>                  if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >> >>>                          max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >> >>>
> >>
> >> >> This is not safe if frag size can be > PAGE_SIZE.
> >>
> >> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
> >>
> >> > So if one of the frags is > PAGE_SIZE ..
> >> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because
> we
> >> are limited by the total packet size ?
> >> > (so we would spare a slot since we have a frag less .. but spend one
> more
> >> because we have a frag that needs 2 slots ?)
> >>
> >> > (and that this should even be pessimistic since we didn't substract the
> >> header etc from the max total packet size ?)
> >>
> >>
> >> > So from what i said early, you could probably do the pessimistic estimate
> >> (that would help when packets have a small skb->data_len
> >> > (space occupied by frags)) so the estimate would be less then the old
> one
> >> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> >> > And CAP it using the old way since a packet should never be able to use
> >> more slots than that theoretical max_slots (which hopefully is less than
> >> > the ring size, so a packet can always be processed if the ring is finally
> >> emptied.
> >>
> >>
> >> >>> That perhaps also misses some theoretical backing, what if it would
> have
> >> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split
> to
> >> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
> >> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> >> >>> MAX_SKB_FRAGS is a upper limit.
> >> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
> >> doesn't
> >> >>> get to a too large non reachable estimate).
> >> >>>
> >> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
> >> needed
> >> >>> for when a frag could not fit in a slot
> >> >>> as a whole ?
> >>
> >>
> >> >> Perhaps it would be best to take the hit on copy_ops and just tightly
> pack,
> >> so
> >> >> we only start a new slot when the current one is completely full; then
> >> actual
> >> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for
> the
> >> extra if
> >> >> it's a GSO).
> >>
> >> > Don't know if and how much a performance penalty that would be.
> >>
> >> >>  Paul
> >>
> >> Hmm since i now started to dig around a bit more ..
> >>
> >> The ring size seems to be determined by netfront and not by netback ?
> >> Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE
> domU
> >> (and potentially lead to a overrun and therefore problems on the HOST) ?
> >>
> >> And about the commit message from
> >> ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
> >> Do i understand it correctly that you saw the original problem (stall on
> large
> >> file copy) only on a "Windows Server 2008R2", probably with PV drivers ?
> >>
> 
> > Yes, with PV drivers as you say.
> 
> >> I don't see why the original calculation wouldn't work, so what kind of
> >> packets (nr_frags, frag size and PAGE_SIZE ) caused it ?
> >>
> >> And could you retest if that "Windows Server 2008R2" works with a
> netback
> >> with you latest patch series (pessimistic estimate) plus a cap on
> >> max_slots_needed like:
> >>
> >> if(max_slots_needed > MAX_SKB_FRAGS + 1){
> >>         max_slots_needed = MAX_SKB_FRAGS + 1;
> >> }
> 
> > The behaviour of the Windows frontend is different to netfront; it tries to
> keep the shared ring as full as possible so the estimate could be as
> pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> never see the lock-up. For some reason (best known to the originator of the
> code I suspect) the Linux netfront driver limits the number of requests it
> posts into the shared ring leading to the possibility of lock-up in the case
> where the backend needs more slots than the fontend 'thinks' it should.
> But from what i read the ring size is determined by the frontend .. so that PV
> driver should be able to guarantee that itself ..
> 

The ring size is 256 - that's baked in. The number of pending requests available to backend *is* determined by the frontend.

> Which begs for the question .. was that change of max_slots_needed
> calculation *needed* to prevent the problem you saw on "Windows Server
> 2008R2",
> or was that just changed for correctness ?
> 

It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS is incorrect if compound pages are in use as the page size is no longer the slot size. It's also wasteful to always wait for space for a maximal packet if the packet you have is smaller so the intention of the max estimate was that it should be at least the number of slots required but not excessive. I think you've proved that making such an estimate is just too hard and since we don't want to fall back to the old dry-run style of slot counting (which meant you had two codepaths that *must* arrive at the same number - and they didn't, which is why I was getting the lock-up with Windows guests) I think we should just go with full-packing so that we don't need to estimate.

  Paul

> >   Paul
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:47                           ` Paul Durrant
@ 2014-03-28  9:59                             ` Sander Eikelenboom
  2014-03-28 10:12                               ` Paul Durrant
  2014-03-28 10:12                               ` Paul Durrant
  2014-03-28  9:59                             ` Sander Eikelenboom
                                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28  9:59 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 10:47:20 AM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 28 March 2014 09:39
>> To: Paul Durrant
>> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Friday, March 28, 2014, 10:30:27 AM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 27 March 2014 19:23
>> >> To: Paul Durrant
>> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
>> devel@lists.xen.org
>> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
>> pointless
>> >> clause from if statement
>> >>
>> >>
>> >> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>> >>
>> >> > <big snip>
>> >>
>> >> >>> >>
>> >> >>> >> > So, it may be that the worse-case estimate is now too bad. In
>> the
>> >> case
>> >> >>> >> where it's failing for you it would be nice to know what the
>> estimate
>> >> was
>> >> >>>
>> >> >>>
>> >> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's
>> a
> > The behaviour of the Windows frontend is different to netfront; it tries to
> keep the shared ring as full as possible so the estimate could be as
> pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> never see the lock-up. For some reason (best known to the originator of the
> code I suspect) the Linux netfront driver limits the number of requests it
> posts into the shared ring leading to the possibility of lock-up in the case
> where the backend needs more slots than the fontend 'thinks' it should.
> But from what i read the ring size is determined by the frontend .. so that PV
> driver should be able to guarantee that itself ..
> 

> The ring size is 256 - that's baked in. The number of pending requests
> available to backend *is* determined by the frontend.

Ah ok, does it also reverse that space ?
(if so .. why not use it to allow multiple complete packets to be shoveled in)

> Which begs for the question .. was that change of max_slots_needed
> calculation *needed* to prevent the problem you saw on "Windows Server
> 2008R2",
> or was that just changed for correctness ?
> 

> It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS is
> incorrect if compound pages are in use as the page size is no longer the slot
> size. It's also wasteful to always wait for space for a maximal packet if the
> packet you have is smaller so the intention of the max estimate was that it
> should be at least the number of slots required but not excessive. I think
> you've proved that making such an estimate is just too hard and since we don't
> want to fall back to the old dry-run style of slot counting (which meant you
> had two codepaths that *must* arrive at the same number - and they didn't,
> which is why I was getting the lock-up with Windows guests) I think we should
> just go with full-packing so that we don't need to estimate.

Ok i asked this question since the about to be released 3.14 does now underestimate and
it causes a regression.
So if that part of your patches is not involved in fixing the stated problem / regression i think
just that calculation change should be reverted to the MAX_SKB_FRAGS variant again.
It's more wasteful (as it always has been) but that is better than incorrect and inducing buffer overrun IMHO.

That would give time to think, revise and test this for 3.15.

BTW: if a slot is always 4k, should it check with PAGE_SIZE then on a lot of occasions or just with the
hardcoded 4k slot size ? (at the moment you only have x86 dom0 so probably the page_size==4k is guaranteed that way,
but nevertheless.)

> Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:47                           ` Paul Durrant
  2014-03-28  9:59                             ` Sander Eikelenboom
@ 2014-03-28  9:59                             ` Sander Eikelenboom
  2014-03-28 10:01                             ` David Laight
  2014-03-28 10:01                             ` [Xen-devel] " David Laight
  3 siblings, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28  9:59 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 10:47:20 AM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 28 March 2014 09:39
>> To: Paul Durrant
>> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Friday, March 28, 2014, 10:30:27 AM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 27 March 2014 19:23
>> >> To: Paul Durrant
>> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
>> devel@lists.xen.org
>> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
>> pointless
>> >> clause from if statement
>> >>
>> >>
>> >> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>> >>
>> >> > <big snip>
>> >>
>> >> >>> >>
>> >> >>> >> > So, it may be that the worse-case estimate is now too bad. In
>> the
>> >> case
>> >> >>> >> where it's failing for you it would be nice to know what the
>> estimate
>> >> was
>> >> >>>
>> >> >>>
>> >> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's
>> a
> > The behaviour of the Windows frontend is different to netfront; it tries to
> keep the shared ring as full as possible so the estimate could be as
> pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> never see the lock-up. For some reason (best known to the originator of the
> code I suspect) the Linux netfront driver limits the number of requests it
> posts into the shared ring leading to the possibility of lock-up in the case
> where the backend needs more slots than the fontend 'thinks' it should.
> But from what i read the ring size is determined by the frontend .. so that PV
> driver should be able to guarantee that itself ..
> 

> The ring size is 256 - that's baked in. The number of pending requests
> available to backend *is* determined by the frontend.

Ah ok, does it also reverse that space ?
(if so .. why not use it to allow multiple complete packets to be shoveled in)

> Which begs for the question .. was that change of max_slots_needed
> calculation *needed* to prevent the problem you saw on "Windows Server
> 2008R2",
> or was that just changed for correctness ?
> 

> It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS is
> incorrect if compound pages are in use as the page size is no longer the slot
> size. It's also wasteful to always wait for space for a maximal packet if the
> packet you have is smaller so the intention of the max estimate was that it
> should be at least the number of slots required but not excessive. I think
> you've proved that making such an estimate is just too hard and since we don't
> want to fall back to the old dry-run style of slot counting (which meant you
> had two codepaths that *must* arrive at the same number - and they didn't,
> which is why I was getting the lock-up with Windows guests) I think we should
> just go with full-packing so that we don't need to estimate.

Ok i asked this question since the about to be released 3.14 does now underestimate and
it causes a regression.
So if that part of your patches is not involved in fixing the stated problem / regression i think
just that calculation change should be reverted to the MAX_SKB_FRAGS variant again.
It's more wasteful (as it always has been) but that is better than incorrect and inducing buffer overrun IMHO.

That would give time to think, revise and test this for 3.15.

BTW: if a slot is always 4k, should it check with PAGE_SIZE then on a lot of occasions or just with the
hardcoded 4k slot size ? (at the moment you only have x86 dom0 so probably the page_size==4k is guaranteed that way,
but nevertheless.)

> Paul

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:47                           ` Paul Durrant
                                               ` (2 preceding siblings ...)
  2014-03-28 10:01                             ` David Laight
@ 2014-03-28 10:01                             ` David Laight
  2014-03-28 10:20                               ` Paul Durrant
  2014-03-28 10:20                               ` Paul Durrant
  3 siblings, 2 replies; 71+ messages in thread
From: David Laight @ 2014-03-28 10:01 UTC (permalink / raw)
  To: 'Paul Durrant', Sander Eikelenboom
  Cc: netdev, Wei Liu, Ian Campbell, xen-devel

From: Paul Durrant
...
> > The behaviour of the Windows frontend is different to netfront; it tries to
> > keep the shared ring as full as possible so the estimate could be as
> > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> > never see the lock-up. For some reason (best known to the originator of the
> > code I suspect) the Linux netfront driver limits the number of requests it
> > posts into the shared ring leading to the possibility of lock-up in the case
> > where the backend needs more slots than the fontend 'thinks' it should.
> > But from what i read the ring size is determined by the frontend .. so that PV
> > driver should be able to guarantee that itself ..
> >
> 
> The ring size is 256 - that's baked in. The number of pending requests available to backend *is*
> determined by the frontend.
> 
> > Which begs for the question .. was that change of max_slots_needed
> > calculation *needed* to prevent the problem you saw on "Windows Server
> > 2008R2",
> > or was that just changed for correctness ?
> >
> 
> It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS is incorrect if compound
> pages are in use as the page size is no longer the slot size. It's also wasteful to always wait for
> space for a maximal packet if the packet you have is smaller so the intention of the max estimate was
> that it should be at least the number of slots required but not excessive. I think you've proved that
> making such an estimate is just too hard and since we don't want to fall back to the old dry-run style
> of slot counting (which meant you had two codepaths that *must* arrive at the same number - and they
> didn't, which is why I was getting the lock-up with Windows guests) I think we should just go with
> full-packing so that we don't need to estimate.

A reasonable high estimate for the number of slots required for a specific
message is 'frag_count + total_size/4096'.
So if that are that many slots free it is definitely ok to add the message.

I can see a more general problem for transmits.
I believe a NAPI driver is supposed to indicate that it can't accept
a tx packet in advance of being given a specific packet to transmit.
This means it has to keep enough tx ring space for a worst case packet
(which in some cases can be larger than 1+MAX_SKB_FRAGS) even though
such a packet is unlikely.
I would be tempted to save the skb that 'doesn't fit' within the driver
rather than try to second guess the number of fragments the next packet
will need.

FWIW the USB3 'bulk' driver has the same problem, fragments can't cross
64k boundaries.

	David

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:47                           ` Paul Durrant
  2014-03-28  9:59                             ` Sander Eikelenboom
  2014-03-28  9:59                             ` Sander Eikelenboom
@ 2014-03-28 10:01                             ` David Laight
  2014-03-28 10:01                             ` [Xen-devel] " David Laight
  3 siblings, 0 replies; 71+ messages in thread
From: David Laight @ 2014-03-28 10:01 UTC (permalink / raw)
  To: 'Paul Durrant', Sander Eikelenboom
  Cc: netdev, Wei Liu, Ian Campbell, xen-devel

From: Paul Durrant
...
> > The behaviour of the Windows frontend is different to netfront; it tries to
> > keep the shared ring as full as possible so the estimate could be as
> > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> > never see the lock-up. For some reason (best known to the originator of the
> > code I suspect) the Linux netfront driver limits the number of requests it
> > posts into the shared ring leading to the possibility of lock-up in the case
> > where the backend needs more slots than the fontend 'thinks' it should.
> > But from what i read the ring size is determined by the frontend .. so that PV
> > driver should be able to guarantee that itself ..
> >
> 
> The ring size is 256 - that's baked in. The number of pending requests available to backend *is*
> determined by the frontend.
> 
> > Which begs for the question .. was that change of max_slots_needed
> > calculation *needed* to prevent the problem you saw on "Windows Server
> > 2008R2",
> > or was that just changed for correctness ?
> >
> 
> It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS is incorrect if compound
> pages are in use as the page size is no longer the slot size. It's also wasteful to always wait for
> space for a maximal packet if the packet you have is smaller so the intention of the max estimate was
> that it should be at least the number of slots required but not excessive. I think you've proved that
> making such an estimate is just too hard and since we don't want to fall back to the old dry-run style
> of slot counting (which meant you had two codepaths that *must* arrive at the same number - and they
> didn't, which is why I was getting the lock-up with Windows guests) I think we should just go with
> full-packing so that we don't need to estimate.

A reasonable high estimate for the number of slots required for a specific
message is 'frag_count + total_size/4096'.
So if that are that many slots free it is definitely ok to add the message.

I can see a more general problem for transmits.
I believe a NAPI driver is supposed to indicate that it can't accept
a tx packet in advance of being given a specific packet to transmit.
This means it has to keep enough tx ring space for a worst case packet
(which in some cases can be larger than 1+MAX_SKB_FRAGS) even though
such a packet is unlikely.
I would be tempted to save the skb that 'doesn't fit' within the driver
rather than try to second guess the number of fragments the next packet
will need.

FWIW the USB3 'bulk' driver has the same problem, fragments can't cross
64k boundaries.

	David

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:59                             ` Sander Eikelenboom
@ 2014-03-28 10:12                               ` Paul Durrant
  2014-03-28 10:36                                 ` Sander Eikelenboom
  2014-03-28 10:36                                 ` [Xen-devel] " Sander Eikelenboom
  2014-03-28 10:12                               ` Paul Durrant
  1 sibling, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28 10:12 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Sander Eikelenboom
> Sent: 28 March 2014 10:00
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Friday, March 28, 2014, 10:47:20 AM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 28 March 2014 09:39
> >> To: Paul Durrant
> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
> devel@lists.xen.org
> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
> pointless
> >> clause from if statement
> >>
> >>
> >> Friday, March 28, 2014, 10:30:27 AM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 27 March 2014 19:23
> >> >> To: Paul Durrant
> >> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
> >> devel@lists.xen.org
> >> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
> >> pointless
> >> >> clause from if statement
> >> >>
> >> >>
> >> >> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
> >> >>
> >> >> > <big snip>
> >> >>
> >> >> >>> >>
> >> >> >>> >> > So, it may be that the worse-case estimate is now too bad. In
> >> the
> >> >> case
> >> >> >>> >> where it's failing for you it would be nice to know what the
> >> estimate
> >> >> was
> >> >> >>>
> >> >> >>>
> >> >> >>> > Ok, so we cannot be too pessimistic. In that case I don't see
> there's
> >> a
> > > The behaviour of the Windows frontend is different to netfront; it tries to
> > keep the shared ring as full as possible so the estimate could be as
> > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> > never see the lock-up. For some reason (best known to the originator of
> the
> > code I suspect) the Linux netfront driver limits the number of requests it
> > posts into the shared ring leading to the possibility of lock-up in the case
> > where the backend needs more slots than the fontend 'thinks' it should.
> > But from what i read the ring size is determined by the frontend .. so that
> PV
> > driver should be able to guarantee that itself ..
> >
> 
> > The ring size is 256 - that's baked in. The number of pending requests
> > available to backend *is* determined by the frontend.
> 
> Ah ok, does it also reverse that space ?
> (if so .. why not use it to allow multiple complete packets to be shoveled in)
> 
> > Which begs for the question .. was that change of max_slots_needed
> > calculation *needed* to prevent the problem you saw on "Windows Server
> > 2008R2",
> > or was that just changed for correctness ?
> >
> 
> > It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS
> is
> > incorrect if compound pages are in use as the page size is no longer the slot
> > size. It's also wasteful to always wait for space for a maximal packet if the
> > packet you have is smaller so the intention of the max estimate was that it
> > should be at least the number of slots required but not excessive. I think
> > you've proved that making such an estimate is just too hard and since we
> don't
> > want to fall back to the old dry-run style of slot counting (which meant you
> > had two codepaths that *must* arrive at the same number - and they
> didn't,
> > which is why I was getting the lock-up with Windows guests) I think we
> should
> > just go with full-packing so that we don't need to estimate.
> 
> Ok i asked this question since the about to be released 3.14 does now
> underestimate and
> it causes a regression.
> So if that part of your patches is not involved in fixing the stated problem /
> regression i think
> just that calculation change should be reverted to the MAX_SKB_FRAGS
> variant again.
> It's more wasteful (as it always has been) but that is better than incorrect and
> inducing buffer overrun IMHO.

But I'm not sure even that is correct. Are you?

> 
> That would give time to think, revise and test this for 3.15.
> 
> BTW: if a slot is always 4k, should it check with PAGE_SIZE then on a lot of
> occasions or just with the
> hardcoded 4k slot size ? (at the moment you only have x86 dom0 so probably
> the page_size==4k is guaranteed that way,
> but nevertheless.)
> 

Well, it's 4k because that's the smallest x86 page size and that's what Xen uses in its ABI so I guess the slot size should really be acquired from Xen to be architecture agnostic.

  Paul

> > Paul
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28  9:59                             ` Sander Eikelenboom
  2014-03-28 10:12                               ` Paul Durrant
@ 2014-03-28 10:12                               ` Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28 10:12 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Sander Eikelenboom
> Sent: 28 March 2014 10:00
> To: Paul Durrant
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Friday, March 28, 2014, 10:47:20 AM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 28 March 2014 09:39
> >> To: Paul Durrant
> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
> devel@lists.xen.org
> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
> pointless
> >> clause from if statement
> >>
> >>
> >> Friday, March 28, 2014, 10:30:27 AM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 27 March 2014 19:23
> >> >> To: Paul Durrant
> >> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
> >> devel@lists.xen.org
> >> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
> >> pointless
> >> >> clause from if statement
> >> >>
> >> >>
> >> >> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
> >> >>
> >> >> > <big snip>
> >> >>
> >> >> >>> >>
> >> >> >>> >> > So, it may be that the worse-case estimate is now too bad. In
> >> the
> >> >> case
> >> >> >>> >> where it's failing for you it would be nice to know what the
> >> estimate
> >> >> was
> >> >> >>>
> >> >> >>>
> >> >> >>> > Ok, so we cannot be too pessimistic. In that case I don't see
> there's
> >> a
> > > The behaviour of the Windows frontend is different to netfront; it tries to
> > keep the shared ring as full as possible so the estimate could be as
> > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> > never see the lock-up. For some reason (best known to the originator of
> the
> > code I suspect) the Linux netfront driver limits the number of requests it
> > posts into the shared ring leading to the possibility of lock-up in the case
> > where the backend needs more slots than the fontend 'thinks' it should.
> > But from what i read the ring size is determined by the frontend .. so that
> PV
> > driver should be able to guarantee that itself ..
> >
> 
> > The ring size is 256 - that's baked in. The number of pending requests
> > available to backend *is* determined by the frontend.
> 
> Ah ok, does it also reverse that space ?
> (if so .. why not use it to allow multiple complete packets to be shoveled in)
> 
> > Which begs for the question .. was that change of max_slots_needed
> > calculation *needed* to prevent the problem you saw on "Windows Server
> > 2008R2",
> > or was that just changed for correctness ?
> >
> 
> > It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS
> is
> > incorrect if compound pages are in use as the page size is no longer the slot
> > size. It's also wasteful to always wait for space for a maximal packet if the
> > packet you have is smaller so the intention of the max estimate was that it
> > should be at least the number of slots required but not excessive. I think
> > you've proved that making such an estimate is just too hard and since we
> don't
> > want to fall back to the old dry-run style of slot counting (which meant you
> > had two codepaths that *must* arrive at the same number - and they
> didn't,
> > which is why I was getting the lock-up with Windows guests) I think we
> should
> > just go with full-packing so that we don't need to estimate.
> 
> Ok i asked this question since the about to be released 3.14 does now
> underestimate and
> it causes a regression.
> So if that part of your patches is not involved in fixing the stated problem /
> regression i think
> just that calculation change should be reverted to the MAX_SKB_FRAGS
> variant again.
> It's more wasteful (as it always has been) but that is better than incorrect and
> inducing buffer overrun IMHO.

But I'm not sure even that is correct. Are you?

> 
> That would give time to think, revise and test this for 3.15.
> 
> BTW: if a slot is always 4k, should it check with PAGE_SIZE then on a lot of
> occasions or just with the
> hardcoded 4k slot size ? (at the moment you only have x86 dom0 so probably
> the page_size==4k is guaranteed that way,
> but nevertheless.)
> 

Well, it's 4k because that's the smallest x86 page size and that's what Xen uses in its ABI so I guess the slot size should really be acquired from Xen to be architecture agnostic.

  Paul

> > Paul
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:01                             ` [Xen-devel] " David Laight
@ 2014-03-28 10:20                               ` Paul Durrant
  2014-03-28 10:35                                 ` David Laight
  2014-03-28 10:35                                 ` [Xen-devel] " David Laight
  2014-03-28 10:20                               ` Paul Durrant
  1 sibling, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28 10:20 UTC (permalink / raw)
  To: David Laight, Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of David Laight
> Sent: 28 March 2014 10:01
> To: Paul Durrant; Sander Eikelenboom
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> From: Paul Durrant
> ...
> > > The behaviour of the Windows frontend is different to netfront; it tries to
> > > keep the shared ring as full as possible so the estimate could be as
> > > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> > > never see the lock-up. For some reason (best known to the originator of
> the
> > > code I suspect) the Linux netfront driver limits the number of requests it
> > > posts into the shared ring leading to the possibility of lock-up in the case
> > > where the backend needs more slots than the fontend 'thinks' it should.
> > > But from what i read the ring size is determined by the frontend .. so that
> PV
> > > driver should be able to guarantee that itself ..
> > >
> >
> > The ring size is 256 - that's baked in. The number of pending requests
> available to backend *is*
> > determined by the frontend.
> >
> > > Which begs for the question .. was that change of max_slots_needed
> > > calculation *needed* to prevent the problem you saw on "Windows
> Server
> > > 2008R2",
> > > or was that just changed for correctness ?
> > >
> >
> > It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS
> is incorrect if compound
> > pages are in use as the page size is no longer the slot size. It's also wasteful
> to always wait for
> > space for a maximal packet if the packet you have is smaller so the
> intention of the max estimate was
> > that it should be at least the number of slots required but not excessive. I
> think you've proved that
> > making such an estimate is just too hard and since we don't want to fall
> back to the old dry-run style
> > of slot counting (which meant you had two codepaths that *must* arrive at
> the same number - and they
> > didn't, which is why I was getting the lock-up with Windows guests) I think
> we should just go with
> > full-packing so that we don't need to estimate.
> 
> A reasonable high estimate for the number of slots required for a specific
> message is 'frag_count + total_size/4096'.
> So if that are that many slots free it is definitely ok to add the message.
> 

Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra' though.

> I can see a more general problem for transmits.
> I believe a NAPI driver is supposed to indicate that it can't accept
> a tx packet in advance of being given a specific packet to transmit.
> This means it has to keep enough tx ring space for a worst case packet
> (which in some cases can be larger than 1+MAX_SKB_FRAGS) even though
> such a packet is unlikely.
> I would be tempted to save the skb that 'doesn't fit' within the driver
> rather than try to second guess the number of fragments the next packet
> will need.
> 

Well, we avoid that by having an internal queue and then only stopping the tx queue if the skb we were just handed will definitely not fit. TBH though, I think this internal queue is problematic though as we require a context switch to get the skbs into the shared ring and I think the extra latency caused by this is hitting performance. If we do get rid of it then we do need to worry about the size of a maximal skb again.

  Paul

> FWIW the USB3 'bulk' driver has the same problem, fragments can't cross
> 64k boundaries.
> 
> 	David
> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:01                             ` [Xen-devel] " David Laight
  2014-03-28 10:20                               ` Paul Durrant
@ 2014-03-28 10:20                               ` Paul Durrant
  1 sibling, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28 10:20 UTC (permalink / raw)
  To: David Laight, Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of David Laight
> Sent: 28 March 2014 10:01
> To: Paul Durrant; Sander Eikelenboom
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> From: Paul Durrant
> ...
> > > The behaviour of the Windows frontend is different to netfront; it tries to
> > > keep the shared ring as full as possible so the estimate could be as
> > > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> > > never see the lock-up. For some reason (best known to the originator of
> the
> > > code I suspect) the Linux netfront driver limits the number of requests it
> > > posts into the shared ring leading to the possibility of lock-up in the case
> > > where the backend needs more slots than the fontend 'thinks' it should.
> > > But from what i read the ring size is determined by the frontend .. so that
> PV
> > > driver should be able to guarantee that itself ..
> > >
> >
> > The ring size is 256 - that's baked in. The number of pending requests
> available to backend *is*
> > determined by the frontend.
> >
> > > Which begs for the question .. was that change of max_slots_needed
> > > calculation *needed* to prevent the problem you saw on "Windows
> Server
> > > 2008R2",
> > > or was that just changed for correctness ?
> > >
> >
> > It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS
> is incorrect if compound
> > pages are in use as the page size is no longer the slot size. It's also wasteful
> to always wait for
> > space for a maximal packet if the packet you have is smaller so the
> intention of the max estimate was
> > that it should be at least the number of slots required but not excessive. I
> think you've proved that
> > making such an estimate is just too hard and since we don't want to fall
> back to the old dry-run style
> > of slot counting (which meant you had two codepaths that *must* arrive at
> the same number - and they
> > didn't, which is why I was getting the lock-up with Windows guests) I think
> we should just go with
> > full-packing so that we don't need to estimate.
> 
> A reasonable high estimate for the number of slots required for a specific
> message is 'frag_count + total_size/4096'.
> So if that are that many slots free it is definitely ok to add the message.
> 

Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra' though.

> I can see a more general problem for transmits.
> I believe a NAPI driver is supposed to indicate that it can't accept
> a tx packet in advance of being given a specific packet to transmit.
> This means it has to keep enough tx ring space for a worst case packet
> (which in some cases can be larger than 1+MAX_SKB_FRAGS) even though
> such a packet is unlikely.
> I would be tempted to save the skb that 'doesn't fit' within the driver
> rather than try to second guess the number of fragments the next packet
> will need.
> 

Well, we avoid that by having an internal queue and then only stopping the tx queue if the skb we were just handed will definitely not fit. TBH though, I think this internal queue is problematic though as we require a context switch to get the skbs into the shared ring and I think the extra latency caused by this is hitting performance. If we do get rid of it then we do need to worry about the size of a maximal skb again.

  Paul

> FWIW the USB3 'bulk' driver has the same problem, fragments can't cross
> 64k boundaries.
> 
> 	David
> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:20                               ` Paul Durrant
  2014-03-28 10:35                                 ` David Laight
@ 2014-03-28 10:35                                 ` David Laight
  2014-03-28 10:42                                   ` Sander Eikelenboom
                                                     ` (3 more replies)
  1 sibling, 4 replies; 71+ messages in thread
From: David Laight @ 2014-03-28 10:35 UTC (permalink / raw)
  To: 'Paul Durrant', Sander Eikelenboom
  Cc: netdev, Wei Liu, Ian Campbell, xen-devel

From: Paul Durrant
> > A reasonable high estimate for the number of slots required for a specific
> > message is 'frag_count + total_size/4096'.
> > So if that are that many slots free it is definitely ok to add the message.
> >
> 
> Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an
> overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra'
> though.

Except I meant '2 * frag_count + size/4096' :-(

You have to assume that every fragment starts at n*4096-1 (so need
at least two slots). A third slot is only needed for fragments
longer that 1+4096+2 - but an extra one is needed for every
4096 bytes after that.

	David

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:20                               ` Paul Durrant
@ 2014-03-28 10:35                                 ` David Laight
  2014-03-28 10:35                                 ` [Xen-devel] " David Laight
  1 sibling, 0 replies; 71+ messages in thread
From: David Laight @ 2014-03-28 10:35 UTC (permalink / raw)
  To: 'Paul Durrant', Sander Eikelenboom
  Cc: netdev, Wei Liu, Ian Campbell, xen-devel

From: Paul Durrant
> > A reasonable high estimate for the number of slots required for a specific
> > message is 'frag_count + total_size/4096'.
> > So if that are that many slots free it is definitely ok to add the message.
> >
> 
> Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an
> overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra'
> though.

Except I meant '2 * frag_count + size/4096' :-(

You have to assume that every fragment starts at n*4096-1 (so need
at least two slots). A third slot is only needed for fragments
longer that 1+4096+2 - but an extra one is needed for every
4096 bytes after that.

	David

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:12                               ` Paul Durrant
  2014-03-28 10:36                                 ` Sander Eikelenboom
@ 2014-03-28 10:36                                 ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28 10:36 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 11:12:26 AM, you wrote:

>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Sander Eikelenboom
>> Sent: 28 March 2014 10:00
>> To: Paul Durrant
>> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Friday, March 28, 2014, 10:47:20 AM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 28 March 2014 09:39
>> >> To: Paul Durrant
>> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
>> devel@lists.xen.org
>> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
>> pointless
>> >> clause from if statement
>> >>
>> >>
>> >> Friday, March 28, 2014, 10:30:27 AM, you wrote:
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> >> Sent: 27 March 2014 19:23
>> >> >> To: Paul Durrant
>> >> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
>> >> devel@lists.xen.org
>> >> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
>> >> pointless
>> >> >> clause from if statement
>> >> >>
>> >> >>
>> >> >> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>> >> >>
>> >> >> > <big snip>
>> >> >>
>> >> >> >>> >>
>> >> >> >>> >> > So, it may be that the worse-case estimate is now too bad. In
>> >> the
>> >> >> case
>> >> >> >>> >> where it's failing for you it would be nice to know what the
>> >> estimate
>> >> >> was
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> > Ok, so we cannot be too pessimistic. In that case I don't see
>> there's
>> >> a
>> > > The behaviour of the Windows frontend is different to netfront; it tries to
>> > keep the shared ring as full as possible so the estimate could be as
>> > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
>> > never see the lock-up. For some reason (best known to the originator of
>> the
>> > code I suspect) the Linux netfront driver limits the number of requests it
>> > posts into the shared ring leading to the possibility of lock-up in the case
>> > where the backend needs more slots than the fontend 'thinks' it should.
>> > But from what i read the ring size is determined by the frontend .. so that
>> PV
>> > driver should be able to guarantee that itself ..
>> >
>> 
>> > The ring size is 256 - that's baked in. The number of pending requests
>> > available to backend *is* determined by the frontend.
>> 
>> Ah ok, does it also reverse that space ?
>> (if so .. why not use it to allow multiple complete packets to be shoveled in)
>> 
>> > Which begs for the question .. was that change of max_slots_needed
>> > calculation *needed* to prevent the problem you saw on "Windows Server
>> > 2008R2",
>> > or was that just changed for correctness ?
>> >
>> 
>> > It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS
>> is
>> > incorrect if compound pages are in use as the page size is no longer the slot
>> > size. It's also wasteful to always wait for space for a maximal packet if the
>> > packet you have is smaller so the intention of the max estimate was that it
>> > should be at least the number of slots required but not excessive. I think
>> > you've proved that making such an estimate is just too hard and since we
>> don't
>> > want to fall back to the old dry-run style of slot counting (which meant you
>> > had two codepaths that *must* arrive at the same number - and they
>> didn't,
>> > which is why I was getting the lock-up with Windows guests) I think we
>> should
>> > just go with full-packing so that we don't need to estimate.
>> 
>> Ok i asked this question since the about to be released 3.14 does now
>> underestimate and
>> it causes a regression.
>> So if that part of your patches is not involved in fixing the stated problem /
>> regression i think
>> just that calculation change should be reverted to the MAX_SKB_FRAGS
>> variant again.
>> It's more wasteful (as it always has been) but that is better than incorrect and
>> inducing buffer overrun IMHO.

> But I'm not sure even that is correct. Are you?

Well i didn't see reports that it wasn't .. so empirical evidence says yes ..
that's why i asked you a few emails before if you would be able to test the revert of just this calculation with the "windows2k8" test case ..

Theoretically ..
- if we have MAX_SKB_FRAGS worth of FRAGS in an SKB .. is it still possible to have offsets ?
- if we have less .. do we trade the slot needed for the offset by having less sized frags or less frags ?
- because the complete packet size is limited and MAX_SKB_FRAGS does already do a worst estimate for that (64k / page_size) + 2, so that gets to needing 18 slots to do a maxed out 1 frag packet.
  which should be less than the ring size, so the change of stalling also shouldn't be there.

- Perhaps leaving only things like the compound page issue ?

So i do think it is safe, and at least it's much more on the safe side then the change in "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c" made it.

That combined with:
a) the "Don't introduce new regressions policy"
b) that this part of the commit wasn't necessary to fix the problem at hand
c) correctness before trying to be less wasteful

I think this together should be a compelling argument to reverting that part of the commit and have the time to work out and test something new for 3.15.


>> 
>> That would give time to think, revise and test this for 3.15.
>> 
>> BTW: if a slot is always 4k, should it check with PAGE_SIZE then on a lot of
>> occasions or just with the
>> hardcoded 4k slot size ? (at the moment you only have x86 dom0 so probably
>> the page_size==4k is guaranteed that way,
>> but nevertheless.)
>> 

> Well, it's 4k because that's the smallest x86 page size and that's what Xen uses in its ABI so I guess the slot size should really be acquired from Xen to be architecture agnostic.

>   Paul

>> > Paul
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:12                               ` Paul Durrant
@ 2014-03-28 10:36                                 ` Sander Eikelenboom
  2014-03-28 10:36                                 ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28 10:36 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 11:12:26 AM, you wrote:

>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Sander Eikelenboom
>> Sent: 28 March 2014 10:00
>> To: Paul Durrant
>> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Friday, March 28, 2014, 10:47:20 AM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 28 March 2014 09:39
>> >> To: Paul Durrant
>> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
>> devel@lists.xen.org
>> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
>> pointless
>> >> clause from if statement
>> >>
>> >>
>> >> Friday, March 28, 2014, 10:30:27 AM, you wrote:
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> >> Sent: 27 March 2014 19:23
>> >> >> To: Paul Durrant
>> >> >> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
>> >> devel@lists.xen.org
>> >> >> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove
>> >> pointless
>> >> >> clause from if statement
>> >> >>
>> >> >>
>> >> >> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>> >> >>
>> >> >> > <big snip>
>> >> >>
>> >> >> >>> >>
>> >> >> >>> >> > So, it may be that the worse-case estimate is now too bad. In
>> >> the
>> >> >> case
>> >> >> >>> >> where it's failing for you it would be nice to know what the
>> >> estimate
>> >> >> was
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> > Ok, so we cannot be too pessimistic. In that case I don't see
>> there's
>> >> a
>> > > The behaviour of the Windows frontend is different to netfront; it tries to
>> > keep the shared ring as full as possible so the estimate could be as
>> > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
>> > never see the lock-up. For some reason (best known to the originator of
>> the
>> > code I suspect) the Linux netfront driver limits the number of requests it
>> > posts into the shared ring leading to the possibility of lock-up in the case
>> > where the backend needs more slots than the fontend 'thinks' it should.
>> > But from what i read the ring size is determined by the frontend .. so that
>> PV
>> > driver should be able to guarantee that itself ..
>> >
>> 
>> > The ring size is 256 - that's baked in. The number of pending requests
>> > available to backend *is* determined by the frontend.
>> 
>> Ah ok, does it also reverse that space ?
>> (if so .. why not use it to allow multiple complete packets to be shoveled in)
>> 
>> > Which begs for the question .. was that change of max_slots_needed
>> > calculation *needed* to prevent the problem you saw on "Windows Server
>> > 2008R2",
>> > or was that just changed for correctness ?
>> >
>> 
>> > It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS
>> is
>> > incorrect if compound pages are in use as the page size is no longer the slot
>> > size. It's also wasteful to always wait for space for a maximal packet if the
>> > packet you have is smaller so the intention of the max estimate was that it
>> > should be at least the number of slots required but not excessive. I think
>> > you've proved that making such an estimate is just too hard and since we
>> don't
>> > want to fall back to the old dry-run style of slot counting (which meant you
>> > had two codepaths that *must* arrive at the same number - and they
>> didn't,
>> > which is why I was getting the lock-up with Windows guests) I think we
>> should
>> > just go with full-packing so that we don't need to estimate.
>> 
>> Ok i asked this question since the about to be released 3.14 does now
>> underestimate and
>> it causes a regression.
>> So if that part of your patches is not involved in fixing the stated problem /
>> regression i think
>> just that calculation change should be reverted to the MAX_SKB_FRAGS
>> variant again.
>> It's more wasteful (as it always has been) but that is better than incorrect and
>> inducing buffer overrun IMHO.

> But I'm not sure even that is correct. Are you?

Well i didn't see reports that it wasn't .. so empirical evidence says yes ..
that's why i asked you a few emails before if you would be able to test the revert of just this calculation with the "windows2k8" test case ..

Theoretically ..
- if we have MAX_SKB_FRAGS worth of FRAGS in an SKB .. is it still possible to have offsets ?
- if we have less .. do we trade the slot needed for the offset by having less sized frags or less frags ?
- because the complete packet size is limited and MAX_SKB_FRAGS does already do a worst estimate for that (64k / page_size) + 2, so that gets to needing 18 slots to do a maxed out 1 frag packet.
  which should be less than the ring size, so the change of stalling also shouldn't be there.

- Perhaps leaving only things like the compound page issue ?

So i do think it is safe, and at least it's much more on the safe side then the change in "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c" made it.

That combined with:
a) the "Don't introduce new regressions policy"
b) that this part of the commit wasn't necessary to fix the problem at hand
c) correctness before trying to be less wasteful

I think this together should be a compelling argument to reverting that part of the commit and have the time to work out and test something new for 3.15.


>> 
>> That would give time to think, revise and test this for 3.15.
>> 
>> BTW: if a slot is always 4k, should it check with PAGE_SIZE then on a lot of
>> occasions or just with the
>> hardcoded 4k slot size ? (at the moment you only have x86 dom0 so probably
>> the page_size==4k is guaranteed that way,
>> but nevertheless.)
>> 

> Well, it's 4k because that's the smallest x86 page size and that's what Xen uses in its ABI so I guess the slot size should really be acquired from Xen to be architecture agnostic.

>   Paul

>> > Paul
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:35                                 ` [Xen-devel] " David Laight
  2014-03-28 10:42                                   ` Sander Eikelenboom
@ 2014-03-28 10:42                                   ` Sander Eikelenboom
  2014-03-28 10:47                                     ` Paul Durrant
                                                       ` (3 more replies)
  2014-03-28 10:44                                   ` Paul Durrant
  2014-03-28 10:44                                   ` [Xen-devel] " Paul Durrant
  3 siblings, 4 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28 10:42 UTC (permalink / raw)
  To: David Laight
  Cc: 'Paul Durrant', netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 11:35:58 AM, you wrote:

> From: Paul Durrant
>> > A reasonable high estimate for the number of slots required for a specific
>> > message is 'frag_count + total_size/4096'.
>> > So if that are that many slots free it is definitely ok to add the message.
>> >
>> 
>> Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an
>> overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra'
>> though.

> Except I meant '2 * frag_count + size/4096' :-(

> You have to assume that every fragment starts at n*4096-1 (so need
> at least two slots). A third slot is only needed for fragments
> longer that 1+4096+2 - but an extra one is needed for every
> 4096 bytes after that.

He did that in his followup patch series .. that works .. for small packets
But for larger ones it's an extremely wasteful estimate and it quickly get larger than the MAX_SKB_FRAGS
we had before and even to large causing stalls. I tried doing this type of calculation with a CAP of
the old  MAX_SKB_FRAGS calculation and that works.

However since the calculated max_needed_slots grow so fast (most of the time unnecessary, i put a printk in there for that and it was quite often more than 5 slots off),
that is also wasteful and it uses a more complex calculation.


>         David

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:35                                 ` [Xen-devel] " David Laight
@ 2014-03-28 10:42                                   ` Sander Eikelenboom
  2014-03-28 10:42                                   ` [Xen-devel] " Sander Eikelenboom
                                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28 10:42 UTC (permalink / raw)
  To: David Laight
  Cc: netdev, 'Paul Durrant', Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 11:35:58 AM, you wrote:

> From: Paul Durrant
>> > A reasonable high estimate for the number of slots required for a specific
>> > message is 'frag_count + total_size/4096'.
>> > So if that are that many slots free it is definitely ok to add the message.
>> >
>> 
>> Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an
>> overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra'
>> though.

> Except I meant '2 * frag_count + size/4096' :-(

> You have to assume that every fragment starts at n*4096-1 (so need
> at least two slots). A third slot is only needed for fragments
> longer that 1+4096+2 - but an extra one is needed for every
> 4096 bytes after that.

He did that in his followup patch series .. that works .. for small packets
But for larger ones it's an extremely wasteful estimate and it quickly get larger than the MAX_SKB_FRAGS
we had before and even to large causing stalls. I tried doing this type of calculation with a CAP of
the old  MAX_SKB_FRAGS calculation and that works.

However since the calculated max_needed_slots grow so fast (most of the time unnecessary, i put a printk in there for that and it was quite often more than 5 slots off),
that is also wasteful and it uses a more complex calculation.


>         David

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:35                                 ` [Xen-devel] " David Laight
                                                     ` (2 preceding siblings ...)
  2014-03-28 10:44                                   ` Paul Durrant
@ 2014-03-28 10:44                                   ` Paul Durrant
  3 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28 10:44 UTC (permalink / raw)
  To: David Laight, Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: David Laight [mailto:David.Laight@ACULAB.COM]
> Sent: 28 March 2014 10:36
> To: Paul Durrant; Sander Eikelenboom
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> From: Paul Durrant
> > > A reasonable high estimate for the number of slots required for a specific
> > > message is 'frag_count + total_size/4096'.
> > > So if that are that many slots free it is definitely ok to add the message.
> > >
> >
> > Hmm, that may work. By total_size, I assume you mean skb->len, so that
> calculation is based on an
> > overhead of 1 non-optimally packed slot per frag. There'd still need to be a
> +1 for the GSO 'extra'
> > though.
> 
> Except I meant '2 * frag_count + size/4096' :-(
> 

And that's the pessimal estimation that's failing for Sander :-(

  Paul

> You have to assume that every fragment starts at n*4096-1 (so need
> at least two slots). A third slot is only needed for fragments
> longer that 1+4096+2 - but an extra one is needed for every
> 4096 bytes after that.
> 
> 	David
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:35                                 ` [Xen-devel] " David Laight
  2014-03-28 10:42                                   ` Sander Eikelenboom
  2014-03-28 10:42                                   ` [Xen-devel] " Sander Eikelenboom
@ 2014-03-28 10:44                                   ` Paul Durrant
  2014-03-28 10:44                                   ` [Xen-devel] " Paul Durrant
  3 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28 10:44 UTC (permalink / raw)
  To: David Laight, Sander Eikelenboom; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: David Laight [mailto:David.Laight@ACULAB.COM]
> Sent: 28 March 2014 10:36
> To: Paul Durrant; Sander Eikelenboom
> Cc: netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-devel@lists.xen.org
> Subject: RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> From: Paul Durrant
> > > A reasonable high estimate for the number of slots required for a specific
> > > message is 'frag_count + total_size/4096'.
> > > So if that are that many slots free it is definitely ok to add the message.
> > >
> >
> > Hmm, that may work. By total_size, I assume you mean skb->len, so that
> calculation is based on an
> > overhead of 1 non-optimally packed slot per frag. There'd still need to be a
> +1 for the GSO 'extra'
> > though.
> 
> Except I meant '2 * frag_count + size/4096' :-(
> 

And that's the pessimal estimation that's failing for Sander :-(

  Paul

> You have to assume that every fragment starts at n*4096-1 (so need
> at least two slots). A third slot is only needed for fragments
> longer that 1+4096+2 - but an extra one is needed for every
> 4096 bytes after that.
> 
> 	David
> 
> 

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:42                                   ` [Xen-devel] " Sander Eikelenboom
  2014-03-28 10:47                                     ` Paul Durrant
@ 2014-03-28 10:47                                     ` Paul Durrant
  2014-03-28 10:53                                       ` Sander Eikelenboom
  2014-03-28 10:53                                       ` [Xen-devel] " Sander Eikelenboom
  2014-03-28 11:11                                     ` David Laight
  2014-03-28 11:11                                     ` David Laight
  3 siblings, 2 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28 10:47 UTC (permalink / raw)
  To: Sander Eikelenboom, David Laight; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Sander Eikelenboom
> Sent: 28 March 2014 10:43
> To: David Laight
> Cc: Paul Durrant; netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
> devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Friday, March 28, 2014, 11:35:58 AM, you wrote:
> 
> > From: Paul Durrant
> >> > A reasonable high estimate for the number of slots required for a
> specific
> >> > message is 'frag_count + total_size/4096'.
> >> > So if that are that many slots free it is definitely ok to add the message.
> >> >
> >>
> >> Hmm, that may work. By total_size, I assume you mean skb->len, so that
> calculation is based on an
> >> overhead of 1 non-optimally packed slot per frag. There'd still need to be
> a +1 for the GSO 'extra'
> >> though.
> 
> > Except I meant '2 * frag_count + size/4096' :-(
> 
> > You have to assume that every fragment starts at n*4096-1 (so need
> > at least two slots). A third slot is only needed for fragments
> > longer that 1+4096+2 - but an extra one is needed for every
> > 4096 bytes after that.
> 
> He did that in his followup patch series .. that works .. for small packets
> But for larger ones it's an extremely wasteful estimate and it quickly get
> larger than the MAX_SKB_FRAGS
> we had before and even to large causing stalls. I tried doing this type of
> calculation with a CAP of
> the old  MAX_SKB_FRAGS calculation and that works.
> 

Given that works for you and caps the estimate at the old constant value I guess that's the modification to go for to handle this regression. I'll try to come up with something better for net-next.

  Paul

> However since the calculated max_needed_slots grow so fast (most of the
> time unnecessary, i put a printk in there for that and it was quite often more
> than 5 slots off),
> that is also wasteful and it uses a more complex calculation.
> 
> 
> >         David
> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:42                                   ` [Xen-devel] " Sander Eikelenboom
@ 2014-03-28 10:47                                     ` Paul Durrant
  2014-03-28 10:47                                     ` [Xen-devel] " Paul Durrant
                                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 71+ messages in thread
From: Paul Durrant @ 2014-03-28 10:47 UTC (permalink / raw)
  To: Sander Eikelenboom, David Laight; +Cc: netdev, Wei Liu, Ian Campbell, xen-devel

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Sander Eikelenboom
> Sent: 28 March 2014 10:43
> To: David Laight
> Cc: Paul Durrant; netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
> devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> 
> Friday, March 28, 2014, 11:35:58 AM, you wrote:
> 
> > From: Paul Durrant
> >> > A reasonable high estimate for the number of slots required for a
> specific
> >> > message is 'frag_count + total_size/4096'.
> >> > So if that are that many slots free it is definitely ok to add the message.
> >> >
> >>
> >> Hmm, that may work. By total_size, I assume you mean skb->len, so that
> calculation is based on an
> >> overhead of 1 non-optimally packed slot per frag. There'd still need to be
> a +1 for the GSO 'extra'
> >> though.
> 
> > Except I meant '2 * frag_count + size/4096' :-(
> 
> > You have to assume that every fragment starts at n*4096-1 (so need
> > at least two slots). A third slot is only needed for fragments
> > longer that 1+4096+2 - but an extra one is needed for every
> > 4096 bytes after that.
> 
> He did that in his followup patch series .. that works .. for small packets
> But for larger ones it's an extremely wasteful estimate and it quickly get
> larger than the MAX_SKB_FRAGS
> we had before and even to large causing stalls. I tried doing this type of
> calculation with a CAP of
> the old  MAX_SKB_FRAGS calculation and that works.
> 

Given that works for you and caps the estimate at the old constant value I guess that's the modification to go for to handle this regression. I'll try to come up with something better for net-next.

  Paul

> However since the calculated max_needed_slots grow so fast (most of the
> time unnecessary, i put a printk in there for that and it was quite often more
> than 5 slots off),
> that is also wasteful and it uses a more complex calculation.
> 
> 
> >         David
> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:47                                     ` [Xen-devel] " Paul Durrant
  2014-03-28 10:53                                       ` Sander Eikelenboom
@ 2014-03-28 10:53                                       ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28 10:53 UTC (permalink / raw)
  To: Paul Durrant; +Cc: David Laight, netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 11:47:04 AM, you wrote:

>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Sander Eikelenboom
>> Sent: 28 March 2014 10:43
>> To: David Laight
>> Cc: Paul Durrant; netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
>> devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Friday, March 28, 2014, 11:35:58 AM, you wrote:
>> 
>> > From: Paul Durrant
>> >> > A reasonable high estimate for the number of slots required for a
>> specific
>> >> > message is 'frag_count + total_size/4096'.
>> >> > So if that are that many slots free it is definitely ok to add the message.
>> >> >
>> >>
>> >> Hmm, that may work. By total_size, I assume you mean skb->len, so that
>> calculation is based on an
>> >> overhead of 1 non-optimally packed slot per frag. There'd still need to be
>> a +1 for the GSO 'extra'
>> >> though.
>> 
>> > Except I meant '2 * frag_count + size/4096' :-(
>> 
>> > You have to assume that every fragment starts at n*4096-1 (so need
>> > at least two slots). A third slot is only needed for fragments
>> > longer that 1+4096+2 - but an extra one is needed for every
>> > 4096 bytes after that.
>> 
>> He did that in his followup patch series .. that works .. for small packets
>> But for larger ones it's an extremely wasteful estimate and it quickly get
>> larger than the MAX_SKB_FRAGS
>> we had before and even to large causing stalls. I tried doing this type of
>> calculation with a CAP of
>> the old  MAX_SKB_FRAGS calculation and that works.
>> 

> Given that works for you and caps the estimate at the old constant value I guess that's the modification to go for to handle this regression. I'll try to come up with something better for net-next.

OK send the patch and i will retest immediately just to make sure.

>   Paul

>> However since the calculated max_needed_slots grow so fast (most of the
>> time unnecessary, i put a printk in there for that and it was quite often more
>> than 5 slots off),
>> that is also wasteful and it uses a more complex calculation.
>> 
>> 
>> >         David
>> 
>> 
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:47                                     ` [Xen-devel] " Paul Durrant
@ 2014-03-28 10:53                                       ` Sander Eikelenboom
  2014-03-28 10:53                                       ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28 10:53 UTC (permalink / raw)
  To: Paul Durrant; +Cc: netdev, David Laight, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 11:47:04 AM, you wrote:

>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>> owner@vger.kernel.org] On Behalf Of Sander Eikelenboom
>> Sent: 28 March 2014 10:43
>> To: David Laight
>> Cc: Paul Durrant; netdev@vger.kernel.org; Wei Liu; Ian Campbell; xen-
>> devel@lists.xen.org
>> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
>> clause from if statement
>> 
>> 
>> Friday, March 28, 2014, 11:35:58 AM, you wrote:
>> 
>> > From: Paul Durrant
>> >> > A reasonable high estimate for the number of slots required for a
>> specific
>> >> > message is 'frag_count + total_size/4096'.
>> >> > So if that are that many slots free it is definitely ok to add the message.
>> >> >
>> >>
>> >> Hmm, that may work. By total_size, I assume you mean skb->len, so that
>> calculation is based on an
>> >> overhead of 1 non-optimally packed slot per frag. There'd still need to be
>> a +1 for the GSO 'extra'
>> >> though.
>> 
>> > Except I meant '2 * frag_count + size/4096' :-(
>> 
>> > You have to assume that every fragment starts at n*4096-1 (so need
>> > at least two slots). A third slot is only needed for fragments
>> > longer that 1+4096+2 - but an extra one is needed for every
>> > 4096 bytes after that.
>> 
>> He did that in his followup patch series .. that works .. for small packets
>> But for larger ones it's an extremely wasteful estimate and it quickly get
>> larger than the MAX_SKB_FRAGS
>> we had before and even to large causing stalls. I tried doing this type of
>> calculation with a CAP of
>> the old  MAX_SKB_FRAGS calculation and that works.
>> 

> Given that works for you and caps the estimate at the old constant value I guess that's the modification to go for to handle this regression. I'll try to come up with something better for net-next.

OK send the patch and i will retest immediately just to make sure.

>   Paul

>> However since the calculated max_needed_slots grow so fast (most of the
>> time unnecessary, i put a printk in there for that and it was quite often more
>> than 5 slots off),
>> that is also wasteful and it uses a more complex calculation.
>> 
>> 
>> >         David
>> 
>> 
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:42                                   ` [Xen-devel] " Sander Eikelenboom
  2014-03-28 10:47                                     ` Paul Durrant
  2014-03-28 10:47                                     ` [Xen-devel] " Paul Durrant
@ 2014-03-28 11:11                                     ` David Laight
  2014-03-28 11:35                                       ` Sander Eikelenboom
  2014-03-28 11:35                                       ` Sander Eikelenboom
  2014-03-28 11:11                                     ` David Laight
  3 siblings, 2 replies; 71+ messages in thread
From: David Laight @ 2014-03-28 11:11 UTC (permalink / raw)
  To: 'Sander Eikelenboom'
  Cc: 'Paul Durrant', netdev, Wei Liu, Ian Campbell, xen-devel

From: Sander Eikelenboom
> Friday, March 28, 2014, 11:35:58 AM, you wrote:
> 
> > From: Paul Durrant
> >> > A reasonable high estimate for the number of slots required for a specific
> >> > message is 'frag_count + total_size/4096'.
> >> > So if that are that many slots free it is definitely ok to add the message.
> >> >
> >>
> >> Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an
> >> overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra'
> >> though.
> 
> > Except I meant '2 * frag_count + size/4096' :-(
> 
> > You have to assume that every fragment starts at n*4096-1 (so need
> > at least two slots). A third slot is only needed for fragments
> > longer that 1+4096+2 - but an extra one is needed for every
> > 4096 bytes after that.
> 
> He did that in his followup patch series .. that works .. for small packets
> But for larger ones it's an extremely wasteful estimate and it quickly get larger than the
> MAX_SKB_FRAGS
> we had before and even to large causing stalls. I tried doing this type of calculation with a CAP of
> the old  MAX_SKB_FRAGS calculation and that works.

I'm confused (easily done).
If you are trying to guess at the number of packets to queue waiting for
the thread that sets things up to run then you want an underestimate.
Since any packets that can't actually be transferred will stay on the queue.
A suitable estimate might be max(frag_count, size/4096).

The '2*frag_count + size/4096' is right for checking if there is enough
space for the current packet - since it gets corrected as soon as the
packet is transferred to the ring slots.

	David

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 10:42                                   ` [Xen-devel] " Sander Eikelenboom
                                                       ` (2 preceding siblings ...)
  2014-03-28 11:11                                     ` David Laight
@ 2014-03-28 11:11                                     ` David Laight
  3 siblings, 0 replies; 71+ messages in thread
From: David Laight @ 2014-03-28 11:11 UTC (permalink / raw)
  To: 'Sander Eikelenboom'
  Cc: netdev, 'Paul Durrant', Wei Liu, Ian Campbell, xen-devel

From: Sander Eikelenboom
> Friday, March 28, 2014, 11:35:58 AM, you wrote:
> 
> > From: Paul Durrant
> >> > A reasonable high estimate for the number of slots required for a specific
> >> > message is 'frag_count + total_size/4096'.
> >> > So if that are that many slots free it is definitely ok to add the message.
> >> >
> >>
> >> Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an
> >> overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra'
> >> though.
> 
> > Except I meant '2 * frag_count + size/4096' :-(
> 
> > You have to assume that every fragment starts at n*4096-1 (so need
> > at least two slots). A third slot is only needed for fragments
> > longer that 1+4096+2 - but an extra one is needed for every
> > 4096 bytes after that.
> 
> He did that in his followup patch series .. that works .. for small packets
> But for larger ones it's an extremely wasteful estimate and it quickly get larger than the
> MAX_SKB_FRAGS
> we had before and even to large causing stalls. I tried doing this type of calculation with a CAP of
> the old  MAX_SKB_FRAGS calculation and that works.

I'm confused (easily done).
If you are trying to guess at the number of packets to queue waiting for
the thread that sets things up to run then you want an underestimate.
Since any packets that can't actually be transferred will stay on the queue.
A suitable estimate might be max(frag_count, size/4096).

The '2*frag_count + size/4096' is right for checking if there is enough
space for the current packet - since it gets corrected as soon as the
packet is transferred to the ring slots.

	David

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 11:11                                     ` David Laight
@ 2014-03-28 11:35                                       ` Sander Eikelenboom
  2014-03-28 11:35                                       ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28 11:35 UTC (permalink / raw)
  To: David Laight
  Cc: 'Paul Durrant', netdev, Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 12:11:24 PM, you wrote:

> From: Sander Eikelenboom
>> Friday, March 28, 2014, 11:35:58 AM, you wrote:
>> 
>> > From: Paul Durrant
>> >> > A reasonable high estimate for the number of slots required for a specific
>> >> > message is 'frag_count + total_size/4096'.
>> >> > So if that are that many slots free it is definitely ok to add the message.
>> >> >
>> >>
>> >> Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an
>> >> overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra'
>> >> though.
>> 
>> > Except I meant '2 * frag_count + size/4096' :-(
>> 
>> > You have to assume that every fragment starts at n*4096-1 (so need
>> > at least two slots). A third slot is only needed for fragments
>> > longer that 1+4096+2 - but an extra one is needed for every
>> > 4096 bytes after that.
>> 
>> He did that in his followup patch series .. that works .. for small packets
>> But for larger ones it's an extremely wasteful estimate and it quickly get larger than the
>> MAX_SKB_FRAGS
>> we had before and even to large causing stalls. I tried doing this type of calculation with a CAP of
>> the old  MAX_SKB_FRAGS calculation and that works.

> I'm confused (easily done).
> If you are trying to guess at the number of packets to queue waiting for
> the thread that sets things up to run then you want an underestimate.
> Since any packets that can't actually be transferred will stay on the queue

We want to overestimate the max_slots_needed .. so that if we check and the ring
hasn't got that many slots free .. we don't dequeue the SKB and wait until there becomes
more space available on the ring.

This is done by a very very cheap minimum estimate .. and a slightly more costly maximum estimate,

The maximum estimate was changed (in said commit) and believed to be the worst case.
But this didn't take the offset into account so it could lead to an underestimation,
which then leads to trying to overrun the ring .. this then fails at the grant_copy code,
since the grant reference is bogus so the hypervisor refuses to do that.

But if you do take the offset into account worst case .. you end up with a gross overestimation
that could even be larger than the ring size, leading to a stall since the packet can never be processed
since the ring can't possibly free more slots that in has. But i think the old calculation is the
theoretical max (due to the limitation in total packetsize not *all* frags can have a offset and be that large
that it would cost more slots).

So you could use the old calc as a CAP so you don't overestimate to the extent that you would stall.
)

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement
  2014-03-28 11:11                                     ` David Laight
  2014-03-28 11:35                                       ` Sander Eikelenboom
@ 2014-03-28 11:35                                       ` Sander Eikelenboom
  1 sibling, 0 replies; 71+ messages in thread
From: Sander Eikelenboom @ 2014-03-28 11:35 UTC (permalink / raw)
  To: David Laight
  Cc: netdev, 'Paul Durrant', Wei Liu, Ian Campbell, xen-devel


Friday, March 28, 2014, 12:11:24 PM, you wrote:

> From: Sander Eikelenboom
>> Friday, March 28, 2014, 11:35:58 AM, you wrote:
>> 
>> > From: Paul Durrant
>> >> > A reasonable high estimate for the number of slots required for a specific
>> >> > message is 'frag_count + total_size/4096'.
>> >> > So if that are that many slots free it is definitely ok to add the message.
>> >> >
>> >>
>> >> Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an
>> >> overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra'
>> >> though.
>> 
>> > Except I meant '2 * frag_count + size/4096' :-(
>> 
>> > You have to assume that every fragment starts at n*4096-1 (so need
>> > at least two slots). A third slot is only needed for fragments
>> > longer that 1+4096+2 - but an extra one is needed for every
>> > 4096 bytes after that.
>> 
>> He did that in his followup patch series .. that works .. for small packets
>> But for larger ones it's an extremely wasteful estimate and it quickly get larger than the
>> MAX_SKB_FRAGS
>> we had before and even to large causing stalls. I tried doing this type of calculation with a CAP of
>> the old  MAX_SKB_FRAGS calculation and that works.

> I'm confused (easily done).
> If you are trying to guess at the number of packets to queue waiting for
> the thread that sets things up to run then you want an underestimate.
> Since any packets that can't actually be transferred will stay on the queue

We want to overestimate the max_slots_needed .. so that if we check and the ring
hasn't got that many slots free .. we don't dequeue the SKB and wait until there becomes
more space available on the ring.

This is done by a very very cheap minimum estimate .. and a slightly more costly maximum estimate,

The maximum estimate was changed (in said commit) and believed to be the worst case.
But this didn't take the offset into account so it could lead to an underestimation,
which then leads to trying to overrun the ring .. this then fails at the grant_copy code,
since the grant reference is bogus so the hypervisor refuses to do that.

But if you do take the offset into account worst case .. you end up with a gross overestimation
that could even be larger than the ring size, leading to a stall since the packet can never be processed
since the ring can't possibly free more slots that in has. But i think the old calculation is the
theoretical max (due to the limitation in total packetsize not *all* frags can have a offset and be that large
that it would cost more slots).

So you could use the old calc as a CAP so you don't overestimate to the extent that you would stall.
)

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2014-03-28 11:36 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-27 12:56 [PATCH net v2 0/3] xen-netback: fix rx slot estimation Paul Durrant
2014-03-27 12:56 ` [PATCH net v2 1/3] xen-netback: remove pointless clause from if statement Paul Durrant
2014-03-27 12:56 ` Paul Durrant
2014-03-27 13:45   ` Sander Eikelenboom
2014-03-27 13:45   ` Sander Eikelenboom
2014-03-27 13:54     ` Paul Durrant
2014-03-27 13:54     ` Paul Durrant
2014-03-27 14:02       ` Sander Eikelenboom
2014-03-27 14:09         ` Paul Durrant
2014-03-27 14:29           ` Sander Eikelenboom
2014-03-27 14:29           ` Sander Eikelenboom
2014-03-27 14:38             ` Paul Durrant
2014-03-27 14:38             ` Paul Durrant
2014-03-27 16:46           ` Sander Eikelenboom
2014-03-27 16:54             ` Paul Durrant
2014-03-27 17:15               ` Sander Eikelenboom
2014-03-27 17:26                 ` Paul Durrant
2014-03-27 17:26                 ` Paul Durrant
2014-03-27 18:34                   ` Sander Eikelenboom
2014-03-27 19:22                     ` [Xen-devel] " Sander Eikelenboom
2014-03-28  9:30                       ` Paul Durrant
2014-03-28  9:30                       ` [Xen-devel] " Paul Durrant
2014-03-28  9:39                         ` Sander Eikelenboom
2014-03-28  9:47                           ` Paul Durrant
2014-03-28  9:59                             ` Sander Eikelenboom
2014-03-28 10:12                               ` Paul Durrant
2014-03-28 10:36                                 ` Sander Eikelenboom
2014-03-28 10:36                                 ` [Xen-devel] " Sander Eikelenboom
2014-03-28 10:12                               ` Paul Durrant
2014-03-28  9:59                             ` Sander Eikelenboom
2014-03-28 10:01                             ` David Laight
2014-03-28 10:01                             ` [Xen-devel] " David Laight
2014-03-28 10:20                               ` Paul Durrant
2014-03-28 10:35                                 ` David Laight
2014-03-28 10:35                                 ` [Xen-devel] " David Laight
2014-03-28 10:42                                   ` Sander Eikelenboom
2014-03-28 10:42                                   ` [Xen-devel] " Sander Eikelenboom
2014-03-28 10:47                                     ` Paul Durrant
2014-03-28 10:47                                     ` [Xen-devel] " Paul Durrant
2014-03-28 10:53                                       ` Sander Eikelenboom
2014-03-28 10:53                                       ` [Xen-devel] " Sander Eikelenboom
2014-03-28 11:11                                     ` David Laight
2014-03-28 11:35                                       ` Sander Eikelenboom
2014-03-28 11:35                                       ` Sander Eikelenboom
2014-03-28 11:11                                     ` David Laight
2014-03-28 10:44                                   ` Paul Durrant
2014-03-28 10:44                                   ` [Xen-devel] " Paul Durrant
2014-03-28 10:20                               ` Paul Durrant
2014-03-28  9:47                           ` Paul Durrant
2014-03-28  9:39                         ` Sander Eikelenboom
2014-03-27 19:22                     ` Sander Eikelenboom
2014-03-28  0:55                     ` Sander Eikelenboom
2014-03-28  0:55                     ` [Xen-devel] " Sander Eikelenboom
2014-03-28  9:36                       ` Paul Durrant
2014-03-28  9:36                       ` [Xen-devel] " Paul Durrant
2014-03-28  9:46                         ` Sander Eikelenboom
2014-03-28  9:46                         ` [Xen-devel] " Sander Eikelenboom
2014-03-27 18:34                   ` Sander Eikelenboom
2014-03-27 17:15               ` Sander Eikelenboom
2014-03-27 16:54             ` Paul Durrant
2014-03-27 16:46           ` Sander Eikelenboom
2014-03-27 14:09         ` Paul Durrant
2014-03-27 14:02       ` Sander Eikelenboom
2014-03-27 14:00     ` Paul Durrant
2014-03-27 14:05       ` Sander Eikelenboom
2014-03-27 14:05       ` Sander Eikelenboom
2014-03-27 14:00     ` Paul Durrant
2014-03-27 12:56 ` [PATCH net v2 2/3] xen-netback: worse-case estimate in xenvif_rx_action is underestimating Paul Durrant
2014-03-27 12:56 ` Paul Durrant
2014-03-27 12:56 ` [PATCH net v2 3/3] xen-netback: BUG_ON in xenvif_rx_action() not catching overflow Paul Durrant
2014-03-27 12:56 ` Paul Durrant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.