All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fix netfront alloc_page error handling bug, need to raise up rx_refill timer if rx request buffer not big enough for backend
@ 2013-11-13 16:15 Ma JieYue
  2013-11-13 17:27 ` Wei Liu
  2013-11-14  2:39 ` annie li
  0 siblings, 2 replies; 5+ messages in thread
From: Ma JieYue @ 2013-11-13 16:15 UTC (permalink / raw)
  To: xen-devel; +Cc: Ma JieYue

Hi,

We found a netfront driver bug when some user downloaded a large 8G file using wget, in a centOS guest on Xen. The virtual machine has only 512MB memory and the system had high IO stress when wget was running. After some time, wget hung and netfront could not receive any packet from netback at all. We also observed netback vif device was dropping packets at that time.

By debugging we found lots of alloc_page error in netfront driver occured in function xennet_alloc_rx_buffers, after wget was running for a while. The alloc_page error won't stop until the wget hung forever. After that, the rx IO ring paramters from frontend and backend driver, indicated that netfront did not provide enough rx request buffer to netback, so that netback driver thought the ring is full and dropped packets.

the rx IO rings looks as below, e.g.

[netback]
rsp_prod_pvt 666318, req_cons 666318, nr_ents 256
sring req_prod 666337, req_event 666338, rsp_prod 666318, rsp_event 666319

[netfront]
rsp_cons 666318, req_prod_pvt 666337
sring req_prod 666337, req_event 666338, rsp_prod 666318, rsp_event 666319

So, the rx request buffer only has 19 requests for backend, while backend at least needs 20 ( MAX_SKB_FRAGS+2 ) requests buffer in order to send a single skb to netfront when SG and TSO enabled.

Also we gathered some info about alloc_page failure, for the last 6 times before netfront/netback stopped communicating. From the last alloc_page failure, we found the rx request ring buffer had only 18 requests buffer left, and it started to allocate 46 pages to refill the buffer. Unfortunately, the driver failed to allocate the second page, so that after this round it only refilled one page and the rx ring request buffer only had 19 request buffers left.

the debug log looks as below, e.g.

alloc_rx_buffers, alloc_page failed at 2 page, try to alloc 26 pages, rx_target 64, req_prod 552445, rsp_cons 552407
alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 36 pages, rx_target 64, req_prod 552447, rsp_cons 552419
alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 46 pages, rx_target 64, req_prod 552447, rsp_cons 552429
alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 46 pages, rx_target 64, req_prod 552447, rsp_cons 552429
alloc_rx_buffers, alloc_page failed at 4 page, try to alloc 28 pages, rx_target 64, req_prod 639240, rsp_cons 639204
alloc_rx_buffers, alloc_page failed at 1 page, try to alloc 46 pages, rx_target 64, req_prod 639244, rsp_cons 639226

The problem is that, if the alloc page failure not occured at the first page, the rx_refill_timer won't be set, which will help extend the rx request buffer later. So, our fix is to set the rx_refill_timer, if the rx request buffer may be not enough. The patch works well during our stress test.


Signed-off-by: Ma JieYue <jieyue.majy@aliyun-inc.com>
---
 drivers/net/xen-netfront.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 1db10141..680e959 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -243,10 +243,16 @@ static void xennet_alloc_rx_buffers(struct net_device *dev)
 	unsigned long pfn;
 	void *vaddr;
 	struct xen_netif_rx_request *req;
+	int needed;
 
 	if (unlikely(!netif_carrier_ok(dev)))
 		return;
 
+	if (dev->features & (NETIF_F_SG|NETIF_F_TSO))
+		needed = MAX_SKB_FRAGS + 2;
+	else
+		needed = 1;
+
 	/*
 	 * Allocate skbuffs greedily, even though we batch updates to the
 	 * receive ring. This creates a less bursty demand on the memory
@@ -327,6 +333,9 @@ no_skb:
 
 	/* Above is a suitable barrier to ensure backend will see requests. */
 	np->rx.req_prod_pvt = req_prod + i;
+	if (i && (np->rx.req_prod_pvt - np->rx.sring->rsp_prod < needed))
+		mod_timer(&np->rx_refill_timer, jiffies + (HZ/10));
+
  push:
 	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->rx, notify);
 	if (notify)
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] fix netfront alloc_page error handling bug, need to raise up rx_refill timer if rx request buffer not big enough for backend
  2013-11-13 16:15 [PATCH] fix netfront alloc_page error handling bug, need to raise up rx_refill timer if rx request buffer not big enough for backend Ma JieYue
@ 2013-11-13 17:27 ` Wei Liu
  2013-11-14  2:39 ` annie li
  1 sibling, 0 replies; 5+ messages in thread
From: Wei Liu @ 2013-11-13 17:27 UTC (permalink / raw)
  To: Ma JieYue; +Cc: wei.liu2, xen-devel

First thing, for a network driver related patch you should send to
netdev@ list as well. As a bug fix you would need to prefix it with
[PATCH net], targeting net tree.

A good reading on how netdev@ works can be found in kernel's
documentation directory, called netdev-FAQ.txt.

Second, the analysis is good but your commit message is not formal.
Could you please have a look at SubmittingPatches.txt on how commit
message should be written.

On Thu, Nov 14, 2013 at 12:15:48AM +0800, Ma JieYue wrote:
> Hi,
> 
> We found a netfront driver bug when some user downloaded a large 8G file using wget, in a centOS guest on Xen. The virtual machine has only 512MB memory and the system had high IO stress when wget was running. After some time, wget hung and netfront could not receive any packet from netback at all. We also observed netback vif device was dropping packets at that time.
> 
> By debugging we found lots of alloc_page error in netfront driver occured in function xennet_alloc_rx_buffers, after wget was running for a while. The alloc_page error won't stop until the wget hung forever. After that, the rx IO ring paramters from frontend and backend driver, indicated that netfront did not provide enough rx request buffer to netback, so that netback driver thought the ring is full and dropped packets.
> 
> the rx IO rings looks as below, e.g.
> 
> [netback]
> rsp_prod_pvt 666318, req_cons 666318, nr_ents 256
> sring req_prod 666337, req_event 666338, rsp_prod 666318, rsp_event 666319
> 
> [netfront]
> rsp_cons 666318, req_prod_pvt 666337
> sring req_prod 666337, req_event 666338, rsp_prod 666318, rsp_event 666319
> 
> So, the rx request buffer only has 19 requests for backend, while backend at least needs 20 ( MAX_SKB_FRAGS+2 ) requests buffer in order to send a single skb to netfront when SG and TSO enabled.
> 

FWIW newer kernel has MAX_SKB_FRAGS = 17, not 18.

> Also we gathered some info about alloc_page failure, for the last 6 times before netfront/netback stopped communicating. From the last alloc_page failure, we found the rx request ring buffer had only 18 requests buffer left, and it started to allocate 46 pages to refill the buffer. Unfortunately, the driver failed to allocate the second page, so that after this round it only refilled one page and the rx ring request buffer only had 19 request buffers left.
> 

I just look at the code snippet, would it be simpler just to move the
mod_timer before "goto refill" in that part?

> the debug log looks as below, e.g.
> 
> alloc_rx_buffers, alloc_page failed at 2 page, try to alloc 26 pages, rx_target 64, req_prod 552445, rsp_cons 552407
> alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 36 pages, rx_target 64, req_prod 552447, rsp_cons 552419
> alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 46 pages, rx_target 64, req_prod 552447, rsp_cons 552429
> alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 46 pages, rx_target 64, req_prod 552447, rsp_cons 552429
> alloc_rx_buffers, alloc_page failed at 4 page, try to alloc 28 pages, rx_target 64, req_prod 639240, rsp_cons 639204
> alloc_rx_buffers, alloc_page failed at 1 page, try to alloc 46 pages, rx_target 64, req_prod 639244, rsp_cons 639226
> 
> The problem is that, if the alloc page failure not occured at the first page, the rx_refill_timer won't be set, which will help extend the rx request buffer later. So, our fix is to set the rx_refill_timer, if the rx request buffer may be not enough. The patch works well during our stress test.
> 
> 
> Signed-off-by: Ma JieYue <jieyue.majy@aliyun-inc.com>
> ---
>  drivers/net/xen-netfront.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 1db10141..680e959 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -243,10 +243,16 @@ static void xennet_alloc_rx_buffers(struct net_device *dev)
>  	unsigned long pfn;
>  	void *vaddr;
>  	struct xen_netif_rx_request *req;
> +	int needed;
>  
>  	if (unlikely(!netif_carrier_ok(dev)))
>  		return;
>  
> +	if (dev->features & (NETIF_F_SG|NETIF_F_TSO))

This is the guest receive path, why does NETIF_F_TSO matter?

Wei.

> +		needed = MAX_SKB_FRAGS + 2;
> +	else
> +		needed = 1;
> +
>  	/*
>  	 * Allocate skbuffs greedily, even though we batch updates to the
>  	 * receive ring. This creates a less bursty demand on the memory
> @@ -327,6 +333,9 @@ no_skb:
>  
>  	/* Above is a suitable barrier to ensure backend will see requests. */
>  	np->rx.req_prod_pvt = req_prod + i;
> +	if (i && (np->rx.req_prod_pvt - np->rx.sring->rsp_prod < needed))
> +		mod_timer(&np->rx_refill_timer, jiffies + (HZ/10));
> +
>   push:
>  	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->rx, notify);
>  	if (notify)
> -- 
> 1.8.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fix netfront alloc_page error handling bug, need to raise up rx_refill timer if rx request buffer not big enough for backend
  2013-11-13 16:15 [PATCH] fix netfront alloc_page error handling bug, need to raise up rx_refill timer if rx request buffer not big enough for backend Ma JieYue
  2013-11-13 17:27 ` Wei Liu
@ 2013-11-14  2:39 ` annie li
  2013-11-14 11:38   ` David Vrabel
  1 sibling, 1 reply; 5+ messages in thread
From: annie li @ 2013-11-14  2:39 UTC (permalink / raw)
  To: Ma JieYue; +Cc: xen-devel


On 2013-11-14 0:15, Ma JieYue wrote:
> Hi,
>
> We found a netfront driver bug when some user downloaded a large 8G file using wget, in a centOS guest on Xen. The virtual machine has only 512MB memory and the system had high IO stress when wget was running. After some time, wget hung and netfront could not receive any packet from netback at all. We also observed netback vif device was dropping packets at that time.
>
> By debugging we found lots of alloc_page error in netfront driver occured in function xennet_alloc_rx_buffers, after wget was running for a while. The alloc_page error won't stop until the wget hung forever. After that, the rx IO ring paramters from frontend and backend driver, indicated that netfront did not provide enough rx request buffer to netback, so that netback driver thought the ring is full and dropped packets.
>
> the rx IO rings looks as below, e.g.
>
> [netback]
> rsp_prod_pvt 666318, req_cons 666318, nr_ents 256
> sring req_prod 666337, req_event 666338, rsp_prod 666318, rsp_event 666319
>
> [netfront]
> rsp_cons 666318, req_prod_pvt 666337
> sring req_prod 666337, req_event 666338, rsp_prod 666318, rsp_event 666319
>
> So, the rx request buffer only has 19 requests for backend, while backend at least needs 20 ( MAX_SKB_FRAGS+2 ) requests buffer in order to send a single skb to netfront when SG and TSO enabled.
>
> Also we gathered some info about alloc_page failure, for the last 6 times before netfront/netback stopped communicating. From the last alloc_page failure, we found the rx request ring buffer had only 18 requests buffer left, and it started to allocate 46 pages to refill the buffer. Unfortunately, the driver failed to allocate the second page, so that after this round it only refilled one page and the rx ring request buffer only had 19 request buffers left.
>
> the debug log looks as below, e.g.
>
> alloc_rx_buffers, alloc_page failed at 2 page, try to alloc 26 pages, rx_target 64, req_prod 552445, rsp_cons 552407
> alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 36 pages, rx_target 64, req_prod 552447, rsp_cons 552419
> alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 46 pages, rx_target 64, req_prod 552447, rsp_cons 552429
> alloc_rx_buffers, alloc_page failed at 0 page, try to alloc 46 pages, rx_target 64, req_prod 552447, rsp_cons 552429
> alloc_rx_buffers, alloc_page failed at 4 page, try to alloc 28 pages, rx_target 64, req_prod 639240, rsp_cons 639204
> alloc_rx_buffers, alloc_page failed at 1 page, try to alloc 46 pages, rx_target 64, req_prod 639244, rsp_cons 639226
>
> The problem is that, if the alloc page failure not occured at the first page, the rx_refill_timer won't be set, which will help extend the rx request buffer later. So, our fix is to set the rx_refill_timer, if the rx request buffer may be not enough. The patch works well during our stress test.

This issue happens on vm with low memory, and netfront fails to create 
new request in such case. This is very easily to make netback to fail 
creating the response on RX because of the non-enough RX requests. Hence 
np->rx.req_prod_pvt - np->rx.sring->rsp_prod < needed is easily to be 
reached.
With this assumption, is it easier to fix this issue by moving mod_timer 
directly
             mod_timer(&np->rx_refill_timer,
                   jiffies + (HZ/10));
above
             if (i != 0)
                 goto refill;

in xennet_alloc_rx_buffers?

Thanks
Annie
>
>
> Signed-off-by: Ma JieYue <jieyue.majy@aliyun-inc.com>
> ---
>   drivers/net/xen-netfront.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
>
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 1db10141..680e959 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -243,10 +243,16 @@ static void xennet_alloc_rx_buffers(struct net_device *dev)
>   	unsigned long pfn;
>   	void *vaddr;
>   	struct xen_netif_rx_request *req;
> +	int needed;
>   
>   	if (unlikely(!netif_carrier_ok(dev)))
>   		return;
>   
> +	if (dev->features & (NETIF_F_SG|NETIF_F_TSO))
> +		needed = MAX_SKB_FRAGS + 2;
> +	else
> +		needed = 1;
> +
>   	/*
>   	 * Allocate skbuffs greedily, even though we batch updates to the
>   	 * receive ring. This creates a less bursty demand on the memory
> @@ -327,6 +333,9 @@ no_skb:
>   
>   	/* Above is a suitable barrier to ensure backend will see requests. */
>   	np->rx.req_prod_pvt = req_prod + i;
> +	if (i && (np->rx.req_prod_pvt - np->rx.sring->rsp_prod < needed))
> +		mod_timer(&np->rx_refill_timer, jiffies + (HZ/10));
> +
>    push:
>   	RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&np->rx, notify);
>   	if (notify)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fix netfront alloc_page error handling bug, need to raise up rx_refill timer if rx request buffer not big enough for backend
  2013-11-14  2:39 ` annie li
@ 2013-11-14 11:38   ` David Vrabel
  2013-11-15  3:09     ` annie li
  0 siblings, 1 reply; 5+ messages in thread
From: David Vrabel @ 2013-11-14 11:38 UTC (permalink / raw)
  To: annie li; +Cc: Ma JieYue, xen-devel

On 14/11/13 02:39, annie li wrote:
> 
> With this assumption, is it easier to fix this issue by moving mod_timer
> directly
>             mod_timer(&np->rx_refill_timer,
>                   jiffies + (HZ/10));
> above
>             if (i != 0)
>                 goto refill;
> 
> in xennet_alloc_rx_buffers?

Yes, always setting the timer if we didn't reach the target is a better
fix I think.

However, we probably want to reduce the fill target if we are under
memory pressure.

Something like:

no_skb:
               /* Memory pressure, reduce fill target. */
	       if (--np->rx_target < np->rx_target_min)
                   np->rx_target = np->rx_target_min;
               mod_timer(...);
               goto refill;

perhaps?

David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fix netfront alloc_page error handling bug, need to raise up rx_refill timer if rx request buffer not big enough for backend
  2013-11-14 11:38   ` David Vrabel
@ 2013-11-15  3:09     ` annie li
  0 siblings, 0 replies; 5+ messages in thread
From: annie li @ 2013-11-15  3:09 UTC (permalink / raw)
  To: David Vrabel; +Cc: Ma JieYue, xen-devel


On 2013-11-14 19:38, David Vrabel wrote:
> On 14/11/13 02:39, annie li wrote:
>> With this assumption, is it easier to fix this issue by moving mod_timer
>> directly
>>              mod_timer(&np->rx_refill_timer,
>>                    jiffies + (HZ/10));
>> above
>>              if (i != 0)
>>                  goto refill;
>>
>> in xennet_alloc_rx_buffers?
> Yes, always setting the timer if we didn't reach the target is a better
> fix I think.
>
> However, we probably want to reduce the fill target if we are under
> memory pressure.
>
> Something like:
>
> no_skb:
>                 /* Memory pressure, reduce fill target. */
> 	       if (--np->rx_target < np->rx_target_min)
>                     np->rx_target = np->rx_target_min;
>                 mod_timer(...);
>                 goto refill;
>
> perhaps?

Sounds reasonable.
When less memory is availble, then reduce rx_target to decrease 
requirement of memory. And rx_target can be adjusted back after memory 
is enough in following code,

     if (((req_prod - np->rx.sring->rsp_prod) < (np->rx_target / 4)) &&
         ((np->rx_target *= 2) > np->rx_max_target))
         np->rx_target = np->rx_max_target;

Thanks
Annie
>
> David

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-11-15  3:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-13 16:15 [PATCH] fix netfront alloc_page error handling bug, need to raise up rx_refill timer if rx request buffer not big enough for backend Ma JieYue
2013-11-13 17:27 ` Wei Liu
2013-11-14  2:39 ` annie li
2013-11-14 11:38   ` David Vrabel
2013-11-15  3:09     ` annie li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.