linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Vineeth Remanan Pillai <vineethp@amazon.com>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Wei Liu <wei.liu2@citrix.com>,
	Paul Durrant <paul.durrant@citrix.com>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH v2] xen-netfront: Fix Rx stall during network stress and OOM
Date: Sun, 29 Jan 2017 18:09:21 -0500	[thread overview]
Message-ID: <38ccfaea-0a65-a6f3-c19a-e6f9c0d4ef76@oracle.com> (raw)
In-Reply-To: <66b10c64-936a-8001-6855-2ff1ed626642@amazon.com>



On 01/19/2017 11:35 AM, Vineeth Remanan Pillai wrote:
> From: Vineeth Remanan Pillai <vineethp@amazon.com>
>
> During an OOM scenario, request slots could not be created as skb
> allocation fails. So the netback cannot pass in packets and netfront
> wrongly assumes that there is no more work to be done and it disables
> polling. This causes Rx to stall.
>
> The issue is with the retry logic which schedules the timer if the
> created slots are less than NET_RX_SLOTS_MIN. The count of new request
> slots to be pushed are calculated as a difference between new req_prod
> and rsp_cons which could be more than the actual slots, if there are
> unconsumed responses.
>
> The fix is to calculate the count of newly created slots as the
> difference between new req_prod and old req_prod.
>
> Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com>
> Reviewed-by: Juergen Gross <jgross@suse.com>
> ---
> Changes in v2:
>     - Removed the old implementation of enabling polling on
>       skb allocation error.
>     - Corrected the refill timer logic to schedule when newly
>       created slots since last push is less than NET_RX_SLOTS_MIN.


There are couple of problems with this patch.
1. The 'if' clause now evaluates to true on pretty much every call to 
xennet_alloc_rx_buffers().
2. It tickles a latent bug during resume where the timer triggers before 
we re-connect. The trouble is that we now try to dereference 
queue->rx.sring which is NULL since we disconnect in netfront_resume(). 
(Curiously, I only observe it with 32-bit guests)

I'll send a patch later that will delete the timer since it looks like a 
bug to me in any case but the first problem seems to be more serious 
than the problem that this patch addresses.

-boris

>
>  drivers/net/xen-netfront.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 40f26b6..2c7c29f 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -321,7 +321,7 @@ static void xennet_alloc_rx_buffers(struct
> netfront_queue *queue)
>      queue->rx.req_prod_pvt = req_prod;
>
>      /* Not enough requests? Try again later. */
> -    if (req_prod - queue->rx.rsp_cons < NET_RX_SLOTS_MIN) {
> +    if (req_prod - queue->rx.sring->req_prod < NET_RX_SLOTS_MIN) {
>          mod_timer(&queue->rx_refill_timer, jiffies + (HZ/10));
>          return;
>      }

  parent reply	other threads:[~2017-01-29 23:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1484771149-12699-1-git-send-email-vineethp@u480fcf3b67f557f68df1.ant.amazon.com>
2017-01-19 16:35 ` [PATCH v2] xen-netfront: Fix Rx stall during network stress and OOM Vineeth Remanan Pillai
2017-01-19 17:11   ` David Miller
2017-01-19 17:17     ` Vineeth Remanan Pillai
2017-01-19 18:10       ` David Miller
2017-01-20 19:09   ` David Miller
2017-01-29 23:09   ` Boris Ostrovsky [this message]
2017-01-30 16:47     ` Vineeth Remanan Pillai
2017-01-30 17:06       ` Boris Ostrovsky
2017-01-30 17:13         ` Vineeth Remanan Pillai
2017-01-31 16:47       ` Vineeth Remanan Pillai
2017-01-12 23:09 [PATCH] " Vineeth Remanan Pillai
     [not found] ` <1484330100-3960-1-git-send-email-vineethp@u480fcf3b67f557f68df1.ant.amazon.com>
2017-01-16  6:24   ` [PATCH v2] " Juergen Gross
2017-01-18 17:02     ` Vineeth Remanan Pillai
2017-01-18 20:08       ` David Miller
2017-01-18 20:10       ` David Miller
2017-01-18 20:24         ` Vineeth Remanan Pillai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38ccfaea-0a65-a6f3-c19a-e6f9c0d4ef76@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paul.durrant@citrix.com \
    --cc=vineethp@amazon.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).