From: Boris Ostrovsky <boris.ostrovsky@oracle.com> To: Vineeth Remanan Pillai <vineethp@amazon.com>, David Miller <davem@davemloft.net>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Wei Liu <wei.liu2@citrix.com>, Paul Durrant <paul.durrant@citrix.com>, xen-devel <xen-devel@lists.xen.org> Subject: Re: [PATCH v2] xen-netfront: Fix Rx stall during network stress and OOM Date: Sun, 29 Jan 2017 18:09:21 -0500 [thread overview] Message-ID: <38ccfaea-0a65-a6f3-c19a-e6f9c0d4ef76@oracle.com> (raw) In-Reply-To: <66b10c64-936a-8001-6855-2ff1ed626642@amazon.com> On 01/19/2017 11:35 AM, Vineeth Remanan Pillai wrote: > From: Vineeth Remanan Pillai <vineethp@amazon.com> > > During an OOM scenario, request slots could not be created as skb > allocation fails. So the netback cannot pass in packets and netfront > wrongly assumes that there is no more work to be done and it disables > polling. This causes Rx to stall. > > The issue is with the retry logic which schedules the timer if the > created slots are less than NET_RX_SLOTS_MIN. The count of new request > slots to be pushed are calculated as a difference between new req_prod > and rsp_cons which could be more than the actual slots, if there are > unconsumed responses. > > The fix is to calculate the count of newly created slots as the > difference between new req_prod and old req_prod. > > Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com> > Reviewed-by: Juergen Gross <jgross@suse.com> > --- > Changes in v2: > - Removed the old implementation of enabling polling on > skb allocation error. > - Corrected the refill timer logic to schedule when newly > created slots since last push is less than NET_RX_SLOTS_MIN. There are couple of problems with this patch. 1. The 'if' clause now evaluates to true on pretty much every call to xennet_alloc_rx_buffers(). 2. It tickles a latent bug during resume where the timer triggers before we re-connect. The trouble is that we now try to dereference queue->rx.sring which is NULL since we disconnect in netfront_resume(). (Curiously, I only observe it with 32-bit guests) I'll send a patch later that will delete the timer since it looks like a bug to me in any case but the first problem seems to be more serious than the problem that this patch addresses. -boris > > drivers/net/xen-netfront.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index 40f26b6..2c7c29f 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -321,7 +321,7 @@ static void xennet_alloc_rx_buffers(struct > netfront_queue *queue) > queue->rx.req_prod_pvt = req_prod; > > /* Not enough requests? Try again later. */ > - if (req_prod - queue->rx.rsp_cons < NET_RX_SLOTS_MIN) { > + if (req_prod - queue->rx.sring->req_prod < NET_RX_SLOTS_MIN) { > mod_timer(&queue->rx_refill_timer, jiffies + (HZ/10)); > return; > }
WARNING: multiple messages have this Message-ID (diff)
From: Boris Ostrovsky <boris.ostrovsky@oracle.com> To: Vineeth Remanan Pillai <vineethp@amazon.com>, David Miller <davem@davemloft.net>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Wei Liu <wei.liu2@citrix.com>, Paul Durrant <paul.durrant@citrix.com>, xen-devel <xen-devel@lists.xen.org> Subject: Re: [PATCH v2] xen-netfront: Fix Rx stall during network stress and OOM Date: Sun, 29 Jan 2017 18:09:21 -0500 [thread overview] Message-ID: <38ccfaea-0a65-a6f3-c19a-e6f9c0d4ef76@oracle.com> (raw) In-Reply-To: <66b10c64-936a-8001-6855-2ff1ed626642@amazon.com> On 01/19/2017 11:35 AM, Vineeth Remanan Pillai wrote: > From: Vineeth Remanan Pillai <vineethp@amazon.com> > > During an OOM scenario, request slots could not be created as skb > allocation fails. So the netback cannot pass in packets and netfront > wrongly assumes that there is no more work to be done and it disables > polling. This causes Rx to stall. > > The issue is with the retry logic which schedules the timer if the > created slots are less than NET_RX_SLOTS_MIN. The count of new request > slots to be pushed are calculated as a difference between new req_prod > and rsp_cons which could be more than the actual slots, if there are > unconsumed responses. > > The fix is to calculate the count of newly created slots as the > difference between new req_prod and old req_prod. > > Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com> > Reviewed-by: Juergen Gross <jgross@suse.com> > --- > Changes in v2: > - Removed the old implementation of enabling polling on > skb allocation error. > - Corrected the refill timer logic to schedule when newly > created slots since last push is less than NET_RX_SLOTS_MIN. There are couple of problems with this patch. 1. The 'if' clause now evaluates to true on pretty much every call to xennet_alloc_rx_buffers(). 2. It tickles a latent bug during resume where the timer triggers before we re-connect. The trouble is that we now try to dereference queue->rx.sring which is NULL since we disconnect in netfront_resume(). (Curiously, I only observe it with 32-bit guests) I'll send a patch later that will delete the timer since it looks like a bug to me in any case but the first problem seems to be more serious than the problem that this patch addresses. -boris > > drivers/net/xen-netfront.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c > index 40f26b6..2c7c29f 100644 > --- a/drivers/net/xen-netfront.c > +++ b/drivers/net/xen-netfront.c > @@ -321,7 +321,7 @@ static void xennet_alloc_rx_buffers(struct > netfront_queue *queue) > queue->rx.req_prod_pvt = req_prod; > > /* Not enough requests? Try again later. */ > - if (req_prod - queue->rx.rsp_cons < NET_RX_SLOTS_MIN) { > + if (req_prod - queue->rx.sring->req_prod < NET_RX_SLOTS_MIN) { > mod_timer(&queue->rx_refill_timer, jiffies + (HZ/10)); > return; > } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-01-29 23:11 UTC|newest] Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-01-18 20:25 [PATCH v2] xen-netfront: Fix Rx stall during network stress and OOM Remanan Pillai 2017-01-18 20:25 ` Remanan Pillai 2017-01-19 16:35 ` Vineeth Remanan Pillai 2017-01-19 17:11 ` David Miller 2017-01-19 17:17 ` Vineeth Remanan Pillai 2017-01-19 18:10 ` David Miller 2017-01-20 19:09 ` David Miller 2017-01-29 23:09 ` Boris Ostrovsky [this message] 2017-01-29 23:09 ` Boris Ostrovsky 2017-01-30 16:47 ` Vineeth Remanan Pillai 2017-01-30 16:47 ` Vineeth Remanan Pillai 2017-01-30 16:47 ` Vineeth Remanan Pillai 2017-01-30 17:06 ` Boris Ostrovsky 2017-01-30 17:06 ` Boris Ostrovsky 2017-01-30 17:13 ` Vineeth Remanan Pillai 2017-01-30 17:13 ` Vineeth Remanan Pillai 2017-01-31 16:47 ` Vineeth Remanan Pillai 2017-01-31 16:47 ` Vineeth Remanan Pillai -- strict thread matches above, loose matches on Subject: below -- 2017-01-12 23:09 [PATCH] " Vineeth Remanan Pillai 2017-01-13 17:55 ` [PATCH v2] " Remanan Pillai 2017-01-13 17:55 ` Remanan Pillai 2017-01-16 6:24 ` Juergen Gross 2017-01-16 6:24 ` Juergen Gross 2017-01-18 17:02 ` Vineeth Remanan Pillai 2017-01-18 17:02 ` Vineeth Remanan Pillai 2017-01-18 17:02 ` Vineeth Remanan Pillai 2017-01-18 20:08 ` David Miller 2017-01-18 20:08 ` David Miller 2017-01-18 20:10 ` David Miller 2017-01-18 20:10 ` David Miller 2017-01-18 20:24 ` Vineeth Remanan Pillai 2017-01-18 20:24 ` Vineeth Remanan Pillai
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=38ccfaea-0a65-a6f3-c19a-e6f9c0d4ef76@oracle.com \ --to=boris.ostrovsky@oracle.com \ --cc=davem@davemloft.net \ --cc=linux-kernel@vger.kernel.org \ --cc=netdev@vger.kernel.org \ --cc=paul.durrant@citrix.com \ --cc=vineethp@amazon.com \ --cc=wei.liu2@citrix.com \ --cc=xen-devel@lists.xen.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.