From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sander Eikelenboom Subject: Re: Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected" Date: Tue, 11 Mar 2014 14:00:41 +0100 Message-ID: <1068180642.20140311140041@eikelenboom.it> References: <529743590.20140227154351@eikelenboom.it> <20140227151538.GG16241@zion.uk.xensource.com> <1982379440.20140227162655@eikelenboom.it> <20140227155726.GI16241@zion.uk.xensource.com> <716618617.20140307113321@eikelenboom.it> <20140307111929.GL19620@zion.uk.xensource.com> <1554992598.20140307125518@eikelenboom.it> <9610144106.20140311000026@eikelenboom.it> <20140311101948.GX19620@zion.uk.xensource.com> <191977479.20140311133142@eikelenboom.it> <20140311123807.GB19620@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20140311123807.GB19620@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: annie li , Paul Durrant , Zoltan Kiss , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org Tuesday, March 11, 2014, 1:38:07 PM, you wrote: > On Tue, Mar 11, 2014 at 01:31:42PM +0100, Sander Eikelenboom wrote: >> >> Tuesday, March 11, 2014, 11:19:48 AM, you wrote: >> >> > On Tue, Mar 11, 2014 at 12:00:26AM +0100, Sander Eikelenboom wrote: >> > [...] >> >> >> >> >> Wei. >> >> >> >> Hi Paul, >> >> >> >> It seems a commit by you: "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c xen-netback: improve guest-receive-side flow control" >> >> is the first that gives the Bad grant references. >> >> It seems later patches partly prevent or mask the issue, so it is less easy to trigger it. >> >> With only this commit applied i can trigger it quite fast. >> >> >> >> This is the result of: >> >> - First testing a baseline that worked o.k. for several days (3.13.6 for both dom0 and domU) >> >> - Testing domU 3.14-rc5 and dom0 3.13.6, this worked ok. >> >> - Testing dom0 3.14-rc5 and domU 3.13.6, this failed. >> >> - After that took 3.13.6 as base and first applied all the general xen related patches for the dom0 kernel, that works ok. >> >> - After that started to apply the netback changes for 3.14 and that failed after the commit stated above. >> >> >> >> So i'm quite confident i'm reporting the right thing now :-) >> >> If you would like me to run debug patches on top of this commit, don't hesitate to send them ! >> >> >> >> > Hmm.... I just looked at the commit, something that's obvious wrong is >> > the use of gso_type to determine whether an extra slot is required, >> > which is fixed by Annie yesterday. Annie fixed that for netback.c but >> > missed interface.c. >> >> > This can probably fix the problem you're seeing. I will submit a proper >> > patch if you confirm that... >> >> Although the patch seems correct in it's own right .. it doesn't seem to fix > I will submit that patch anyway... >> the issue when using 3.13.6 as a base and .. >> - pull all 3.14 patches from the git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git tree >> - apply paul's commit "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c xen-netback: improve guest-receive-side flow control" >> - applying annie's v2 patch >> - applying your patch >> as dom0 and using a 3.14-rc5 as domU kernel. >> >> Unfortunately i'm still getting the Bad grant references .. >> > :-( That's bad news. > I guess you always have the same DomU kernel when testing? That means we > can narrow down the bug to netback only. Yes my previous tests (from my previous mail): - First testing a baseline that worked o.k. for several days (3.13.6 for both dom0 and domU) - Testing domU 3.14-rc5 and dom0 3.13.6, this worked ok. - Testing dom0 3.14-rc5 and domU 3.13.6, this failed. - After that took 3.13.6 as base and first applied all the general xen related patches for the dom0 kernel, that works ok. - After that started to apply the netback changes for 3.14 and that failed after the commit "ca2f09f2b2c6c25047cfc545d057c4edfcfe561c xen-netback: improve guest-receive-side flow control". Also seem to indicate just that, although it could also be something in this netback commit that triggers a latent bug in netfront, can't rule that one out completly. But the trigger is in that commit && annie's and your patch seem to have no effect at all( on this issue) && later commits in 3.14 do seems to mask it / make it less likely to trigger, but do not fix it. > Paul, do you have any idea what might go wrong? > Wei.