From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [PATCH] xen mmu: fix a race window causing leave_mm BUG() Date: Wed, 11 May 2011 11:44:21 -0400 Message-ID: <20110511154421.GA16510@dumpdata.com> References: <625BA99ED14B2D499DC4E29D8138F1505C843BB4B6@shsmsx502.ccr.corp.intel.com> <20110510202701.GA18283@dumpdata.com> <625BA99ED14B2D499DC4E29D8138F1505C8F009055@shsmsx502.ccr.corp.intel.com> <1305107063.26692.378.camel@zakaz.uk.xensource.com> <625BA99ED14B2D499DC4E29D8138F1505C8F00952B@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <625BA99ED14B2D499DC4E29D8138F1505C8F00952B@shsmsx502.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Tian, Kevin" Cc: "jeremy@goop.org" , xen devel , Ian Campbell , MaoXiaoyun List-Id: xen-devel@lists.xenproject.org On Wed, May 11, 2011 at 08:34:46PM +0800, Tian, Kevin wrote: > > From: Ian Campbell [mailto:Ian.Campbell@citrix.com] > > Sent: Wednesday, May 11, 2011 5:44 PM > > > > On Wed, 2011-05-11 at 02:20 +0100, Tian, Kevin wrote: > > > > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > > > > Sent: Wednesday, May 11, 2011 4:27 AM > > > > > > > > On Fri, Apr 29, 2011 at 12:10:57PM +0800, Tian, Kevin wrote: > > > > > xen mmu: fix a race window causing leave_mm BUG() > > > > > > > > I've this in mailbox and I am wondering whether this still an issue > > > > with the > > > > 2.6.39 type kernels? > > > > How do you reproduce the failure? When using LVM? > > > > > > this issue is reported by Xiaoyun when he did extensive test which > > > happened occasionally after dozen of hours running. From the > > > phenomenon and info provided by Xiaoyun, I found this potential race > > > window and Xiaoyun has verified this patch solving his stability issue. > > > > > > the original thread is at: > > > http://lists.xensource.com/archives/html/xen-devel/2011-04/msg01186.ht > > > ml > > > > > > his kernel is based on 2.6.38, and I checked latest 2.6.39 from your > > > maintained repo, and same issue still exists. > > > > > > btw, I didn't reproduce it myself, and not sure whether Xiaoyun uses > > > LVM. But I think it has nothing to do with storage type, and a pure mmu > > design issue. > > > > Is there a specific stack trace (or two) which is associated with this bug? I'm > > wondering if http://bugs.debian.org/613073 might be the same thing... > > > > If you look into above thread: > > http://lists.xensource.com/archives/html/xen-devel/2011-04/msg00657.html > > [] drop_other_mm_ref+0x2a/0x53 > > [] generic_smp_call_function_single_interrupt+0xd8/0xfc > > [] xen_call_function_single_interrupt+0x13/0x28 > > [] handle_IRQ_event+0x66/0x120 > > [] handle_percpu_irq+0x41/0x6e > > [] __xen_evtchn_do_upcall+0x1ab/0x27d > > [] xen_evtchn_do_upcall+0x33/0x46 > > [] xen_do_hypervisor_callback+0x1e/0x30 Can you resend the patch to me, based on top of v2.6.39-rc7, with the above stack dump? And please resend it as an attachment. Your mailer mangles the patch.