From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: RE: Kernel BUG at arch/x86/mm/tlb.c:61 Date: Thu, 28 Apr 2011 16:29:09 -0700 Message-ID: <4DB9F845.6020204@goop.org> References: , , , , , , , <4DA3438A.6070503@goop.org>, , , <20110412100000.GA15647@dumpdata.com>, , , , , <4DA8B715.9080508@goop.org> <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <625BA99ED14B2D499DC4E29D8138F1505C7F2C5185@shsmsx502.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Tian, Kevin" Cc: MaoXiaoyun , xen devel , "giamteckchoon@gmail.com" , "konrad.wilk@oracle.com" List-Id: xen-devel@lists.xenproject.org On 04/25/2011 10:52 PM, Tian, Kevin wrote: >> From: MaoXiaoyun >> Sent: Monday, April 25, 2011 11:15 AM >>> Date: Fri, 15 Apr 2011 14:22:29 -0700 >>> From: jeremy@goop.org >>> To: tinnycloud@hotmail.com >>> CC: giamteckchoon@gmail.com; xen-devel@lists.xensource.com; konrad.wi= lk@oracle.com >>> Subject: Re: Kernel BUG at arch/x86/mm/tlb.c:61 >>> >>> On 04/15/2011 05:23 AM, MaoXiaoyun wrote: >>>> Hi=EF=BC=9A >>>> >>>> Could the crash related to this patch ? >>>> http://git.kernel.org/?p=3Dlinux/kernel/git/jeremy/xen.git;a=3Dcommi= tdiff;h=3D45bfd7bfc6cf32f8e60bb91b32349f0b5090eea3 >>>> >>>> Since now TLB state change to TLBSTATE_OK(mmu_context.h:40) is befor= e >>>> cpumask_clear_cpu(line 49). >>>> Could it possible that right after execute line 40 of mmu_context.h, >>>> CPU revice IPI from other CPU to >>>> flush the mm, and when in interrupt, find the TLB state happened to = be >>>> TLBSTATE_OK. Which conflicts. >>> Does reverting it help? >>> >>> J >> =20 >> Hi Jeremy: >> =20 >> The lastest test result shows the reverting didn't help. >> Kernel panic exactly at the same place in tlb.c. >> =20 >> I have question about TLB state, from the stack,=20 >> xen_do_hypervisor_callback-> xen_evtchn_do_upcall->... ->drop_othe= r_mm_ref >> =20 >> What cpu_tlbstate.state should be, could TLBSTATE_OK or TLBSTAT= E_LAZY all be possible?=20 >> That is after a hypercall from userspace, state will be TLBSTATE_O= K, and >> if from kernel space, state will be TLBSTATE_LAZE ?=20 >> =20 >> thanks. > it looks a bug in drop_other_mm_ref implementation, that current TLB st= ate should be checked > before invoking leave_mm(). There's a window between below lines of cod= e: > > > /* Get the "official" set of cpus referring to our pagetable. */ > if (!alloc_cpumask_var(&mask, GFP_ATOMIC)) { > for_each_online_cpu(cpu) { > if (!cpumask_test_cpu(cpu, mm_cpumask(mm)) > && per_cpu(xen_current_cr3, cpu) !=3D __pa(= mm->pgd)) > continue; > smp_call_function_single(cpu, drop_other_mm_ref= , mm, 1); > } > return; > } > > there's chance that when smp_call_function_single is invoked, actual TL= B state has been > updated in the other cpu. The upstream kernel patch you referred to ear= lier just makes > this bug exposed more easily. But even without this patch, you may stil= l suffer such issue > which is why reverting the patch doesn't help. > > Could you try adding a check in drop_other_mm_ref? > > if (active_mm =3D=3D mm && percpu_read(cpu_tlbstate.state) !=3D= TLBSTATE_OK) > leave_mm(smp_processor_id()); > > once the interrupted context has TLBSTATE_OK, it implicates that later = it will handle=20 > the TLB flush and thus no need for leave_mm from interrupt handler, and= that's the > assumption of doing leave_mm. That seems reasonable. MaoXiaoyun, does it fix the bug for you? Kevin, could you submit this as a proper patch? Thanks, J