From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753242Ab0JCLVH (ORCPT ); Sun, 3 Oct 2010 07:21:07 -0400 Received: from mail-qy0-f174.google.com ([209.85.216.174]:38122 "EHLO mail-qy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753075Ab0JCLVF convert rfc822-to-8bit (ORCPT ); Sun, 3 Oct 2010 07:21:05 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=ml91aNnBworCdf5bTG25SA0sToTofmBdLCwQ3zYjexR66lmPZ295x+OPcat/9jop7E ayn5rbqwZ8yyL6EwOQ6fZem93x+1ChhhsTZSJLgqKdRMkwfPfGooYzpxUZT+7JfOl2Lb ZWT6haGHst6MONQAo3w/tG1U60PI6jrz9Yxjk= MIME-Version: 1.0 In-Reply-To: <20101002165215.GK21129@thunk.org> References: <20101002165215.GK21129@thunk.org> Date: Sun, 3 Oct 2010 13:21:03 +0200 Message-ID: Subject: Re: [Bug #17361] Watchdog detected hard LOCKUP in jbd2_journal_get_write_access From: Vegard Nossum To: "Ted Ts'o" , "Rafael J. Wysocki" , Linux Kernel Mailing List , Kernel Testers List , Maciej Rutecki , Florian Mickler , Christian Casteyde Cc: Frederic Weisbecker , Ingo Molnar , Mathieu Desnoyers Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2 October 2010 18:52, Ted Ts'o wrote: > On Sun, Sep 26, 2010 at 10:04:13PM +0200, Rafael J. Wysocki wrote: >> >> Bug-Entry     : http://bugzilla.kernel.org/show_bug.cgi?id=17361 >> Subject               : Watchdog detected hard LOCKUP in jbd2_journal_get_write_access >> Submitter     : Christian Casteyde >> Date          : 2010-08-29 19:59 (29 days old) > > See my latest comment here: > >    https://bugzilla.kernel.org/show_bug.cgi?id=17361#c14 > > This subject line is highly misleading, since after -rc4, the stack > traces are in places all over the kernel, in other places other than > ext4/jbd2.  So I fear no one is looking at this bug report given the > highly misleading subject line. > > It looks like you have spinlock debugging, and yet there wan't any > spinlocks listed on the initial ext4 might_sleep() warning.  So > something looks highly confused. > > The fact that you closed other bugs as duplicates of this one that > relate to kmemcheck makes me wonder if this is really a kmemcheck bug. > (If so, the subject line here is doubly, doubly misleading.) > > Do you see any symptoms if you turn off kmemcheck?  Are you sure this > isn't just only a kmemcheck bug? I just had a quick glance at the report, and here's my gut feeling: I see perf symbols in the stack trace. I don't think kmemcheck and perf play nicely together (for example if perf uses NMIs to write data to its buffers, it could get a page fault inside the NMI handler, which is not so nice, I think). Isn't this exactly what Frederic Weisbecker tried to detect and warn about in a patch that I saw recently? Please do as Ted suggested and try to turn kmemcheck off. Vegard From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vegard Nossum Subject: Re: [Bug #17361] Watchdog detected hard LOCKUP in jbd2_journal_get_write_access Date: Sun, 3 Oct 2010 13:21:03 +0200 Message-ID: References: <20101002165215.GK21129@thunk.org> Mime-Version: 1.0 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=JenYelabIcQ3pLL+rX/5FxS6Y8tGQLlRAO7R5SZQxEk=; b=g1lM8kCY71lO61Rm919hfCbzw3PDyzs4gSqf5eG+JriHxWvsvQdBW/LJh2VN+d4pxT lRuUukkHgh3aPdIp0pRZlk2f6PLL72u30OZ+h3PtqbM8q3IYvT+23QduwMgbY1amZCpd bI25sK1FIjNThDTva1ihbLWCyW8QCEd5phJUM= In-Reply-To: <20101002165215.GK21129-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> Sender: kernel-testers-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="utf-8" To: Ted Ts'o , "Rafael J. Wysocki" , Linux Kernel Mailing List , Kernel Testers List , Maciej Cc: Frederic Weisbecker , Ingo Molnar , Mathieu Desnoyers On 2 October 2010 18:52, Ted Ts'o wrote: > On Sun, Sep 26, 2010 at 10:04:13PM +0200, Rafael J. Wysocki wrote: >> >> Bug-Entry =C2=A0 =C2=A0 : http://bugzilla.kernel.org/show_bug.cgi?id= =3D17361 >> Subject =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 : Watchdog = detected hard LOCKUP in jbd2_journal_get_write_access >> Submitter =C2=A0 =C2=A0 : Christian Casteyde >> Date =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0: 2010-08-29 19:59 (29 days o= ld) > > See my latest comment here: > > =C2=A0 =C2=A0https://bugzilla.kernel.org/show_bug.cgi?id=3D17361#c14 > > This subject line is highly misleading, since after -rc4, the stack > traces are in places all over the kernel, in other places other than > ext4/jbd2. =C2=A0So I fear no one is looking at this bug report given= the > highly misleading subject line. > > It looks like you have spinlock debugging, and yet there wan't any > spinlocks listed on the initial ext4 might_sleep() warning. =C2=A0So > something looks highly confused. > > The fact that you closed other bugs as duplicates of this one that > relate to kmemcheck makes me wonder if this is really a kmemcheck bug= =2E > (If so, the subject line here is doubly, doubly misleading.) > > Do you see any symptoms if you turn off kmemcheck? =C2=A0Are you sure= this > isn't just only a kmemcheck bug? I just had a quick glance at the report, and here's my gut feeling: I see perf symbols in the stack trace. I don't think kmemcheck and perf play nicely together (for example if perf uses NMIs to write data to its buffers, it could get a page fault inside the NMI handler, which is not so nice, I think). Isn't this exactly what Frederic Weisbecker tried to detect and warn about in a patch that I saw recently? Please do as Ted suggested and try to turn kmemcheck off. Vegard