From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751914AbWAEDwr (ORCPT ); Wed, 4 Jan 2006 22:52:47 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751916AbWAEDwr (ORCPT ); Wed, 4 Jan 2006 22:52:47 -0500 Received: from smtp-7.smtp.ucla.edu ([169.232.46.137]:65174 "EHLO smtp-7.smtp.ucla.edu") by vger.kernel.org with ESMTP id S1751914AbWAEDwq (ORCPT ); Wed, 4 Jan 2006 22:52:46 -0500 Date: Wed, 4 Jan 2006 19:52:36 -0800 (PST) From: Chris Stromsoe To: Willy Tarreau cc: Alan Cox , Marcelo Tosatti , linux-kernel@vger.kernel.org Subject: Re: bad pmd filemap.c, oops; 2.4.30 and 2.4.32 In-Reply-To: <20051231130151.GA15993@alpha.home.local> Message-ID: References: <20051228001047.GA3607@dmt.cnet> <1136030901.28365.51.camel@localhost.localdomain> <20051231130151.GA15993@alpha.home.local> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Probable-Spam: no X-Spam-Report: none Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 31 Dec 2005, Willy Tarreau wrote: > On Sat, Dec 31, 2005 at 12:08:21PM +0000, Alan Cox wrote: >> On Gwe, 2005-12-30 at 17:48 -0800, Chris Stromsoe wrote: >>> scsi0:0:0:0: Attempting to queue an ABORT message CDB: 0x12 0x0 0x0 >>> 0x0 0xff 0x0 scsi0:0:0:0: Command already completed aic7xxx_abort >>> returns 0x2002 >> >> IRQ routing by the look of that trace. Make sure that if you are using >> 2.4.x you have ACPI disabled and see it looks any better > > Correct, and I came to the same conclusion ; Chris told us he booted > with the "nosmp" option. I've checked his config, and he has > CONFIG_ACPI_BOOT=y. I've just tried the same here, and I confirm that my > machine (dual athlon) does not boot with "nosmp" unless I also add > "acpi=off". Mine even stops ealier, while scanning IDE devices. 2.6.14.4 has been running stable for 4 days. For the long term, I'll probably migrate the box to 2.6 and leave it there. > So now we're back to the original problem, i.e. why does he get bad pmd > that often on 2.4. It leaves us with the following possible next steps > after the problem occurs again (if it still happens with 2.6.14 or if > Chris is OK for a few more tests) : > - 2.4.32 nosmp acpi=off => the easiest one > - 2.4.32 + aic7xxx+20040522 => the more interesting one I booted 2.4.32 with the aic7xxx patch you pointed me at last week. It's been up for a few hours. I'll let it run for at least a week or two and will report back positive or negative results. After that, I'll try 2.4.32 with nosmp and acpi=off. -Chris