From mboxrd@z Thu Jan 1 00:00:00 1970 Received: with ECARTIS (v1.0.0; list linux-mips); Thu, 16 Mar 2017 20:06:39 +0100 (CET) Received: from localhost.localdomain ([127.0.0.1]:45972 "EHLO linux-mips.org" rhost-flags-OK-OK-OK-FAIL) by eddie.linux-mips.org with ESMTP id S23992155AbdCPTGbTF7rS (ORCPT ); Thu, 16 Mar 2017 20:06:31 +0100 Received: from h7.dl5rb.org.uk (localhost [127.0.0.1]) by h7.dl5rb.org.uk (8.15.2/8.14.8) with ESMTP id v2GJ6TOF001866; Thu, 16 Mar 2017 20:06:29 +0100 Received: (from ralf@localhost) by h7.dl5rb.org.uk (8.15.2/8.15.2/Submit) id v2GJ6TiZ001865; Thu, 16 Mar 2017 20:06:29 +0100 Date: Thu, 16 Mar 2017 20:06:29 +0100 From: Ralf Baechle To: Joshua Kinard Cc: linux-mips@linux-mips.org Subject: Re: ARCS can't load CONFIG_DEBUG_LOCK_ALLOC kernel Message-ID: <20170316190629.GP5512@linux-mips.org> References: <8b2d7473-ba4d-f2c9-27e7-b1a30b95c4f8@gentoo.org> <20170316140918.GH5512@linux-mips.org> <86da6dd2-7b02-cd0d-f152-00dfb134a3ec@gentoo.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <86da6dd2-7b02-cd0d-f152-00dfb134a3ec@gentoo.org> User-Agent: Mutt/1.8.0 (2017-02-23) Return-Path: X-Envelope-To: <"|/home/ecartis/ecartis -s linux-mips"> (uid 0) X-Orcpt: rfc822;linux-mips@linux-mips.org Original-Recipient: rfc822;linux-mips@linux-mips.org X-archive-position: 57377 X-ecartis-version: Ecartis v1.0.0 Sender: linux-mips-bounce@linux-mips.org Errors-to: linux-mips-bounce@linux-mips.org X-original-sender: ralf@linux-mips.org Precedence: bulk List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: linux-mips X-List-ID: linux-mips List-subscribe: List-owner: List-post: List-archive: X-list: linux-mips On Thu, Mar 16, 2017 at 01:50:42PM -0400, Joshua Kinard wrote: > On 03/16/2017 10:09, Ralf Baechle wrote: > > On Wed, Mar 15, 2017 at 11:50:44PM -0400, Joshua Kinard wrote: > > > >> On 03/15/2017 16:11, Joshua Kinard wrote: > >>> I've reported in the past that turning on CONFIG_DEBUG_LOCK_ALLOC produces a > >>> kernel that can't boot on several SGI platforms. It turns out that using > >>> arcload (Stan's bootloader originally written for IP30), I can get some > >>> debugging out on why. I am still puzzled, but maybe this information can be > >>> interpreted by someone else into something meaningful? > >>> > >>> All addresses printed out of arcload are physical address. > >>> > >>> ARCS Memory Map as printed by some debugging I added to the arcload binary: > >>> > >>> 0x00000000 - 0x00001000 ExceptionBlock > >>> 0x00001000 - 0x00002000 SystemParameterBlock > >>> 0x00002000 - 0x00004000 FirmwarePermanent > >>> 0x20004000 - 0x20f00000 FreeMemory*** > >>> 0x20f00000 - 0x21000000 FirmwareTemporary > >>> 0x21000000 - 0x5fff0000 FreeMemory > >>> 0x5fff0000 - 0x5ffff000 LoadedProgram > >>> 0x5ffff000 - 0x60000000 FreeMemory > >>> 0x60000000 - 0xa0000000 FirmwarePermanent > >> > >> So it turns out I can get away, on Octane at least, by changing the load > >> address from 0x20004000 to an arbitrary value in the other FreeMemory segment > >> from 0x21000000 - 0x5fff0000. Specifically, using 0x21004000 appears to work > >> without any ill effects. > >> > >> The 0x20004000 value is the address used by IRIX to load (with symon, it > >> becomes 0x200800000 instead). I'll have to try this on the IP27 later on as > >> well. On Octane, CONFIG_DEBUG_LOCK_ALLOC didn't toss up any major locking > >> issues yet. Probably need to hammer the disks with bonnie++ or such. At least > >> I can get back to the BRIDGE/PCI mess now... > > > > I'm wondering where the ARC stack is on kernel entry if maybe the > > ARC stack has corrupted the kernel? If possible, can you get your > > kernel or a test program to compute a checksum over itself to see > > if it has been corrupted? > > As far as I can tell, it really does seem that it is a sizing issue. I don't > have the time to dive into what CONFIG_DEBUG_LOCK_ALLOC is exactly doing, but I > found one hit on LKML (lost the URL) that indicates it fluffs up a particular > struct that is very common and so introduces a fair bit of bloat, and it seems > possible that the 0x20004000-0x20f00000 really is too small. I wouldn't rule > out the possibility that SGI designed ARCS on the Octane to allow only IRIX to > load at this particular address and Linux has just gotten lucky thus far. > > As for whether loading at the next FreeMemory segment in 0x21000000-0x5fff0000 > smashes any ARCS segments, that I am not sure about. A kernel booting in that > segment does boot, and seems to behave no differently than a kernel booting in > the other segment, including exhibiting the same bugs. Like IP27, Octane > doesn't have a need for ARCS after the kernel boots, as resetting the system > can be done by flipping a bit in HEART, and power down is handled by the RTC > driver (this feature broke, though, and I haven't chased down why yet). So if > we're clobbering ARCS using this load address...well, it can't be all that bad > > > I'll see what IP27 does, assuming it even has a large enough FreeMemory segment > to work with. > > > > Let me repeat my ARC(S) mantra again, ARC(S) is broken, ARC(S) lies. > > Trust is futile. Even if ARC(S) claims something is free I'd rather > > not rely on it. > > Apparently, and only on Octane, ARCS detects and maps out only the first 1GB of > RAM. All remaining RAM installed in the system is marked as FirmwarePermanent > and mapped into 0x60000000 on up. I think on IP27 it was only the first 32MB that are somehow used by ARCS. Everything else is entirely ignored and the OS is supposed to use klconfig to query the hardware configuration. That said, klconfig is an infinitely better than ARCS, it actually works and is easy to use. What it does not provide is information on how firmware or other loaded programs are using memory - it's really just a hardware inventory. Ralf